CN112682203B - Vehicle control device, system, method, learning device, and storage medium - Google Patents

Vehicle control device, system, method, learning device, and storage medium Download PDF

Info

Publication number
CN112682203B
CN112682203B CN202011090364.5A CN202011090364A CN112682203B CN 112682203 B CN112682203 B CN 112682203B CN 202011090364 A CN202011090364 A CN 202011090364A CN 112682203 B CN112682203 B CN 112682203B
Authority
CN
China
Prior art keywords
vehicle
data
post
relationship
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011090364.5A
Other languages
Chinese (zh)
Other versions
CN112682203A (en
Inventor
桥本洋介
片山章弘
大城裕太
杉江和纪
冈尚哉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toyota Motor Corp
Original Assignee
Toyota Motor Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toyota Motor Corp filed Critical Toyota Motor Corp
Publication of CN112682203A publication Critical patent/CN112682203A/en
Application granted granted Critical
Publication of CN112682203B publication Critical patent/CN112682203B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/0098Details of control systems ensuring comfort, safety or stability not otherwise provided for
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W10/00Conjoint control of vehicle sub-units of different type or different function
    • B60W10/04Conjoint control of vehicle sub-units of different type or different function including control of propulsion units
    • B60W10/06Conjoint control of vehicle sub-units of different type or different function including control of propulsion units including control of combustion engines
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/04Monitoring the functioning of the control system
    • B60W50/045Monitoring control system parameters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0043Signal treatments, identification of variables or parameters, parameter estimation or state estimation
    • B60W2050/0052Filtering, filters
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0062Adapting control system settings
    • B60W2050/0075Automatic parameter input, automatic initialising or calibrating means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0062Adapting control system settings
    • B60W2050/0075Automatic parameter input, automatic initialising or calibrating means
    • B60W2050/0083Setting, resetting, calibration
    • B60W2050/0088Adaptive recalibration
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2510/00Input parameters relating to a particular sub-units
    • B60W2510/06Combustion engines, Gas turbines
    • B60W2510/0604Throttle position
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2510/00Input parameters relating to a particular sub-units
    • B60W2510/06Combustion engines, Gas turbines
    • B60W2510/0666Engine power
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/10Accelerator pedal position
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2556/00Input parameters relating to data
    • B60W2556/10Historical data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/60Other road transportation technologies with climate change mitigation effect
    • Y02T10/62Hybrid vehicles

Abstract

Provided are a vehicle control device, a vehicle control system, a vehicle learning device, a vehicle control method, and a storage medium. When the detection process detects that the function recovery process is performed, the switching process switches the relationship specification data to be used in the operation process to the post-process data. The switching process includes the following processes: as the post-treatment data, initial data, which is relationship specification data before the update processing accompanying the running of the vehicle is executed, is used.

Description

Vehicle control device, system, method, learning device, and storage medium
Technical Field
The present disclosure relates to a vehicle control device, a vehicle control system, a vehicle learning device, a vehicle control method, and a storage medium.
Background
For example, japanese patent application laid-open No. 2016-6327 describes a control device that operates a throttle valve, which is an operation portion of an internal combustion engine mounted in a vehicle, based on a value obtained by filtering an operation amount of an accelerator pedal.
However, the above-described filtering needs to be used to set the operation amount of the throttle valve of the internal combustion engine mounted on the vehicle to an appropriate operation amount in accordance with the operation amount of the accelerator pedal. Thus, the adaptation of the operation amount of the throttle valve requires a skilled person to take many man-hours. As described above, conventionally, a skilled person has spent many man-hours for adapting the operation amount of electronic devices in a vehicle to the state of the vehicle.
Disclosure of Invention
The following describes aspects of the present disclosure.
The vehicle control device according to aspect 1 includes an execution device and a storage device, wherein the storage device stores relationship specifying data specifying a relationship between a state of a vehicle and an action variable, the action variable being a variable related to an operation of an electronic device in the vehicle. The execution device is configured to execute: an acquisition process of acquiring a detection value of a sensor that detects a state of the vehicle; and an operation process of operating the electronic device based on the value of the action variable, the value of the action variable being determined based on the detected value acquired by the acquisition process and the relationship specifying data. And a reward calculation process of giving a larger reward when the characteristic of the vehicle satisfies a criterion based on the detection value acquired by the acquisition process than when the characteristic of the vehicle does not satisfy the criterion. And updating the relationship specifying data by using, as input to a predetermined update map, a state of the vehicle based on the detection value acquired by the acquisition process, a value of the action variable used in the operation of the electronic device, and the reward corresponding to the operation. And a detection process of detecting that a function recovery process is performed on a component that affects the state of the vehicle due to an operation based on the operation process, among the components in the vehicle. And a switching process of switching the relationship specification data to be used in the operation process to post-process data when the function recovery process is detected by the detection process. The update map outputs the relationship specification data updated so that expected benefits regarding the rewards in the case of operating the electronic apparatus in accordance with the relationship specification data are increased. The switching process includes the following processes: as the post-treatment data, initial data, which is the relationship specification data before the update processing accompanying the vehicle running is performed, is used.
In the above configuration, by calculating the rewards associated with the operation of the electronic apparatus, it is possible to grasp what rewards can be obtained by the operation. Further, by updating the relationship specifying data by the reinforcement learning update map based on the reward, the relationship between the state of the vehicle and the action variable can be set to an appropriate relationship during the running of the vehicle. Therefore, when the relationship between the state of the vehicle and the behavior variable is set to be an appropriate relationship during the running of the vehicle, man-hours required for the skilled person can be reduced.
However, in the case of deterioration of a component in a vehicle or the like, the relationship specifying data is updated to appropriate data when the deteriorated component is used by reinforcement learning. Therefore, when the function recovery processing is performed later, the relationship specification data may not be appropriate data in terms of increasing expected yields. In the above configuration, when the function recovery processing is performed, the relationship specifying data to be used in the operation processing is switched to the initial data by the switching processing. This can suppress a decrease in expected revenue due to the performance of the function recovery process.
An aspect 2 is the vehicle control device according to the aspect 1, wherein the execution device is configured to execute a past data maintenance process that is a process of: the relationship specifying data updated by the update processing is set to be stored in the storage device until a predetermined condition is satisfied, and the update by the update processing is avoided after the predetermined condition is satisfied, unlike the relationship specifying data updated by the update processing. The switching process includes the following processes: selecting whether the post-treatment data is the relationship specification data that has been updated by the update process before the predetermined condition is satisfied, or whether the post-treatment data is the initial data.
The relationship specifying data updated by the update processing accompanying the running of the vehicle and the data that has not been updated until the component deteriorates to such an extent that the function recovery processing is required are considered. The data has a higher possibility of specifying a value of a more appropriate action variable corresponding to the state of the vehicle after the function recovery process than the initial data before the start of the running of the vehicle. In the above configuration, the post-treatment data is selected as one of the relationship specifying data and the initial data, which are updated before the predetermined condition is satisfied, by the past data maintaining process. In this way, for example, compared with a case where the post-treatment data is uniformly set to the initial data before the start of the running of the vehicle, the value of the more appropriate action variable corresponding to the state of the vehicle after the function recovery treatment can be set based on the relationship specification data after the switching process.
An aspect 3 is the vehicle control device according to the above aspect 1 or 2, wherein the execution device is configured to execute, when the detection processing detects that the function recovery processing is performed: processing the post-treatment data request, and transmitting a signal requesting the post-treatment data; and a post-treatment data reception process of receiving the post-treatment data transmitted as a result of the post-treatment data request process. The switching process includes the following processes: the relationship specifying data to be used in the operation processing is switched to the received post-treatment data.
In the above configuration, the post-treatment data request processing and the post-treatment data reception processing are executed. Therefore, for example, even when the control device does not hold the post-treatment data, the post-treatment data can be acquired.
The present invention provides a vehicle control system including the execution device and the storage device in the vehicle control device according to the above-described aspect 1 or 2, wherein the execution device includes a 1 st execution device mounted on the vehicle and a 2 nd execution device different from an in-vehicle device. The 2 nd execution device is configured to execute at least a post-treatment data transmission process of transmitting the post-treatment data when the detection process detects that the function recovery process is performed. The 1 st execution device is configured to execute at least the acquisition process, the operation process, and the post-treatment data reception process. The post-treatment data reception process receives data transmitted by the post-treatment data transmission process.
In the above configuration, the 2 nd execution device, which is different from the in-vehicle device, executes the post-treatment data transmission processing. Therefore, for example, even when the 1 st execution device does not hold the post-treatment data, the post-treatment data can be acquired. Further, the 2 nd execution device is a device different from the in-vehicle device, which means that the 2 nd execution device is not the in-vehicle device.
In the vehicle control system according to the above aspect 4, the 1 st execution means is configured to execute the detection process and the post-treatment data request process. The post-treatment data request processing transmits a signal requesting the post-treatment data when the detection processing detects that the function recovery processing is performed.
In the above configuration, the post-treatment data request processing and the post-treatment data reception processing are executed. Therefore, for example, even when the post-treatment data is not held in the 1 st execution device, the post-treatment data can be acquired.
An aspect 6 is the control system for a vehicle according to the above aspect 4 or 5, wherein the update process is executed by the 1 st execution device.
The vehicle control device according to claim 7 includes the 1 st execution device in the vehicle control system according to any one of claims 4 to 6.
The vehicle learning device according to claim 8 includes the 2 nd execution device in the vehicle control system according to any one of the above-described aspects 4 to 6.
The present invention is implemented as a vehicle control method for executing the various processes described in the above-described aspects.
The present invention is realized as a non-transitory computer-readable recording medium storing a program for causing various devices to execute the various processes described in the above aspects.
Drawings
Fig. 1 is a diagram showing a control device and a drive system thereof according to embodiment 1.
Fig. 2 is a flowchart showing steps of processing executed by the control device according to the embodiment.
Fig. 3 is a flowchart showing detailed steps of a part of the processing executed by the control device according to the embodiment.
Fig. 4 is a flowchart showing steps of processing executed by the control device according to the embodiment.
Fig. 5 is a diagram showing a configuration of a vehicle control system according to embodiment 2.
Parts (a) and (b) of fig. 6 are flowcharts showing steps of the process executed by the vehicle control system of fig. 5.
Detailed Description
Embodiment 1
Embodiment 1 of a vehicle control device is described below with reference to fig. 1 to 4.
Fig. 1 shows a configuration of a drive system and a control device of a vehicle VC1 according to the present embodiment.
As shown in fig. 1, a throttle valve 14 and a fuel injection valve 16 are provided in this order from the upstream side in an intake passage 12 of an internal combustion engine 10. The air taken into the intake passage 12 and the fuel injected from the fuel injection valve 16 flow into a combustion chamber 24 partitioned by the cylinder 20 and the piston 22 with the opening of the intake valve 18. In the combustion chamber 24, a mixture of fuel and air is supplied for combustion in accordance with spark discharge by the ignition device 26, and energy generated by the combustion is converted into rotational energy of the crankshaft 28 via the piston 22. The mixture supplied to combustion is discharged as exhaust gas to the exhaust passage 32 with the opening of the exhaust valve 30. The exhaust passage 32 is provided with a catalyst 34 as an aftertreatment device for purifying exhaust gas.
The crankshaft 28 is mechanically coupled to an input shaft 52 of a transmission 50 via a torque converter 40 including a lockup clutch 42. The transmission 50 is a device that changes a gear ratio, which is a ratio between the rotational speed of the input shaft 52 and the rotational speed of the output shaft 54. The output shaft 54 is mechanically coupled to a drive wheel 60.
The control device 70 operates the throttle valve 14, the fuel injection valve 16, the ignition device 26, and other operating units of the internal combustion engine 10 in order to control the torque, the exhaust gas component ratio, and the like, which are control amounts of the internal combustion engine 10, with respect to the internal combustion engine 10. The control device 70 is configured to operate the lockup clutch 42 in order to control the engagement state of the lockup clutch 42 with respect to the torque converter 40. The control device 70 is configured to control the transmission device 50, and to control the transmission ratio as a control amount thereof, the control device 50 is operated. In fig. 1, operation signals MS1 to MS5 of the throttle valve 14, the fuel injection valve 16, the ignition device 26, the lockup clutch 42, and the transmission 50 are shown, respectively.
The control device 70 refers to the intake air amount Ga detected by the airflow meter 80, the opening degree of the throttle valve 14 (throttle opening degree TA) detected by the throttle valve sensor 82, and the output signal Scr of the crank angle sensor 84 for control of the control amount. The control device 70 refers to the amount of depression of the accelerator pedal 86 (accelerator operation amount PA) detected by the accelerator sensor 88 and the acceleration Gx in the front-rear direction of the vehicle VC1 detected by the acceleration sensor 90. The control device 70 refers to position data Pgps based on the global positioning system (GPS 92).
The control device 70 includes a CPU72, a ROM74, an electrically rewritable nonvolatile memory (storage device 76), and a peripheral circuit 78, and those components can communicate via a local network 79. The peripheral circuit 78 includes a circuit for generating a clock signal defining an internal operation, a power supply circuit, a reset circuit, and the like.
The ROM74 stores a control program 74a and a learning program 74b. On the other hand, the storage device 76 stores relationship specifying data DR and initial data DR0 for specifying the relationship between the accelerator operation amount PA, the throttle opening degree command value (throttle opening degree command value ta×) and the retard amount aop of the ignition device 26. Here, the retard amount aop is a retard amount relative to a predetermined reference ignition timing, which is a retard side timing among the MBT ignition timing and the knock limit point. The MBT ignition timing is an ignition timing (maximum torque ignition timing) at which the maximum torque can be obtained. In addition, the knock limit point is an advance limit value of the ignition timing that can fall within a level that can tolerate knocking under the envisaged optimal conditions when using a high-octane fuel with a high knock limit. Further, torque output map data DT is stored in the storage device 76. The torque output map defined by the torque output map data DT is a map having the rotational speed NE of the crankshaft 28, the filling efficiency η, and the ignition timing as input and output torques Trq.
Fig. 2 shows steps of processing executed by the control device 70 according to the present embodiment. The processing shown in fig. 2 is realized by the CPU72 repeatedly executing the control program 74a and the learning program 74b stored in the ROM74, for example, at predetermined cycles. In the following, the step numbers of the respective processes are denoted by numerals given to the head with "S".
In the series of processing shown in fig. 2, the CPU72 first acquires time-series data composed of 6 sample values "PA (1), PA (2), … …, PA (6) of the accelerator operation amount PA as a state S (S10). Here, each sampling value constituting time-series data is a value sampled at a different timing from each other. In the present embodiment, the time-series data is configured by 6 sampling values adjacent to each other in time series when sampling is performed at a predetermined sampling period.
Next, the CPU72 sets an action a including a throttle opening degree command value ta_and a delay amount aop corresponding to the state S obtained by the processing of S10, in accordance with the policy pi specified by the relationship specification data DR (S12).
In the present embodiment, the relationship specifying data DR is data for specifying the action cost function Q and the policy pi. In the present embodiment, the action cost function Q is a table-type function representing the expected benefit value corresponding to the 8-dimensional argument of the state s and the action a. In addition, policy pi determines the following rules: when the state s is provided, an action a (greedy action) having the largest expected benefit among the action cost functions Q having the argument as the provided state s is preferentially selected, but the other actions a are selected with a predetermined probability epsilon.
Specifically, the number of values that can be taken by the argument of the action cost function Q according to the present embodiment is a number obtained by cutting out some of all combinations of the states s and the values that can be taken by the action a by human knowledge or the like. That is, for example, in a case where one of two adjacent sampling values in the time-series data of the accelerator operation amount PA becomes the minimum value and the other becomes the maximum value of the accelerator operation amount PA, it is considered that the action cost function Q cannot be defined due to the operation of the accelerator pedal 86 by a person. In the present embodiment, the value that can be taken by the state s defining the action cost function Q is limited to not more than the power of 10, more preferably not more than the power of 10, by dimension reduction based on human knowledge or the like.
Next, the CPU72 operates the throttle opening degree TA by outputting the operation signal MS1 to the throttle valve 14 based on the set throttle opening degree command value TA and the delay amount aop, and operates the ignition timing by outputting the operation signal MS3 to the ignition device 26 (S14). In this embodiment, feedback control of the throttle opening degree TA to the throttle opening degree command value TA "is exemplified. Thus, even if the throttle opening command value ta_is the same, the operation signal MS1 may be different from each other. For example, in the case of performing well-known Knock Control (KCS) or the like, the ignition timing is set to a value obtained by feedback-correcting the reference point ignition timing by the KCS by the delay amount aop. Here, the reference ignition timing is set in a variable manner by the CPU72 according to the rotational speed NE of the crankshaft 28 and the filling efficiency η. The rotation speed NE is calculated by the CPU72 based on the output signal Scr of the crank angle sensor 84. The filling efficiency η is calculated by the CPU72 based on the rotation speed NE and the intake air amount Ga.
Next, the CPU72 acquires the torque Trq of the internal combustion engine 10, the torque command value Trq for the internal combustion engine 10, and the acceleration Gx (S16). Here, the CPU112 calculates the torque Trq by inputting the rotation speed NE, the filling efficiency η, and the ignition timing into a torque output map. The CPU72 sets a torque command value Trq according to the accelerator operation amount PA.
Next, the CPU72 determines whether the transition flag F is "1" (S18). The transition flag F indicates that the vehicle is in the transition operation when it is "1", and indicates that the vehicle is not in the transition operation when it is "0". When it is determined that the transition flag F is "0" (S18: no), the CPU72 determines whether or not the absolute value of the change amount Δpa per unit time of the accelerator operation amount PA is equal to or greater than a predetermined amount Δpath (S20). Here, the change amount Δpa may be, for example, the difference between the latest accelerator operation amount PA at the execution timing of the process of S20 and the accelerator operation amount PA before the unit time with respect to the timing.
When determining that the absolute value of the change amount Δpa is equal to or larger than the predetermined amount Δpath (yes in S20), the CPU72 substitutes "1" into the transition flag F (S22).
In contrast, when it is determined that the transition flag F is "1" (yes in S18), the CPU72 determines whether or not a predetermined period of time has elapsed from the execution timing of the process in S22 (S24). Here, the predetermined period is set to a period until the absolute value of the change amount Δpa per unit time of the accelerator operation amount PA becomes equal to or smaller than a predetermined amount that is smaller than the predetermined amount Δpath, and the predetermined period is continued. When it is determined that the predetermined period of time has elapsed (S24: yes), the CPU72 substitutes "0" into the transition flag F (S26).
When the processing in S22 and S26 is completed, the CPU72 updates the action cost function Q by reinforcement learning as one scenario (epoode) is completed (S28).
Fig. 3 shows the details of the processing of S28.
In the series of processing shown in fig. 3, the CPU72 acquires time-series data including a set of three sample values of the torque command value Trq, the torque Trq, and the acceleration Gx in the latest ending scenario, and time-series data of the state S and the action a (S30). Here, in the latest scenario, the transition flag F is kept at "0" when the processing of S30 is performed following the processing of S22, and the transition flag F is kept at "1" when the processing of S30 is performed following the processing of S26.
In fig. 3, the values of variables whose numbers in brackets are different are shown as variables of different sampling timings. For example, the sampling timings of the torque command value Trq (1) and the torque command value Trq (2) are different from each other. The time-series data of the action a belonging to the latest scenario is defined as an action set Aj, and the time-series data of the state s belonging to the scenario is defined as a state set Sj.
Next, the CPU72 determines whether or not the logical product of the condition (a) in which the absolute value of the difference between the torque Trq and the torque command value Trq is equal to or smaller than the predetermined amount Δtrq and the condition (B) in which the acceleration Gx is equal to or larger than the lower limit GxL and equal to or smaller than the upper limit GxH, which belong to the latest scenario, is true (S32).
Here, the CPU72 variably sets the prescribed amount Δtrq based on the amount Δpa of change per unit time of the accelerator operation amount PA at the start of the scenario. That is, when the absolute value of the change amount Δpa is large, the CPU72 sets the predetermined amount Δtrq to a value larger than the predetermined amount Δtrq in the steady state as a scenario concerning the transition.
Further, the CPU72 sets the lower limit value GxL in a variable manner according to the variation Δpa of the accelerator operation amount PA at the start of the scenario. That is, when the CPU72 is a scenario concerning a transition and the change amount Δpa is positive, the lower limit value GxL is set to a value larger than the lower limit value GxL in the case of a scenario concerning a steady state. Further, when the change Δpa is negative in the case of the scenario related to the transition, the CPU72 sets the lower limit value GxL to a value smaller than the lower limit value GxL in the case of the scenario related to the steady state.
In addition, the CPU72 sets the upper limit value GxH in a variable manner according to the amount of change Δpa per unit time of the accelerator operation amount PA at the start of the scenario. That is, the CPU72 sets the upper limit value GxH to a value larger than the upper limit value GxH in the case of the scenario in the steady state when the scenario in the transient state is taken and the change amount Δpa is positive. Further, when the transition scenario is a scenario and the change amount Δpa is negative, the CPU72 sets the upper limit value GxH to a value smaller than the upper limit value GxH in the case of the steady-state scenario.
When the logical product is determined to be true (yes in S32), the CPU72 substitutes "10" for the prize r (S34), and when the logical product is determined to be false (no in S32), substitutes "-10" for the prize r (S36). When the processing in S34 and S36 is completed, the CPU72 updates the relationship specification data DR stored in the storage 76 shown in fig. 1. In the present embodiment, epsilon soft homotactic monte carlo (epsilon soft on-policy Monte Carlo method) is used for updating the relationship specification data DR.
That is, the CPU72 adds the prize R to each of the benefits R (Sj, aj) determined by the respective states and the corresponding groups of actions read out by the processing of S30 described above (S38). Here, "R (Sj, aj)" is a summary description of the benefit R of the action with one of the elements of the state set Sj as the state and one of the elements of the action set Aj as the action. Next, the gains R (Sj, aj) determined by the respective states and the corresponding groups of actions read out by the processing of S30 are averaged, and the averaged gains R (Sj, aj) are substituted into the corresponding action cost functions Q (Sj, aj) (S40). Here, the average of the benefit R may be a process of dividing the benefit R calculated by the process of S38 by a predetermined number of times the process of S38 is performed. The initial value of the benefit R may be set to the initial value of the corresponding action cost function Q.
Next, the CPU72 substitutes actions into the actions Aj (S42) for the state read out by the processing of S30, the actions being a set of the throttle opening degree command value TA and the delay amount aop at the time when the expected benefit is the maximum value in the corresponding action cost function Q (Sj, a). Here, "a" means any action that is desirable. The action Aj is a separate value according to the type of state read by the processing of S30, but the description is simplified and the same reference numerals are used.
Next, the CPU72 updates the corresponding policies pi (aj|sj) with respect to the states read out by the processing of S30 described above (S44). That is, when the total number of actions is "|A|", the selection probability of the action Aj selected by S42 is "1- ε+ε/|A|". The selection probabilities of the "" A "" 1 "" actions other than the action Aj are set to "ε/|A|". The process of S44 is based on the action cost function Q updated by the process of S40. Thereby, the relationship specification data DR that specifies the relationship between the state s and the action a is updated to increase the benefit R.
Further, the CPU72 temporarily ends the series of processing shown in fig. 3 when the processing of S44 is completed.
Returning to fig. 2, the cpu72 temporarily ends the series of processing shown in fig. 2 when the processing of S28 is completed and when a negative determination is made in the processing of S20 and S24. The processing of S10 to S26 is realized by the CPU72 executing the control program 74a, and the processing of S28 is realized by the CPU72 executing the learning program 74 b. The relationship specification data DR at the time of shipment of the vehicle VC1 is the same as the initial data DR 0. The initial data DR0 is data that has been learned in advance by performing a process similar to the process shown in fig. 2, such as simulating the running of the vehicle with a test bed.
Fig. 4 shows steps of processing executed by the control device 70 according to the present embodiment. The processing shown in fig. 4 is realized by the CPU72 repeatedly executing the learning program 74b stored in the ROM74, for example, at a predetermined cycle.
In the series of processing shown in fig. 4, the CPU72 first determines whether the travel distance RL of the vehicle VC1 has become a predetermined distance (S50). Here, the predetermined distance is set to, for example, a plurality of distances represented by multiples of a predetermined amount, such as 1 km, 2 km, 3 km, … …, and the like. When it is determined that the travel distance RL of the vehicle VC1 is a predetermined distance (yes in S50), the CPU72 stores the relationship specification data DR at that time as updated data DR1 in the storage means 76 (S52). If the travel distance RL is 2 km, two different pieces of data are stored in the storage device 76 as updated data DR1, assuming that the predetermined amount is "1 km". That is, each time it is determined that the travel distance RL is a predetermined distance, the relationship specification data DR at that time is stored in the storage means 76 as new updated data DR1, whereby the updated data DR1 is increased.
The CPU72 determines whether or not the function recovery processing has been performed in the case where the processing of S52 is completed and in the case where a negative determination is made in the processing of S50 (S54). In this embodiment, it is assumed that: when the function recovery process of the component mounted on the vehicle VC1 is performed with the maintenance of the vehicle VC1, a signal indicating that the function recovery process is performed is input from the scanning tool to the control device 70. Therefore, when a signal indicating that the function recovery process is performed is input to the CPU72, the CPU72 determines that the function recovery process is performed.
When it is determined that the function recovery process is performed (S54: yes), the CPU72 determines whether or not updated data DR1 exists at a time point which is a travel distance shorter than the current travel distance RL by a predetermined amount DeltaL or more (S56). When it is determined that the updated data DR1 at the time point exists (yes in S56), the CPU72 substitutes the updated data DR1 at the time point into the post-treatment data DRp (S58). Further, in the case where a plurality of data to be the target of affirmative determination by the processing of S56 are stored in the storage device 76 as updated data DR1, the CPU72 substitutes the data in which the travel distance RL is longest into the post-treatment data DRp. In contrast, when it is determined that the updated data DR1 at this time point does not exist (S56: no), the CPU72 substitutes the initial data DR0 into the post-treatment data DRp (S60).
When the processing in S58 and S60 is completed, the CPU72 rewrites the relationship specification data DR used in the processing in S12 into the post-treatment data DRp (S62).
Further, the CPU72 temporarily ends the series of processing shown in fig. 4 in the case where the processing of S62 is completed and in the case where a negative determination is made in the processing of S54.
Here, the operation and effects of the present embodiment will be described.
The CPU72 acquires time-series data of the accelerator operation amount PA in response to the operation of the accelerator pedal 86 by the user, and sets the action a including the throttle opening degree command value TA and the delay amount aop in accordance with the policy pi. Here, the CPU72 basically selects an action a that maximizes the expected benefit based on the action cost function Q defined by the relationship definition data DR. However, the CPU72 performs a search for the action a that maximizes the expected benefit by selecting actions other than the action a that maximizes the expected benefit with a predetermined probability epsilon. Accordingly, the relationship specification data DR can be updated by reinforcement learning in association with the driving of the vehicle VC1 by the user. Therefore, the throttle opening degree command value ta_sum and the delay amount aop corresponding to the accelerator operation amount PA can be set to appropriate values during running of the vehicle VC1 without excessively increasing the man-hours of the skilled person.
In this way, the relationship specification data DR that is the same as the initial data DR0 is updated as the vehicle VC1 travels when the vehicle VC1 leaves the factory. Here, for example, even if the throttle opening degree TA is the same, if deposits are deposited on the throttle valve 14 and the intake passage 12, the flow path cross-sectional area of the intake passage 12 becomes smaller. Thereby, the intake air amount Ga becomes small. Therefore, the throttle opening degree command value TA that maximizes the expected benefit based on the time-series data of the accelerator operation amount PA, which is defined by the relationship specification data DR, may be updated so as to compensate for the variation in the flow path cross-sectional area of the intake passage 12 caused by the deposit on the throttle valve 14. In this way, when the relationship specification data DR is learned so as to compensate for the aged deterioration of the components of the vehicle VC1, and then the function recovery processing of the components of the vehicle VC1 is performed by performing component replacement and cleaning with maintenance, there is a possibility that the relationship specification data DR will not be appropriate data in terms of determining an action to increase expected income.
Then, the CPU72 rewrites the relationship specifying data DR on the condition that it is determined that the function recovery process is performed, with the initial data DR0 as the post-process data DRp. The initial data DR0 is data that is not updated to compensate for degradation of the component. Therefore, by rewriting the initial data DR0, it is possible to operate the throttle valve 14 and the ignition device 26 after the function recovery treatment by using data more appropriate for the operation of the throttle valve than in the case where the relation specification data DR before the function recovery treatment is continuously used even when the function recovery treatment is performed.
According to the present embodiment described above, the following operational effects can be further obtained.
(1) The CPU72 stores the relationship specification data DR at that time as updated data DR1 every time the travel distance RL increases by a predetermined amount (S50: yes) (S52). When it is determined that the function recovery process is performed (yes in S54), if updated data DR1 is present at a travel distance equal to or shorter than the current travel distance RL by a predetermined amount Δl (yes in S56), the CPU72 rewrites relationship specification data DR used for setting the throttle opening degree command value TA and the delay amount aop as post-process data DRp (S58) (S62). Here, the updated data DR1 is data obtained by updating the initial data DR0, which is the relationship specification data DR when the vehicle VC1 leaves the factory, in association with the actual running of the vehicle VC 1. The updated data DR1 is the relationship specifying data DR at the travel distance RL shorter than the predetermined distance Δl or more when the function recovery treatment is performed, and is considered to be data having little influence such as deterioration of the component at the time point when the function recovery treatment is performed. Therefore, by updating the updated data DR1, the relationship specification data DR appropriate for the vehicle VC1 after the function recovery processing can be set.
(2) The argument of the action cost function Q contains time-series data of the accelerator operation amount PA. Thus, for example, the value of the action a can be finely adjusted with respect to various changes in the accelerator operation amount PA, as compared with a case where only a single sampling value is used as an argument with respect to the accelerator operation amount PA.
(3) The argument of the action cost function Q contains the throttle opening degree command value TA_itself. This makes it possible to easily improve the degree of freedom of search by reinforcement learning, as compared with a case where, for example, a model-type parameter that models the behavior of the throttle opening command value TA is used as an argument related to the throttle opening.
< embodiment 2 >
Hereinafter, embodiment 2 will be described with reference to fig. 5 and 6, focusing on differences from embodiment 1.
Fig. 5 shows a structure of a control system that performs reinforcement learning in the present embodiment. In fig. 5, for convenience of explanation, the same reference numerals are given to the components corresponding to those shown in fig. 1.
In addition to the control program 74a, a main program 74c for learning is stored in the ROM74 in the vehicle VC1 shown in fig. 5. The storage device 76 in the vehicle VC1 stores the torque output map data DT and the relationship specifying data DR, but does not store the initial data DR0. The control device 70 further includes a communicator 77. The communicator 77 is a device for communicating with the data analysis center 110 via the network 100 outside the vehicle VC 1.
The data analysis center 110 analyzes data transmitted from the plurality of vehicles VC1, VC2, … …. The data analysis center 110 includes a CPU112, a ROM114, and an electrically rewritable nonvolatile memory (storage device 116), a peripheral circuit 118, and a communication device 117, and those components can communicate via a local network 119. A learning subroutine 114a is stored in the ROM 114. The initial data DR0 is stored in the storage 116.
Fig. 6 shows a processing procedure for handling the function recovery processing according to the present embodiment. The processing shown in part (a) of fig. 6 is realized by the CPU72 executing the learning main program 74c stored in the ROM74 shown in fig. 5. The processing shown in part (b) of fig. 6 is realized by the CPU112 executing the learning subroutine 114a stored in the ROM 114. In fig. 6, the same step numbers are given for convenience of explanation as for the processing corresponding to the processing shown in fig. 4. The processing shown in fig. 6 will be described below along the time series.
In a series of processes shown in part (a) of fig. 6, the CPU72 first operates the communicator 77 to transmit the identification information ID, the travel distance RL, and the position data Pgps of the vehicle VC1 (S70).
In contrast, as shown in part (b) of fig. 6, the CPU112 receives the identification information ID, the travel distance RL, and the position data Pgps (S80). Then, the CPU112 updates the travel distance RL and the position data Pgps associated with the identification information ID stored in the storage device 116 to the value received by the processing of S80 (S82).
On the other hand, as shown in part (a) of fig. 6, the CPU72 executes the processing of S54, and when an affirmative determination is made, by operating the communication device 77, transmits a signal requesting post-treatment data DRp that is suitable as the relationship specification data DR used in the processing of S12 (S72).
In contrast, as shown in fig. 6 (b), the CPU112 determines whether or not there is a request for post-treatment data DRp (S84). When it is determined that there is a request for the post-treatment data DRp (yes in S84), the CPU112 searches for a vehicle that is located close to the vehicle VC1 that transmitted the request signal and has a short travel distance (S86). Here, the condition that the vehicle is close in position is set to a vehicle having a distance equal to or less than a predetermined distance from the vehicle VC1 that transmitted the signal for requesting based on the position data Pgps of each vehicle stored in the process of S82. The vehicle having a shorter travel distance than the vehicle VC1 that transmitted the request signal is a vehicle having a travel distance that is shorter than the travel distance RL of the vehicle VC1 by a predetermined amount Δl or more and a difference between the travel distance and the travel distance RL of the vehicle VC1 is a predetermined amount Δh or less.
Here, the reason why the vehicle having a distance equal to or less than the predetermined distance from the vehicle VC1 is searched is that: the relationship specification data DR in the case of a vehicle located at a position sufficiently distant from the vehicle VC1 may not be appropriate data for the vehicle VC1 in terms of increasing expected benefits due to differences in the environment of the vehicle VC1, and the like. The reason for providing a vehicle in which the travel distance RL is shorter than the travel distance RL of the vehicle VC1 by a predetermined amount Δl or more but not shorter than the predetermined amount Δh is to determine a setting of the vehicle similar to the state before the component degradation of the vehicle VC 1.
When it is determined that there is a matched vehicle (yes in S88), the CPU112 operates the communication device 117 to prompt transmission of the matched vehicle relationship specification data DR, and receives the relationship specification data DR transmitted from the matched vehicle as other vehicle specification data DRa (S90). Next, the CPU72 substitutes the other vehicle predetermined data DRa into the post-treatment data DRp (S92). On the other hand, when it is determined that there is no vehicle matching (S88: no), the CPU72 substitutes the initial data DR0 into the post-treatment data DRp (S94). When the processing in S92 and S94 is completed, the CPU112 operates the communication device 117 to transmit the post-treatment data DRp to the vehicle VC1 that has issued the request for the post-treatment data DRp (S96). Further, the CPU112 temporarily ends a series of processes shown in part (b) of fig. 6 in the case where the process of S96 is completed and in the case where a negative determination is made in the process of S84.
In contrast, as shown in fig. 6 (a), the CPU72 receives the transmitted post-treatment data DRp (S74) and executes the processing of S62.
Further, the CPU72 temporarily ends the series of processing shown in part (a) of fig. 6 in the case where the processing of S62 is completed and in the case where a negative determination is made in the processing of S54.
< correspondence relation >)
The correspondence between the matters in the above embodiment and the matters described in the column of the "summary of the invention" is as follows. In the following, the corresponding relation is indicated by each number of the aspects described in the column of "summary of the invention".
[1] The execution means corresponds to the CPU72 and the ROM74, and the storage means corresponds to the storage means 76. The acquisition processing corresponds to the processing of S10 and S16, and the operation processing corresponds to the processing of S14. The reward calculation processing corresponds to the processing of S32 to S36, and the update processing corresponds to the processing of S38 to S44. The detection process corresponds to the process of S54, and the switching process corresponds to the process of S62. The update map corresponds to a map specified by an instruction to execute the processing of S38 to S44 in the learning program 74 b.
[2] The past data maintenance process corresponds to the process of S52.
[3] And [5] the post-treatment data request processing corresponds to the processing of S72, and the post-treatment data reception processing corresponds to the processing of S74.
[4] The [6] to [8] 1 st execution means correspond to the CPU72 and the ROM74, and the 2 nd execution means correspond to the CPU112 and the ROM114. The post-treatment data transmission process corresponds to the process of S96, and the post-treatment data reception process corresponds to the process of S74.
< other embodiments >
The present embodiment can be modified as follows. The present embodiment and the following modifications can be combined and implemented within a range that is not technically contradictory.
"about detection Process"
In the above embodiment, the control device 70 is configured to detect the intention of the function recovery process by inputting a signal indicating that the intention is to be subjected to the function recovery process from the scanning tool to the control device 70 in a state where the control device 70 is connected to the scanning tool, but the detection process is not limited thereto. For example, when the function recovery process is performed in a repair facility or the like, the data analysis center 110 may be notified of the function recovery process via the network 100. Even in this case, the post-treatment data DRp can be transmitted to the control device 70 by executing the processing in accordance with the processing in S80, 82, S86 to S96 in part (b) of fig. 6 in the data analysis center 110.
Of course, the detection process is not limited to the process performed by any one of the control device 70 and the data analysis center 110. For example, as described in the column "related to the vehicle control system" below, when the vehicle control system is configured by providing a mobile terminal, the mobile terminal may execute the detection process. Here, when the control device 70, the mobile terminal, and the data analysis center 110 constitute a vehicle control system, the mobile terminal may send a signal requesting the post-treatment data DRp to the data analysis center 110 after the detection process is performed by the mobile terminal.
Further, the detection process is not limited to a process of directly detecting a signal of a repair shop or the like. For example, as the detection process, the following process may be used: when a signal indicating that the function recovery process is performed is transmitted to the mobile terminal and the mobile terminal transmits the signal indicating that the function recovery process is performed to the control device 70, the control device 70 receives the signal from the mobile terminal.
"about past data maintenance Process"
In the above embodiment, the relationship specification data DR at that time is stored as the updated data DR1 every time the travel distance RL increases by a predetermined distance, but is not limited thereto. For example, the amount of deposit around the throttle valve 14 may be quantified by an average value of the intake air amount Ga per "1%" in the case where the fully opened state of the throttle valve opening degree TA is 100%, and the relationship specification data DR at that time may be regarded as updated data DR1 at the point in time when the average value changes by a predetermined value. Here, the predetermined value may be set to an upper limit value that can neglect the influence on the intake air amount Ga.
Data transmission processing for treatment "
The data required for the control device 70 to execute the switching process is not limited to the post-treatment data DRp. For example, as described in the "related to detection processing" column, when a signal indicating that the function recovery process is performed is transmitted from the repair factory to the data analysis center 110 via the network 100, data indicating the intention and the post-process data DRp may be transmitted from the data analysis center 110 to the control device 70.
The processing in S86 to S92 may be deleted, and the post-treatment data DRp transmitted from the data analysis center 110 to the control device 70 may be always used as the initial data DR0.
For example, the processing of S28 in fig. 2 and the processing of S50, S52, and S56 to S62 in fig. 4 may be executed by the data analysis center 110, and the post-treatment data DRp generated by the processing of S62 may be transmitted to the control device 70.
"about initial data"
The initial data DR0 is not limited to data subjected to reinforcement learning on a test stand or the like. For example, the reinforcement learning data may be data that has been reinforcement learned during traveling of a vehicle for test traveling, which is a different vehicle from the vehicle shipped from the factory. Of course, the data generated by reinforcement learning is not limited, and may be, for example, data generated based on control logic in a vehicle that has been adapted by a conventional method. Even in this case, the data that can increase the expected benefit from the initial data DR0 can be generated without increasing the man-hours of the skilled person by updating the data by reinforcement learning after the vehicle leaves the factory.
"about action variables"
In the above embodiment, the throttle opening degree command value TA is exemplified as the variable related to the opening degree of the throttle valve as the action variable, but the present invention is not limited thereto. For example, the responsiveness of the throttle opening degree command value TA to the accelerator operation amount PA may be expressed by a dead time and a second order lag filter, and a total of three variables of the dead time and two variables defining the second order lag filter may be used as the variables relating to the throttle opening degree. However, in this case, it is preferable that the state variable is set to a change amount per unit time of the accelerator operation amount PA instead of time-series data of the accelerator operation amount PA.
In the above embodiment, the delay amount aop is exemplified as a variable related to the ignition timing as an action variable, but is not limited thereto. For example, the ignition timing itself may be set as the correction target of KCS.
In the above embodiment, the variable relating to the opening degree of the throttle valve and the variable relating to the ignition timing are exemplified as the action variable, but not limited thereto. For example, the fuel injection amount may be used in addition to the variable relating to the opening degree of the throttle valve and the variable relating to the ignition timing. Regarding those three variables, only the variable relating to the opening degree of the throttle valve and the fuel injection amount may be used as the action variable, or only the variable relating to the ignition timing and the fuel injection amount may be used as the action variable. Further, regarding those three variables, only one of those may be employed as an action variable.
As described below in the column "related to the internal combustion engine", in the case of the compression ignition type internal combustion engine, a variable related to the injection amount may be used instead of a variable related to the opening degree of the throttle valve, and a variable related to the injection timing may be used instead of a variable related to the ignition timing. In addition, it is preferable that a variable relating to the injection timing, a variable relating to the number of injections in 1 combustion cycle, and a variable relating to a time interval between the end timing of one of the 2 fuel injections adjacent in time series for one cylinder and the start timing of the other of the 1 combustion cycles are added in addition to the variable relating to the injection timing.
In addition, for example, when the transmission 50 is a stepped transmission, a current value of a solenoid valve for adjusting an engagement state of a clutch by hydraulic pressure or the like may be used as the action variable.
In addition, for example, as described in the column "related to vehicle" below, when a hybrid vehicle, an electric vehicle, or a fuel cell vehicle is used as the vehicle, the torque and the output of the rotating electrical machine may be used as the behavior variables. In addition, for example, in the case of a vehicle-mounted air conditioner having a compressor that rotates by the rotational power of a crankshaft of an internal combustion engine, the load torque of the compressor may be included in the motion variable. In addition, in the case of an electric vehicle-mounted air conditioner, the power consumption of the air conditioner may be included in the action variable.
"about status"
In the above embodiment, the time-series data of the accelerator operation amount PA is taken as data including 6 values sampled at equal intervals, but is not limited thereto. As long as the data includes 2 or more sampling values having different sampling timings, it is more preferable that the data includes 3 or more sampling values and the sampling interval is equal.
The state variable related to the accelerator operation amount is not limited to the time-series data of the accelerator operation amount PA, and may be, for example, the amount of change per unit time of the accelerator operation amount PA as described in the column "related to the action variable".
For example, as described in the column "related to the action variable", when the current value of the solenoid valve is used as the action variable, the rotation speed of the input shaft 52, the rotation speed of the output shaft 54, and the hydraulic pressure adjusted by the solenoid valve may be included in the state. For example, as described in the column "related to the action variable", when the torque and the output of the rotating electric machine are used as the action variable, the state may include the charging rate and the temperature of the battery. For example, as described in the column "related to the behavior variable", when the load torque of the compressor and the power consumption of the air conditioner are included in the behavior, the temperature in the vehicle interior may be included in the state.
Dimension reduction of data in Table form "
The dimension reduction method of the table-format data is not limited to the method exemplified in the above embodiment. For example, the accelerator operation amount PA rarely becomes the maximum value. Thus, the action cost function Q may not be defined in a state where the accelerator operation amount PA is equal to or greater than a predetermined amount, and the throttle opening degree command value ta×, etc. when the accelerator operation amount PA is equal to or greater than a predetermined amount may be adapted separately. Further, for example, the throttle opening degree command value TA is removed from the action-acceptable value, and the dimension may be reduced to a value equal to or greater than a predetermined value.
"data about relation specification"
In the above embodiment, the action cost function Q is set as a table-type function, but the present invention is not limited thereto. For example, a function approximator may also be used.
For example, instead of using an action cost function Q, the policy pi is represented by a function approximator that uses the probability of taking action a as a dependent variable while using the state s and action a as independent variables. The parameters that determine the function approximator may also be updated based on the prize r.
"about handling operations"
For example, as described in the column of "relation specification data", when the action cost function is used as a function approximator, the following is sufficient. All groups of discrete values related to the action of the phenotype function as the argument in the above embodiment may be inputted to the action cost function Q together with the state s, and the action a maximizing the action cost function Q may be selected.
For example, as described in the column "relation specifying data", when the policy pi is a function approximator having the state s and the action a as arguments and the probability of taking the action a as the argument, the action a may be selected based on the probability indicated by the policy pi.
"about update map"
In the processing of S38 to S44, the processing by the epsilon soft homotactic monte carlo method is exemplified, but the processing is not limited thereto. For example, the treatment may be based on the iso-policy type Monte Carlo method (off-policy Monte Carlo method). Of course, the method is not limited to the monte carlo method, and for example, an irregular policy type TD method may be used, an equal policy type TD method such as the SARSA method may be used, and for example, a qualification method (eligibility trace method) may be used as learning of an equal policy type.
For example, as described in the column "relation specifying data", when the policy pi is expressed by using the function approximator and the function approximator is directly updated based on the prize r, the update map may be formed by using a policy gradient method or the like.
In addition, it is not limited to setting only one of the action cost function Q and the policy pi as a direct update target based on the incentive r. For example, the action cost function Q and the policy pi may be updated separately as in the Actor-Critic method. In the Actor-Critic method, the cost function V may be updated instead of the action cost function Q, for example.
The "epsilon" of the determination policy pi is not limited to a fixed value, and may be changed according to a predetermined rule according to the progress of learning.
"about rewards calculation Process"
In the process of S32 of fig. 3, the prize is provided according to whether or not the logical product of the condition (a) and the condition (B) is true, but is not limited thereto. For example, a process of providing a bonus according to whether the condition (a) is satisfied and a process of providing a bonus according to whether the condition (B) is satisfied may also be performed. In addition, for example, regarding both the process of providing the bonus according to whether the condition (a) is satisfied and the process of providing the bonus according to whether the condition (B) is satisfied, only any one of those processes may be executed.
For example, instead of uniformly providing the same rewards when the condition (a) is satisfied, the following process may be adopted: in the case where the absolute value of the difference between the torque Trq and the torque command value Trq is small, a larger prize is provided than in the case where the absolute value of the difference is large. For example, instead of uniformly providing the same rewards when the condition (a) is not satisfied, the following process may be performed: in the case where the absolute value of the difference between the torque Trq and the torque command value Trq is large, a smaller prize is provided than in the case where the absolute value of the difference is small.
For example, instead of uniformly providing the same rewards when the condition (B) is satisfied, the following process may be performed: the magnitude of the reward is made variable according to the magnitude of the acceleration Gx. For example, instead of uniformly providing the same rewards when the condition (B) is not satisfied, the following process may be performed: the magnitude of the reward is made variable according to the magnitude of the acceleration Gx.
In the above embodiment, the prize r is provided according to whether or not the criterion relating to drivability is satisfied. The reference regarding drivability is not limited to the above, and may be set based on whether or not the noise and vibration intensity satisfy the reference, for example. Of course, it is not limited to this, and any one or more of whether the acceleration satisfies the criterion, whether the following performance of the torque Trq satisfies the criterion, whether the noise satisfies the criterion, and whether the vibration intensity satisfies the criterion may be a criterion concerning drivability.
The bonus calculation process is not limited to the process of providing the bonus r according to whether or not the criterion concerning drivability is satisfied. For example, if the fuel consumption rate satisfies the criterion, a larger reward may be provided than if the criterion is not satisfied. Further, for example, if the exhaust characteristic satisfies the criterion, a larger reward may be provided than if the criterion is not satisfied. Further, two or three of the three processes of providing a larger incentive if the criterion concerning drivability is satisfied than if the criterion is not satisfied, providing a larger incentive if the fuel consumption rate satisfies the criterion than if the criterion is not satisfied, and providing a larger incentive if the exhaust characteristic satisfies the criterion than if the criterion is not satisfied may also be included.
For example, when the current value of the solenoid valve of the transmission 50 is set as the action variable as described in the column "regarding the action variable", at least one of the following three processes (a) to (c) may be included in the reward calculation process.
(a) The method comprises the following steps: in the case where the time required for switching the gear ratio of the transmission is within a predetermined time, a larger prize is provided than in the case where the predetermined time is exceeded.
(b) The method comprises the following steps: in the case where the absolute value of the change speed of the rotational speed of the input shaft 52 of the transmission is equal to or less than the input-side predetermined value, a larger prize is provided than in the case where the input-side predetermined value is exceeded.
(c) The method comprises the following steps: in the case where the absolute value of the change speed of the rotational speed of the output shaft 54 of the transmission is equal to or less than the output-side predetermined value, a larger prize is provided than in the case where the output-side predetermined value is exceeded.
For example, when the torque and output of the rotating electrical machine are set as the action variables as described in the column "action variables", the following processing may be included: a process of providing a larger incentive in the case where the charging rate of the battery is within a predetermined range than in the case where the charging rate is not within the predetermined range; a process of providing a larger prize in the case where the temperature of the battery is within a predetermined range than in the case where the temperature is not within the predetermined range. For example, when the load torque of the compressor and the power consumption of the air conditioner are included in the action variables as described in the column "action variables", the following processing may be added: a process of providing a larger prize in the case where the temperature in the vehicle interior is within a predetermined range than in the case where it is not within the predetermined range.
Control System for vehicle "
The vehicle control system is not limited to the system constituted by the control device 70 and the data analysis center 110. For example, instead of the data analysis center 110, a mobile terminal held by the user may be used, and the control device 70 and the mobile terminal may constitute a vehicle control system. Further, for example, the control device 70, the mobile terminal, and the data analysis center 110 may constitute a vehicle control system.
"about execution device"
The execution device is not limited to the device provided with the CPU72 (112) and the ROM74 (114) to execute software processing. For example, a dedicated hardware circuit such as an ASIC may be provided for performing hardware processing on at least a part of the means provided for software processing in the above-described embodiment. That is, the actuator may have any one of the following structures (a) to (c). (a) A program storage device (which may include a non-transitory computer-readable storage medium) including a processing device that executes all of the above processes in accordance with a program, a ROM that stores the program, and the like. (b) The processing device and the program storage device are provided with a processing device and a program storage device for executing a part of the above processing according to a program, and a dedicated hardware circuit for executing the rest of the processing. (c) The processing device is provided with a dedicated hardware circuit for executing all the above processing. Here, the software executing apparatus and the dedicated hardware circuit each including the processing apparatus and the program storage apparatus may be plural.
"about storage device"
In the above embodiment, the storage device storing the relationship specifying data DR and the storage device (ROM 74) storing the learning program 74b and the control program 74a are provided as different storage devices, but the present invention is not limited thereto.
"about internal Combustion Engine"
The internal combustion engine is not limited to the internal combustion engine provided with the port injection valve that injects fuel into intake passage 12 as the fuel injection valve, and may be an internal combustion engine provided with an in-cylinder injection valve that directly injects fuel into combustion chamber 24. Further, for example, an internal combustion engine having both a port injection valve and an in-cylinder injection valve may be used.
The internal combustion engine is not limited to the spark ignition type internal combustion engine, and may be, for example, a compression ignition type internal combustion engine using light oil or the like as fuel.
"about vehicle"
The vehicle is not limited to a vehicle in which the thrust generation device is only an internal combustion engine, and may be a so-called hybrid vehicle including an internal combustion engine and a rotating electrical machine. For example, the present invention may be applied to a so-called electric vehicle or a fuel cell vehicle that does not include an internal combustion engine, but includes a rotating electric machine as a thrust generation device.

Claims (13)

1. A vehicle control device includes an execution device and a storage device,
The storage means stores relationship specification data that specifies a relationship between a state of a vehicle and an action variable that is a variable related to an operation of an electronic device in the vehicle,
the execution device is configured to execute:
an acquisition process of acquiring a detection value of a sensor that detects a state of the vehicle;
an operation process of operating the electronic device based on a value of the action variable, the value of the action variable being a value determined from the detected value acquired by the acquisition process and the relationship specifying data;
a reward calculation process of giving a larger reward when the characteristic of the vehicle satisfies a criterion based on the detection value acquired by the acquisition process than when the characteristic of the vehicle does not satisfy the criterion;
an update process of updating the relationship specifying data by using, as input to a predetermined update map, a state of the vehicle based on the detection value acquired by the acquisition process, a value of the action variable used in an operation of the electronic device, and the reward corresponding to the operation;
A detection process of detecting that a function recovery process of a component that affects a state of the vehicle due to an operation based on the operation process among components in the vehicle has been performed; and
a switching process of switching the relationship specification data to be used in the operation process to post-process data when the function recovery process is detected by the detection process,
the update map outputs the relationship specification data updated in such a manner that expected benefits with respect to the rewards in the case where the electronic apparatus is operated in accordance with the relationship specification data are increased,
the switching process includes the following processes: as the post-treatment data, initial data, which is the relationship specification data before the update processing accompanying the vehicle running is performed, is used.
2. The control device for a vehicle according to claim 1,
the execution device is configured to execute past data maintenance processing, which is processing as follows: different from the relationship specifying data updated by the update processing, the relationship specifying data updated by the update processing until a predetermined condition is satisfied and the update by the update processing is avoided after the predetermined condition is satisfied is brought into a state stored in the storage device,
The switching process includes the following processes: selecting whether the post-treatment data is the relationship specification data that has been updated by the update process before the predetermined condition is satisfied, or whether the post-treatment data is the initial data.
3. The control device for a vehicle according to claim 1 or 2,
the execution means is configured to execute, when the detection processing detects that the function recovery processing is performed:
processing the post-treatment data request, and transmitting a signal requesting the post-treatment data; and
a post-treatment data reception process of receiving the post-treatment data transmitted as a result of the post-treatment data request process,
the switching process includes the following processes: the relationship specifying data to be used in the operation processing is switched to the received post-treatment data.
4. A vehicle control system comprising the execution device and the storage device of the vehicle control device according to claim 1 or 2,
the execution device comprises a 1 st execution device which is carried on the vehicle and a 2 nd execution device which is different from the vehicle-mounted device,
The 2 nd execution means is configured to execute at least a post-treatment data transmission process of transmitting the post-treatment data when the detection process detects that the function recovery process is performed,
the 1 st execution means is configured to execute at least the acquisition processing, the operation processing, and the post-treatment data reception processing for receiving the data transmitted by the post-treatment data transmission processing.
5. The control system for a vehicle according to claim 4,
the 1 st execution means is configured to execute the detection processing and the post-treatment data request processing,
the post-treatment data request processing transmits a signal requesting the post-treatment data when the detection processing detects that the function recovery processing is performed.
6. The control system for a vehicle according to claim 4 or 5,
the update processing is performed by the 1 st execution device.
7. A vehicle control device is provided with a 1 st execution device in a vehicle control system, the vehicle control system is provided with an execution device and a storage device,
the storage means stores relationship specification data that specifies a relationship between a state of a vehicle and an action variable that is a variable related to an operation of an electronic device in the vehicle,
The execution device is configured to execute:
an acquisition process of acquiring a detection value of a sensor that detects a state of the vehicle;
an operation process of operating the electronic device based on a value of the action variable, the value of the action variable being a value determined from the detected value acquired by the acquisition process and the relationship specifying data;
a reward calculation process of giving a larger reward when the characteristic of the vehicle satisfies a criterion based on the detection value acquired by the acquisition process than when the characteristic of the vehicle does not satisfy the criterion;
an update process of updating the relationship specifying data by using, as input to a predetermined update map, a state of the vehicle based on the detection value acquired by the acquisition process, a value of the action variable used in an operation of the electronic device, and the reward corresponding to the operation;
a detection process of detecting that a function recovery process of a component that affects a state of the vehicle due to an operation based on the operation process among components in the vehicle has been performed; and
A switching process of switching the relationship specification data to be used in the operation process to post-process data when the function recovery process is detected by the detection process,
the update map outputs the relationship specification data updated in such a manner that expected benefits with respect to the rewards in the case where the electronic apparatus is operated in accordance with the relationship specification data are increased,
the switching process includes the following processes: using initial data, which is the relationship specification data before the update process accompanying the vehicle running is performed, as the post-treatment data,
the execution device includes the 1 st execution device and a 2 nd execution device different from an in-vehicle device which are mounted on the vehicle,
the 2 nd execution means is configured to execute at least a post-treatment data transmission process of transmitting the post-treatment data when the detection process detects that the function recovery process is performed,
the 1 st execution means is configured to execute at least the acquisition processing, the operation processing, and the post-treatment data reception processing for receiving the data transmitted by the post-treatment data transmission processing.
8. A vehicle control device is provided with a 1 st execution device in a vehicle control system, the vehicle control system is provided with an execution device and a storage device, the storage device stores relationship specification data, the relationship specification data specifies the relationship between the state of a vehicle and an action variable, the action variable is a variable related to the operation of an electronic device in the vehicle,
the execution device is configured to execute:
an acquisition process of acquiring a detection value of a sensor that detects a state of the vehicle;
an operation process of operating the electronic device based on a value of the action variable, the value of the action variable being a value determined from the detected value acquired by the acquisition process and the relationship specifying data;
a reward calculation process of giving a larger reward when the characteristic of the vehicle satisfies a criterion based on the detection value acquired by the acquisition process than when the characteristic of the vehicle does not satisfy the criterion;
an update process of updating the relationship specifying data by using, as input to a predetermined update map, a state of the vehicle based on the detection value acquired by the acquisition process, a value of the action variable used in an operation of the electronic device, and the reward corresponding to the operation;
A detection process of detecting that a function recovery process of a component that affects a state of the vehicle due to an operation based on the operation process among components in the vehicle has been performed; and
a switching process of switching the relationship specification data to be used in the operation process to post-process data when the function recovery process is detected by the detection process,
the update map outputs the relationship specification data updated in such a manner that expected benefits with respect to the rewards in the case where the electronic apparatus is operated in accordance with the relationship specification data are increased,
the switching process includes the following processes: using initial data, which is the relationship specification data before the update process accompanying the vehicle running is performed, as the post-treatment data,
the execution device includes the 1 st execution device and a 2 nd execution device different from an in-vehicle device which are mounted on the vehicle,
the 2 nd execution means is configured to execute at least a post-treatment data transmission process of transmitting the post-treatment data when the detection process detects that the function recovery process is performed,
The 1 st execution means is configured to execute at least the acquisition process, the operation process, and a post-treatment data reception process for receiving data transmitted by the post-treatment data transmission process,
the 1 st execution means is configured to execute the detection processing and the post-treatment data request processing,
the post-treatment data request processing transmits a signal requesting the post-treatment data when the detection processing detects that the function recovery processing is performed.
9. A vehicle control device is provided with a 1 st execution device in a vehicle control system,
the vehicle control system includes an execution device and a storage device that stores relationship specifying data that specifies a relationship between a state of a vehicle and an action variable that is a variable related to an operation of an electronic device in the vehicle,
the execution device is configured to execute:
an acquisition process of acquiring a detection value of a sensor that detects a state of the vehicle;
an operation process of operating the electronic device based on a value of the action variable, the value of the action variable being a value determined from the detected value acquired by the acquisition process and the relationship specifying data;
A reward calculation process of giving a larger reward when the characteristic of the vehicle satisfies a criterion based on the detection value acquired by the acquisition process than when the characteristic of the vehicle does not satisfy the criterion;
an update process of updating the relationship specifying data by using, as input to a predetermined update map, a state of the vehicle based on the detection value acquired by the acquisition process, a value of the action variable used in an operation of the electronic device, and the reward corresponding to the operation;
a detection process of detecting that a function recovery process of a component that affects a state of the vehicle due to an operation based on the operation process among components in the vehicle has been performed; and
a switching process of switching the relationship specification data to be used in the operation process to post-process data when the function recovery process is detected by the detection process,
the update map outputs the relationship specification data updated in such a manner that expected benefits with respect to the rewards in the case where the electronic apparatus is operated in accordance with the relationship specification data are increased,
The switching process includes the following processes: using initial data, which is the relationship specification data before the update process accompanying the vehicle running is performed, as the post-treatment data,
the execution device includes the 1 st execution device and a 2 nd execution device different from an in-vehicle device which are mounted on the vehicle,
the 2 nd execution means is configured to execute at least a post-treatment data transmission process of transmitting the post-treatment data when the detection process detects that the function recovery process is performed,
the 1 st execution means is configured to execute at least the acquisition process, the operation process, and the post-treatment data reception process for receiving the data transmitted by the post-treatment data transmission process, and the update process is executed by the 1 st execution means.
10. The control device for a vehicle according to claim 9,
the 1 st execution means is configured to execute the detection processing and the post-treatment data request processing,
the post-treatment data request processing transmits a signal requesting the post-treatment data when the detection processing detects that the function recovery processing is performed.
11. A learning device for a vehicle is provided with a 2 nd execution device in a vehicle control system,
the vehicle control system includes an execution device and a storage device that stores relationship specifying data that specifies a relationship between a state of a vehicle and an action variable that is a variable related to an operation of an electronic device in the vehicle,
the execution device is configured to execute:
an acquisition process of acquiring a detection value of a sensor that detects a state of the vehicle;
an operation process of operating the electronic device based on a value of the action variable, the value of the action variable being a value determined from the detected value acquired by the acquisition process and the relationship specifying data;
a reward calculation process of giving a larger reward when the characteristic of the vehicle satisfies a criterion based on the detection value acquired by the acquisition process than when the characteristic of the vehicle does not satisfy the criterion;
an update process of updating the relationship specifying data by using, as input to a predetermined update map, a state of the vehicle based on the detection value acquired by the acquisition process, a value of the action variable used in an operation of the electronic device, and the reward corresponding to the operation;
A detection process of detecting that a function recovery process of a component that affects a state of the vehicle due to an operation based on the operation process among components in the vehicle has been performed; and
a switching process of switching the relationship specification data to be used in the operation process to post-process data when the function recovery process is detected by the detection process,
the update map outputs the relationship specification data updated in such a manner that expected benefits with respect to the rewards in the case where the electronic apparatus is operated in accordance with the relationship specification data are increased,
the switching process includes the following processes: using initial data, which is the relationship specification data before the update process accompanying the vehicle running is performed, as the post-treatment data,
the execution device includes a 1 st execution device mounted on the vehicle and the 2 nd execution device different from an in-vehicle device,
the 2 nd execution means is configured to execute at least a post-treatment data transmission process of transmitting the post-treatment data when the detection process detects that the function recovery process is performed,
The 1 st execution means is configured to execute at least the acquisition processing, the operation processing, and the post-treatment data reception processing for receiving the data transmitted by the post-treatment data transmission processing.
12. A control method for a vehicle, which is executed by an execution device and a storage device, the control method comprising:
storing, by the storage device, relationship specification data that specifies a relationship between a state of a vehicle and an action variable that is a variable related to an operation of an electronic device in the vehicle;
by means of the said execution means,
acquiring a detection value of a sensor that detects a state of the vehicle;
operating the electronic device based on the value of the action variable, the value of the action variable being determined from the acquired detection value and the relationship specification data;
based on the obtained detection value, when the characteristic of the vehicle satisfies a criterion, a larger reward is given than when the characteristic of the vehicle does not satisfy the criterion;
updating the relationship specifying data by using, as input to a predetermined update map, the state of the vehicle based on the acquired detection value, the value of the action variable used in the operation of the electronic device, and the reward corresponding to the operation;
Detecting that a function recovery process is performed, the function recovery process being a function recovery process of a component that affects a state of the vehicle due to an operation of the electronic device, among components in the vehicle; and
when it is detected that the function recovery processing is performed, the relationship specification data to be used in the operation of the electronic device is switched to post-processing data,
the update map outputs the relationship specification data updated in such a manner that expected benefits with respect to the rewards in the case where the electronic apparatus is operated in accordance with the relationship specification data are increased,
switching the relationship specification data includes: the post-treatment data is used as initial data, which is the relationship specification data before the update of the relationship specification data accompanying the vehicle running is performed.
13. A non-transitory computer-readable storage medium storing a program for causing an execution device and a storage device to execute a control process for a vehicle,
the control process for a vehicle includes:
storing, by the storage device, relationship specification data that specifies a relationship between a state of a vehicle and an action variable that is a variable related to an operation of an electronic device in the vehicle;
By means of the said execution means,
acquiring a detection value of a sensor that detects a state of the vehicle;
operating the electronic device based on the value of the action variable, the value of the action variable being determined from the acquired detection value and the relationship specification data;
based on the obtained detection value, when the characteristic of the vehicle satisfies a criterion, a larger reward is given than when the characteristic of the vehicle does not satisfy the criterion;
updating the relationship specifying data by using, as input to a predetermined update map, the state of the vehicle based on the acquired detection value, the value of the action variable used in the operation of the electronic device, and the reward corresponding to the operation;
detecting that a function recovery process is performed, the function recovery process being a function recovery process of a component that affects a state of the vehicle due to an operation of the electronic device, among components in the vehicle; and
when it is detected that the function recovery processing is performed, the relationship specification data to be used in the operation of the electronic device is switched to post-processing data,
The update map outputs the relationship specification data updated in such a manner that expected benefits with respect to the rewards in the case where the electronic apparatus is operated in accordance with the relationship specification data are increased,
switching the relationship specification data includes: the post-treatment data is used as initial data, which is the relationship specification data before the update of the relationship specification data accompanying the vehicle running is performed.
CN202011090364.5A 2019-10-18 2020-10-13 Vehicle control device, system, method, learning device, and storage medium Active CN112682203B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-191357 2019-10-18
JP2019191357A JP6705547B1 (en) 2019-10-18 2019-10-18 Vehicle control device, vehicle control system, and vehicle learning device

Publications (2)

Publication Number Publication Date
CN112682203A CN112682203A (en) 2021-04-20
CN112682203B true CN112682203B (en) 2023-08-04

Family

ID=70858155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011090364.5A Active CN112682203B (en) 2019-10-18 2020-10-13 Vehicle control device, system, method, learning device, and storage medium

Country Status (3)

Country Link
US (1) US11453375B2 (en)
JP (1) JP6705547B1 (en)
CN (1) CN112682203B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6809588B1 (en) * 2019-10-18 2021-01-06 トヨタ自動車株式会社 Vehicle control system, vehicle control device, and vehicle learning device
JP6744597B1 (en) * 2019-10-18 2020-08-19 トヨタ自動車株式会社 Vehicle control data generation method, vehicle control device, vehicle control system, and vehicle learning device
KR20210076223A (en) * 2019-12-13 2021-06-24 현대자동차주식회사 Hybrid vehicle and method of controlling the same
JP7302522B2 (en) * 2020-04-22 2023-07-04 トヨタ自動車株式会社 hybrid vehicle
JP7331789B2 (en) 2020-06-25 2023-08-23 トヨタ自動車株式会社 Vehicle control device, vehicle control system, vehicle learning device, and vehicle learning method
JP7439680B2 (en) * 2020-07-28 2024-02-28 トヨタ自動車株式会社 Shift control data generation method, shift control device, shift control system, and vehicle learning device
US11480436B2 (en) * 2020-12-02 2022-10-25 Here Global B.V. Method and apparatus for requesting a map update based on an accident and/or damaged/malfunctioning sensors to allow a vehicle to continue driving
US11341847B1 (en) 2020-12-02 2022-05-24 Here Global B.V. Method and apparatus for determining map improvements based on detected accidents
KR20220095286A (en) * 2020-12-29 2022-07-07 현대자동차주식회사 Apparatus and method for determining optimal velocity of vehicle

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11139295A (en) * 1997-11-05 1999-05-25 Mazda Motor Corp Control device for vehicle
JP2000250604A (en) * 1999-03-02 2000-09-14 Yamaha Motor Co Ltd Cooperation method of optimization for characteristic optimization method
US6549815B1 (en) * 1999-03-02 2003-04-15 Yamaha Hatsudoki Kabushiki Kaisha Method and apparatus for optimizing overall characteristics of device, using heuristic method
CN103538537A (en) * 2012-07-12 2014-01-29 雅马哈发动机株式会社 Vehicle information management system
CN110096045A (en) * 2018-01-30 2019-08-06 现代自动车株式会社 Vehicle Predictive Control System and its method based on big data
CN110126826A (en) * 2018-02-09 2019-08-16 本田技研工业株式会社 Controlling device for vehicle running
JP2019138273A (en) * 2018-02-15 2019-08-22 株式会社明電舎 Vehicle speed control device and vehicle speed control method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4975158B2 (en) * 2010-11-08 2012-07-11 本田技研工業株式会社 Plant control equipment
JP6026612B2 (en) * 2015-09-22 2016-11-16 本田技研工業株式会社 Control device for internal combustion engine for vehicle
JP6547991B1 (en) * 2019-02-20 2019-07-24 トヨタ自動車株式会社 Catalyst temperature estimation device, catalyst temperature estimation system, data analysis device, and control device for internal combustion engine
JP6590097B1 (en) * 2019-02-20 2019-10-16 トヨタ自動車株式会社 PM amount estimation device, PM amount estimation system, data analysis device, control device for internal combustion engine, and reception device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11139295A (en) * 1997-11-05 1999-05-25 Mazda Motor Corp Control device for vehicle
JP2000250604A (en) * 1999-03-02 2000-09-14 Yamaha Motor Co Ltd Cooperation method of optimization for characteristic optimization method
US6549815B1 (en) * 1999-03-02 2003-04-15 Yamaha Hatsudoki Kabushiki Kaisha Method and apparatus for optimizing overall characteristics of device, using heuristic method
CN103538537A (en) * 2012-07-12 2014-01-29 雅马哈发动机株式会社 Vehicle information management system
CN110096045A (en) * 2018-01-30 2019-08-06 现代自动车株式会社 Vehicle Predictive Control System and its method based on big data
CN110126826A (en) * 2018-02-09 2019-08-16 本田技研工业株式会社 Controlling device for vehicle running
JP2019138273A (en) * 2018-02-15 2019-08-22 株式会社明電舎 Vehicle speed control device and vehicle speed control method

Also Published As

Publication number Publication date
US20210114580A1 (en) 2021-04-22
JP2021067201A (en) 2021-04-30
CN112682203A (en) 2021-04-20
JP6705547B1 (en) 2020-06-03
US11453375B2 (en) 2022-09-27

Similar Documents

Publication Publication Date Title
CN112682203B (en) Vehicle control device, system, method, learning device, and storage medium
CN112682182B (en) Vehicle control device, vehicle control system, and vehicle control method
CN112682197B (en) Method for generating control data for vehicle, control device for vehicle, and control system
CN112682181B (en) Vehicle control device, vehicle control system, and vehicle control method
CN112682198B (en) Vehicle control system, vehicle control device, and vehicle control method
JP2021116783A (en) Vehicular control device and vehicular control system
CN113103972B (en) Method and device for generating control data for vehicle, control device and system for vehicle, and storage medium
CN113090400B (en) Vehicle control device and control system, vehicle learning device and learning method, vehicle control method, and storage medium
CN112682184A (en) Vehicle control device, vehicle control system, and vehicle control method
JP7211375B2 (en) vehicle controller
CN113006951B (en) Method for generating vehicle control data, vehicle control device, vehicle control system, and vehicle learning device
CN113103971B (en) Method for generating vehicle control data, vehicle control device, vehicle control system, and vehicle learning device
CN113217204B (en) Vehicle control method, vehicle control device, and server
CN113266481A (en) Vehicle control method, vehicle control device, and server
JP2021067257A (en) Vehicle control device, vehicle control system, and vehicle learning device
CN112682202B (en) Control system, device, control method, and storage medium for vehicle
CN112682204B (en) Vehicle control device, vehicle control system, learning device, learning method, and storage medium
JP2021066417A (en) Vehicle control device, vehicle control system, and vehicle learning device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant