CN106842925A - A kind of locomotive smart steering method and system based on deeply study - Google Patents
A kind of locomotive smart steering method and system based on deeply study Download PDFInfo
- Publication number
- CN106842925A CN106842925A CN201710045758.0A CN201710045758A CN106842925A CN 106842925 A CN106842925 A CN 106842925A CN 201710045758 A CN201710045758 A CN 201710045758A CN 106842925 A CN106842925 A CN 106842925A
- Authority
- CN
- China
- Prior art keywords
- locomotive
- study
- learning
- train
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
Abstract
The present invention relates to a kind of locomotive smart steering method and system based on deeply study, the system includes data source modules, locomotive operation environment learning module, evaluates mechanism study module and control strategy study module, data source modules are for locomotive operation environment learning module and evaluate the data input needed for mechanism study module is provided, and the specific running environment and reward functions value that locomotive operation environment learning module and evaluation mechanism study module will be obtained respectively are exported to control strategy study module.Based on deeply learning algorithm, the Real-Time Evaluation that locomotive operation environmental model is acted using locomotive control is used as feedback information, by awarding or punishing current control action, a reward functions are fed back as award evaluation of estimate to control strategy, and control strategy combination running status is made iteratively the renewal and optimization of strategy.The present invention can preferably realize that locomotive intelligent optimization is manipulated, and considerably reduce artificial participation.
Description
Technical field
The present invention relates to a kind of locomotive control method and system, more particularly to a kind of locomotive intelligence based on deeply study
Energy method of operating and system, belong to field of locomotive control.
Background technology
The automatic Pilot and optimized handling of railway locomotive are for liberation manpower, reducing energy consumption, raising locomotive punctuality rate and peace
The aspects such as full property play an important roll.Because train operation environment is complicated, influence factor is numerous, scholars are excellent in locomotive control
Changing algorithm has carried out numerous studies, wherein can substantially be divided three classes:Analytical Solution method, numerical optimization and didactic
Optimized algorithm.In Analytical Solution method application, two kinds are generally divided into:It is a kind of be applied to input tractive force and brake force be from
The locomotive of type is dissipated, the tractive force and brake force that another kind is applied to input are the locomotives of continuous type.But Analytical Solution side
Constraint in method is excessively simple, it is impossible to which fitting locomotive shows ruuning situation well, and numerical optimization real-time is poor, difficult
For the real-time optimal control of locomotive, didactic optimized algorithm has the shortcomings that artificial dependency degree is excessive.Current locomotive
Manipulate real time control algorithms typically can be all designed based on ad hoc hypothesis, it is difficult to suitable for the operating condition that locomotive is complicated, from
And it is difficult to ensure that locomotive operation safety.
In recent years, the engine optimizing control based on machine learning artificial intelligence technology also turns into study hotspot.Luo
Hengyu and Xu Hongze propose a comprehensive intelligent control system for being applied to express locomotive automated control operation system
System.Comprising multiple fuzzy neural network controllers in system, and selected automatically with the running status that expert decision system is based on locomotive
Optimal controller is selected to realize effective control of locomotive.Heqing Sun et al. propose a learning algorithm for iteration with reality
The tracking of existing locomotive operation track, the algorithm is based on dynamics model of the locomotive, use in conjunction error feedback mechanism.They pass through
What theory analysis demonstrated algorithm can convergence.Lixing Yang et al. are for the real-time locomotive under condition of uncertainty interference
The realization of control system, two RTO algorithms and an on-line learning algorithm are proposed based on expert's study, and algorithm considers not true
The interference of qualitative condition, meets the requirement of multiple target.Jia TengYin et al. are added on the basis of existing ATO algorithms
Stopped algorithm (HSA) based on data mining algorithm and expert's study and didactic locomotive, form the STO algorithms of optimization.
These researchs by means of pilot steering experience to a certain extent, and being aided with the modes such as machine learning by expert system realizes locomotive
Optimized handling, but it is excessive and be difficult to ensure that effect of optimization to still suffer from artificial participation.
The development of deeply study (Deep Reinforcement Learning) also result in machine learning field
Huge sensation.Research team with DeepMind team as representative proposes the depth based on DQN (Deep Q-Network) first
Intensified learning method, and use the partial games of Atari 2600 as test object, as a result can exceed human player.The machine
Breakthrough on learning art is then delivered on Nature periodicals, causes the huge sensation of machine learning research field.
The theoretical developments process can trace back to the related work that Lange in 2010 does earliest, and he proposes Deep auto-encoder
For the relevant control of view-based access control model.Cuccu in 2011 et al. and Abtahi etc. is studied in related fields per capita, wherein,
Abtahi proposes the method for approaching device replaced with DBN in traditional intensified learning, and this is non-with the thought of deeply study
Very close to.2012, Lange further started to do application, it is proposed that Deep Fitted Q learn for wagon control.2013
Year, Deep Mind team has delivered their article on NIPS, convolutional neural networks and intensified learning has been combined, with
The Value Function that raw image data is acted as input, using each are played as output by Atari 2600
As test, there are 6 to exceed human levels in 7 game for finding the method test.DeepMind team exists afterwards
The DQN articles of modified version are delivered on Nature, has caused the extensive concern of people.It is similar that experiment shows that the method is more applied to
The optimization control process such as game, locomotive control, new thinking and opportunity are provided to railway locomotive optimized handling.
The content of the invention
The present invention realizes applying machine learning completely using the important breakthrough of machine learning field deeply learning method
Artificial intelligence means carry out railway locomotive optimized handling.For the target, emphasis of the invention is the depth of engine optimizing operation
The evaluation mechanism study of locomotive operation environment and locomotive real time management needed for nitrification enhancement, and deeply learning process
Also realized using machine learning method, and uncertain and influence operation safety the non-standard operation in environment will be taken into account
Deng.
A kind of locomotive smart steering system based on deeply study, it is characterised in that the locomotive smart steering system
System includes data source modules, locomotive operation environment learning module, evaluates mechanism study module and control strategy study module;
The data source modules are used to carry out data prediction to the data source for obtaining, data source bag described in the data source
Include locomotive operation daily record, train operation cross-channel data, energy consumption in train journey information and train schedule information, the data
Pretreatment is that the locomotive operation daily record and the train operation cross-channel data are delivered into the locomotive operation environment learning mould
Block, by the energy consumption in train journey information and the train schedule information conveyance to the evaluation mechanism study module;
The locomotive operation environment learning module is used to build locomotive operation environmental model, the locomotive operation environment learning
The study of underlying parameter part and disturbance parameter part comprising train runing parameters, learning outcome constitutes locomotive and specifically runs
The specific running environment of the locomotive for obtaining is delivered to the control strategy by environment, the locomotive operation environment learning module
Study module;
The information combining assessment mechanism that the evaluation mechanism study module will be obtained from the data source modules obtains machine
The reward functions wanted needed for car running, the reward functions are as the feedback data of the evaluation mechanism by the evaluation
Mechanism study module is delivered to the control strategy study module;
The control strategy study module is from the locomotive operation environment learning module and the evaluation mechanism study module
The specific running environment of the locomotive and the reward functions are obtained respectively, and carry out the train based on deeply learning method
Optimized handling policy learning is trained, and continuous interactive learning is carried out with the locomotive operation environmental model, by the evaluation machine
Reward functions that study module processed is fed back so as to for instructing the manipulation sequence after train, and by policy update
Mechanism, obtains the final manipulation of physical strategy of the locomotive.
Further, the evaluation mechanism includes the study of train operation scoring and non-standard operation Penalty Mark mechanism
Design.
Further, the control strategy study module carries out deeply study is carried out based on DQN models, described
DQN models carry out continuous interactive learning with the locomotive operation environmental model.
Present invention additionally comprises a kind of locomotive smart steering method based on deeply study, it is characterised in that the machine
Car smart steering method is achieved by the steps of:
S1:Data source is pre-processed;
The characteristic of locomotive operation environmental model study, i.e. locomotive operation daily record are extracted from data source and train is transported
Row cross-channel data, constitute the sample data of locomotive operation environment supervised learning algorithm study.Train fortune is extracted from data source
The parameter that the data of row consumption information and train schedule information learn as the mechanism of evaluation;
S2:The study of locomotive operation environment and structure;
By the running environment information of locomotive using supervised learning and dynamic time sequence nomography based on history data
Training and the structure of locomotive operation environmental model are carried out, locomotive operation environmental model specifically runs ring by learning acquisition locomotive
Border, and the specific running environment of locomotive that will be obtained is for control strategy study;
S3:Evaluation mechanism learns;
The information combining assessment mechanism that will be obtained from data source is carried out for given travel route and locomotive state information
Target observations in short interval obtain the reward functions of locomotive operation, and reward functions are used for control as the evaluation of estimate of locomotive control
Policy learning processed;
S4:Control strategy learns;
Policy learning is controlled to the specific running environment of locomotive using deeply learning method, and by acquisition
Reward functions carry out the renewal and optimization of strategy to running status, and then obtain the optimized handling control strategy of locomotive.
Further, the locomotive smart steering method also includes policy update mechanism, the control strategy after optimization
Real-time policy update can be carried out using the policy update mechanism, instructed from the basis of current control strategy, it is real
When adaptive learning draw the control strategy for more optimizing, realize the successive optimization of locomotive control strategy.
Further, in step s 2, the running environment information of locomotive includes locomotive operation daily record, train operation cross-channel number
According to the train for constituting status information in itself and the ambient parameter information in the external world, wherein most parameter ripple in certain scope
It is dynamic, it is the fluctuation information that can be observed and predict by historical data, and it is uncertain in actual scene to have fraction parameter
Property, and uncertain fluctuation may occur.
Further, the locomotive operation environmental model is based on mechanism model and completes train operation by supervised learning algorithm
Basic model parameter learning realizes the covering to general scene, and train operation environment perturbation parameters are completed based on dynamic graph model
Practise.
Further, the supervised learning algorithm is decision Tree algorithms or neural network algorithm.
Further, in step s3, the evaluation mechanism includes that train operation scoring and non-standard operation are punished
Scoring, the train operation scoring is formulated based on history log, the non-standard operation Penalty Mark mechanism
Formulated based on non-standard operation.
Further, in step s 4, complete control strategy by DQN models to learn, based on deeply study
Algorithm, the Real-Time Evaluation that the locomotive operation environmental model is acted using locomotive control evaluates mechanism by prize as feedback information
Current control action is appreciated or punished, gives the DQN model feedbacks one award evaluation of estimate, the DQN models couplings run shape
State is made iteratively the renewal and optimization of strategy.
The beneficial effects of the invention are as follows:
(1) optimized handling of railway locomotive is realized by the autonomous learning of machine, the present invention learns to calculate based on deeply
Method, locomotive operation environment and reward functions are realized by the autonomous learning of machine, during whole algorithm design and implementation,
As much as possible property avoids artificial participation.
(2) running environment of locomotive and the reward functions of locomotive control are trained and structure using machine learning techniques
Build, and taken into account the uncertain security with locomotive control of environmental model.The present invention is used for the running environment of locomotive
Supervised learning and dynamic time sequence nomography based on history data carry out training and the structure of model.Wherein dynamic time sequence
Nomography is innovatively applied to the study of ambient parameter variation tendency, to set up locomotive operation environmental model.The present invention is directed to
The reward functions of locomotive control, it is considered to locomotive control safety issue, respectively in terms of normal operating and non-standard operation two
Reward functions value is obtained, and based on train history information, the evaluation mechanism of locomotive control is completed using supervised learning
The training of habit.
(3) towards engine optimizing operation and the deeply learning algorithm of real-time policy update mechanism.It is of the invention specific real
Shi Zhong, the optimized algorithm scheme suitable for this problem is devised based on deeply learning algorithm (DQN models) in a creative way, and
The program can draw real-time policy update mechanism in specific implementation with reference to deep learning Algorithm for Training.
Therefore, the present invention can preferably realize that locomotive intelligent optimization is manipulated, and considerably reduce artificial participation.
Brief description of the drawings
Fig. 1 is locomotive smart steering system structure diagram of the present invention based on deeply study;
Fig. 2 is the technology path flow chart of locomotive smart steering method of the present invention based on deeply study;
Fig. 3 is deeply study basic model flow chart in the present invention;
Fig. 4 is DQN model support compositions in the present invention.
Specific embodiment
Technical scheme is described in detail with reference to the accompanying drawings and examples.
The present embodiment provides a kind of locomotive smart steering system based on deeply study, as shown in figure 1, the system bag
Containing four modules, it is respectively:Data source modules, locomotive operation environment learning module, evaluation mechanism study module and control strategy
Study module.
Data source modules are used to pre-process the data source for obtaining, and data source includes that locomotive operation daily record, train are transported
Row cross-channel data, energy consumption in train journey information and train schedule information, data prediction is to be extracted from data source
Locomotive operation daily record and train operation cross-channel data are delivered to locomotive operation environmentology as the characteristic of locomotive operation environment
Module is practised, the sample data of locomotive operation environment learning is constituted, by energy consumption in train journey information and train schedule information
Evaluation mechanism study module is delivered to, Real-Time Evaluation is carried out to locomotive control for evaluating mechanism study module.
Locomotive operation environment learning module is used to build locomotive operation environmental model, and locomotive operation environment learning includes two
Divide the study of parameter, i.e. the study of the underlying parameter part and disturbance parameter part of train runing parameters, learning outcome constitutes machine
The specific running environment of car.Generally respectively using classical supervised learning algorithm and dynamic time sequence nomography to this two parts parameter
Learnt.The specific running environment of the locomotive of acquisition is delivered to control strategy study mould by locomotive operation environment learning module
Block.
The information combining assessment mechanism that evaluation mechanism study module will be obtained from data source modules obtains locomotive operation mistake
The reward functions wanted needed for journey.Evaluation mechanism includes the study of train operation scoring and non-standard operation Penalty Mark mechanism
Design.Reward functions are evaluated mechanism study module and are delivered to control strategy as the feedback data for evaluating mechanism study module
Practise module.
Control strategy study module obtains specific fortune from locomotive running environment study module and evaluation mechanism study module
Row environment and reward functions, and deeply study is carried out based on DQN models, that is, carry out the row based on deeply learning method
Car optimized handling policy learning is trained, specifically, DQN models and locomotive operation environmental model carry out continuous interactive learning (see
Fig. 3), by evaluating reward functions that mechanism study module fed back so as to for instructing the manipulation sequence after train, and
By policy update mechanism, the final manipulation of physical strategy of locomotive is obtained.
Above-mentioned locomotive smart steering system is based on the smart steering that locomotive is realized in deeply study, as shown in Fig. 2 used
Method is:
Step 1, pre-processes to data source
The characteristic of locomotive operation environmental model study, i.e. locomotive operation daily record are extracted from data source and train is transported
Row cross-channel data, constitute the sample data of locomotive operation environment supervised learning algorithm study.Train fortune is extracted from data source
The parameter that the data of row consumption information and train schedule information learn as the mechanism of evaluation.
Step 2, study and the structure of locomotive operation environment
The running environment information of locomotive does not only include the row that locomotive operation daily record and train operation cross-channel data are constituted generally
Car status information in itself, also including extraneous ambient parameter information, wherein most parameter fluctuates in certain scope, is
The fluctuation information that can be observed and predict by historical data;And have fraction parameter be in actual scene it is probabilistic,
And uncertain fluctuation may occur.The present invention is by the running environment information of locomotive using the prison based on history data
Educational inspector practises training and the structure that probabilistic locomotive operation environmental model is carried out with dynamic time sequence nomography.Specifically, lead to
Cross supervised learning algorithm (such as decision tree, neutral net classic algorithm) and be based on mechanism model completion train operation basic model ginseng
Mathematics is practised realizing the covering to general scene, and the study of train operation environment perturbation parameters is completed based on dynamic graph model.
Locomotive operation environmental model obtains the specific running environment of locomotive by learning, and the locomotive of acquisition is specifically transported
Row environment learns for control strategy.
Step 3, evaluates mechanism study
The study of evaluation mechanism is the award letter that the information combining assessment mechanism that will be obtained from data source obtains locomotive operation
Number, reward functions value is used for control strategy and learns as the evaluation of estimate of locomotive control, is that the intensified learning that the present invention is based on is calculated
Method, the policy selection foundation on basis.The reward functions value is in general application scenarios (such as game manipulation, robot control)
It is determining, objective, it is that the evaluation of estimate is directly obtained according to game rule such as in game manipulation.And in the present invention, award letter
Number cannot directly determine that it needs the information knot that will be obtained from data source as the evaluation of locomotive operation according to rule
Close evaluation mechanism carries out the target observations in short interval to determine the value for given travel route and locomotive state information.This hair
The bright evaluation mechanism that operation is formulated for locomotive driving optimization aim.The evaluation mechanism includes what is formulated based on history log
Train operation scoring and by analyzing non-standard operation after formulate non-standard operation Penalty Mark mechanism, especially,
Based on the non-standard operation Penalty Mark mechanism that non-standard operation is formulated, it is contemplated that the system requirements of high security, for possible
The non-standard operation (such as risk is stopped or exceeded the speed limit on slope) of serious consequence is caused to give maximum penalty value, it is nonstandard to evade such
Locomotive control is acted, and the security of strategy generating is effectively ensured.
Step 4, control strategy study
The present invention is controlled policy learning using deeply learning method to the specific running environment of locomotive, and passes through
The reward functions of acquisition carry out the renewal and optimization of strategy to running status, and then obtain the optimized handling control strategy of locomotive.
Deeply learning method has significant advantage in terms of the optimized handling strategy generating of complication system.Nitrification enhancement can
So that algorithm relies on few external information, by continuous repetitive exercise in the environment, and by itself study, optimization behaviour is obtained
Vertical control strategy.Deep learning algorithm has significant advantage in terms of complex multi-dimensional data are processed.So, intensified learning with
The deeply study that deep learning is combined can solve the problems, such as the optimized handling strategy generating under complication system.Such as Fig. 3 institutes
Show, under free position, based on deeply learning algorithm, locomotive operation environmental model is made with the Real-Time Evaluation that locomotive control is acted
It is feedback information, evaluation mechanism is by award or punishes current control action, gives DQN model feedbacks one reward functions conduct
Award evaluation of estimate, DQN models coupling running statuses are made iteratively the renewal and optimization of strategy.
The present invention carries out the design of deeply learning method based on DQN models.Specifically, DQN models and locomotive operation
Environmental model carries out continuous interactive learning, makes and changing using uncertain locomotive operation environment and evaluation mechanism in the present invention
Enter, locomotive often performs an operation (action) under free position, and evaluation mechanism just feeds back an award evaluation of estimate, for instructing
Manipulation sequence after train, i.e., constantly excitation DQN models carry out the renewal and optimization of strategy, are asked with solving engine optimizing operation
Topic, after multiple iteration, the Train Control strategy that model will finally be restrained and be optimized.
The detailed architecture figure of DQN models is as shown in figure 4, wherein interactive environment is uncertain train operation environment.In tool
During body is implemented, nitrification enhancement employs the Q-learning algorithms of optimization, and its optimization method is:In Q-learning algorithms
The thought of middle combination Experience Replay, i.e., set up a playback storage pool during algorithm iteration, will learn to arrive every time
Experience save, next time training when random selection one experience be trained.Using the relatively common extensive chemical of the thought
Practising mainly has three below advantage:(1) can effectively break the correlation between status data, reduce the not true of data renewal
It is qualitative;(2) harsh conditions of local optimum are caused when can be prevented effectively from algorithmic statement;(3) mesh of nitrification enhancement is solved
Mark not fixation problem.Mutually tied with the Q-learning algorithms of optimization using deep learning algorithm (such as deep neural network) in model
Close, be obtained in that the element value of approximate Q matrixes (the accumulative valuation functions of train operation described in Q values as Fig. 2), Q in such as Fig. 4
Network is the Q matrix norm types that deep neural network builds.During specific algorithm is implemented, Q network models then update per iteration n times
Target Q network parameter, then further updates the DQN differences of DQN models, and Q nets are instructed eventually through gradient descent algorithm
Network model continues to optimize training.The application of deep learning method can effectively solve the problem that system state space magnitude is larger and ask
Topic.Finally, the selection of locomotive operation (action) is tactful using conventional ε-greedy in DQN models, i.e., the strategy is with very little
Probability random selection is operated and with the current optimal operation of greater probability selection, is finally iteratively generating engine optimizing operation plan
Slightly.
Additionally, locomotive smart steering method also includes policy update mechanism, the control strategy after optimization being capable of application strategy
Update mechanism carries out real-time policy update, that is, instruct from the basis of current control strategy, and real-time adaptive learns
Go out the control strategy for more optimizing, realize the successive optimization of locomotive control strategy.
Although being described in detail to principle of the invention above in conjunction with the preferred embodiments of the present invention, this area skill
Art personnel are not wrapped to the present invention it should be understood that above-described embodiment is only the explanation to exemplary implementation of the invention
Restriction containing scope.Details in embodiment is simultaneously not meant to limit the scope of the invention, without departing substantially from spirit of the invention and
In the case of scope, any equivalent transformation based on technical solution of the present invention, simple replacement etc. are obvious to be changed, and is all fallen within
Within the scope of the present invention.
Claims (10)
1. it is a kind of based on deeply study locomotive smart steering system, it is characterised in that the locomotive smart steering system
Including data source modules, locomotive operation environment learning module, evaluate mechanism study module and control strategy study module;
The data source modules are used to carry out data prediction to the data source for obtaining, and data source described in the data source includes machine
Car running log, train operation cross-channel data, energy consumption in train journey information and train schedule information, the data are located in advance
Reason is that the locomotive operation daily record and the train operation cross-channel data are delivered into the locomotive operation environment learning module, will
The energy consumption in train journey information and the train schedule information conveyance are to the evaluation mechanism study module;
The locomotive operation environment learning module is used to build locomotive operation environmental model, and the locomotive operation environment learning is included
The study of the underlying parameter part and disturbance parameter part of train runing parameters, learning outcome constitutes locomotive and specifically runs ring
The specific running environment of the locomotive for obtaining is delivered to the control strategy by border, the locomotive operation environment learning module
Practise module;
The information combining assessment mechanism that the evaluation mechanism study module will be obtained from the data source modules obtains locomotive fortune
Required reward functions during row, the reward functions are as the feedback data of the evaluation mechanism by the evaluation mechanism
Study module is delivered to the control strategy study module;
The control strategy study module is distinguished from the locomotive operation environment learning module and the evaluation mechanism study module
The specific running environment of the locomotive and the reward functions are obtained, and carries out the train based on deeply learning method and optimized
Handling Strategy learning training, continuous interactive learning is carried out with the locomotive operation environmental model, by the evaluation mechanism
The reward functions that are fed back of module are practised so as to for instructing the manipulation sequence after train, and by policy update machine
System, obtains the final manipulation of physical strategy of the locomotive.
2. it is according to claim 1 based on deeply study locomotive smart steering system, it is characterised in that institute's commentary
Valency mechanism includes the study of train operation scoring and non-standard operation Penalty Mark Mechanism Design.
3. it is according to claim 1 based on deeply study locomotive smart steering system, it is characterised in that the control
Policy learning module processed carries out deeply study to be carried out based on DQN models, the DQN models and the locomotive operation ring
Border model carries out continuous interactive learning.
4. it is a kind of based on deeply study locomotive smart steering method, it is characterised in that the locomotive smart steering method
It is achieved by the steps of:
S1:Data source is pre-processed;
The characteristic of locomotive operation environmental model study, i.e. locomotive operation daily record are extracted from data source and train operation is handed over
Circuit-switched data, constitutes the sample data of locomotive operation environment supervised learning algorithm study.Train operation energy is extracted from data source
The parameter that the data of consumption information and train schedule information learn as the mechanism of evaluation;
S2:The study of locomotive operation environment and structure;
Carried out using supervised learning and dynamic time sequence nomography based on history data by the running environment information of locomotive
The training of locomotive operation environmental model and structure, locomotive operation environmental model obtain the specific running environment of locomotive by learning,
And the specific running environment of locomotive that will be obtained learns for control strategy;
S3:Evaluation mechanism learns;
The information combining assessment mechanism that will be obtained from data source carries out short area for given travel route and locomotive state information
Interior target observations obtain the reward functions of locomotive operation, and reward functions be used to control plan as the evaluation of estimate of locomotive control
Slightly learn;
S4:Control strategy learns;
Policy learning, and the award by obtaining are controlled to the specific running environment of locomotive using deeply learning method
Function pair running status carries out the renewal and optimization of strategy, and then obtains the optimized handling control strategy of locomotive.
5. it is according to claim 4 based on deeply study locomotive smart steering method, it is characterised in that the machine
Car smart steering method also includes policy update mechanism, and the control strategy after optimization can apply the policy update mechanism
Real-time policy update is carried out, is instructed from the basis of current control strategy, real-time adaptive study draws what is more optimized
Control strategy, realizes the successive optimization of locomotive control strategy.
6. it is according to claim 4 based on deeply study locomotive smart steering method, it is characterised in that in step
In S2, the running environment information of locomotive includes train that locomotive operation daily record, train operation cross-channel data constitute state in itself
Information and the ambient parameter information in the external world, wherein most parameter fluctuates in certain scope, is that can be seen by historical data
The fluctuation information examined and predict, and it is probabilistic in actual scene to have fraction parameter, and may occur can not be pre-
The fluctuation of survey.
7. it is according to claim 6 based on deeply study locomotive smart steering method, it is characterised in that the machine
Car running environment model completes train operation basic model parameter learning to realize by supervised learning algorithm based on mechanism model
Covering to general scene, the study of train operation environment perturbation parameters is completed based on dynamic graph model.
8. it is according to claim 7 based on deeply study locomotive smart steering method, it is characterised in that the prison
Learning algorithm is superintended and directed for decision Tree algorithms or neural network algorithm.
9. it is according to claim 4 based on deeply study locomotive smart steering method, it is characterised in that in step
In S3, the evaluation mechanism includes train operation scoring and non-standard operation Penalty Mark mechanism, and the train operation is commented
Extension set system is formulated based on history log, and the non-standard operation Penalty Mark mechanism is formulated based on non-standard operation.
10. it is according to claim 4 based on deeply study locomotive smart steering method, it is characterised in that in step
In rapid S4, control strategy is completed by DQN models and is learnt, based on the deeply learning algorithm, the locomotive operation environment
, used as feedback information, evaluation mechanism is by award or punishes that current manipulation is moved for the Real-Time Evaluation that model is acted using locomotive control
Make, give the DQN model feedbacks one award evaluation of estimate, the DQN models couplings running status is made iteratively strategy more
Newly with optimization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710045758.0A CN106842925B (en) | 2017-01-20 | 2017-01-20 | A kind of locomotive smart steering method and system based on deeply study |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710045758.0A CN106842925B (en) | 2017-01-20 | 2017-01-20 | A kind of locomotive smart steering method and system based on deeply study |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106842925A true CN106842925A (en) | 2017-06-13 |
CN106842925B CN106842925B (en) | 2019-10-11 |
Family
ID=59119196
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710045758.0A Active CN106842925B (en) | 2017-01-20 | 2017-01-20 | A kind of locomotive smart steering method and system based on deeply study |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106842925B (en) |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107194612A (en) * | 2017-06-20 | 2017-09-22 | 清华大学 | A kind of train operation dispatching method learnt based on deeply and system |
CN107239628A (en) * | 2017-06-15 | 2017-10-10 | 清华大学 | A kind of uncertain locomotive simulation model system construction method based on dynamic time sequence figure |
CN107315572A (en) * | 2017-07-19 | 2017-11-03 | 北京上格云技术有限公司 | Build control method, storage medium and the terminal device of Mechatronic Systems |
CN107315573A (en) * | 2017-07-19 | 2017-11-03 | 北京上格云技术有限公司 | Build control method, storage medium and the terminal device of Mechatronic Systems |
CN107367929A (en) * | 2017-07-19 | 2017-11-21 | 北京上格云技术有限公司 | Update method, storage medium and the terminal device of Q value matrixs |
CN107450593A (en) * | 2017-08-30 | 2017-12-08 | 清华大学 | A kind of unmanned plane autonomous navigation method and system |
CN107544516A (en) * | 2017-10-11 | 2018-01-05 | 苏州大学 | Automated driving system and method based on relative entropy depth against intensified learning |
CN107563426A (en) * | 2017-08-25 | 2018-01-09 | 清华大学 | A kind of learning method of locomotive operation temporal aspect |
CN107832836A (en) * | 2017-11-27 | 2018-03-23 | 清华大学 | Model-free depth enhancing study heuristic approach and device |
CN108161934A (en) * | 2017-12-25 | 2018-06-15 | 清华大学 | A kind of method for learning to realize robot multi peg-in-hole using deeply |
CN108333959A (en) * | 2018-03-09 | 2018-07-27 | 清华大学 | A kind of energy saving method of operating of locomotive based on convolutional neural networks model |
CN108549237A (en) * | 2018-05-16 | 2018-09-18 | 华南理工大学 | Preview based on depth enhancing study controls humanoid robot gait's planing method |
CN108820157A (en) * | 2018-04-25 | 2018-11-16 | 武汉理工大学 | A kind of Ship Intelligent Collision Avoidance method based on intensified learning |
CN108984275A (en) * | 2018-08-27 | 2018-12-11 | 洛阳中科龙网创新科技有限公司 | The agricultural driver training method of Intelligent unattended based on Unity3D and depth enhancing study |
CN109204390A (en) * | 2018-09-29 | 2019-01-15 | 交控科技股份有限公司 | A kind of Train control method based on deep learning |
CN109225640A (en) * | 2018-10-15 | 2019-01-18 | 厦门邑通软件科技有限公司 | A kind of wisdom electric precipitation power-economizing method |
CN109243021A (en) * | 2018-08-28 | 2019-01-18 | 余利 | Deeply learning type intelligent door lock system and device based on user experience analysis |
CN109472984A (en) * | 2018-12-27 | 2019-03-15 | 苏州科技大学 | Signalized control method, system and storage medium based on deeply study |
CN109740839A (en) * | 2018-11-23 | 2019-05-10 | 北京交通大学 | Train Dynamic method of adjustment and system under a kind of emergency event |
CN109782600A (en) * | 2019-01-25 | 2019-05-21 | 东华大学 | A method of autonomous mobile robot navigation system is established by virtual environment |
CN109835375A (en) * | 2019-01-29 | 2019-06-04 | 中国铁道科学研究院集团有限公司通信信号研究所 | High Speed Railway Trains automated driving system based on artificial intelligence technology |
CN109919319A (en) * | 2018-12-31 | 2019-06-21 | 中国科学院软件研究所 | Deeply learning method and equipment based on multiple history best Q networks |
CN109919243A (en) * | 2019-03-15 | 2019-06-21 | 天津拾起卖科技有限公司 | A kind of scrap iron and steel type automatic identifying method and device based on CNN |
CN109977998A (en) * | 2019-02-14 | 2019-07-05 | 网易(杭州)网络有限公司 | Information processing method and device, storage medium and electronic device |
CN110147891A (en) * | 2019-05-23 | 2019-08-20 | 北京地平线机器人技术研发有限公司 | Method, apparatus and electronic equipment applied to intensified learning training process |
CN110194041A (en) * | 2019-05-19 | 2019-09-03 | 瑞立集团瑞安汽车零部件有限公司 | The adaptive bodywork height adjusting method of Multi-source Information Fusion |
EP3557489A1 (en) * | 2018-04-19 | 2019-10-23 | Siemens Mobility GmbH | Energy optimisation in operation of a rail vehicle |
CN110390398A (en) * | 2018-04-13 | 2019-10-29 | 北京智行者科技有限公司 | On-line study method |
CN110687802A (en) * | 2018-07-06 | 2020-01-14 | 珠海格力电器股份有限公司 | Intelligent household electrical appliance control method and intelligent household electrical appliance control device |
WO2020098226A1 (en) * | 2018-11-16 | 2020-05-22 | Huawei Technologies Co., Ltd. | System and methods of efficient, continuous, and safe learning using first principles and constraints |
CN111324099A (en) * | 2018-12-12 | 2020-06-23 | 上汽通用汽车有限公司 | Machine learning-based calibration method and machine learning-based calibration system |
CN111381511A (en) * | 2018-12-27 | 2020-07-07 | 松下知识产权经营株式会社 | Jet lag reduction system and jet lag reduction method |
CN111542836A (en) * | 2017-10-04 | 2020-08-14 | 华为技术有限公司 | Method for selecting action for object by using neural network |
CN111581178A (en) * | 2020-05-12 | 2020-08-25 | 国网安徽省电力有限公司信息通信分公司 | Ceph system performance tuning strategy and system based on deep reinforcement learning |
CN111670468A (en) * | 2017-12-18 | 2020-09-15 | 日立汽车系统株式会社 | Moving body behavior prediction device and moving body behavior prediction method |
CN111781940A (en) * | 2020-05-19 | 2020-10-16 | 中车工业研究院有限公司 | Train attitude control method based on DQN reinforcement learning |
US10831208B2 (en) | 2018-11-01 | 2020-11-10 | Ford Global Technologies, Llc | Vehicle neural network processing |
CN111965981A (en) * | 2020-09-07 | 2020-11-20 | 厦门大学 | Aeroengine reinforcement learning control method and system |
CN112193280A (en) * | 2020-12-04 | 2021-01-08 | 华东交通大学 | Heavy-load train reinforcement learning control method and system |
CN113525462A (en) * | 2021-08-06 | 2021-10-22 | 中国科学院自动化研究所 | Timetable adjusting method and device under delay condition and electronic equipment |
CN113537603A (en) * | 2021-07-21 | 2021-10-22 | 北京交通大学 | Intelligent scheduling control method and system for high-speed train |
CN114450131A (en) * | 2019-09-30 | 2022-05-06 | 三菱电机株式会社 | Non-derivative model learning system and design for robot system |
US11472452B2 (en) | 2019-10-11 | 2022-10-18 | Progress Rail Services Corporation | Machine learning based train handling evaluation |
CN115598985A (en) * | 2022-11-01 | 2023-01-13 | 南栖仙策(南京)科技有限公司(Cn) | Feedback controller training method and device, electronic equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102981408A (en) * | 2012-12-10 | 2013-03-20 | 华东交通大学 | Running process modeling and adaptive control method for motor train unit |
CN103019267A (en) * | 2012-12-10 | 2013-04-03 | 华东交通大学 | Predicative control method for modeling and running speed of adaptive network-based fuzzy inference system (ANFIS) of high-speed train |
CN103870892A (en) * | 2014-03-26 | 2014-06-18 | 北京清软英泰信息技术有限公司 | Method and system for achieving railway locomotive operation control from off-line mode to on-line mode |
CN103879414A (en) * | 2014-03-26 | 2014-06-25 | 北京清软英泰信息技术有限公司 | Locomotive optimal manipulation method based on self-adaption A-Star algorithm |
CN104951425A (en) * | 2015-07-20 | 2015-09-30 | 东北大学 | Cloud service performance adaptive action type selection method based on deep learning |
CN105427016A (en) * | 2015-10-28 | 2016-03-23 | 南车株洲电力机车研究所有限公司 | Locomotive vehicle data processing method and system |
-
2017
- 2017-01-20 CN CN201710045758.0A patent/CN106842925B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102981408A (en) * | 2012-12-10 | 2013-03-20 | 华东交通大学 | Running process modeling and adaptive control method for motor train unit |
CN103019267A (en) * | 2012-12-10 | 2013-04-03 | 华东交通大学 | Predicative control method for modeling and running speed of adaptive network-based fuzzy inference system (ANFIS) of high-speed train |
CN103870892A (en) * | 2014-03-26 | 2014-06-18 | 北京清软英泰信息技术有限公司 | Method and system for achieving railway locomotive operation control from off-line mode to on-line mode |
CN103879414A (en) * | 2014-03-26 | 2014-06-25 | 北京清软英泰信息技术有限公司 | Locomotive optimal manipulation method based on self-adaption A-Star algorithm |
CN104951425A (en) * | 2015-07-20 | 2015-09-30 | 东北大学 | Cloud service performance adaptive action type selection method based on deep learning |
CN105427016A (en) * | 2015-10-28 | 2016-03-23 | 南车株洲电力机车研究所有限公司 | Locomotive vehicle data processing method and system |
Cited By (64)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107239628A (en) * | 2017-06-15 | 2017-10-10 | 清华大学 | A kind of uncertain locomotive simulation model system construction method based on dynamic time sequence figure |
CN107194612A (en) * | 2017-06-20 | 2017-09-22 | 清华大学 | A kind of train operation dispatching method learnt based on deeply and system |
CN107194612B (en) * | 2017-06-20 | 2020-10-13 | 清华大学 | Train operation scheduling method and system based on deep reinforcement learning |
CN107315572A (en) * | 2017-07-19 | 2017-11-03 | 北京上格云技术有限公司 | Build control method, storage medium and the terminal device of Mechatronic Systems |
CN107315573A (en) * | 2017-07-19 | 2017-11-03 | 北京上格云技术有限公司 | Build control method, storage medium and the terminal device of Mechatronic Systems |
CN107367929A (en) * | 2017-07-19 | 2017-11-21 | 北京上格云技术有限公司 | Update method, storage medium and the terminal device of Q value matrixs |
CN107315573B (en) * | 2017-07-19 | 2020-06-16 | 北京上格云技术有限公司 | Control method of building electromechanical system, storage medium and terminal equipment |
CN107315572B (en) * | 2017-07-19 | 2020-08-11 | 北京上格云技术有限公司 | Control method of building electromechanical system, storage medium and terminal equipment |
CN107563426A (en) * | 2017-08-25 | 2018-01-09 | 清华大学 | A kind of learning method of locomotive operation temporal aspect |
WO2019037557A1 (en) * | 2017-08-25 | 2019-02-28 | 清华大学 | Method for learning time sequence characteristics of locomotive operation |
CN107450593B (en) * | 2017-08-30 | 2020-06-12 | 清华大学 | Unmanned aerial vehicle autonomous navigation method and system |
CN107450593A (en) * | 2017-08-30 | 2017-12-08 | 清华大学 | A kind of unmanned plane autonomous navigation method and system |
CN111542836B (en) * | 2017-10-04 | 2024-05-17 | 华为技术有限公司 | Method for selecting action by using neural network as object |
CN111542836A (en) * | 2017-10-04 | 2020-08-14 | 华为技术有限公司 | Method for selecting action for object by using neural network |
CN107544516A (en) * | 2017-10-11 | 2018-01-05 | 苏州大学 | Automated driving system and method based on relative entropy depth against intensified learning |
CN107832836B (en) * | 2017-11-27 | 2020-04-21 | 清华大学 | Model-free deep reinforcement learning exploration method and device |
CN107832836A (en) * | 2017-11-27 | 2018-03-23 | 清华大学 | Model-free depth enhancing study heuristic approach and device |
CN111670468A (en) * | 2017-12-18 | 2020-09-15 | 日立汽车系统株式会社 | Moving body behavior prediction device and moving body behavior prediction method |
CN108161934A (en) * | 2017-12-25 | 2018-06-15 | 清华大学 | A kind of method for learning to realize robot multi peg-in-hole using deeply |
CN108161934B (en) * | 2017-12-25 | 2020-06-09 | 清华大学 | Method for realizing robot multi-axis hole assembly by utilizing deep reinforcement learning |
CN108333959A (en) * | 2018-03-09 | 2018-07-27 | 清华大学 | A kind of energy saving method of operating of locomotive based on convolutional neural networks model |
CN110390398B (en) * | 2018-04-13 | 2021-09-10 | 北京智行者科技有限公司 | Online learning method |
CN110390398A (en) * | 2018-04-13 | 2019-10-29 | 北京智行者科技有限公司 | On-line study method |
EP3557489A1 (en) * | 2018-04-19 | 2019-10-23 | Siemens Mobility GmbH | Energy optimisation in operation of a rail vehicle |
CN108820157A (en) * | 2018-04-25 | 2018-11-16 | 武汉理工大学 | A kind of Ship Intelligent Collision Avoidance method based on intensified learning |
CN108549237A (en) * | 2018-05-16 | 2018-09-18 | 华南理工大学 | Preview based on depth enhancing study controls humanoid robot gait's planing method |
CN108549237B (en) * | 2018-05-16 | 2020-04-28 | 华南理工大学 | Preset control humanoid robot gait planning method based on deep reinforcement learning |
CN110687802A (en) * | 2018-07-06 | 2020-01-14 | 珠海格力电器股份有限公司 | Intelligent household electrical appliance control method and intelligent household electrical appliance control device |
CN108984275A (en) * | 2018-08-27 | 2018-12-11 | 洛阳中科龙网创新科技有限公司 | The agricultural driver training method of Intelligent unattended based on Unity3D and depth enhancing study |
CN109243021A (en) * | 2018-08-28 | 2019-01-18 | 余利 | Deeply learning type intelligent door lock system and device based on user experience analysis |
CN109204390A (en) * | 2018-09-29 | 2019-01-15 | 交控科技股份有限公司 | A kind of Train control method based on deep learning |
CN109204390B (en) * | 2018-09-29 | 2021-03-12 | 交控科技股份有限公司 | Train control method based on deep learning |
CN109225640A (en) * | 2018-10-15 | 2019-01-18 | 厦门邑通软件科技有限公司 | A kind of wisdom electric precipitation power-economizing method |
US10831208B2 (en) | 2018-11-01 | 2020-11-10 | Ford Global Technologies, Llc | Vehicle neural network processing |
WO2020098226A1 (en) * | 2018-11-16 | 2020-05-22 | Huawei Technologies Co., Ltd. | System and methods of efficient, continuous, and safe learning using first principles and constraints |
CN109740839A (en) * | 2018-11-23 | 2019-05-10 | 北京交通大学 | Train Dynamic method of adjustment and system under a kind of emergency event |
CN109740839B (en) * | 2018-11-23 | 2021-06-18 | 北京交通大学 | Train dynamic adjustment method and system under emergency |
CN111324099A (en) * | 2018-12-12 | 2020-06-23 | 上汽通用汽车有限公司 | Machine learning-based calibration method and machine learning-based calibration system |
CN111381511B (en) * | 2018-12-27 | 2023-09-01 | 松下知识产权经营株式会社 | Time difference reaction reducing system and time difference reaction reducing method |
CN111381511A (en) * | 2018-12-27 | 2020-07-07 | 松下知识产权经营株式会社 | Jet lag reduction system and jet lag reduction method |
CN109472984A (en) * | 2018-12-27 | 2019-03-15 | 苏州科技大学 | Signalized control method, system and storage medium based on deeply study |
CN109919319A (en) * | 2018-12-31 | 2019-06-21 | 中国科学院软件研究所 | Deeply learning method and equipment based on multiple history best Q networks |
CN109782600A (en) * | 2019-01-25 | 2019-05-21 | 东华大学 | A method of autonomous mobile robot navigation system is established by virtual environment |
CN109835375A (en) * | 2019-01-29 | 2019-06-04 | 中国铁道科学研究院集团有限公司通信信号研究所 | High Speed Railway Trains automated driving system based on artificial intelligence technology |
CN109977998A (en) * | 2019-02-14 | 2019-07-05 | 网易(杭州)网络有限公司 | Information processing method and device, storage medium and electronic device |
CN109977998B (en) * | 2019-02-14 | 2022-05-03 | 网易(杭州)网络有限公司 | Information processing method and apparatus, storage medium, and electronic apparatus |
CN109919243A (en) * | 2019-03-15 | 2019-06-21 | 天津拾起卖科技有限公司 | A kind of scrap iron and steel type automatic identifying method and device based on CNN |
CN110194041A (en) * | 2019-05-19 | 2019-09-03 | 瑞立集团瑞安汽车零部件有限公司 | The adaptive bodywork height adjusting method of Multi-source Information Fusion |
CN110147891B (en) * | 2019-05-23 | 2021-06-01 | 北京地平线机器人技术研发有限公司 | Method and device applied to reinforcement learning training process and electronic equipment |
CN110147891A (en) * | 2019-05-23 | 2019-08-20 | 北京地平线机器人技术研发有限公司 | Method, apparatus and electronic equipment applied to intensified learning training process |
CN114450131A (en) * | 2019-09-30 | 2022-05-06 | 三菱电机株式会社 | Non-derivative model learning system and design for robot system |
US11472452B2 (en) | 2019-10-11 | 2022-10-18 | Progress Rail Services Corporation | Machine learning based train handling evaluation |
CN111581178A (en) * | 2020-05-12 | 2020-08-25 | 国网安徽省电力有限公司信息通信分公司 | Ceph system performance tuning strategy and system based on deep reinforcement learning |
CN111781940A (en) * | 2020-05-19 | 2020-10-16 | 中车工业研究院有限公司 | Train attitude control method based on DQN reinforcement learning |
CN111781940B (en) * | 2020-05-19 | 2022-12-20 | 中车工业研究院有限公司 | Train attitude control method based on DQN reinforcement learning |
CN111965981A (en) * | 2020-09-07 | 2020-11-20 | 厦门大学 | Aeroengine reinforcement learning control method and system |
CN111965981B (en) * | 2020-09-07 | 2022-02-22 | 厦门大学 | Aeroengine reinforcement learning control method and system |
CN112193280A (en) * | 2020-12-04 | 2021-01-08 | 华东交通大学 | Heavy-load train reinforcement learning control method and system |
US11205124B1 (en) | 2020-12-04 | 2021-12-21 | East China Jiaotong University | Method and system for controlling heavy-haul train based on reinforcement learning |
CN112193280B (en) * | 2020-12-04 | 2021-03-16 | 华东交通大学 | Heavy-load train reinforcement learning control method and system |
CN113537603A (en) * | 2021-07-21 | 2021-10-22 | 北京交通大学 | Intelligent scheduling control method and system for high-speed train |
CN113525462A (en) * | 2021-08-06 | 2021-10-22 | 中国科学院自动化研究所 | Timetable adjusting method and device under delay condition and electronic equipment |
CN115598985A (en) * | 2022-11-01 | 2023-01-13 | 南栖仙策(南京)科技有限公司(Cn) | Feedback controller training method and device, electronic equipment and medium |
CN115598985B (en) * | 2022-11-01 | 2024-02-02 | 南栖仙策(南京)高新技术有限公司 | Training method and device of feedback controller, electronic equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN106842925B (en) | 2019-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106842925B (en) | A kind of locomotive smart steering method and system based on deeply study | |
CN107194612A (en) | A kind of train operation dispatching method learnt based on deeply and system | |
CN105700526B (en) | Online limit of sequence learning machine method with independent learning ability | |
CN107943022A (en) | A kind of PID locomotive automatic Pilot optimal control methods based on intensified learning | |
CN107697070A (en) | Driving behavior Forecasting Methodology and device, unmanned vehicle | |
Palmroth | Performance monitoring and operator assistance systems in mobile machines | |
CN109635246A (en) | A kind of multiattribute data modeling method based on deep learning | |
CN108333959A (en) | A kind of energy saving method of operating of locomotive based on convolutional neural networks model | |
CN117719535A (en) | Human feedback automatic driving vehicle interactive self-adaptive decision control method | |
Buche et al. | An expert system manipulating knowledge to help human learners into virtual environment | |
Plebe et al. | Human-inspired autonomous driving: A survey | |
Guevarra et al. | Augmenting flight training with AI to efficiently train pilots | |
Forneris et al. | Implementing Deep Reinforcement Learning (DRL)-based Driving Styles for Non-Player Vehicles | |
Li et al. | Complementary learning-team machines to enlighten and exploit human expertise | |
CN106647279B (en) | A kind of locomotive smart steering optimized calculation method based on fuzzy rule | |
Stein et al. | Learning in context: enhancing machine learning with context-based reasoning | |
Knox et al. | Understanding human teaching modalities in reinforcement learning environments: A preliminary report | |
CN105279978B (en) | Intersection traffic signal control method and equipment | |
Yan | Research on path planning of robot based on artificial intelligence algorithm | |
Li | Introduction to Reinforcement Learning | |
Saxena et al. | Advancement of industrial automation in integration with robotics | |
Tervo et al. | A hierarchical fuzzy inference method for skill evaluation of machine operators | |
Garza-Coello et al. | AWS DeepRacer: A Way to Understand and Apply the Reinforcement Learning Methods | |
Weigand et al. | Reinforcement learning using guided observability | |
LeCun | A path to ai |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |