JP5750657B2 - 強化学習装置、制御装置、および強化学習方法 - Google Patents

強化学習装置、制御装置、および強化学習方法 Download PDF

Info

Publication number
JP5750657B2
JP5750657B2 JP2011074694A JP2011074694A JP5750657B2 JP 5750657 B2 JP5750657 B2 JP 5750657B2 JP 2011074694 A JP2011074694 A JP 2011074694A JP 2011074694 A JP2011074694 A JP 2011074694A JP 5750657 B2 JP5750657 B2 JP 5750657B2
Authority
JP
Japan
Prior art keywords
external force
virtual external
reinforcement learning
output
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2011074694A
Other languages
English (en)
Japanese (ja)
Other versions
JP2012208789A (ja
JP2012208789A5 (enExample
Inventor
徳和 杉本
徳和 杉本
雄悟 上田
雄悟 上田
忠明 長谷川
忠明 長谷川
総司 射場
総司 射場
赤塚 浩二
浩二 赤塚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honda Motor Co Ltd
ATR Advanced Telecommunications Research Institute International
Original Assignee
Honda Motor Co Ltd
ATR Advanced Telecommunications Research Institute International
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honda Motor Co Ltd, ATR Advanced Telecommunications Research Institute International filed Critical Honda Motor Co Ltd
Priority to JP2011074694A priority Critical patent/JP5750657B2/ja
Priority to US13/432,094 priority patent/US8886357B2/en
Publication of JP2012208789A publication Critical patent/JP2012208789A/ja
Publication of JP2012208789A5 publication Critical patent/JP2012208789A5/ja
Application granted granted Critical
Publication of JP5750657B2 publication Critical patent/JP5750657B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S901/00Robots
    • Y10S901/02Arm motion controller
    • Y10S901/03Teaching system

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)
  • Manipulator (AREA)
JP2011074694A 2011-03-30 2011-03-30 強化学習装置、制御装置、および強化学習方法 Active JP5750657B2 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2011074694A JP5750657B2 (ja) 2011-03-30 2011-03-30 強化学習装置、制御装置、および強化学習方法
US13/432,094 US8886357B2 (en) 2011-03-30 2012-03-28 Reinforcement learning apparatus, control apparatus, and reinforcement learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2011074694A JP5750657B2 (ja) 2011-03-30 2011-03-30 強化学習装置、制御装置、および強化学習方法

Publications (3)

Publication Number Publication Date
JP2012208789A JP2012208789A (ja) 2012-10-25
JP2012208789A5 JP2012208789A5 (enExample) 2014-05-22
JP5750657B2 true JP5750657B2 (ja) 2015-07-22

Family

ID=46928279

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2011074694A Active JP5750657B2 (ja) 2011-03-30 2011-03-30 強化学習装置、制御装置、および強化学習方法

Country Status (2)

Country Link
US (1) US8886357B2 (enExample)
JP (1) JP5750657B2 (enExample)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6522488B2 (ja) * 2015-07-31 2019-05-29 ファナック株式会社 ワークの取り出し動作を学習する機械学習装置、ロボットシステムおよび機械学習方法
DE102016015936B8 (de) 2015-07-31 2024-10-24 Fanuc Corporation Vorrichtung für maschinelles Lernen, Robotersystem und maschinelles Lernsystem zum Lernen eines Werkstückaufnahmevorgangs
JP6240689B2 (ja) 2015-07-31 2017-11-29 ファナック株式会社 人の行動パターンを学習する機械学習装置、ロボット制御装置、ロボットシステム、および機械学習方法
JP6106226B2 (ja) * 2015-07-31 2017-03-29 ファナック株式会社 ゲインの最適化を学習する機械学習装置及び機械学習装置を備えた電動機制御装置並びに機械学習方法
US10839302B2 (en) 2015-11-24 2020-11-17 The Research Foundation For The State University Of New York Approximate value iteration with complex returns by bounding
JP6733239B2 (ja) 2016-03-18 2020-07-29 セイコーエプソン株式会社 制御装置及びロボットシステム
JP2017199077A (ja) * 2016-04-25 2017-11-02 ファナック株式会社 複数台の産業機械を有する生産システムの動作を最適化するセルコントローラ
CN106886451B (zh) * 2017-01-10 2020-10-27 广东石油化工学院 一种基于虚拟化容器技术的多工作流任务分配方法
JP6453919B2 (ja) * 2017-01-26 2019-01-16 ファナック株式会社 行動情報学習装置、行動情報最適化システム及び行動情報学習プログラム
JP6706223B2 (ja) * 2017-05-25 2020-06-03 日本電信電話株式会社 移動体制御方法、移動体制御装置、及びプログラム
JP6748135B2 (ja) * 2018-03-19 2020-08-26 ファナック株式会社 機械学習装置、サーボ制御装置、サーボ制御システム、及び機械学習方法
JP7131087B2 (ja) * 2018-05-31 2022-09-06 セイコーエプソン株式会社 ロボットシステムの制御方法およびロボットシステム
US11403513B2 (en) * 2018-09-27 2022-08-02 Deepmind Technologies Limited Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy
CN109711040B (zh) * 2018-12-25 2023-06-02 南京天洑软件有限公司 一种基于搜索方向学习的智能工业设计强化学习算法
EP3920000A4 (en) * 2019-01-30 2022-01-26 NEC Corporation CONTROL DEVICE, ORDER METHOD AND RECORDING MEDIA
JP7225923B2 (ja) 2019-03-04 2023-02-21 富士通株式会社 強化学習方法、強化学習プログラム、および強化学習システム
JP7379833B2 (ja) * 2019-03-04 2023-11-15 富士通株式会社 強化学習方法、強化学習プログラム、および強化学習システム
US12093001B2 (en) * 2019-05-22 2024-09-17 Nec Corporation Operation rule determination device, method, and recording medium using frequency of a cumulative reward calculated for series of operations
US11676064B2 (en) * 2019-08-16 2023-06-13 Mitsubishi Electric Research Laboratories, Inc. Constraint adaptor for reinforcement learning control
CN110496377B (zh) * 2019-08-19 2020-07-28 华南理工大学 一种基于强化学习的虚拟乒乓球手击球训练方法
DE102019130040A1 (de) * 2019-11-07 2021-05-12 Bayerische Motoren Werke Aktiengesellschaft Verfahren und System zum Prüfen einer automatisierten Fahrfunktion durch Reinforcement-Learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4587738B2 (ja) * 2003-08-25 2010-11-24 ソニー株式会社 ロボット装置及びロボットの姿勢制御方法
JP4929449B2 (ja) * 2005-09-02 2012-05-09 国立大学法人横浜国立大学 強化学習装置および強化学習方法
US8458715B1 (en) * 2007-02-23 2013-06-04 Hrl Laboratories, Llc System for allocating resources to optimize transition from a current state to a desired state

Also Published As

Publication number Publication date
JP2012208789A (ja) 2012-10-25
US20120253514A1 (en) 2012-10-04
US8886357B2 (en) 2014-11-11

Similar Documents

Publication Publication Date Title
JP5750657B2 (ja) 強化学習装置、制御装置、および強化学習方法
CN112428278B (zh) 机械臂的控制方法、装置及人机协同模型的训练方法
JP2012208789A5 (enExample)
CN113677485B (zh) 使用基于元模仿学习和元强化学习的元学习的用于新任务的机器人控制策略的高效自适应
US11745355B2 (en) Control device, control method, and non-transitory computer-readable storage medium
JP7295421B2 (ja) 制御装置及び制御方法
CN108873768B (zh) 任务执行系统及方法、学习装置及方法、以及记录介质
US9387589B2 (en) Visual debugging of robotic tasks
US9361590B2 (en) Information processing apparatus, information processing method, and program
CN114080304A (zh) 控制装置、控制方法及控制程序
US11461589B1 (en) Mitigating reality gap through modification of simulated state data of robotic simulator
JP2019529135A (ja) ロボット操作のための深層強化学習
Xu et al. Visual-haptic aid teleoperation based on 3-D environment modeling and updating
JP6321905B2 (ja) 関節システムの制御方法、記憶媒体、制御システム
RU2308762C2 (ru) Перемещение виртуального объекта в виртуальной окружающей среде без взаимных помех между его сочлененными элементами
US20240054393A1 (en) Learning Device, Learning Method, Recording Medium Storing Learning Program, Control Program, Control Device, Control Method, and Recording Medium Storing Control Program
CN114041828B (zh) 超声扫查控制方法、机器人及存储介质
US20220193906A1 (en) User Interface for Supervised Autonomous Grasping
JP7180696B2 (ja) 制御装置、制御方法およびプログラム
JPWO2020138436A1 (ja) ロボット制御装置、ロボットシステム及びロボット制御方法
JP7263987B2 (ja) 制御装置、制御方法、及び制御プログラム
EP4175795A1 (en) Transfer between tasks in different domains
KR20240157375A (ko) 심층 강화학습 기반의 공칭 제어 보강을 수행하는 드론 제어 방법 및 그 장치
WO2024051978A1 (en) Action abstraction controller for fully actuated robotic manipulators
JP2022183723A (ja) 処理装置、サーボシステム、処理方法、およびプログラム

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20140314

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20140314

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20140314

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20150114

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20150127

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20150304

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20150401

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20150413

R150 Certificate of patent or registration of utility model

Ref document number: 5750657

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250