EP3884432A1 - Apprentissage de modèle d'apprentissage de renfort par simulation - Google Patents

Apprentissage de modèle d'apprentissage de renfort par simulation

Info

Publication number
EP3884432A1
EP3884432A1 EP19829363.1A EP19829363A EP3884432A1 EP 3884432 A1 EP3884432 A1 EP 3884432A1 EP 19829363 A EP19829363 A EP 19829363A EP 3884432 A1 EP3884432 A1 EP 3884432A1
Authority
EP
European Patent Office
Prior art keywords
simulation
reinforcement learning
application
computer
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP19829363.1A
Other languages
German (de)
English (en)
Inventor
Leo Parker Dirac
Eric Li Sun
Sunil Mallya Kasaragod
Sahika Genc
Bharathan BALAJI
Saurabh Gupta
Brian James TOWNSEND
Pramod Ravikumar KUMAR
Marthinus Coenraad De Clercq WENTZEL
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Amazon Technologies Inc
Original Assignee
Amazon Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/198,605 external-priority patent/US11455234B2/en
Priority claimed from US16/198,698 external-priority patent/US20200156243A1/en
Priority claimed from US16/201,830 external-priority patent/US11836577B2/en
Priority claimed from US16/201,872 external-priority patent/US11429762B2/en
Priority claimed from US16/201,864 external-priority patent/US20200167687A1/en
Application filed by Amazon Technologies Inc filed Critical Amazon Technologies Inc
Publication of EP3884432A1 publication Critical patent/EP3884432A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Abstract

L'invention concerne un service de gestion de simulation recevant une demande d'exécution d'apprentissage de renfort pour un dispositif robotique. La demande peut comprendre un code exécutable par ordinateur définissant une fonction de renfort permettant d'entraîner un modèle d'apprentissage de renfort pour le dispositif robotique. En réponse à la demande, le service de gestion de simulation génère un environnement de simulation et injecte le code exécutable par ordinateur dans une application de simulation pour le dispositif robotique. Grâce à l'application de simulation et au code exécutable par ordinateur, le service de gestion de simulation réalise l'apprentissage de renfort à l'intérieur de l'environnement de simulation.
EP19829363.1A 2018-11-21 2019-11-20 Apprentissage de modèle d'apprentissage de renfort par simulation Pending EP3884432A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US16/198,605 US11455234B2 (en) 2018-11-21 2018-11-21 Robotics application development architecture
US16/198,698 US20200156243A1 (en) 2018-11-21 2018-11-21 Robotics application simulation management
US16/201,830 US11836577B2 (en) 2018-11-27 2018-11-27 Reinforcement learning model training through simulation
US16/201,872 US11429762B2 (en) 2018-11-27 2018-11-27 Simulation orchestration for training reinforcement learning models
US16/201,864 US20200167687A1 (en) 2018-11-27 2018-11-27 Simulation modeling exchange
PCT/US2019/062509 WO2020106908A1 (fr) 2018-11-21 2019-11-20 Apprentissage de modèle d'apprentissage de renfort par simulation

Publications (1)

Publication Number Publication Date
EP3884432A1 true EP3884432A1 (fr) 2021-09-29

Family

ID=69061434

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19829363.1A Pending EP3884432A1 (fr) 2018-11-21 2019-11-20 Apprentissage de modèle d'apprentissage de renfort par simulation

Country Status (3)

Country Link
EP (1) EP3884432A1 (fr)
CN (1) CN113272825B (fr)
WO (1) WO2020106908A1 (fr)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882072B (zh) * 2020-07-09 2023-11-14 北京华如科技股份有限公司 一种与规则对弈的智能模型自动化课程训练方法
CN112862108B (zh) * 2021-02-07 2024-05-07 超参数科技(深圳)有限公司 组件化的强化学习模型处理方法、系统、设备和存储介质
CN112884066A (zh) * 2021-03-15 2021-06-01 网易(杭州)网络有限公司 数据处理方法及装置
CN113205070B (zh) * 2021-05-27 2024-02-20 三一专用汽车有限责任公司 视觉感知算法优化方法及系统
CN114609925B (zh) * 2022-01-14 2022-12-06 中国科学院自动化研究所 水下探索策略模型的训练方法及仿生机器鱼水下探索方法
CN114327916B (zh) * 2022-03-10 2022-06-17 中国科学院自动化研究所 一种资源分配系统的训练方法、装置及设备
CN114415737A (zh) * 2022-04-01 2022-04-29 天津七一二通信广播股份有限公司 一种无人机强化学习训练系统的实现方法
CN115098998B (zh) * 2022-05-25 2023-05-12 上海锡鼎智能科技有限公司 一种基于仿真数据的模型训练方法和系统
CN115330095B (zh) * 2022-10-14 2023-07-07 青岛慧拓智能机器有限公司 矿车调度模型训练方法、装置、芯片、终端、设备及介质
CN116151137B (zh) * 2023-04-24 2023-07-28 之江实验室 一种仿真系统、方法及装置
CN116738867B (zh) * 2023-08-14 2023-10-31 厦门安智达信息科技有限公司 一种基于机器学习的无人机防御仿真方法及其系统
CN116911202B (zh) * 2023-09-11 2023-11-17 北京航天晨信科技有限责任公司 一种基于多粒度仿真训练环境的智能体训练方法和装置
CN117593095B (zh) * 2024-01-17 2024-03-22 苏州元脑智能科技有限公司 自适应调参的方法、装置、计算机设备及存储介质
CN117725985A (zh) * 2024-02-06 2024-03-19 之江实验室 一种强化学习模型训练和业务执行方法、装置及电子设备
CN117809629A (zh) * 2024-02-29 2024-04-02 青岛海尔科技有限公司 基于大模型的交互系统更新方法、装置及存储介质

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7983996B2 (en) * 2007-03-01 2011-07-19 The Boeing Company Method and apparatus for human behavior modeling in adaptive training
US9501694B2 (en) * 2008-11-24 2016-11-22 Qualcomm Incorporated Pictorial methods for application selection and activation
US9715402B2 (en) * 2014-09-30 2017-07-25 Amazon Technologies, Inc. Dynamic code deployment and versioning
JP6243385B2 (ja) * 2015-10-19 2017-12-06 ファナック株式会社 モータ電流制御における補正値を学習する機械学習装置および方法ならびに該機械学習装置を備えた補正値計算装置およびモータ駆動装置
US10586173B2 (en) * 2016-01-27 2020-03-10 Bonsai AI, Inc. Searchable database of trained artificial intelligence objects that can be reused, reconfigured, and recomposed, into one or more subsequent artificial intelligence models
US10058995B1 (en) * 2016-07-08 2018-08-28 X Development Llc Operating multiple testing robots based on robot instructions and/or environmental parameters received in a request
US11314907B2 (en) * 2016-08-26 2022-04-26 Hitachi, Ltd. Simulation including multiple simulators
US10423522B2 (en) * 2017-04-12 2019-09-24 Salesforce.Com, Inc. System and method for detecting an error in software
CN108170529A (zh) * 2017-12-26 2018-06-15 北京工业大学 一种基于长短期记忆网络的云数据中心负载预测方法

Also Published As

Publication number Publication date
CN113272825A (zh) 2021-08-17
WO2020106908A1 (fr) 2020-05-28
CN113272825B (zh) 2024-02-02

Similar Documents

Publication Publication Date Title
US11836577B2 (en) Reinforcement learning model training through simulation
US20200167687A1 (en) Simulation modeling exchange
US11455234B2 (en) Robotics application development architecture
US11429762B2 (en) Simulation orchestration for training reinforcement learning models
WO2020106908A1 (fr) Apprentissage de modèle d'apprentissage de renfort par simulation
US20200156243A1 (en) Robotics application simulation management
US10977111B2 (en) Constraint solver execution service and infrastructure therefor
US10289463B2 (en) Flexible scripting platform for troubleshooting
US10540269B2 (en) Inter-process communication automated testing framework
CA2919839C (fr) Migration d'instance informatique virtuelle
US10990516B1 (en) Method, apparatus, and computer program product for predictive API test suite selection
US9237130B2 (en) Hierarchical rule development and binding for web application server firewall
US11847480B2 (en) System for detecting impairment issues of distributed hosts
US10673712B1 (en) Parallel asynchronous stack operations
US11533330B2 (en) Determining risk metrics for access requests in network environments using multivariate modeling
US9996381B1 (en) Live application management workflow using metadata capture
US9229693B1 (en) Build service for software development projects
US10592068B1 (en) Graphic composer for service integration
US10164848B1 (en) Web service fuzzy tester
US20180032384A1 (en) Secure script execution using sandboxed environments
US10079738B1 (en) Using a network crawler to test objects of a network document
US10747390B1 (en) Graphical composer for policy management
US11360951B1 (en) Database migration systems and methods
JP7252332B2 (ja) ロボティクスアプリケーション開発のための方法及びシステム
US10705945B1 (en) Computing system testing service

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20210618

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)