JP7438336B2 - 強化学習モデルのための状態シミュレータ - Google Patents
強化学習モデルのための状態シミュレータ Download PDFInfo
- Publication number
- JP7438336B2 JP7438336B2 JP2022515598A JP2022515598A JP7438336B2 JP 7438336 B2 JP7438336 B2 JP 7438336B2 JP 2022515598 A JP2022515598 A JP 2022515598A JP 2022515598 A JP2022515598 A JP 2022515598A JP 7438336 B2 JP7438336 B2 JP 7438336B2
- Authority
- JP
- Japan
- Prior art keywords
- features
- state
- subsets
- actions
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Complex Calculations (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/568,284 | 2019-09-12 | ||
| US16/568,284 US11574244B2 (en) | 2019-09-12 | 2019-09-12 | States simulator for reinforcement learning models |
| PCT/EP2020/072487 WO2021047842A1 (en) | 2019-09-12 | 2020-08-11 | States simulator for reinforcement learning models |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2022547529A JP2022547529A (ja) | 2022-11-14 |
| JP2022547529A5 JP2022547529A5 (https=) | 2022-12-13 |
| JP7438336B2 true JP7438336B2 (ja) | 2024-02-26 |
Family
ID=72050874
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2022515598A Active JP7438336B2 (ja) | 2019-09-12 | 2020-08-11 | 強化学習モデルのための状態シミュレータ |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US11574244B2 (https=) |
| EP (1) | EP4028959A1 (https=) |
| JP (1) | JP7438336B2 (https=) |
| CN (1) | CN114365157A (https=) |
| WO (1) | WO2021047842A1 (https=) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102338304B1 (ko) * | 2020-10-20 | 2021-12-13 | 주식회사 뉴로코어 | 강화 학습을 이용한 공장 시뮬레이터 기반 스케줄링 시스템 |
| CN115617796A (zh) * | 2022-10-12 | 2023-01-17 | 中电智元数据科技有限公司 | 一种分布式数据库索引选择方法 |
| CN118837737B (zh) * | 2024-06-27 | 2025-09-09 | 西安交通大学 | 一种水下推进电机故障诊断方法、装置、设备及存储介质 |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2013242761A (ja) | 2012-05-22 | 2013-12-05 | Internatl Business Mach Corp <Ibm> | マルコフ決定過程システム環境下における方策パラメータを更新するための方法、並びに、その制御器及び制御プログラム |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8918866B2 (en) * | 2009-06-29 | 2014-12-23 | International Business Machines Corporation | Adaptive rule loading and session control for securing network delivered services |
| US9128739B1 (en) * | 2012-12-31 | 2015-09-08 | Emc Corporation | Determining instances to maintain on at least one cloud responsive to an evaluation of performance characteristics |
| US20160260024A1 (en) * | 2015-03-04 | 2016-09-08 | Qualcomm Incorporated | System of distributed planning |
| US10540598B2 (en) | 2015-09-09 | 2020-01-21 | International Business Machines Corporation | Interpolation of transition probability values in Markov decision processes |
| CN108701252B (zh) | 2015-11-12 | 2024-02-02 | 渊慧科技有限公司 | 使用优先化经验存储器训练神经网络 |
| US10839302B2 (en) | 2015-11-24 | 2020-11-17 | The Research Foundation For The State University Of New York | Approximate value iteration with complex returns by bounding |
| CN108230057A (zh) * | 2016-12-09 | 2018-06-29 | 阿里巴巴集团控股有限公司 | 一种智能推荐方法及系统 |
| US20180342004A1 (en) * | 2017-05-25 | 2018-11-29 | Microsoft Technology Licensing, Llc | Cumulative success-based recommendations for repeat users |
| WO2020005240A1 (en) * | 2018-06-27 | 2020-01-02 | Google Llc | Adapting a sequence model for use in predicting future device interactions with a computing system |
| US10963313B2 (en) * | 2018-08-27 | 2021-03-30 | Vmware, Inc. | Automated reinforcement-learning-based application manager that learns and improves a reward function |
| US11468322B2 (en) * | 2018-12-04 | 2022-10-11 | Rutgers, The State University Of New Jersey | Method for selecting and presenting examples to explain decisions of algorithms |
| EP3776347B1 (en) * | 2019-06-17 | 2025-07-02 | Google LLC | Vehicle occupant engagement using three-dimensional eye gaze vectors |
-
2019
- 2019-09-12 US US16/568,284 patent/US11574244B2/en active Active
-
2020
- 2020-08-11 CN CN202080063367.1A patent/CN114365157A/zh not_active Withdrawn
- 2020-08-11 WO PCT/EP2020/072487 patent/WO2021047842A1/en not_active Ceased
- 2020-08-11 EP EP20754737.3A patent/EP4028959A1/en not_active Withdrawn
- 2020-08-11 JP JP2022515598A patent/JP7438336B2/ja active Active
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2013242761A (ja) | 2012-05-22 | 2013-12-05 | Internatl Business Mach Corp <Ibm> | マルコフ決定過程システム環境下における方策パラメータを更新するための方法、並びに、その制御器及び制御プログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2022547529A (ja) | 2022-11-14 |
| CN114365157A (zh) | 2022-04-15 |
| EP4028959A1 (en) | 2022-07-20 |
| WO2021047842A1 (en) | 2021-03-18 |
| US11574244B2 (en) | 2023-02-07 |
| US20210081829A1 (en) | 2021-03-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110366734B (zh) | 优化神经网络架构 | |
| US11615302B2 (en) | Effective user modeling with time-aware based binary hashing | |
| CN109902706B (zh) | 推荐方法及装置 | |
| JP7438336B2 (ja) | 強化学習モデルのための状態シミュレータ | |
| KR102203253B1 (ko) | 생성적 적대 신경망에 기반한 평점 증강 및 아이템 추천 방법 및 시스템 | |
| US20180024989A1 (en) | Automated building and sequencing of a storyline and scenes, or sections, included therein | |
| US20160086498A1 (en) | Recommending a Set of Learning Activities Based on Dynamic Learning Goal Adaptation | |
| US12079289B2 (en) | Recommending content to subscribers | |
| KR20200046189A (ko) | 생성적 적대 신경망에 기반한 협업 필터링을 위한 방법 및 시스템 | |
| Leike | Nonparametric general reinforcement learning | |
| JP2020087103A (ja) | 学習方法、コンピュータプログラム、分類器、及び生成器 | |
| CN114930317A (zh) | 用于视频接地的图形卷积网络 | |
| US10537801B2 (en) | System and method for decision making in strategic environments | |
| CN114119078A (zh) | 目标资源确定方法、装置、电子设备及介质 | |
| CN114138954B (zh) | 用户咨询问题推荐方法、系统、计算机设备及存储介质 | |
| KR102549937B1 (ko) | Sns 텍스트 기반의 사용자의 인테리어 스타일 분석 모델 제공 장치 및 방법 | |
| CN114118411A (zh) | 图像识别网络的训练方法、图像识别方法及装置 | |
| CN116402138A (zh) | 一种多粒度历史聚合的时序知识图谱推理方法及系统 | |
| CN112699203B (zh) | 路网数据的处理方法和装置 | |
| CN116955808B (zh) | 一种游戏推荐方法、装置、电子设备及介质 | |
| CN120821823A (zh) | 一种基于语言模型的任务处理方法、装置 | |
| CN119072696A (zh) | 训练神经网络系统以执行多个机器学习任务 | |
| US12462200B1 (en) | Accelerated training of a machine learning model | |
| KR20210000181A (ko) | 게임 데이터 처리 방법 | |
| CN110347916A (zh) | 跨场景的项目推荐方法、装置、电子设备及存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| RD04 | Notification of resignation of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7424 Effective date: 20220518 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20221202 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20230120 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20231213 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20240130 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20240213 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7438336 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |