JP7347544B2 - 方策学習方法、方策学習装置、プログラム - Google Patents
方策学習方法、方策学習装置、プログラム Download PDFInfo
- Publication number
- JP7347544B2 JP7347544B2 JP2021570601A JP2021570601A JP7347544B2 JP 7347544 B2 JP7347544 B2 JP 7347544B2 JP 2021570601 A JP2021570601 A JP 2021570601A JP 2021570601 A JP2021570601 A JP 2021570601A JP 7347544 B2 JP7347544 B2 JP 7347544B2
- Authority
- JP
- Japan
- Prior art keywords
- state
- learning
- value
- graph
- behavior
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- User Interface Of Digital Computer (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2020/001500 WO2021144963A1 (ja) | 2020-01-17 | 2020-01-17 | 方策学習方法、方策学習装置、プログラム |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JPWO2021144963A1 JPWO2021144963A1 (https=) | 2021-07-22 |
| JPWO2021144963A5 JPWO2021144963A5 (https=) | 2022-08-23 |
| JP7347544B2 true JP7347544B2 (ja) | 2023-09-20 |
Family
ID=76864131
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2021570601A Active JP7347544B2 (ja) | 2020-01-17 | 2020-01-17 | 方策学習方法、方策学習装置、プログラム |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20230023899A1 (https=) |
| JP (1) | JP7347544B2 (https=) |
| WO (1) | WO2021144963A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114005014B (zh) * | 2021-12-23 | 2022-06-17 | 杭州华鲤智能科技有限公司 | 一种模型训练、社交互动策略优化方法 |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190357520A1 (en) | 2018-05-24 | 2019-11-28 | Blue River Technology Inc. | Boom sprayer including machine feedback control |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3696737B1 (en) * | 2016-11-03 | 2022-08-31 | Deepmind Technologies Limited | Training action selection neural networks |
| EP4273757A3 (en) * | 2017-06-05 | 2024-02-14 | DeepMind Technologies Limited | Selecting actions using multi-modal inputs |
| US10935982B2 (en) * | 2017-10-04 | 2021-03-02 | Huawei Technologies Co., Ltd. | Method of selection of an action for an object using a neural network |
-
2020
- 2020-01-17 JP JP2021570601A patent/JP7347544B2/ja active Active
- 2020-01-17 WO PCT/JP2020/001500 patent/WO2021144963A1/ja not_active Ceased
- 2020-01-17 US US17/790,574 patent/US20230023899A1/en active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190357520A1 (en) | 2018-05-24 | 2019-11-28 | Blue River Technology Inc. | Boom sprayer including machine feedback control |
Non-Patent Citations (1)
| Title |
|---|
| SEGLER, Marwin H.S.,World Programs for Model-Based Learning and Planning in Compositional State and Action Spaces,arXiv:1912.13007v1,version v1,[online], arXiv (Cornell University),2019年12月30日,Pages 1-6,[retrieved on 2020-03-18], Retrieved from the Internet: <URL: https://arxiv.org/abs/1912.13007v1> an |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2021144963A1 (https=) | 2021-07-22 |
| US20230023899A1 (en) | 2023-01-26 |
| WO2021144963A1 (ja) | 2021-07-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12591415B2 (en) | Intelligent and predictive modules for software development and coding using artificial intelligence and machine learning | |
| US10528349B2 (en) | Branch synthetic generation across multiple microarchitecture generations | |
| CN114355793B (zh) | 用于车辆仿真评测的自动驾驶规划模型的训练方法及装置 | |
| CN109491494A (zh) | 功率参数的调整方法、装置及强化学习模型训练方法 | |
| JP7822521B2 (ja) | 深層強化学習を用いたgan分散型rf電力増幅器自動化設計 | |
| Gao et al. | A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence | |
| Lorig | Hypothesis-Driven Simulation Studies | |
| CN120031099A (zh) | 图形界面智能体的训练方法、设备及存储介质 | |
| JP7347544B2 (ja) | 方策学習方法、方策学習装置、プログラム | |
| Kopsick et al. | Formation and retrieval of cell assemblies in a biologically realistic spiking neural network model of area CA3 in the mouse hippocampus | |
| CN118092764B (zh) | 一种大语言模型指导的智能体动作控制方法及装置 | |
| WO2024180789A1 (ja) | 情報処理装置、情報処理方法、プログラム | |
| Chandrasiri et al. | Multi-Objective Cloud Workflow Scheduling with Two-Stage Deep Reinforcement Learning and Graph Neural Networks | |
| Xu et al. | Beyond Token-Level Policy Gradients for Complex Reasoning with Large Language Models | |
| Hage et al. | BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning | |
| CN119967614B (zh) | 面向多资源联合调度的操作系统配置调优方法及装置 | |
| Moreno-Palancas et al. | Handling discrete decisions in bilevel optimization via neural network embeddings | |
| Gartner | Adaptive Reinforcement Learning and Microstructure-Aware Optimization in Foreign Exchange Markets: A Unified Framework for Algorithmic Trading and Risk-Aware Execution | |
| CN120746653B (zh) | 基于预测性投放规划的广告自动出价方法及系统 | |
| KR20250083782A (ko) | 의료 데이터 관련 강화 학습을 위한 전자 장치 및 그 동작 방법 | |
| Sha et al. | World Model Aided Parameter Adjustment Decision and Evaluation System for Radio Access Network | |
| Rodrigues | The Lean Startup Methodology: A Catalyst for Semiconductor Innovation | |
| CN118297687A (zh) | DeFi协议链上治理智能决策模型、方法、设备及介质 | |
| Moolawi | Reinforcement learning for optimizing regression testing | |
| Stanev | Quantum Photonic Architectures and Machine Learning |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220623 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220623 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20230530 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230724 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20230808 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20230821 |
|
| R151 | Written notification of patent or utility model registration |
Ref document number: 7347544 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R151 |