JPWO2021144963A5 - - Google Patents
Download PDFInfo
- Publication number
- JPWO2021144963A5 JPWO2021144963A5 JP2021570601A JP2021570601A JPWO2021144963A5 JP WO2021144963 A5 JPWO2021144963 A5 JP WO2021144963A5 JP 2021570601 A JP2021570601 A JP 2021570601A JP 2021570601 A JP2021570601 A JP 2021570601A JP WO2021144963 A5 JPWO2021144963 A5 JP WO2021144963A5
- Authority
- JP
- Japan
- Prior art keywords
- state
- action element
- value
- graph
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims 11
- 230000003542 behavioural effect Effects 0.000 claims 7
- 230000007704 transition Effects 0.000 claims 4
- 230000010365 information processing Effects 0.000 claims 1
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2020/001500 WO2021144963A1 (ja) | 2020-01-17 | 2020-01-17 | 方策学習方法、方策学習装置、プログラム |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JPWO2021144963A1 JPWO2021144963A1 (https=) | 2021-07-22 |
| JPWO2021144963A5 true JPWO2021144963A5 (https=) | 2022-08-23 |
| JP7347544B2 JP7347544B2 (ja) | 2023-09-20 |
Family
ID=76864131
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2021570601A Active JP7347544B2 (ja) | 2020-01-17 | 2020-01-17 | 方策学習方法、方策学習装置、プログラム |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20230023899A1 (https=) |
| JP (1) | JP7347544B2 (https=) |
| WO (1) | WO2021144963A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114005014B (zh) * | 2021-12-23 | 2022-06-17 | 杭州华鲤智能科技有限公司 | 一种模型训练、社交互动策略优化方法 |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3696737B1 (en) * | 2016-11-03 | 2022-08-31 | Deepmind Technologies Limited | Training action selection neural networks |
| EP4273757A3 (en) * | 2017-06-05 | 2024-02-14 | DeepMind Technologies Limited | Selecting actions using multi-modal inputs |
| US10935982B2 (en) * | 2017-10-04 | 2021-03-02 | Huawei Technologies Co., Ltd. | Method of selection of an action for an object using a neural network |
| AU2019272876B2 (en) * | 2018-05-24 | 2021-12-16 | Blue River Technology Inc. | Boom sprayer including machine feedback control |
-
2020
- 2020-01-17 JP JP2021570601A patent/JP7347544B2/ja active Active
- 2020-01-17 WO PCT/JP2020/001500 patent/WO2021144963A1/ja not_active Ceased
- 2020-01-17 US US17/790,574 patent/US20230023899A1/en active Pending
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108431832B (zh) | 利用外部存储器扩增神经网络 | |
| KR101932835B1 (ko) | 행동 결정 장치 및 방법, 컴퓨터 판독 가능한 저장 매체 | |
| JP6669897B2 (ja) | 優位推定を使用する強化学習 | |
| JP6724870B2 (ja) | 人工ニューラルネットワーク回路の訓練方法、訓練プログラム、及び訓練装置 | |
| Knox et al. | Reinforcement learning from human reward: Discounting in episodic tasks | |
| AU2014101627A4 (en) | Computer-implemented frameworks and methodologies for generating, delivering and managing adaptive tutorials | |
| Kusmierczyk et al. | On the causal effect of badges | |
| CN111079895A (zh) | 增强具有外部存储器的神经网络 | |
| CN114730306A (zh) | 强化学习软件代理的数据增强训练 | |
| CN111047482A (zh) | 基于层次记忆网络的知识追踪系统及方法 | |
| US20150018060A1 (en) | System and method for decision making in strategic environments | |
| CN114981820A (zh) | 用于在边缘设备上评估和选择性蒸馏机器学习模型的系统和方法 | |
| Kumaraswamy et al. | Context-dependent upper-confidence bounds for directed exploration | |
| JP2019164753A5 (https=) | ||
| JPWO2021144963A5 (https=) | ||
| CN115204387B (zh) | 分层目标条件下的学习方法、装置和电子设备 | |
| JPWO2021064766A5 (https=) | ||
| CN107977909A (zh) | 具个人化学习路径自动产生机制之学习规划方法与学习规划系统 | |
| JP2021082014A5 (https=) | ||
| Wang | An I-POMDP based multi-agent architecture for dialogue tutoring | |
| US10832585B2 (en) | Reading progress indicator | |
| Khanmirza et al. | Identification of linear and non-linear physical parameters of multistory shear buildings using artificial neural network | |
| JPWO2024069871A5 (ja) | 評価システム、情報処理システム、評価方法、及びプログラム | |
| KR102590791B1 (ko) | 불확실성 조건부 심층 강화 학습 방법 및 그 처리 장치 | |
| TWI655604B (zh) | 學習規劃方法與學習規劃系統 |