JPWO2021144963A5 - - Google Patents

Download PDF

Info

Publication number
JPWO2021144963A5
JPWO2021144963A5 JP2021570601A JP2021570601A JPWO2021144963A5 JP WO2021144963 A5 JPWO2021144963 A5 JP WO2021144963A5 JP 2021570601 A JP2021570601 A JP 2021570601A JP 2021570601 A JP2021570601 A JP 2021570601A JP WO2021144963 A5 JPWO2021144963 A5 JP WO2021144963A5
Authority
JP
Japan
Prior art keywords
state
action element
value
graph
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2021570601A
Other languages
English (en)
Japanese (ja)
Other versions
JPWO2021144963A1 (https=
JP7347544B2 (ja
Filing date
Publication date
Application filed filed Critical
Priority claimed from PCT/JP2020/001500 external-priority patent/WO2021144963A1/ja
Publication of JPWO2021144963A1 publication Critical patent/JPWO2021144963A1/ja
Publication of JPWO2021144963A5 publication Critical patent/JPWO2021144963A5/ja
Application granted granted Critical
Publication of JP7347544B2 publication Critical patent/JP7347544B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

JP2021570601A 2020-01-17 2020-01-17 方策学習方法、方策学習装置、プログラム Active JP7347544B2 (ja)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/001500 WO2021144963A1 (ja) 2020-01-17 2020-01-17 方策学習方法、方策学習装置、プログラム

Publications (3)

Publication Number Publication Date
JPWO2021144963A1 JPWO2021144963A1 (https=) 2021-07-22
JPWO2021144963A5 true JPWO2021144963A5 (https=) 2022-08-23
JP7347544B2 JP7347544B2 (ja) 2023-09-20

Family

ID=76864131

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2021570601A Active JP7347544B2 (ja) 2020-01-17 2020-01-17 方策学習方法、方策学習装置、プログラム

Country Status (3)

Country Link
US (1) US20230023899A1 (https=)
JP (1) JP7347544B2 (https=)
WO (1) WO2021144963A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114005014B (zh) * 2021-12-23 2022-06-17 杭州华鲤智能科技有限公司 一种模型训练、社交互动策略优化方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3696737B1 (en) * 2016-11-03 2022-08-31 Deepmind Technologies Limited Training action selection neural networks
EP4273757A3 (en) * 2017-06-05 2024-02-14 DeepMind Technologies Limited Selecting actions using multi-modal inputs
US10935982B2 (en) * 2017-10-04 2021-03-02 Huawei Technologies Co., Ltd. Method of selection of an action for an object using a neural network
AU2019272876B2 (en) * 2018-05-24 2021-12-16 Blue River Technology Inc. Boom sprayer including machine feedback control

Similar Documents

Publication Publication Date Title
CN108431832B (zh) 利用外部存储器扩增神经网络
KR101932835B1 (ko) 행동 결정 장치 및 방법, 컴퓨터 판독 가능한 저장 매체
JP6669897B2 (ja) 優位推定を使用する強化学習
JP6724870B2 (ja) 人工ニューラルネットワーク回路の訓練方法、訓練プログラム、及び訓練装置
Knox et al. Reinforcement learning from human reward: Discounting in episodic tasks
AU2014101627A4 (en) Computer-implemented frameworks and methodologies for generating, delivering and managing adaptive tutorials
Kusmierczyk et al. On the causal effect of badges
CN111079895A (zh) 增强具有外部存储器的神经网络
CN114730306A (zh) 强化学习软件代理的数据增强训练
CN111047482A (zh) 基于层次记忆网络的知识追踪系统及方法
US20150018060A1 (en) System and method for decision making in strategic environments
CN114981820A (zh) 用于在边缘设备上评估和选择性蒸馏机器学习模型的系统和方法
Kumaraswamy et al. Context-dependent upper-confidence bounds for directed exploration
JP2019164753A5 (https=)
JPWO2021144963A5 (https=)
CN115204387B (zh) 分层目标条件下的学习方法、装置和电子设备
JPWO2021064766A5 (https=)
CN107977909A (zh) 具个人化学习路径自动产生机制之学习规划方法与学习规划系统
JP2021082014A5 (https=)
Wang An I-POMDP based multi-agent architecture for dialogue tutoring
US10832585B2 (en) Reading progress indicator
Khanmirza et al. Identification of linear and non-linear physical parameters of multistory shear buildings using artificial neural network
JPWO2024069871A5 (ja) 評価システム、情報処理システム、評価方法、及びプログラム
KR102590791B1 (ko) 불확실성 조건부 심층 강화 학습 방법 및 그 처리 장치
TWI655604B (zh) 學習規劃方法與學習規劃系統