JP7529144B2 - 学習装置、学習方法および学習プログラム - Google Patents
学習装置、学習方法および学習プログラム Download PDFInfo
- Publication number
- JP7529144B2 JP7529144B2 JP2023516874A JP2023516874A JP7529144B2 JP 7529144 B2 JP7529144 B2 JP 7529144B2 JP 2023516874 A JP2023516874 A JP 2023516874A JP 2023516874 A JP2023516874 A JP 2023516874A JP 7529144 B2 JP7529144 B2 JP 7529144B2
- Authority
- JP
- Japan
- Prior art keywords
- log
- likelihood
- lower limit
- reward function
- trajectory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2021/016630 WO2022230019A1 (ja) | 2021-04-26 | 2021-04-26 | 学習装置、学習方法および学習プログラム |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JPWO2022230019A1 JPWO2022230019A1 (https=) | 2022-11-03 |
| JPWO2022230019A5 JPWO2022230019A5 (https=) | 2024-01-04 |
| JP7529144B2 true JP7529144B2 (ja) | 2024-08-06 |
Family
ID=83846792
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2023516874A Active JP7529144B2 (ja) | 2021-04-26 | 2021-04-26 | 学習装置、学習方法および学習プログラム |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20240211767A1 (https=) |
| EP (1) | EP4332845A4 (https=) |
| JP (1) | JP7529144B2 (https=) |
| WO (1) | WO2022230019A1 (https=) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7815840B2 (ja) * | 2022-02-22 | 2026-02-18 | 富士通株式会社 | 関数生成プログラム、関数生成装置、制御装置、及び関数生成方法 |
| CN119045292A (zh) * | 2024-10-31 | 2024-11-29 | 浙江大学 | 一种基于多层感知机的逆向光刻方法 |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP7315007B2 (ja) * | 2019-08-29 | 2023-07-26 | 日本電気株式会社 | 学習装置、学習方法および学習プログラム |
-
2021
- 2021-04-26 JP JP2023516874A patent/JP7529144B2/ja active Active
- 2021-04-26 EP EP21939182.8A patent/EP4332845A4/en not_active Withdrawn
- 2021-04-26 WO PCT/JP2021/016630 patent/WO2022230019A1/ja not_active Ceased
- 2021-04-26 US US18/287,546 patent/US20240211767A1/en active Pending
Non-Patent Citations (5)
| Title |
|---|
| GANGWANI, Tanmay ほか,STATE-ONLY IMITATION WITH TRANSITION DYNAMICS MISMATCH,[online],arXiv,2020年02月27日,pp.1-17,[検索日 2021.06.24], インターネット:<URL:https://arxiv.org/pdf/2002.11879.pdf> |
| XIAO, Huang ほか,Wasserstein Adversarial Imitation Learning,[online],arXiv,2019年06月19日,pp.1-18,[検索日 2021.06.24], インターネット:<URL:https://arxiv.org/pdf/1906.08113.pdf> |
| ZHANG, Ming ほか,Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration,[online],arXiv,2020年12月08日,[検索日 2021.06.24], インターネット:<URL:https://arxiv.org/pdf/2006.03503.pdf> |
| 中口悠輝, 外2名,最大エントロピー原理に基づく逆強化ダイナミクス学習フレームワークの構築,2019年度人工知能学会全国大会(第33回),2019年06月07日,p.1-4 |
| 今井拓司,専門家の意図が分かる模倣学習を逆強化学習でNECが単発の意思決定問題から実用へ,NIKKEI Robotics,日本,日経BP,2019年09月10日,第51号,p.22-26 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2022230019A1 (https=) | 2022-11-03 |
| EP4332845A1 (en) | 2024-03-06 |
| EP4332845A4 (en) | 2024-06-12 |
| WO2022230019A1 (ja) | 2022-11-03 |
| US20240211767A1 (en) | 2024-06-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20190324822A1 (en) | Deep Reinforcement Learning for Workflow Optimization Using Provenance-Based Simulation | |
| TWI620075B (zh) | 用於雲端巨量資料運算架構之伺服器及其雲端運算資源最佳化方法 | |
| US9785468B2 (en) | Finding resource bottlenecks with low-frequency sampled data | |
| JP7402482B2 (ja) | 前段階コシミュレーション方法、デバイス、コンピュータ可読媒体、及びプログラム | |
| JP7529144B2 (ja) | 学習装置、学習方法および学習プログラム | |
| JP5845812B2 (ja) | 分散コンピューティング環境におけるソフトウェアの解析の効率的な並列化のためのポリシーのスケジューリング | |
| CN108664378A (zh) | 一种微服务最短执行时间的优化方法 | |
| Hellwig et al. | Evolution under strong noise: A self-adaptive evolution strategy can reach the lower performance bound-the pccmsa-es | |
| Zhai et al. | Deep q-learning with prioritized sampling | |
| JP7687041B2 (ja) | 衛星観測計画立案システム、衛星観測計画立案方法、および衛星観測計画立案プログラム | |
| KR20230059508A (ko) | 잡 스케줄러 모니터링 방법과 이를 수행하기 위한 장치 및 시스템 | |
| JP2020126511A (ja) | 最適化装置、方法、及びプログラム | |
| CN117157625A (zh) | 执行环境的智能识别 | |
| JP7537517B2 (ja) | 学習装置、学習方法および学習プログラム | |
| Khebbeb et al. | Formalizing and simulating cross-layer elasticity strategies in Cloud systems | |
| Chen et al. | Self‐triggered control for linear systems based on hierarchical reinforcement learning | |
| Giortamis et al. | Qonductor: A Cloud Orchestrator for Quantum Computing | |
| JP2005049922A (ja) | ジョブ実行計画の評価システム | |
| KR101383225B1 (ko) | 하나 이상의 실행 유닛에 대한 성능 분석 방법, 성능 분석 장치 및 성능 분석 방법을 수행하는 프로그램을 기록한 컴퓨터 판독가능 기록매체 | |
| CN117839220A (zh) | 基于强化学习的游戏引擎优化方法及装置 | |
| CN117873718A (zh) | 一种dag高性能计算方法 | |
| US11874836B2 (en) | Configuring graph query parallelism for high system throughput | |
| US7505886B1 (en) | Technique for programmatically obtaining experimental measurements for model construction | |
| Xie et al. | Task scheduling in heterogeneous computing systems based on machine learning approach | |
| KR101621280B1 (ko) | 최악 응답 시간 분석 방법 및 컴퓨터 프로그램 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230928 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20230928 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20240625 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20240708 |
|
| R150 | Certificate of patent or registration of utility model |
Ref document number: 7529144 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |