CN114365157A - 用于增强学习模型的状态模拟器 - Google Patents

用于增强学习模型的状态模拟器 Download PDF

Info

Publication number
CN114365157A
CN114365157A CN202080063367.1A CN202080063367A CN114365157A CN 114365157 A CN114365157 A CN 114365157A CN 202080063367 A CN202080063367 A CN 202080063367A CN 114365157 A CN114365157 A CN 114365157A
Authority
CN
China
Prior art keywords
state
feature
actions
subsets
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202080063367.1A
Other languages
English (en)
Chinese (zh)
Inventor
M·马西恩
A·扎多罗杰尼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN114365157A publication Critical patent/CN114365157A/zh
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)
CN202080063367.1A 2019-09-12 2020-08-11 用于增强学习模型的状态模拟器 Withdrawn CN114365157A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/568,284 2019-09-12
US16/568,284 US11574244B2 (en) 2019-09-12 2019-09-12 States simulator for reinforcement learning models
PCT/EP2020/072487 WO2021047842A1 (en) 2019-09-12 2020-08-11 States simulator for reinforcement learning models

Publications (1)

Publication Number Publication Date
CN114365157A true CN114365157A (zh) 2022-04-15

Family

ID=72050874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080063367.1A Withdrawn CN114365157A (zh) 2019-09-12 2020-08-11 用于增强学习模型的状态模拟器

Country Status (5)

Country Link
US (1) US11574244B2 (https=)
EP (1) EP4028959A1 (https=)
JP (1) JP7438336B2 (https=)
CN (1) CN114365157A (https=)
WO (1) WO2021047842A1 (https=)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102338304B1 (ko) * 2020-10-20 2021-12-13 주식회사 뉴로코어 강화 학습을 이용한 공장 시뮬레이터 기반 스케줄링 시스템
CN115617796A (zh) * 2022-10-12 2023-01-17 中电智元数据科技有限公司 一种分布式数据库索引选择方法
CN118837737B (zh) * 2024-06-27 2025-09-09 西安交通大学 一种水下推进电机故障诊断方法、装置、设备及存储介质

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8918866B2 (en) * 2009-06-29 2014-12-23 International Business Machines Corporation Adaptive rule loading and session control for securing network delivered services
JP2013242761A (ja) * 2012-05-22 2013-12-05 Internatl Business Mach Corp <Ibm> マルコフ決定過程システム環境下における方策パラメータを更新するための方法、並びに、その制御器及び制御プログラム
US9128739B1 (en) * 2012-12-31 2015-09-08 Emc Corporation Determining instances to maintain on at least one cloud responsive to an evaluation of performance characteristics
US20160260024A1 (en) * 2015-03-04 2016-09-08 Qualcomm Incorporated System of distributed planning
US10540598B2 (en) 2015-09-09 2020-01-21 International Business Machines Corporation Interpolation of transition probability values in Markov decision processes
CN108701252B (zh) 2015-11-12 2024-02-02 渊慧科技有限公司 使用优先化经验存储器训练神经网络
US10839302B2 (en) 2015-11-24 2020-11-17 The Research Foundation For The State University Of New York Approximate value iteration with complex returns by bounding
CN108230057A (zh) * 2016-12-09 2018-06-29 阿里巴巴集团控股有限公司 一种智能推荐方法及系统
US20180342004A1 (en) * 2017-05-25 2018-11-29 Microsoft Technology Licensing, Llc Cumulative success-based recommendations for repeat users
WO2020005240A1 (en) * 2018-06-27 2020-01-02 Google Llc Adapting a sequence model for use in predicting future device interactions with a computing system
US10963313B2 (en) * 2018-08-27 2021-03-30 Vmware, Inc. Automated reinforcement-learning-based application manager that learns and improves a reward function
US11468322B2 (en) * 2018-12-04 2022-10-11 Rutgers, The State University Of New Jersey Method for selecting and presenting examples to explain decisions of algorithms
EP3776347B1 (en) * 2019-06-17 2025-07-02 Google LLC Vehicle occupant engagement using three-dimensional eye gaze vectors

Also Published As

Publication number Publication date
JP7438336B2 (ja) 2024-02-26
JP2022547529A (ja) 2022-11-14
EP4028959A1 (en) 2022-07-20
WO2021047842A1 (en) 2021-03-18
US11574244B2 (en) 2023-02-07
US20210081829A1 (en) 2021-03-18

Similar Documents

Publication Publication Date Title
US11235248B1 (en) Online behavior using predictive analytics
Wirsansky Hands-On Genetic Algorithms with Python: Applying genetic algorithms to solve real-world deep learning and artificial intelligence problems
CN109902706B (zh) 推荐方法及装置
US20180024989A1 (en) Automated building and sequencing of a storyline and scenes, or sections, included therein
US20230394328A1 (en) Prompting Machine-Learned Models Using Chains of Thought
US9908052B2 (en) Creating dynamic game activities for games
KR102203253B1 (ko) 생성적 적대 신경망에 기반한 평점 증강 및 아이템 추천 방법 및 시스템
US11238339B2 (en) Predictive neural network with sentiment data
US20160086498A1 (en) Recommending a Set of Learning Activities Based on Dynamic Learning Goal Adaptation
US12346828B2 (en) Image analysis by prompting of machine-learned models using chain of thought
CN114365157A (zh) 用于增强学习模型的状态模拟器
CN113457167A (zh) 用户分类网络的训练方法、用户分类方法及装置
Coelho et al. Building Machine Learning Systems with Python: Explore machine learning and deep learning techniques for building intelligent systems using scikit-learn and TensorFlow
US10213689B2 (en) Method and system modeling social identity in digital media with dynamic group membership
US11275994B2 (en) Unstructured key definitions for optimal performance
US10537801B2 (en) System and method for decision making in strategic environments
US11458397B1 (en) Automated real-time engagement in an interactive environment
Jagtap et al. Uncertainty-based decision support system for gaming applications
CN112699203A (zh) 路网数据的处理方法和装置
KR102527019B1 (ko) 온라인 통합 수리 플랫폼에서 가짜 계정을 판별하여 인플루언서 마케팅을 제공하는 방법 및 장치
CN117372691A (zh) 实例分割方法、装置、设备及介质
CN119072696A (zh) 训练神经网络系统以执行多个机器学习任务
US20250242261A1 (en) Systems and methods for video game design and engagement tracking
KR102885596B1 (ko) 대형 언어 모델을 위한 학습 데이터를 생성하는 방법 및 시스템
US20250242260A1 (en) System and method for monitoring and optimizing player engagement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220415

WW01 Invention patent application withdrawn after publication