RU2019119314A - METHOD AND SYSTEM OF MACHINE LEARNING OF HIERARCHICALLY ORGANIZED PURPOSE BEHAVIOR - Google Patents

METHOD AND SYSTEM OF MACHINE LEARNING OF HIERARCHICALLY ORGANIZED PURPOSE BEHAVIOR Download PDF

Info

Publication number
RU2019119314A
RU2019119314A RU2019119314A RU2019119314A RU2019119314A RU 2019119314 A RU2019119314 A RU 2019119314A RU 2019119314 A RU2019119314 A RU 2019119314A RU 2019119314 A RU2019119314 A RU 2019119314A RU 2019119314 A RU2019119314 A RU 2019119314A
Authority
RU
Russia
Prior art keywords
level
hierarchy
control signals
external environment
behavior
Prior art date
Application number
RU2019119314A
Other languages
Russian (ru)
Other versions
RU2755935C2 (en
RU2019119314A3 (en
Inventor
Сергей Александрович Шумский
Original Assignee
Сергей Александрович Шумский
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Сергей Александрович Шумский filed Critical Сергей Александрович Шумский
Priority to RU2019119314A priority Critical patent/RU2755935C2/en
Priority to PCT/RU2020/050123 priority patent/WO2020256593A1/en
Publication of RU2019119314A publication Critical patent/RU2019119314A/en
Publication of RU2019119314A3 publication Critical patent/RU2019119314A3/ru
Application granted granted Critical
Publication of RU2755935C2 publication Critical patent/RU2755935C2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Machine Translation (AREA)
  • Computer And Data Communications (AREA)

Claims (12)

1. Компьютерно реализуемый способ машинного обучения целенаправленному поведению, содержащий следующие этапы: 1. A computer-implemented method of machine learning purposeful behavior, containing the following stages: получают из внешней среды сенсорную информацию, в том числе подкрепляющие сигналы, и receive sensory information from the external environment, including reinforcing signals, and генерируют управляющие сигналы с целью максимизации суммы ожидаемых в будущем подкрепляющих сигналов, при этом управляющие сигналы генерируют в соответствии с иерархией согласованных вложенных друг в друга планов, которые автоматически создают в процессе обучения и постоянно адаптируют к изменяющимся внешним обстоятельствам.generate control signals in order to maximize the sum of reinforcing signals expected in the future, while the control signals are generated in accordance with the hierarchy of coordinated nested plans, which are automatically created in the learning process and constantly adapted to changing external circumstances. 2. Способ по п. 1, отличающийся тем, что внешние подкрепляющие сигналы дополняют внутренними подкреплениями в случаях осуществления прогнозируемого системой хода развития событий.2. The method according to claim. 1, characterized in that external reinforcing signals are supplemented with internal reinforcements in cases of implementation of the course of events predicted by the system. 3.Способ по любому из пп. 1, 2, отличающийся тем, что количество уровней иерархии увеличивают постепенно по мере накопления информации о взаимодействии с внешней средой.3. The method according to any one of paragraphs. 1, 2, characterized in that the number of levels of the hierarchy increases gradually as information about interaction with the external environment accumulates. 4. Способ по любому из пп. 1-3, отличающийся тем, что управляющие сигналы на каждом уровне иерархии представляют собой цепочки элементарных дискретных действий – паттерны поведения данного уровня, которые характеризуются наибольшим ожидаемым суммарным подкреплением с учетом статистической неопределенности определяемой при помощи Томпсоновского сэмплирования данных из памяти данного уровня.4. A method according to any one of claims. 1-3, characterized in that the control signals at each level of the hierarchy are chains of elementary discrete actions - patterns of behavior of a given level, which are characterized by the greatest expected total reinforcement, taking into account the statistical uncertainty determined using Thompson's sampling of data from the memory of this level. 5. Способ по любому из пп. 1-4, отличающийся тем, что на каждом уровне иерархии новые паттерны поведения создают путем добавления в память наиболее выгодных комбинаций из уже известных паттернов.5. The method according to any one of claims. 1-4, characterized in that at each level of the hierarchy, new patterns of behavior are created by adding to the memory the most advantageous combinations of already known patterns. 6. Система для обучения иерархическому целесообразному поведению, содержащая по меньшей мере один процессор, компьютерную память, сетевую инфраструктуру, средства хранения информации, выполненные с возможностью осуществления иерархической послойной обработки входной сенсорной информации из более низкого уровня, включая внешнюю среду, как нулевой уровень, и управляющих сигналов с более высокого уровня, кроме верхнего уровня иерархии и выработки управляющих сигналов более низкому уровню, а также накопления опыта взаимодействия с внешней средой.6. A system for teaching hierarchical expedient behavior, containing at least one processor, computer memory, network infrastructure, information storage facilities capable of performing hierarchical layer-by-layer processing of input sensory information from a lower level, including the external environment, as a zero level, and control signals from a higher level, in addition to the upper level of the hierarchy and the generation of control signals to a lower level, as well as the accumulation of experience in interacting with the external environment. 7. Система по п. 6, отличающаяся тем, что количество уровней иерархии обработки информации увеличивается постепенно по мере накопления опыта взаимодействия с внешней средой.7. The system according to claim 6, characterized in that the number of levels of the information processing hierarchy increases gradually as the experience of interaction with the external environment accumulates. 8. Система по п. 6 и/или 7, отличающаяся тем, что обработка информации на каждом иерархическом уровне производится набором программно-аппаратных модулей, работающих параллельно и независимо друг от друга.8. The system according to claim 6 and / or 7, characterized in that the information processing at each hierarchical level is performed by a set of software and hardware modules operating in parallel and independently of each other. 9. Система по любому из пп. 6-8, отличающаяся тем, что система или ее отдельные компоненты реализованы аппаратно в виде специализированных микросхем соответствующей архитектуры.9. System according to any one of paragraphs. 6-8, characterized in that the system or its individual components are implemented in hardware in the form of specialized microcircuits of the corresponding architecture. 10. Система по любому из пп. 6-9, отличающаяся тем, что система реализована в клиент-серверной архитектуре и все блоки соединены между собой стандартизированными каналами связи.10. System according to any one of paragraphs. 6-9, characterized in that the system is implemented in a client-server architecture and all units are interconnected by standardized communication channels.
RU2019119314A 2019-06-20 2019-06-20 Method and system for machine learning of hierarchically organized purposeful behavior RU2755935C2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
RU2019119314A RU2755935C2 (en) 2019-06-20 2019-06-20 Method and system for machine learning of hierarchically organized purposeful behavior
PCT/RU2020/050123 WO2020256593A1 (en) 2019-06-20 2020-06-16 Method and system for the machine learning of a hierarchically organized target behaviour

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
RU2019119314A RU2755935C2 (en) 2019-06-20 2019-06-20 Method and system for machine learning of hierarchically organized purposeful behavior

Publications (3)

Publication Number Publication Date
RU2019119314A true RU2019119314A (en) 2020-12-21
RU2019119314A3 RU2019119314A3 (en) 2020-12-21
RU2755935C2 RU2755935C2 (en) 2021-09-23

Family

ID=74040629

Family Applications (1)

Application Number Title Priority Date Filing Date
RU2019119314A RU2755935C2 (en) 2019-06-20 2019-06-20 Method and system for machine learning of hierarchically organized purposeful behavior

Country Status (2)

Country Link
RU (1) RU2755935C2 (en)
WO (1) WO2020256593A1 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7346009B2 (en) * 2002-09-30 2008-03-18 Mosaid Technologies, Inc. Dense mode coding scheme
US8856138B1 (en) * 2012-08-09 2014-10-07 Google Inc. Faster substring searching using hybrid range query data structures
US10796335B2 (en) * 2015-10-08 2020-10-06 Samsung Sds America, Inc. Device, method, and computer readable medium of generating recommendations via ensemble multi-arm bandit with an LPBoost
US11010664B2 (en) * 2016-02-05 2021-05-18 Deepmind Technologies Limited Augmenting neural networks with hierarchical external memory
RU2670781C9 (en) * 2017-03-23 2018-11-23 Илья Николаевич Логинов System and method for data storage and processing
US10387298B2 (en) * 2017-04-04 2019-08-20 Hailo Technologies Ltd Artificial neural network incorporating emphasis and focus techniques
WO2020013726A1 (en) * 2018-07-13 2020-01-16 Публичное Акционерное Общество "Сбербанк России" Method for interpreting artificial neural networks

Also Published As

Publication number Publication date
RU2755935C2 (en) 2021-09-23
WO2020256593A1 (en) 2020-12-24
RU2019119314A3 (en) 2020-12-21

Similar Documents

Publication Publication Date Title
Kuchuk et al. Approaches to selection of combinatorial algorithm for optimization in network traffic control of safety-critical systems
Khalili-Damghani et al. Solving multi-mode time–cost–quality trade-off problems under generalized precedence relations
Aarts et al. Boltzmann machines and their applications
CN104335232A (en) Continuous time spiking neural network event-based simulation
Rego et al. A filter-and-fan approach to the job shop scheduling problem
Krovvidy et al. Wastewater treatment systems from case-based reasoning
CN113312874B (en) Overall wiring method based on improved deep reinforcement learning
Scheu et al. Nonlinear distributed dynamic optimization based on first order sensitivities
Xie et al. Origin-based algorithms for traffic assignment: algorithmic structure, complexity analysis, and convergence performance
Wang et al. On-line distributed prediction of traffic flow in a large-scale road network
Bandic et al. Mapping quantum circuits to modular architectures with QUBO
Xie et al. GA based decomposition of large scale distributed model predictive control systems
Yang et al. Bidirectional implementation of Markov/CCMT for dynamic reliability analysis with application to digital I&C systems
RU2019119314A (en) METHOD AND SYSTEM OF MACHINE LEARNING OF HIERARCHICALLY ORGANIZED PURPOSE BEHAVIOR
Sarker et al. A two-phase procedure for duplicating bottleneck machines in a linear layout, cellular manufacturing system
Bertsimas et al. A machine learning approach to two-stage adaptive robust optimization
Wagner et al. An analysis of adaptive windowing for time series forecasting in dynamic environments: further tests of the DyFor GP model
Mehrabi et al. An adaptive genetic algorithm for multiprocessor task assignment problem with limited memory
Meng et al. Accelerating monte-carlo tree search on cpu-fpga heterogeneous platform
Mester et al. Efficiency improvement of the global optimization method by local search changes
Jedrzejowicz et al. Population learning algorithm for the resource-constrained project scheduling
CN113506188A (en) Power transmission network extension planning method and system considering uncertainty and active load
Zhou et al. Communities of solutions in single solution clusters of a random K-satisfiability formula
Abdelwahed et al. Parallel asynchronous algorithms for optimal control of large‐scale dynamic systems
Wigström et al. An integrated CP/OR method for optimal control of modular hybrid systems