RU2019119314A

RU2019119314A - METHOD AND SYSTEM OF MACHINE LEARNING OF HIERARCHICALLY ORGANIZED PURPOSE BEHAVIOR

Info

Publication number: RU2019119314A
Application number: RU2019119314A
Authority: RU
Inventors: Сергей Александрович Шумский
Original assignee: Сергей Александрович Шумский
Priority date: 2019-06-20
Filing date: 2019-06-20
Publication date: 2020-12-21
Also published as: RU2755935C2; WO2020256593A1; RU2019119314A3

Claims

1. A computer-implemented method of machine learning purposeful behavior, containing the following stages:

receive sensory information from the external environment, including reinforcing signals, and

generate control signals in order to maximize the sum of reinforcing signals expected in the future, while the control signals are generated in accordance with the hierarchy of coordinated nested plans, which are automatically created in the learning process and constantly adapted to changing external circumstances.

2. The method according to claim. 1, characterized in that external reinforcing signals are supplemented with internal reinforcements in cases of implementation of the course of events predicted by the system.

3. The method according to any one of paragraphs. 1, 2, characterized in that the number of levels of the hierarchy increases gradually as information about interaction with the external environment accumulates.

4. A method according to any one of claims. 1-3, characterized in that the control signals at each level of the hierarchy are chains of elementary discrete actions - patterns of behavior of a given level, which are characterized by the greatest expected total reinforcement, taking into account the statistical uncertainty determined using Thompson's sampling of data from the memory of this level.

5. The method according to any one of claims. 1-4, characterized in that at each level of the hierarchy, new patterns of behavior are created by adding to the memory the most advantageous combinations of already known patterns.

6. A system for teaching hierarchical expedient behavior, containing at least one processor, computer memory, network infrastructure, information storage facilities capable of performing hierarchical layer-by-layer processing of input sensory information from a lower level, including the external environment, as a zero level, and control signals from a higher level, in addition to the upper level of the hierarchy and the generation of control signals to a lower level, as well as the accumulation of experience in interacting with the external environment.

7. The system according to claim 6, characterized in that the number of levels of the information processing hierarchy increases gradually as the experience of interaction with the external environment accumulates.

8. The system according to claim 6 and / or 7, characterized in that the information processing at each hierarchical level is performed by a set of software and hardware modules operating in parallel and independently of each other.

9. System according to any one of paragraphs. 6-8, characterized in that the system or its individual components are implemented in hardware in the form of specialized microcircuits of the corresponding architecture.

10. System according to any one of paragraphs. 6-9, characterized in that the system is implemented in a client-server architecture and all units are interconnected by standardized communication channels.