US20200230813A1 - Methods for establishing and utilizing sensorimotor programs - Google Patents

Methods for establishing and utilizing sensorimotor programs Download PDF

Info

Publication number
US20200230813A1
US20200230813A1 US16/840,210 US202016840210A US2020230813A1 US 20200230813 A1 US20200230813 A1 US 20200230813A1 US 202016840210 A US202016840210 A US 202016840210A US 2020230813 A1 US2020230813 A1 US 2020230813A1
Authority
US
United States
Prior art keywords
sensorimotor
concept
program
programs
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/840,210
Inventor
David Scott Phoenix
Michael Stark
Nicholas Hay
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vicarious FPC Inc
Intrinsic Innovation LLC
Intrinsic I LLC
Original Assignee
Vicarious FPC Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vicarious FPC Inc filed Critical Vicarious FPC Inc
Priority to US16/840,210 priority Critical patent/US20200230813A1/en
Assigned to VICARIOUS FPC, INC. reassignment VICARIOUS FPC, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: STARK, MICHAEL, Hay, Nicholas, PHOENIX, DAVID SCOTT
Publication of US20200230813A1 publication Critical patent/US20200230813A1/en
Assigned to LLC, INTRINSIC I reassignment LLC, INTRINSIC I ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BOSTON POLARIMETRICS, INC., VICARIOUS FPC, INC
Assigned to INTRINSIC INNOVATION LLC reassignment INTRINSIC INNOVATION LLC CORRECTIVE ASSIGNMENT TO CORRECT THE THE RECEIVING PARTY NAME PREVIOUSLY RECORDED AT REEL: 060389 FRAME: 0682. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT. Assignors: BOSTON POLARIMETRICS, INC., VICARIOUS FPC, INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1602Programme controls characterised by the control system, structure, architecture
    • B25J9/161Hardware, e.g. neural networks, fuzzy logic, interfaces, processor
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • G05B13/027Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • G06F18/2193Validation; Performance evaluation; Active pattern learning techniques based on specific statistical tests
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • G06K9/6265
    • G06K9/6268
    • G06K9/6297
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/008Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0454
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N5/003
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/84Arrangements for image or video recognition or understanding using pattern recognition or machine learning using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks
    • G06V10/85Markov-related models; Markov random fields

Definitions

  • This invention relates generally to the artificial intelligence field, and more specifically to new and useful methods for establishing and utilizing sensorimotor programs.
  • RNN recursive cortical network
  • object recognition is only a part of the skillset needed to effectively interact with an environment. Humans may observe how objects interact with each other to infer properties of those objects; for example, by observing how a sphere reacts when dropped onto a hard surface, a human may be able to infer whether a ball is made of rubber, cork, or steel.
  • FIG. 1 is a chart representation of a method of an invention embodiment
  • FIG. 2 is a chart representation of a concept hierarchy of a method of an invention embodiment.
  • a method 100 for establishing sensorimotor programs includes specifying a concept relationship S 120 , training a first sensorimotor program S 130 and training a second sensorimotor program using the first sensorimotor program S 140 , as shown in FIG. 1 .
  • the method 100 may additionally or alternatively include generating a sensorimotor training curriculum S 110 and/or executing the first and second sensorimotor programs S 150 .
  • a computer vision system for an autonomous vehicle might be trained to distinguish various objects based on their visual characteristics as observed by a camera. While this approach to computer vision is straightforward, it often suffers from two disadvantages. The first is that sensory input is useful to distinguish objects or environmental states only to the extent that the sensory input differs substantially between those objects and environmental states. For example, a computer that identifies objects based on similarity of appearance may have trouble distinguishing objects that appear to be similar; e.g., such a computer may not be able to distinguish a person from a statue.
  • Some computer vision approaches attempt to solve this issue by taking more data (e.g., different sensor types or attempting to infer indirectly sensed information such as inferring physical properties of an object from its movement), but these approaches have drawbacks as well.
  • the second issue is that this approach results in poor generalizability; e.g., it can be difficult to figure out how to treat a new detected object based solely on similarity to trained objects. For example, if a robot is trained to interact with a pen in a certain way, that may not accurately inform the robot how to interact with a laser pointer (even though pens and laser pointers may look quite similar).
  • Animals may utilize exploration and exploitation efficiently: when encountering a new environment, animals explore the environment until finding a rewarding set of behaviors. Once a rewarding set of behaviors is established, animals may continue to exploit the behaviors as long as they continue being rewarding. If a set of behaviors ceases to be rewarding, animals may then resume the process of exploration.
  • the method 100 focuses on the establishment of sensorimotor programs that build upon previously established sensorimotor programs to learn new behavior. By utilizing existing sensorimotor programs in the training process, the method 100 may more quickly learn complex behaviors than would otherwise be possible. Further, to the extent that the method 100 includes environment generation (a la S 110 ), the method 100 may additionally exploit this advantage by intentionally generating environments in a manner reflecting the role of particular simple concepts in representing more complex concepts. Further, the sensorimotor programs generated by the method 100 may feature enhanced generalizability compared to traditional approaches thanks to hierarchical relationships between concepts (this may also be thought of as an advantage for S 150 ).
  • the method 100 is preferably implemented by a partially observable Markov decision process (POMDP) operating on a neural network.
  • POMDP partially observable Markov decision process
  • Neural networks and related systems including recursive cortical networks (RCNs), convolutional neural networks (CNNs), hierarchical compositional networks (HCNs), HMAX models, Slow Feature Analysis (SFA) systems, and Hierarchical Temporal Memory (HTM) systems may be used for a wide variety of tasks that are difficult to complete using standard rule-based programming. These tasks include many in the important fields of computer vision and speech recognition.
  • Neural networks and related systems can be represented as distributed processing elements that implement summation, multiplication, exponentiation or other functions on the elements incoming messages/signals. Such networks can be enabled and implemented through a variety of implementations.
  • a system operating the method 100 may be implemented as a network of electronically coupled functional node components.
  • the functional node components can be logical gates arranged or configured in a processor to perform a specified function.
  • the system may be implemented as a network model programmed or configured to be operative on a processor.
  • the network model is preferably electronically stored software that encodes the operation and communication between nodes of the network.
  • Neural networks and related systems may be used in a wide variety of applications and can use a wide variety of data types as input such as images, video, audio, natural language text, analytics data, widely distributed sensor data, or other suitable forms of data.
  • the method 100 enables both more efficient learning and execution of machine learning tasks related to environmental perception, thus serving as a specific improvement to computer-related technology.
  • the method 100 may enable more memory-efficient, faster, more generalizable, and more compact representation of any automated computer-controlled system that interacts with its environment.
  • the method 100 is not intended in any form to cover an abstract idea and may not be performed without a computing system.
  • the sensorimotor programs (SMPs) of the method 100 (also referred to as sensorimotor contingencies) embody behaviors that can be used to represent an agent's knowledge of the environment. Each sensorimotor program jointly represents one or more behaviors and an outcome. Each sensorimotor program additionally is capable of signaling its outcome (enabling a high-level sensorimotor program to execute and act based on the output of lower-level sensorimotor programs).
  • the ability of SMPs to generate outcome signals enables the outcome signals to be compared with global truth during training, and enable rewards to be based not only on whether an SMP achieves a desired outcome but also on whether the SMP signals that outcome; e.g., if the SMP is achieving an outcome but does not signal properly the reward can be structured differently than if it achieves an outcome and signals properly. This is not possible in traditional reinforcement learning systems.
  • classification SMPs perform actions in the environment to determine whether a concept is present in the environment or not.
  • a classification SMP may address the concept of “containment” (i.e., is the agent located within a bounded container, such as a fenced-in yard) and may signal “yes” or “no”.
  • Bring-about SMPs perform actions in the environment to bring about a particular state. For example, a bring-about SMP may attempt to bring about containment (e.g., if not already within a bound container, attempt to get into a bounded container). If the bring-about SMP is able to bring about containment, the SMP may signal “yes”.
  • SMPs may additionally or alternatively signal outcomes in any number of ways and in any manner.
  • SMPs may additionally or alternatively be constrained in any manner; for example, SMPs may terminate after a threshold number of processing steps is achieved or after a threshold time has elapsed.
  • SMPs may signal outcomes using trinary logic.
  • a classification SMP may, for instance, signal an outcome of “1” if a concept is found to be true and a “ ⁇ 1” if a concept is found not to be true. If two outcomes are possible, why are three values needed? The reason: in some cases, it may be desirable to maintain a vector that stores, for each SMP, a record of the result returned by the SMP on last execution. In these cases, it may be further desirable to initialize the SMPs at a value that does not correspond to one of the two concepts (e.g., “0”) so that the method 100 may effectively determine if a given SMP has been executed to completion since initialization.
  • SMPs of the method 100 may additionally or alternatively signal outcomes in any manner. Further, the systems executing the method 100 may maintain memory of SMP outcomes in any manner (or not at all).
  • S 110 includes generating a sensorimotor training curriculum.
  • S 110 functions to generate a set of environments where each environment is associated with one or more concepts. These environments are then used to train sensorimotor programs (e.g., to classify based on or bring about the concepts).
  • S 110 preferably generates a plurality of environments that represent the concept (additionally or alternatively, S 110 may map concepts to environments in any manner).
  • S 110 may generate the sensorimotor training curriculum in any manner.
  • S 110 generates the sensorimotor training curriculum for a set of concepts automatically by a rejection sampler working in tandem with a general-purpose constraint satisfaction problem (CSP) solver.
  • Environment distributions may be specified in a fragment of first-order logic, using a pre-defined vocabulary of unary and binary predicates that can be combined using conjunction and negation.
  • generators e.g., conjunctions of first-order logic expressions that specify random samples
  • the concept filter is then used to filter generated environments into those that satisfy a given concept and those that do not. Then, these filtered environments are assigned a reward function.
  • the reward function may reward +1 for SMPs that output a “1” signal (corresponding to concept present), a ⁇ 1 for SMPs that output a “ ⁇ 1” signal (corresponding to concept not present), and 0 otherwise (e.g., if an SMP times out).
  • the reward function may reward +1 for SMPs that output a “ ⁇ 1” signal (corresponding to concept not present), a ⁇ 1 for SMPs that output a “1” signal (corresponding to concept present), and 0 otherwise (e.g., if an SMP times out).
  • the concept filter evaluation may be performed dynamically (e.g., at each step of SMP execution, rewarding +1 if and only if the concept is true AND the concept has signaled appropriately and 0 otherwise).
  • reward functions for SMPs may be implemented in any manner.
  • a bring-about concept SMP may receive a reward if the concept is made true even if the SMP has not signaled correctly (e.g., at each step of SMP execution, rewarding +1 if the concept is true but the SMP has not properly signaled, +2 if the concept is true and the SMP has properly signaled, and 0 otherwise).
  • SMP training environments are preferably simulations of an environment for which utilization of sensorimotor programs are desired, but may additionally or alternatively be representative environments.
  • a set of SMPs intended to function only in virtual environments may utilize simpler variations of these virtual environments or (if possible) actual representative environments.
  • a set of SMPs intended to function in real-world environments may utilize virtual simulations of those real-world environments (e.g., a set of SMPs intended to operate a robot arm may be trained on simulated visual data and physics); additionally or alternatively, such SMPs may be trained in a real-world environment (e.g., a physical environment that is reconfigured to represent various concepts).
  • Data used for generating or simulating environments may include images, video, audio, speech, medical sensor data, natural language data, financial data, application data, physical data, traffic data, environmental data, etc.
  • S 120 includes specifying a concept relationship.
  • S 120 functions to establish relationships between concepts that can be exploited during training (and may aid in increasing generalizability even after training).
  • SMPs may be reused (i.e., SMPs may call each other) during and after training.
  • SMPs may be more efficient for SMPs to have an existing hierarchy (e.g., based on complexity) that determines what other SMPs a given SMP may call. Additionally or alternatively, a hierarchy may be used in specifying how SMPs are trained.
  • an existing hierarchy e.g., based on complexity
  • a concept that classifies an SMP as “contained” in one dimension or not may call a first SMP that determines whether an SMP is bounded in a first direction and a second SMP that determines whether an SMP is bounded in the other direction—(if both SMPs signal “1” then so does the “contained” SMP).
  • it may be preferable to train SMPs in reverse order of such a hierarchy e.g., train first SMPs that may not call other SMPs. Then train SMPs that may call those SMPs but no others, etc.).
  • SMPs may call no other SMPs; medium-complexity SMPs may call low-complexity SMPs, high-complexity SMPs may call low- and medium-complexity SMPs
  • training e.g., train low-complexity SMPs, then medium-complexity, then high-complexity
  • a hierarchy or other concept relationship that limits the available SMPs that an SMP may call exists, it may be based on complexity (as subjectively determined by a human) as in the previous example, but may additionally or alternatively be determined in any manner.
  • the concept relationship established in S 120 may simply be a flat relationship (e.g., there is no restriction on which SMPs an SMP may call; e.g., all SMPs may call each other).
  • the concept relationships used for training need not be the same as the concept relationships used in executing a fully trained sensorimotor network (e.g., by S 150 ), and that concept relationships may change over time.
  • training need not be performed according to the concept relationship (e.g., it may still be that SMPs may only call less-complex SMPs during training, but instead of training the less-complex SMPs first, then the more-complex SMPs, it may be desirable to train all SMPs at the same time, or train more complex SMPs first).
  • Concept relationships may be specified manually, but they may additionally or alternatively be specified in any manner. For example, concept relationships may be determined automatically or partially automatically by the results of training on similar networks. Likewise, a concept relationship initially determined may be updated during SMP training based on the results of SMP training.
  • S 130 includes training a first sensorimotor program and S 140 includes training a second sensorimotor program using the first sensorimotor program. While S 130 and S 140 are substantially similar in function, the explicit mention of S 140 highlights that training of SMPs to call other SMPs is an essential part of sensorimotor program training.
  • examples of SMPs include classification and bring-about SMPs.
  • Classification SMPs are preferably trained using reward functions that reward the SMP when the SMP terminates within a set time or number of steps and correctly returns a value consistent with the presence or non-presence of a given concept in an environment.
  • Bring-about SMPs are preferably trained using reward functions that reward the SMP when the SMP successfully brings about a given environmental state, terminates within a set time or number of steps, and correctly returns a value indicating that the SMP has successfully brought about the given environmental state.
  • SMPs may additionally or alternatively be rewarded in any manner. For example, bring-about SMPs may receive shaping awards when an SMP successfully brings about a concept but does not appropriately signal. The shaping reward may, for instance, provide a smaller reward than the primary reward function. Note that bring-about SMPs may call classification SMPs and vice-versa.
  • SMPs may be trained in any manner.
  • sets of SMPs may be represented by a neural network (e.g., a gated recurrent unit (GRU) network) and trained using natural policy optimization (NPO).
  • Networks may likewise be initialized in any manner, and training may occur over any number of iterations. For example, a given SMP may be trained and evaluated for five different random seeds. When an SMP reuses another SMP, that other SMP may be selected in any manner (e.g., the best performing seed may be selected, one of the seeds may be selected at random, etc.).
  • SMPs preferably may call other SMPs as an option (e.g., reusing an SMP to perform some set of actions) or as observations (e.g., where the SMP makes use of the output of SMPs it calls, not just the set of actions performed by it).
  • an SMP may perform any one of a set of primitive actions.
  • primitive actions may include simple movement of the robot arm (e.g., move up, move down, rotate hand, etc.).
  • Other examples of primitive actions may be those related to the control of sensors, actuators, or processing modules; for example, the orientation of a camera, the focus of the camera, values recorded by touch sensors, etc.
  • primitive actions are preferably pure motor or sensory actions (rather than the concepts that are derived from motor and sensory interaction), but primitive actions may be any “base-level” action (i.e., an action that may not call an SMP but rather serves as building block for an SMP).
  • a classification concept example is the “containment” classification (i.e., determine if the agent is contained or not) and a bring-about concept example is bringing-about containment (i.e., bring the agent to a contained state if possible).
  • S 150 includes executing the first and second sensorimotor programs.
  • S 150 functions to allow the use of the sensorimotor programs trained in S 110 -S 140 to accomplish a task or determine an environmental state.
  • S 150 preferably includes executing the second SMP, and in the process of executing the second SMP, executing the first SMP (e.g., reusing the first SMP as previously described in the method 100 .
  • first and second sensorimotor programs are trained using a simulation of a real world environment, execution may occur using physical sensors and actuators (e.g., on a robot).
  • the methods of the preferred embodiment and variations thereof can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions.
  • the instructions are preferably executed by computer-executable components preferably integrated with a neural network.
  • the computer-readable medium can be stored on any suitable computer-readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device.
  • the computer-executable component is preferably a general or application specific processor, but any suitable dedicated hardware or hardware/firmware combination device can alternatively or additionally execute the instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Robotics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Automation & Control Theory (AREA)
  • Mechanical Engineering (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Toys (AREA)

Abstract

A method for establishing sensorimotor programs includes specifying a concept relationship that relates a first concept to a second concept and establishes the second concept as higher-order than the first concept; training a first sensorimotor program to accomplish the first concept using a set of primitive actions; and training a second sensorimotor program to accomplish the second concept using the first sensorimotor program and the set of primitive actions.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. patent application Ser. No. 16/043,146, filed on 23 Jul. 2018, which claims the benefit of U.S. Provisional Application Ser. No. 62/535,703, filed on 21 Jul. 2017, which is incorporated in its entirety by this reference.
  • TECHNICAL FIELD
  • This invention relates generally to the artificial intelligence field, and more specifically to new and useful methods for establishing and utilizing sensorimotor programs.
  • BACKGROUND
  • While computer vision remains a complex problem in artificial intelligence, recent achievements such as the recursive cortical network (RCN) have enabled computers to identify objects from visual data efficiently and with high accuracy. However, just as with human vision, object recognition is only a part of the skillset needed to effectively interact with an environment. Humans may observe how objects interact with each other to infer properties of those objects; for example, by observing how a sphere reacts when dropped onto a hard surface, a human may be able to infer whether a ball is made of rubber, cork, or steel. Often, this observation occurs as a result of direct interaction with the environment; e.g., a human intentionally drops a ball onto a hard surface (or squeezes the ball, etc.) as an alternative to passively waiting for the environment to produce such a situation naturally. This knowledge makes it easier to accurately interpret past events, and likewise, to predict future events. Unfortunately, traditional approaches to computer vision more often embody the approach of the passive observer, which restricts their ability to achieve comprehension of an environment in a complete and generalizable sense. Thus, there is a need in the artificial intelligence field to create new and useful methods for establishing and utilizing sensorimotor programs. This invention provides such new and useful methods.
  • BRIEF DESCRIPTION OF THE FIGURES
  • FIG. 1 is a chart representation of a method of an invention embodiment; and
  • FIG. 2 is a chart representation of a concept hierarchy of a method of an invention embodiment.
  • DESCRIPTION OF THE INVENTION EMBODIMENTS
  • The following description of the invention embodiments of the invention is not intended to limit the invention to these invention embodiments, but rather to enable any person skilled in the art to make and use this invention.
  • 1. Method for Establishing Sensorimotor Programs
  • A method 100 for establishing sensorimotor programs includes specifying a concept relationship S120, training a first sensorimotor program S130 and training a second sensorimotor program using the first sensorimotor program S140, as shown in FIG. 1. The method 100 may additionally or alternatively include generating a sensorimotor training curriculum S110 and/or executing the first and second sensorimotor programs S150.
  • As discussed in the background section, traditional approaches to computer vision often focus on systems and methods that derive information from their environments in a passive manner. For example, a computer vision system for an autonomous vehicle might be trained to distinguish various objects based on their visual characteristics as observed by a camera. While this approach to computer vision is straightforward, it often suffers from two disadvantages. The first is that sensory input is useful to distinguish objects or environmental states only to the extent that the sensory input differs substantially between those objects and environmental states. For example, a computer that identifies objects based on similarity of appearance may have trouble distinguishing objects that appear to be similar; e.g., such a computer may not be able to distinguish a person from a statue. Some computer vision approaches attempt to solve this issue by taking more data (e.g., different sensor types or attempting to infer indirectly sensed information such as inferring physical properties of an object from its movement), but these approaches have drawbacks as well. The second issue is that this approach results in poor generalizability; e.g., it can be difficult to figure out how to treat a new detected object based solely on similarity to trained objects. For example, if a robot is trained to interact with a pen in a certain way, that may not accurately inform the robot how to interact with a laser pointer (even though pens and laser pointers may look quite similar).
  • To address these problems, some researchers have turned toward models of perception used to describe the behavior exhibited by natural consciousnesses (e.g., those of animals and humans). One such model is the sensorimotor theory of perceptual consciousness. This theory attempts to explain the perception of “feel” as arising from an agent engaging in a particular sensorimotor skill and attending to the fact that they are engaged in exercising that skill. It follows from this theory that the quality of a sensation is based on the way an agent interacts with its environment and not solely based on passive observation.
  • This reflects the real-world behavior of many animals when placed into new environments. Often, animals interact with their environments in what at first may appear to be random ways. The result of their behavior leads to environmental feedback that either reward or punish the behavior; the link established between the feedback (e.g., sensory cues) and the behaviors (e.g., exploratory motor actions) may be referred to as a sensorimotor contingency. Over time, animals refine their behavior to attain rewards while avoiding punishment (reinforcement learning). The process of establishing sensorimotor contingencies is part of “exploration”, and the use of them after establishment is known as “exploitation”.
  • Animals may utilize exploration and exploitation efficiently: when encountering a new environment, animals explore the environment until finding a rewarding set of behaviors. Once a rewarding set of behaviors is established, animals may continue to exploit the behaviors as long as they continue being rewarding. If a set of behaviors ceases to be rewarding, animals may then resume the process of exploration.
  • While reinforcement learning in general has been studied extensively in the context of machine learning, applications utilizing sensorimotor contingencies are far less common. Further, most of these applications utilize relatively simple reinforcement learning strategies (e.g., learning solely via random exploration). This can limit the efficiency and/or generalizability of these applications.
  • In contrast, the method 100 focuses on the establishment of sensorimotor programs that build upon previously established sensorimotor programs to learn new behavior. By utilizing existing sensorimotor programs in the training process, the method 100 may more quickly learn complex behaviors than would otherwise be possible. Further, to the extent that the method 100 includes environment generation (a la S110), the method 100 may additionally exploit this advantage by intentionally generating environments in a manner reflecting the role of particular simple concepts in representing more complex concepts. Further, the sensorimotor programs generated by the method 100 may feature enhanced generalizability compared to traditional approaches thanks to hierarchical relationships between concepts (this may also be thought of as an advantage for S150).
  • The method 100 is preferably implemented by a partially observable Markov decision process (POMDP) operating on a neural network. Neural networks and related systems, including recursive cortical networks (RCNs), convolutional neural networks (CNNs), hierarchical compositional networks (HCNs), HMAX models, Slow Feature Analysis (SFA) systems, and Hierarchical Temporal Memory (HTM) systems may be used for a wide variety of tasks that are difficult to complete using standard rule-based programming. These tasks include many in the important fields of computer vision and speech recognition.
  • Neural networks and related systems can be represented as distributed processing elements that implement summation, multiplication, exponentiation or other functions on the elements incoming messages/signals. Such networks can be enabled and implemented through a variety of implementations. For example, a system operating the method 100 may be implemented as a network of electronically coupled functional node components. The functional node components can be logical gates arranged or configured in a processor to perform a specified function. As a second example, the system may be implemented as a network model programmed or configured to be operative on a processor. The network model is preferably electronically stored software that encodes the operation and communication between nodes of the network. Neural networks and related systems may be used in a wide variety of applications and can use a wide variety of data types as input such as images, video, audio, natural language text, analytics data, widely distributed sensor data, or other suitable forms of data.
  • As described previously, the method 100 enables both more efficient learning and execution of machine learning tasks related to environmental perception, thus serving as a specific improvement to computer-related technology. The method 100 may enable more memory-efficient, faster, more generalizable, and more compact representation of any automated computer-controlled system that interacts with its environment. The method 100 is not intended in any form to cover an abstract idea and may not be performed without a computing system.
  • The sensorimotor programs (SMPs) of the method 100 (also referred to as sensorimotor contingencies) embody behaviors that can be used to represent an agent's knowledge of the environment. Each sensorimotor program jointly represents one or more behaviors and an outcome. Each sensorimotor program additionally is capable of signaling its outcome (enabling a high-level sensorimotor program to execute and act based on the output of lower-level sensorimotor programs). The ability of SMPs to generate outcome signals enables the outcome signals to be compared with global truth during training, and enable rewards to be based not only on whether an SMP achieves a desired outcome but also on whether the SMP signals that outcome; e.g., if the SMP is achieving an outcome but does not signal properly the reward can be structured differently than if it achieves an outcome and signals properly. This is not possible in traditional reinforcement learning systems.
  • Two examples of sensorimotor programs include classification SMPs and bring-about SMPs. Classification SMPs perform actions in the environment to determine whether a concept is present in the environment or not. For example, a classification SMP may address the concept of “containment” (i.e., is the agent located within a bounded container, such as a fenced-in yard) and may signal “yes” or “no”. Bring-about SMPs perform actions in the environment to bring about a particular state. For example, a bring-about SMP may attempt to bring about containment (e.g., if not already within a bound container, attempt to get into a bounded container). If the bring-about SMP is able to bring about containment, the SMP may signal “yes”. SMPs may additionally or alternatively signal outcomes in any number of ways and in any manner.
  • SMPs may additionally or alternatively be constrained in any manner; for example, SMPs may terminate after a threshold number of processing steps is achieved or after a threshold time has elapsed.
  • In one implementation of an invention embodiment, SMPs may signal outcomes using trinary logic. In this implementation, a classification SMP may, for instance, signal an outcome of “1” if a concept is found to be true and a “−1” if a concept is found not to be true. If two outcomes are possible, why are three values needed? The reason: in some cases, it may be desirable to maintain a vector that stores, for each SMP, a record of the result returned by the SMP on last execution. In these cases, it may be further desirable to initialize the SMPs at a value that does not correspond to one of the two concepts (e.g., “0”) so that the method 100 may effectively determine if a given SMP has been executed to completion since initialization.
  • SMPs of the method 100 may additionally or alternatively signal outcomes in any manner. Further, the systems executing the method 100 may maintain memory of SMP outcomes in any manner (or not at all).
  • S110 includes generating a sensorimotor training curriculum. S110 functions to generate a set of environments where each environment is associated with one or more concepts. These environments are then used to train sensorimotor programs (e.g., to classify based on or bring about the concepts). For each concept, S110 preferably generates a plurality of environments that represent the concept (additionally or alternatively, S110 may map concepts to environments in any manner).
  • S110 may generate the sensorimotor training curriculum in any manner. In one implementation of an invention embodiment, S110 generates the sensorimotor training curriculum for a set of concepts automatically by a rejection sampler working in tandem with a general-purpose constraint satisfaction problem (CSP) solver. Environment distributions may be specified in a fragment of first-order logic, using a pre-defined vocabulary of unary and binary predicates that can be combined using conjunction and negation. To generate environments, generators (e.g., conjunctions of first-order logic expressions that specify random samples) may be sampled uniformly; then the generator itself is invoked. For classification concepts, the concept filter is then used to filter generated environments into those that satisfy a given concept and those that do not. Then, these filtered environments are assigned a reward function. For example, for an environment with “Concept A” present, the reward function may reward +1 for SMPs that output a “1” signal (corresponding to concept present), a −1 for SMPs that output a “−1” signal (corresponding to concept not present), and 0 otherwise (e.g., if an SMP times out). Likewise, for an environment with “Concept A” not present, the reward function may reward +1 for SMPs that output a “−1” signal (corresponding to concept not present), a −1 for SMPs that output a “1” signal (corresponding to concept present), and 0 otherwise (e.g., if an SMP times out). For bring-about concepts, the concept filter evaluation may be performed dynamically (e.g., at each step of SMP execution, rewarding +1 if and only if the concept is true AND the concept has signaled appropriately and 0 otherwise).
  • Note that in general, reward functions for SMPs may be implemented in any manner. For example, a bring-about concept SMP may receive a reward if the concept is made true even if the SMP has not signaled correctly (e.g., at each step of SMP execution, rewarding +1 if the concept is true but the SMP has not properly signaled, +2 if the concept is true and the SMP has properly signaled, and 0 otherwise).
  • SMP training environments are preferably simulations of an environment for which utilization of sensorimotor programs are desired, but may additionally or alternatively be representative environments. For example, a set of SMPs intended to function only in virtual environments may utilize simpler variations of these virtual environments or (if possible) actual representative environments. A set of SMPs intended to function in real-world environments may utilize virtual simulations of those real-world environments (e.g., a set of SMPs intended to operate a robot arm may be trained on simulated visual data and physics); additionally or alternatively, such SMPs may be trained in a real-world environment (e.g., a physical environment that is reconfigured to represent various concepts).
  • Data used for generating or simulating environments may include images, video, audio, speech, medical sensor data, natural language data, financial data, application data, physical data, traffic data, environmental data, etc.
  • S120 includes specifying a concept relationship. S120 functions to establish relationships between concepts that can be exploited during training (and may aid in increasing generalizability even after training). As previously discussed, SMPs may be reused (i.e., SMPs may call each other) during and after training.
  • From a training perspective, it may be more efficient for SMPs to have an existing hierarchy (e.g., based on complexity) that determines what other SMPs a given SMP may call. Additionally or alternatively, a hierarchy may be used in specifying how SMPs are trained.
  • For example, as shown in FIGURE, a concept that classifies an SMP as “contained” in one dimension or not may call a first SMP that determines whether an SMP is bounded in a first direction and a second SMP that determines whether an SMP is bounded in the other direction—(if both SMPs signal “1” then so does the “contained” SMP). From a training perspective, it may be preferable to train SMPs in reverse order of such a hierarchy (e.g., train first SMPs that may not call other SMPs. Then train SMPs that may call those SMPs but no others, etc.). Alternatively stated, if the concept relationship for SMPs is top down in terms of complexity (e.g., low-complexity SMPs may call no other SMPs; medium-complexity SMPs may call low-complexity SMPs, high-complexity SMPs may call low- and medium-complexity SMPs) it may be preferable for training to occur bottom up (e.g., train low-complexity SMPs, then medium-complexity, then high-complexity).
  • If a hierarchy or other concept relationship that limits the available SMPs that an SMP may call exists, it may be based on complexity (as subjectively determined by a human) as in the previous example, but may additionally or alternatively be determined in any manner. However, the concept relationship established in S120 may simply be a flat relationship (e.g., there is no restriction on which SMPs an SMP may call; e.g., all SMPs may call each other). Note also that the concept relationships used for training need not be the same as the concept relationships used in executing a fully trained sensorimotor network (e.g., by S150), and that concept relationships may change over time. Likewise, while a concept relationship may be useful for directing training as in the above example, training need not be performed according to the concept relationship (e.g., it may still be that SMPs may only call less-complex SMPs during training, but instead of training the less-complex SMPs first, then the more-complex SMPs, it may be desirable to train all SMPs at the same time, or train more complex SMPs first).
  • Concept relationships may be specified manually, but they may additionally or alternatively be specified in any manner. For example, concept relationships may be determined automatically or partially automatically by the results of training on similar networks. Likewise, a concept relationship initially determined may be updated during SMP training based on the results of SMP training.
  • Note that as used in this document, a statement that a first concept is “higher-order” than a second concept is to be interpreted as specifying that the first concept may call the second concept, but the second concept may not call the first.
  • S130 includes training a first sensorimotor program and S140 includes training a second sensorimotor program using the first sensorimotor program. While S130 and S140 are substantially similar in function, the explicit mention of S140 highlights that training of SMPs to call other SMPs is an essential part of sensorimotor program training.
  • As previously mentioned, examples of SMPs include classification and bring-about SMPs. Classification SMPs are preferably trained using reward functions that reward the SMP when the SMP terminates within a set time or number of steps and correctly returns a value consistent with the presence or non-presence of a given concept in an environment. Bring-about SMPs are preferably trained using reward functions that reward the SMP when the SMP successfully brings about a given environmental state, terminates within a set time or number of steps, and correctly returns a value indicating that the SMP has successfully brought about the given environmental state. SMPs may additionally or alternatively be rewarded in any manner. For example, bring-about SMPs may receive shaping awards when an SMP successfully brings about a concept but does not appropriately signal. The shaping reward may, for instance, provide a smaller reward than the primary reward function. Note that bring-about SMPs may call classification SMPs and vice-versa.
  • SMPs may be trained in any manner. In one implementation of an invention embodiment, sets of SMPs may be represented by a neural network (e.g., a gated recurrent unit (GRU) network) and trained using natural policy optimization (NPO). Networks may likewise be initialized in any manner, and training may occur over any number of iterations. For example, a given SMP may be trained and evaluated for five different random seeds. When an SMP reuses another SMP, that other SMP may be selected in any manner (e.g., the best performing seed may be selected, one of the seeds may be selected at random, etc.).
  • SMPs preferably may call other SMPs as an option (e.g., reusing an SMP to perform some set of actions) or as observations (e.g., where the SMP makes use of the output of SMPs it calls, not just the set of actions performed by it).
  • In addition to other SMPs, an SMP may perform any one of a set of primitive actions. For example, for an SMP controlling a robot arm, primitive actions may include simple movement of the robot arm (e.g., move up, move down, rotate hand, etc.). Other examples of primitive actions may be those related to the control of sensors, actuators, or processing modules; for example, the orientation of a camera, the focus of the camera, values recorded by touch sensors, etc. In general, primitive actions are preferably pure motor or sensory actions (rather than the concepts that are derived from motor and sensory interaction), but primitive actions may be any “base-level” action (i.e., an action that may not call an SMP but rather serves as building block for an SMP).
  • SMPs are trained to accomplish a concept. For example, a classification concept example is the “containment” classification (i.e., determine if the agent is contained or not) and a bring-about concept example is bringing-about containment (i.e., bring the agent to a contained state if possible).
  • S150 includes executing the first and second sensorimotor programs. S150 functions to allow the use of the sensorimotor programs trained in S110-S140 to accomplish a task or determine an environmental state. S150 preferably includes executing the second SMP, and in the process of executing the second SMP, executing the first SMP (e.g., reusing the first SMP as previously described in the method 100.
  • Note that if the first and second sensorimotor programs are trained using a simulation of a real world environment, execution may occur using physical sensors and actuators (e.g., on a robot).
  • The methods of the preferred embodiment and variations thereof can be embodied and/or implemented at least in part as a machine configured to receive a computer-readable medium storing computer-readable instructions. The instructions are preferably executed by computer-executable components preferably integrated with a neural network. The computer-readable medium can be stored on any suitable computer-readable media such as RAMs, ROMs, flash memory, EEPROMs, optical devices (CD or DVD), hard drives, floppy drives, or any suitable device. The computer-executable component is preferably a general or application specific processor, but any suitable dedicated hardware or hardware/firmware combination device can alternatively or additionally execute the instructions.
  • As a person skilled in the art will recognize from the previous detailed description and from the figures and claims, modifications and changes can be made to the preferred embodiments of the invention without departing from the scope of this invention defined in the following claims.

Claims (20)

We claim:
1. A method for establishing sensorimotor programs, comprising:
determining a first environment that represents a first concept; and
training a first sensorimotor program to accomplish the first concept by interacting with the first environment, comprising training the first sensorimotor program using a first reward function that rewards the first sensorimotor program when the first sensorimotor program successfully accomplishes the first concept and correctly returns a value indicating that the first sensorimotor program has successfully accomplished the first concept.
2. The method of claim 1, wherein the first concept is a bring-about concept; wherein the first reward function rewards when the first sensorimotor program successfully brings about the first concept and correctly returns a value indicating that the first sensorimotor program has successfully brought about the first concept.
3. The method of claim 1, wherein the first concept is a classification concept; wherein the first reward function that rewards when the first sensorimotor program correctly returns a value consistent with the presence or non-presence of the first concept in the first environment.
4. The method of claim 1, wherein training the first sensorimotor program to accomplish the first concept further comprises training the first sensorimotor program using a second reward function different from the first reward function that rewards when the first sensorimotor program successfully accomplishes the first concept but fails to return a value indicating that the first sensorimotor program has successfully accomplished the first concept.
5. The method of claim 4, wherein the second reward function is a shaping reward function.
6. The method of claim 1, wherein training the first sensorimotor program to accomplish the first concept comprises using a set of primitive actions to interact with the first environment.
7. The method of claim 7, wherein using the set of primitive actions to interact with the first environment comprises pushing an object of the first environment.
8. The method of claim 7, further comprising executing the trained first sensorimotor program on a robotic arm system comprising a robotic arm actuator, wherein an action of the set of primitive actions actuates the robotic arm actuator.
9. A method for establishing sensorimotor programs, comprising:
generating a first plurality of environments that represents a first concept, wherein the first concept is a bring-about concept; and
training a first sensorimotor program to accomplish the first concept using a reward function in each environment of the first plurality, wherein the first sensorimotor program executes actions of a set of primitive actions to accomplish the first concept.
10. The method of claim 9, wherein the first plurality of environments is generated based on recurring content that enables re-use of learned concepts.
9. method of claim 9, wherein each environment of the first plurality of environments is associated with a dynamics model that collectively simulates the actions executed by the first sensorimotor program.
12. The method of claim 9, wherein generating the first plurality of environments comprises generating a superset of environments and filtering the superset of environments to determine the first plurality of environments.
13. The method of claim 9, further comprising:
generating a second plurality of environments that represent a second concept higher-order than the first concept; and
training, using the second plurality of environments, a second sensorimotor program to accomplish the second concept using the first sensorimotor program.
14. The method of claim 13, wherein the second sensorimotor program is trained using the set of primitive actions, wherein the second sensorimotor program calls the first sensorimotor program as an additional action.
15. The method of claim 13, wherein the second sensorimotor program calls the first sensorimotor program as an observation.
16. The method of claim 9, wherein the first sensorimotor program is trained based on a set of actionable lower-level sensorimotor programs associated with different respective bring-about concepts and a set of conceptual lower-level sensorimotor programs associated with different respective classification concepts.
17. The method of claim 16, wherein the set of actionable lower-level sensorimotor programs and the set of conceptual lower-level sensorimotor programs are trained before the first sensorimotor program is trained.
18. The method of claim 16, wherein training the first sensorimotor program comprises automatically determining which programs of the actionable lower-level sensorimotor programs and the conceptual lower-level sensorimotor programs enable the first sensorimotor program to accomplish the first concept.
19. The method of claim 9, wherein the first sensorimotor program is represented by a recurrent neural network.
20. The method of claim 9, wherein the first sensorimotor program is trained using natural policy optimization.
US16/840,210 2017-07-21 2020-04-03 Methods for establishing and utilizing sensorimotor programs Abandoned US20200230813A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/840,210 US20200230813A1 (en) 2017-07-21 2020-04-03 Methods for establishing and utilizing sensorimotor programs

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201762535703P 2017-07-21 2017-07-21
US16/043,146 US10646996B2 (en) 2017-07-21 2018-07-23 Methods for establishing and utilizing sensorimotor programs
US16/840,210 US20200230813A1 (en) 2017-07-21 2020-04-03 Methods for establishing and utilizing sensorimotor programs

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/043,146 Continuation US10646996B2 (en) 2017-07-21 2018-07-23 Methods for establishing and utilizing sensorimotor programs

Publications (1)

Publication Number Publication Date
US20200230813A1 true US20200230813A1 (en) 2020-07-23

Family

ID=65016112

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/043,146 Active US10646996B2 (en) 2017-07-21 2018-07-23 Methods for establishing and utilizing sensorimotor programs
US16/840,210 Abandoned US20200230813A1 (en) 2017-07-21 2020-04-03 Methods for establishing and utilizing sensorimotor programs

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US16/043,146 Active US10646996B2 (en) 2017-07-21 2018-07-23 Methods for establishing and utilizing sensorimotor programs

Country Status (2)

Country Link
US (2) US10646996B2 (en)
WO (1) WO2019018860A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11119483B2 (en) * 2018-02-22 2021-09-14 Alan M. Kadin System and method for conscious machines

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200043610A1 (en) * 2017-02-03 2020-02-06 Koninklijke Philips N.V. Extracted concept normalization using external evidence
US20200167633A1 (en) * 2017-05-19 2020-05-28 Deepmind Technologies Limited Programmable reinforcement learning systems
US20210406774A1 (en) * 2016-01-27 2021-12-30 Microsoft Technology Licensing, Llc Artificial intelligence engine for mixing and enhancing features from one or more trained pre-existing machine-learning models

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7309315B2 (en) * 2002-09-06 2007-12-18 Epoch Innovations, Ltd. Apparatus, method and computer program product to facilitate ordinary visual perception via an early perceptual-motor extraction of relational information from a light stimuli array to trigger an overall visual-sensory motor integration in a subject
US7331007B2 (en) * 2005-07-07 2008-02-12 International Business Machines Corporation Harnessing machine learning to improve the success rate of stimuli generation
RU2331105C1 (en) * 2007-05-10 2008-08-10 Виктор Викторович Олексенко Universal bridge inverting adder
US8996177B2 (en) * 2013-03-15 2015-03-31 Brain Corporation Robotic training apparatus and methods
US9630318B2 (en) 2014-10-02 2017-04-25 Brain Corporation Feature detection apparatus and methods for training of robotic navigation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210406774A1 (en) * 2016-01-27 2021-12-30 Microsoft Technology Licensing, Llc Artificial intelligence engine for mixing and enhancing features from one or more trained pre-existing machine-learning models
US20200043610A1 (en) * 2017-02-03 2020-02-06 Koninklijke Philips N.V. Extracted concept normalization using external evidence
US20200167633A1 (en) * 2017-05-19 2020-05-28 Deepmind Technologies Limited Programmable reinforcement learning systems

Also Published As

Publication number Publication date
WO2019018860A1 (en) 2019-01-24
US10646996B2 (en) 2020-05-12
US20190039239A1 (en) 2019-02-07

Similar Documents

Publication Publication Date Title
US10963785B2 (en) Methods and systems for artificial cognition
Asai et al. Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary
US11113585B1 (en) Artificially intelligent systems, devices, and methods for learning and/or using visual surrounding for autonomous object operation
US11699295B1 (en) Machine learning for computing enabled systems and/or devices
KR102532749B1 (en) Method and apparatus for hierarchical learning of neural networks based on weak supervised learning
Ring Continual learning in reinforcement environments
Sheh " Why Did You Do That?" Explainable Intelligent Robots
US10102449B1 (en) Devices, systems, and methods for use in automation
CN111144580B (en) Hierarchical reinforcement learning training method and device based on imitation learning
US11568246B2 (en) Synthetic training examples from advice for training autonomous agents
Yu et al. Continuous timescale long-short term memory neural network for human intent understanding
Crowder et al. Artificial cognition architectures
Kaiser et al. Obtaining good performance from a bad teacher
Das et al. Probing emergent semantics in predictive agents via question answering
US20200230813A1 (en) Methods for establishing and utilizing sensorimotor programs
Sheldon et al. PSchema: A developmental schema learning framework for embodied agents
CA2798529C (en) Methods and systems for artificial cognition
Akula Gaining Justified Human Trust by Improving Explainability in Vision and Language Reasoning Models
Cederborg et al. A social learning formalism for learners trying to figure out what a teacher wants them to do
Ge et al. Deep reinforcement learning navigation via decision transformer in autonomous driving
CN116361138A (en) Test method and test equipment
Davies et al. A Database for Learning Numbers by Visual Finger Recognition in Developmental Neuro-Robotics
KR20220034149A (en) Memory in Embedded Agents
EP4035079A1 (en) Upside-down reinforcement learning
Mangin et al. Learning the combinatorial structure of demonstrated behaviors with inverse feedback control

Legal Events

Date Code Title Description
AS Assignment

Owner name: VICARIOUS FPC, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PHOENIX, DAVID SCOTT;STARK, MICHAEL;HAY, NICHOLAS;SIGNING DATES FROM 20181001 TO 20200325;REEL/FRAME:052310/0764

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

AS Assignment

Owner name: LLC, INTRINSIC I, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VICARIOUS FPC, INC;BOSTON POLARIMETRICS, INC.;REEL/FRAME:060389/0682

Effective date: 20220520

AS Assignment

Owner name: INTRINSIC INNOVATION LLC, CALIFORNIA

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE THE RECEIVING PARTY NAME PREVIOUSLY RECORDED AT REEL: 060389 FRAME: 0682. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNORS:VICARIOUS FPC, INC.;BOSTON POLARIMETRICS, INC.;REEL/FRAME:060614/0104

Effective date: 20220520

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION