CN109643389B - System and method for generating data interpretations for neural networks and related systems - Google Patents

System and method for generating data interpretations for neural networks and related systems Download PDF

Info

Publication number
CN109643389B
CN109643389B CN201680088615.1A CN201680088615A CN109643389B CN 109643389 B CN109643389 B CN 109643389B CN 201680088615 A CN201680088615 A CN 201680088615A CN 109643389 B CN109643389 B CN 109643389B
Authority
CN
China
Prior art keywords
data
network
layer
node
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201680088615.1A
Other languages
Chinese (zh)
Other versions
CN109643389A (en
Inventor
迪利普·乔治
肯尼思·艾伦·坎斯基
克里斯多佛·雷默特·拉恩
沃尔夫冈·勒拉奇
巴斯卡拉·马纳尔·马西
大卫·斯科特·菲尼克斯
埃里克·珀迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Insi Innovation Co ltd
Original Assignee
Insi Innovation Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Insi Innovation Co ltd filed Critical Insi Innovation Co ltd
Publication of CN109643389A publication Critical patent/CN109643389A/en
Application granted granted Critical
Publication of CN109643389B publication Critical patent/CN109643389B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks

Abstract

A method for generating data interpretations in a recursive cortical network, comprising: receiving a set of evidence data at a sub-feature node of a first layer of the recursive cortical network; setting a conversion configuration that directs messaging of evidence data and converted data between layers of the network; performing a series of transformations on the evidence data according to a transformation configuration, the series including at least one forward transformation and at least one reverse transformation; and outputting the converted evidence data.

Description

System and method for generating data interpretations for neural networks and related systems
Technical Field
The present invention relates generally to the field of artificial intelligence, and more particularly to a new and useful system and method for generating data interpretations for neural networks and related systems in the field of artificial intelligence.
Background
Despite advances in computer vision, image processing, and machine learning, identifying visual objects is still an unfulfilled task for computers compared to human capabilities. Identifying objects from images requires identifying not only the images in the scene, but also objects in various locations, in different settings, and with slight variations. For example, to identify a chair, it is necessary to know the inherent properties of making the chair a chair. This is a simple task for humans. Computers struggle to handle many types of chairs and situations in which chairs may exist. The problem is more challenging when considering the problem of detecting multiple objects in a scene. A model capable of performing visual object recognition must be able to provide an interpretation of visual data sets in order to identify objects present in those visual data sets. Visual object recognition is a specific case of a more general problem in artificial intelligence: pattern recognition (pattern recognition) (and its inverse, pattern generation). Pattern recognition is a problem in other fields and intermediaries (e.g., speech recognition, natural language processing, and other fields) than image processing. Thus, there is a need in the art of artificial intelligence to create new and useful methods for generating data interpretations for neural networks and related systems. The present invention provides such a new and useful method.
Brief Description of Drawings
FIG. 1 is a flow chart representation of a layer-based bi-directional data-to-conversion system (LBD);
FIG. 2 is a schematic representation of a Recursive Cortical Network (RCN);
FIG. 3 is a schematic representation of a Recursive Cortical Network (RCN);
FIG. 4 is a schematic representation of a subnet of an RCN;
FIG. 5 is a schematic representation of a subnet of an RCN;
FIG. 6 is a diagrammatic representation of a method of a preferred embodiment;
FIG. 7 is a schematic representation of reasoning using forward conversion;
FIG. 8 is a schematic representation of reasoning using combined forward and reverse transitions;
FIG. 9 is a flow chart representation of the forward conversion of the method of the preferred embodiment;
FIG. 10 is an exemplary representation of an LBD;
FIG. 11 is an exemplary implementation of the forward conversion of the method of the preferred embodiment;
FIG. 12 is an exemplary implementation of the forward conversion of the method of the preferred embodiment;
FIG. 13 is an exemplary implementation of the reverse switch of the method of the preferred embodiment;
FIG. 14 is a flow chart implementation of the reverse conversion of the method of the preferred embodiment; and
fig. 15 is an exemplary implementation of the reverse conversion of the method of the preferred embodiment.
Description of the preferred embodiments
The following description of the preferred embodiments of the invention is not intended to limit the invention to those embodiments, but to enable any person skilled in the art to make and use the invention.
The system and method for generating data interpretations of the preferred embodiment are used to improve the generation and/or reasoning tasks of neural networks and related systems. The system and method preferably applies bi-directional conversion through the various layers of a data-conversion system (e.g., a neural network). To address the challenges of pattern recognition, the systems and methods of the preferred embodiments may be applied to generate data interpretations of pattern data. Data interpretation generation is important for many pattern recognition models, including convolutional neural networks, recursive cortical networks, and other models that consist of a series of layers, each of which applies transformations to the underlying layers. In a first target task of the neural network, an inference output can be generated. A process and system for generating such inference output is described herein. In particular, a variation may apply a reverse generation transformation to preferably refine the inference output. In a second target task of the neural network, a generation (generation) or imagination (imaging) output may be generated. The systems and methods described herein may additionally or alternatively be applied to produce such generated output. Variations of the generation process preferably use inference transformations to at least partially preferably refine the generated output. The systems and methods are preferably used with neural networks, and more particularly with recursive cortical networks, but may additionally be used with any suitable hierarchical data-to-conversion system.
1. Neural network and related systems
Neural networks and related systems, including Recurrent Cortical Networks (RCNs), convolutional Neural Networks (CNNs), HMAX models, slow Feature Analysis (SFA) systems, and hierarchical time memory (HTM, hierarchical Temporal Memory) systems, can be used for a variety of tasks that are difficult to accomplish using standard rule-based programming. These tasks include many tasks in the important areas of computer vision and speech recognition.
Neural networks and related systems may be represented as distributed processing elements that perform summation, multiplication, exponentiation, or other functions on elements of an input message/signal. Such a network may be enabled and implemented by various embodiments. For example, the system may be implemented as a network of electronically coupled functional node components. The functional node means may be a logic gate arranged or configured in the processor to perform a specified function. As a second example, the system may be implemented as a network model programmed and/or configured to operate on a processor. Such a network model is preferably electronically stored software that encodes the operation and communication between network nodes. Neural networks and related systems may be used for a wide variety of applications, and may use a wide variety of data types as inputs, such as images, video, audio, natural language text, analytical data, widely distributed sensor data, or other suitable forms of data.
Neural networks and related systems may be described as systems that include a number of conversion layers. Input data (typically with low-level abstractions) into these systems may be transformed by a first layer to create first intermediate data; the first intermediate data may be converted by the second layer to create second intermediate data, and so on. This process may continue until the system reaches the last layer, where output data (typically with a higher level of abstraction) is created from the intermediate data. The process may be used to generate a data interpretation (e.g., reasoning) for the data set by identifying local features of the data set, identifying more complex features based on the local features, etc., each layer adding a level of abstraction to the data interpretation.
Note that the layer may be capable of generating intermediate data and output data; that is, the output of the middle layer may be used as both an input to a higher layer and as a data interpretation output (e.g., output to another process).
Some neural networks and related systems may be used in a complementary manner; in this manner, a system similar to the previously described system may be initialized from the final layer (or middle layer) to convert data from a higher level of abstraction to a lower level of abstraction. This process is often referred to as generating (and imagining where the generating is not affected by data generated from the input data).
In particular, for artificial intelligence applications (e.g., computer vision), as shown in FIG. 1, neural networks and related systems are capable of performing reasoning and generation by changing the direction of data propagation. In such a system, hereinafter referred to as a layer-based bi-directional data-conversion system (LBD), reasoning can be performed by passing "bottom-up messages" (e.g., BU1, BU2 …) from the lower layers of the system to the higher layers, while generating can be performed by passing "top-down messages" (e.g., TD1, TD2 …) from the higher layers of the system to the lower layers of the system. In LBD, the number of layers may be arbitrarily large to allow for any gradual change in the input/output conversion process.
As an example of reasoning, the input data to the LBD may comprise images. The image data can be directly introduced into the LBD (as BU 1 Messages) or may be preprocessed first (e.g. by increasing the image contrast of the input data to prepare it for transmission as a BU1 message). BU composed of image data (which may or may not be preprocessed) 1 Messages are converted by Layer1 (Layer 1) to create a converted dataset (e.g., bitwiseProfile of the detection) indicated as BU2 message. The BU2 message is further converted by layers 3 and 4 to successively create BU3 and BU4 messages, respectively. For example, the BU4 message in this case may represent a dataset corresponding to the object detected in the image data. The BU4 message can in turn be processed by a post-processing layer (e.g., processing BU4 into a written description of the objects in the image).
As a similar example of generation, the LBD may be provided with generation constraints passed through a post-processing layer (e.g., a written description of objects desired in the generated image) or more directly with a TD1 message (e.g., a dataset corresponding to objects desired to be contained within the generated image data). TD1 messages are converted from layer 3 to TD2 messages, then from layer 2 to TD3 messages, and so on. These layer transitions effectively predict possible lower level of abstraction data based on input to the layer (e.g., TD1 message in the case of layer 3). Finally, the LBD may output generated data representing image data predicted by the system based on the generation constraints or other initialization inputs. Additionally or alternatively, the LBD may not be provided with input data for generation, resulting in the assumption of default values (or alternatively, randomly generated values) stored within the system. The output of the LBD is a special case of what is called imagination generation.
While the first two examples represent reasoning about incoming data and/or BU1 messages, and generation based on generation constraints and/or TD1 messages, such reasoning or generation may originate from any layer of the LBD; for example, the LBD may perform reasoning on data that is provided directly to layer 3. The neural network and related systems preferably apply complementary generation and/or reasoning during the reasoning and/or generation process.
2. Recursive cortical network
While the systems and methods of the preferred embodiments described within this application are preferably applicable to any neural network and related systems (i.e., LBD) consistent with the description above, implementation details and examples will particularly relate to Recursive Cortical Networks (RCNs).
As shown in fig. 2, a Recursive Cortical Network (RCN) may include a plurality of subnets. The subnetwork preferably comprises at least one parent characteristic node, a pool node, a parent specific child characteristic node (or PSCF node for short), and at least one constraint node. The RCN may be configured for different modes of operation, including a first mode of operation: generating a pattern, and a second pattern: reasoning mode. As shown in FIG. 3, the RCN is preferably a hierarchically organized network of interconnected subnetworks in various parent-child relationships. The RCN may alternatively be a single layer or a single subnet of a collection of subnets.
As shown in fig. 4, various instances and instantiations of the RCN subnetworks are preferably built, connected, and recursively used in the hierarchy of the RCNs. The architecture of the hierarchical network may be built by algorithms or by at least some user selections and configurations. RCNs can be described as alternating layers of feature nodes and pool nodes in a neural network. The subnetwork has a characteristic input node and a characteristic output node, and the characteristic nodes are used to bridge or connect the subnetwork. As shown in fig. 4, feature nodes may be constrained to various invariants by using constraint nodes that bridge constraints across pools and subnets that are spatially or temporally distinct. Each node of the hierarchical network preferably has a parent node connection and a child node connection. In general, the parent node connection is preferably an input during generation and an output during reasoning. Instead, the child node connections are outputs during generation and inputs during reasoning. In a variant of a single layer (or non-hierarchical) subnetwork, the subnetworks are arranged in the same hierarchy (sibmings). The subnetworks described below may have interactions through various forms of constraint nodes.
The subnetworks may be arranged in a variety of different configurations within the network. Many configurations are determined by constraint nodes that define node choices within a subnet, between subnets, or even between networks. In addition, the subnets may be configured to have different or shared sub-features. The subnetworks are additionally arranged in a hierarchical layer. In other words, the first subnetwork may be a parent network of the second subnetwork. Similarly, the second subnetwork may additionally be a parent network of the third subnetwork. The layers of the subnetwork are preferably connected by shared parent and child feature nodes. Preferably, the child feature node of the top level subnet is the parent feature node of the lower subnet. Conversely, a parent feature node of a subnet may participate as a child feature node of a higher subnet. The parent feature node of the top level subnetwork is preferably an input to the system. The sub-feature of the bottom/lowest sub-network is preferably the output of the system. Connecting multiple subnets may introduce multiple parent interactions at several nodes in the network. These interactions can be modeled using different probabilistic models in the nodes.
Connecting subnets in a hierarchy may be used to facilitate compact and compressed representations through subnet reuse. The parent characteristic node of one subnet may participate as a child characteristic node in multiple parent subnets. A similar benefit is that the invariant representation of the child sub-network can be reused across multiple parent sub-networks. One example of a suitable use is the case of RCNs representing visual objects. The lower level subnets may correspond to portions of the object and the higher level subnets (i.e., upper level subnets) may represent how the portions are clustered together to form the object. For example, the lower layer subnetwork may correspond to a representation of a body part of an image of a cow. Each body part will be represented unchanged and can tolerate positional transformations such as translation, scaling and deformation. The higher layer subnetworks will then specify how the body parts are clustered together to represent cows. Some of the lower body parts of the cows may be reused at the higher layers to represent goats. For example, the legs of both animals move similarly, and thus these parts may be reused. This means that the invariant representation of leg learning for cows can be automatically reused for representing goats.
RCNs can be used to generate both data interpretations (e.g., classifying objects in an image) and data predictions (e.g., an image that contains some set of objects). During data interpretation generation, the nodes of the RCN preferably operate on the input data features and propagate node selection/processing through the hierarchy of the RCN until output is obtained from the parent features of the top level subnetwork. The output may be accomplished using a combination of propagating information up (to higher parent layers) and propagating information down (to final child features) in the hierarchy. During data prediction generation, the RCN preferably begins with a generic generation request that is directed, fed, or passed to the parent feature node of the top level subnet. The nodes preferably operate on the information and propagate node selection/processing down the hierarchy of RCNs until output is obtained from the sub-feature nodes of the underlying subnetwork.
As shown in fig. 5, the subnetwork is used to provide node selection operations between the parent and child features. The subnet is the basic building block (building block) of the RCN. In the case of generation, the sub-network preferably maps or networks from higher layer features to a set of lower layer features such that lower layer feature activity (e.g., visual features of an image) is determined by activity of higher layer features (e.g., object names). In the case of reasoning, the sub-network preferably maps or networks from lower-level features to higher-level features, such that higher-level feature activity (e.g., object names) is determined by the activity of lower-level features (e.g., visual features of the image). The general architecture of a subnet preferably includes a single top level node as the parent feature node. The parent feature node (PF 1) preferably comprises a connection to at least two pool nodes (P1 and P2). Each pool node preferably comprises connections to a plurality of PSCF nodes (X1, X2, X3, X4, X5, X6). Constraint nodes (C1, C2, C3) may additionally be within the subnetwork. The constraint node is preferably connected to other PSCF nodes. Constraint nodes define constraints (limits), rules and restrictions (constraints) between at least two PSCF nodes. The PSCF node is preferably connected to a sub-feature node 150 (CF 1, CF2, CF3, CR4, CF5, CF 6). Examples of subnets within the RCN may or may not share commonalities with other subnets. The function operation of each node may vary in the number and configuration of connections, connection weighting, and/or any other aspect. In some edge cases (name edge cases), the subnet may include more than one node selection option. In one exemplary edge case, a subnet may be defined that has no selection options such that activation of a parent feature results in activation of a child feature. For example, a parent feature node may be connected to a pool, which is then connected to a PSCF node.
The nodes of the network are preferably configured to operate, perform, or interact with probabilistic interactions that determine node activation, selection, ON/OFF, or other suitable states. When activated by a parent node, the node will preferably trigger activation of the connected child node according to the node's selection function. The nodes preferably represent binary random variables or multiple random variables as in a bayesian network, although other suitable node models may alternatively be used. The feature nodes (e.g., parent feature nodes, child feature nodes) are preferably binary random variable nodes that may have multiple parent nodes and multiple child nodes. When multiple parent nodes are involved (i.e., multiple nodes connected by a parent connection/input connection), interactions between the parent connections are preferably considered a superposition of connections. Additionally or alternatively, multi-parent interactions may be modeled in any manner. The multi-parent interactions may be probability modeled in the nodes using a canonical (canonical) model, such as noise-OR (OR) gates and noise-Max gates. The sub-connections of the feature nodes preferably encode the probability relationships between the features and the pool. In some RCNs, if a feature is active, all pools of features are active, but such activation may be modified according to probability tables or any suitable mechanism. Each link from node to pool node encodes a probability table of class P (pool|feature) as shown in the table below.
Feature/pool False, false True sense
False, false 1-q q
True sense p 1-p
In the case where the pool node is ON, p and q will be zero if and only if the feature is ON. However, other values of p and q may alternatively be used. Pool nodes are preferably considered binary nodes. The pool node preferably has a parent connection that represents the probability table shown above. The pool node may have multiple connections to child nodes. In one variation, the child node connection represents a transient-by-instance connection. The transient connection is preferably implemented as an OR selection function on the pool members with associated probabilities. In other words, the transient connection represents a plurality of random variable connections. Pool members (modeled as a set of possible activations of PSCF nodes) are preferably configured to act as binary random variables, at least one variable being selected when a pool is selected according to a distribution P (m|pool). Pool members represent a function fit of the sub-features (functional combinations). For example, pool member 1 may be a sub-feature 1 AND a sub-feature 2 ANDed. The constraint node is preferably considered a binary node, the observation of which is instantiated as 1. The probability tables used in these constraint nodes implement constraint types that are enforced between parent nodes connected to the constraint nodes. The constraint is typically an AND OR OR constraint, but may be any suitable selection function. Constraint nodes may additionally be nodes with a larger pair-wise connection.
The parent feature node serves as a high-level feature node. In the generation mode of operation, the parent feature node is an input to the subnet. In the reasoning mode of operation, the parent feature node is the output of the subnet. The parent feature node is configured to implement a selection function when activated. The selection function is preferably a logical function, such as a boolean-based selection function for AND (AND), OR (OR), NOT (NOT), exclusive OR (XOR) operations of node selection. For example, if P1 AND P2 are pool nodes of PF1, AND PF1 is configured for AND selection functions, then activation of PF1 will activate the P1 AND P2 pools. The selection function may include a randomized selection mechanism for determining the selection between different options, e.g. if the operator is an exclusive or (XOR), only one connected node can be selected. In addition, the selection of randomization may be biased or weighted according to the node connection weights of the connections between the parent feature node and the pool node. The selection function may alternatively be a probability selection function or any suitable function for selecting a connection option.
The pool node serves as a node selected from a set of sub-features. The sub-features associated with the pool nodes preferably share relationships, have dependencies, or are variants of each other. For example, a pool may be used for different variations in the location of pixel patterns. In other words, the PSCF node is preferably a invariant representation of variants of the feature. In fig. 5, P1 is an invariant representation of 3 different translations for vertical lines, and P2 is an invariant representation of 3 different translations for horizontal lines. The term pool may be used herein to refer to a possible set of PSCF nodes for a particular pool node. The possible set of PSCF nodes is preferably any PSCF node having a connection to a pool node. The pool may be constrained. For example, a member of a pool may be a set { (a), (b AND c), (d), (e) }, where a, b, c, d, e is a sub-feature. Similar to the parent feature node, the pool node is configured to implement a selection function when activated. The selection function may be any suitable function, but is preferably a logical operator as described above for the parent feature node. The selection function may be similarly randomized, biased, and/or weighted. The selection function of the pool node preferably selects, triggers, activates or otherwise transmits a signal to the corresponding PSCF node. Further, the selection function may be restricted or overridden based on the activated constraint node. The active constraint node may define which node within the selection pool based on the selection of PSCF nodes (nodes connected by the constraint node). Similarly, it may determine a set of possible PSCF nodes for the pool node and/or determine a weight or preference of the pool node. Pool nodes within a subnet may be evaluated sequentially so that constraint nodes may be applied to other pools as appropriate.
The PSCF node serves as an option for the invariant feature option. The PSCF node maps to a child feature and the PSCF node has only one parent pool node. The PSCF node may additionally be connected or coupled with a constraint node. The constraint node preferably defines a relationship between a plurality of PSCF nodes. The constraint nodes are preferably connected to other PSCF nodes of different pools, different times and/or different subnets. PSCF nodes are preferably not shared between subnets. However, a child feature node (which may be a parent node of a lower subnet) may share connections to multiple subnets.
The constraint node is used to limit the types of patterns allowed in the sub-network. The constraint node is preferably connected to at least two PSCF nodes. More than two PSCF nodes may alternatively be connected by a constraint node. Constraint nodes may additionally be between nodes of any suitable type. Constraint nodes may be between pool nodes. Constraint nodes may additionally be between the two types of nodes. For example, the constraint node may connect the PSCF node and the pool node. Variations in constraint node connection with PSCF nodes are shown herein as preferred embodiments, but constraint nodes may be used to enforce constraints between any set of nodes (of any type) in an RCN. The constraint nodes may be between pool nodes, between pool nodes and PSCF nodes, or between any suitable nodes of the network. The PSCF nodes preferably do not have the same pool and in some cases are not in the same subnet. Constraint nodes are preferably connected to PSCF nodes of the same layer, but they may alternatively be connected to subnets in different layers. In addition, any suitable PSCF node may have connected constraint nodes and have any suitable number of connected constraint nodes. Constraint nodes may enforce limits, rules, and constraints in other pools, in other subnets, and/or within the selection of nodes at different times. Preferably, the network is evaluated in an orderly fashion, such that PSCF nodes connected by constraint nodes are preferably not evaluated simultaneously. When the first PSCF node is activated or selected, a constraint node connected to the first PSCF node may be activated. Subsequently, the constraint node's restrictions are activated/enforced on the connected PSCF nodes. Similar to other nodes, the constraint node may have a selection function that determines how it activates the PSCF node. Constraint nodes preferably affect how pool nodes select PSCF nodes. In one variation, the constraint node's selection function may be an AND (AND) logical operator, such that if one of the PCSF nodes is active, the constraint node implements the selection of the connected PSCF node. In another variation, the constraint node's selection function may be an OR (OR) logical operator such that it modifies the possible PSCF nodes within the pool. Any suitable selection function may be used. Some constraint nodes may have basic or simple constraints, where activation of one node corresponds to selection of a second node. These can be represented as direct connections without nodes, as the selection logic is a direct correspondence between nodes.
The constraint nodes may include lateral constraint nodes, external constraint nodes, and temporal constraint nodes. The lateral constraint nodes are used to limit the kinds of patterns of the sub-network based on interactions between pool nodes of the sub-network. External constraint nodes are used to implement invariant patterns in different subnets. Similar to how lateral constraint nodes can ensure that representations in different pools agree with each other by imposing (impose) constraints on PSCF nodes in one pool node that are allowed with PSCF nodes in another pool, external constraint nodes can maintain compatibility on the hierarchy. The time constraint node is used to enforce relationships on the RCN and the subnets running at other time instances. At the base layer, members of the pool (e.g., PSCF nodes with shared parent pool nodes) may have relationships that specify the order in which they occur over time. The time constraint nodes are preferably simple direct connection constraints, wherein the activation/selection of one node implements the selection of the designated node in the second instance. In an alternative description, constraint nodes may function similar to the specifications in a Markov chain.
The PSCF node may have more than one type of constraint node implemented thereon. The lateral constraint nodes impose coordination between PSCF nodes in different pools of the same network and the external constraint nodes impose coordination between PSCF nodes in different subnets. Constraint nodes are preferably set so as not to cause conflicts (e.g., one constraint activates a node and another constraint specifies that it should not be activated). Ordering of constraint nodes, or heuristics for the order in which constraint nodes are applied, or other suitable rules, may be used to resolve conflicts and contentions between constraint nodes.
3. Method for generating data interpretation
As shown in fig. 6, a method 100 for generating data interpretation in a layer-based bi-directional data-conversion system (LBD) includes receiving evidence data S110, setting a conversion configuration S120, performing a forward conversion S130, performing a reverse conversion S140, and outputting converted evidence data S150.
The method 100 is for generating an interpretation of evidence data received by a layer-based bi-directional data-to-conversion system (LBD). The method 100 may be used to infer patterns in a wide variety of data types (e.g., image, video, audio, voice, medical sensor data, natural language data, financial data, application data, traffic data, environmental data, etc.). In one embodiment, the method 100 may be used for image detection to detect the presence of an object in an image or video; the method 100 may additionally or alternatively be used to classify the detected object.
The method 100 generates an interpretation of the evidence data (received in step S110) by a series of forward and reverse transformations (steps S130 and S140, respectively), ultimately outputting transformed evidence data (step S150), which may be understood or used as an interpretation of the received evidence data. The forward and reverse conversions may be performed on an entire set of evidence data or a subset of evidence data; the collection of evidence data for a transformation may differ between transformations. Further, the forward and reverse conversions may be performed in any order and at any time, including simultaneously. Details about the conversion dataset selection, order and timing are preferably managed by the conversion configuration (set in step S120).
In general, forward conversion may be considered to provide an explanation for evidence data, and reverse conversion may be considered to predict evidence data for a given particular explanation. As mentioned previously, forward and reverse conversions may also be viewed as increasing or decreasing the level of abstraction for a given data. While the forward and reverse conversions work in the direction of operation through the layers of the LBD, the method of the preferred embodiment preferably applies both forms of conversion to enhance the output. These descriptions are intended as guidelines for understanding forward and reverse switching, and are not meant to limit or define forward and reverse switching (described in greater detail herein).
While reverse transformation operates substantially opposite to forward transformation in the abstract direction (e.g., forward and reverse transformations act as abstraction level incrementers and decreasers, respectively), the method may preferably apply reverse transformation to help increase the data abstraction level (e.g., create an interpretation of evidence data). As shown in fig. 7, a reference unidirectional neural network is considered, which is designed to recognize characters in an input image. The reference neural network instance is composed of a plurality of subnets (Sa, sb, sc) designed to identify characters (a, b, c) in different parts of the image; the network then outputs the characters detected by the sub-network in an order (i.e., a- > b- > c) corresponding to the order of the image portions at the higher layer nodes. The neural network is designed such that the subnets Sa, sb, and Sc convert input data into intermediate data containing likelihood distributions over the character and position space. For example, given input data, one member of intermediate data output by Sa may include the likelihood that the character "a" appears at the position vector (x 1, y 1). The intermediate data is processed by the subnet O to create a likelihood distribution of the string and the string over the position space, which can be output as the most probable string. In the case of the example image of fig. 2, the most probable string is "cko", which is obviously not the string shown in the image ("do"). In this example of neural network processing, the neural network of fig. 7 returns incorrect results because it cannot take into account the surrounding context of each image portion (surrounding context). This may be solved by trying to build context knowledge into layer 2, e.g. storing a priori probabilities of character combinations given characters in adjacent image portions. For example, there may be a probability that "c" - "l" is actually "d", or there may be probabilities of various character strings (e.g., a probability that "cko" occurs in a particular language is low). In the case where various character pairs are equally likely or knowledge stored in the neural network is insufficient (e.g., if the c-l pairs and "d" are equally likely or if the relative probabilities of "cko" and "do" appearing in the training set do not represent real world data), the method may be affected. Another approach is to include more data from layer 1; for example, instead of transmitting character and position data, layer 1 may also include character features and position data. This approach is similar in some respects to performing feature detection only on the image (rather than on the image portion), and it may be affected because it requires a substantial increase in the data transferred from layer 1 to layer 2.
The bi-directional processing applying this method preferably solves the problems of the reference neural network example described above and serves as an example of generating to improve reasoning. The neural networks in the above reference case may be similar in every respect except that the neural network processing is bi-directional, applying forward and reverse conversions. As shown in fig. 8, the neural network may use recursion to allow context (e.g., higher layer data) to affect the output of lower layer sub-networks. During the first step of the recursion, layer 1 outputs the same likelihood distribution as in the unidirectional network instance. Layer 2 does not output the string 'cko' but passes information about the regional neighbors to each subnet during the second step (e.g., passes information about the content of Sb to Sa). The information may be a complete likelihood distribution or some subset thereof (e.g., the first five likelihoods) or any other suitable contextual information. The subnetwork processes this information and Sb computes a significantly varying likelihood distribution (in particular, note that the likelihood containing "d" and "o" is now higher than the maximum choice of the previous "k"). As a result, sb sends this updated likelihood distribution to O in step 3. Then in step 4, O sends updated likelihood information to Sa and Sc, which in turn recalculate their likelihood based on the updated likelihood information and pass it back to O in step 5. This process may be repeated as often as necessary until a certain threshold is reached at step N (e.g., the number of recursive steps, a threshold maximum likelihood, etc.), at which point O outputs the final string. The recursive steps of this example neural network are similar to the possible forward and reverse conversions of method 100. Note that in more complex neural networks (or other LBDs), the recursive step may include multiple forward conversions prior to the reverse conversion, and vice versa; the order and direction of the transitions is preferably determined by method 100. The bi-directional processing may be applied to the final goal of forming the inferential output or to the goal of forming the generated output. The method 100 is preferably implemented by a recursive cortical network as previously described, and more preferably by a recursive cortical network as described in U.S. patent application No. 13/895,225, which is incorporated by reference in its entirety. Additionally or alternatively, the method 100 may be implemented by any suitable neural network or related system (e.g., LBD).
Step S110, receiving evidence data for providing data requiring data interpretation to the LBD. Receiving evidence data preferably includes receiving data that has been preprocessed and more likely to be reduced, converted, or extracted as data features (e.g., specifications of attributes and associated values), but may additionally or alternatively include data that has not been preprocessed. For example, an image may be subdivided into a plurality of image blocks, and pixel patterns in the plurality of blocks are extracted as features. As another example, step S110 may include receiving edges of the image (pre-processed data) or the detection of an unprocessed image. Step S110 may additionally or alternatively include performing evidence data processing (potentially even though the data has been pre-processed). Evidence data processing preferably includes any type of data processing that converts data into a form suitable for processing by the LBD. Some examples of evidence data processing of images may include edge detection, resolution reduction, contrast enhancement; some examples of evidence data processing for audio may include pitch (pitch) detection, frequency analysis, or mel-frequency cepstral coefficient (mel-frequency cepstral coefficient) generation. Evidence data may be received from any suitable source; in some cases, the evidence data may include output data from the LBD. Examples of such evidence data may include a dataset containing character information contained within an image (e.g., the output of the neural network of fig. 7).
When evidence data has been received (and if need be processed), the evidence data is preferably sent, fed or directed to the input of the LBD. In the case of RCN, the evidence data is preferably directed to the sub-feature nodes of the lowest subnet layer. Additionally or alternatively, evidence data may be directed to any LBD node or connection (e.g., data that has been previously processed through layer 1 and layer 2 of LBD may be inserted directly into layer 3).
Step S120, a conversion configuration is set for providing instructions to the LBD on how to perform forward and reverse conversion. Step S120 preferably determines when and where to perform forward and reverse conversions in the LBD (i.e., steps S130 and S140, respectively) and when to generate an output. Step S120 may include setting a static configuration, for example, given an input to a first layer (L1) in a five-layer system, step S120 may direct the LBD to perform a conversion according to:
input → L 1 →L 2 →L 3 →L 2 →L 3 →L 4 →L 3 →L 4 →L 5 Output → output
The static configuration preferably applies forward and reverse conversions, as can be seen in the example above. The static configuration may fully define how to transition from an input to an output application. Alternatively, the static configuration may be a transition pattern of a subset of the overlay layers, which may be triggered or executed in response to some trigger (e.g., in the dynamic transition configuration described below). The reverse transformation may be shallow (e.g., advance only one or a few abstraction layers in the reverse direction), but the reverse transformation may alternatively be deep (e.g., advance many abstraction layers, even back to the starting layer). The mixed pattern may be additionally defined by a sequence of transformations, such as three reverse directions, two forward directions, one reverse direction and two forward directions, as a pattern of net progression of forward movement by one layer. Any suitable conversion pattern may alternatively be used and in any suitable combination.
Additionally or alternatively, step S120 may include setting a dynamic transition configuration; for example, recursion is performed based on a probability threshold (e.g., recursion is performed until the highest probability of the distribution is above 0.8 or a certain maximum number of recursion cycles is reached). As another example, step S120 may perform recursion based on convergence; for example, recursion is performed until the difference between the higher-level outputs in the recursion loop is below a certain threshold or the maximum number of recursion loops is reached.
Although the examples mentioned herein describe recursion between layers, step S120 may additionally or alternatively include setting recursion between nodes or another other LBD component. In the case where the LBD is an RCN, step S120 may additionally or alternatively include instructions for propagation of lateral constraints.
Step S130, forward conversion is performed for reasoning interpretation or classification from the evidence data. Inference can include pattern detection, classification, prediction, system control, decision-making, and other applications involving reasoning about information from data. The forward conversion preferably occurs layer-by-layer (i.e., simultaneously cross-layer) in the LBD, but may additionally or alternatively occur subnet-by-subnet, node-by-node, or in any other suitable manner. In the case of the example shown in fig. 7, each subnet of layer 1 performs forward conversion by receiving image data for its associated portion, calculating likelihood distributions over the character and position space, and outputting the likelihood distributions. Also, the subnet O (i.e., layer 2) performs forward conversion by receiving likelihood distribution as evidence, calculating likelihood distribution of a character string in a character string space (i.e., space of possible character strings), and outputting a character string having the maximum likelihood. The forward conversion may be performed by a subnet as shown in fig. 7, or may be performed by a node, or by another suitable LBD component or infrastructure.
Step S130 preferably includes receiving evidence at an input of the LBD unit (e.g., node, subnet, or layer), performing a mathematical transformation (e.g., calculating a probability distribution or ratio) on the input evidence, and outputting the transformed evidence at an output of the LBD unit. The mathematical transformation performed by step S130 preferably calculates the posterior probability distribution to the LBD units based on the received updated likelihood data, but any suitable mathematical function may additionally or alternatively be calculated as part of the transformation. Step S130 preferably utilizes a confidence propagation technique to communicate information, but other probabilistic reasoning methods may alternatively be implemented. Confidence propagation involves passing messages between nodes and performing calculations in the nodes under different assumptions.
Step S130 preferably includes performing forward conversion based on the conversion configuration set in step S120.
An example network is shown in fig. 9. Encoded in the network is a priori probability P 1 (S 1 ),P 1 (R 1 ),P 1 (Q 1 ) And likelihood relation P (e|S 1 ),P(S 1 |R 1 ),P(R 1 |Q 1 ). When evidence e is introduced into the system, it is first entered at S1. Given the general form of evidence e, it can be expressed as,
this sum is valid for the discrete probability distribution over e, but one skilled in the art will recognize that this can be generalized to a continuous probability distribution. In a simplified example where e takes a particular value,
After computing the posterior probability of S1 (P2 (S1)), this posterior probability is sent from S1 to R1 where it is used to update the posterior probability of R1 (essentially, the ratio of the posterior probability of S1 to the prior probability of S1 is a function of the correction or weighted likelihood P (s1|r1)). The following is a derivation of the relationship between the posterior probability of R1 and the posterior probability of S1,
from this derivation, it is clear that the posterior probability can be calculated at R1 given only the ratio of the posterior probability and the prior probability of S1. Also, it can be shown that this relationship applies to Q1 (only the ratio of a priori and a posteriori for R1 need be transmitted, and no S1 a priori/posteriori or evidence is needed).
In the case of the Q1-based compound,
this example network demonstrates one particular type of forward conversion in order to emphasize the fact that the computation at any layer or node of the network preferably depends only directly on the values output by neighboring LBD units (e.g., subnets, nodes, layers). The forward conversion of S130 preferably outputs a function that depends directly only on the cell at the location where the conversion occurred, but any suitable function may additionally or alternatively be output. Direct reliance preferably reduces recalculation of the LBD units, allowing the unit structure to be more easily reused (e.g., using many identical sub-networks connected together to form the LBD).
The previous example relates to explicit delivery (a priori and a posterior ratio or mathematically related terms) of likelihood update messages, but step S130 may also perform forward conversion in networks where likelihood or related concepts are not directly related to the delivered message (e.g., binary output based on a threshold function). As shown in fig. 10, the neural network with binary nodes calculates whether a given four-bit number is a prime number. As shown in fig. 11, the neural network takes the input 0xb1011 (11 in 10 scale). The first tier node then calculates the response and propagates it along the network (i.e., it performs a forward translation at tier 1). Then, the second layer node receives the input and calculates the response; the third tier nodes follow. Finally, the system outputs 1, indicating 11 is actually prime.
This example is slightly limited by the node output capability; as mentioned, it can only output if the number is prime. In many cases, it may instead be useful to know the posterior probability (e.g., the probability that a number is prime given some evidence). For example, it may not be obvious how the example system can calculate the probability that a four-bit binary number with a least significant bit of 1 is a prime number. One way to calculate this probability is to perform multiple forward passes over the network over time; to calculate the probability that the four-bit binary number with the least significant bit of 1 is prime, a "1" may simply be provided as the system input for the least significant bit and a random binary variable with p (x=1) =0.5. The probability distribution may be estimated from the output of the system after a number of forward passes.
In the example shown in fig. 12, the method 100 performs a forward conversion on the RCN to infer patterns of images. Given that the node corresponding to the source of the message is ON, the message in this example represents the likelihood of evidence. For example, node CF2 has a higher likelihood than node CF1 because the representation of node CF2 is better aligned with the input evidence. The likelihood of a pool (represented by a connection originating from a pool node) is the maximum of the likelihood of the pool members. When propagating confidence in a network having a sequence of inputs corresponding to subsequent time instances, the network may propagate messages in time and make temporal inferences. In such a scenario, the values calculated at the different nodes will represent the probability of a given evidence sequence.
The propagation is preferably initiated when a data feature input is received at the final sub-feature node of the network. The final sub-feature node is the lowest level sub-feature node in the hierarchy. The data is preferably processed, transformed or partitioned into a set of features. The data feature is then used to select or activate the final sub-feature node. In a simple scenario, the presence of a feature is used to activate or deactivate a sub-feature node. Alternatively, the likelihood parameter of the feature node may be an input. The likelihood may be a convolution similarity measure or any suitable measure of likelihood of the feature being apparent in the data. The confidence propagation then continues to propagate the input up the hierarchy of networks. Within the subnet, the propagation node activation includes the sub-feature node sending likelihood scores to the connected PSCF nodes; generating likelihood scores at pool nodes of the subnetwork from the posterior distribution components and likelihood scores of the connected PSCF nodes; at a parent feature node of the subnet, likelihood scores are generated from the posterior distribution components and likelihood scores of pool nodes connected to the parent feature node. The belief propagation then preferably continues to higher subnets and continues until the network propagation is exhausted or a certain threshold is met (these constraints are preferably set in step S120).
If used on the RCN, step S130 may include enforcing a selection constraint on at least the second node for allowing the invariant relationship between the pool and the subnetwork to be defined and used during reasoning. When a node is activated, other nodes connected by constraint nodes preferably have constraints imposed on them. The external constraint node is preferably between at least two PSCF nodes, but may alternatively be between any set of nodes. In one variation, the constraint may instead increase or change the probability metric of one connected PSCF node and/or multiple PSCF nodes of the same pool.
Step S130 preferably outputs evidence of the transition at the output of the LBD unit; the output is preferably used to process or assimilate the activated nodes of the network into an inference result. Preferably, the parent feature node is used as an indicator of the pattern. In constructing the network, the different layers preferably detect patterns with different levels of granularity. On the lower layer this may include detecting a specific pattern of pixels, such as corners or lines or dots. On higher layers, this may be the detection of patterns, as if people were detected in the image or information expressed happy. In addition, each subnet is preferably customized for a particular pattern recognition. In the above example, the subnetwork may be used for constant angle detection. If the parent node of the particular subnetwork is activated, then an inference can be made that a corner exists. The mapping may exist such that activation of a parent node of the subnet is paired with a different pattern tag. Inference can come from the top layer, but can alternatively be obtained through multiple layers of the network. For example, if the method outputs an inference that "male is smiling," inferences about the presence of a person, the person being male, and the facial expression being smiling can be obtained through multiple layers and/or subnets. In addition, the selection of which layers and/or sub-networks to use for output reasoning can adjust the scope of the reasoning. For example, when inferences are generated from images, inferences from a higher level may detect that the images are scenes of a coffee shop. The lower layers may be used to detect the presence of three tables in the image, a male, a female, and various other coffee shop objects.
In step S140, reverse conversion is performed for predicting evidence data according to the knowledge of LBD. Additionally or alternatively, step S140 may include predicting evidence data based on constraints that occur during the reverse transformation. Reverse conversion may be referred to as generation, which is called imagination in the special case where LBD is not provided with external evidence. Generating may include generating static graphics, video graphics, audio media, text content, selection actions or responses, or any suitable media synthesized based on high-level input.
While performing the reverse transformation S140, the node preferably operates on the information and propagates the node selection/processing down the hierarchy of LBDs until an output is obtained from the output of the underlying subnetwork. More specifically, the top-level subnetwork simultaneously generates samples. The output samples of the top layer sub-network determine which lower layer sub-networks are active. Samples are then generated simultaneously from the lower layer subnetworks. The output determines the active subnetwork at the lower layer. The pattern continues through the layers of the LBD until finally a sample is generated from the lowest layer of the subnet. In generating, the output is preferably an analog output. For example, if LBD is used for image generation and the input is the name of the object, the output is preferably an image representing the name of the object.
As with step S130, the reverse conversion S140 preferably occurs layer by layer (i.e., simultaneously cross-layer) in the LBD, but may additionally or alternatively occur subnet by subnet, node by node, or in any other suitable manner. In the case of the example shown in fig. 13, the input character string is supplied to the output terminal of the LBD. Given an input string, subnet O performs reverse conversion based on the input string by computing probability distributions for various features of layer 2. Also, layer 1 then performs a reverse conversion by drilling down the intermediate data and back into the predicted image data. The predicted image data may be considered as the output of a set of random variables having probability distributions defined by the LBD.
Step S140 preferably includes receiving constraints at the output of the LBD unit, performing mathematical transformations on the information stored within the LBD given the constraints, and outputting the generated data at the input of the LBD unit. The mathematical transformation performed by step S140 preferably calculates the updated likelihood distribution of the LBD unit based on constraints, but any suitable mathematical function may additionally or alternatively be calculated as part of the transformation.
Step S140 preferably utilizes a belief propagation technique to communicate information, but other probabilistic reasoning methods may alternatively be implemented. Confidence propagation involves passing messages between nodes and performing calculations in the nodes under different assumptions.
Step S140 preferably includes performing reverse conversion based on the conversion configuration set in step S120.
An example network is shown in fig. 14. Encoded in the network is a priori probability P 1 (S 1 ),P 1 (R 1 ),P 1 (Q 1 ) And likelihood relation P (e|S 1 ),P(S 1 |R 1 ),P(R 1 |Q 1 ). When the constraint q1=q is introduced into the system, it can be inserted directly into the known likelihood and used as the probability of an update of R1:
P 2 (R 1 )=P(R 1 |Q 1 =q)
likewise, S1:
in addition, a probability distribution describing e may be generated,
the probability distribution describes the distribution of evidence of LBD prediction for a given constraint q1=q. By summing Q1, each possible output can be calculated for each layer as a function of the input to that layer, regardless of where that layer exists in the larger LBD. For RCN, each possible object (potentially represented by q1= { Q1, Q2, … }) can be extended into the graph for each layer of computation, which allows the pool selection problem to be formulated as a factor graph. The parameters of the factors in the factor graph will depend on the input but not on the larger structure of the RCN. Pre-computing the factor graph allows for storing updated maximum product ordering and allocation for any desired object, which enables fast object recognition. This is called a static reverse conversion, which may be included in step S140.
Step S140 may additionally or alternatively include performing dynamic reverse conversion. Unlike static reverse conversion, in which the output is a probability distribution based on predicted activation given some constraint, dynamic reverse conversion includes the feature that the given constraint directly activates the LBD to produce an example output (or a set of example outputs if repeated). This preferably enables detection of the behaviour of new objects and/or summarized (generalized) object parts.
This example network demonstrates a particular type of reverse transformation in order to emphasize the fact that the computation at any layer or node of the network preferably depends only directly on the values output by neighboring LBD units (e.g. sub-network, node, layer). The inverse transformation of S140 preferably outputs a function that depends directly only on the cell at the location where the transformation occurred, but any suitable function may additionally or alternatively be output. Direct reliance preferably reduces recalculation of the LBD units, allowing for easier reuse of the unit structure (e.g., using many identical sub-networks connected together to form the LBD).
The previous example relates to explicit delivery of probability update messages (likelihood calculation or mathematically related terms), but step S140 may also perform reverse conversion in networks where the probability or related concepts are not directly related to the delivered message (e.g. binary output based on a threshold function).
In the example shown in fig. 15, the method 100 performs a reverse transformation on the RCN to generate patterns. Pattern generation can be applied to various media and fields such as computer graphics, speech synthesis, physical modeling, data modeling, natural language processing/translation, and the like. Initially, pattern parent feature input is received. The parent features are preferably high-level features, classifications or other inputs that form the basis for generating patterns. The input is preferably passed to a subnet in the top layer of the network. Propagation through the network then proceeds: processing the sub-network of the top layer; then processing the next layer of sub-network; and processing continues with each hierarchical layer of the network being processed step by step (i.e., sequentially or continuously). In some instances, the external constraint may define a relationship between two subnets, thus processing one subnet first and then processing the other subnet taking the external constraint into account. The order may be predefined or configured. Alternatively, the process may be a race (race) condition between the different subnets and the first subnet to complete the process of determining the implementation of the constraint. Alternatively, they may be processed or managed simultaneously in any suitable manner. Similarly, there may be a processing order of nodes within the subnet. The pools in the subnetworks are preferably also ordered. In some examples, the lateral constraints may define a relationship between PSCF nodes of two pools, thus processing one pool first and then considering the lateral constraints to process the other pool inside. The order may be predefined or configured. Alternatively, the process may be a race condition between a different pool and the first pool to complete the process of determining the constraint enforcement for the other pool. Alternatively, they may be processed or managed simultaneously in any suitable manner. Within each subnet, the selection of nodes starts at the parent feature node, then the pool node is activated, and the PSCF node is selected. The selection of PSCF nodes may be affected or determined, at least in part, by the selection constraints of the implementation of the constraint nodes. Pool nodes consistent with the function of the parent feature node are selected for appropriate activation of the pool of subnets. As previously mentioned, the pool is preferably a packet of PSCF nodes corresponding to invariant features. The selection preferably occurs within the parent feature node that has been configured with the selection function. The selection function is preferably in An (AND) relationship such that each connected pool node is activated, but any suitable selection function may alternatively be used.
At least a first PSCF node corresponding to a sub-feature of a subnet function is selected to select a PSCF node within a pool member set of pool nodes. A selection is made for each of the selected pool nodes. The order in which pool nodes within a subnet are evaluated may be ordered in a random order and non-simultaneous manner. Alternatively, the pools may be evaluated simultaneously. The selection of PSCF nodes is preferably performed according to a selection function of the selected pool node. In one embodiment, the selection function is an exclusive or (XOR) function, where only one PSCF node is selected. Any suitable selection function may alternatively be used. When a PSCF node is selected, the PSCF node is preferably connected or otherwise associated with at least one sub-feature node in a direct relationship, with the connected sub-feature node being selected. In some variations, PSCF nodes may be associated with multiple sub-feature nodes. Each sub-feature node is preferably selected when the corresponding PSCF node is selected. In yet another variation, the sub-feature node may additionally be associated with other PSCF nodes in a network or subnetwork. Preferably, the sub-feature nodes are selected/activated based on the super-position of the connection to the sub-feature node.
A selection constraint function is implemented to allow a constant relationship between the defined pool and the subnetwork. Constraints are preferably created to define the logic between feature pairs and patterns. In a general example, if a sub-network stitches image components together to form an image of an automobile and one pool selects the body of the automobile, it may impose restrictions on the other pools of selecting the wheels of the automobile so that the wheels and body remain consistent. The selection constraint may be defined by a constraint node through a connection between at least two PSCF nodes. The constraint nodes may comprise any suitable number of connected PSCF nodes and may implement any suitable selection function. In some cases, the selection constraint may be defined by a connection between two pool nodes or any suitable type of nodes. Similarly, constraint nodes may be between any two or more types of nodes, such as between PSCF nodes and pool nodes. The implementation of constraint nodes when implemented preferably has some form of directionality, the selection of a first node having a selective impact on a second node. Directionality can also go in any direction between two types of nodes. PSCF nodes may cause constraint nodes to affect pool nodes, and pool nodes may cause constraint nodes to affect PSCF nodes. One preferred selection constraint is that if one of the PSCF nodes connected to the constraint node is activated, the selection of the connected PSCF node is implemented. In other words, the constraint node's selection constraint function is an AND operation. Preferably, the selection constraint is implemented in response to a selection of at least a first PSCF node with connected constraint nodes. As described above, the nodes are preferably evaluated or propagated in a certain order. The selection constraint is preferably not implemented on PSCF nodes that have been selected, but on the selection of pool nodes. In some scenarios, after the selection constraint is enforced and sent to the pool members through the constraint node, the pool node may reduce the set of possible PSCF nodes to one node. In other scenarios, the pool node may reduce the number of possible PSCF nodes, or even change the probability weights for the selection. Constraint nodes are shown as connections between two PSCF nodes, but the constraints may alternatively be implemented through messaging mechanisms between pool members and/or subnets. As described herein, the message preferably modifies the operation of the selection function to effectively implement the constraint node. Constraint nodes may be lateral constraints, external constraints, temporal constraints, and/or any suitable type of constraint. The lateral constraint is preferably implemented between two different pools. External constraints are preferably implemented between two different subnets. Lateral and external constraints are preferred for spatial constraints, but may be used to define any suitable invariant. The time constraint is a network evaluation performed for different time instances. The time constraint may define a constant image over different time ranges. The time selection constraint determines features that can occur, may occur, or cannot occur within a sequence of features. The final sub-features of the network are compiled into generated output functions to assemble the features into a generated product, representation or analysis, simulation, or any suitable output. The final sub-feature is preferably the lowest level sub-feature node of the hierarchical network. The sub-feature nodes preferably represent binomial variables that represent the presence of a particular data feature. A database or map may be maintained that maps sub-feature nodes to particular data features. Compiling the final sub-feature preferably includes mapping the selected sub-feature nodes to data features and then compiling the data features into the generated output. The activated sub-feature nodes are preferably components that when combined form a media representation. For example, if the network is trained or created for image generation, the output is preferably a substantially complete simulated image. If the network is trained with audio features, the final sub-features may be assembled to output an audio file or signal. When multiple network evaluations are used for the time signal, the final sub-features of the multiple networks may be compiled into a final generated output.
The method 100 may comprise performing steps S130 and S140 at any time and any location of the LBD, preferably according to the switching configuration of step S120. For example, as shown in FIG. 8, the LBD may perform a series of partial forward and reverse conversions. As another example, the LBD may obtain an image input for half an image. Half of the image input is input into the sub-feature. The LBD is then prompted to generate a likelihood for the other half.
Step S150 includes outputting transformed evidence data for outputting data interpretations generated by the LBD. The evidence data of the output transformation preferably comprises post-processed output, but may additionally or alternatively comprise output unprocessed output data. For example, the output data may include a set of class labels for the image, post-processing according to probability distributions on the class labels. As another example, S150 may include outputting a natural language description of the object within the photograph.
The methods of the preferred embodiments and variations thereof may be at least partially embodied and/or implemented as a machine configured to receive a computer-readable medium storing computer-readable instructions. Preferably, the instructions are executed by a computer-executable component, which is preferably integrated with the recursive cortical network. The computer readable medium may be stored on any suitable computer readable medium, such as RAM, ROM, flash memory, EEPROM, an optical device (CD or DVD), a hard disk drive, a floppy disk drive, or any suitable device. The computer-executable components are preferably general-purpose or special-purpose processors, but any suitable special-purpose hardware or hardware/firmware combination device may alternatively or additionally execute the instructions.
As will be recognized by those skilled in the art from the foregoing detailed description and from the accompanying drawings and claims, modifications and changes may be made to the preferred embodiments of the invention without departing from the scope of the invention as defined in the appended claims.

Claims (22)

1. A method for generating a description of a dataset in a bi-directional layer-based network, wherein the network comprises a first layer having a plurality of subnets and a second layer receiving outputs of the plurality of subnets, the method comprising:
receiving the dataset, wherein the dataset has a detectable characteristic;
setting a conversion configuration, the conversion configuration directing data messaging of the data sets and converted data between layers of the network;
at a first layer of the network, performing, by each of the plurality of subnets, a respective first forward conversion of the data set to a first set of converted data, the first forward conversion comprising: calculating a first posterior probability distribution based on the data set and a corresponding set of prior probabilities and likelihood relationships of the detectable features encoded in a first layer of the network, and generating the first set of transformed data from the first posterior probability distribution;
Receiving, at a second layer of the network, the first set of transformed data from the first layer of the network;
performing, at a second layer of the network, a first inverse transformation on the first set of transformed data and providing data to each respective subnet, the data representing output generated by one or more adjacent subnets of the subnet;
generating, at a first layer of the network, a respective set of updated likelihoods by each subnet of the first layer based on received data representing one or more outputs generated by neighboring subnets of the subnet and a respective set of prior probabilities and likelihood relationships of the detectable feature encoded in the first layer of the network;
performing, at a first layer of the network, a second forward transform on the dataset according to the updated likelihood to generate a second set of transformed data; and
at the second layer of the network, a third forward conversion of the second set of converted data is performed to generate output values comprising one or more descriptions of the detectable features of the dataset.
2. The method of claim 1, wherein the dataset comprises at least one of image data, video data, audio data, natural language text data, or sensor data.
3. The method of claim 2, the bidirectional layer-based network being a recursive cortical network.
4. The method of claim 1, wherein performing the third forward conversion comprises:
receiving the second set of transformed data at one or more inputs of the second layer;
calculating a second posterior probability distribution based on the first set of transformed data, a set of updated likelihoods of the second set of transformed data, and a set of prior probabilities and likelihood relationships encoded in the second layer; and
a fourth set of transformed data is generated from the second posterior probability distribution at the output of the second layer.
5. The method of claim 1, wherein the bi-directional layer-based network is a recursive cortical network comprising a set of evidence data.
6. The method of claim 5, wherein the recursive cortical network is implemented by a distributed computing system.
7. The method of claim 6, wherein the recursive cortical network comprises:
A recursive architecture network of subnetworks, said subnetworks being organized into a plurality of hierarchical layers;
the sub-network at least comprises a father characteristic node, a pool node, a father specific sub-characteristic PSCF node and a sub-characteristic node;
the parent feature node of at least one subnetwork is configured with a selection function operable on at least two pool nodes connected to the parent feature node of the at least one subnetwork;
the pool node of the at least one subnet is configured with a selection function operable on at least two PSCF nodes connected to the pool node of the at least one subnet;
the PSCF node of the at least one subnet is configured to activate a connected sub-feature node;
the child feature node is connectable to at least a parent feature node of a second subnet at a lower hierarchical level; and
constraint nodes having at least two connections from at least two PSCF nodes and a selection function to enhance selection of the pool node.
8. The method of claim 5, wherein the set of evidence data comprises image data.
9. The method of claim 8, wherein the image data comprises image data processed by an edge detection filter.
10. The method of claim 9, wherein the image data is captured by a camera; and wherein outputting the description of the dataset comprises outputting image description data.
11. The method of claim 1, further comprising performing additional forward and reverse conversions based on a static conversion configuration.
12. The method of claim 1, further comprising performing additional forward and reverse translations based on a dynamic translation configuration, wherein the dynamic translation configuration directs message delivery based on a layer output probability threshold of the dynamic translation configuration.
13. The method of claim 1, further comprising performing additional forward and reverse translations based on a dynamic translation configuration, wherein the dynamic translation configuration directs messaging based on a recursion level threshold of the dynamic translation configuration.
14. The method of claim 1, further comprising performing forward conversion based on lateral constraints encoded in the network.
15. The method of claim 14, wherein performing forward conversion based on lateral constraints comprises enforcing activation constraints between at least two nodes of the network.
16. The method of claim 1, wherein outputting the description of the dataset comprises outputting a set of data classifiers.
17. A system comprising one or more computers and one or more storage devices storing operational instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising:
obtaining a dataset for a bi-directional layer-based network, wherein the network comprises a first layer having a plurality of subnets and a second layer receiving outputs of the plurality of subnets, wherein the dataset has a detectable characteristic;
setting a conversion configuration, the conversion configuration directing data messaging of the data sets and converted data between layers of the network;
at a first layer of the network, performing, by each of the plurality of subnets, a respective first forward conversion of the data set to a first set of converted data, the first forward conversion comprising: calculating a first posterior probability distribution based on the data set and a corresponding set of prior probabilities and likelihood relationships of the detectable features encoded in a first layer of the network, and generating the first set of transformed data from the first posterior probability distribution;
receiving, at a second layer of the network, the first set of transformed data from the first layer of the network;
Performing, at a second layer of the network, a first inverse transformation on the first set of transformed data and providing data to each respective subnet, the data representing output generated by one or more adjacent subnets of the subnet;
generating, at a first layer of the network, a respective set of updated likelihoods by each subnet of the first layer based on received data representing one or more outputs generated by neighboring subnets of the subnet and a respective set of prior probabilities and likelihood relationships of the detectable feature encoded in the first layer of the network;
performing, at a first layer of the network, a second forward transform on the dataset according to the updated likelihood to generate a second set of transformed data; and
at the second layer of the network, a third forward conversion of the second set of converted data is performed to generate output values comprising one or more descriptions of the detectable features of the dataset.
18. The system of claim 17, wherein the dataset comprises at least one of image data, video data, audio data, natural language text data, or sensor data.
19. The system of claim 18, the bidirectional layer-based network being a recursive cortical network.
20. One or more non-transitory computer storage media having computer program instructions encoded thereon that, when executed by one or more computers, cause the one or more computers to perform operations comprising:
obtaining a dataset for a bi-directional layer-based network, wherein the network comprises a first layer having a plurality of subnets and a second layer receiving outputs of the plurality of subnets, wherein the dataset has a detectable characteristic;
setting a conversion configuration, the conversion configuration directing data messaging of the data sets and converted data between layers of the network;
at a first layer of the network, performing, by each of the plurality of subnets, a respective first forward conversion of the data set to a first set of converted data, the first forward conversion comprising: calculating a first posterior probability distribution based on the data set and a corresponding set of prior probabilities and likelihood relationships of the detectable features encoded in a first layer of the network, and generating the first set of transformed data from the first posterior probability distribution;
Receiving, at a second layer of the network, the first set of transformed data from the first layer of the network;
performing, at a second layer of the network, a first inverse transformation on the first set of transformed data and providing data to each respective subnet, the data representing output generated by one or more adjacent subnets of the subnet;
updating, at a first layer of the network, a respective set of updated likelihoods by each subnet of the first layer based on received data representing one or more outputs generated by neighboring subnets of the subnet and a respective set of prior probabilities and likelihood relationships of the detectable feature encoded in the first layer of the network;
performing, at a first layer of the network, a second forward transform on the dataset according to the updated weights to generate a second set of transformed data; and
at the second layer of the network, a third forward conversion of the second set of converted data is performed to generate output values comprising one or more descriptions of the detectable features of the dataset.
21. The one or more non-transitory computer storage media of claim 20, wherein the dataset comprises at least one of image data, video data, audio data, natural language text data, or sensor data.
22. The one or more non-transitory computer storage media of claim 21, the bidirectional layer-based network being a recursive cortical network.
CN201680088615.1A 2016-06-21 2016-06-21 System and method for generating data interpretations for neural networks and related systems Active CN109643389B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2016/038516 WO2017222505A1 (en) 2016-06-21 2016-06-21 Systems and methods for generating data explanations for neural networks and related systems

Publications (2)

Publication Number Publication Date
CN109643389A CN109643389A (en) 2019-04-16
CN109643389B true CN109643389B (en) 2023-08-22

Family

ID=60783511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680088615.1A Active CN109643389B (en) 2016-06-21 2016-06-21 System and method for generating data interpretations for neural networks and related systems

Country Status (6)

Country Link
EP (1) EP3472713A4 (en)
JP (1) JP6761055B2 (en)
CN (1) CN109643389B (en)
AU (2) AU2016410565A1 (en)
CA (1) CA3028919C (en)
WO (1) WO2017222505A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6728690B1 (en) * 1999-11-23 2004-04-27 Microsoft Corporation Classification system trainer employing maximum margin back-propagation with probabilistic outputs
US7720779B1 (en) * 2006-01-23 2010-05-18 Quantum Leap Research, Inc. Extensible bayesian network editor with inferencing capabilities
CN102509105A (en) * 2011-09-30 2012-06-20 北京航空航天大学 Hierarchical processing method of image scene based on Bayesian inference
CN102867327A (en) * 2012-09-05 2013-01-09 浙江理工大学 Textile flexible movement reestablishing method based on neural network system
US8775341B1 (en) * 2010-10-26 2014-07-08 Michael Lamport Commons Intelligent control with hierarchical stacked neural networks
US9262698B1 (en) * 2012-05-15 2016-02-16 Vicarious Fpc, Inc. Method and apparatus for recognizing objects visually using a recursive cortical network

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5065040A (en) * 1990-08-03 1991-11-12 Motorola Inc. Reverse flow neuron
US6976012B1 (en) * 2000-01-24 2005-12-13 Sony Corporation Method and apparatus of using a neural network to train a neural network
US7519488B2 (en) * 2004-05-28 2009-04-14 Lawrence Livermore National Security, Llc Signal processing method and system for noise removal and signal extraction
US8219507B2 (en) * 2007-06-29 2012-07-10 Numenta, Inc. Hierarchical temporal memory system with enhanced inference capability
WO2009097552A1 (en) * 2008-02-01 2009-08-06 Omnivision Cdm Optics, Inc. Image data fusion systems and methods

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6728690B1 (en) * 1999-11-23 2004-04-27 Microsoft Corporation Classification system trainer employing maximum margin back-propagation with probabilistic outputs
US7720779B1 (en) * 2006-01-23 2010-05-18 Quantum Leap Research, Inc. Extensible bayesian network editor with inferencing capabilities
US8775341B1 (en) * 2010-10-26 2014-07-08 Michael Lamport Commons Intelligent control with hierarchical stacked neural networks
CN102509105A (en) * 2011-09-30 2012-06-20 北京航空航天大学 Hierarchical processing method of image scene based on Bayesian inference
US9262698B1 (en) * 2012-05-15 2016-02-16 Vicarious Fpc, Inc. Method and apparatus for recognizing objects visually using a recursive cortical network
CN102867327A (en) * 2012-09-05 2013-01-09 浙江理工大学 Textile flexible movement reestablishing method based on neural network system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于二层贝叶斯网的网络入侵检测方法;程传慧;郑秋华;;武汉理工大学学报(交通科学与工程版)(第01期);全文 *

Also Published As

Publication number Publication date
CA3028919A1 (en) 2017-12-28
AU2022202492A1 (en) 2022-05-12
CA3028919C (en) 2022-04-26
JP6761055B2 (en) 2020-09-23
EP3472713A1 (en) 2019-04-24
AU2016410565A1 (en) 2019-01-24
CN109643389A (en) 2019-04-16
EP3472713A4 (en) 2020-02-26
JP2019520655A (en) 2019-07-18
WO2017222505A1 (en) 2017-12-28

Similar Documents

Publication Publication Date Title
US11551057B2 (en) Systems and methods for generating data explanations for neural networks and related systems
US9607262B2 (en) System and method for a recursive cortical network
Buxton Learning and understanding dynamic scene activity: a review
Rimey et al. Controlling eye movements with hidden Markov models
US20180082179A1 (en) Systems and methods for deep learning with small training sets
US20220083863A1 (en) System and method for teaching compositionality to convolutional neural networks
Lee et al. Context-prediction performance by a dynamic bayesian network: Emphasis on location prediction in ubiquitous decision support environment
KR20190126857A (en) Detect and Represent Objects in Images
JP6828065B2 (en) Systems and methods for recursive cortical networks
CN109643389B (en) System and method for generating data interpretations for neural networks and related systems
US11526757B2 (en) Systems and methods for deep learning with small training sets
Kumar et al. Unified granular neural networks for pattern classification
KR20200092453A (en) Method and apparatus for generating images based on keyword
Botzheim et al. Growing neural gas for information extraction in gesture recognition and reproduction of robot partners
Chakrabarty et al. Q-learning elicited disaster management system using intelligent mapping approach
KR102507892B1 (en) Object state recognition method, apparatus and computer program
Liu et al. Recognizing constrained 3D human motion: An inference approach
Leopold et al. Belief revision with reinforcement learning for interactive object recognition.
Rodriguez-Criado Deep Learning in Graph Domains for Sensorised Environments
Kasprzak Integration of different computational models in a computer vision framework
Gupta Representations for Visually Guided Actions
JP2023027688A (en) Information processing apparatus, inference device, information processing method, and inference method
Novo et al. Emergent Image Segmentation by Means of Evolved Connectionist Models and Using a Topological Active Net Model
Niemann Recognition and Interpretation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221019

Address after: California, USA

Applicant after: Insi Innovation Co.,Ltd.

Address before: California, USA

Applicant before: Vicarious FPC, Inc.

GR01 Patent grant
GR01 Patent grant