WO2019018533A1

WO2019018533A1 - Neuro-bayesian architecture for implementing artificial general intelligence

Info

Publication number: WO2019018533A1
Application number: PCT/US2018/042701
Authority: WO
Inventors: Rajesh Perampalli Nekkar RAO; Satish Kathirisetti; Ramesh Durairaj
Original assignee: Neubay Inc
Priority date: 2017-07-18
Filing date: 2018-07-18
Publication date: 2019-01-24

Abstract

The present disclosure envisages a processor architecture designed tor artificial general intelligence operations. The engine for Neuro-Bayesian teaming (eN-BLe) further includes a hierarchical Neuro Bayesian Network module, a reinforcement learning module, a supervised learning, module, and a planning, imagination, and simulation module, for planning, imagination, and decision making under uncertainty. The engine for Neuro-Bayesian learning is communicably coupled to a user application and receives input data from the user application. The hierarchical Neuro-Bayesian Network (H-NBN) acts as a probabilistic internal model of an application or unknown environment. The H-NBN is capable of probabilistic and Bayesian inference, prediction, and unsupervised learning. Thereafter, the outputs of the H-NBN are provided to supervised NBNs for classification or regression of input states. Additionally, the output of the H-NBN is provided to the reinforcement learning module, which in turn comprises Value-NBNs (V-NBNs) and Policy-NBNs (P-NBNs), to compute expected reward and select optimal actions under uncertainty.

Description

NEURO-BAYESIAN ARCHITECTURE FOR IMPLEMENTING ARTIFICIAL GENERAL INTELLIGENCE

CROSS-REFERENCE TO RELATED APPLICATIONS

[001] The claims disclosed herein benefit from the priority associated with US Provisional Patent Application No. 62/534,040 filed on My 18, 2017 with the tide "NEURO-BAYESIAN ARCHITECTURE FOR IMPLEMENTING ARTIFICIAL GENERAL INTELLIGENCE", the contents of which are incorporated herein by the way of reference.

BACKGROUND

Technical field

[002] The present disclosure relates to artificial intelligence. Particularly, the present disclosure relates to a processor specifically configured for implementing artificial general intelligence operations. More particularly, the present disclosure relates to a processor that combines neural probabilistic inference, multi-scale prediction, planning, and imagination with unsupervised learning, supervised learning, and reinforcement learning, while allowing a distributed implementation across edge computing and cloud computing frameworks.

Description of Related Art

[003] Artificial Intelligence (AI) is used in various computer-implemented applications including but not restricted to gaming, natural language processing, self-driven cars, face detection, speech recognition in personal assistants, handwriting recognition, and robotics. A computer controlled robot, for instance achieves AI through learning, reasoning, perception, problem-solving and linguistic intelligence. Advancements in AI in recent years have led to remarkable demonstrations of machines reaching near-human capabilities in solving the challengesfcroblems associated with a variety of technical as well as non-technical domains.

[004] Most of the existing AI based systems employ "deep" neural networks and back propagation learning techniques to determine an output in response to an input. These systems are geared towards supervised learning and may therefore be rendered ineffective when mere arises a need to deduce inferences backed by appropriate reasoning despite the presence of uncertainty. Furtherrnore, existing AI based systems do not incorporate processor architectures which are specifically designed to implement Artificial General Intelligence (AGI) programs that in turn can simultaneously perform neural probabilistic inference, prediction, reasoning, planning, imagination, and multiple types of learning for noisy time-varying inputs, amongst other phenomena.

[005] Therefore, in order to overcome the drawbacks discussed hitherto, there was felt a need for a processor architecture that has been specifically designed to effectively implement Artificial General Intelligence (AGI) operations, even in uncertain and noisy real-world applications. Further, there was also felt a need for a processor architecture designed to perform multiple forms of AGI operations combining neural and Bayesian inference and learning, including unsupervised learning, supervised learning, and reinforcement-based action learning, with the capability of being implemented seamlessly in a distributed manner across edge computing devices and cloud computing infrastructures. OBJECTS

[006] An object of the present disclosure is to provide a Neuro-Bayesian Architecture suited for Artificial General Intelligence (AG!) operations.

[007] Another object of the present disclosure is to provide a processor architecture suited for implementation of probabilistic inference and supervised as well as unsupervised learning principles.

[008] Yet another object of the present disclosure is to provide a processor architecture optimized for implementing reinforcement-based action learning operations using a value NBN and a policy NBN.

[009] One more object of the present disclosure is to provide a processor architecture that optimizes reinforcement learning and implements actions derived from optimized reinforcement learning.

[0010] Still a further object of the present disclosure is to provide a processor architecture mat efficiently addresses the issue of optimizing expected costs and rewards by efficiently and effectively planning future actions.

[0011] One more object of the present disclosure is to provide a hierarchical architecture that enables probabilistic predictions of future inputs, and facilitates planning, simulation and imagination of multiple possible input trajectories at multiple spatial and temporal scales.

[0012] Still a further object of the present disclosure is to provide a processor architecture mat enables discovery of hierarchical hidden states from input data. SUMMARY

[0013] The present disclosure envisages a processor architecture suited for implementing Artificial General Intelligence operations. The processor architecture is built on Neuro Bayesian Networks (NBNs) and is also suited for neural probabilistic inference, reasoning, planning, simulation, imagination, and various forms of learning. The processor architecture is designed to perform unsupervised learning, supervised learning, reinforcement learning, and planning amongst others.

[0014] The processor architecture includes a Neuro-Bayesian learning engine (eN-BLe). The engine for Neuro-Bayesian learning (referred to as the 'engine' hereafter) further includes at least one hierarchical Neuro Bayesian Network module for unsupervised learning, Bayesian inference and memory. Preferably, conglomeration of a plurality of Hierarchical Neuro-Bayesian Network modules results in the formation of a layered, hierarchical Neuro-Bayesian Network; with each constituent Neuro-Bayesian Network module being considered as a layer of the hierarchical Neuro-Bayesian Network. The engine envisaged by the present disclosure further includes a reinforcement learning module (preferably used for Bayesian decision making), a supervised learning module, and a planning and simulation module. The engine for Neuro- Bayesian learning (eN-BLe) is communicably coupled to a user application, from which it (engine) receives input training data.

[0015] The hierarchical Neuro-Bayesian Network (H-NBN) acts as a probabilistic internal model of an application or an unknown environment through unsupervised learning. The H-NBN is endowed with the ability to perform probabilistic inferences and hierarchical predictions. Typically, the outputs from the H-NBN are provided to the supervised learning module which uses supervised NBNs to classify input states. Subsequently, the outputs of the supervised learning module are provided to the reinforcement learning module comprising Vahie-NBNs (V- NBNs) and Policy-NBNs (P-NBNs), to facilitate computation of expected rewards and further to facilitate selection of optimal actions preferably leading to maximized rewards despite the presence of uncertainty.

[0016] The processor architecture based on Neuro-Bayesian Networks, as disclosed by the present disclosure therefore implements artificial general intelligence operations, allowing a machine or application to: (a) learn a hierarchical probabilistic model of an unknown dynamic environment or application in an unsupervised manner based on time series data from possibly multimodal sensors; (b) make probabilistic predictions of future sensor inputs and multiple possible input trajectories at multiple spatial and temporal scales; (c) detect anomalies at multiple spatial/temporal scales; (d) perform classification of sensor states using supervised learning; (e) perform hierarchical planning, simulation, and imagination of future scenarios at multiple temporal scales; and (f) select optimal actions mat maximize total expected reward and minimize expected costs.

[0017] In accordance with the present disclosure, the Neuro-Bayesian Network Module (NBNM) is configured to receive input data at the current time Y as a sequence of input vectors from time duration 't-T^* to 't\ from a pre-determined user application. These input vectors are merged into a single Spado-Temporal Vector (STY). The NBNM transforms each input STV (I_STV) at the current time Y into a latent hidden vector (LHV) through multiple layers of nonlinear transformations. A plurality of NBNMs are recursively connected over space and time to form a spatio-temporal hierarchical NBN or spatio-temporal H-NBN. The upper layers of the H-NBN are preferably designed to retain extensive memories of events and predict longer future input sequences as compared to lower layers of the H-NBN. [0018] In accordance with the present disclosure, the engine for neuro-Bayesian learning (eN- BLe) is designed to perform unsupervised learning in circumstances where an output label is unavailable for each corresponding input. The eN-BLe includes an unsupervised learning module that is configured to perform learning of a hierarchical predictive model for any given sequence of input vectors. Further, the eN-BLe includes a supervised learning module mat has been configured to map an input vector to an output vector along with the associated probability; the probability being a measure of a given input vector resulting in inception of a given output vector, if a user action or an output label is available for each input.

[0019] In accordance with the present disclosure, the input vectors are initially utilized to team a hierarchical model of the inputs using the unsupervised learning module (unsupervised H-NBN). Thereafter, the supervised learning module (supervised S-NBN) is trained to map a pooled Latent Hidden Vector (PJLHV) from the unsupervised H-NBN to an appropriate output label to minimize die prediction error between the output of the S-NBN and the training output label.

[0020] In accordance with the present disclosure, the engine for neuro-Bayesian learning (eN- BLe) is also designed to perform teinforcement-based action learning, execution, simulation, and planning. In typical user application scenarios, actions are performed in response to an input Thereafter rewards, costs, and penalties associated with each of the input states and actions are generated, in addition to receiving user feedback. The eN-BLe is subsequently used to learn and simulate the user application/system to plan future actions that optimize the expected costs and rewards. BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS

[0021] The other objects, features, and advantages will be apparent to those skilled in the art from the following detailed description and the accompanying drawings in which:

[0022] FIG. 1 is a block diagram illustrating a Neuro Bayesian Network based processor architecture, in accordance with the present disclosure; and

[0023] FIG.2A and FIG.2B in combination illustrate, in the form of a flowchart, the steps involved in a computer-impl emented method for implementing a Neuro-Bayesian Learning engine (eN-BLe), in accordance with the present disclosure.

[0024] Certain features of the present disclosure are shown in some drawings and not in others. This has been done only for convenience as each feature may be combined with any or all of the other features, in accordance with the present disclosure.

DETAILED DESCRIPTION

[0025] In order to overcome the drawbacks discussed in the 'background' section, the present disclosure envisages a processor architecture specifically designed to implement General Artificial Intelligence (AGI) operations. The processor architecture, as envisaged by the present disclosure includes an engine for Neuro-Bayesian learning (eN-BLe). The engine for Neuro- Bayesian learning (eN-BLe) has the following components: (1) a Hierarchical Neuro Bayesian Network module designed to implement probabilistic inference, prediction, memory, and unsupervised learning inter-alia (2) a reinforcement learning module for Bayesian decision making and action selection, (3) a supervised learning module, and (4) a planning, imagination, and simulation module. [0026] The Hierarchical Neuro-Bayesian Network (H-NBN) acts as a hierarchical probabilistic internal model of an application or unknown environment whose inputs are the inputs from a user application and whose outputs are described in more detail below. The H-NBN can perform probabilistic inference of hidden states of the user application, predict future states, and learn hierarchical representations of the user application. Thereafter, the outputs of the H-NBN are provided to the supervised learning module which uses supervised NBNs to classify input states. Alternately, the output of the NBNs are provided to the reinforcement learning module comprising Value-NBNs (V-NBNs) and Pohcy-NBNs (P-NBNs) to compute expected reward/costs and select optimal actions under uncertainty.

[0027] The processor architecture envisaged by the present disclosure is used to perform Artificial General Intelligence operations including but not restricted to (a) learning a hierarchical probabilistic model of an unknown dynamic environment or application based on time series data from multimodal sensors (b) deriving probabilistic predictions of future sensor inputs and multiple possible input trajectories at multiple spatial and temporal scales (c) detecting anomalies at multip!e spatial/temporal scales (d) performing classification of sensor states using supervised learning (e) performing hierarchical planning, simulation, and imagination of future scenarios at multiple temporal scales and (f) selecting optimal actions that maximize total expected reward and minimize expected costs.

[0028] Referring to FIG. 1, the processor architecture described therein includes an engine for Neuro-Bayesian learning (eN-BLe) 100. The engine 100 for Neuro-Bayesian learning (eN-BLe) further includes a hierarchical Neuro Bayesian Network Module 104 (H-NBNM), a reinforcement learning module 106, a supervised learning module 108, and a planning, imagination and simulation module 110 (referred to as ^* simulation module 110^* hereafter, for the sake of brevity). Preferably, the engine 100 is eommunieably coupled to a user application 102 from which the 'input training data' and 'application data' are received as an inputs, preferably during a test implementation. Preferably, the input training data received by the engine 100 can be time-series data comprising multivariate input values,

[0029] Examples of input data include but are not restricted to sensor measurements from Internet of Things (loT) applications, video streams from cameras and video recorders, speech related information, audio information from microphones and audio recorders, textual data, time- series data, unstructured/semi-structured data. The input data preferably includes user feedback (at any point in time) as well as any application-specific costs, rewards or penalties incurred as a result of the input states and actions taken by the user application 102 communicably coupled to the engine 100. Other inputs to the engine 100 include an output label for each input or a user action for each input.

[0030] In accordance with the present disclosure, the hierarchical Neuro-Bayesian Network Module (NBNM) 104, during implementation of a training phase, is configured to receive the input training data either in the form of a times-series or as a sequence of input vectors. Preferably, the sequence of input vectors V for a time duration VT to the

current time Y are merged into a single Input Spatio-Temporal Vector for the time Y,

using the below mentioned equation:

Preferably, the Input Spatio-Temporal vector (I_STV) thus calculated is fed as an input to the NBNM 104. The NBNM 104 preferably transforms the single Input Spatio-Temporal vector (l_STV) at time Y, into a Latent Hidden Vector (LHV) through one or more layers of nonlinear transformations represented by the below mentioned equation:

As an example, each nonlinear transformation fi may be defined by a nonlinear neural network, i.e.,

Wherein X is an input vector, W is a weight matrix of positive or negative values, and h is an activation function such as the sigmoid or rectified linear unit (RELU) function.

Further, the Latent Hidden Vector (LHV) obtained at time 't^* is fed as an input to a preset number ('Κ') of Recurrent Networks (RNs). An example of an RN that may be used for this purpose is the Long Short Term Memory (LSTM) network but other RNs or time-series prediction models may also be used. The weights of the 'K' RNs are trained on the Latent Hidden Vector in order to generate output vectors, given by LHV;, LHV2,...,LHVK, which

are predictions of the possible LHVs for the next (future) time step 't-H ', and the corresponding probabilities

During the training phase, only the RN whose prediction - the combination of the Latent Hidden Vector (LHV) and the corresponding probability (p) - is closest to die input training data is selected and its weights adjust to better predict the input and its probability increased. Preferably, the NBNM 104 selects one of the LHVs amongst LHVi, LHV₂,...,LHVK by sampling mem based on their probabilities. The selected LHV is regarded as the Predicted LHV (Pr_ LHV) for the future time step 't+Γ. The Pr_ LHV is further transformed via multiple layers of nonlinear transformations, similar to the nonlinear transformations applied to LHV above, to generate a Predicted Spatio-Temporal Vector (PrjSTV) for a future time step t+1 :

As an example, the g functions can be implemented using nonlinear neural networks similar to the LHV case described above. The Predicted Spatio-Temporal Vector

is subsequently unrolled to produce a sequence of predicted input vectors for time steps^* and

using the below mentioned formula:

It is pertinent to note that the NBNM 104 can also be trained as described above to predict input vectors (Pr_V) for time steps further into the future than time step ^kt+l\

[0031] In accordance with the present disclosure, the Neuro-Bayesian Network Module (NBNM) 104 is configured to generate a prediction error vector which is a function of

the Input Spatio-Temporal vector and the predicted Spatio-Temporal vector. The

prediction error vector is typically used to determine the occurrence of anomalies. Preferably, an anomaly is notified if the function of the prediction error vector satisfies a

predetermined condition, for instance, if the length of the error vector exceeds a threshold. Depending on the spatial and temporal extent of the vectors IJSTV and Pr_STV, this procedure allows the detection of anomalies at multiple spatial and temporal scales. Preferably, in one version of the NBNM 104 termed as the predicti ve coding NBNM, the current prediction error vector can be given as input to the NBNM 104 rather than the current input vector to train the NBNM (104) and generate the next estimate for the LHV.

[0032] In accordance with the present disclosure, the engine 100 for Neuro-Bayesian teaming (eN-BLe) further includes a plurality of NBNMs (104) recursively connected over space and time to form a spatio-temporal hierarchical NBN or H-NBN (not shown in figures). In order to obtain a spatio-temporal hierarchy, the LKVs from multiple spatial windows (e.g., in an image), each for a time window spanning

(current time step Y) and ^*T' (past time steps t-1, t-2,...,t- T) are merged to form a single 1_STV which is further ted as an input to a higher level NBNM The LHV of the higher level NBNM learns properties of the I _STV ranging across multiple spatial windows and time periods. The process of feeding the I_STV to higher level NBNMs is repeated depending on the desired levels of NBNMs. Subsequently, a spatio-temporal hierarchical NBN is obtained whose LHVs incorporate spatial properties over progressively comparatively larger spatial windows and temporal properties over progressively longer durations of time. In this manner, the upper layers of Neuro Bayesian networks in the H-NBN retain extensive memories of events to predict the future as compared to tower layers of the neuro Bayesian networks in the H-NBN.

[0033] In accordance with the present disclosure, each NBNM is implemented using a computational model of neuronal networks. Examples of computational models of neuronal networks include but are not limited to leaky integrator neurons, Hodgkin-Huxley neurons, sigma-pi neurons, compartmental neurons as well as the more commonly used traditional neuronal units (computing weighted sum of inputs with a nonlinear output).

[0034] In accordance with the present disclosure, the engine 100 for Neuro- Bayesian Learning (eN-BLe) is designed to perform unsupervised learning in instances when an output label is unavailable for each input The engine 100 is configured to perform unsupervised learning of a hierarchical predictive model for series of input vectors. In an exemplary scenario, when the NBNM 104 is designed from traditional neuronal units performing a weighted sum of the input vectors (values in each 1_STV) followed by nonlinearity (e.g., sigmoid), the weights of the NBNM 104 are learned in an unsupervised manner to minimize the error in the prediction of the input vectors. The minimization of error is performed using known optimization procedures, for example gradient descent of the prediction error function to adjust the weights of the NBNM 104. When stacked NBNMs are used to form an H-NBN as described earlier, a known gradient descent algorithm such as the back propagation algorithm is used for minimizing the prediction error function by adjusting the weights. In another exemplary scenario, an evolutionary method is used for optimization to learn NBNMs: The evolutionarily-inspired operations of selection, crossover and mutation are used to evolve NBNMs in order to maximize a "fitness" function, which is defined to be inversely related to the overall prediction error.

[0035] In accordance with the present disclosure, the supervised learning module 108 is configured to perform classification or regression, i.e., map an input vector to an output vector along with its associated probability if a user action or output label is available for each input. The supervised learning module 108 is preferably a Supervised Neuro Bayesian Network (S- NBN) that receives as input from the H-NBN, a pooled LHV (P_LHV) which is obtained by concatenating all the LHVs of the H-NBN into a single vector. Subsequently, the S-NBN is trained to predict the output label based at least on the labeled input training data.

[0036] In accordance with the present disclosure, the input vectors are initially utilized to learn a hierarchical model of the inputs (input training data) using the unsupervised learning module (unsupervised H-NBN). In mis case, the input training data are merged into a single Input Spatio-Temporal Vector (l_STV). Further, the Input Spatio-Temporal Vector (l_STV) is provided as an input to a plurality of Neuro-Bayesian Network modules constituting the hierarchical Neuro-Bayesian network. Subsequently, each of the Neuro-Bayesian Network modules produces a Latent Hidden Vector on the basis of the I_STV, and by implementing the equation: Further, the Latent Hidden Vectors (LHVs) obtained by

each of the Neuro-Bayesian Network modules are pooled together (combined) to form a pooled Latent Hidden Vector (P LHV). [0037] The supervised learning module 108 is trained to map the pooled from the

H-NBN to an appropriate output label using a known optimization method, for example bade propagation, so as to minimize the errors between the output of the S-NBN and the output label, hi mis case, back propagation technique is used to adjust the weights of all the layers of the S- NBN to minimize the output error, using as ate standard input In another possible

implementation, back propagation technique is used to not only train the S-NBN but also adjust the weights of the H-NBN in order to further decrease the output error and increase the accuracy of the overall Neuro-Bayesian network.

[0038] In accordance with the present disclosure, the engine 100 for Neuro-Bayesian Learning (eN-BLe) is designed to perform hierarchical planning, simulation (imagination), and reinforcement-based action learning. In typical user application scenarios, actions are performed in response to an input. Thereafter rewards, costs, and penalties associated with each of the input suites and actions are generated, in addition to receiving user feedback. The engine 100 is subsequently used to learn and simulate the user application/system to plan future actions mat optimize the expected costs and rewards. The process of planning and optimization preferably involves at least the following steps:

1. Using the principles of unsupervised learning as described above, an H-NBN (104) is first learnt from the 'input training data' obtained from a user application 102 Which has been used to perform a variety of actions; with the resultant input vector and rewards/costs for each action over a predetermined period of time, being recorded for further analysis.

2. The trained H-NBN is used as the engine's (100) internal model of the user application (102), to simulate different "what-if ' scenarios for planning: The procedure is as follows: i. Start the NBN from 3 given application state; ii. Apply the first "Vhat-if ' action to obtain a resulting sensory state and associated reward/cost, based on sampling the next states according to their probabilities in the H-NBNM; iii. Repeat step (ii) for the next "what-if action from the current sampled state to get a next state and associated reward/cost; iv. Steps described above can also be used for planning a sequence of actions

by starting from the current state, choosing an action that is likely to lead to a state with high expected reward as given by the H-NBN, simulating the action and getting the next state using the H-NBN, and repeating mis procedure until a desired goal state is reached. Since the H-NBN is hierarchical in nature, the procedure explained herein can be used to generate a hierarchical plan that spans multiple spatial/temporal scales.

3. The trained H-NBN (104) is used by the reinforcement learning module (106) to learn optimal actions that maximize the total expected future reward. This is done by concatenating all the LHVs of the hierarchical NBN into a single pooled LHV vector (PJLHV) and feeding this PJLHV as an input to another NBN termed as Value-NBN (V- NBN), which is communicably coupled to the reinforcement learning module (106). Given any P_LHV as input, the V-NBN output the total expected future reward (also known as "value") for the current application state as represented by the current PJLHV. The V-NBN is trained using a well-known algorithm termed TD Learning, which is typically used for learning the value function called temporal difference. Further, the well-known optimization algorithm, i.e. back propagation, is used to update the V-NBN weights. Training data for the V-NBN is preferably obtained in two ways:

i. When the user application 102 is executed, the input training data obtained from the application, which includes sensory inputs, actions, as well as rewards/costs can be used to update die weights of the V-NBN.

ii. When the user application 102 is not under execution or alternately, in parallel to executing the user application 102, an instance of the trained H-NBN is used to predict a sensory input and reward/cost for each action recommended by the V- NBN, and the reward/cost combination thus received is used to update die weights of the V-NBN.

4. Optimal actions are also learnt using another type of NBN termed the poHcy-NBN (P- NBN) communicably coupled to the reinforcement learning module (106). The P-NBN takes as its input the PJLHV for the current user application state and outputs the best action for that state, along with the corresponding probabilities. Starting initially with all the possible actions (associated with the user application 102) being accorded equal probabilities for all the associated states, the P-NBN learns the appropriate actions and probabilities by: i. Generating a current set of possible actions and their probabilities;

ii. Either picking the action that will lead to a next state (based on the H-NBN) with the highest value (based on the output of the V-NBN), using mat action as the output label for the P-NBN, and updating its probability; Or executing a random action for exploration purposes. 5. Preferably, pluming is achieved by the simulation module (110) by starting the H-NBN from a given user application state and applying an action either randomly or as given by the P-NBN or a combination of these two strategies. The resulting sensory state is then evaluated according to the value generated by the V-NBN, and the process is repeated.

[0039] In accordance with the present disclosure, the H-NBN is implemented as a Distributed AI (DAI) system in applications such as IoT to achieve a trade-off between edge computing am! cloud computing. In the Distributed AI system, for an application-dependent value K, the first K levels of the H-NBN are implemented on local devices as part of the application's edge computing while the NBNMs in levels K+l and beyond are implemented via cloud computing. The prediction PJSTV from level K+l NBNM is conveyed from the cloud to local devices across the edge. The local devices compute the error (I_STV - P_STV) between the output LHV of the NBNM at level K (which forms the J_STV for level K+l). Based on the communication bandwidth and internet connectivity of the edge devices, a criterion can be selected such that the edge devices only need to communicate the error (l_STV - PJSTV) to the K+l level NBNM on the cloud when the criterion is met, for example, when the magnitude of the error becomes larger than a threshold. The distributed AI system enables fast prediction at short time scales on the edge devices while prediction at a longer time scale occurs in the cloud. The distributed AI system enables "Hazy Edge Computing" in which the edge level K referred to above can be changed on-the-fly depending on currently available communication bandwidth. In accordance with the present disclosure, the planning, imagination, and simulation module 110 employs the H-NBN to explore the consequences of actions by predicting multiple future inputs trajectories, outcomes, and expected rewards/costs. [0040] Referring to FIG.2A and FIG. 2B in combination, there is shown a flowchart illustrating the steps involved in the computer-implemented method for implementing a Neuro-Bayesian Learning engine (eN-BLe). !n accordance with the present disclosure, the execution of the (computer-implemented) method begins at step 200 wherein a hierarchical Neuro-Bayesian Network module (H-NBNM) receives 'input training data' from a predetermined user application. Preferably, the input training data is a time-sensitive sequence of input vectors. Examples of input training data include but are not restricted to sensor measurements from Internet of Things (loT) applications, video streams and images from cameras and video recorders, speech related information, audio information from microphones and audio recorders, textual data, time-series data, unstructureoVsemi-structured data. Further, the input data preferably includes user feedback (at any point in time) as well as any application-specific costs, rewards or penalties incurred as a result of the input states and actions taken by the user application. Other inputs to the Neuro-Bayesian Learning engine (and in turn the H-NBNM) include an output label for each input or a user action for each input

[0041] Further, at step 202, the hierarchical Neuro-Bayesian Network module merges the input vectors (received from the user application as input training data) into a single Input Spatio- Temporai Vector. Preferably, the sequence of input vectors V(t-T), V(t-T+l),...,V(t) for a time duration *t«T to the current time 't' are merged into a single input Spatio-Temporal Vector (I_STV) for the time T, by the hierarchical Neuro-Bayesian Network module, based on the equation: I_STV = rv(t-T) V(t-TH) ... V(t)L

[0042] Further, at step 204, the hierarchical Neuro-Bayesian Network module transforms the single Input Spatio-Temporal Vector (I_STV) into a Latent Hidden Vector (LHV). Preferably, the time-sequence/time-frame corresponding to both the I.. STV and the LHV is ^*t' which is also termed as the 'current time frame'. Preferably, the hierarchical Neuro-Bayesian Network module transforms the IJSTV into LHV by using one or more layers of nonlinear transformations represented by the equation: As an example, each nonlinear

tramformation^may be defined by a nonlinear neural network, wherein X

is an input vector, if is a weight matrix of positive or negative values, and h is an activation function such as the sigmoid or rectified linear unit (RELU) function.

[0043] Subsequently, at step 206, the Latent Hidden Vector (LHV) obtained at time V is fed as an input to a preset number (^*K^*) of Recurrent Networks (RNs). Each of the recurrent networks incorporate a plurality of layers, with each layer being assigned a learnable weight. An example of a RN is the Long Short Term Memory (LSTM) network, but it is possible that other RNs or time-series prediction models are used. The weights of the 'K' RNs are trained on the Latent Hidden Vector (LHV) in order to generate 'K' output vectors, given by LHVi, LHV₂,...,LHVic which are predictions of the possible LHVs for the next (future) time step W, and the corresponding probabilities p_{, pa,..., ρκ-

Further, preferably, only the RN whose prediction - the combination of the Latent Hidden Vector (LHV) and the corresponding probability (p) - is closest to the input training data is selected and its weights are adjusted to better predict the input and to bring about an increase in the probability of achieving the predicted input. At step 208, the NBNM selects one of the LHVs amongst LHVt, LHV₂,...,LHVKby sampling them based on their probabilities, and codifies the selected LHV as the Predicted for the future time step ^*t+l Subsequently, the

PrJLHV is further transformed by the Neuro-Bayesian Network module via multiple layers of nonlinear transformations, similar to the nonlinear trarisforrriarions applied to LHV above, to generate a Predicted Spatio-Temporal Vector for future time step t+1. The equation

used for calculating the Predicted Spatio- Temporal Vector

Further, the Predicted Spatio-Temporal Vector (PrjSTV) is unrolled to

produce a sequence of^* predicted input vectors (Pr V) for time steps'

using the formula:

[0044] At step 210, the Latent Hidden Vectors (LHVs) corresponding to the future time frame (t+1) are processed by a supervised learning module, and each of the Latent Hidden Vectors are mapped to respective output labels relevant to the future time frame. Alternatively, at step 212, the Latent Hidden Vectors (LHVs) are processed by a reinforcement learning module which in turn maps the Latent Hidden Vectors to rewards and optimal actions expected at the future time frame (t+1). Further, at step 214, a simulation module receives predetermined states of the user application as inputs, and subsequently performs a 'what-if* analysis on the said predetermined states, by applying a plurality of sequences of actions (action sequences) onto the Neuro- Bayesian Network module and the reinforcement learning module. The result of the 'what-if analysis is preferably the identification of 'expected application states^* and corresponding expected rewards. Further, at step 216, the simulation module selects at least one sequence of actions determined to generate maximal expected rewards, and subsequently creates a plan to achieve an application state which is either determined to be a goal state or is determined to be associated with maximal expected rewards.

[0045] In accordance with the present disclosure, the computer-implemented method further includes the following steps:

generating a prediction error, the prediction error represented as a function of the input Spatio-temporal Vector and the predicted Spatio-temporal Vector, utilizing the prediction error vector for detecting anomalies across the input Spatio- temporal Vector and the predicted Spatio-temporal Vector, based on a predetermined detection criteria; utilizing the prediction error vector instead of die time-sensitive sequence of input vectors, to calculate the Latent Hidden Vectors (LHVs) for the at least one future time frame;

recursively connecting a plurality of the hierarchical Neuro-Bayesian Network modules over at least time and space to form a Spatio-temporal hierarchical Neuro-Bayesian Network; combining Latent Hidden Vectors corresponding to a current time frame and Latent Hidden vectors corresponding to a previous time frame, to form the input Spatio- temporal Vector, feeding the input Spatio-temporal Vector to a higher level hierarchical Neuro-Bayesian Network module present within the Spatio-temporal hierarchical Neuro-Bayesian Network; iterarively feed the input Spatio-temporal Vector to higher levels of hierarchical Neuro- Bayesian Network modules present within the Spatio-temporal hierarchical Neuro- Bayesian Network, until a Hierarchical Neuro-Bayesian network constituting a predetermined number of hierarchical Neuro-Bayesian Network modules is achieved, and further until Latent Hidden Vectors corresponding to the Hierarchical Neuro-Bayesian network are determined to incorporate predetermined spatial properties and temporal properties over pre-deterrained spatial windows and pre-determined time frames respectively; concatenating, by the hierarchical Neuro-Bayesian Network module, the Latent Hidden Vectors derived from the plurality of output vectors, into a single vector to form a pooled Latent Hidden Vector; feeding the pooled Latent Hidden Vector to the supervised learning module for triggering prediction of the output label; feeding the pooled Latent Hidden Vector to the reinforcement learning module for triggering prediction of optimal actions and rewards expected to be associated with the optimal actions; receiving, at the simulation engine, an application state and a corresponding action as inputs, and performing the what-if analysis on the application state and the corresponding action, to generate a probabilistic prediction of future states and expected rewards; selecting, by the simulation engine, a sequence of actions determined to be leading to a maximum expected reward; integrating a plurality of the Neuro-Bayesian Network modules in a hierarchical manner to form a hierarchal Neuro-Bayesian network; and configuring the Neuro-Bayesian Network to be implemented as a Distributed Artificial Intelligence (DAI) system, with at least some of the plurality of the Neuro-Bayesian Network modules implemented on a cloud based network, and at least some of the plurality of the Neuro-Bayesian Network modules implemented on local computer based devices. TECHNICAL ADVANTAGES

[0046] The technical advantages envisaged by the present disclosure include the realization of a general-purpose processor architecture suited for artificial general intelligence operations and distributed implementation. The present disclosure provides a Neuro Bayesian Network (NBN) based processor designed to perform neural probabilistic and Bayesian inference, prediction, memory, reasoning, unsupervised learning, supervised learning, reinforcement learning, planning and decision making, imagination and simulation. The present disclosure discloses an engine for Neuro Bayesian Learning (eN-BLe) that can be used in any application where a time-series of input vectors is generated or where a sequence of inputs (e.g., text, images and the like) is generated but can be converted to a vector of real values. Optionally, the application may also offer possible actions the user can take to influence the application and include rewards/costs/penalties associated with inputs or actions. The eN-BLe framework can be utilized for any such application that can benefit from automation and AT, including but not limited to:

Internet of Things (IOT) applications generating real-time streaming sensor data, with possible actions available to the user for influencing the application and hence the sensor values. The eN-BLe framework can be used for predictive maintenance, anomaly detection, adaptive security, operations optimization, and die like;

Streaming video or other types of image data, audio/speech, and text processing for prediction, anomaly detection, classification/interpretation, robotics applications, driverless cars, and the like;

Online user behavior data for user modeling, prediction, optimized advertising, and the like; Learning predictive models of sales data and other business related data for customized marketing campaigns, advertising, inventory management, and the like; and

Learning personalized predictive models of healthcare data for anomaly detection and recommendation of prescriptive actions customized for patient

The present disclosure provides a flexible and reconfigurable Distributed AI (DAI) system mat enables fast prediction and probabilistic inference at comparatively shorter time scales on edge devices and longer time-scale predictions and inference in the cloud.

Claims

What is claimed is:

1. A computer implemented engine for Neuro-Bayesian Learning (eN-BLe), said engine comprising:

at least one hierarchical Neuro-Bayesian Network module (H-NBNM), said hierarchical Neuro-Bayesian Network module configured to receive inputs from a predetermined user application, in form of a time-sensitive sequence of input vectors, said Neuro-Bayesian learning model further configured to merge said input vectors into a single Input Spatio-Temporal vector, said Neuro-Bayesian learning model further configured to transform said single Input Spatio-Temporal vector into a Latent Hidden Vector (LHV) corresponding to a current time frame, said Neuro-Bayesian Network module further comprising: at least one recurrent network, said recurrent network having learnable weights assigned to layers thereof, said recurrent network configured to forecast a plurality of output vectors based on said sequence of input vectors and said weights, said output vectors forming predictions of Latent Hidden Vectors for at least one future time frame, said output vectors representative of a plurality of possible future inputs and probabilities associated with each of said possible future inputs; said Neuro-Bayesian Network module further configured to select at least one of said Latent Hidden Vectors as a predicted Latent Hidden Vector (PrJLHV), based on a pre-determined criteria, and further transform said predicted Latent Hidden Vector into a predicted Spatio-Tempora! Vector (PrJSTV) corresponding to said at least one future time frame; and a supervised learning module cooperating with said Neuro-Bayesian Network module, said supervised teaming module configured to receive said Latent Hidden Vectors corresponding to said future time frame, said supervised learning module further configured to map said Latent Hidden Vectors to an output label relevant to said at least one future time frame; a reinforcement learning module cooperating with said Neuro-Bayesian Network module to receive said Latent Hidden Vectors for said at least one future time frame, said reinforcement learning module configured to map said Latent Hidden Vectors to rewards and optimal actions expected at said at least one future time frame; and

a simulation module cooperating with said reinforcement learning module and said Neuro-Bayesian Network module, said simulation module configured to receive predetermined states of said user application as inputs, said simulation module further configured to perform a what-if analysis by applying a plurality of sequences of actions onto said Neuro-Bayesian Network module and said reinforcement learning module, to determine expected application states and expected rewards, said simulation module further configured to select at least one sequence of actions determined to generate maximal expected rewards, and further generate a plan to achieve an application state determined to be associated with said maximal expected rewards.

2. The engine for Neuro-Bayesian Learning as claimed in claim 1, wherein said hierarchical

Neuro-Bayesian Network module is further configured to generate a prediction error vector, said prediction error vector represented as a function of said input Spatio-temporal Vector and said predicted Spatio-temporal Vector, said prediction error vector usable in detecting anomalies across said input Spatio-temporal Vector and said predicted Spatio-temporal Vector, based on a predetermined detection criteria, said hierarchical Neuro-Bayesian Network module still further configured to utilize said prediction error vector instead of said time-sensitive sequence of input vectors, to calculate said Latent Hidden Vectors (LHVs) for said at least one future time frame.

3. The engine for Neuro-Bayesian Learning as claimed in claim 1, said engine for Neuro-

Bayesian Learning comprising a plurality of said hierarchical Neuro-Bayesian Network modules recursively connected over at least time and space to form a Spatio-temporal hierarchical Neuro-Bayesian Network.

4. The engine for Neuro-Bayesian Learning as claimed in claim 1 or 3, wherein Latent Hidden

Vectors corresponding to a current time frame and Latent Hidden vectors corresponding to a previous time frame are combined to form said input Spatio-temporal Vector, said input Spatio-temporal Vector configured to be fed to a higher level hierarchical Neuro-Bayesian Network module present within said Spatio-temporal hierarchical Neuro-Bayesian NetWork.

5. The engine for Neuro-Bayesian Learning as claimed in claim 4, wherein said input Spatio- temporal Vector is iteratively fed to higher levels of hierarchical Neuro-Bayesian Network modules present within said Spatio-temporal hierarchical Neuro-Bayesian Network, until a Hierarchical Neuro-Bayesian network constituting a pre-determined number of hierarchical Neuro-Bayesian Network modules is achieved, and further until Latent Hidden Vectors corresponding to said Hierarchical Neuro-Bayesian network are determined to incorporate pre-determined spatial properties and temporal properties over pre-determined spatial windows and pre-determined time frames respectively.

6. The engine for Neuro-Bayesian Learning as claimed in claim 1, wherein said hierarchical

Neuro-Bayesian Network module is further configured to determine weights to be assigned to layers thereof, using unsupervised learning principles, only when each input directed to said hierarchical Neuro-Bayesian Network module is not associated with a corresponding output label.

7. The engine for Neuro-Bayesian Learning as claimed in claim 1, wherein said Latent Hidden

Vectors derived from said plurality of output vectors are concatenated by said hierarchical Neuro-Bayesian Network module into a single vector to form a pooled Latent Hidden Vector, and wherein said pooled Latent Hidden Vector is fed to said supervised learning module for triggering prediction of said output label.

8. The engine for Neuro-Bayesian Learning as claimed in claim 1 or 7, wherein said pooled

Latent Hidden Vector is fed to said reinforcement learning module for triggering prediction of optimal actions and rewards expected to be associated with said optimal actions.

9. The engine for Neuro-Bayesian Learning as claimed in claim 1, wherein said simulation module receives an application state and a corresponding action, as an input, said simulation module further configured to perform said what-if analysis on said application state and said corresponding action, to generate a probabilistic prediction of future states and expected rewards, said simulation module still further configured to select a sequence of actions determined to be leading to a maximum expected reward.

10. The engine for Neuro-Bayesian Learning as claimed in claim 1, wherein said engine comprises a plurality of said Neuro-Bayesian Network modules arranged in a hierarchical manner to form a hierarehal Neuro-Bayesian network, said Neuro-Bayesian network configured to be implemented as a distributed Artificial Intelligence (A!) system, with at least some of said plurality of said Neuro-Bayesian Network modules implemented on a cloud based network, and at least some of said plurality of said Neuro-Bayesian Network modules implemented on local computer based devices.

11. A computer implemented method for implementing a Neuro-Bayesian Learning engine (eN- BLe), said method comprising the following computer-implemented steps: receiving, at a hierarchical Neuro-Bayesian Network module (H-NBNM), inputs from a predetermined user application, in form of a time-sensitive sequence of input vectors; merging, by said hierarchical Neuro-Bayesian Network module, said input vectors into a single Input Sr»tio-Terr¾>oral vector; transforming, by said hierarchical Neuro-Bayesian Network module, said single Input Spatio- Temporal vector into Latent Hidden Vector (LHV) corresponding to a current time frame;

forecasting, by a recurrent network having leamable weights assigned to layers thereof, a plurality of output vectors based on said sequence of input vectors and said weights, said output vectors forming predictions of Latent Hidden Vectors for at least one future time frame;

selecting, by said Neuro-Bayesian Network module, at least one of said Latent Hidden Vectors as a predicted Latent Hidden Vector, based on a pre-determined criteria, and further transforming said predicted Latent Hidden Vector into a predicted Spatio- temporal Vector (PrJST V) corresponding to said at least one future time frame; receiving said Latent Hidden Vectors corresponding to said future time frame, at a supervised learning module, and mapping, by said supervised learning module, said Latent Hidden Vectors to an output label relevant to said at least one future time frame; receiving said Latent Hidden Vectors corresponding to said at least one future time frame, at a reinforcement learning module, and mapping, by said reinforcement learning module, said Latent Hidden Vectors to rewards and optimal actions expected at said at least one future time frame; and receiving predetermined states of said user application as inputs at a simulation module, and performing a what-if analysis on said predetermined states, by applying a plurality of sequences of actions onto said Neuro-Bayestan Network module and said remforcement learning module, to determine expected application states and expected rewards; and selecting, by said simulation module, at least one sequence of actions determined to generate maximal expected rewards, and further generating, by said simulation module, a plan to achieve an application state determined to be associated with said maximal expected rewards.

12. The method as claimed in claim 11, wherein the method further includes the following steps: generating a prediction error, said prediction error represented as a function of said input Spatio-temporal Vector and said predicted Spatio-temporal Vector, utilizing said prediction error vector for detecting anomalies across said input Spatio- temporal Vector and said predicted Spatio-temporal Vector, based on a predetermined detection criteria; and

utilizing said prediction error vector instead of said time-sensitive sequence of input vectors, to calculate said Latent Hidden Vectors (LHVs) for said at least one future time frame.

13. The method as claimed in claim 11 , wherein the method further includes the following steps: recursively connecting a plurality of said hierarchical Neuro-Bayesian Network modules over at least time and space to form a Spatio-temporal hierarchical Neuro-Bayesian Network; combining Latent Hidden Vectors corresponding to a current time frame and Latent Hidden vectors corresponding to a previous time frame, to form said input Spatio- temporal Vector;

feeding said input Spatio-temporal Vector to a higher level hierarchical Neuro-Bayesian Network module present within said Spatio-temporal hierarchical Neuro-Bayesian Network; iteratively feed said input Spatio-temporal Vector to higher levels of hierarchical Neuro- Bayesian Network modules present within said Spatio-temporal hierarchical Neuro- Bayesian Network, until a Hierarchical Neuro-Bayesian network constituting a pre- determined number of hierarchical Neuro-Bayesian Network modules is achieved, and further until Latent Hidden Vectors corresponding to said Hierarchical Neuro-Bayesian network are determined to incorporate pre-determined spatial properties and temporal properties over pre-determined spatial windows and pre-determined time frames respectively; concatenating, by said hierarchical Neuro-Bayesian Network module, said Latent Hidden Vectors derived from said plurality of output vectors, into a single vector to form a pooled Latent Hidden Vector; feeding said pooled Latent Hidden Vector to said supervised learning module for triggering prediction of said output label; feeding said pooled Latent Hidden Vector to said reinforcement learning module for triggering prediction of optimal actions and rewards expected to be associated with said optimal actions; and receiving, at said simulation engine, an application state and a corresponding action as inputs, and performing said what-if analysis on said application state and said corresponding action, to generate a probabilistic prediction of future states and expected rewards; and selecting, by said simulation engine, a sequence of actions determined to be leading to a maximum expected reward.

14. Hie method as claimed in claim 11, wherein the method further includes the following steps: integrating a plurality of said Neuro-Bayesian Network modules in a hierarchical manner to form a hierarchal Neuro-Bayesian network; configuring said Neuro-Bayesian Network to be implemented as a Distributed Artificial Intelligence (DAI) system, with at least some of said plurality of said Neuro-Bayesian Network modules implemented on a cloud based network, and at least some of said plurality of said Neuro-Bayesian Network modules implemented on local computer based devices.

15. A non-transitory computer readable storage medium having computer executable instructions stored thereupon, said computer executable instructions when executed by a processor, cause the processor to: create a Neuro-Bayesian Learning engine (eN-BLe) mat: receives inputs from a predetermined user application, in form of a time-sensitive sequence of input vectors; merges said input vectors into a single Input Spatio-Temporal vector; transforms said single Input Spatio-Temporal vector into a Latent Hidden Vector (LHV) corresponding to a current time frame; triggers a recurrent network having leamable weights assigned to layers thereof, to forecast a plurality of output vectors based on a combination of said sequence of input vectors and said weights, said output vectors forming predictions of Latent Hidden Vectors for at least one future time frame; selects at least one of said Latent Hidden Vectors as a predicted Latent Hidden Vector, based on a pre-determined criteria, and further transforms said predicted Latent Hidden Vector into a predicted Spatio-temporal Vector (Pr_STV) corresponding to said at least one future time frame; propagates said Latent Hidden Vectors corresponding to said future time frame, to a supervised learning module, and triggers said supervised learning model to map said Latent Hidden Vectors to an output label relevant to said at least one future time frame; propagates said Latent Hidden Vectors corresponding to said at least one future time frame, to a reinforcement learning module, and triggers said reinforcement learning module to map said Latent Hidden Vectors to rewards and optimal actions expected at said at least one future time frame;

transmits predetermined states of said user application as inputs to a simulation module, and triggers said simulation module to perform a what-if analysis on said predetermined states by applying a plurality of sequences of actions onto said Neuro-Bayesian Network module and said reinforcement learning module, and determine expected application states and expected rewards; and triggers said simulation module to select at least one sequence of actions determined to generate maximal expected rewards, and further triggers said simulation module to generate a plan to achieve an application state determined to be associated with said maximal expected rewards.

16. The non-transitory computer readable storage medium as claimed in claim IS, wherein said computer executable instructions, when executed by the computer processor, trigger creation of said Neuro-Bayesian Learning engine (eN-BLe) that further: calculates a prediction error, said prediction error represented as a function of said input Spatio-temporal Vector and said predicted Spatio-temporal Vector; utilizes said prediction error vector for detecting anomalies across said input Spatio- temporal Vector and said predicted Spado-temporal Vector, based on a predetermined detection criteria; and utilizes said prediction error vector instead of said time-sensitive sequence of input vectors, to calculate said Latent Hidden Vectors (LHVs) for said at least one future time frame; recursively connects a plurality of hierarchical Neuro-Bayesian Network modules over at least time and space to form a Spatio-temporal hierarchical Neuro-Bayesian Network; combines Latent Hidden Vectors corresponding to a current time frame and Latent Hidden vectors corresponding to a previous time frame, to form said input Spatio- temporal Vector; feeds said input Spatio-temporal Vector to a higher level hierarchical Neuro-Bayesian Network module present within said Spatio-temporal hierarchical Neuro-Bayesian Network;

iteratively feeds said input Spatio-temporal Vector to higher levels of hierarchical Neuro-Bayesian Network modules present within said Spatio-temporal hierarchical Neuro-Bayesian Network, until a Hierarchical Neuro-Bayesian network constituting a pre-detennined number of hierarchical Neuro-Bayesian Network modules is achieved, and further until Latent Hidden Vectors corresponding to said Hierarchical Neuro-Bayesian network are determined to incorporate pre-deterrained spatial properties and temporal properties over pre-determined spatial windows and predetermined time frames respectively; concatenates said Latent Hidden Vectors derived from said plurality of output vectors, into a single vector to form a pooled Latent Hidden Vector,

feeds said pooled Latent Hidden Vector to a supervised learning module for facilitating prediction of said output label;

feeds said pooled Latent Hidden Vector to a reinforcement learning module for triggering prediction of optimal actions and rewards expected to be associated with said optimal actions; and transmits an application state and a corresponding action as inputs to a simulation engine, and triggers said simulation engine to perform said what-if analysis on said application state and said corresponding action, to generate a probabilistic prediction of future states and expected rewards; and triggering said simulation engine to select a sequence of actions determined to be leading to a maximum expected reward.

17. The non-transitory computer readable storage medium as claimed in claim 15, wherein said computer executable instructions, when executed by the computer processor

trigger integration of a plurality of Neuro-Bayesian network modules into a hierarchical Neuro-Bayesian Network; configure said hierarchical Neuro-Bayesian Network to be implemented as a Distributed Artificial Intelligence (DAI) system, with at least some of said plurality of said Neuro- Bayesian Network modules implemented on a cloud based network, and at least some of said plurality of said Neuro-Bayesian Network modules implemented on local computer based devices.