WO2022245134A1 - A system and method for context resolution and purpose driven clustering in autonomous systems - Google Patents
A system and method for context resolution and purpose driven clustering in autonomous systems Download PDFInfo
- Publication number
- WO2022245134A1 WO2022245134A1 PCT/KR2022/007123 KR2022007123W WO2022245134A1 WO 2022245134 A1 WO2022245134 A1 WO 2022245134A1 KR 2022007123 W KR2022007123 W KR 2022007123W WO 2022245134 A1 WO2022245134 A1 WO 2022245134A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- representation
- users
- clusters
- user identity
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Definitions
- the present invention generally relates to an autonomous system.
- the present invention relates to provide a system and method for context resolution and purpose driven clustering in the autonomous systems by generating different predictions from the neural network models based on multi-user input.
- Figure 1 shows an example scenario as per the state-of-the-art solution.
- a user 1 wants to make sure that people sitting inside the self-driving autonomous cars are “not drunk”. Earlier he had to stop each car and manually check alcohol level of each driver. But now all the cars are fully autonomous. They don’t make mistakes while running on roads even if the users inside them are drunk.
- User 1 has no way of knowing which of the self-driving cars he should stop, since all are following the traffic rules. Thus, here the possible solution could have been if there could be a system that could advertise the data of each user sitting inside the car.
- self-driving car carries 2 people. User 1 sees the car.
- Alcohol levels of all 2 inside the car are being shown on the dashboard 2.
- User 1 is standing on the road while this car passes him. He does not get time to read the alcohol levels of “all 2 people”, since the car is moving very fast.
- User 1 thinks of simpler days, when dashboard contained alcohol level of only “one” of the car occupants. Thus, it is required that information of “only 1 user” should be advertised by the autonomous system. Otherwise, all sort of problems related to data accuracy and interpretation are faced by an outside observer.
- a method of context-resolution in a computing-environment comprising, Receiving (102) input data from a plurality of users, Clustering (104) the users into a plurality of clusters, wherein each of the plurality of clusters represents a sub-plurality of users, Computing (106) a prototypical-representation for each cluster based on the received input data to obtain a plurality of prototypical-representations, Processing (108), by a convolution network, the plurality of prototypical-representations to perform a context-resolution with respect to the plurality of the users based on at least one of, Computing (110) a shared representation of the plurality of prototypical representations based on learning a contribution parameter of each of the plurality of users and Determine (112) a data contribution of each of the plurality of users within the shared representation based on the learned contribution parameter.
- the convolution network operates based on learning the input from the plurality of users in at least one single phase.
- the clustering corresponds to one or more of an unsupervised or supervised machine learning and defined by clustering of correlated or uncorrelated users within the plurality of users.
- the correlated users correspond to users exhibiting similar data and located near in embedding space; and the uncorrelated users correspond to users exhibiting dissimilar data and located far in embedding space.
- the clustering based on an unlabelled input data is defined by, receiving unlabelled input data as input points at the convolution network, examining a contribution-parameter associated with each of the input points by the cross-stitch unit and clustering the input points in a same cluster based on a degree of equivalence of the contribution parameter.
- the clusters are classified as similar or different based on one or more of receiving the clusters as labelled input points at the convolution network defining a cross stitch unit, estimating difference of contribution amongst the clusters based on a contribution parameter associated with each of the cluster and deciding at least two clusters from amongst the clusters as different based on the estimated difference contribution between the at least two clusters exceeding a threshold.
- the method further comprising receiving the shared representation from the cross stitch model corresponding to the plurality of clusters, predicting, from an artificial neural network (ANN), at least one user identity for associating with the shared representation based on the steps of a) executing a hard conditioning criteria defined by learning a plurality of default templates for each cluster, selecting the default template from amongst the plurality of default templates, said selected default template corresponding to the cluster in the shared representation, and b) executing a soft conditioning criteria, communicating the at least one user identity from the ANN for triggering dynamic clustering based on said user identity corresponding to a single entity, prohibiting the communicating of at least one user identity based on said user identity corresponding to multiple entities.
- ANN artificial neural network
- the execution of the hard conditioning criteria comprises, converting the shared representation to a plurality of discrete integers by a classifier unit, each integer representing the cluster or a combination of clusters, feeding the plurality of discrete integers from the classifier to an embedding unit to project a plurality of cluster identities, forwarding the plurality of cluster identities from the embedding unit to the ANN to enable selection of the user identity for associating with the shared representation, said selected user identity corresponding to the maximum contributing user.
- the execution of the soft conditioning criteria comprises, computing cross-correlation among a plurality of channels of a feature stream associated with the shared representation, computing a plurality of constraints associated with the clusters, computing a loss-function representing soft conditioning based on the computed constraints, re-initiating soft conditioning based on iteratively aggregating the shared representation with prototype estimate from an observer module.
- the method further comprising, receiving, by an observer module, a sample drawn from a predicted distribution p(A
- the method further comprises receiving, by an observer module, one or more samples drawn from a predicted distribution p(A
- the training of the context resolution comprises, computing, based on a probabilistic graphical model, a posterior representation p ⁇ (B
- the training of the soft conditioning comprises aggregating the prototype estimate with the shared representation predicted by the context resolution.
- the training of the hard conditioning is based on, providing the cluster class identifier by the observer module to the classifier associated with the hard conditioning, observing a difference between the cluster class identifier of the observer module and a class identifier associated with the user identity predicted at the ANN, said user identity associated with user who contributed maximum to the shared representation computed by the context resolution.
- A) comprises, conducting an initial training of the ANN to shortlist a subset of neurons required to compute a hypothesis space p(A) of the ANN, decrementing weights of the subset of neurons if an activation value associated with the ANN reaches a pre-defined threshold to compute the hypothesis space p(A) and computing the posterior representation p ⁇ (B
- the context resolution, the conditioning and the ANN correspond to a client configuration and wherein the observer module corresponds to a global server configuration.
- a system for context-resolution in a computing-environment comprising a receiving module for receiving (102) input data from a plurality of users, a clustering module (302) for, clustering (104)the users into a plurality of clusters, wherein each of the plurality of clusters represents a sub-plurality of users, computing (106) a prototypical-representation for each cluster based on the received input data to obtain a plurality of prototypical-representations, a convolution network (302) for processing (108) the plurality of prototypical-representations to perform a context-resolution with respect to the plurality of the users based on at-least one of computing (110) a shared representation of the plurality of prototypical representations based on learning a contribution parameter of each of the plurality of users and determine (112) a data contribution of each of the plurality of users within the shared representation based on the learned contribution parameter.
- the convolution network operates based on learning the input from the plurality of users in at least one single phase.
- the clustering module is configured for clustering in accordance with one or more of an unsupervised or supervised machine learning and defined by clustering of correlated or uncorrelated users within the plurality of users.
- the correlated users correspond to users exhibiting similar data and located near in an embedding space and the uncorrelated users correspond to users exhibiting dissimilar data and located far in the embedding space.
- the clustering module is configured for clustering based on an unlabelled input data defined by the steps of receiving unlabelled input data as input points at the convolution network, examining a contribution-parameter associated with each of the input points by the cross-stitch unit; and clustering the input points in a same cluster based on a degree of equivalence of the contribution parameter.
- the clustering module is configured for determining the clusters as similar or different based on one or more of, receiving the clusters as labelled input points at the convolution network defining a cross stitch unit, estimating difference of contribution amongst the clusters based on a contribution parameter associated with each of the cluster and deciding at least two clusters from amongst the clusters as different based on the estimated difference contribution between the at least two clusters exceeding a threshold.
- the system further comprises a conditioning-module configured for, receiving the shared representation from the cross stitch model corresponding to the plurality of clusters, predicting, through an artificial neural network (ANN), at least one user identity for associating with the shared representation based on the steps of a) executing a hard conditioning criteria defined by learning a plurality of default templates for each cluster, selecting the default template from amongst the plurality of default templates, said selected default template corresponding to the cluster in the shared representation, and b) executing a soft conditioning criteria, communicating the at least one user identity from the ANN for triggering dynamic clustering based on said user identity corresponding to a single entity, prohibiting the communicating of the at least one user identity based on said user identity corresponding to multiple entities.
- ANN artificial neural network
- the execution of the hard conditioning criteria by the conditioning module comprises, converting the shared representation to a plurality of discrete integers by a classifier unit, each integer representing the cluster or a combination of clusters, feeding the plurality of discrete integers from the classifier to an embedding unit to project a plurality of cluster identities, forwarding the plurality of cluster identities from the embedding unit to the ANN to enable selection of the user identity for associating with the shared representation, said selected user identity corresponding to the maximum contributing user.
- the execution of the soft conditioning criteria by the conditioning module comprises, computing cross-correlation among a plurality of channels of a feature stream associated with the shared representation, computing a plurality of constraints associated with the clusters, computing a loss-function representing soft conditioning based on the computed constraints, re-initiating soft conditioning based on iteratively aggregating the shared representation with prototype estimate from an observer module.
- the system further comprises an observer module configured for receiving a sample drawn from a predicted distribution p(A
- an observer module configured for receiving a sample drawn from a predicted distribution p(A
- the system further comprises an observer module configured for receiving one or more samples drawn from a predicted distribution p(A
- an observer module configured for receiving one or more samples drawn from a predicted distribution p(A
- the training of the convolution network for the context resolution is defined by the steps of computing, based on a probabilistic graphical model, a posterior representation p ⁇ (B
- the training of the conditioning module comprising the soft conditioning comprises aggregating the prototype estimate with the shared representation predicted by the context resolution.
- the training of the conditional module comprising the hard conditioning is based on the steps of, providing the cluster class identifier by the observer module to the classifier associated with the hard conditioning, observing a difference between the cluster class identifier of the observer module and a class identifier associated with the user identity predicted at the ANN, said user identity associated with user who contributed maximum to the shared representation computed by the context resolution.
- A) comprises, conducting an initial training of the ANN to shortlist a subset of neurons required to compute a hypothesis space p(A) of the ANN.
- Figure 1A, 1B and 1C shows an example scenario as per the state-of-the-art solution
- Figure 2A and 2B illustrates a method of context-resolution in a computing-environment
- Figure 3A illustrates a high-level block diagram, according to an embodiment of the present disclosure
- Figure 3B illustrates a diagram showing the input/output black box-view, according to an embodiment of the present disclosure
- Figure 4 illustrates a broader flow of the context resolution block, according to an embodiment of the present disclosure
- Figure 5 illustrates a multi-task learning mechanism, according to the state-of-the-art techniques
- Figure 6A and 6B illustrates Multi-Task Learning Architecture according to the state of the art.
- Figure 7 illustrates a new "augmented" cross stitch unit, according to an embodiment of the present disclosure.
- Figure 8A, 8B and 8C illustrates a mathematical mechanism of the existing machine learning ML) based training methods as per the state of the art.
- Figure 9A, 9B and 9C illustrates another step of injecting remaining knowledge into the machine as per the state of the art.
- Figure 10A, 10B and 10C illustrates a mathematical mechanism of training based context resolution methods, according to the embodiment of the present disclosure.
- Figure 11A and 11B illustrates a loss function mechanism, according to the embodiment if the present disclosure.
- Figure 12 illustrates a working of a conditioning unit as a part of the broader system flow, according to the embodiment of the present disclosure.
- Figure 13 illustrates a neural network model for implementation conditioning unit of figure 12, according to the embodiment of the present disclosure.
- Figure 14 illustrates internal contents of the conditioning unit, according to the embodiment of the present disclosure.
- Figure 15 illustrates a mechanism by which data from the autonomous plain gets advertised to the observer plain, according to the embodiment of the present disclosure.
- Figure 16A, 16B and 16C illustrates a working of the dynamic clustering block, according to an embodiment of the present disclosure.
- Figure 17A and 17B illustrates a mechanism to calculate prototype Estimation, according to an embodiment of the present disclosure.
- Figure 18A and 18B illustrates unsupervised clustering, according to an embodiment of the present disclosure.
- Figure 19A and 19B illustrates an improved supervised clustering to achieve clustering at a 'coarser' level, according to an embodiment of the present disclosure.
- Figure 20 illustrates a practical application of Figure 21, according to an embodiment of the present disclosure.
- Figure 21 illustrates a swapping of dynamic clustering and context resolution blocks, according to an embodiment of the present disclosure.
- Figure 22 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to the state of the art.
- Figure 23 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to the present embodiment of the present disclosure.
- Figure 24 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to the state of the art.
- Figure 25 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to the present embodiment of the present disclosure.
- Figure 26 illustrates a use case 3 for combining "different" user clusters together in context resolution, according to the state of the art.
- Figure 27 illustrates a use case 3 for Channel Switching and Graph Matching in dynamic clustering, according to the present embodiment of the present disclosure.
- Figure 28 illustrates a use case 4 Fluctuations of Heart during a match in dynamic clustering, according to the state of the art.
- Figure 29 illustrates a use case 4 Fluctuations of Heart during a match in dynamic clustering, according to the present embodiment of the present disclosure.
- Figure 30A, 30B and 30C illustrates a solution to the scenarios of Figure 1, according to the present embodiment of the present disclosure.
- Figure 31 illustrates another system architecture implementing various modules and sub-modules in accordance with the implementation, according to the present embodiment of the present disclosure.
- Figure. 2A illustrates a method of context-resolution in a computing-environment.
- Figure. 2B illustrates a flowchart representing a system flow within the user-plain in line with Fig. 2A, according to an embodiment of the present disclosure.
- the user plain is the plain where multiple users are present. Data from multiple such users are processed simultaneously.
- the method comprises receiving (102) input data from a plurality of users. Collective data from multiple users is sent as an input
- the users are clustered (104) into a plurality of clusters, wherein each of the plurality of clusters represents a sub-plurality of users.
- the clustering corresponds to one or more of an unsupervised or supervised machine learning and is defined by clustering of correlated or uncorrelated users within the plurality of users.
- the correlated users correspond to users exhibiting similar data and located near in embedding space.
- the uncorrelated users correspond to users exhibiting dissimilar data and located far in embedding space.
- a prototypical-representation R1 for each cluster is computed (106) based on the received input data to obtain a plurality of prototypical-representations.
- a convolution network processes (108) the plurality of prototypical-representations to perform a context-resolution with respect to the plurality of the users.
- a shared representation is computed (110) of the plurality of prototypical representations based on learning a contribution parameter or a contribution ratio of each of the plurality of users.
- the shared representation corresponds to a single new user out of multiple users.
- step 112 data contribution of each of the plurality of users is determined (112) within the shared representation based on the learned contribution parameter.
- Figure 3A illustrates a high-level block diagram, according to an embodiment of the present disclosure.
- Figure 3B illustrates a diagram showing the input/output black box-view, according to an embodiment of the present disclosure.
- Phase 1 is a Context Resolution and corresponds to steps 102-112 of Fig. 2 (A and 2B).
- the mechanism of Phase 1 of the context resolution phase is as follows with respect to Fig. 3B:
- a single new user 304 is generated out of example three input users in as shown in Fig. 3B.
- the user 304 is now a "mixture" of several users. As can be seen, in particular, in the figure 3B.
- the generated user 304 is formed by maximal contribution (70%) of the user 1.
- user 1's identity i.e. the integer 1
- an autonomous plain 306. As can be seen, in particular, the figure 3B. User privacy is preserved since only an integer 1 is shared, and not the private representation of user 1.
- Phase 2 of Fig. 3A and Fig. 3B refers Conditioning Unit 305 and Neural Network 306.
- the mechanism of the conditioning unit 305 phase is as follows:
- the decision making step typically performed by the autonomous plain's neural network 306 takes as input, wherein the shared representation is generated by the context resolution block 302. As can be seen, in particular, the figure 3B.
- the behavior of the said network 306 is adjusted in two ways.
- the three users at can have different behaviours.
- the user who contributes maximum (70% of user1 in Fig. 3B) is used to condition the network 306 via an embedding unit.
- Phase 3 of Fig. 3A and Fig. 3B refers Dynamic Clustering.
- the mechanism of the dynamic clustering phase is as follows:
- An external observer sees the generated decision, along with the identity of the user who contributed maximally to that decision which is formed based on the shared representation provided by context resolution in the user plain.
- the identity of the user may be directly fetched from the shared representation based on a contribution ratio and the decision from the autonomous network 306 may not be required.
- a clustering logic 308 operates as follows:
- Table 1 illustrates a comparison table of the present disclosure with respect to the proposed mechanism. Reference is made with respect to the figure 3B.
- the shared representation is a single representation formed by mixing different users of Fig. 3B.
- the context resolution block 302 is shown in the figures 3A and 3B.
- Output of the context resolution 302 is the shared representation.
- the shared representation is advertised as "single user" to an observer in the observer plain.
- T55] The proposed Context Resolution block 302 allows us to generate a single shared representation from multi-user input.
- This shared representation is shared by an external observer user. 3
- Identity of a single user must be mapped with the shared representation obtained from context resolution.
- Figure 4 illustrates a broader flow of the context resolution block 302, according to an embodiment of the present disclosure.
- the present disclosure assumes an existence of multiple user models, whose data is given as input simultaneously to the context resolution block 302.
- There might be many such users as depicted at Fig. 3B Thus, it will become impossible to process their data at the same time. Therefore, such users are clustered together into several groups. For each such group, a single representation called prototypical representation is computed. This is used in order to compress the total no of actual inputs which are sent to the context resolution block [1].
- prototypical representation is computed. This is used in order to compress the total no of actual inputs which are sent to the context resolution block [1].
- multiple such inputs [1.1] make a forward pass through a novel cross stitch-unit.
- This model allows the system to generate a single shared representation [1.2].
- the context resolution concept can better be understood with the help of the existing state of the art techniques as provided in Fig. 5 and Fig. 6A and 6B
- Figure 5 illustrates a multi-task learning mechanism, according to the state-of-the-art techniques. It shows detecting a horse consists of two parts. First, a person can search an image for an animal which has four legs. Finally, he can draw a bounding box around it. According to the figure 5, the notion of combining many things [5.1] to generate a shared representation [output of 5.2] is already known as multi-task learning. According to the fig 5, the sign [5.1] refers the task to draw a bounding box around a horse. It could perform this in two ways. Firstly, it could directly detect the box around the horse [5.3]. This is single task. Secondly, it could consider the detection task to be a sequence of steps.
- Figure 6A and 6B illustrates Multi-Task Learning Architecture according to the state of the art.
- Figure 6A illustrates a pre-known setting for multi task learning where two separate tasks [Task 1 and Task 2]. The idea is to learn a single model, that inputs feature from both task 1 & 2 to perform a particular goal (detecting horse as shown in figure 5). This is made possible by learning the parameters of a cross-stitch unit as shown in the figure 6B.
- the final feature vector [6.5] is formed by a linear combination of feature vectors from [6.1] and [6.3]. This is shown by a solid line.
- figure 6B shows how different tasks get combined by cross stitch unit.
- Multi-Task learning is made possible by learning the parameters of a cross stitch unit.
- the system learns to perform a shared task 2, using the linear combinations [6.6] of the feature vectors from both [6.2] and [6.4]. This is shown as a dotted line.
- Mathematically, such a feature combination can be represented by the following equations:
- a major problem with existing cross-stitch unit is that the networks of both Task 1 and Task 2 get trained.
- the problem is a bit different. Here it is given many users whose data have to combine to form a shared representation. Accordingly, it is not needed to train the user models separately. Instead, the data from each user model contributes feature representations to a COMMON shared task.
- Figure 7 illustrates a new "augmented" cross stitch unit, according to an embodiment of the present disclosure.
- the operation of a new "augmented” cross stitch unit is provided in detailed below:
- the user whose and are maximum, is said to be the user who contributes most to the global shared representation. Mathematically, this maximal contribution is searched by calculating the Norm of each user's contribution parameters ( and ).
- Figure 8A, 8B and 8C illustrates a mathematical mechanism of the existing machine learning (ML) based training methods as per the state of the art.
- Figure 8A represents a probabilistic-formulation of a learning system.
- B denotes the information a user knows.
- A denotes the total learnable space of a neural network.
- the intersection [2] of A and B denotes the data which the user gives to the network for training.
- U denotes the total universe of all knowledge both known and unknown.
- Figure 8B represents the basic training process in a learning machine. The user injects his knowledge [2] in form of annotated datasets. This mechanism illustrates the traditional supervised setting.
- Figure 8C represents shows the individual probabilities which are modelled between phase 1 and phase 2.
- FIG. 8A A user B (Fig 8A: [1]+[2]), typically injects his knowledge [2] in a neural network A.
- This process is termed as supervised learning. It is typically done by a human annotating a dataset [Fig 8B:[2]], and a learning algorithm fitting its parameters onto it.
- this knowledge injection phase is denoted by the left bolded portion of Figure 8C.
- the knowledge [2] that a user B injects into the machine via datasets can be denoted by .
- a typical neural network gets trained on this dataset. Mathematically, it is denoted by the conditional probability in [Fig. 8C:[5]].
- Figure 9A, 9B and 9C illustrates another step of injecting remaining knowledge into the machine as per the state of the art.
- figure 9A represents a basic diagram to show the probability based explanation for training learning machines.
- Figure 9B represents diagram showing that the learning algorithm now exposes a feedback mechanism through which the user injects his remaining knowledge [1] into the machine. This is also known as semi-supervised training in prior arts.
- Figure 9C represents the individual probabilities which are modelled between phase 1 and phase 2.
- the grey region [2] denotes the step 1 of learning, i.e. training on user provided datasets.
- the portion of user knowledge denoted by [1] is still not provided to the neural network [5] during learning.
- the mechanism to inject this information gets denoted in Figure 9B.
- a learning algorithm [5] exposes a feedback mechanism [6] to the user. It refers to a network training on dataset [2], till the time it reaches a particular intelligence threshold. Then, unlabelled samples are passed through the machine and its own predictions are treated as the ground truths. This allows the system to create an unsupervised dataset for its operation. This method is known as semi-supervised setting in existing art.
- Steps 1 and Steps 2 of training learning machines need to be performed one after another. This is shown by the two parallel branches in Fig 9C, that meet together at [5]. Both need to be completed for the entire system to be trained.
- a major limitation is that the effective training time increases since learning process is now sequential in nature. Context Resolution solves this by bridging steps 1 and 2 into a SINGLE step.
- Figure 10A, 10B and 10C illustrates a mathematical mechanism of a training based context resolution methods, according to the embodiment of the present disclosure.
- Figure 10A represents a diagram showing the individual probabilities which are modelled between phase 1 and phase 2 through steps 1002 till 1006.
- Figure 10B represents diagram shows the individual probabilities which are modelled between phase 1 and phase 2.
- the learning algorithm [5] in step 1012 receives data from these two phases of step 1010. Since it is conditioned separately during each phase, its probability distribution can be considered the sum of two complex terms.
- Figure 10C represents diagram showing the steps 1014 till 1018 that context resolution allows the learning algorithm to combine the phase 1 and phase 2 of Figure 10B into a single phase.
- the convolution network or context resolution 304 of Fig. 3B operates based on learning the input (provided at the step 1014) from the plurality of users in at-least one single phase.
- the context resolution block in Fig. 10C is able to combine the phase 1 and phase 2 of Figure 10B into a single block vide step 1016, and perform the network's training simultaneously vide step 1018.
- FIG 10A a typical setting in which full-supervision [step 1] and semi/supervision [2] are performed at the same time in a network is shown vide steps 1002 till 1008.
- Figure 10B denotes the actual successive supervision/semi-supervision which occurs in existing neural networks vide steps 1010 till 1012. The same network gets conditioned to both the user dataset and semi-supervision iteratively.
- Figure 10C shows that context resolution block is able to bridge the two steps. This is because, the user B is directly kept in the input pipeline through augmented cross stitch units vide steps 1014 till 1018. There is no need to break the input B into separate regions of and .
- Figure 11A and 11C illustrates a loss function mechanism, according to the embodiment of the present disclosure.
- Figure 11A represents a diagram showing the mechanism to build decision boundaries in neural networks.
- Point P denotes the current output of context resolution 302 as the shared representation 304 of Fig. 3B.
- a constraint that point P should have maximum contribution of user 3, and minimum contributions of user 1/ user 2 is required.
- Figure 11B represents a diagram showing how the constraints introduced in Figure 11A that can be realized in the context resolution network.
- a clustering distance [loss] is calculated between shared representation and each user's cluster.
- a loss function that penalizes individual weights of augmented cross stitch unit for each user is applied.
- FIG. 11A and 11C augmented cross stitch unit forming a part of context resolution is able to process multi-user input to generate a shared representation.
- Figure 11A denotes the t-SNE 2d projection of various outputs of context resolution block. This subspace consists of several points. Point denotes the prototypical representation which was calculated for the user cluster.
- a random point P, generated as output of the context resolution block should follow two constraints:
- Constraint 1 Distance of point P to true class should be minimized and maximized for remaining classes. This can be done by taking a minimizing the Euclidian norm of P and P 3 .
- Constraint 2 Point P should be made to lie in the user cluster 3 [Figure 11A:[2]]. This is done by minimizing the contribution of user 1 and user 2 and maximizing the contribution of user 3. This constraint is achieved by penalizing the weights and of each such user.
- the said constraints can be implemented in form of any mathematical kernel.
- the basic concepts of enforcing these two constraints shall remain same. Architecturally, it is shown as clustering loss in Fig 11A:[2].
- Figure 12 illustrates a working of a conditioning unit 305 as a part of the broader system flow, according to the embodiment of the present disclosure.
- a main objective for the present mechanism is to achieve generate different predictions from the neural network 306 based on multi-user input as may be observed in Fig. 3B. This behavior is achieved by the conditioning unit 305.
- the present disclosure models a different type of conditioning constraints into the network.
- Figure 13 illustrates a neural network model for implementation conditioning unit 305 of Figure 12, according to the embodiment of the present disclosure.
- figure 13 shows the total modellable universe of a neural network or an autonomous network corresponding to the neural network 306 as referred in previous figures.
- the network contains total of 1000 parameters, which it must effectively utilize to learn two separate hypothesis [5.1 and 5.2].
- the system broadly switches among such hypothesis by hard conditioning, and allows for finer-adjustments via soft-conditioning.
- Figure 13 represents a simpler neural-network containing 1000 parameters.
- This network receives shared representation corresponding to two user groups [Group 1 and Group 2].
- each group contains a large no of users, whose simultaneous input cannot be processed by the network. Therefore, the conditioning unit 305 of figure 12 helps the autonomous network learn a default template for each group. To adjust the behavior of the neural network, two decisions need to be made.
- the network is allowed to change its decisions slightly in response to mild fluctuations b/w the preferences of the users belonging to a same cluster.
- the process of Fig. 13 may be represented as receiving the shared representation from the cross stitch model 302 corresponding to the plurality of clusters.
- ANN artificial neural network
- the execution of the hard conditioning criteria is defined by learning a plurality of default templates for each cluster, and selecting the default template from amongst the plurality of default templates, said selected default template corresponding to the cluster in the shared representation.
- the execution of soft conditioning criteria is defined by communicating the at-least one user identity from the ANN 306 for triggering dynamic clustering based on said user identity corresponding to a single entity. The communicating of the at-least one user identity may be prohibited based on said user identity corresponding to multiple entities.
- Figure 14 illustrates internal contents of the conditioning unit 305, according to the embodiment of the present disclosure. The working of the conditioning unit will be explained referring to Figure 14.
- the shared representation [1.1] is received as input.
- the representation [1.1] is generated as the output of the context resolution block 302 of Fig. 3B and corresponds to 304 of Fig. 3B.
- the integer label obtained from classifier [2.2] is passed through an embedding unit [2.3].
- the input [1.1] is a higher dimensional feature vector, and not a discrete integer like 1-9.
- the cascaded combination of a classifier unit, on top of the embedding unit is required in accordance with the present subject matter.
- the execution of the hard conditioning criteria comprises converting the shared representation to a plurality of discrete integers by a classifier unit, each integer representing the cluster or a combination of clusters.
- the plurality of discrete integers is fed from the classifier to an embedding unit to project a plurality of cluster identities.
- the plurality of cluster identities are forwarded from the embedding unit to the ANN to enable selection of the user identity, said selected user identity corresponding to the maximum contributing user.
- the input is fed to the neural network or the network 306.
- the loss function (clustering constraints) used to achieve soft conditioning is a unique mechanism.
- the execution of the soft conditioning criteria comprises computing cross-correlation among a plurality of channels of a feature stream associated with the shared representation.
- a plurality of constraints associated with the clusters is computed.
- a loss-function representing soft conditioning is computed based on the computed constraints.
- the soft conditioning is re-initiated based on iteratively aggregating the shared representation with prototype estimate from the observer module 308.
- Figure 15 illustrates a mechanism by which data from the autonomous plain or the neural network 306 gets advertised to the observer plain or the clustering logic 308, according to the embodiment of the present disclosure.
- the system consists of a Magnitude Tracer 1502 and Dynamic Clustering block 308.
- the system implements an additional-block called Magnitude tracer
- Magnitude Tracer Module Operation of Magnitude Tracer Module is as follows:
- Magnitude tracer receives an input from the context resolution block 302.
- the machine then sorts the users in decreasing order on norm & picks the user with maximal contributions.
- the autonomous plain 306 shares two things with the observer plain 308. First, it shares the integer of the user group which contributes maximal amount to the shared representation. Secondly, the autonomous plain 306 predicts a representation which can be said to be drawn from p(A
- the said user identity can perturb in real time as the user with maximal contribution inside the context resolution block 302 changes.
- the observer module 308 when the observer module 308 receives a sample drawn from a predicted distribution p(A/B) of clusters, and the predicted user identity from the from ANN, the observer module 308 projects the received distribution to a subspace, wherein said subspace comprises predictions from a plurality of ANN.
- the process involves grouping data corresponding to the projected representation into a number of classes determined based on the user identity, wherein each class is associated with a label or integer associated from the user identity.
- a prototypical representation is computed for at least one class of the plurality of classes based on sensing mismatch between the user identity and output of the ANN.
- At least one of the context resolution, the hard conditioning criteria, the soft conditioning criteria is trained to adjust prediction of the ANN to correct the user identity upon sensing the mismatch.
- Figure 16A, 16B and 16C illustrates a working of the dynamic clustering block within the observer plain 308, according to an embodiment of the present disclosure.
- the observer plain 308 receives plurality of inputs from the autonomous plain. i.e. ⁇ sample predicted from p(A
- the step 1602 corresponds to receipt of input by the observer module 308 as depicted in Fig. 15.
- the system clusters the plurality of generated p(A
- the step 1604 corresponds to a comparison between the user-identity at the observer module 308 as shared by the magnitude tracer 1502 and the output of the ANN 306. If there is a mismatch, then the flow proceeds to the step 1606.
- ⁇ Also at step 1608, for a SAME CLASS p(A
- the estimated ⁇ p(A/B)prototype and the cluster class id are shared with the autonomous plane.
- ⁇ p(A/B) is used to estimate a posterior ⁇ p(B/A) which is used to train the weights of the context resolution block 302.
- ⁇ At step 1612 the prototype, is merged with the shared representation predicted from context resolution block 302.
- the new representation is used for soft conditioning.
- ⁇ At step 1614 the hard conditioning, is ONLY done when the cluster class id being shared by the observer plain is different from the class id used in the conditioning unit.
- the class id used in the conditioning unit is given by the user who contributed maximal amount to the shared representation.
- Figure 16C further illustrates dynamic clustering and corresponds to steps 1608 till 1614, i.e. the flow from the observer plane to the user plane, according to an embodiment of the present disclosure.
- Figure 16c shows a Bayesian flow diagram illustrating the probability flow in user-plain.
- the block [4.2] is used to calculate posteriors, which drive the input selection mechanism at [4.4]. This is used to train the context resolution block 302 and further adjust the behavior of autonomous plain neural network 306.
- process steps 1602 till 1610 of the dynamic clustering operation may be defined as follows:
- Step 1602 receiving, by an observer module, one or more samples drawn from a predicted distribution p(A/B) of clusters, and the predicted user identity from ANN ;
- Step 1602 determining an integer value representative of the predicted user identity, said integer value defining a cluster class identifier
- Step 1602 and 1604 clustering the samples based on a number of classes associated with the received user identity
- Step 1606 computing prototypical estimates for each of the classes of clustered samples based on averaging inputs received from the plurality of ANNs;
- Step 1608 estimating hypothesis p ⁇ (A/B) for each cluster of the clustered samples based on receipt of inputs from the plurality of ANNs;
- Step 1610 sharing the estimated hypothesis p ⁇ (A/B), the prototypical estimates, and the cluster class identifier by the observer module with the ANN.
- the training of the context resolution 302 comprises computing, based on a probabilistic graphical model, a posterior representation p ⁇ (B/A) from the prototype estimate.
- a plurality of weights associated with the context resolution is updated based on said posterior representation.
- the training of the soft conditioning comprises aggregating the prototype estimate with the shared representation predicted by the context resolution 302.
- the training of the hard conditioning is based on providing the cluster class identifier by the observer module 308 to the classifier associated with the hard conditioning. Thereafter, a difference between the cluster class identifier of the observer module 308 and a class identifier associated with the user identity predicted at the ANN or the network 306 is computed.
- the user identity associated with user who contributed maximum to the shared representation is computed by the context resolution 302.
- Figure 17A and 17B illustrate a mechanism to calculate Prototype Estimation, according to an embodiment of the present disclosure and corresponding to steps 1604 till 1610 of Fig. 16A.
- the system needs to re-compute the linear combinations of multi-user input.
- the autonomous system or neural network 306 had modelled probability p(A
- A) is estimated.
- Step 1702 denotes training of the autonomous neural network as being divided into two phases [Pre phase 1 & phase 1].
- sensitivity of a neuron to a particular feature is calculated using R1. This helps us to converge upon a smaller subset of neurons which represent the additional features the machine would learn in future.
- step 1706 isolation of 1000 such neurons is performed. In the learning phase, only 100 out of these would be learned on datasets. However, the rest 900 represent the expansion capacity of the system.
- the 'weights' of a neuron are clipped if the activation value reaches a pre-defined threshold. This is different from the existing regularization techniques, since existing methods clip weights based on gradients.
- the present step proposes, clipping based on activation thresholds.
- the estimate p(A) obtained helps us compute p(B
- the prediction is then minimized w.r.t to the original inputs used to train the context resolution block.
- A) is just one of the possible ways. It's possible to implement a parametric model like a VAE, to model the optimal input combinations p(B) for training context resolution block without deviating from the scope of the invention.
- the steps of 1702 till 1716 may be referred to as comprising the steps of:
- Figure 18A and 18B illustrates unsupervised clustering.
- Figure 18B explains how the basic unsupervised clustering algorithms work
- figure 18A explains the flow of Context Resolution for Unsupervised Clustering.
- the figure 18B it performs unsupervised clustering between two closest points and assigning them to same cluster.
- the algorithm iterates till the system achieves a fixed no of two clusters [4] and [5], where the no of clusters to be formed is given as an input to the algorithm.
- figure 18A describes the operation of a context resolution network, in a variety of different embodiments which are separate from the main flow of the invention.
- the context resolution network operated in combination with an autonomous network and conditioning unit to achieve intelligence.
- the mechanism of clustering in UNSUPERVISED setting as done in the prior state of art proceeds by 'greedily' estimating the pair of points which are closest to each other and gradually collapsing [or merging] them into one till they end up in the number of clusters a person desire as per the prior art.
- Figure 18A specifically illustrates an improved unsupervised clustering to achieve clustering at a 'coarser' level, according to embodiment of the present disclosure.
- Figure 18A describes how context resolution can be used to perform clustering in UNSUPERVISED settings, to obtain the same sort of output as defined in Figure 18B. The only new thing is that now context resolution is being used to perform the clustering. As per figure 18A the context resolution can help to achieve unsupervised clustering at a 'coarser' level than state of the art.
- the system takes two points as input at a time as opposed to only a single point to calculate clusters.
- the way the said clustering proceeds is achieved is by passing a plurality of points as input representation to the context resolution
- the steps 1802 till 1810 defining the clustering based on unlabelled input data are defined by:
- Figure 19A and 19B illustrates supervised clustering based on labelled data.
- Figure 19 A's [6.1] denotes that the context resolution block can be used to generate a single shared representation out of two points. Effectively, it is performing clustering of 4 points. This is achieved by taking two separate context resolution blocks [with each receiving 2 inputs]. The problem is then reduced to merging two shared representations which can be solved by KNN method.
- KNN method The advantage of this embodiment was that it introduced a coarser decision step, on top of the existing method such as KNN clustering method as known in state of the art.
- FIG. 19B describes a flow, using Context Resolution, to take a decision step among different clusters in the subspace (assuming cluster classes are pre-known). It assumes that the input data is SUPERVISED in nature, i.e. individual classes [and clusters have been formed beforehand] is known. In such an embodiment, the context resolution is used to estimate that how much different are two clusters from each other.
- the control flow may be explained as follows:
- a plurality of points (belonging to different-clusters) is passed.
- a difference threshold is defined.
- step 1906 the difference of their contribution ratios is calculated.
- step 1908 if a difference is found greater than a predefined threshold, then the separate clusters are said to be 'sufficiently different' vide step 1910 and a decision step gets taken. Otherwise, the clusters are adjudicated as similar vide step 1912.
- the steps 1902 till 1912 defining the clustering based on an labeled input data are defined by:
- Figure 20 illustrates a practical application of Figure 19, according to an embodiment of the present disclosure.
- figure 20 shows that the 2 planes follow the same relativistic concepts of Newtonian mechanics, i.e. a single plane can be both user/observer plane.
- [4] consists of a selection process that takes two planes as input, and outputs which plane is observer/user.
- Plane [1] denotes a professor whose knowledge is superior to a school teacher [2].
- Plane [2] denotes a teacher who knows more than a student.
- the information flow in planes is in decreasing order of knowledge.
- plane [2] A teacher grades many students in [3]. From this perspective, [2] becomes an observer plane, and [3] becomes the user plane.
- the teacher's plane now becomes a user plane.
- the said invention gives two advantages, i.e. a coarser method for achieving unsupervised clustering [previous slide], and defining a plane selection [4] algorithm based on the relative context of different users [2,3].
- Figure 21 illustrates a swapping of dynamic clustering and context resolution blocks, according to embodiment of the present disclosure.
- FIG. 21 illustrates a swapping of dynamic clustering and context resolution blocks, according to embodiment of the present disclosure.
- a single context resolution block [3] will not be able to form a shared representation for these many users because of limited model capacity.
- the dynamic clustering block [1] in the observer plane can be implemented in the user plane [2].
- a plurality of users is clustered into respective groups.
- a single representation of each group is computed via mechanisms like representation averaging.
- the context resolution now contains inputs of these "group representations" instead of individual user inputs.
- the same embodiment would contain a situation in which the context resolution block [3] has calculated a LOCAL shared representation.
- [3] queries a database for GLOBAL representations. However, it does not find those representations. In that case, a context resolution block [4] would be implemented in the observer plane, where the said GLOBAL representations of each class are combined to form optimal GLOBAL representations. This global representation [4] is then transferred to the local system [3] as has been explained in the best embodiment.
- Figure 22 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to the state of the art. According to figure 22 following scenarios were described.
- the system [7] was receiving both User 1 and User 2 input.
- User 1 decision to not stop was given a higher priority even when User 2 wanted to drop off at park [4].
- User 2 is a child, he knows the park better than User 1.
- the system thought that since User 1 is an elder, he is always right. This problem was because the system 'permanently' assigned a higher priority to User 1 decision.
- Figure 23 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to present embodiment of the present disclosure.
- our context resolution block assigns higher priority to User 2. It calculates that the current car location [park] is more correlated to User 2 than User 1. So, it learns to assign higher weight to User 2 decision and automatically stops the car. Once User 2 drops off, the system reverts to its default behaviour to prioritize User 1 input. The users who are given priority at each processing step are shown in bold.
- a Context resolution b/w two users changing default ranking of multiple user input based on relevance of each user to an external environmental trigger. Performing the said change in the ranking by a neural network instead of database querying leads to a lower time to achieve context resolution.
- Figure 24 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to the state of the art.
- Marshall [1] is watching his favourite movies channel. From system's perspective, the system [4] is aware about the users watching the TV [2] and contains a channel list [4] from which a user [1] selects the channel he wants to view. [2] Lily requests Marshall to change to sports channel. Marshall agrees and switches to sport. This "agreement" step was not handled by system, but required Marshall-Lily to talk together.
- Figure 25 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to present embodiment of the present disclosure.
- [1] Marshall 's representation [1] is fused with Lily's [5] to obtain a shared blob [6].
- [2] Based on newly created user [6], the system checks the relevance of channels in the channel list. A single sports channel is selected and shown to Marshall Computationally, it corresponds to selection of higher edge weight [7] in matching graph. Thus, the system automatically resolves decisions of multiple users by context resolution. This paves way for new applications like graph matching [knowledge representation].
- Figure 26 illustrates use case 3 for combining "different" user clusters in context resolution, according to the state of the art.
- Mr John [1] is an assistant professor, who wants to prepare a lecture on computer vision. He interacts with a flip [2] and queries a server [3] for all the presentations which took place at his work place in the past [2] But, all the server contains is the lectures grouped into graduate students and Professor clusters. Mr. John is not satisfied since content form cluster [4] is too easy, and cluster[5] too complex.
- Figure 27 illustrates a use case 3 for Channel Switching and Graph Matching in dynamic clustering, according to the present embodiment of the present disclosure.
- the proposed [1] context resolution block combines the representations from two clusters [4] and [5] and generates a new shared representation for an assistant professor [6]. This is then shared with Mr. John. [2] Note that the concept of forming shared clusters can be extended to multi modal data, i.e. images, audio and texts.
- Figure 28 illustrates a use case 4 for Fluctuations of Heart during a match in dynamic clustering, according to the state of the art.
- This scenario represents 3 friends viewing a match via an AR/ VR [in a pandemic situation].
- Existing systems consider each user to possess a separate identity. Hence, there is no notion of clustering uses into multiple sized groups according to personal preference.
- Figure 29 illustrates a use case 4 for Fluctuations of Heart during a match in dynamic clustering, according to present embodiment of the present disclosure.
- Jaz and Sam are clustered into one group, while Lin is in separate cluster. These clusters were formed based upon the direct relevance of each user to the match, i.e. whether they supported a team or not.
- dynamic clustering of a user (Mr. Lin) based on change of purpose (increase of heartbeat) is done.
- generating better personalization for people in a same cluster (Mr. Lin and Ms. Jaz are now in same cluster).
- Figure 30A, 30B and 30c illustrates a solution to the scenarios of Figure 1, according to present embodiment of the present disclosure.
- the present mechanism advertises the data of each user.
- the figure 30A addresses the issues as shown in the figure 1A. Accordingly, the traffic policeman [user 1] sees a self driving car. The Alcohol levels of all people inside the car are being shown on the dashboard. Now the user 1 can see the car's dashboard [2] from outside and fine the defaulters.
- the figure 30B devises a method which can combine data of multiple users into a SINGLE data.
- the figure 30B addresses the issues as shown in the figure 1B. Accordingly, inside the car, a system observes the alcohol contents of all occupants.
- the occupant whose alcohol content is maximum gets advertised on the car's dashboard [3].
- the dashboard [3] contains only 1 information (max alcohol level) instead of information of all the occupants.
- the user 1 has to has to read only a single alcohol level, for each self-driving car.
- the figure 30C devise a method for mapping between person identity and shared representation.
- the figure 30C addresses the issues as shown in the figure 1C. Accordingly, the person's name who violated the traffic rule is also visible on the dashboard [3].
- the information ⁇ occupant_id, alcohol_level> gets advertised on the dashboard.
- user 1 stops the car, and only fines that particular person instead of penalizing the driver every time.
- Figure 31 illustrates machine learning based system 1000 for prediction and clustering, in accordance with an embodiment of the present invention.
- the present implementation of the machine learning (ML) based system 1000 for prediction and clustering may be implemented in hardware, software, firmware, or any combination thereof.
- the ML based system 1000 includes an input and interaction module 1001 which is adapted for interpreting input accepted in the form of user's input and generating a response to the user.
- the input is compared to a database of interrelated concepts, which may be employed through ML specification hardware 1002.
- the ML based system 1000 further includes a virtual personal assistant (VPA) 1003 which can interact with one of more general-purpose hardware and drivers 1004 to provide access to information.
- VPN virtual personal assistant
- the ML based system 1000 further includes an ML specification application programming interface (API) 1005.
- the ML specification API 1005 may provide current knowledge regarding virtual personal assistance.
- the ML specification API 1005 can also, change, update, and/or modify the virtual personal assistant's information, based on explicit and/or implicit feedback based on user data such as user profiles, and from learning a person's preferences.
- a multimedia database 1010 in collaboration with an ML logic 1007 may be provided.
- the ML logic 1007 may assist in updating the database by adding new concepts and relationships can be developed or strengthened based on machine learning.
- the ML logic 1007 may include software logic modules that enable a virtual personal assistant to adapt the database for the user's usage patterns, preferences, and priorities, etc.
- the ML engine 1007 is also adapted to index various observations according to a set of pre-determined features, where these features define the characteristics of observation data that are of interest to the virtual personal assistant.
- a separate VPA 1003 may be provided which is adapted to interpret conversational user input, determine an appropriate response to the input.
- the VPA 1003 is also adapted to provide response which can easily understood and interpreted by the user.
- a plurality of software components may be implemented to accomplish such task.
- the VPA 1003 may be communicatively coupled with a network from where the VPA 1003 may fetch information from one or more websites.
- the websites may include the API.
- the VPA 1003 may optionally be incorporated into other sub-systems or interactive software applications, for example, operating systems, middleware or framework, such as VPA specification API 1006 software, and/or user-level applications software (e.g., another interactive software application, such as a search engine, web browser or web site application, or a user interface for a computing device).
- Such applications may also include position-infotainment systems, position-based VPA applications, Smart devices, etc.
- the ML based system1000 may further perform simulation in a simulation engine 1008 based on the responses received from VPA specification API 1006 and the ML logic 1007 and one or more objects database 1011 to generate output and presentation 1009. In this way, the ML based system 1000 can have the ability to adapt to a user's needs, preferences, lingo and more.
- a machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine, for e.g., a computing device or a "virtual machine” running on one or more computing devices).
- a machine-readable medium may include any suitable form of volatile or non-volatile memory.
- the processors refers to any type of computational circuit, such as, but not limited to, a microcontroller, a microprocessor, a complex instruction set computing microprocessor, a reduced instruction set computing microprocessor, a very long instruction word microprocessor, an explicitly parallel instruction computing microprocessor, a graphics processor, a digital signal processor, or any other type of processing circuit.
- the processors may also include embedded controllers, such as generic or programmable logic devices or arrays, application-specific integrated circuits, single-chip computers, smart cards, and the like.
- an embodiment of the present invention may be implemented by using hardware only, by using software, and a necessary universal hardware platform.
- the present invention may be implemented in the form of a procedure, function, module, etc. that implements the functions or operations described above. Based on such understandings, the technical solution of the present invention may be embodied in the form of software.
- the software may be stored in a non-volatile or non-transitory storage medium/module, which can be a compact disk read-only memory (CD-ROM), USB flash disk, or a removable hard disk or a cloud environment.
- CD-ROM compact disk read-only memory
- USB flash disk or a removable hard disk or a cloud environment.
- execution may correspond to a simulation of the logical operations as described herein.
- the software product may additionally or alternatively include a number of instructions that enable a computing device to execute operations for configuring or programming a digital logic apparatus in accordance with embodiments of the present invention.
- the NLP/ML mechanism and VPA simulations underlying the present architecture 1300 may be remotely accessible and cloud-based, thereby being remotely accessible through a network connection.
- a computing device such as a VPA device may be configured for remotely accessing the NLP/ML modules and simulation modules may comprise skeleton elements such as a microphone, a camera a screen/monitor, a speaker etc.
- one of the plurality of modules may be implemented through AI based on ML/NLP logic.
- a function associated with AI may be performed through the non-volatile memory, the volatile memory, and the processor constituting the first hardware module i.e. specialized hardware for ML/NLP based mechanisms.
- the processor may include one or a plurality of processors.
- one or a plurality of processors may be a general purpose processor, such as a central processing unit (CPU), an application processor (AP), or the like, a graphics-only processing unit such as a graphics processing unit (GPU), a visual processing unit (VPU), and/or an AI-dedicated processor such as a neural processing unit (NPU).
- the aforesaid processors collectively correspond to the processor.
- the one or a plurality of processors control the processing of the input data in accordance with a predefined operating rule or artificial intelligence (AI) model stored in the non-volatile memory and the volatile memory.
- the predefined operating rule or artificial intelligence model is provided through training or learning.
- learning means by applying a learning logic/technique to a plurality of learning data, a predefined operating rule or AI model of a desired characteristic is made.
- the learning may be performed in a device itself in which AI according to an embodiment is performed, and/or may be implemented through a separate server/system.
- the AI model may consist of a plurality of neural network layers. Each layer has a plurality of weight values, and performs a layer operation through calculation of a previous-layer and an operation of a plurality of weights.
- Examples of neural-networks include, but are not limited to, convolutional neural network (CNN), deep neural network (DNN), recurrent neural network (RNN), restricted Boltzmann Machine (RBM), deep belief network (DBN), bidirectional recurrent deep neural network (BRDNN), generative adversarial networks (GAN), and deep Q-networks.
- the ML/NLP logic is a method for training a predetermined target device (for example, a robot) using a plurality of learning data to cause, allow, or control the target device to make a determination or prediction.
- learning techniques include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention generally relates to an autonomous system. In particular, the present invention relates to provide a system and method for context resolution and purpose driven clustering in the autonomous systems by generating different predictions from the neural network models based on multi-user input. It decouples the operational mechanics of an AI operated system into dual parts, i.e. a user plain and an autonomous plain. In particular, the method and system provide a Context Resolution for resolving the data from multiple users in the user plain to obtain a shared representation. Using the said representation to obtain a discrete integer that conditions the behavior of a neural network in an autonomous plane. Then, an Identity Mapping is provided for Mapping the identity of a single user (who contributes maximum amount to the shared representation) with the identity of autonomous plane. And further provides a Dynamic Clustering for Performing clustering based on dynamic identities of an autonomous plane. The said identity can change in real time, as user contribution towards the shared representation changes.
Description
The present invention generally relates to an autonomous system. In particular, the present invention relates to provide a system and method for context resolution and purpose driven clustering in the autonomous systems by generating different predictions from the neural network models based on multi-user input.
Recent developments in autonomous system require more precise decision(s) and correct data handling from multiple users. Existing systems require data of all the users present in a group to be shared to a server. Server executes a group-based decision. Other methods such as federated averaging treat each entity in a group to be represented by a separate feature vector. However, still the data for each of the users in the group needs to be shared with a server.
An outside person will be confused on what decision he should make when he sees data of so many people. Transmitting this much data violates user privacy and chokes network bandwidth. In an alternate state of the art implementation, existing art might compute a ‘Single representation’ for a group of entities, which models the behavior of a group, as a single entity. However the mechanism of formation of the said ‘Single representation’ assumes that the nature of individual entities of a group remains constant. Overall, the current implementations only form clusters if information inside them is 100% correct. For example, A “Prefers Cricket” cluster would denote that all the users lying inside it prefer to play cricket 100% of the time. They cannot handle partial cases like a user who plays cricket 60% of the time and hockey 40% of the time. The best they can do is to create two separate clusters (Cricket and Hockey) and assign the same user to both. But, such multiple cluster assignment would create confusion. Thus, a number of clusters increase linearly if number of classes increases. In another example, If users could play additional games like “Golf”, “Baseball”, 4 separate clusters would have to be formed (Cricket, Hockey, Golf & Baseball). Thus, there is poor decision making due to “static” nature of clusters.
Figure 1 shows an example scenario as per the state-of-the-art solution. As per the figure 1A, a user 1 wants to make sure that people sitting inside the self-driving autonomous cars are “not drunk”. Earlier he had to stop each car and manually check alcohol level of each driver. But now all the cars are fully autonomous. They don’t make mistakes while running on roads even if the users inside them are drunk. User 1 has no way of knowing which of the self-driving cars he should stop, since all are following the traffic rules. Thus, here the possible solution could have been if there could be a system that could advertise the data of each user sitting inside the car. In another example, as per the figure 1B, self-driving car carries 2 people. User 1 sees the car. Alcohol levels of all 2 inside the car are being shown on the dashboard 2. User 1 is standing on the road while this car passes him. He does not get time to read the alcohol levels of “all 2 people”, since the car is moving very fast. User 1 thinks of simpler days, when dashboard contained alcohol level of only “one” of the car occupants. Thus, it is required that information of “only 1 user” should be advertised by the autonomous system. Otherwise, all sort of problems related to data accuracy and interpretation are faced by an outside observer.
The existing methods would work if systems were configured to share data only for a “single” user. For example, in a naive state of the art method, the data of either User1 or User 2 would have been shared. Then, traditional approaches would choose only User 1 or User 2 according to who had higher alcohol content/ heartbeat. This logic is wrong since “Both” User 1/User 2 violates a particular constraint, i.e. alcohol level/heartbeat of a person should not be high. Further, even if the system advertises as shown in the figure 1B, only advertising the representation is not enough. That is to say user 2 as per the figure 2B is unable to identify “which particular passenger” of the car is drunk, since advertised information on the dashboard doesn’t contain passenger’s identity.
Furthermore, yet another existing art mechanism could have shared the data of both User1/User 2 together. But, this violates the notion of privacy, since the identities of both User 1/ User 2 shall also be shared. Alternatively, another existing art could have ‘stripped’ off the identities altogether, and shared the data of both User1/User2 together. However, a larger amount of such data would take a toll on the bandwidth and thereby cause malfunctioned operation. Therefore, such problems call for the need to construct a mechanism which can generate a single joint representation from multi-user input, and can share identity with an external observer without violating the privacy constraints of an individual. There is a need to assign ‘identity(s)’ to a joint representation, such that the ‘said identity’ does not violate the identities of the individual parts which contributed to the joint representation.
Consider the scenarios as shown in the figure 1C, as can be seen that, the user 1 observes two self-driving cars on a road. First one contains a person with high alcohol level [2]. Second one contains a person who is getting a heart attack and needs to be rushed to the hospital [3]. But all user 1 sees is a car. For him, all the self-driving cars are equal. So, he stops both of the cars to check their alcohol levels. Due to the delay while waiting, the patient dies of heart attack on the road. Thus, owing to the incapacity of the static clusters, there is a need to advertise any individual user information outside such as to an outsider observer.
Thus, there exists a need for “context resolution mechanism” that operates upon readings of multiple users. It must output a single reading that gets advertised by the autonomous system. Identity of a single user must be mapped with the shared representation obtained from context resolution. Further, there exists a need of a system that has an ability to generate dynamic cluster.
Thus, as may be seen, there exists a need to provide a methodology in order to overcome one of the above-mentioned problems.
Provided is a system and method for context resolution and purpose driven clustering in autonomous systems.
In accordance with an aspect of the disclosure, A method of context-resolution in a computing-environment, said method comprising, Receiving (102) input data from a plurality of users, Clustering (104) the users into a plurality of clusters, wherein each of the plurality of clusters represents a sub-plurality of users, Computing (106) a prototypical-representation for each cluster based on the received input data to obtain a plurality of prototypical-representations, Processing (108), by a convolution network, the plurality of prototypical-representations to perform a context-resolution with respect to the plurality of the users based on at least one of, Computing (110) a shared representation of the plurality of prototypical representations based on learning a contribution parameter of each of the plurality of users and Determine (112) a data contribution of each of the plurality of users within the shared representation based on the learned contribution parameter.
The convolution network operates based on learning the input from the plurality of users in at least one single phase.
The clustering corresponds to one or more of an unsupervised or supervised machine learning and defined by clustering of correlated or uncorrelated users within the plurality of users.
The correlated users correspond to users exhibiting similar data and located near in embedding space; and the uncorrelated users correspond to users exhibiting dissimilar data and located far in embedding space.
The clustering based on an unlabelled input data is defined by, receiving unlabelled input data as input points at the convolution network, examining a contribution-parameter associated with each of the input points by the cross-stitch unit and clustering the input points in a same cluster based on a degree of equivalence of the contribution parameter.
The clusters are classified as similar or different based on one or more of receiving the clusters as labelled input points at the convolution network defining a cross stitch unit, estimating difference of contribution amongst the clusters based on a contribution parameter associated with each of the cluster and deciding at least two clusters from amongst the clusters as different based on the estimated difference contribution between the at least two clusters exceeding a threshold.
The method further comprising receiving the shared representation from the cross stitch model corresponding to the plurality of clusters, predicting, from an artificial neural network (ANN), at least one user identity for associating with the shared representation based on the steps of a) executing a hard conditioning criteria defined by learning a plurality of default templates for each cluster, selecting the default template from amongst the plurality of default templates, said selected default template corresponding to the cluster in the shared representation, and b) executing a soft conditioning criteria, communicating the at least one user identity from the ANN for triggering dynamic clustering based on said user identity corresponding to a single entity, prohibiting the communicating of at least one user identity based on said user identity corresponding to multiple entities.
The execution of the hard conditioning criteria comprises, converting the shared representation to a plurality of discrete integers by a classifier unit, each integer representing the cluster or a combination of clusters, feeding the plurality of discrete integers from the classifier to an embedding unit to project a plurality of cluster identities, forwarding the plurality of cluster identities from the embedding unit to the ANN to enable selection of the user identity for associating with the shared representation, said selected user identity corresponding to the maximum contributing user.
The execution of the soft conditioning criteria comprises, computing cross-correlation among a plurality of channels of a feature stream associated with the shared representation, computing a plurality of constraints associated with the clusters, computing a loss-function representing soft conditioning based on the computed constraints, re-initiating soft conditioning based on iteratively aggregating the shared representation with prototype estimate from an observer module.
The method further comprising, receiving, by an observer module, a sample drawn from a predicted distribution p(A|B) of clusters, and the predicted user identity from the from ANN, projecting the received distribution to a subspace, said subspace comprising predictions from a plurality of ANN, grouping data corresponding to the projected representation into a number of classes determined based on the user identity, wherein each class is associated with a label or integer associated from the user identity, computing a prototypical representation for at least one class of the plurality of classes based on sensing mismatch between the user identity and output of the ANN and training at least one of the context resolution, the hard conditioning criteria, the soft conditioning criteria to adjust prediction of the ANN to correct the user identity upon sensing the mismatch.
The method further comprises receiving, by an observer module, one or more samples drawn from a predicted distribution p(A|B) of clusters, and the predicted user identity from the from ANN, determining an integer value representative of the predicted user identity, said integer value defining a cluster class identifier, clustering the samples based on a number of classes associated with the received user identity, computing prototypical estimates for each of the classes of clustered samples based on averaging inputs received from the plurality of ANNs, estimating hypothesis p^(A/B) for each cluster of the clustered samples based on receipt of inputs from the plurality of ANNs and sharing the estimated hypothesis p^(A/B), the prototypical estimates, and the cluster class identifier by the observer module with the ANN.
The training of the context resolution comprises, computing, based on a probabilistic graphical model, a posterior representation p^(B|A) from the prototype estimate, updating a plurality of weights associated with the context resolution based on said posterior representation.
The training of the soft conditioning comprises aggregating the prototype estimate with the shared representation predicted by the context resolution.
The training of the hard conditioning is based on, providing the cluster class identifier by the observer module to the classifier associated with the hard conditioning, observing a difference between the cluster class identifier of the observer module and a class identifier associated with the user identity predicted at the ANN, said user identity associated with user who contributed maximum to the shared representation computed by the context resolution.
The computing of the posterior representation p^(B|A) comprises, conducting an initial training of the ANN to shortlist a subset of neurons required to compute a hypothesis space p(A) of the ANN, decrementing weights of the subset of neurons if an activation value associated with the ANN reaches a pre-defined threshold to compute the hypothesis space p(A) and computing the posterior representation p^(B|A) based on the computed hypothesis space P(A).
The context resolution, the conditioning and the ANN correspond to a client configuration and wherein the observer module corresponds to a global server configuration.
In accordance with an aspect of the disclosure, A system for context-resolution in a computing-environment, said system comprising a receiving module for receiving (102) input data from a plurality of users, a clustering module (302) for, clustering (104)the users into a plurality of clusters, wherein each of the plurality of clusters represents a sub-plurality of users, computing (106) a prototypical-representation for each cluster based on the received input data to obtain a plurality of prototypical-representations, a convolution network (302) for processing (108) the plurality of prototypical-representations to perform a context-resolution with respect to the plurality of the users based on at-least one of computing (110) a shared representation of the plurality of prototypical representations based on learning a contribution parameter of each of the plurality of users and determine (112) a data contribution of each of the plurality of users within the shared representation based on the learned contribution parameter.
The convolution network operates based on learning the input from the plurality of users in at least one single phase.
The clustering module is configured for clustering in accordance with one or more of an unsupervised or supervised machine learning and defined by clustering of correlated or uncorrelated users within the plurality of users.
The correlated users correspond to users exhibiting similar data and located near in an embedding space and the uncorrelated users correspond to users exhibiting dissimilar data and located far in the embedding space.
The clustering module is configured for clustering based on an unlabelled input data defined by the steps of receiving unlabelled input data as input points at the convolution network, examining a contribution-parameter associated with each of the input points by the cross-stitch unit; and clustering the input points in a same cluster based on a degree of equivalence of the contribution parameter.
The clustering module is configured for determining the clusters as similar or different based on one or more of, receiving the clusters as labelled input points at the convolution network defining a cross stitch unit, estimating difference of contribution amongst the clusters based on a contribution parameter associated with each of the cluster and deciding at least two clusters from amongst the clusters as different based on the estimated difference contribution between the at least two clusters exceeding a threshold.
The system further comprises a conditioning-module configured for, receiving the shared representation from the cross stitch model corresponding to the plurality of clusters, predicting, through an artificial neural network (ANN), at least one user identity for associating with the shared representation based on the steps of a) executing a hard conditioning criteria defined by learning a plurality of default templates for each cluster, selecting the default template from amongst the plurality of default templates, said selected default template corresponding to the cluster in the shared representation, and b) executing a soft conditioning criteria, communicating the at least one user identity from the ANN for triggering dynamic clustering based on said user identity corresponding to a single entity, prohibiting the communicating of the at least one user identity based on said user identity corresponding to multiple entities.
The execution of the hard conditioning criteria by the conditioning module comprises, converting the shared representation to a plurality of discrete integers by a classifier unit, each integer representing the cluster or a combination of clusters, feeding the plurality of discrete integers from the classifier to an embedding unit to project a plurality of cluster identities, forwarding the plurality of cluster identities from the embedding unit to the ANN to enable selection of the user identity for associating with the shared representation, said selected user identity corresponding to the maximum contributing user.
The execution of the soft conditioning criteria by the conditioning module comprises, computing cross-correlation among a plurality of channels of a feature stream associated with the shared representation, computing a plurality of constraints associated with the clusters, computing a loss-function representing soft conditioning based on the computed constraints, re-initiating soft conditioning based on iteratively aggregating the shared representation with prototype estimate from an observer module.
The system further comprises an observer module configured for receiving a sample drawn from a predicted distribution p(A|B) of clusters, and the predicted user identity from the from ANN, projecting the received distribution to a subspace, said subspace comprising predictions from a plurality of ANN, grouping data corresponding to the projected distribution into a number of classes determined based on the user identity, wherein each class is associated with a label or integer associated from the user identity, computing a prototypical representation for at least one class of the plurality of classes based on sensing mismatch between the user identity and output of the ANN and training at least one of the context resolution of the convolution network, the hard conditioning criteria, the soft conditioning criteria associated with the conditioning module to adjust prediction of the ANN to correct the user identity upon sensing the mismatch.
The system further comprises an observer module configured for receiving one or more samples drawn from a predicted distribution p(A|B) of clusters, and the predicted user identity from the from ANN, determining an integer value representative of the predicted user identity, said integer value defining a cluster class identifier, clustering the samples based on a number of classes associated with the received user identity, computing prototypical estimates for each of the classes of clustered samples based on averaging inputs received from the plurality of ANNs, estimating hypothesis p^(A/B) for each cluster of the clustered samples based on receipt of inputs from the plurality of ANNs and sharing the estimated hypothesis p^(A/B), the prototypical estimates, and the cluster class identifier with the ANN.
The training of the convolution network for the context resolution is defined by the steps of computing, based on a probabilistic graphical model, a posterior representation p^(B|A) from the prototype estimate, updating a plurality of weights associated with the context resolution based on said posterior representation.
The training of the conditioning module comprising the soft conditioning comprises aggregating the prototype estimate with the shared representation predicted by the context resolution.
The training of the conditional module comprising the hard conditioning is based on the steps of, providing the cluster class identifier by the observer module to the classifier associated with the hard conditioning, observing a difference between the cluster class identifier of the observer module and a class identifier associated with the user identity predicted at the ANN, said user identity associated with user who contributed maximum to the shared representation computed by the context resolution.
The observer module for computing of the posterior representation p^(B|A) comprises, conducting an initial training of the ANN to shortlist a subset of neurons required to compute a hypothesis space p(A) of the ANN.
These and other features, aspects, and advantages of the present invention will become better understood when the following detailed description is read with reference to the accompanying drawings in which like characters represent like parts throughout the drawings, wherein:
Figure 1A, 1B and 1C shows an example scenario as per the state-of-the-art solution;
Figure 2A and 2B illustrates a method of context-resolution in a computing-environment
Figure 3A illustrates a high-level block diagram, according to an embodiment of the present disclosure;
Figure 3B illustrates a diagram showing the input/output black box-view, according to an embodiment of the present disclosure;
Figure 4 illustrates a broader flow of the context resolution block, according to an embodiment of the present disclosure;
Figure 5 illustrates a multi-task learning mechanism, according to the state-of-the-art techniques;
Figure 6A and 6B illustrates Multi-Task Learning Architecture according to the state of the art.
Figure 7 illustrates a new "augmented" cross stitch unit, according to an embodiment of the present disclosure.
Figure 8A, 8B and 8C illustrates a mathematical mechanism of the existing machine learning ML) based training methods as per the state of the art.
Figure 9A, 9B and 9C illustrates another step of injecting remaining knowledge into the machine as per the state of the art.
Figure 10A, 10B and 10C illustrates a mathematical mechanism of training based context resolution methods, according to the embodiment of the present disclosure.
Figure 11A and 11B illustrates a loss function mechanism, according to the embodiment if the present disclosure.
Figure 12 illustrates a working of a conditioning unit as a part of the broader system flow, according to the embodiment of the present disclosure.
Figure 13 illustrates a neural network model for implementation conditioning unit of figure 12, according to the embodiment of the present disclosure.
Figure 14 illustrates internal contents of the conditioning unit, according to the embodiment of the present disclosure.
Figure 15 illustrates a mechanism by which data from the autonomous plain gets advertised to the observer plain, according to the embodiment of the present disclosure.
Figure 16A, 16B and 16C illustrates a working of the dynamic clustering block, according to an embodiment of the present disclosure.
Figure 17A and 17B illustrates a mechanism to calculate prototype Estimation, according to an embodiment of the present disclosure.
Figure 18A and 18B illustrates unsupervised clustering, according to an embodiment of the present disclosure.
Figure 19A and 19B illustrates an improved supervised clustering to achieve clustering at a 'coarser' level, according to an embodiment of the present disclosure.
Figure 20 illustrates a practical application of Figure 21, according to an embodiment of the present disclosure.
Figure 21 illustrates a swapping of dynamic clustering and context resolution blocks, according to an embodiment of the present disclosure.
Figure 22 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to the state of the art.
Figure 23 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to the present embodiment of the present disclosure.
Figure 24 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to the state of the art.
Figure 25 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to the present embodiment of the present disclosure.
Figure 26 illustrates a use case 3 for combining "different" user clusters together in context resolution, according to the state of the art.
Figure 27 illustrates a use case 3 for Channel Switching and Graph Matching in dynamic clustering, according to the present embodiment of the present disclosure.
Figure 28 illustrates a use case 4 Fluctuations of Heart during a match in dynamic clustering, according to the state of the art.
Figure 29 illustrates a use case 4 Fluctuations of Heart during a match in dynamic clustering, according to the present embodiment of the present disclosure.
Figure 30A, 30B and 30C illustrates a solution to the scenarios of Figure 1, according to the present embodiment of the present disclosure.
Figure 31 illustrates another system architecture implementing various modules and sub-modules in accordance with the implementation, according to the present embodiment of the present disclosure.
Further, skilled artisans will appreciate that elements in the drawings are illustrated for simplicity and may not have been necessarily been drawn to scale. For example, the flow charts illustrate the method in terms of the most prominent steps involved to help to improve understanding of aspects of the present invention. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the drawings by conventional symbols, and the drawings may show only those specific details that are pertinent to understanding the embodiments of the present invention so as not to obscure the drawings with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.
For the purpose of promoting an understanding of the principles of the invention, reference will now be made to the embodiment illustrated in the drawings and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the illustrated system, and such further applications of the principles of the invention as illustrated therein being contemplated as would normally occur to one skilled in the art to which the invention relates.
It will be understood by those skilled in the art that the foregoing general description and the following detailed description are explanatory of the invention and are not intended to be restrictive thereof.
Reference throughout this specification to "an aspect", "another aspect" or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrase "in an embodiment", "in another embodiment" and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps but may include other steps not expressly listed or inherent to such process or method. Similarly, one or more devices or sub-systems or elements or structures or components proceeded by "comprises... a" does not, without more constraints, preclude the existence of other devices or other sub-systems or other elements or other structures or other components or additional devices or additional sub-systems or additional elements or additional structures or additional components.
Figure. 2A illustrates a method of context-resolution in a computing-environment. Figure. 2B illustrates a flowchart representing a system flow within the user-plain in line with Fig. 2A, according to an embodiment of the present disclosure. The user plain is the plain where multiple users are present. Data from multiple such users are processed simultaneously.
At step 102, the method comprises receiving (102) input data from a plurality of users. Collective data from multiple users is sent as an input
At step 104, the users are clustered (104) into a plurality of clusters, wherein each of the plurality of clusters represents a sub-plurality of users. The clustering corresponds to one or more of an unsupervised or supervised machine learning and is defined by clustering of correlated or uncorrelated users within the plurality of users. The correlated users correspond to users exhibiting similar data and located near in embedding space. The uncorrelated users correspond to users exhibiting dissimilar data and located far in embedding space.
At step 106, a prototypical-representation R1 for each cluster is computed (106) based on the received input data to obtain a plurality of prototypical-representations.
At step 108, as a part of generating a shared representation, a convolution network processes (108) the plurality of prototypical-representations to perform a context-resolution with respect to the plurality of the users.
At step 110, a shared representation is computed (110) of the plurality of prototypical representations based on learning a contribution parameter or a contribution ratio of each of the plurality of users. The shared representation corresponds to a single new user out of multiple users.
At step 112, data contribution of each of the plurality of users is determined (112) within the shared representation based on the learned contribution parameter.
Figure 3A illustrates a high-level block diagram, according to an embodiment of the present disclosure. Figure 3B illustrates a diagram showing the input/output black box-view, according to an embodiment of the present disclosure.
Referring to the figures 3A and 3B, Phase 1 is a Context Resolution and corresponds to steps 102-112 of Fig. 2 (A and 2B). The mechanism of Phase 1 of the context resolution phase is as follows with respect to Fig. 3B:
1. Initially, collective data from multiple users (example 3 users of Fig. 3B) is sent as an input to the context resolution block 302.
2. A single new user 304 is generated out of example three input users in as shown in Fig. 3B. The user 304 is now a "mixture" of several users. As can be seen, in particular, in the figure 3B.
3. One possible ratio of different users present in such obtained joint representation is shown in the table on right side of Figure 3B.
4. The generated user 304 is formed by maximal contribution (70%) of the user 1. Thus user 1's identity [i.e. the integer 1] is exposed to an autonomous plain 306. As can be seen, in particular, the figure 3B. User privacy is preserved since only an integer 1 is shared, and not the private representation of user 1.
Further, Phase 2 of Fig. 3A and Fig. 3B refers Conditioning Unit 305 and Neural Network 306. The mechanism of the conditioning unit 305 phase is as follows:
1. The decision making step typically performed by the autonomous plain's neural network 306 takes as input, wherein the shared representation is generated by the context resolution block 302. As can be seen, in particular, the figure 3B.
2. The behavior of the said network 306 is adjusted in two ways.
a) Hard Conditioning:
·The three users at can have different behaviours. The user who contributes maximum (70% of user1 in Fig. 3B) is used to condition the network 306 via an embedding unit.
·In particular, it refers to making a network choose one of the possible hypothesis out of three users. Since this conditioning is done explicitly, it is termed as hard conditioning.
b) Soft Conditioning
·The shared representation from the context resolution block 302 is directly fed into the neural network 306.
·Although the network is hard conditioned on User 1 (as example of Fig. 3B), such soft conditioning allows minor fluctuations in the decision making process.
·In particular, these fluctuations mean that the network 306 is now able to explore the regions which lie closer to the decisions made by user 1.
Further, Phase 3 of Fig. 3A and Fig. 3B refers Dynamic Clustering. The mechanism of the dynamic clustering phase is as follows:
1. The decision made by the autonomous network 306 is exposed to the observer plain.
2. An external observer sees the generated decision, along with the identity of the user who contributed maximally to that decision which is formed based on the shared representation provided by context resolution in the user plain. In another example, the identity of the user may be directly fetched from the shared representation based on a contribution ratio and the decision from the autonomous network 306 may not be required.
3. Naturally, an external observer groups the data coming from an autonomous plain into a tag based on generated decision and identity.
4. A clustering logic 308 operates as follows:
- [A]: Data from multiple autonomous plains is grouped into <generated representation, identity> tags.
- [B]: This data is then grouped into various clusters depending on different identities. In particular, generated representations can change in real time.
Table 1 illustrates a comparison table of the present disclosure with respect to the proposed mechanism. Reference is made with respect to the figure 3B.
Sr No. | Problem in the existing state of the solution. | Solution provided by the proposed mechanism. |
1 | [T1] Outside observer cannot directly see the contents inside an autonomous system. [T2] An outside observer can "only see" the contents inside an autonomous system if they are "explicitly advertised?" |
[T1] The observer in the observer plane cannot see users in the autonomous plane. All he sees is the shared representation. [T2] The shared representation is explicitly advertised as to the observer plane. |
2 | [T3] Information of "only 1 user" should be advertised by the autonomous system. Otherwise, all sort of problems related to the data interpretation and accuracy are faced by an outside observer. [T4] There must exist a "context resolution mechanism" that operates upon readings of multiple users. Finally, it must output a single reading that gets advertised by the autonomous system. |
[T3] The shared representation is a single representation formed by mixing different users of Fig. 3B. [T4] The [T5] Output of the [T6] The shared representation is advertised as "single user" to an observer in the observer plain. T55] The proposed [T66] This shared representation is shared by an external observer user. |
3 | [T7] Identity of a single user must be mapped with the shared representation obtained from context resolution. | [T7] The shared |
4 | [T8] Existing clusters are STATIC since they don't advertise any user information. | [T8] [T88] Autonomous systems share user data to an outside observer. This allows an external observer to cluster these vehicles dynamically. |
Figure 4 illustrates a broader flow of the context resolution block 302, according to an embodiment of the present disclosure. According to the present disclosure, it assumes an existence of multiple user models, whose data is given as input simultaneously to the context resolution block 302. There might be many such users as depicted at Fig. 3B Thus, it will become impossible to process their data at the same time. Therefore, such users are clustered together into several groups. For each such group, a single representation called prototypical representation is computed. This is used in order to compress the total no of actual inputs which are sent to the context resolution block [1]. Next, multiple such inputs [1.1] make a forward pass through a novel cross stitch-unit. This model allows the system to generate a single shared representation [1.2]. The context resolution concept can better be understood with the help of the existing state of the art techniques as provided in Fig. 5 and Fig. 6A and 6B
Figure 5 illustrates a multi-task learning mechanism, according to the state-of-the-art techniques. It shows detecting a horse consists of two parts. First, a person can search an image for an animal which has four legs. Finally, he can draw a bounding box around it. According to the figure 5, the notion of combining many things [5.1] to generate a shared representation [output of 5.2] is already known as multi-task learning. According to the fig 5, the sign [5.1] refers the task to draw a bounding box around a horse. It could perform this in two ways. Firstly, it could directly detect the box around the horse [5.3]. This is single task. Secondly, it could consider the detection task to be a sequence of steps. First, it roughly searches the image for an animal which has 4 legs [5.2] by cropping relevant region containing the horse in the image. Then it uses the knowledge from [5.2] to construct the bounding box around horse [5.3]. This is called multi-task learning [consisting of image classification followed by spatial localization in the image.] In the said detection task of Fig 5, both [5.2] and [5.3] have to do with detecting the horse.
However, the present mechanism poses no such constraints. According to the present disclosure, each of these tasks now represents a separate user. The data from multiple such users can be entirely different [uncorrelated] from each other. Now the architectural difference will be explained in detail with respect to state of art description in Fig. 6 and the description of context resolution 302 of present subject matter in Fig. 7.
Figure 6A and 6B illustrates Multi-Task Learning Architecture according to the state of the art. In particular, Figure 6A illustrates a pre-known setting for multi task learning where two separate tasks [Task 1 and Task 2]. The idea is to learn a single model, that inputs feature from both task 1 & 2 to perform a particular goal (detecting horse as shown in figure 5). This is made possible by learning the parameters of a cross-stitch unit as shown in the figure 6B. In task 1, the final feature vector [6.5] is formed by a linear combination of feature vectors from [6.1] and [6.3]. This is shown by a solid line. Now, figure 6B shows how different tasks get combined by cross stitch unit. According to the figure 6B, Multi-Task learning is made possible by learning the parameters of a cross stitch unit. In figure 6B, the system learns to perform a shared task 2, using the linear combinations [6.6] of the feature vectors from both [6.2] and [6.4]. This is shown as a dotted line. Mathematically, such a feature combination can be represented by the following equations:
A major problem with existing cross-stitch unit is that the networks of both Task 1 and Task 2 get trained. However, in the proposed mechanism, the problem is a bit different. Here it is given many users whose data have to combine to form a shared representation. Accordingly, it is not needed to train the user models separately. Instead, the data from each user model contributes feature representations to a COMMON shared task.
Figure 7 illustrates a new "augmented" cross stitch unit, according to an embodiment of the present disclosure. The operation of a new "augmented" cross stitch unit is provided in detailed below:
1. Take several user models. Train them. Finally, take their frozen versions.
2. Initialize a SINGLE network that represents the shared user [9] as corresponding to 304 in Fig. 3B.
3. Each user model gets bridged to the COMMON SHARED USER network via our augmented cross stitch unit.
5. The user whose and are maximum, is said to be the user who contributes most to the global shared representation. Mathematically, this maximal contribution is searched by calculating the Norm of each user's contribution parameters ( and ).
6. The identity of the obtained user in [5] gets mapped to the obtained shared representation 304 of Fig. 3B. The user1 who contributed 70% to the grey user was preferred over user2 and user3 as depicted in Fig. 3B.
7. Finally, the shared representation [9] is fed to the subsequent conditioning unit.
Figure 8A, 8B and 8C illustrates a mathematical mechanism of the existing machine learning (ML) based training methods as per the state of the art. In particular, Figure 8A represents a probabilistic-formulation of a learning system. B denotes the information a user knows. A denotes the total learnable space of a neural network. The intersection [2] of A and B denotes the data which the user gives to the network for training. “U” denotes the total universe of all knowledge both known and unknown. Figure 8B represents the basic training process in a learning machine. The user injects his knowledge [2] in form of annotated datasets. This mechanism illustrates the traditional supervised setting. Figure 8C represents shows the individual probabilities which are modelled between phase 1 and phase 2.
Now, consider Figure 8A, A user B (Fig 8A: [1]+[2]), typically injects his knowledge [2] in a neural network A. This process is termed as supervised learning. It is typically done by a human annotating a dataset [Fig 8B:[2]], and a learning algorithm fitting its parameters onto it. Mathematically, this knowledge injection phase is denoted by the left bolded portion of Figure 8C. The knowledge [2] that a user B injects into the machine via datasets can be denoted by . Finally, a typical neural network gets trained on this dataset. Mathematically, it is denoted by the conditional probability in [Fig. 8C:[5]].
Figure 9A, 9B and 9C illustrates another step of injecting remaining knowledge into the machine as per the state of the art.
In particular, figure 9A represents a basic diagram to show the probability based explanation for training learning machines. Figure 9B represents diagram showing that the learning algorithm now exposes a feedback mechanism through which the user injects his remaining knowledge [1] into the machine. This is also known as semi-supervised training in prior arts. Figure 9C represents the individual probabilities which are modelled between phase 1 and phase 2.
Now, consider Figure 9A, the grey region [2] denotes the step 1 of learning, i.e. training on user provided datasets. However, the portion of user knowledge denoted by [1] is still not provided to the neural network [5] during learning. The mechanism to inject this information gets denoted in Figure 9B. A learning algorithm [5] exposes a feedback mechanism [6] to the user. It refers to a network training on dataset [2], till the time it reaches a particular intelligence threshold. Then, unlabelled samples are passed through the machine and its own predictions are treated as the ground truths. This allows the system to create an unsupervised dataset for its operation. This method is known as semi-supervised setting in existing art. It's visual aspect is shown in Fig 9A:[1]+[2] when the entire oval B gets collapsed inside A. This denotes that the machine [5] has learned on entire user knowledge B. Mathematically, the portion of user knowledge [1] ingested by the machine [5] during this step 2 gets denoted by ). These two steps face a fundamental problem.
The problem of iterative knowledge injection is that Steps 1 and Steps 2 of training learning machines need to be performed one after another. This is shown by the two parallel branches in Fig 9C, that meet together at [5]. Both need to be completed for the entire system to be trained. A major limitation is that the effective training time increases since learning process is now sequential in nature. Context Resolution solves this by bridging steps 1 and 2 into a SINGLE step.
Figure 10A, 10B and 10C illustrates a mathematical mechanism of a training based context resolution methods, according to the embodiment of the present disclosure.
In particular, Figure 10A represents a diagram showing the individual probabilities which are modelled between phase 1 and phase 2 through steps 1002 till 1006.
Figure 10B represents diagram shows the individual probabilities which are modelled between phase 1 and phase 2. The learning algorithm [5] in step 1012 receives data from these two phases of step 1010. Since it is conditioned separately during each phase, its probability distribution can be considered the sum of two complex terms. Figure 10C represents diagram showing the steps 1014 till 1018 that context resolution allows the learning algorithm to combine the phase 1 and phase 2 of Figure 10B into a single phase.
According to the figure 10, the convolution network or context resolution 304 of Fig. 3B operates based on learning the input (provided at the step 1014) from the plurality of users in at-least one single phase. The context resolution block in Fig. 10C is able to combine the phase 1 and phase 2 of Figure 10B into a single block vide step 1016, and perform the network's training simultaneously vide step 1018.
In figure 10A a typical setting in which full-supervision [step 1] and semi/supervision [2] are performed at the same time in a network is shown vide steps 1002 till 1008. Such a form of training does not exist currently in existing art, and is the probability setting aimed to be achieved. Figure 10B denotes the actual successive supervision/semi-supervision which occurs in existing neural networks vide steps 1010 till 1012. The same network gets conditioned to both the user dataset and semi-supervision iteratively. Figure 10C shows that context resolution block is able to bridge the two steps. This is because, the user B is directly kept in the input pipeline through augmented cross stitch units vide steps 1014 till 1018. There is no need to break the input B into separate regions of and .
Figure 11A and 11C illustrates a loss function mechanism, according to the embodiment of the present disclosure. In particular, Figure 11A represents a diagram showing the mechanism to build decision boundaries in neural networks. Point P denotes the current output of context resolution 302 as the shared representation 304 of Fig. 3B. A constraint that point P should have maximum contribution of user 3, and minimum contributions of user 1/ user 2 is required. Figure 11B represents a diagram showing how the constraints introduced in Figure 11A that can be realized in the context resolution network. A clustering distance [loss] is calculated between shared representation and each user's cluster. Finally, a loss function that penalizes individual weights of augmented cross stitch unit for each user is applied.
According to the figure 11A and 11C, augmented cross stitch unit forming a part of context resolution is able to process multi-user input to generate a shared representation. A detailed design of the loss function for training the novel context resolution block is explained hereafter. Figure 11A denotes the t-SNE 2d projection of various outputs of context resolution block. This subspace consists of several points. Point denotes the prototypical representation which was calculated for the user cluster. A random point P, generated as output of the context resolution block should follow two constraints:
Constraint 1: Distance of point P to true class should be minimized and maximized for remaining classes. This can be done by taking a minimizing the Euclidian norm of P and P3.
Constraint 2: Point P should be made to lie in the user cluster 3 [Figure 11A:[2]]. This is done by minimizing the contribution of user 1 and user 2 and maximizing the contribution of user 3. This constraint is achieved by penalizing the weights and of each such user.
The said constraints can be implemented in form of any mathematical kernel. The basic concepts of enforcing these two constraints shall remain same. Architecturally, it is shown as clustering loss in Fig 11A:[2].
Figure 12 illustrates a working of a conditioning unit 305 as a part of the broader system flow, according to the embodiment of the present disclosure. A main objective for the present mechanism is to achieve generate different predictions from the neural network 306 based on multi-user input as may be observed in Fig. 3B. This behavior is achieved by the conditioning unit 305. The present disclosure models a different type of conditioning constraints into the network.
Figure 13 illustrates a neural network model for implementation conditioning unit 305 of Figure 12, according to the embodiment of the present disclosure.
In particular, figure 13 shows the total modellable universe of a neural network or an autonomous network corresponding to the neural network 306 as referred in previous figures. The network contains total of 1000 parameters, which it must effectively utilize to learn two separate hypothesis [5.1 and 5.2]. The system broadly switches among such hypothesis by hard conditioning, and allows for finer-adjustments via soft-conditioning.
Specifically, Figure 13 represents a simpler neural-network containing 1000 parameters. This network receives shared representation corresponding to two user groups [Group 1 and Group 2]. As mentioned earlier, each group contains a large no of users, whose simultaneous input cannot be processed by the network. Therefore, the conditioning unit 305 of figure 12 helps the autonomous network learn a default template for each group. To adjust the behavior of the neural network, two decisions need to be made.
- Hard conditioning -> Switching b/w the prototypes of different user groups. The neural network must switch its decision for users belonging in a particular cluster. This step is denoted by the microscope in Figure 13:[5.2]
- Soft Conditioning-> Switching b/w the same cluster.
Once a cluster is isolated, the network is allowed to change its decisions slightly in response to mild fluctuations b/w the preferences of the users belonging to a same cluster. Next, a detail of how these constraints of hard conditioning and soft conditioning are actually realized inside the conditioning unit 305.
In an embodiment, the process of Fig. 13 may be represented as receiving the shared representation from the cross stitch model 302 corresponding to the plurality of clusters. From an artificial neural network (ANN) 306, at least one user identity is predicted subsequent to hard conditioning or soft conditioning.
In an example, the execution of the hard conditioning criteria is defined by learning a plurality of default templates for each cluster, and selecting the default template from amongst the plurality of default templates, said selected default template corresponding to the cluster in the shared representation. The execution of soft conditioning criteria is defined by communicating the at-least one user identity from the ANN 306 for triggering dynamic clustering based on said user identity corresponding to a single entity. The communicating of the at-least one user identity may be prohibited based on said user identity corresponding to multiple entities.
Figure 14 illustrates internal contents of the conditioning unit 305, according to the embodiment of the present disclosure. The working of the conditioning unit will be explained referring to Figure 14.
1. Input: The shared representation [1.1] is received as input. The representation [1.1] is generated as the output of the context resolution block 302 of Fig. 3B and corresponds to 304 of Fig. 3B.
2. Hard Conditioning:
·The continuous feature-representation [1.1] is converted to a discrete integer, by a sigmoidal classifier unit. [2.2]
·Each such integer represents one of the possible user groups, which was given as input to the context resolution block
·The mechanism to train this classifier, is explained in the flow diagram of Fig. 15.
·The integer label obtained from classifier [2.2] is passed through an embedding unit [2.3]. Traditional embedding layers, condition on labels in a dataset. E.g. it can take a dataset of numbers [with images from 1-9] and condition a network on labels like 1-9, to generate respective images of different numbers.
·However, in accordance with the present subject matter the input [1.1] is a higher dimensional feature vector, and not a discrete integer like 1-9. Thus, the cascaded combination of a classifier unit, on top of the embedding unit is required in accordance with the present subject matter.
·The output [2.3] is fed to the network [3] or the network 306. Changing the values of generated integers from [2.2] allows the network [3] or the network 306 automatically switches among multiple hypothesis.
In an embodiment, the execution of the hard conditioning criteria comprises converting the shared representation to a plurality of discrete integers by a classifier unit, each integer representing the cluster or a combination of clusters. The plurality of discrete integers is fed from the classifier to an embedding unit to project a plurality of cluster identities. The plurality of cluster identities are forwarded from the embedding unit to the ANN to enable selection of the user identity, said selected user identity corresponding to the maximum contributing user.
3. Soft Conditioning:
·The continuous representation [1.1] is communicated to a channel expansion unit [2.1] of the conditioning unit 305.
·This simulates the idea of computing cross-correlations among channels of a CNN feature stream as in the Squeeze Nets.
·Finally, the input is fed to the neural network or the network 306. The loss function (clustering constraints) used to achieve soft conditioning is a unique mechanism.
In an embodiment, the execution of the soft conditioning criteria comprises computing cross-correlation among a plurality of channels of a feature stream associated with the shared representation. A plurality of constraints associated with the clusters is computed. A loss-function representing soft conditioning is computed based on the computed constraints. The soft conditioning is re-initiated based on iteratively aggregating the shared representation with prototype estimate from the observer module 308.
Figure 15 illustrates a mechanism by which data from the autonomous plain or the neural network 306 gets advertised to the observer plain or the clustering logic 308, according to the embodiment of the present disclosure. The system consists of a Magnitude Tracer 1502 and Dynamic Clustering block 308. The system implements an additional-block called Magnitude tracer
Operation of Magnitude Tracer Module is as follows:
1. Magnitude tracer receives an input from the context resolution block 302.
2. The coefficients for each user that contributes to the formation of shared representation are stored.
3. The machine then sorts the users in decreasing order on norm & picks the user with maximal contributions.
4. The autonomous plain 306 shares two things with the observer plain 308. First, it shares the integer of the user group which contributes maximal amount to the shared representation. Secondly, the autonomous plain 306 predicts a representation which can be said to be drawn from p(A|B). Therefore the system shares a combined data, <sample predicted from p(A|B) distribution modelled by the ANN or network 306, and the user identity> with the observer plain. In addition, weights of autonomous network 306 are shared with the observer plain 308.
5. The said user identity can perturb in real time as the user with maximal contribution inside the context resolution block 302 changes.
In an embodiment, when the observer module 308 receives a sample drawn from a predicted distribution p(A/B) of clusters, and the predicted user identity from the from ANN, the observer module 308 projects the received distribution to a subspace, wherein said subspace comprises predictions from a plurality of ANN.
Next, a detailed operation of dynamic clustering block which receives the claimed shared data in the observer plain 308 will be explained. More specifically, the process involves grouping data corresponding to the projected representation into a number of classes determined based on the user identity, wherein each class is associated with a label or integer associated from the user identity. A prototypical representation is computed for at least one class of the plurality of classes based on sensing mismatch between the user identity and output of the ANN. At least one of the context resolution, the hard conditioning criteria, the soft conditioning criteria is trained to adjust prediction of the ANN to correct the user identity upon sensing the mismatch.
Figure 16A, 16B and 16C illustrates a working of the dynamic clustering block within the observer plain 308, according to an embodiment of the present disclosure. Accordingly, the observer plain 308 receives plurality of inputs from the autonomous plain. i.e.<sample predicted from p(A|B) distribution modelled by the ANN, the user identity, and the weights of the autonomous plain network 306 >.
·The dynamic clustering operation as referred at Fig. 16A may be referred as follows:
·The step 1602 corresponds to receipt of input by the observer module 308 as depicted in Fig. 15. The system clusters the plurality of generated p(A|B) samples, according to classes given by the shared user identity.
·The step 1604 corresponds to a comparison between the user-identity at the observer module 308 as shared by the magnitude tracer 1502 and the output of the ANN 306. If there is a mismatch, then the flow proceeds to the step 1606.
·Next at step 1606, the prototypical estimates for each of the classes get calculated using the averaging approach done as per Figure 16B.
·Also at step 1608, for a SAME CLASS, p(A|B) samples shared via MULTIPLE autonomous plains are used to estimate a hypothesis. ^p(A/B), such that, ^p(AjBj _ p(AjB).
·Next, at step 1610, the estimated ^p(A/B)prototype and the cluster class id are shared with the autonomous plane. ^p(A/B), is used to estimate a posterior ^p(B/A) which is used to train the weights of the context resolution block 302.
·At step 1612, the prototype, is merged with the shared representation predicted from context resolution block 302. The new representation is used for soft conditioning.
·At step 1614, the hard conditioning, is ONLY done when the cluster class id being shared by the observer plain is different from the class id used in the conditioning unit. As mentioned previously, the class id used in the conditioning unit, is given by the user who contributed maximal amount to the shared representation.
Figure 16C further illustrates dynamic clustering and corresponds to steps 1608 till 1614, i.e. the flow from the observer plane to the user plane, according to an embodiment of the present disclosure. In particular, Figure 16c shows a Bayesian flow diagram illustrating the probability flow in user-plain. The block [4.2] is used to calculate posteriors, which drive the input selection mechanism at [4.4]. This is used to train the context resolution block 302 and further adjust the behavior of autonomous plain neural network 306.
In an embodiment, the process steps 1602 till 1610 of the dynamic clustering operation may be defined as follows:
Step 1602: receiving, by an observer module, one or more samples drawn from a predicted distribution p(A/B) of clusters, and the predicted user identity from ANN ;
Step 1602: determining an integer value representative of the predicted user identity, said integer value defining a cluster class identifier;
Step 1606: computing prototypical estimates for each of the classes of clustered samples based on averaging inputs received from the plurality of ANNs;
Step 1608: estimating hypothesis p^(A/B) for each cluster of the clustered samples based on receipt of inputs from the plurality of ANNs; and
Step 1610: sharing the estimated hypothesis p^(A/B), the prototypical estimates, and the cluster class identifier by the observer module with the ANN.
In an example, the training of the context resolution 302 comprises computing, based on a probabilistic graphical model, a posterior representation p^(B/A) from the prototype estimate. A plurality of weights associated with the context resolution is updated based on said posterior representation.
In an embodiment, the training of the soft conditioning comprises aggregating the prototype estimate with the shared representation predicted by the context resolution 302.
In an embodiment, the training of the hard conditioning is based on providing the cluster class identifier by the observer module 308 to the classifier associated with the hard conditioning. Thereafter, a difference between the cluster class identifier of the observer module 308 and a class identifier associated with the user identity predicted at the ANN or the network 306 is computed. The user identity associated with user who contributed maximum to the shared representation is computed by the context resolution 302.
Figure 17A and 17B illustrate a mechanism to calculate Prototype Estimation, according to an embodiment of the present disclosure and corresponding to steps 1604 till 1610 of Fig. 16A. Now, to adjust the behavior of the context resolution block 302 [for correcting the advertised user-identity], the system needs to re-compute the linear combinations of multi-user input. Thus, it is explained in terms of probabilities how this input-selection process is governed. The autonomous system or neural network 306 had modelled probability p(A|B). To get an optimal user input, p(B|A) is estimated.
Accordingly, it is needed to estimate p(A). Till now this had been a problem since "A" represents the total hypothesis space of a neural network. The activations of individual neurons inside it govern multiple possible hypothesis which a network can model. Since there is no unique p(A) that can be defined, it had been impossible to compute P(B|A). This problem is shown graphically by the shaded region [3] in Figure 17B. As can be seen [2] represents the knowledge that can be injected by a user via datasets. However, [3] is a region which denotes the capacity of a network, which has not yet been exposed to an external user B. For estimating this p(A) component, a constraining mechanism is being proposed. Its working has been explained in the forthcoming paragraphs.
The Clipping mechanism for estimating p(A) in accordance with Fig. 17A has been explained below:
· Step 1702 denotes training of the autonomous neural network as being divided into two phases [Pre phase 1 & phase 1].
· In the step 1704, sensitivity of a neuron to a particular feature is calculated using R1. This helps us to converge upon a smaller subset of neurons which represent the additional features the machine would learn in future.
· At step 1706, isolation of 1000 such neurons is performed. In the learning phase, only 100 out of these would be learned on datasets. However, the rest 900 represent the expansion capacity of the system.
· At step 1708 and 1710, during the training process, the 'weights' of a neuron are clipped if the activation value reaches a pre-defined threshold. This is different from the existing regularization techniques, since existing methods clip weights based on gradients. The present step proposes, clipping based on activation thresholds.
· At step 1712, the estimate p(A) obtained helps us compute p(B|A) [governing predicted user input].
· At step 1714 and 1716, the prediction is then minimized w.r.t to the original inputs used to train the context resolution block.
The said procedure of estimating p(B|A) is just one of the possible ways. It's possible to implement a parametric model like a VAE, to model the optimal input combinations p(B) for training context resolution block without deviating from the scope of the invention.
In an embodiment, the steps of 1702 till 1716 may be referred to as comprising the steps of:
conducting an initial training of the ANN to shortlist a subset of neurons required to compute a hypothesis space p(A) of the ANN ;
decrementing weights of the subset of neurons if an activation value associated with the ANN reaches a pre-defined threshold to compute the hypothesis space p(A); and
computing the posterior representation p^(B/A) based on the computed hypothesis space p(A).
Figure 18A and 18B illustrates unsupervised clustering. In particular, Figure 18B explains how the basic unsupervised clustering algorithms work, and figure 18A explains the flow of Context Resolution for Unsupervised Clustering.
According to the figure 18B, it performs unsupervised clustering between two closest points and assigning them to same cluster. The algorithm iterates till the system achieves a fixed no of two clusters [4] and [5], where the no of clusters to be formed is given as an input to the algorithm.
Referring to the figure 18B, it takes two close points [4.1 and 4.2] in the subspace and keep on merging them until a desired number of clusters are obtained. However, the problem is that at a time, only two points can be taken during clustering. If such a system were asked to take 3 points at the same time, the best it would have been able to do is to take three closest points together in terms of distance. There is no notion of taking clustering decisions based on far away [uncorrelated] points in the embedding space.
To address aforesaid, figure 18A, describes the operation of a context resolution network, in a variety of different embodiments which are separate from the main flow of the invention. In the most generic nature of the invention as described earlier, the context resolution network operated in combination with an autonomous network and conditioning unit to achieve intelligence.
The mechanism of clustering in UNSUPERVISED setting as done in the prior state of art proceeds by 'greedily' estimating the pair of points which are closest to each other and gradually collapsing [or merging] them into one till they end up in the number of clusters a person desire as per the prior art.
Figure 18A specifically illustrates an improved unsupervised clustering to achieve clustering at a 'coarser' level, according to embodiment of the present disclosure. Figure 18A describes how context resolution can be used to perform clustering in UNSUPERVISED settings, to obtain the same sort of output as defined in Figure 18B. The only new thing is that now context resolution is being used to perform the clustering. As per figure 18A the context resolution can help to achieve unsupervised clustering at a 'coarser' level than state of the art.
At step 1802, the system takes two points as input at a time as opposed to only a single point to calculate clusters. The way the said clustering proceeds is achieved is by passing a plurality of points as input representation to the context resolution
At steps 1804 and 1806, the contribution norms inside the cross stitch unit for each of the input points are examined.
At steps 1808 and 1810, if two points are having equal [or almost equal contribution ratios within some tolerance], then they are said to belong to the same cluster.
In an embodiment, the steps 1802 till 1810 defining the clustering based on unlabelled input data are defined by:
receiving unlabelled input data as input points at the convolution network;
examining a contribution-parameter associated with each of the input points by the cross-stitch unit; and
clustering the input points in the same cluster based on a degree of equivalence of the contribution parameter.
Figure 19A and 19B illustrates supervised clustering based on labelled data. Figure 19 A's [6.1] denotes that the context resolution block can be used to generate a single shared representation out of two points. Effectively, it is performing clustering of 4 points. This is achieved by taking two separate context resolution blocks [with each receiving 2 inputs]. The problem is then reduced to merging two shared representations which can be solved by KNN method. The advantage of this embodiment was that it introduced a coarser decision step, on top of the existing method such as KNN clustering method as known in state of the art.
Further, consider figure 19B that describes a flow, using Context Resolution, to take a decision step among different clusters in the subspace (assuming cluster classes are pre-known). It assumes that the input data is SUPERVISED in nature, i.e. individual classes [and clusters have been formed beforehand] is known. In such an embodiment, the context resolution is used to estimate that how much different are two clusters from each other. The control flow may be explained as follows:
At step 1902, a plurality of points (belonging to different-clusters) is passed.
At step 1904 a difference threshold is defined.
At step 1906, the difference of their contribution ratios is calculated.
At step 1908, if a difference is found greater than a predefined threshold, then the separate clusters are said to be 'sufficiently different' vide step 1910 and a decision step gets taken. Otherwise, the clusters are adjudicated as similar vide step 1912.
In an embodiment, the steps 1902 till 1912 defining the clustering based on an labeled input data are defined by:
receiving the clusters as labelled input points at the convolution network defining a cross stitch unit;
estimating difference of contribution amongst the clusters based on a contribution parameter associated with each of the clusters; and
deciding at least two clusters from amongst the clusters as different based on the estimated difference contribution between the at least two clusters exceeding a threshold.
Figure 20 illustrates a practical application of Figure 19, according to an embodiment of the present disclosure. In particular, figure 20 shows that the 2 planes follow the same relativistic concepts of Newtonian mechanics, i.e. a single plane can be both user/observer plane. [4] consists of a selection process that takes two planes as input, and outputs which plane is observer/user.
According to the figure 20, multiple users can be approximated as points in the subspace. A plane selection algorithm [4] then leverages context resolution block to cluster these users into three separate planes [1,2,3]. Next, a hierarchy of information flow is established. Plane [1] denotes a professor whose knowledge is superior to a school teacher [2]. Plane [2] denotes a teacher who knows more than a student. Thus, the information flow in planes is in decreasing order of knowledge. Consider plane [2]. A teacher grades many students in [3]. From this perspective, [2] becomes an observer plane, and [3] becomes the user plane. Similarly, for plane [1]-[2] pairing, the teacher's plane now becomes a user plane. Thus, the said invention gives two advantages, i.e. a coarser method for achieving unsupervised clustering [previous slide], and defining a plane selection [4] algorithm based on the relative context of different users [2,3].
Figure 21 illustrates a swapping of dynamic clustering and context resolution blocks, according to embodiment of the present disclosure. In yet another embodiment as shown in the figure 21, there can be a situation when there are many users present in the user plane. A single context resolution block [3] will not be able to form a shared representation for these many users because of limited model capacity. Thus, the dynamic clustering block [1] in the observer plane can be implemented in the user plane [2]. A plurality of users is clustered into respective groups. A single representation of each group is computed via mechanisms like representation averaging. The context resolution now contains inputs of these "group representations" instead of individual user inputs. The same embodiment would contain a situation in which the context resolution block [3] has calculated a LOCAL shared representation. [3] queries a database for GLOBAL representations. However, it does not find those representations. In that case, a context resolution block [4] would be implemented in the observer plane, where the said GLOBAL representations of each class are combined to form optimal GLOBAL representations. This global representation [4] is then transferred to the local system [3] as has been explained in the best embodiment.
Figure 22 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to the state of the art. According to figure 22 following scenarios were described.
1. Narration Settings
Consider a scenario where user 1 and user 2 are travelling in a self-driving car [1]. User 2 wants to complete his homework. The car's route is being adjusted according to wearable inputs from both passengers [7]. User 1 wants to reach office on time. However, the car encounters a park in between the route [2]. User 2 insists on dropping at park, and completing work with his friends. User 1 refuses [3] and takes user 2 to his office. User 2 is forced to work alone there [6].
2. Existing System (Traditional)
The system [7] was receiving both User 1 and User 2 input. When car reached park [2] , User 1 decision to not stop was given a higher priority even when User 2 wanted to drop off at park [4]. Although User 2 is a child, he knows the park better than User 1. Yet, the system thought that since User 1 is an elder, he is always right. This problem was because the system 'permanently' assigned a higher priority to User 1 decision.
Figure 23 illustrates a use case 1 for resolving the decisions among multiple passengers in context resolution, according to present embodiment of the present disclosure. According to the proposed system, when park comes [2], our context resolution block assigns higher priority to User 2. It calculates that the current car location [park] is more correlated to User 2 than User 1. So, it learns to assign higher weight to User 2 decision and automatically stops the car. Once User 2 drops off, the system reverts to its default behaviour to prioritize User 1 input. The users who are given priority at each processing step are shown in bold. Thus, a Context resolution b/w two users: changing default ranking of multiple user input based on relevance of each user to an external environmental trigger. Performing the said change in the ranking by a neural network instead of database querying leads to a lower time to achieve context resolution.
Figure 24 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to the state of the art. According to the figure 24, [1] Marshall [1] is watching his favourite movies channel. From system's perspective, the system [4] is aware about the users watching the TV [2] and contains a channel list [4] from which a user [1] selects the channel he wants to view. [2] Lily requests Marshall to change to sports channel. Marshall agrees and switches to sport. This "agreement" step was not handled by system, but required Marshall-Lily to talk together.
Figure 25 illustrates a use case 2 for Channel Switching and Graph Matching in dynamic clustering, according to present embodiment of the present disclosure. According to the figure 25, [1] Marshall 's representation [1] is fused with Lily's [5] to obtain a shared blob [6]. [2] Based on newly created user [6], the system checks the relevance of channels in the channel list. A single sports channel is selected and shown to Marshall Computationally, it corresponds to selection of higher edge weight [7] in matching graph. Thus, the system automatically resolves decisions of multiple users by context resolution. This paves way for new applications like graph matching [knowledge representation].
Figure 26 illustrates use case 3 for combining "different" user clusters in context resolution, according to the state of the art. According to the figure 28, [1] Mr John [1] is an assistant professor, who wants to prepare a lecture on computer vision. He interacts with a flip [2] and queries a server [3] for all the presentations which took place at his work place in the past [2] But, all the server contains is the lectures grouped into Graduate students and Professor clusters. Mr. John is not satisfied since content form cluster [4] is too easy, and cluster[5] too complex.
Figure 27 illustrates a use case 3 for Channel Switching and Graph Matching in dynamic clustering, according to the present embodiment of the present disclosure. The proposed [1] context resolution block combines the representations from two clusters [4] and [5] and generates a new shared representation for an assistant professor [6]. This is then shared with Mr. John. [2] Note that the concept of forming shared clusters can be extended to multi modal data, i.e. images, audio and texts.
Figure 28 illustrates a use case 4 for Fluctuations of Heart during a match in dynamic clustering, according to the state of the art. This scenario represents 3 friends viewing a match via an AR/ VR [in a pandemic situation]. Existing systems consider each user to possess a separate identity. Hence, there is no notion of clustering uses into multiple sized groups according to personal preference.
Figure 29 illustrates a use case 4 for Fluctuations of Heart during a match in dynamic clustering, according to present embodiment of the present disclosure. [ 2] Jaz and Sam are clustered into one group, while Lin is in separate cluster. These clusters were formed based upon the direct relevance of each user to the match, i.e. whether they supported a team or not. [3] Now consider a third indirect factor, i.e. eye-gaze. Lin sees Jaz, and their heartbeats increase. Our system detects this user fluctuation and automatically switches Lin's cluster. Now, Lin and Jaz watch match together and become better friends due to better personalization. Thus, dynamic clustering of a user (Mr. Lin) based on change of purpose (increase of heartbeat) is done. Thus, generating better personalization for people in a same cluster (Mr. Lin and Ms. Jaz are now in same cluster).
Figure 30A, 30B and 30c illustrates a solution to the scenarios of Figure 1, according to present embodiment of the present disclosure. As seen from the figure 30A, the present mechanism advertises the data of each user. The figure 30A addresses the issues as shown in the figure 1A. Accordingly, the traffic policeman [user 1] sees a self driving car. The Alcohol levels of all people inside the car are being shown on the dashboard. Now the user 1 can see the car's dashboard [2] from outside and fine the defaulters. Similarly, the figure 30B, devises a method which can combine data of multiple users into a SINGLE data. The figure 30B addresses the issues as shown in the figure 1B. Accordingly, inside the car, a system observes the alcohol contents of all occupants. The occupant whose alcohol content is maximum gets advertised on the car's dashboard [3]. The dashboard [3] contains only 1 information (max alcohol level) instead of information of all the occupants. Now the user 1 has to has to read only a single alcohol level, for each self-driving car. Further, the figure 30C, devise a method for mapping between person identity and shared representation. The figure 30C addresses the issues as shown in the figure 1C. Accordingly, the person's name who violated the traffic rule is also visible on the dashboard [3]. In simpler terms instead of just <alcohol_level>, the information <occupant_id, alcohol_level> gets advertised on the dashboard. Thus, user 1 stops the car, and only fines that particular person instead of penalizing the driver every time.
Figure 31 illustrates machine learning based system 1000 for prediction and clustering, in accordance with an embodiment of the present invention. The present implementation of the machine learning (ML) based system 1000 for prediction and clustering may be implemented in hardware, software, firmware, or any combination thereof.
The ML based system 1000 includes an input and interaction module 1001 which is adapted for interpreting input accepted in the form of user's input and generating a response to the user. The input is compared to a database of interrelated concepts, which may be employed through ML specification hardware 1002.
The ML based system 1000 further includes a virtual personal assistant (VPA) 1003 which can interact with one of more general-purpose hardware and drivers 1004 to provide access to information.
The ML based system 1000 further includes an ML specification application programming interface (API) 1005. On the basis of identification of the user's input by the ML specification hardware 1002, the ML specification API 1005 may provide current knowledge regarding virtual personal assistance. The ML specification API 1005 can also, change, update, and/or modify the virtual personal assistant's information, based on explicit and/or implicit feedback based on user data such as user profiles, and from learning a person's preferences. Further, a multimedia database 1010 in collaboration with an ML logic 1007 may be provided. The ML logic 1007 may assist in updating the database by adding new concepts and relationships can be developed or strengthened based on machine learning.
In various implementations, the ML logic 1007 may include software logic modules that enable a virtual personal assistant to adapt the database for the user's usage patterns, preferences, and priorities, etc. The ML engine 1007 is also adapted to index various observations according to a set of pre-determined features, where these features define the characteristics of observation data that are of interest to the virtual personal assistant.
As a non-limiting factor, a separate VPA 1003 may be provided which is adapted to interpret conversational user input, determine an appropriate response to the input. The VPA 1003 is also adapted to provide response which can easily understood and interpreted by the user. As a non-limiting factor, a plurality of software components may be implemented to accomplish such task.
The VPA 1003 may be communicatively coupled with a network from where the VPA 1003 may fetch information from one or more websites. As a non-limiting factor, the websites may include the API. In some embodiments, the VPA 1003 may optionally be incorporated into other sub-systems or interactive software applications, for example, operating systems, middleware or framework, such as VPA specification API 1006 software, and/or user-level applications software (e.g., another interactive software application, such as a search engine, web browser or web site application, or a user interface for a computing device). Such applications may also include position-infotainment systems, position-based VPA applications, Smart devices, etc.
The ML based system1000 may further perform simulation in a simulation engine 1008 based on the responses received from VPA specification API 1006 and the ML logic 1007 and one or more objects database 1011 to generate output and presentation 1009. In this way, the ML based system 1000 can have the ability to adapt to a user's needs, preferences, lingo and more.
Further, some embodiment may also be implemented as instructions stored on one or more machine-readable media, which may further be read and executed by one or more processors. A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine, for e.g., a computing device or a "virtual machine" running on one or more computing devices). For example, a machine-readable medium may include any suitable form of volatile or non-volatile memory.
In an implementation, the processors as used herein, refers to any type of computational circuit, such as, but not limited to, a microcontroller, a microprocessor, a complex instruction set computing microprocessor, a reduced instruction set computing microprocessor, a very long instruction word microprocessor, an explicitly parallel instruction computing microprocessor, a graphics processor, a digital signal processor, or any other type of processing circuit. The processors may also include embedded controllers, such as generic or programmable logic devices or arrays, application-specific integrated circuits, single-chip computers, smart cards, and the like.
In the case of implementation, an embodiment of the present invention may be implemented by using hardware only, by using software, and a necessary universal hardware platform. The present invention may be implemented in the form of a procedure, function, module, etc. that implements the functions or operations described above. Based on such understandings, the technical solution of the present invention may be embodied in the form of software. The software may be stored in a non-volatile or non-transitory storage medium/module, which can be a compact disk read-only memory (CD-ROM), USB flash disk, or a removable hard disk or a cloud environment. For example, such execution may correspond to a simulation of the logical operations as described herein. The software product may additionally or alternatively include a number of instructions that enable a computing device to execute operations for configuring or programming a digital logic apparatus in accordance with embodiments of the present invention.
In an example, the NLP/ML mechanism and VPA simulations underlying the present architecture 1300 may be remotely accessible and cloud-based, thereby being remotely accessible through a network connection. A computing device such as a VPA device may be configured for remotely accessing the NLP/ML modules and simulation modules may comprise skeleton elements such as a microphone, a camera a screen/monitor, a speaker etc.
Further, at-least one of the plurality of modules may be implemented through AI based on ML/NLP logic. A function associated with AI may be performed through the non-volatile memory, the volatile memory, and the processor constituting the first hardware module i.e. specialized hardware for ML/NLP based mechanisms. The processor may include one or a plurality of processors. At this time, one or a plurality of processors may be a general purpose processor, such as a central processing unit (CPU), an application processor (AP), or the like, a graphics-only processing unit such as a graphics processing unit (GPU), a visual processing unit (VPU), and/or an AI-dedicated processor such as a neural processing unit (NPU). The aforesaid processors collectively correspond to the processor.
The one or a plurality of processors control the processing of the input data in accordance with a predefined operating rule or artificial intelligence (AI) model stored in the non-volatile memory and the volatile memory. The predefined operating rule or artificial intelligence model is provided through training or learning.
Here, being provided through learning means that, by applying a learning logic/technique to a plurality of learning data, a predefined operating rule or AI model of a desired characteristic is made. The learning may be performed in a device itself in which AI according to an embodiment is performed, and/or may be implemented through a separate server/system.
The AI model may consist of a plurality of neural network layers. Each layer has a plurality of weight values, and performs a layer operation through calculation of a previous-layer and an operation of a plurality of weights. Examples of neural-networks include, but are not limited to, convolutional neural network (CNN), deep neural network (DNN), recurrent neural network (RNN), restricted Boltzmann Machine (RBM), deep belief network (DBN), bidirectional recurrent deep neural network (BRDNN), generative adversarial networks (GAN), and deep Q-networks.
The ML/NLP logic is a method for training a predetermined target device (for example, a robot) using a plurality of learning data to cause, allow, or control the target device to make a determination or prediction. Examples of learning techniques include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skilled in the art to which this invention belongs. The system, methods, and examples provided herein are illustrative only and not intended to be limiting.
While specific language has been used to describe the present subject matter, any limitations arising on account thereto, are not intended. As would be apparent to a person in the art, various working modifications may be made to the method in order to implement the inventive concept as taught herein. The drawings and the foregoing description give examples of embodiments. Those skilled in the art will appreciate that one or more of the described elements may well be combined into a single functional element. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment.
Claims (15)
- A method of context-resolution in a computing-environment, said method comprisingReceiving (102) input data from a plurality of users;Clustering (104) the users into a plurality of clusters, wherein each of the plurality of clusters represents a sub-plurality of users;Computing (106) a prototypical-representation for each cluster based on the received input data to obtain a plurality of prototypical-representations;Processing (108), by a convolution network, the plurality of prototypical-representations to perform a context-resolution with respect to the plurality of the users based on at least one of:Computing (110) a shared representation of the plurality of prototypical representations based on learning a contribution parameter of each of the plurality of users; andDetermine (112) a data contribution of each of the plurality of users within the shared representation based on the learned contribution parameter.
- The method as claimed in claim 1, wherein the convolution network operates based on learning the input from the plurality of users in at least one single phase.
- The method as claimed in claim 1, wherein the clustering corresponds to one or more of an unsupervised or supervised machine learning and defined by clustering of correlated or uncorrelated users within the plurality of users.
- The method as claimed in claim 3, wherein the correlated users correspond to users exhibiting similar data and located near in embedding space; andwherein the uncorrelated users correspond to users exhibiting dissimilar data and located far in embedding space.
- The method as claimed in claim 1, wherein the clustering based on an unlabelled input data is defined by:receiving unlabelled input data as input points at the convolution network;examining a contribution-parameter associated with each of the input points by the cross-stitch unit; andclustering the input points in a same cluster based on a degree of equivalence of the contribution parameter.
- The method as claimed in claim 1, wherein the clusters are classified as similar or different based on one or more of:receiving the clusters as labelled input points at the convolution network defining a cross stitch unit ;estimating difference of contribution amongst the clusters based on a contribution parameter associated with each of the cluster; anddeciding at least two clusters from amongst the clusters as different based on the estimated difference contribution between the at least two clusters exceeding a threshold.
- The method as claimed in claim 1, further comprisingreceiving the shared representation from the cross stitch model corresponding to the plurality of clusters;predicting, from an artificial neural network (ANN), at least one user identity for associating with the shared representation based on the steps of:a) executing a hard conditioning criteria defined bylearning a plurality of default templates for each cluster;selecting the default template from amongst the plurality of default templates, said selected default template corresponding to the cluster in the shared representation;andb) executing a soft conditioning criteria.communicating the at least one user identity from the ANN for triggering dynamic clustering based on said user identity corresponding to a single entity;prohibiting the communicating of at least one user identity based on said user identity corresponding to multiple entities.
- The method as claimed in claim 6, wherein the execution of the hard conditioning criteria comprises:converting the shared representation to a plurality of discrete integers by a classifier unit, each integer representing the cluster or a combination of clusters;feeding the plurality of discrete integers from the classifier to an embedding unit to project a plurality of cluster identities;forwarding the plurality of cluster identities from the embedding unit to the ANN to enable selection of the user identity for associating with the shared representation, said selected user identity corresponding to the maximum contributing user.
- The method as claimed in claim 7, wherein the execution of the soft conditioning criteria comprises:computing cross-correlation among a plurality of channels of a feature stream associated with the shared representation;computing a plurality of constraints associated with the clusters;computing a loss-function representing soft conditioning based on the computed constraints;re-initiating soft conditioning based on iteratively aggregating the shared representation with prototype estimate from an observer module.
- The method as claimed in claim 7, further comprising:receiving, by an observer module, a sample drawn from a predicted distribution p(A/B) of clusters, and the predicted user identity from the from ANN;projecting the received distribution to a subspace, said subspace comprising predictions from a plurality of ANN;grouping data corresponding to the projected representation into a number of classes determined based on the user identity, wherein each class is associated with a label or integer associated from the user identity;computing a prototypical representation for at least one class of the plurality of classes based on sensing mismatch between the user identity and output of the ANN; andtraining at least one of the context resolution, the hard conditioning criteria, the soft conditioning criteria to adjust prediction of the ANN to correct the user identity upon sensing the mismatch.
- The method as claimed in claim 7, further comprising:receiving, by an observer module, one or more samples drawn from a predicted distribution p(A/B) of clusters, and the predicted user identity from the from ANN ;determining an integer value representative of the predicted user identity, said integer value defining a cluster class identifier;clustering the samples based on a number of classes associated with the received user identity;computing prototypical estimates for each of the classes of clustered samples based on averaging inputs received from the plurality of ANNs;estimating hypothesis p^(A/B) for each cluster of the clustered samples based on receipt of inputs from the plurality of ANNs; andsharing the estimated hypothesis p^(A/B), the prototypical estimates, and the cluster class identifier by the observer module with the ANN.
- The method as claimed in claim 10, wherein the training of the context resolution comprises:computing, based on a probabilistic graphical model, a posterior representation p^(B/A) from the prototype estimate;updating a plurality of weights associated with the context resolution based on said posterior representation.
- The method as claimed in claim 10, wherein the training of the soft conditioning comprises aggregating the prototype estimate with the shared representation predicted by the context resolution.
- The method as claimed in claim 10, wherein the training of the hard conditioning is based on:providing the cluster class identifier by the observer module to the classifier associated with the hard conditioning;observing a difference between the cluster class identifier of the observer module and a class identifier associated with the user identity predicted at the ANN, said user identity associated with user who contributed maximum to the shared representation computed by the context resolution.
- The method as claimed in claim 12, wherein the computing of the posterior representation p^ (B|A) comprises:conducting an initial training of the ANN to shortlist a subset of neurons required to compute a hypothesis space p(A) of the ANN ;decrementing weights of the subset of neurons if an activation value associated with the ANN reaches a pre-defined threshold to compute the hypothesis space p(A); andcomputing the posterior representation p^(B/A) based on the computed hypothesis space p(A).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN202111022563 | 2021-05-20 | ||
IN202111022563 | 2021-07-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022245134A1 true WO2022245134A1 (en) | 2022-11-24 |
Family
ID=84141993
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2022/007123 WO2022245134A1 (en) | 2021-05-20 | 2022-05-18 | A system and method for context resolution and purpose driven clustering in autonomous systems |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022245134A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160364002A1 (en) * | 2015-06-09 | 2016-12-15 | Dell Products L.P. | Systems and methods for determining emotions based on user gestures |
KR20190106950A (en) * | 2019-08-31 | 2019-09-18 | 엘지전자 주식회사 | Artificial device and method for controlling the same |
US20190354555A1 (en) * | 2017-01-04 | 2019-11-21 | International Business Machines Corporation | Dynamic faceting for personalized search and discovery |
US20200019644A1 (en) * | 2018-07-10 | 2020-01-16 | Reflektion, Inc. | Automated Assignment Of User Profile Values According To User Behavior |
US20200073953A1 (en) * | 2018-08-30 | 2020-03-05 | Salesforce.Com, Inc. | Ranking Entity Based Search Results Using User Clusters |
-
2022
- 2022-05-18 WO PCT/KR2022/007123 patent/WO2022245134A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160364002A1 (en) * | 2015-06-09 | 2016-12-15 | Dell Products L.P. | Systems and methods for determining emotions based on user gestures |
US20190354555A1 (en) * | 2017-01-04 | 2019-11-21 | International Business Machines Corporation | Dynamic faceting for personalized search and discovery |
US20200019644A1 (en) * | 2018-07-10 | 2020-01-16 | Reflektion, Inc. | Automated Assignment Of User Profile Values According To User Behavior |
US20200073953A1 (en) * | 2018-08-30 | 2020-03-05 | Salesforce.Com, Inc. | Ranking Entity Based Search Results Using User Clusters |
KR20190106950A (en) * | 2019-08-31 | 2019-09-18 | 엘지전자 주식회사 | Artificial device and method for controlling the same |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020190112A1 (en) | Method, apparatus, device and medium for generating captioning information of multimedia data | |
WO2020159232A1 (en) | Method, apparatus, electronic device and computer readable storage medium for image searching | |
WO2020138928A1 (en) | Information processing method, apparatus, electrical device and readable storage medium | |
WO2019164251A1 (en) | Method of performing learning of deep neural network and apparatus thereof | |
WO2018128362A1 (en) | Electronic apparatus and method of operating the same | |
AU2018319215B2 (en) | Electronic apparatus and control method thereof | |
WO2019135631A1 (en) | Electronic device for obfuscating and decoding data and method for controlling same | |
WO2016017987A1 (en) | Method and device for providing image | |
WO2022075668A1 (en) | Artificial intelligence model distributed processing system, and method for operating same | |
WO2021261836A1 (en) | Image detection apparatus and operation method thereof | |
EP3545436A1 (en) | Electronic apparatus and method of operating the same | |
WO2021167210A1 (en) | Server, electronic device, and control methods therefor | |
WO2019240562A1 (en) | Electronic device and operating method thereof for outputting response to user input, by using application | |
WO2019135621A1 (en) | Video playback device and control method thereof | |
WO2020246647A1 (en) | Artificial intelligence device for managing operation of artificial intelligence system, and method therefor | |
WO2021132922A1 (en) | Computing device and operation method thereof | |
WO2023080276A1 (en) | Query-based database linkage distributed deep learning system, and method therefor | |
WO2020262721A1 (en) | Control system for controlling plurality of robots by using artificial intelligence | |
WO2023068821A1 (en) | Multi-object tracking device and method based on self-supervised learning | |
WO2018124464A1 (en) | Electronic device and search service providing method of electronic device | |
WO2022245134A1 (en) | A system and method for context resolution and purpose driven clustering in autonomous systems | |
WO2023048537A1 (en) | Server and method for providing recommendation content | |
WO2022045613A1 (en) | Method and device for improving video quality | |
WO2023277663A1 (en) | Image processing method using artificial neural network, and neural processing unit | |
WO2023008599A1 (en) | Video editing device and operation method of video editing device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22804988 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22804988 Country of ref document: EP Kind code of ref document: A1 |