US12380360B2 - Interpretable imitation learning via prototypical option discovery for decision making - Google Patents
Interpretable imitation learning via prototypical option discovery for decision makingInfo
- Publication number
- US12380360B2 US12380360B2 US17/323,475 US202117323475A US12380360B2 US 12380360 B2 US12380360 B2 US 12380360B2 US 202117323475 A US202117323475 A US 202117323475A US 12380360 B2 US12380360 B2 US 12380360B2
- Authority
- US
- United States
- Prior art keywords
- option
- learning
- options
- prototypical
- policy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
Definitions
- the present invention relates to imitation learning and, more particularly, to methods and systems related to interpretable imitation learning via prototypical option discovery.
- a method for learning prototypical options for interpretable imitation learning includes initializing options by bottleneck state discovery, each of the options presented by an instance of trajectories generated by experts, applying segmentation embedding learning to extract features to represent current states in segmentations by dividing the trajectories into a set of segmentations, learning prototypical options for each segment of the set of segmentations to mimic expert policies by minimizing loss of a policy and projecting prototypes to the current states, training option policy with imitation learning techniques to learn a conditional policy, generating interpretable policies by comparing the current states in the segmentations to one or more prototypical option embeddings, and taking an action based on the interpretable policies generated.
- a non-transitory computer-readable storage medium comprising a computer-readable program for learning prototypical options for interpretable imitation learning.
- the computer-readable program when executed on a computer causes the computer to perform the steps of initializing options by bottleneck state discovery, each of the options presented by an instance of trajectories generated by experts, applying segmentation embedding learning to extract features to represent current states in segmentations by dividing the trajectories into a set of segmentations, learning prototypical options for each segment of the set of segmentations to mimic expert policies by minimizing loss of a policy and projecting prototypes to the current states, training option policy with imitation learning techniques to learn a conditional policy, generating interpretable policies by comparing the current states in the segmentations to one or more prototypical option embeddings, and taking an action based on the interpretable policies generated.
- a method for learning prototypical options for interpretable imitation learning includes dividing a task, by a processor, into a plurality of sub-tasks via a learning policy over options, learning, by the processor, different options to solve each of the plurality of sub-tasks by mimicking expert policy, and fine-tuning the learning policy to learn to take an action based on the task.
- FIG. 1 is a block/flow diagram of an exemplary option selection mechanism, in accordance with embodiments of the present invention
- FIG. 2 is a block/flow diagram of an exemplary prototypical option discovery for interpretable imitation learning (IPOD) architecture, in accordance with embodiments of the present invention
- FIG. 3 is a block/flow diagram of an exemplary method for employing the IPOD architecture of FIG. 2 , in accordance with embodiments of the present invention
- FIG. 4 is a block/flow diagram of an exemplary method for employing the option initialization, segmentation embedding learning, prototypical option learning, and option policy learning components of FIG. 3 , in accordance with embodiments of the present invention
- FIG. 5 is a block/flow diagram of a practical application of the IPOD architecture, in accordance with embodiments of the present invention.
- FIG. 6 is an exemplary processing system for the IPOD architecture, in accordance with embodiments of the present invention.
- FIG. 7 is a block/flow diagram of an exemplary method for executing the IPOD architecture, in accordance with embodiments of the present invention.
- FIG. 8 illustrates exemplary equations for implementing the IPOD architecture, in accordance with embodiments of the present invention.
- Imitation learning which mimics experts' behaviors is beneficial to finding meaningful structure or skills in the experts' demonstrations.
- they are usually considered as “black-boxes” which lack transparency, limiting their application in many decision-making scenarios, e.g., healthcare and finance.
- a variety of methods learn a hidden variable of the variation underlying expert demonstrations to construct the structure of expert policy and visualize the changes in the hidden variable.
- post-hoc explanations do not explain the reasoning process of how the model makes its decisions and can be incomplete or inaccurate in capturing the reasoning process of the original model. Therefore, it is often desirable to have models with built-in interpretability.
- the exemplary embodiments address such issues by defining a form of interpretability in imitation learning that imitates human abstraction and explains its reasoning in a human-understanding manner.
- the exemplary methods employ prototype learning to discovery options for built-in interpretable imitation learning.
- Prototype learning which drives from the study of human reasoning, is a form of case-based reasoning, which makes decisions by comparing new inputs with a few data instances (prototypes) in, e.g., image recognition, sequence classification, sequence segmentation, etc.
- the exemplary methods discover prototypical options for interpretable imitation learning.
- the exemplary methods introduce a network architecture referred to as prototypical option discovery (IPOD).
- IPOD prototypical option discovery
- Each prototypical option is responsible for explaining a group of variable-length segments of the demonstration trajectories.
- IPOD uses LSTM with a soft-attention mechanism to derive segment embedding.
- the exemplary methods learn a prototypical contextual policy to take action with states as well as the option embedding, which is determined based on centroids of the segment embedding, as inputs.
- the model is interpretable, in the sense that it has a transparent reasoning process when making decisions.
- the exemplary methods define several criteria for constructing the prototypes, including option diversity and prediction accuracy.
- the exemplary embodiments introduce an imitation learning framework that learns interpretable policy via prototypical options which include segmentation prototypes.
- the exemplary embodiments enable learning the prototypical option embedding by weighted segmentation for sparsity and learn the prototypical option's policy by driving the option-relevant information via option embedding.
- the goal is to learn a new policy it, which imitates the expert behavior by maximizing the likelihood of given demonstration trajectories.
- the behavior of an expert agent can be copied to accomplish a desired task.
- Imitation learning refers to learning a policy that mimics the behavior of experts who demonstrate how to perform the given task.
- Imitation learning has various approaches.
- One approach is behavior cloning (BC), which directly maps from the state to the action. This method usually learns a policy through standard supervised learning. BC does not perform any additional policy interactions with the learning environment, but it suffers from distributional drift.
- Another approach is inverse reinforcement learning (IRL), which learns a policy by recovering the reward function from demonstrations and with dense reward signals provided from the learned reward function.
- AIL adversarial imitation learning
- IRL require interacting with the environment for generating the agent's trajectory for comparison with the expert's trajectory.
- imitation learning with neural networks efficiently learns a desired behavior in complex environments.
- these methods are usually considered as “black-boxes,” which lack transparency.
- the exemplary methods introduce an interpretable imitation learning framework for more applications of imitation learning, e.g., healthcare, finance, etc.
- An option is a generalization of an action (also known as a skill, sub-policy or a sub-goal).
- an option is a three-tuple that includes the start, end probability of an option and the policy of the option.
- Options offer great potential for mitigating the difficulty of solving complex Markov decision processes (MDPs) via temporally extended actions.
- Interpretable modeling mainly falls into two categories, that is, intrinsic explanation which makes the model transparent by restricting the complexity, e.g., decision tree or case-based (prototype-based) model, and post-hoc explanation, which is achieved by analyzing the model after training, e.g., extracting the importance of states via attention and distilling a black-box policy into a simple structure policy.
- a set of post-hoc imitation learning was proposed for generating meaningful policy.
- the intrinsic explanation model is sometimes desirable since post-hoc explanations usually do not fit the original model precisely.
- Prototype learning which draws conclusions for new inputs by comparing them with a few exemplary cases (e.g., prototypes) belongs to the intrinsic explanation method.
- the options framework models skills as options, which is a closed-loop policy to solve the sub-tasks. For example, picking up an object, jumping, etc. are options, which require a user to take actions over a period of time.
- An option o includes the following components, that is, its initiation condition, I o (s), which determines whether o can be executed in state s, its termination condition, ⁇ o (s), which determines whether option execution must terminate in state s and its closed-loop control policy, ⁇ o (s), which maps state s to a low-level action a.
- Prototype theory emerged in 1971 with the work of psychologist Eleanor Rosch, and it has been described as a “Copernican revolution” in the theory of categorization.
- prototype theory any given concept in any given language has a real-world example that best represents this concept. For instance, when asked to give an example of the concept of fruits, an apple is more frequently cited than, a durian. This theory claims that the presumed natural prototypes were central tendencies of the categories.
- Prototype theory has also been applied in machine learning, where a prototype is defined as a data instance that is representative of all the data. There are many approaches to find prototypes in the data. Any clustering algorithm that returns actual data points as cluster centers would qualify for selecting prototypes.
- a prototypical option o is a kind of option that can be presented by an instance of the trajectories generated by the experts.
- a prototypical option o includes four components ⁇ I o , ⁇ o , ⁇ o , g o >, that is, an intra-option policy ⁇ o : ⁇ ⁇ [0, 1], a termination condition ⁇ o : p ⁇ [0, 1], an initiation state set I o ⁇ and an option prototype g o .
- a prototypical option ⁇ I o , ⁇ o , ⁇ o , g o > is available in state s t if and only if s t ⁇ I o . If the option is taken, then actions are selected according to ⁇ o until the option terminates according to ⁇ o .
- g o is considered as a real-world example to explain the option.
- Options discovery is based on the intuition that it would be easier to solve the long-horizon task from temporal abstraction, e.g., separate or divide the long-horizon task into a set of sub-tasks, and select different options to solve for each sub-task.
- This intuition informs the steps of the algorithm, that is, breaking or dividing the trajectories into a set of subtasks via learning a policy ⁇ h over options, learning (or discovering) options that could solve these sub-tasks by mimicking the expert' policy, and, once such options are learned, the exemplary embodiments fine-tune ⁇ h to learn to take an option based on the current task.
- the goal is to first break or divide trajectories ⁇ into M disjoint segments (g 1 , g 2 , . . . , g M ), where
- the exemplary embodiments leverage prototype learning to introduce an interpretable imitation learning framework by prototypical option discovery, where each prototypical option is responsible for explaining a group of variable-length segments of the demonstration trajectory.
- I2L 200 addresses interpretable imitation learning tasks with steps to learn prototypical options ⁇ I o , ⁇ o , ⁇ o , g o >.
- the exemplary methods learn a policy ⁇ h (o
- the exemplary methods map each segment into an option embedding
- the exemplary methods learn a prototypical contextual policy ⁇ (a
- the O(s t ) is updated according to the ⁇ h (o
- IPOD 200 in FIG.
- s) is learned by choosing the admissible prototypical option. Since the exemplary methods utilize imitation learning to learn the intra-option policy, the reward of ⁇ h (o
- s) is obtained by the selected option ⁇ o which takes primitive actions and receives the reward signal. Thus, the reward of the option is the cumulative reward of the actions taken from a current time to the termination of the option: r t:t+ ⁇ r t +. . . +r ⁇ t+ ⁇ ,
- ⁇ [0,T] is the time interval of the option t+ ⁇ is the termination of the option o t .
- the option-value Q(s t , o t ) refers to the expected rewards for an action o t taken in a given state s t .
- the above equations show how the exemplary methods can learn the policy ⁇ h over option and use it for selecting options.
- the exemplary methods must assign appropriate initial parameters to ⁇ h .
- the exemplary methods segment the trajectories by detecting the bottleneck states within the trajectories. Bottlenecks have been defined as those states which appear frequently on successful trajectories to a goal but not on unsuccessful ones or as nodes which allow for densely connected regions of the interaction graph to reach other such regions.
- bottleneck areas have been described as the border states of densely connected areas in the state space or as states that allow transitions to a different part of the environment.
- a more formal definition defines bottleneck areas as those states which are local maxima of betweenness, a measure of centrality on graphs, on a transition graph.
- the exemplary methods extract all the states in the trajectories, and use density-based spatial clustering methods (e.g., DBSCAN) to automatically cluster the states into K groups.
- the exemplary methods aim to learn the option prototype, which is a sub-trajectory or segment generated by the experts. Each option prototype is responsible for explaining a group of variable-length segments of the demonstration trajectory g m generated by ⁇ h .
- the exemplary methods map each group of segments g m,k individually into a low dimension embedding g m,k by classifying the segment into the corresponding option's category k.
- the exemplary methods learn o k by minimizing the distance between o k and g m,k .
- the exemplary methods consider the segment which has the smallest distance with o k as the option prototype of o k .
- the exemplary methods aim to learn a meaningful latent space to represent the segments, where they are clustered (in L2-distance) around semantically similar prototypical options, and the clusters from different classes are well-separated.
- the exemplary methods use a long short-term memory (LSTM) to learn the segment's representation
- the exemplary methods minimize the distance between
- the exemplary methods leverage both supervised learning and imitation learning regarding the effectiveness and interpretability.
- the exemplary methods attempt to minimize the least square loss between g and o k , and prevent the learning of multiple similar prototypical options.
- the exemplary methods use a diversity regularization term that penalizes prototypical options that are close to each other. Meanwhile, the exemplary methods also consider the downstream task (e.g., imitation learning).
- the first term is for effectiveness, where an imitation learning objective function is conducted to learn the segment embeddings and option prototype embeddings to mimic expert's policy ⁇ E .
- IM loss (reproduced below) can be any imitation learning method, e.g., a behavior cloning loss or an adversarial imitation learning objective.
- the second term is for interpretability where an evidence regularization is used to encourage each prototypical option embedding to be as close to an encoded instance as possible.
- the third term is a diversity regularization term to learn diversified options, where d min is a threshold that classifies or determines whether two prototypes are close or not. d min is set to 1.0 in exemplary embodiments. ⁇ 1 , ⁇ 2 , ⁇ 3 ⁇ [0, 1] are the weights used to balance the three loss terms.
- each option o maintains its own policy ⁇ o :s ⁇ a t , which is parameterized by its own parameters ⁇ o .
- the exemplary methods propose a contextual policy ⁇ ⁇ (a t
- the exemplary methods train the option policy ⁇ ⁇ (a t
- the goal of adversarial imitation learning is to minimize the JS divergence between trajectory distribution generated by the expert's policy and the option's policy.
- the exemplary methods use the same policy loss for both option prototypes and option policy, but the exemplary methods only optimize the parameters of option prototypes or option policy for each optimization step.
- w 1 , w 2 , w 3 ⁇ [0, 1] are hyper-parameters to balance the weights of the three kinds of loss.
- the exemplary methods first initialize K groups segments followed by iteratively optimizing option + IL loss + emb .
- the exemplary embodiments introduce an interpretable imitation learning framework by discovering compositional structure which is called prototypical option discovery imitation learning (IPOD).
- IPOD constructs prototypical options which embed the skills of experts by an option embedding and an option policy via a prototype learning framework.
- IPOD generates interpretable agent policies by comparing the state segmentations to a few prototypical option embeddings followed by taking an action based on the option embedding.
- the exemplary model of the present invention uses a soft attention mechanism to derive prototypical option embedding from trajectory fragments.
- the exemplary methods also use the soft attention mechanism to create a bottleneck in the agent, forcing it to focus on option-relevant information.
- FIG. 3 is a block/flow diagram of an exemplary method 300 for employing the IPOD architecture of FIG. 2 , in accordance with embodiments of the present invention.
- IPOD interpretable imitation learning
- option initialization takes place:
- the IPOD first initializes the options by bottleneck state discovery methodology.
- the exemplary methods identify states that connect different densely connected regions in the state space.
- the exemplary methods use the behavior cloning method with soft attention mechanism to obtain important states with large attention weights.
- the important states can then be found with DBSCAN clustering.
- the dense clusters derived from DBSCAN are used for option initialization.
- the policy over options learning takes place:
- a prototypical option o includes four components ⁇ I o , ⁇ o , ⁇ o , g o >, an intra-option policy ⁇ o : ⁇ ⁇ [0, 1], a termination condition ⁇ o : ⁇ [0,1], an initiation state set I 0 ⁇ , and its option prototype g o .
- s) is learned to choose the admissible prototypical option. Since the exemplary methods utilize imitation learning to learn the intra-option policy, the reward of ⁇ h (o
- s) is obtained by the selected option ⁇ o which takes primitive actions and receives the reward signal. Thus, the reward of the option is the cumulative reward of the actions taken from a current time to the termination of the option: r ⁇ t:t+ ⁇ r t + . . . r ⁇ t+ ⁇ , where ⁇ [0, T] is the time interval of the option on-going, and t+ ⁇ is the termination of the option o t .
- the exemplary methods update ⁇ h (o
- s) taking option o t at state s t according to policy gradient: ⁇ J s ⁇ h [Q ( s, ⁇ h ( o t
- option-value Q (s t , o t ) refers to the expected rewards for an action o t taken in a given state s t .
- the exemplary methods aim to learn the option prototype, which is a sub-trajectory or segment generated by the experts. Each option prototype is responsible for explaining a group of variable-length segments of the demonstration trajectory g m , generated by ⁇ h .
- the exemplary methods map each group of segment g m,k individually into a low-dimension embedding g m,k by classifying the segment into the corresponding option's category k.
- the exemplary methods learn o k by minimizing the distance between o k and g m,k .
- the exemplary methods consider the segment which has the smallest distance with o k as the option prototype of o k .
- the exemplary methods aim to learn a meaningful latent space to represent the segments, where they are clustered (in L2-distance) around semantically similar prototypical options, and the clusters from different classes are well-separated.
- the exemplary methods minimize the distance between g v m′ v m and its closest prototype o k .
- the optimization problem to be solved is:
- the exemplary methods leverage both supervised learning and imitation learning regarding effectiveness and interpretability.
- the exemplary methods try to minimize the least square loss between g and o k to prevent learning multiple similar prototypical options.
- the exemplary methods use a diversity regularization term that penalizes prototypical options that are close to each other. Meanwhile, the exemplary methods also consider the downstream task (imitation learning).
- the second term is for interpretability where an evidence regularization is used to encourage each prototypical option embedding to be as close to an encoded instance as possible.
- the third term is a diversity regularization to learn diversified options and d min is a threshold that classifies whether two prototypes are close or not.
- the exemplary methods set d min to 1.0. ⁇ 1 , ⁇ 2 , ⁇ 3 ⁇ [0,1] are the weights used to balance the three loss terms.
- option policy learning takes place:
- Each option o maintains its own policy ⁇ o :s ⁇ a t , which is parameterized by its own parameters ⁇ o .
- the exemplary methods propose a contextual policy ⁇ ⁇ (a t
- the exemplary methods train the option policy ⁇ o (a t
- the goal of behavior cloning is to mimic the action of the expert at each time step via supervised learning technical.
- the goal of adversarial imitation learning is to minimize the JS divergence between trajectory distribution generated by the expert's policy and the option's policy.
- option policy loss is used for both option prototypes and option policy, but the exemplary methods only optimize the parameters of option prototypes or option policy for each optimization step.
- the exemplary methods can further train the option policy with imitation learning algorithms, e.g., behavior cloning and adversarial imitation learning.
- the goal of option policy learning is to mimic the segmentations of demonstrations from the experts.
- FIG. 4 is a block/flow diagram of an exemplary method for employing the option initialization, segmentation embedding learning, prototypical option learning, and option policy learning components of FIG. 3 , in accordance with embodiments of the present invention.
- Imitation learning with neural networks efficiently learns a desired behavior in complex environments.
- these methods are usually considered as “black-boxes” which lack transparency, limiting their application in many decision-making scenarios.
- a variety of methods learn a hidden variable of the variation underlying expert demonstrations to construct the structure of expert policy and visualize the changes in the hidden variable.
- post-hoc explanations do not explain the reasoning process of how the model makes its decisions and can be incomplete or inaccurate in capturing the reasoning process of the original model. Therefore, it is often desirable to have models with built-in interpretability.
- the exemplary embodiments of the present invention define a form of interpretability in imitation learning that imitates human abstraction and explains its reasoning in a human-understanding manner.
- the exemplary methods enable prototype learning to discovery options for built-in interpretable imitation learning, which makes decisions by comparing the new inputs with a few data instances (prototypes).
- attention mechanics and behavior cloning are utilized to extract the most important states considered while mimicking the expert's demonstration.
- DBSCAN is used on the extracted states and the states are automatically clustered into groups.
- imitation learning is utilized to learn the intra-option policy, where the reward is calculated by the cumulative rewards from the primitive actions.
- prototypical options are learned via minimizing the loss of the policy and projecting the prototypes to observed states.
- the option policy is trained with imitation learning algorithms, such as behavior cloning, inverse imitation learning and adversarial imitation learning.
- the exemplary methods introduce a new architecture, that is, prototypical option discovery for interpretable imitation learning (IPOD).
- Each prototypical option includes a set of segmentation from experts' trajectories and is embedded by an option policy.
- the IPOD uses a soft attention mechanism to derive prototypical option embedding from its trajectory fragments.
- the model matches the segmentations from the demonstration to the learned prototypical options, and makes an action based on the learned prototypical option.
- the exemplary methods also use the soft attention mechanism to create a bottleneck in the agent, forcing the agent to focus on option-relevant information. In this way, the model is interpretable, in the sense that it has a transparent reasoning process when making decisions.
- the exemplary methods define several criteria for constructing the prototypes, including option diversity and accuracy.
- Bottleneck state discovery segments the input trajectories into disjoint segments of variable length by, e.g., density-based clustering methods.
- Option projection includes representation learning of the segmentations in each cluster, and prototypical option embedding learning.
- Option refixation takes the low-level actions controlled through the prototypical option embedding and refines each group of segments by matching the segmentation embeddings to prototypical option embeddings.
- FIG. 5 is a block/flow diagram 500 of a practical application of the IPOD architecture, in accordance with embodiments of the present invention.
- a patient 502 needs to receive medication 504 .
- Options are computed for indicating different levels of dosages of the medication 504 .
- the exemplary methods learn a prototypical contextual policy ⁇ (a
- the IPOD architecture 670 is implemented to enable prototypical option visualization by executing a reasoning process 555 and evaluating policy performance 557 .
- I2L 670 via the reasoning process 555 , can smoothly compose the different options by considering the variant states 506 of the patient 502 . In one instance, I2L 670 can chose the low-dosage option for the patient 502 .
- the results 510 e.g., dosage options
- FIG. 6 is an exemplary processing system for GBL, in accordance with embodiments of the present invention.
- the processing system includes at least one processor (CPU) 604 operatively coupled to other components via a system bus 602 .
- a GPU 605 operatively coupled to the system bus 602 .
- a GPU 605 operatively coupled to the system bus 602 .
- ROM Read Only Memory
- RAM Random Access Memory
- I/O input/output
- an interpretable imitation learning framework 670 can be employed to execute option initialization 303 , policy over options learning 305 , prototypical option learning 307 , prototypical option embedding learning 309 , and option policy learning 311 .
- a storage device 622 is operatively coupled to system bus 602 by the I/O adapter 620 .
- the storage device 622 can be any of a disk storage device (e.g., a magnetic or optical disk storage device), a solid-state magnetic device, and so forth.
- a transceiver 632 is operatively coupled to system bus 602 by network adapter 630 .
- User input devices 642 are operatively coupled to system bus 602 by user interface adapter 640 .
- the user input devices 642 can be any of a keyboard, a mouse, a keypad, an image capture device, a motion sensing device, a microphone, a device incorporating the functionality of at least two of the preceding devices, and so forth. Of course, other types of input devices can also be used, while maintaining the spirit of the present invention.
- the user input devices 642 can be the same type of user input device or different types of user input devices.
- the user input devices 642 are used to input and output information to and from the processing system.
- a display device 652 is operatively coupled to system bus 602 by display adapter 650 .
- the processing system may also include other elements (not shown), as readily contemplated by one of skill in the art, as well as omit certain elements.
- various other input devices and/or output devices can be included in the system, depending upon the particular implementation of the same, as readily understood by one of ordinary skill in the art.
- various types of wireless and/or wired input and/or output devices can be used.
- additional processors, controllers, memories, and so forth, in various configurations can also be utilized as readily appreciated by one of ordinary skill in the art.
- FIG. 7 is a block/flow diagram of an exemplary method for executing the IPOD architecture, in accordance with embodiments of the present invention.
- FIG. 8 illustrates exemplary equations 800 for implementing the IPOD architecture, in accordance with embodiments of the present invention.
- the equations include a loss function for segmentation embedding learning, an objective function, and policy losses.
- the terms “data,” “content,” “information” and similar terms can be used interchangeably to refer to data capable of being captured, transmitted, received, displayed and/or stored in accordance with various example embodiments. Thus, use of any such terms should not be taken to limit the spirit and scope of the disclosure.
- a computing device is described herein to receive data from another computing device, the data can be received directly from the another computing device or can be received indirectly via one or more intermediary computing devices, such as, for example, one or more servers, relays, routers, network access points, base stations, and/or the like.
- the data can be sent directly to the another computing device or can be sent indirectly via one or more intermediary computing devices, such as, for example, one or more servers, relays, routers, network access points, base stations, and/or the like.
- intermediary computing devices such as, for example, one or more servers, relays, routers, network access points, base stations, and/or the like.
- aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module,” “calculator,” “device,” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
- the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
- a computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
- a computer readable storage medium may be any tangible medium that can include, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
- a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electromagnetic, optical, or any suitable combination thereof.
- a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
- Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
- Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
- the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
- LAN local area network
- WAN wide area network
- Internet Service Provider for example, AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
- These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks or modules.
- the computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks or modules.
- processor as used herein is intended to include any processing device, such as, for example, one that includes a CPU (central processing unit) and/or other processing circuitry. It is also to be understood that the term “processor” may refer to more than one processing device and that various elements associated with a processing device may be shared by other processing devices.
- memory as used herein is intended to include memory associated with a processor or CPU, such as, for example, RAM, ROM, a fixed memory device (e.g., hard drive), a removable memory device (e.g., diskette), flash memory, etc. Such memory may be considered a computer readable storage medium.
- input/output devices or “I/O devices” as used herein is intended to include, for example, one or more input devices (e.g., keyboard, mouse, scanner, etc.) for entering data to the processing unit, and/or one or more output devices (e.g., speaker, display, printer, etc.) for presenting results associated with the processing unit.
- input devices e.g., keyboard, mouse, scanner, etc.
- output devices e.g., speaker, display, printer, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Here, vm∈[1,T] are segment boundary indicator variables with v0=0, vm=T, vm≥vm′, e.g., go=s2:4, so that go=[s2,s3,s4].
m′=m−1. Here, vm∈[1, T] are segment boundary indicator variables with v0=0, vM=T, vm≥vm′. The segments are grouped into K clusters and learn each cluster's prototypical options, where Gk={gm}mE{0, 1, . . . , M} indicate the m-th group segments.
and cluster to them to find K central nodes as option prototypes go, o={1, . . . , K}. As for learning intra-option policy πo, the exemplary methods learn a prototypical contextual policy λ(a|s, o) to take action based on states, as well as the option embedding.
r t:t+δ =r t +. . . +r {t+δ},
∇J= s˜πh [Q(s,π h(o t |s t)]
and the embeddings of prototypical option ok, where
vm=t indicates the current segment generated by πh. To force the segment
and the option prototypes to be in the same space, the exemplary methods minimize the distance between
and its closest prototype ok.
emb=Σm=1 M mink=1 K ∥f ϕ(s {v
Full =w 1· option +w 2· IL
(s t)={o i |I o
∇J= s˜πh [Q(s,π h(o t |s t)]
where the first term is for effectiveness and where an imitation learning objective function is conducted to learn the segment embeddings and option prototype embeddings to mimic expert's policy πE. The second term is for interpretability where an evidence regularization is used to encourage each prototypical option embedding to be as close to an encoded instance as possible. The third term is a diversity regularization to learn diversified options and dmin is a threshold that classifies whether two prototypes are close or not. The exemplary methods set dmin to 1.0. λ1, λ2, λ3∈[0,1] are the weights used to balance the three loss terms.
Claims (18)
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/323,475 US12380360B2 (en) | 2020-05-26 | 2021-05-18 | Interpretable imitation learning via prototypical option discovery for decision making |
| JP2022572280A JP7466702B2 (en) | 2020-05-26 | 2021-05-19 | Interpretable Imitation Learning by Discovering Prototype Options |
| PCT/US2021/033107 WO2021242585A1 (en) | 2020-05-26 | 2021-05-19 | Interpretable imitation learning via prototypical option discovery |
| US19/230,357 US20250299112A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
| US19/230,344 US20250299111A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202063029754P | 2020-05-26 | 2020-05-26 | |
| US202063033304P | 2020-06-02 | 2020-06-02 | |
| US17/323,475 US12380360B2 (en) | 2020-05-26 | 2021-05-18 | Interpretable imitation learning via prototypical option discovery for decision making |
Related Child Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/230,344 Continuation US20250299111A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
| US19/230,357 Continuation US20250299112A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20210374612A1 US20210374612A1 (en) | 2021-12-02 |
| US12380360B2 true US12380360B2 (en) | 2025-08-05 |
Family
ID=78705053
Family Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/323,475 Active 2044-04-29 US12380360B2 (en) | 2020-05-26 | 2021-05-18 | Interpretable imitation learning via prototypical option discovery for decision making |
| US19/230,344 Pending US20250299111A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
| US19/230,357 Pending US20250299112A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
Family Applications After (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/230,344 Pending US20250299111A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
| US19/230,357 Pending US20250299112A1 (en) | 2020-05-26 | 2025-06-06 | Interpretable imitation learning via prototypical option discovery for decision making |
Country Status (3)
| Country | Link |
|---|---|
| US (3) | US12380360B2 (en) |
| JP (1) | JP7466702B2 (en) |
| WO (1) | WO2021242585A1 (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20230186107A1 (en) * | 2021-12-14 | 2023-06-15 | International Business Machines Corporation | Boosting classification and regression tree performance with dimension reduction |
| CN115204387B (en) * | 2022-07-21 | 2023-10-03 | 法奥意威(苏州)机器人系统有限公司 | Learning methods, devices and electronic equipment under hierarchical goal conditions |
| JP7786689B1 (en) * | 2024-11-11 | 2025-12-16 | ソフトバンク株式会社 | Information processing device, information processing method, and control program |
Citations (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2872831A1 (en) * | 2011-05-08 | 2012-11-15 | Infinetics Technologies, Inc. | Flexible radix switch |
| KR20130049201A (en) * | 2010-09-30 | 2013-05-13 | 인텔 코오퍼레이션 | Storage drive management |
| CN105393264A (en) * | 2013-07-12 | 2016-03-09 | 微软技术许可有限责任公司 | Interaction Segment Extraction in Human-Computer Interaction Learning |
| CN105893256A (en) * | 2016-03-30 | 2016-08-24 | 西北工业大学 | Software failure positioning method based on machine learning algorithm |
| JP2017142549A (en) * | 2016-02-08 | 2017-08-17 | ブレインズコンサルティング株式会社 | Troubleshooting support apparatus, troubleshooting support program, and storage medium |
| EP3462385A1 (en) * | 2017-09-28 | 2019-04-03 | Siemens Aktiengesellschaft | Sgcnn: structural graph convolutional neural network |
| EP2504776B1 (en) * | 2009-11-24 | 2019-06-26 | Zymeworks Inc. | Density based clustering for multidimensional data |
| US20190324795A1 (en) * | 2018-04-24 | 2019-10-24 | Microsoft Technology Licensing, Llc | Composite task execution |
| CN108805877B (en) * | 2017-05-03 | 2019-11-19 | 西门子保健有限责任公司 | Multiscale Deep Reinforcement Machine Learning for N-Dimensional Segmentation in Medical Imaging |
| CN110491171A (en) * | 2019-09-17 | 2019-11-22 | 南京莱斯网信技术研究院有限公司 | A kind of water transportation supervision early warning system and method based on machine learning techniques |
| WO2020162680A1 (en) * | 2019-02-08 | 2020-08-13 | 아콘소프트 주식회사 | Microservice system and method |
| CN111712862A (en) * | 2018-02-14 | 2020-09-25 | 通腾运输公司 | Method and system for generating traffic volume or traffic density data |
| US20200334093A1 (en) * | 2019-04-17 | 2020-10-22 | Microsoft Technology Licensing, Llc | Pruning and prioritizing event data for analysis |
| CN111950950A (en) * | 2019-05-17 | 2020-11-17 | 北京京东尚科信息技术有限公司 | Planning method, device, computer medium and electronic device for order delivery route |
| WO2020235693A1 (en) * | 2019-05-23 | 2020-11-26 | 国立大学法人神戸大学 | Learning method, learning device, and learning program for ai agent that behaves like human |
| US20210295171A1 (en) * | 2020-03-19 | 2021-09-23 | Nvidia Corporation | Future trajectory predictions in multi-actor environments for autonomous machine applications |
| CN109739585B (en) * | 2018-12-29 | 2022-02-18 | 广西交通科学研究院有限公司 | Spark cluster parallelization calculation-based traffic congestion point discovery method |
| JP7390126B2 (en) * | 2019-07-31 | 2023-12-01 | 株式会社日立製作所 | Trajectory data analysis system |
-
2021
- 2021-05-18 US US17/323,475 patent/US12380360B2/en active Active
- 2021-05-19 JP JP2022572280A patent/JP7466702B2/en active Active
- 2021-05-19 WO PCT/US2021/033107 patent/WO2021242585A1/en not_active Ceased
-
2025
- 2025-06-06 US US19/230,344 patent/US20250299111A1/en active Pending
- 2025-06-06 US US19/230,357 patent/US20250299112A1/en active Pending
Patent Citations (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2504776B1 (en) * | 2009-11-24 | 2019-06-26 | Zymeworks Inc. | Density based clustering for multidimensional data |
| KR20130049201A (en) * | 2010-09-30 | 2013-05-13 | 인텔 코오퍼레이션 | Storage drive management |
| CA2872831A1 (en) * | 2011-05-08 | 2012-11-15 | Infinetics Technologies, Inc. | Flexible radix switch |
| CN105393264A (en) * | 2013-07-12 | 2016-03-09 | 微软技术许可有限责任公司 | Interaction Segment Extraction in Human-Computer Interaction Learning |
| JP2017142549A (en) * | 2016-02-08 | 2017-08-17 | ブレインズコンサルティング株式会社 | Troubleshooting support apparatus, troubleshooting support program, and storage medium |
| CN105893256A (en) * | 2016-03-30 | 2016-08-24 | 西北工业大学 | Software failure positioning method based on machine learning algorithm |
| CN108805877B (en) * | 2017-05-03 | 2019-11-19 | 西门子保健有限责任公司 | Multiscale Deep Reinforcement Machine Learning for N-Dimensional Segmentation in Medical Imaging |
| EP3462385A1 (en) * | 2017-09-28 | 2019-04-03 | Siemens Aktiengesellschaft | Sgcnn: structural graph convolutional neural network |
| CN111712862A (en) * | 2018-02-14 | 2020-09-25 | 通腾运输公司 | Method and system for generating traffic volume or traffic density data |
| US20190324795A1 (en) * | 2018-04-24 | 2019-10-24 | Microsoft Technology Licensing, Llc | Composite task execution |
| CN109739585B (en) * | 2018-12-29 | 2022-02-18 | 广西交通科学研究院有限公司 | Spark cluster parallelization calculation-based traffic congestion point discovery method |
| WO2020162680A1 (en) * | 2019-02-08 | 2020-08-13 | 아콘소프트 주식회사 | Microservice system and method |
| US20200334093A1 (en) * | 2019-04-17 | 2020-10-22 | Microsoft Technology Licensing, Llc | Pruning and prioritizing event data for analysis |
| CN111950950A (en) * | 2019-05-17 | 2020-11-17 | 北京京东尚科信息技术有限公司 | Planning method, device, computer medium and electronic device for order delivery route |
| WO2020235693A1 (en) * | 2019-05-23 | 2020-11-26 | 国立大学法人神戸大学 | Learning method, learning device, and learning program for ai agent that behaves like human |
| JP7390126B2 (en) * | 2019-07-31 | 2023-12-01 | 株式会社日立製作所 | Trajectory data analysis system |
| CN110491171A (en) * | 2019-09-17 | 2019-11-22 | 南京莱斯网信技术研究院有限公司 | A kind of water transportation supervision early warning system and method based on machine learning techniques |
| US20210295171A1 (en) * | 2020-03-19 | 2021-09-23 | Nvidia Corporation | Future trajectory predictions in multi-actor environments for autonomous machine applications |
Non-Patent Citations (6)
| Title |
|---|
| Abbeel et al., "Apprenticeship Learning via Inverse Reinforcement Learning", Proceedings of the 21st International Conference on Machine Learning. Jul. 5-9, 2004. pp. 1-8. |
| Eysenbach et al., "Diversity is All You Need: Learning Skills Without a Reward Function", arXiv:1802.06070v6 [cs.AI]. Oct. 9, 2018. pp. 1-22. |
| Ho et al., "Generative Adversarial Imitation Learning", arXiv:1606.03476v1 [cs.LG]. Jun. 10, 2016. pp. 1-14. |
| Li et al., "InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations", arXiv:1703.08840v2 [cs.LG]. Nov. 14, 2017. pp. 1-14. |
| Ming et al., "Interpretable and Steerable Sequence Learning via Prototypes", 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Aug. 4-8, 2019. pp. 1-11. |
| Tomar et al., "Successor Options: An Option Discovery Framework for Reinforcement Learning", associarXiv:1905.05731v1 [cs.LG]. May 14, 2019. pp. 1-7. |
Also Published As
| Publication number | Publication date |
|---|---|
| JP7466702B2 (en) | 2024-04-12 |
| WO2021242585A1 (en) | 2021-12-02 |
| US20210374612A1 (en) | 2021-12-02 |
| US20250299112A1 (en) | 2025-09-25 |
| JP2023527341A (en) | 2023-06-28 |
| US20250299111A1 (en) | 2025-09-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20230153622A1 (en) | Method, Apparatus, and Computing Device for Updating AI Model, and Storage Medium | |
| US20250299112A1 (en) | Interpretable imitation learning via prototypical option discovery for decision making | |
| JP7316453B2 (en) | Object recommendation method and device, computer equipment and medium | |
| EP3782080B1 (en) | Neural networks for scalable continual learning in domains with sequentially learned tasks | |
| US11636347B2 (en) | Action selection using interaction history graphs | |
| US20240046128A1 (en) | Dynamic causal discovery in imitation learning | |
| CN106548210B (en) | Credit user classification method and device based on machine learning model training | |
| CN114616577A (en) | Identifying optimal weights to improve prediction accuracy in machine learning techniques | |
| CN114270365B (en) | Clustering based on elastic centroid | |
| US20200143498A1 (en) | Intelligent career planning in a computing environment | |
| US11176491B2 (en) | Intelligent learning for explaining anomalies | |
| CN114556331A (en) | New frame for less-lens time action positioning | |
| WO2025167876A1 (en) | Object category recognition model training method and apparatus, and object category recognition method and apparatus | |
| WO2022012347A1 (en) | Predictive models having decomposable hierarchical layers configured to generate interpretable results | |
| Lin | Online semi-supervised learning in contextual bandits with episodic reward | |
| Zhai et al. | Classification of high-dimensional evolving data streams via a resource-efficient online ensemble | |
| CN112348161B (en) | Neural network training method, neural network training device and electronic equipment | |
| Qi et al. | Fedvad: Enhancing federated video anomaly detection with gpt-driven semantic distillation | |
| Asif et al. | A generalized meta-loss function for distillation based learning using privileged information for classification and regression | |
| CN117056595A (en) | An interactive project recommendation method, device and computer-readable storage medium | |
| CN113095592A (en) | Method and system for performing predictions based on GNN and training method and system | |
| US20250181964A1 (en) | Machine learning model method and system for cross-domain recommendations | |
| US20250259072A1 (en) | Automated single-to-grouped cloud computing optimization | |
| US20250371100A1 (en) | Efficient sampling for theorem proving | |
| US20250006077A1 (en) | Auto-scaling, simulated reality task training |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: NEC LABORATORIES AMERICA, INC., NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YU, WENCHAO;CHEN, HAIFENG;CHENG, WEI;SIGNING DATES FROM 20210512 TO 20210513;REEL/FRAME:056276/0245 |
|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| AS | Assignment |
Owner name: NEC CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEC LABORATORIES AMERICA, INC.;REEL/FRAME:071486/0094 Effective date: 20250621 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |