CN113268143B - Multimodal man-machine interaction method based on reinforcement learning - Google Patents

Multimodal man-machine interaction method based on reinforcement learning Download PDF

Info

Publication number
CN113268143B
CN113268143B CN202110773626.6A CN202110773626A CN113268143B CN 113268143 B CN113268143 B CN 113268143B CN 202110773626 A CN202110773626 A CN 202110773626A CN 113268143 B CN113268143 B CN 113268143B
Authority
CN
China
Prior art keywords
data
agent
interaction method
human
computer interaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110773626.6A
Other languages
Chinese (zh)
Other versions
CN113268143A (en
Inventor
印二威
裴育
闫慧炯
谢良
艾勇保
罗治国
闫野
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin (binhai) Intelligence Military-Civil Integration Innovation Center
National Defense Technology Innovation Institute PLA Academy of Military Science
Original Assignee
Tianjin (binhai) Intelligence Military-Civil Integration Innovation Center
National Defense Technology Innovation Institute PLA Academy of Military Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin (binhai) Intelligence Military-Civil Integration Innovation Center, National Defense Technology Innovation Institute PLA Academy of Military Science filed Critical Tianjin (binhai) Intelligence Military-Civil Integration Innovation Center
Publication of CN113268143A publication Critical patent/CN113268143A/en
Application granted granted Critical
Publication of CN113268143B publication Critical patent/CN113268143B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Aiming at the problem of performance bottleneck caused by data mismatching in the traditional man-machine interaction method, the invention discloses a multi-mode man-machine interaction method based on reinforcement learning, which comprises the following steps: acquiring user data, wherein the user is required to wear a corresponding wearable sensor, the wearable sensor records the user data, and the recorded data form a training set and a test set; constructing a classification algorithm model on a streaming data set in an off-line manner; and applying the constructed classification algorithm model to perform human-computer interaction. For the synchronous human-computer interaction method, segmenting data according to the instruction synchronous label, and sending the data into a classification algorithm model for classification; for the asynchronous human-computer interaction method, data are cut according to the starting point of synchronous time, and the cut data are used as input samples of a classification model. The invention directly builds the model from the streaming data, avoids the problems of complicated development process and low performance upper limit of the traditional man-machine interaction method, and has better stability.

Description

Multimodal man-machine interaction method based on reinforcement learning
Technical Field
The invention relates to the field of human-computer interaction and wearable sensors, in particular to a human-computer interaction method based on reinforcement learning.
Background
Human-computer interaction (HCI) is a technical science which enables a researcher to complete information management, service, processing and other functions for people to the maximum extent through mutual understanding of communication and communication between the researcher and the computer, and enables the computer to be a harmonious assistant for work and study of people really.
In recent years, with the development of integrated electronic technology, electronic sensors have become smaller and more powerful. Human-computer interaction methods based on wearable sensors are increasingly being used. Depending on the type of information captured by the sensors, human-computer interaction methods can be divided into: and (3) human-computer interaction methods such as gestures and eye movements. A human-computer interaction method based on gesture recognition requires a user to wear a pair of data gloves containing motion sensors, the data gloves can collect hand motion information of the user in real time, and the purpose of human-computer cooperative work and human-computer interaction is achieved by recognizing and conjecturing the behavior intention of the user through a computer; the system judges eye movement information by arranging a pair of high-speed miniature cameras near the forehead and capturing eye movement images in real time to achieve the aim of human-computer interaction. Human-computer interaction methods can be classified into synchronous human-computer interaction methods and asynchronous human-computer interaction methods according to an online control strategy. The biggest difference between the synchronous and asynchronous human-computer interaction methods is whether the algorithm model can accurately obtain the starting time point of each action when the method is applied online. In the synchronous man-machine interaction method, a user needs to specially follow the rhythm of a system to send an instruction, so that an algorithm model can accurately identify the starting time of each action. However, the asynchronous human-computer interaction method requires that the correct result can be identified for the action started at any time point, and the system has high requirements on the algorithm model.
In the asynchronous human-computer interaction method, it is generally difficult to design a threshold for determining whether a user starts an action starting point. If the performance of the asynchronous interactive system needs to be further improved, an online dynamic decision-making method needs to be designed, which exceeds the framework based on the static classification model. Therefore, the current human-computer interaction has a very important drawback: the data used in constructing the classification algorithm model is segmented, while in practice the data is continuously streamed. The difference of data forms causes the problems of a starting point threshold which is difficult to select and an online dynamic strategy which is difficult to design, and the two problems become technical bottlenecks which restrict the performance of the existing human-computer interaction method. In order to break through the technical bottleneck, an identification model needs to be directly constructed from streaming data, so that the data form of the off-line model construction stage is consistent with that of the on-line application stage model, and the performance of the man-machine interaction method is expected to be further improved.
Reinforcement learning is learning by agents in a "trial and error" manner, with the goal of making the Agent obtain the maximum reward through reward-directed behavior by interacting with the environment. The difference between reinforcement learning and supervised learning in connection-oriented learning is mainly expressed in reinforcement signals, in which reinforcement signals provided by the environment in reinforcement learning are used to evaluate the quality of actions, rather than telling the reinforcement learning system RLS (learning system) how to generate correct actions, and reinforcement signals are usually scalar signals. In the field of reinforcement learning, there is a classic problem, the 'inverted pendulum' problem. In this problem, the control system is required to give a force of +10N or-10N after observing the position, velocity, angle, angular velocity of the inverted pendulum each time, so that the inverted pendulum is balanced as much as possible and does not topple. In this problem, the reinforcement learning model is faced with continuous and continuous observed streaming data, which is very similar to human behavior data observed through wearable sensors in the man-machine interaction method. Reinforcement learning is well suited for dynamic decisions on streaming data. Therefore, the invention introduces a reinforcement learning framework into the design of the man-machine interaction method, and is expected to break through the performance bottleneck caused by mismatching of the middle-stage data and the streaming data in the traditional interaction system design method.
Disclosure of Invention
The invention discloses a multimodal man-machine interaction method based on reinforcement learning, aiming at the problem of performance bottleneck caused by mismatching of middle-section data and streaming data in the traditional interaction system design method, comprising the following steps of:
s1, collecting user data. The user is required to wear a corresponding wearable sensor, corresponding actions are made according to a prompt interface, the wearable sensor records user data, the recorded data are cut into segmented data according to the instruction synchronization label and the time of each action, and then a training set and a test set are formed and used as a streaming data set to construct a classification algorithm model.
And S2, constructing a classification algorithm model on the streaming data set in an off-line manner.
And S3, applying the classification algorithm model constructed in the step S2 to perform human-computer interaction. For the synchronous human-computer interaction method, data received from a sensor in real time are segmented according to the same data format when a classification algorithm model is established in an off-line mode according to an instruction synchronous label, and then the segmented data are sent to the classification algorithm model to obtain a classification result; for the asynchronous human-computer interaction method, a threshold value is set to judge whether a user starts to act, the time point is used as a synchronous time starting point, and data are cut according to the length of a preset time window and used as an input sample of a classification model.
The step S2 specifically includes:
applying a reinforcement learning model to construct a classification algorithm model, wherein the reinforcement learning model comprises two components: agent and Environment. The agent observes data from the environment, i.e., the data flow is from the environment to the agent. The agent makes a decision on the environment, i.e. issues an instruction, based on the data it observes. After receiving the instruction from the intelligent agent, the environment feeds back the instruction to the corresponding reward of the intelligent agent, then changes the state of the environment and continues to send data to the intelligent agent. The agent comprises a decision module and a data temporary storage area. The agent will be at each sampling instantReceiving an observation from the environment, i.e. user behavior and action data O from the wearable sensor t The agent is according to O t And the data temporary storage area form a time window, and the decision module decides the system action A according to the time window t (Action), after the instruction is output, the intelligent agent updates the data temporary storage area and updates the O t Add it and discard the environment observations at the farthest time.
In the classification algorithm model building process, the intelligent agent randomly samples in a training set, namely randomly selects a section of continuous time signal data, then sends the section of data into the intelligent agent according to frames, the intelligent agent outputs an instruction in each frame and selects an output time point of a first non-wait instruction, the intelligent agent obtains rewards according to a reward function rule, the intelligent agent randomly samples for a plurality of times, accumulates reward values for a plurality of times and then obtains an average value. The decision module of the agent comprises a learnable parameter, and the learnable parameter is updated to a more optimal direction by using a gradient method.
The learning parameters are updated to be more optimal by using a gradient method, the gradient of each parameter in the intelligent agent decision module is calculated by average reward values, the learnable parameters are updated by using a gradient ascending method, and the process is repeated until the preset iteration times are reached.
In the classification algorithm model, the user behavior action data is a finite set, the behavior action data set is { left, right, stop, forward }, U { wait }, and wait means that a judgment result is not output and data collection is continued.
The decision module is realized by a convolution neural network, and learnable parameters are adopted in the decision module.
The reward function in the reinforcement learning model is set as:
Figure BDA0003154861140000041
wherein, O t The observed value is a period of time after the time t, namely a period of temporarily stored continuous sampling data. a is t Is the decision value at time t, which is eachThe output value of the intelligent agent at the moment, namely the prediction label at the moment, and realavailability is the real label at the moment. If the output value is correct, the agent obtains a reward of +1, and if the agent outputs an error, the agent obtains a reward of-1; if the agent output waits, it is penalized, which increases with increasing response time. And the values of the lambda and the p are balance factors, influence the going direction of the intelligent agent between an earlier output result and a more accurate output result, and the values are determined according to needs.
The invention has the beneficial effects that:
(1) The invention directly builds the model from the streaming data, and avoids the process that the data is segmented firstly, then the classification algorithm model is trained in the segmented data set and then the classification algorithm model is applied on line in the traditional human-computer interaction method development process. In the traditional development method, the performance upper limit of an interactive system is not high due to the data form in the off-line stage and the on-line stage, but the problem is solved by directly modeling the human-computer interaction problem from continuous data by using a reinforcement learning methodology, so that the novel human-computer interaction method developed and designed according to the method has better performance and stability.
(2) The invention changes the modeling method of the traditional human-computer interaction method, remodels the modeling method into a dynamic time sequence decision problem, and solves the problem by reinforcement learning. The reinforcement learning method is a sub-field of rapid development in the field of artificial intelligence in recent years, and subsequent development of the reinforcement learning method supports further iterative updating of the reinforcement learning method and can improve the upper limit of the performance of a system.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention, are incorporated in and constitute a part of this specification. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention and not to limit the present invention.
FIG. 1 is a diagram of the basic architecture of reinforcement learning used in the present invention.
FIG. 2 is a diagram showing the structure of an Agent used in the present invention.
Detailed Description
For a better understanding of the present disclosure, an example is given here.
The embodiment of the invention provides a human-computer interaction method based on reinforcement learning, which comprises the following steps:
s1, collecting user data. The user is required to wear a corresponding wearable sensor, corresponding actions are made according to a prompt interface, the wearable sensor records user data, the recorded data are cut into segmented data according to the instruction synchronization label and the time of each action, and then a training set and a test set are formed and used as a streaming data set to construct a classification algorithm model.
And S2, constructing a classification algorithm model on the streaming data set in an off-line manner.
And S3, applying the classification algorithm model constructed in the step S2 to perform human-computer interaction. For the synchronous man-machine interaction method, the data received from the sensor in real time is segmented according to the same data format when a classification algorithm model is constructed offline according to an instruction synchronous label, and then the segmented data is sent to the classification algorithm model to obtain a classification result; for the asynchronous human-computer interaction method, a threshold value is set to judge whether a user starts to act, the time point is used as a synchronous time starting point, and data are cut according to the length of a preset time window and used as an input sample of a classification model.
The method faces to an asynchronous human-computer interaction method, the action starting point is determined without manually setting a threshold value, only data collected by each frame on a sensor are sent into a model of a reinforcement learning algorithm, and then the output of the model is used as an instruction to be output.
The step S2 specifically includes:
applying a reinforcement learning model to construct a classification algorithm model, wherein the reinforcement learning model comprises two components: agent and Environment. The agent observes data from the environment, i.e. the data flow is from the environment to the agent. The agent makes a decision on the environment, i.e. issues an instruction, based on the data it observes. After receiving the instruction from the intelligent agent, the environment feeds back the instruction to the corresponding reward of the intelligent agent, then changes the state of the environment and continuously sends data to the intelligent agent. The agent corresponds to a classification algorithm model in a traditional human-computer interaction design, and the environment corresponds to a wearable sensor in a human-computer interaction method. The agent comprises a decision module and a data temporary storage area. The agent receives an observation value from the environment at each sampling moment, namely user behavior and action data O from the wearable sensor t The agent is according to O t And the data temporary storage area form a time window, and the decision module decides the system action A according to the time window t (Action), after the instruction is output, the intelligent agent updates the data temporary storage area and updates the O t Add it and discard the environment observations at the farthest time.
In the classification algorithm model building process, the intelligent agent randomly samples in a training set, namely randomly selects a section of continuous time signal data, then sends the section of data into the intelligent agent according to frames, the intelligent agent outputs an instruction in each frame and selects an output time point of a first non-wait instruction, the intelligent agent obtains rewards according to a reward function rule, the intelligent agent randomly samples for a plurality of times, accumulates reward values for a plurality of times and then obtains an average value. The decision module of the agent comprises a learnable parameter, and the decision module updates the learnable parameter to a better direction by using a gradient method.
The decision module uses a gradient method to update the learning parameters to a more optimal direction, calculates the gradient of each parameter in the intelligent agent decision module through an average reward value, updates the learnable parameters by using a gradient ascending method, and repeats the process until reaching the preset iteration number.
In the classification algorithm model, the user behavior action data is a finite set, the behavior action data set is { left, right, stop, forward }, U { wait }, and wait means that a judgment result is not output and data collection is continued.
The decision module is realized by a convolution neural network, and learnable parameters are adopted in the decision module.
A temporary storage space is arranged in the intelligent agent, and the behavior and action observed value of the intelligent agent in the last period of time is stored, so that the observed value of the intelligent agent is not a single-frame observed value any more, but a continuous observed value, which is necessary for a human-computer interaction method. The observation value of a single frame cannot capture enough information, and most of data information of the man-machine interaction method is hidden in a time domain and a frequency domain.
The performance evaluation indexes of the man-machine interaction method comprise response time, accuracy rate and false alarm rate. The reward function in the reinforcement learning model is set as:
Figure BDA0003154861140000081
wherein, O t The observed value is a period of time after the time t, namely a period of temporarily stored continuous sampling data. a is t The decision value at the time t is an output value of the agent at each time, namely a prediction label at the time, realabely is a real label at the time, and p is a time variable index. If the output value is correct, the agent obtains a reward of +1, and if the agent outputs an error, the agent obtains a reward of-1; if the agent output waits, it is penalized, which increases with increasing response time. And the values of the lambda and the p are balance factors, influence the going direction of the intelligent agent between an earlier output result and a more accurate output result, and the values are determined according to needs.
For the training process of the reinforcement learning model, the intelligent agent is required to continuously interact with the environment, and parameters of the decision network of the intelligent agent are optimized according to the obtained rewards.
Fig. 1 is a diagram of a basic architecture of reinforcement learning. The architecture mainly comprises two components: 1) Agent, 2) Environment (Environment). The agent corresponds to a classification algorithm model in a traditional human-computer interaction design, and the environment corresponds to a wearable sensor in a human-computer interaction method. The decision module inside the agent receives at each sampling instant an observation from the environment, i.e. behavioural data O from the user of the wearable sensor t The decision module is based on O t The decision system now acts a t (Action). Taking the gesture human-computer interaction method as an example, the action is a finite set, which may be { left, right, stop, go }. U { wait }, where wait denotes no outputAnd judging the result, and continuously collecting data. An alternative implementation of the decision module is a convolutional neural network, which is one of the neural networks popular in recent years, and has a very strong representation capability.
Fig. 2 is a diagram of an agent structure adapted to the field of human-computer interaction. In many human-computer interaction modalities, the features are mainly reflected in the time domain. Therefore, the reinforcement learning framework is introduced into the field of human-computer interaction, and the observed quantity O of the reinforcement learning framework is required t And (6) adjusting. As shown in fig. 2, there is a temporary storage space inside the agent where the last observation is stored. That is, the adjusted observation of the agent is no longer a single frame of observation, but a continuous segment of observation.
For the training process of the reinforcement learning model, the training of the reinforcement learning model is different from the supervised machine learning of obtaining information from the labels of the samples. In the training of reinforcement learning, the intelligent agent is required to continuously interact with the environment, and parameters of a decision network of the intelligent agent are optimized according to the obtained rewards.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of the present application shall be included in the scope of the claims of the present application.

Claims (4)

1. A multimodal man-machine interaction method based on reinforcement learning is characterized by comprising the following steps:
s1, collecting user data; requiring a user to wear a corresponding wearable sensor, making corresponding actions according to a prompt interface, recording user data by the wearable sensor, cutting the recorded data into segmented data according to the instruction synchronization label and the time of each action, and then forming a training set and a test set which are used as a streaming data set to construct a classification algorithm model;
s2, constructing a classification algorithm model on the streaming data set in an off-line manner;
s3, applying the classification algorithm model constructed in the step S2 to perform human-computer interaction; for the synchronous human-computer interaction method, according to the instruction synchronous label, segmenting data received from the sensor in real time according to the same data format when a classification algorithm model is established in an off-line mode, and then sending the segmented data into the classification algorithm model to obtain a classification result; setting a threshold value to judge whether a user starts to act or not for the asynchronous human-computer interaction method, taking the time point as a synchronous time starting point, and cutting data by the length of a preset time window to be used as an input sample of a classification model;
the step S2 specifically includes:
applying a reinforcement learning model to construct a classification algorithm model, wherein the reinforcement learning model comprises two components: agents and environments; the agent observes data from the environment, namely the data flow flows from the environment to the agent; the intelligent agent makes a decision on the environment according to the observed data, namely, sends out an instruction; after receiving the instruction from the intelligent agent, the environment feeds back the instruction to the corresponding reward of the intelligent agent, then changes the state of the environment and continuously sends data to the intelligent agent; the intelligent agent comprises a decision module and a data temporary storage area; the agent receives an observation value from the environment at each sampling moment, namely user behavior and action data O from the wearable sensor t The agent is according to O t And a data temporary storage area form a time window, and the decision module decides the system action A according to the time window t After the instruction is output, the agent updates the data temporary storage area and stores the data temporary storage area t Adding the environment observation values into the obtained mixture, and discarding the environment observation values at the farthest moment;
in the classification algorithm model building process, the intelligent agent randomly samples in a training set, namely randomly selects a section of continuous time signal data, then sends the section of data into the intelligent agent according to frames, the intelligent agent outputs an instruction in each frame and selects an output time point of a first non-wait instruction, the intelligent agent obtains rewards according to a reward function rule, the intelligent agent randomly samples for a plurality of times, accumulates reward values for a plurality of times and then takes an average value; the decision module of the agent contains learnable parameters, and the decision module updates the learnable parameters to a more optimal direction by using a gradient method.
2. The reinforcement learning-based multi-modal human-computer interaction method as claimed in claim 1, wherein the decision module uses a gradient method to update the learning parameters to a more optimal direction, calculates the gradient of each parameter in the intelligent decision module by averaging the reward values, updates the learnable parameters using a gradient ascent method, and repeats the process until a predetermined number of iterations is reached.
3. The reinforcement learning-based multi-modal human-computer interaction method as claimed in claim 1, wherein in the classification algorithm model, the user behavior and action data is a finite set, the behavior and action data set is { left, right, stop, go } { u { wait }, and wait indicates that no judgment result is output and data collection is continued.
4. The reinforcement learning-based multi-modal human-computer interaction method as claimed in claim 1, wherein the decision module is implemented by a convolutional neural network, and learnable parameters are used inside the decision module.
CN202110773626.6A 2020-09-29 2021-07-08 Multimodal man-machine interaction method based on reinforcement learning Active CN113268143B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011053448.1A CN112181148A (en) 2020-09-29 2020-09-29 Multimodal man-machine interaction method based on reinforcement learning
CN2020110534481 2020-09-29

Publications (2)

Publication Number Publication Date
CN113268143A CN113268143A (en) 2021-08-17
CN113268143B true CN113268143B (en) 2022-11-04

Family

ID=73946701

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202011053448.1A Pending CN112181148A (en) 2020-09-29 2020-09-29 Multimodal man-machine interaction method based on reinforcement learning
CN202110773626.6A Active CN113268143B (en) 2020-09-29 2021-07-08 Multimodal man-machine interaction method based on reinforcement learning

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202011053448.1A Pending CN112181148A (en) 2020-09-29 2020-09-29 Multimodal man-machine interaction method based on reinforcement learning

Country Status (1)

Country Link
CN (2) CN112181148A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113449265A (en) * 2021-06-28 2021-09-28 湖南汇视威智能科技有限公司 Waist-borne course angle calculation method based on stacked LSTM
CN113778580B (en) * 2021-07-28 2023-12-08 赤子城网络技术(北京)有限公司 Modal user interface display method, electronic device and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809144B (en) * 2016-03-24 2019-03-08 重庆邮电大学 A kind of gesture recognition system and method using movement cutting
US11687822B2 (en) * 2016-07-13 2023-06-27 Metric Masters Ltd. Automated functional understanding and optimization of human/machine systems
CN106648068A (en) * 2016-11-11 2017-05-10 哈尔滨工业大学深圳研究生院 Method for recognizing three-dimensional dynamic gesture by two hands
CN107909042B (en) * 2017-11-21 2019-12-10 华南理工大学 continuous gesture segmentation recognition method
CN108985342A (en) * 2018-06-22 2018-12-11 华南理工大学 A kind of uneven classification method based on depth enhancing study

Also Published As

Publication number Publication date
CN112181148A (en) 2021-01-05
CN113268143A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
WO2021082749A1 (en) Action identification method based on artificial intelligence and related apparatus
Wu et al. Learning to anticipate egocentric actions by imagination
CN113268143B (en) Multimodal man-machine interaction method based on reinforcement learning
CN104616028B (en) Human body limb gesture actions recognition methods based on space segmentation study
CN112527113B (en) Training method and device for gesture recognition and gesture recognition network, medium and equipment
CN104766038A (en) Palm opening and closing action recognition method and device
CN106648078B (en) Multi-mode interaction method and system applied to intelligent robot
CN110909762B (en) Robot posture recognition method and device based on multi-sensor fusion
CN107862295A (en) A kind of method based on WiFi channel condition informations identification facial expression
CN113989943B (en) Distillation loss-based human body motion increment identification method and device
CN111680660B (en) Human behavior detection method based on multi-source heterogeneous data stream
CN111860117A (en) Human behavior recognition method based on deep learning
CN111158476B (en) Key recognition method, system, equipment and storage medium of virtual keyboard
CN113723378A (en) Model training method and device, computer equipment and storage medium
CN111046742B (en) Eye behavior detection method, device and storage medium
CN111898420A (en) Lip language recognition system
CN114332711A (en) Method, device, equipment and storage medium for facial motion recognition and model training
CN112052795B (en) Video behavior identification method based on multi-scale space-time feature aggregation
Razmah et al. LSTM Method for Human Activity Recognition of Video Using PSO Algorithm
CN113887501A (en) Behavior recognition method and device, storage medium and electronic equipment
CN111339983A (en) Method for fine-tuning face recognition model
CN115188080A (en) Traffic police gesture recognition method and system based on skeleton recognition and gated loop network
CN111274443A (en) Video clip description generation method and device, electronic equipment and storage medium
CN115645929A (en) Method and device for detecting plug-in behavior of game and electronic equipment
CN112989088B (en) Visual relation example learning method based on reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant