WO2017152531A1 - 一种基于超声波的凌空手势识别方法及系统 - Google Patents

一种基于超声波的凌空手势识别方法及系统 Download PDF

Info

Publication number
WO2017152531A1
WO2017152531A1 PCT/CN2016/085475 CN2016085475W WO2017152531A1 WO 2017152531 A1 WO2017152531 A1 WO 2017152531A1 CN 2016085475 W CN2016085475 W CN 2016085475W WO 2017152531 A1 WO2017152531 A1 WO 2017152531A1
Authority
WO
WIPO (PCT)
Prior art keywords
palm
model
trend
gesture
gesture recognition
Prior art date
Application number
PCT/CN2016/085475
Other languages
English (en)
French (fr)
Inventor
陈益强
杨晓东
于汉超
曾辉
黄伟城
钟习
胡子昂
Original Assignee
中国科学院计算技术研究所
深圳市车音网科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院计算技术研究所, 深圳市车音网科技有限公司 filed Critical 中国科学院计算技术研究所
Publication of WO2017152531A1 publication Critical patent/WO2017152531A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer

Definitions

  • the invention relates to the field of human-computer interaction, and more particularly to an ultrasonic-based volley gesture recognition technology, in particular to an ultrasonic volley gesture recognition technology for an intelligent mobile terminal.
  • the myoelectric signal of the arm is measured by means of a wristband, and different electromyographic signals are represented by different gestures, thereby identifying different static gestures.
  • Such methods can be found in Chinese patent applications published as CN105138133A, CN105139038A, and the like. This method requires the user to wear a professional myoelectric wristband, which requires the wristband to fit closely with the user's arm. Can cause user discomfort and cost.
  • an ultrasonic-based volley gesture recognition method including:
  • Step 1) using the pre-trained palm movement trend model, identifying the palm movement trend of the acquired ultrasonic signal reflected by the human hand, and obtaining a palm motion trend timing sequence including a series of palm movement trends; wherein the palm movement
  • the trend model is a model for recognizing the trend of palm movement based on the acoustic characteristics of the ultrasonic signal reflected by the human hand;
  • Step 2) using a pre-trained gesture recognition model, performing gesture recognition on the palm motion trend time series obtained in step 1); wherein the gesture recognition model is trained according to a training data set composed of a palm motion trend time series sequence A model for recognizing gestures.
  • Step 1) of the above method includes:
  • the palm movement trend constitutes a sequence of palm movement trends, in which:
  • Step 11 extracting features from the one-frame ultrasonic signal to obtain input data.
  • Step 12 determine the current state, if the current state indication is in the gesture process, perform step 14), otherwise perform step 13);
  • Step 13 using a pre-trained palm motion trend model to identify a palm motion trend for the input data, if the recognition result indicates no palm motion trend, return to step 11) for the acquired next frame ultrasonic signal, otherwise the current state Set to be in the process of gesture, cache recognition of the palm movement trend and return to step 11) for the acquired next frame of ultrasonic signals;
  • Step 14 determining whether the gesture ends, and if so, composing the cached palm motion trend to form a palm motion trend timing sequence and performing step 2), otherwise returning to step 11) for the acquired next frame ultrasonic signal.
  • the above method further includes after step 2):
  • the above method also includes training the palm movement trend model according to the following steps:
  • Step a) extracting features from each acquired ultrasonic signal reflected by the human hand to obtain a training sample per frame, and assigning a corresponding category to each training sample to form a first training data set;
  • Step b) training the palm motion trend model according to the first training data set.
  • step b) comprises: training the palm motion trend model by using the ELM model in combination with the requirements of cross-validation.
  • the above method further includes training the gesture recognition model according to the following steps:
  • Step c) performing a gesture segmentation on the result obtained by the training data of the first training data set to form a training sample corresponding to different gestures, and assigning a corresponding category to the training samples corresponding to each gesture.
  • Step d) according to the second training data set, training a corresponding gesture recognition model for each gesture.
  • step d) includes: using the HMM model pin in combination with the requirement of cross-validation, training the corresponding gesture recognition model for each gesture.
  • the ultrasonic signal is continuously transmitted by the speaker inherent to the smart mobile terminal, and the ultrasonic signal reflected back by the human hand is collected by the microphone inherent to the smart mobile terminal.
  • the ultrasonic wave has a frequency of 18 kHz to 22 kHz.
  • an ultrasonic-based volley gesture recognition system including:
  • the palm movement trend recognizing device is configured to use the pre-trained palm movement trend model to identify the palm movement trend of the acquired ultrasonic signal reflected by the human hand, and obtain a sequence sequence of the palm movement trend including a series of palm movement trends;
  • the palm movement trend model is a model for recognizing a palm movement tendency according to an acoustic characteristic of an ultrasonic signal reflected by a human hand;
  • a gesture recognition device configured to perform gesture recognition on the palm motion trend time series obtained in step 1) by using a pre-trained gesture recognition model; wherein the gesture recognition model is a root
  • a model for recognizing gestures is trained based on a training data set consisting of a sequence of palm movement trend timing sequences.
  • the ultrasonic signal is continuously transmitted by the speaker inherent to the intelligent mobile terminal, and the ultrasonic signal reflected back by the human hand is collected by the microphone inherent to the intelligent mobile terminal.
  • the ultrasonic-based volley gesture recognition method provided by the present invention is a layered method that combines acoustic characteristics and time series features of an acoustic signal: at the first level, ultrasonic signal data acquired for each moment (hereinafter referred to as a frame) The acoustic characteristics are extracted, the palm movement trend of each frame is recognized, and the time series of the palm movement trend is obtained. At the second level, the identified palm motion trend sequences are identified and classified, thereby realizing the volley gesture recognition. The method performs further gesture classification based on the recognition of the palm movement trend, and realizes high-precision and high-robust volley gesture recognition.
  • the data to be processed in the present invention is relatively simple, and does not require the use of additional data gloves, myoelectric sensors, etc., and is suitable for a smart mobile terminal, and can perform ultrasonic waves using a microphone and a speaker inherent to the smart mobile terminal.
  • the transmission and reception are therefore low cost and easy to promote.
  • FIG. 1 is a flow chart of a model offline training method in accordance with one embodiment of the present invention.
  • FIG. 2 is a flow chart of an ultrasonic-based volley gesture recognition method in accordance with one embodiment of the present invention
  • FIG. 3 is a schematic diagram of a hierarchical recognition process in an ultrasonic-based volley gesture recognition method, in accordance with one embodiment of the present invention.
  • an ultrasonic-based volley gesture recognition method is provided.
  • the method captures the user's hand motion by transmitting and receiving ultrasonic waves, and performing (two-layer) palm motion trend recognition and gesture recognition on the received ultrasonic signal. And realize the recognition of the volley gesture.
  • the palm motion trend model is pre-trained using the pre-trained palm movement trend model
  • the gesture recognition model is pre-trained for gesture recognition.
  • the ultrasonic here refers to an audio frequency that is not heard by a normal adult (ie, not lower than 18 kHz); and the volley gesture includes but is not limited to: forward stretching, back pulling, clicking, double clicking, and the like.
  • the ultrasonic-based volley gesture recognition method provided by the present invention will be described in two stages below, which are a model training phase and a gesture recognition phase, respectively.
  • the model training method includes the following steps:
  • the ultrasonic wave is continuously played by the ultrasonic wave transmitting device and the ultrasonic wave reflected back by the human hand is collected by the ultrasonic wave receiving device.
  • ultrasonic waves can be played using a speaker inherent to the smart mobile terminal.
  • the ultrasonic waves used may have a frequency ranging from 18 kHz to 22 kHz, preferably 18 kHz.
  • the smart mobile terminals referred to herein include, but are not limited to, wearable smart devices such as smart phones, tablet computers, and smart watches.
  • the ultrasonic wave reflected by the human hand can be collected at a predetermined sampling frequency by using a microphone inherent to the smart mobile terminal (ie, an intelligent mobile terminal transmitting ultrasonic waves).
  • the human hand can be located at any location relative to the smart mobile terminal, preferably within a range of one-half meter from the smart mobile terminal.
  • the ultrasonic receiving device of the present invention preferably uses a sampling frequency of 44.1 kHz for periodic acoustic signals (ie, reflected back). Acquisition of ultrasound).
  • the acquired ultrasonic signal data is divided into continuous multi-frames by using a sliding Hamming window of length L (that is, a time series of each L sampling points constitutes one frame), and the data between adjacent two frames does not overlap.
  • the pre-processing includes preliminary smoothing to increase the final recognition accuracy.
  • the intra-frame time domain signal can be filtered by an N-order Moving Average Filter, and the response function is expressed as follows:
  • features are extracted from the pre-processed frame data to form a training sample set of the palm motion trend model.
  • the pre-processed data per frame is used as a unit to extract features corresponding to each frame (such as acoustic features), thereby forming a training sample set of the palm motion trend model.
  • the extracted features may be a spectral peak, a peak position, a frequency spectrum, a mean value, a zero crossing rate, a standard deviation, and the like.
  • the features extracted herein are not limited to the several typical features listed above. Additionally, one or a combination of the above features can be used to implement training in a palm motion trend model.
  • each frame of data includes 2048 sample points
  • the corresponding features of each frame may be extracted as follows:
  • the frequency domain data of 1025 points is obtained by FFT time-frequency conversion, and the spectral peaks of the spectrum peaks and n (n is a positive integer) frequency bands before and after the peak are extracted as the feature vectors corresponding to the frame.
  • determining the frequency band in which the spectral peak is located includes the following manners: 1. In the high frequency region (>17000 Hz) of the obtained frequency domain data, the maximum value of the spectral value is taken as the frequency band in which the spectral peak is located; 2.
  • the frequency band in which the carrier frequency is selected is taken as In the frequency band where the peak is located, the subscript of the frequency band in which the carrier frequency is located is about 835 on the premise of the 18 kHz carrier (ie, the above 18 kHz ultrasound), the 44.1 kHz sampling frequency, and the 2048 sampling point/frame.
  • the corresponding palm movement trend category label is assigned to each frame of data (for example, the forward palm movement trend category label is 1, and the backward palm movement is performed.
  • the trend category label is 2, etc., so that the feature vector extracted from each frame of data (the training sample set of the palm motion trend model described above) and the frame class form the training data set TrainDataSet_1 of the first level.
  • the output of the palm motion trend model may include, but is not limited to, "null”, “forward”, and “backward”, respectively indicating no palm movement trend, relative to the microphone and away from the microphone.
  • the first level training data set TrainDataSet_1 is divided into m parts for cross-validation; and the classification model (for example, Naive Bayes, SVM, DNN, etc.) is used for training.
  • the classification model for example, Naive Bayes, SVM, DNN, etc.
  • the ELM (Extern Learning Machine) model is combined with the requirements of cross-validation to train the palm motion trend model. Specifically, firstly, the model parameters are randomly set, then the model is trained with m-1 data, and the accuracy of the model is tested with another data. The whole process is repeated m times, and the average value of the precision is used as the final precision value, and the output is output. Model parameters.
  • the ELM model is used to construct a palm movement trend model for the training sample set composed of 9 training samples, and the established model is tested on the test data to obtain the training accuracy rate and the test accuracy rate; the above process is repeated 9 times, each time selecting a different one.
  • One group was used as test data, and the other nine groups were used as training data to obtain corresponding training accuracy and test accuracy.
  • the result of the trained palm movement trend model of TrainDataSet_1 (the time series of the palm movement trend) is divided into gestures to form different gestures.
  • a training sample set wherein sequence length normalization processing and repeated gesture removal processing can be performed in the training sample set.
  • the training samples of each gesture are respectively assigned corresponding category numbers (for example, the category label of "click” is 1, and the category label of "double click” is 2, etc.), thereby the training sample set and corresponding category for different gestures.
  • a training data set TrainDataSet_2 for the second level of the gesture recognition model is formed.
  • a model using timing features such as CRF, HMM, etc. is used for model training.
  • the HMM model (Hidden Markov Model) is used for training.
  • the second level training data set TrainDataSet_2 is divided into m parts for cross-validation (for example, adopting a ten-fold cross-validation method); for a gesture set (the gesture set includes but is not limited to: “forward”, “ Each gesture in the backward, "click”, “double click”, “reverse click”, etc.
  • the HMM model is training an HMM model for each gesture during the training, so that A palm motion trend sequence has a likelihood probability for each gesture corresponding model, wherein the maximum probability corresponding gesture is the recognized gesture
  • the HMM model is combined with the cross-validation requirement (the cross-validation process is similar to the 5 steps) to train the gesture recognition model, and obtain the output parameters of the model after the training.
  • the ultrasonic-based volley gesture recognition method includes the following steps:
  • the first step continuously play the ultrasound and receive the ultrasound reflected back by the human hand
  • the ultrasonic transmitting device continuously plays ultrasonic waves.
  • an ultrasonic transmitting device such as a speaker inherent to an intelligent mobile terminal
  • an ultrasonic transmitting device is used to play ultrasonic waves of 18 to 20 kHz, preferably 18 kHz.
  • the ultrasonic receiving device (for example, an intelligent mobile terminal that transmits ultrasonic waves Some microphones) receive and collect ultrasonic waves reflected back by the human hand at a sampling frequency of 44.1 kHz. Each time a certain number of ultrasonic signals are acquired, the second step is followed by subsequent processing of these ultrasonic signals.
  • the collected ultrasonic signal data is divided into continuous multi-frames by using a sliding Hamming window of length L (such as 2048).
  • L such as 2048
  • each frame of L sampling points is collected for the frame data. Go to the second step.
  • Step 2 Pre-process the acquired signal and extract features from the pre-processed signal to form input data of the palm motion trend model
  • this step performs preprocessing and feature extraction on the frame data collected in the first step, including:
  • Step 3 Identify palm movement trends based on input data
  • This step includes the following substeps:
  • the current state is used to indicate whether the gesture is currently in progress (ie, in the process of making a gesture). If in the gesture process, the current state is represented as “gesture” or otherwise as “wait” (ie, non-gesture process). If in the gesture process ("gesture"), then sub-step 3 is performed, otherwise (ie "wait") is performed sub-step 2. In this article, the initial state is "wait”.
  • the palm movement trend is identified for the above input data according to the pre-trained palm movement trend model.
  • the recognition results of the palm motion trend model include, but are not limited to, "forward”, “backward”, and “null”, which respectively represent the approaching, moving, and palm-free motion trends with respect to the microphone.
  • the input data is input into the palm motion trend model for recognition, and the recognition result is obtained; according to the recognition result, it is determined whether the current palm motion trend is “null” (ie, no palm movement trend), and if “null”, returns to the first Step to perform a subsequent process such as preprocessing according to the next frame acquired, otherwise set the current state to "gesture” and the identified current palm
  • the motion trend (for example, "backward” or “forward”) is buffered, and then returns to the first step to perform subsequent processes such as preprocessing according to the next frame acquired.
  • three consecutive “nulls” may be used as the end of the gesture. If the palm motion trend model continuously recognizes three “nulls" recently, the fourth step is entered; otherwise, the first step is returned.
  • Step 4 Gesture recognition of cached palm movement trends
  • the previously cached palm movement trend may constitute a time series of palm movement trends (ie, a palm motion trend sequence), the sequence is recognized using a pre-trained gesture recognition model, and the gesture recognition result is output.
  • the time series of the palm movement trend can be first normalized and then recognized by the gesture recognition model.
  • An example of a sequence of gesture motion trends is given below:
  • the gesture recognition model corresponding to each gesture is used for recognition.
  • the palm motion trend sequence obtains a likelihood probability after identifying the model by each gesture, taking the gesture corresponding to the maximum probability as the recognized gesture.
  • Step 5 Respond to gestures
  • this step (for example, the smart mobile device) responds with an operation corresponding to the recognition result.
  • the cache is cleared, the current state is set to "wait" (ie, non-gesture process), and the first step is returned to perform subsequent processes such as preprocessing according to the next frame acquired.
  • the above method is a hierarchical method, which decomposes the gesture into a sequence of multiple palm movement trends according to the degree of freedom of the natural movement of the hand.
  • the palm movement trend is identified;
  • gesture recognition is performed based on the identified series of palm movement trends.
  • FIG. 3 shows a flow diagram of identifying a gesture by the hierarchical method.
  • an ultrasonic-based volley gesture recognition system including:
  • the palm movement trend recognizing device is configured to use the pre-trained palm movement trend model to identify the palm movement trend of the acquired ultrasonic signal reflected by the human hand, and obtain a sequence sequence of the palm movement trend including a series of palm movement trends;
  • the palm motion trend model is a model for recognizing the trend of palm movement based on the acoustic characteristics of the ultrasonic signal reflected by the human hand;
  • a gesture recognition device configured to perform gesture recognition on the hand motion trend time series obtained in step 1) by using a pre-trained gesture recognition model; wherein the gesture recognition model is trained according to a training data set composed of a palm motion trend time series sequence A model for recognizing gestures.
  • the volley gesture recognition system further includes: an ultrasonic transmitting device for continuously transmitting the ultrasonic signal; and an ultrasonic receiving device for collecting the ultrasonic signal reflected by the human hand.
  • the ultrasonic transmitting device may be a speaker of the smart mobile terminal, and the ultrasonic receiving device may be a microphone of the smart mobile terminal.
  • the smart mobile terminal can be a smart tablet, a smart phone, etc. in a ubiquitous environment.
  • ⁇ Microphone the microphone inherent in the experimental equipment
  • the obtained 7600 frames of palm motion trend data are equally divided into 10 groups, the ELM excitation function is set to Sigmoid, the hidden layer node number is set to 90, and then the ten-fold cross-validation method is adopted, and the ELM algorithm is used to perform the palm movement trend recognition model. Training and testing, and comparison with rule-based palm movement trend recognition methods, the experimental results of the ELM-based palm movement trend model are shown in the following table.
  • an effective palm movement trend sequence is obtained through the palm movement trend model and gesture segmentation, and length normalization, gesture type calibration and repeated gesture sequence removal processing are performed. Then, using the ten-fold cross-validation method, the calibrated palm motion trend sequence is used as the feature vector of HMM model training.
  • the HMM algorithm is used to train and test the gesture recognition model. The experimental results of the HMM-based gesture recognition model are shown in the following table. Show.
  • the method and system provided by the present invention are applicable to intelligent mobile terminals, and at the same time, high-precision volley gesture recognition is realized.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

一种基于超声波的凌空手势识别方法及系统,所述方法包括:利用预先训练好的手掌运动趋势模型,对所采集的经由人手反射回的超声波信号识别手掌运动趋势,得到包含一系列手掌运动趋势的手掌运动趋势时序序列;其中,所述手掌运动趋势模型是根据人手反射回的超声波信号的声学特征训练得到的用于识别手掌运动趋势的模型;以及,利用预先训练好的手势识别模型,对得到的手掌运动趋势时序序列进行手势识别;其中,所述手势识别模型是根据由手掌运动趋势时序序列组成的训练数据集训练得到的用于识别手势的模型。上述方法及系统适用于智能移动终端,并且同时能实现手势识别的高精度和高鲁棒性。

Description

一种基于超声波的凌空手势识别方法及系统 技术领域
本发明涉及人机交互领域,更具体地,涉及基于超声波的凌空手势识别技术,尤其是面向智能移动终端的超声波凌空手势识别技术。
背景技术
近年来,随着智能移动终端与可穿戴技术的不断发展,出现了许多新型的人机交互场景与形式。目前主流的基于触摸屏的人机交互方式要求用户与触摸屏接触,影响人机交互的普适性和自然性,尤其是针对特定场景(如驾驶车辆过程、烹饪食物过程等)和特定设备形态(如智能眼镜、智能手环等),用户难以进行触摸,无法完全满足人机交互多样化的需求。因此,迫切需要一种适用于智能移动终端的凌空手势输入方法。
传统的人机交互系统中,用于提供凌空手势识别的现有技术方案主要有以下三种方式:
(1)基于计算机视觉。基于摄像头采集的彩色图像信息或者专业的深度摄像头采集的深度图像信息识别出用户的手部,并进一步识别出用户的手势,此类方法可参见公开号为CN104360742A、CN103136541A、CN103176605A、CN103472916A、CN104915010A等的中国专利申请。然而,此类方法对光线比较敏感,算法复杂度较高,对手势的识别范围有限,需要较多的系统资源,并且需要配备专用的摄像头传感器,不适于在普适环境下进行推广应用。
(2)基于数据手套。通过数据手套采集手部动作,从而识别出用户的手势,此类方法可参见公开号为CN104392237A、CN105127973A、CN204765652U、CN204740561U等的中国专利申请。该方法需要用户佩戴复杂的位置跟踪器和数据手套,操作复杂,不利于自然的人机交互,并且价格比较昂贵,同样不适于在普适环境下进行推广应用。
(3)基于肌电传感器。采用腕带的方式测量小臂的肌电信号,由于不同的手势动作表现不同的肌电信号,从而识别出不同的静态手势,此类方法可参见公开号为CN105138133A、CN105139038A等的中国专利申请。该方法需要用户佩戴专业的肌电腕带,要求腕带与用户小臂紧密贴合,可 能造成用户的不适,且成本较高。
此外,以上三种方式均不适用于智能移动终端。
发明内容
为解决上述现有技术中存在的问题,根据本发明的一个实施例,提供一种基于超声波的凌空手势识别方法,包括:
步骤1)、利用预先训练好的手掌运动趋势模型,对所采集的经由人手反射回的超声波信号识别手掌运动趋势,得到包含一系列手掌运动趋势的手掌运动趋势时序序列;其中,所述手掌运动趋势模型是根据人手反射回的超声波信号的声学特征训练得到的用于识别手掌运动趋势的模型;
步骤2)、利用预先训练好的手势识别模型,对步骤1)得到的手掌运动趋势时序序列进行手势识别;其中,所述手势识别模型是根据由手掌运动趋势时序序列组成的训练数据集训练得到的用于识别手势的模型。
上述方法的步骤1)包括:
从所采集的经由人手反射回的超声波信号中提取特征,利用预先训练好的手掌运动趋势模型针对提取到的特征识别手掌运动趋势,缓存经识别得到的手掌运动趋势直至手势结束,并且将缓存的手掌运动趋势组成手掌运动趋势时序序列,其中:
针对所采集的一帧超声波信号执行如下步骤,其中一帧超声波信号由连续的L个采样点数据组成:
步骤11)、从所述一帧超声波信号中提取特征,得到输入数据;
步骤12)、判断当前状态,如果当前状态指示处于手势过程中则执行步骤14),否则执行步骤13);
步骤13)、利用预先训练好的手掌运动趋势模型针对所述输入数据识别手掌运动趋势,如果识别结果指示无手掌运动趋势则针对所采集的下一帧超声波信号返回步骤11),否则将当前状态设置为处于手势过程中、缓存识别得到的手掌运动趋势并且针对所采集的下一帧超声波信号返回步骤11);
步骤14)、判断手势是否结束,如果结束则将缓存的手掌运动趋势组成手掌运动趋势时序序列并执行步骤2),否则针对所采集的下一帧超声波信号返回步骤11)。
上述方法在步骤2)之后还包括:
清空缓存的手掌运动趋势,将当前状态设置为非手势过程,并且返回步骤1)。
上述方法中,还包括根据以下步骤训练手掌运动趋势模型:
步骤a)、从所采集的经由人手反射回的每帧超声波信号中提取特征得到每帧训练样本,并且为每帧训练样本赋予相应的类别,形成第一训练数据集;
步骤b)、根据所述第一训练数据集,训练所述手掌运动趋势模型。
上述方法中,步骤b)包括:利用ELM模型结合交叉验证的要求来训练所述手掌运动趋势模型。
上述方法中,还包括根据以下步骤训练手势识别模型:
步骤c)、将所述第一训练数据集经由训练后的所述手掌运动趋势模型得到的结果进行手势分割,形成不同手势对应的训练样本,并且为每个手势对应的训练样本赋予相应的类别,形成第二训练数据集;
步骤d)、根据所述第二训练数据集,针对每个手势训练相应的手势识别模型。
上述方法中,步骤d)包括:利用HMM模型针结合交叉验证的要求,针对每个手势训练相应的手势识别模型。
上述方法中,由智能移动终端所固有的扬声器持续地发射超声波信号,并且由所述智能移动终端所固有的麦克风采集经由人手反射回的超声波信号。
上述方法中,所述超声波的频率为18kHz-22kHz。
根据本发明的一个实施例,还提供一种基于超声波的凌空手势识别系统,包括:
手掌运动趋势识别装置,用于利用预先训练好的手掌运动趋势模型,对所采集的经由人手反射回的超声波信号识别手掌运动趋势,得到包含一系列手掌运动趋势的手掌运动趋势时序序列;其中,所述手掌运动趋势模型是根据人手反射回的超声波信号的声学特征训练得到的用于识别手掌运动趋势的模型;
手势识别装置,用于利用预先训练好的手势识别模型,对步骤1)得到的手掌运动趋势时序序列进行手势识别;其中,所述手势识别模型是根 据由手掌运动趋势时序序列组成的训练数据集训练得到的用于识别手势的模型。
上述系统中,由智能移动终端所固有的扬声器持续地发射超声波信号,并且由所述智能移动终端所固有的麦克风采集经由人手反射回的超声波信号。
本发明提供的基于超声波的凌空手势识别方法是一种融合了声波信号的声学特征与时序特征的层次化方法:在第一层次,针对每一时刻(在下文中称为帧)采集的超声波信号数据提取声学特征,识别出每一帧的手掌运动趋势,得到其中的手掌运动趋势的时序序列;在第二层次,对于识别出的手掌运动趋势序列进行识别分类,从而实现凌空手势识别。该方法在识别出手掌运动趋势的基础上执行进一步的手势分类,实现了高精度和高鲁棒性的凌空手势识别。
此外,与上述现有技术相比,本发明中所要处理的数据较为简单,无需使用额外的数据手套、肌电传感器等,适用于智能移动终端,可使用智能移动终端固有的麦克风与扬声器进行超声波的发射与接收,因此成本较低,易于推广应用。
附图说明
图1是根据本发明一个实施例的模型离线训练方法的流程图;
图2是根据本发明一个实施例的基于超声波的凌空手势识别方法的流程图;
图3是根据本发明一个实施例的基于超声波的凌空手势识别方法中的层次化识别流程的示意图。
具体实施方式
下面结合附图和具体实施方式对本发明加以说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。
根据本发明的一个实施例,提供一种基于超声波的凌空手势识别方法。
概括而言,该方法通过发射、接收超声波,并且对接收的超声波信号进行(两层的)手掌运动趋势识别及手势识别,来捕捉用户手部动作,从 而实现凌空手势的识别。其中,利用预先训练好的手掌运动趋势模型进行手掌运动趋势识别,并且利用预先训练好的手势识别模型进行手势识别。需要说明的是,这里的超声波指的是正常成年人听不到的音频频率(即不低于18kHz);并且凌空手势包括但不限于:前伸、后拉、单击、双击等。
下文将分两个阶段来描述本发明提供的基于超声波的凌空手势识别方法,这两个阶段分别是模型训练阶段和手势识别阶段。
一.模型训练阶段
在进行凌空手势识别(其中包括手掌运动趋势识别以及手势识别)之前,需要事先训练好手掌运动趋势模型和手势识别模型。参见图1,该模型训练方法包括以下步骤:
1、超声波凌空手势数据采集
在本步骤中,由超声波发射装置持续地播放超声波并且由超声波接收装置采集经由人手反射回的超声波。
根据本发明的一个实施例,可以利用智能移动终端固有的扬声器来播放超声波。在使用智能移动终端发射和接收超声波(参见下一段)的情况下,所采用的超声波的频率范围可以为18kHz到22kHz,优选为18kHz。本文所涉及的智能移动终端包括但不限于:智能手机、平板电脑以及智能手表等可穿戴智能设备。
同时,可以利用智能移动终端(即发射超声波的智能移动终端)固有的麦克风以预定的采样频率采集经由人手反射回的超声波。对于本发明来说,人手可以相对于该智能移动终端位于任何位置处,优选地,位于与该智能移动终端距离半米的范围内。另外,在声学采样方面,尽管存在11.025kHz,22.05kHz和44.1kHz等标准的采样频率,然而针对18kHz的超声波,本文的超声波接收装置优选采用44.1kHz的采样频率进行周期性声波信号(即反射回的超声波)的采集。
2、对采集到的信号数据进行预处理
首先,用长度为L的滑动海明窗将所采集的超声波信号数据分割成连续的多帧(即每L个采样点的时序序列构成一帧),并且相邻两帧之间的 数据没有重叠。其中,L可以是2048(或者其他2的n次幂,以适用于下文的特征提取),从而针对44.1kHz的采样频率,每帧的时间约为46.44ms,并且每帧数据可以表示为:Ai={a1,a2,…,a2048},其中aj(可称作帧内时域信号)表示采样点j的模拟电压值经过A/D转换后的整数型数值,而连续的多帧数据组成声波信号序列A={A1,A2,…An}。
接着,对采集得到的帧数据进行预处理。预处理包括初步平滑处理,以用于提高最终的识别准确率。例如,可以采用N阶滑动平均滤波法(Moving Average Filter)对帧内时域信号进行滤波处理,其响应函数表示如下:
Figure PCTCN2016085475-appb-000001
在平滑处理后,该帧数据可表示为:Ai={a1’,a2’,…,a2048’}。
3、特征提取
在本步骤中,从预处理后的帧数据中提取特征,组成手掌运动趋势模型的训练样本集。
具体而言,将预处理后的每帧数据作为一个单元来提取每一帧对应的特征(如声学特征),从而组成手掌运动趋势模型的训练样本集。其中,所提取的特征可以是谱峰值、谱峰位置、频率图谱、均值、过零率、标准差等。本领域技术人员应理解,这里所提取的特征不限于上文列举的几种典型特征。另外,以上特征的其中之一或者其组合均可以用于实现手掌运动趋势模型的训练。
举例来说,在每帧数据包括2048个采样点的情况下,可采取如下方式提取每帧对应的特征:
通过FFT时频转换得到1025点(即1025个频段)的频域数据,提取其中的谱峰值和谱峰前后各n个(n为正整数)频段的频谱值作为该帧对应的特征向量。其中,确定谱峰所在的频段包括如下方式:1、在所得到的频域数据的高频区域(>17000Hz),将频谱值最大处作为谱峰所在频段;2、选择载波频率所在的频段作为谱峰所在频段,其中在18kHz载波(即上述18kHz超声波)、44.1kHz采样频率和2048采样点/帧的前提下,载波频率所在频段的下标约为835。
4、手掌运动趋势数据的标定
根据手掌运动趋势的类别(例如,包括向前、向后等),为每帧数据分别赋予相应的手掌运动趋势类别标号(例如,向前的手掌运动趋势类别标号为1,向后的手掌运动趋势类别标号为2等),从而由每帧数据提取的特征向量(上述手掌运动趋势模型的训练样本集)及该帧类别形成第一层次的训练数据集TrainDataSet_1。
5、训练手掌运动趋势模型
本文中,手掌运动趋势模型的输出可以包括但不限于:“null”、“forward”和“backward”,分别表示无手掌运动趋势,相对于麦克风的靠近和相对于麦克风的远离。
在本步骤中,将第一层次的训练数据集TrainDataSet_1分为m份,用于交叉验证;利用分类模型(例如,朴素贝叶斯,SVM,DNN等)进行训练。
优选地,利用ELM(极速学习机,Extern Learning Machine)模型结合交叉验证的要求来进行手掌运动趋势模型的训练。具体地,首先随机设定模型参数,然后用m-1份数据来训练模型,用另1份数据来测试模型精度,整个过程重复m次,以精度的平均值作为最终的精度值,并且输出模型参数。
举例来说,可以将训练样本平均分为10(即m=10)组,选择其中1组作为测试数据,其余9组作为训练数据。使用ELM模型对9组训练样本组成的训练样本集构造手掌运动趋势模型,将所建立的模型对测试数据进行测试,得到训练准确率和测试准确率;重复上述过程9次,每次选取不同的1组作为测试数据,其余9组作为训练数据,得到相应的训练准确率和测试准确率。取10次实验训练准确率和测试准确率的均值作为最终模型的准确率并输出模型的参数。
6、手势数据的提取与标定
首先,将TrainDataSet_1经过训练后的手掌运动趋势模型所得到的结果(手掌运动趋势的时序序列),进行手势分割,以组成针对不同手势的 训练样本集;其中,可在训练样本集中进行序列长度归一化处理和重复手势去除处理。
接着,为各手势的训练样本分别赋予相应的类别编号(例如,“单击”的类别标号为1,“双击”的类别标号为2等),从而由针对不同手势的训练样本集及相应类别形成用于手势识别模型的第二层次的训练数据集TrainDataSet_2。
7、训练手势识别模型
在本步骤中,针对每个手势,采用诸如CRF,HMM等利用时序特征的模型来进行模型训练。
优选地,利用HMM模型(隐马尔可夫模型,Hidden Markov Model)来进行训练。具体地,将第二层次的训练数据集TrainDataSet_2分为m份,以用于交叉验证(例如采用十折交叉验证方法);针对手势集(该手势集包括但不限于:“向前”、“向后”、“单击”、“双击”、“反向单击”等)中的每个手势(需要说明的是,HMM模型在训练过程中,是针对每个手势训练一个HMM模型,这样,一个手掌运动趋势序列对每个手势对应的模型都有一个似然概率,其中最大概率对应的手势即为识别的手势),利用HMM模型结合交叉验证的要求(该交叉验证的过程类似于第5步)来进行手势识别模型的训练,训练结束后获得模型的输出参数。
由此,针对手势集中的每一种手势分别建立了相应的手势识别模型。
二.手势识别阶段
参见图2,本实施例提供的基于超声波的凌空手势识别方法包含如下步骤:
第一步:持续播放超声波并且接收经由人手反射回的超声波
1、超声波发射装置持续地播放超声波。
与模型训练阶段的第1步相同,为了使环境中存在超声波信号,要利用超声波发射装置(如智能移动终端固有的扬声器)来播放18~20kHz的超声波,优选发射18kHz的超声波。
2、同时,超声波接收装置(例如,发射超声波的智能移动终端所固 有的麦克风)接收并且以44.1kHz的采样频率采集经由人手反射回的超声波。每采集一定数量的超声波信号,则进入第二步对这些超声波信号进行后续处理。
在上文中,用长度为L(如2048)的滑动海明窗将所采集的超声波信号数据分割成连续的多帧,在本步骤,每采集L个采样点组成的一帧,针对该帧数据进入第二步。
第二步:对采集到的信号进行预处理并且从预处理后的信号中提取特征以组成手掌运动趋势模型的输入数据
与模型训练阶段类似,本步骤对第一步中采集的帧数据进行预处理和特征提取,包括:
1、利用上述公式(1)对采集的帧数据进行滤波处理;
2、与上文给出的特征提取方式相同,从预处理后的数据中提取特征,构成手掌运动趋势模型的输入数据。
第三步:基于输入数据识别手掌运动趋势
本步骤包括如下子步骤:
1、判断当前状态
当前状态用于指示当前是否处于手势过程中(即,正在做手势的过程中)。如果处于手势过程中,则当前状态表示为“gesture”否则表示为“wait”(即,非手势过程)。如果处于手势过程(“gesture”)中,则执行子步骤3,否则(即“wait”)执行子步骤2。本文中,初始状态为“wait”。
2、针对输入数据识别手掌运动趋势
根据预先训练好的手掌运动趋势模型针对上述输入数据来识别手掌运动趋势。如上文所述,该手掌运动趋势模型的识别结果包括但不限于:“forward”,“backward”,“null”,分别代表相对于麦克风的靠近、远离以及无手掌运动趋势。
具体而言,将输入数据输入手掌运动趋势模型进行识别,得到识别结果;根据识别结果判断当前手掌运动趋势是否为“null”(即,无手掌运动趋势),如果为“null”则返回第一步,以根据采集的下一帧执行预处理等后续过程,否则将当前状态设置为“gesture”,并且将所识别的当前手掌 运动趋势(例如,“backward”或“forward”)进行缓存,接着返回第一步,以根据采集的下一帧执行预处理等后续过程。
3、判断手势是否结束,如果手势结束,则进入第四步,否则返回第一步,以根据采集的下一帧执行预处理等后续过程。
在本发明的一个实施例中,可以将连续三个“null”作为手势结束的标志,如果手掌运动趋势模型最近连续识别三个“null”则进入第四步;否则返回第一步。
第四步:对缓存的手掌运动趋势进行手势识别
进入第四步之后,之前缓存的手掌运动趋势可组成手掌运动趋势的时序序列(即,手掌运动趋势序列),采用预先训练好的手势识别模型对该序列进行识别,并且输出手势识别结果。其中,可以先对手掌运动趋势的时序序列进行长度归一化处理再通过手势识别模型进行识别。以下给出了手势运动趋势序列的一个例子:
(forward,forward,…,backward,backward)
如上文所述,采用与每个手势相对应的手势识别模型来进行识别。手掌运动趋势序列在通过每个手势识别模型后得到一个似然概率,取其中最大概率对应的手势作为所识别的手势。
第五步:响应手势操作
在本步骤中,(例如,智能移动装置)以与识别结果相对应的操作进行响应。同时清空缓存,将当前状态设置为“wait”(即非手势过程),并且返回第一步,以根据采集的下一帧来执行预处理等后续过程。
上述方法是一种层次化方法,根据手部自然运动的自由度,将手势分解为多个手掌运动趋势的时序序列。在第一层次(参见第三步),进行手掌运动趋势的识别;在第二层次(参见第四步),根据所识别的一系列手掌运动趋势来进行手势识别。图3示出了通过该层次化方法识别手势的流程示意图。
根据本发明的一个实施例,还提供一种基于超声波的凌空手势识别系统,包括:
手掌运动趋势识别装置,用于利用预先训练好的手掌运动趋势模型,对所采集的经由人手反射回的超声波信号识别手掌运动趋势,得到包含一系列手掌运动趋势的手掌运动趋势时序序列;其中,手掌运动趋势模型是根据人手反射回的超声波信号的声学特征训练得到的用于识别手掌运动趋势的模型;
手势识别装置,用于利用预先训练好的手势识别模型,对步骤1)得到的手掌运动趋势时序序列进行手势识别;其中,手势识别模型是根据由手掌运动趋势时序序列组成的训练数据集训练得到的用于识别手势的模型。
在一个实施例中,该凌空手势识别系统还包括:超声波发射装置,用于持续地发射超声波信号;超声波接收装置,用于采集经由人手反射回的超声波信号。其中,超声波发射装置可以是智能移动终端的扬声器,而超声波接收装置可以是智能移动终端的麦克风。智能移动终端可以是普适环境下的智平板电脑、智能手机等。
为验证本发明提供的凌空手势识别方法和系统的有效性,发明人进行了如下实验:
1、实验平台
·实验环境:正常的工作环境(~30dB)
·实验设备:MacBook Pro 13-inch;Dell Insprion 1545
·扬声器:实验设备固有的扬声器
·麦克风:实验设备固有的麦克风
2、手势数据源
4位用户(2位男士用户和2位女士用户)在1分半钟内随机实施“向前”、“向后”和“单击”手势,共采集了7600帧手掌运动趋势对应的声波信号,然后人工对每帧声波信号(即前文所述的每帧数据)提取谱峰值及其前后15个频段的频谱值(频率范围为[17677,18323]),并且将每帧数据进行标定,作为初始的手掌运动趋势模型的训练数据和测试数据,并且选择其中构成手势的手掌运动趋势序列(即下文中的有效的手掌运动趋势序列)作为手势识别模型的训练数据和测试数据。
3、手掌运动趋势模型
将得到的7600帧手掌运动趋势数据平均分成10组,将ELM的激励函数设置为Sigmoid、隐层节点数设置为90,然后采用十折交叉验证的方法,利用ELM算法对手掌运动趋势识别模型进行训练和测试,并与基于规则的手掌运动趋势识别方法进行对比,该基于ELM的手掌运动趋势模型的实验结果如下表所示。
表1
Figure PCTCN2016085475-appb-000002
4、手势识别模型
在7600帧手掌运动趋势数据的基础上,通过手掌运动趋势模型和手势分割得到有效的手掌运动趋势序列,并对其进行长度归一化、手势类型标定和重复手势序列去除处理。然后采用十折交叉验证的方法,将标定好的手掌运动趋势序列作为HMM模型训练的特征向量,利用HMM算法对手势识别模型进行训练和测试,该基于HMM的手势识别模型的实验结果如下表所示。
表2
Figure PCTCN2016085475-appb-000003
根据以上实验结果可知,本发明提供的方法和系统适用于智能移动终端,并且同时实现了高精度的凌空手势识别。
应该注意到并理解,在不脱离后附的权利要求所要求的本发明的精神和范围的情况下,能够对上述详细描述的本发明做出各种修改和改进。因此,要求保护的技术方案的范围不受所给出的任何特定示范教导的限制。

Claims (12)

  1. 一种基于超声波的凌空手势识别方法,所述方法包括:
    步骤1)、利用预先训练好的手掌运动趋势模型,对所采集的经由人手反射回的超声波信号识别手掌运动趋势,得到包含一系列手掌运动趋势的手掌运动趋势时序序列;其中,所述手掌运动趋势模型是根据人手反射回的超声波信号的声学特征训练得到的用于识别手掌运动趋势的模型;
    步骤2)、利用预先训练好的手势识别模型,对步骤1)得到的手掌运动趋势时序序列进行手势识别;其中,所述手势识别模型是根据由手掌运动趋势时序序列组成的训练数据集训练得到的用于识别手势的模型。
  2. 根据权利要求1所述的方法,其中,步骤1)包括:
    从所采集的经由人手反射回的超声波信号中提取特征,利用预先训练好的手掌运动趋势模型针对提取到的特征识别手掌运动趋势,缓存经识别得到的手掌运动趋势直至手势结束,并且将缓存的手掌运动趋势组成手掌运动趋势时序序列。
  3. 根据权利要求2所述的方法,其中,步骤1)包括:
    针对所采集的一帧超声波信号执行如下步骤,其中一帧超声波信号由连续的L个采样点数据组成:
    步骤11)、从所述一帧超声波信号中提取特征,得到输入数据;
    步骤12)、判断当前状态,如果当前状态指示处于手势过程中则执行步骤14),否则执行步骤13);
    步骤13)、利用预先训练好的手掌运动趋势模型针对所述输入数据识别手掌运动趋势,如果识别结果指示无手掌运动趋势则针对所采集的下一帧超声波信号返回步骤11),否则将当前状态设置为处于手势过程中、缓存识别得到的手掌运动趋势并且针对所采集的下一帧超声波信号返回步骤11);
    步骤14)、判断手势是否结束,如果结束则将缓存的手掌运动趋势组成手掌运动趋势时序序列并执行步骤2),否则针对所采集的下一帧超声波信号返回步骤11)。
  4. 根据权利要求3所述的方法,其中,步骤2)之后还包括:
    清空缓存的手掌运动趋势,将当前状态设置为非手势过程,并且返回步骤1)。
  5. 根据权利要求1-4中任何一个所述的方法,还包括根据以下步骤训练手掌运动趋势模型:
    步骤a)、从所采集的经由人手反射回的每帧超声波信号中提取特征得到每帧训练样本,并且为每帧训练样本赋予相应的类别,形成第一训练数据集;
    步骤b)、根据所述第一训练数据集,训练所述手掌运动趋势模型。
  6. 根据权利要求5所述的方法,其中,步骤b)包括:
    利用ELM模型结合交叉验证的要求来训练所述手掌运动趋势模型。
  7. 根据权利要求5所述的方法,还包括根据以下步骤训练手势识别模型:
    步骤c)、将所述第一训练数据集经由训练后的所述手掌运动趋势模型得到的结果进行手势分割,形成不同手势对应的训练样本,并且为每个手势对应的训练样本赋予相应的类别,形成第二训练数据集;
    步骤d)、根据所述第二训练数据集,针对每个手势训练相应的手势识别模型。
  8. 根据权利要求7所述的方法,其中,步骤d)包括:
    利用HMM模型针结合交叉验证的要求,针对每个手势训练相应的手势识别模型。
  9. 根据权利要求1-4中任何一个所述的方法,其中,由智能移动终端所固有的扬声器持续地发射超声波信号,并且由所述智能移动终端所固有的麦克风采集经由人手反射回的超声波信号。
  10. 根据权利要求1-4中任何一个所述的方法,其中,所述超声波的频率为18kHz-22kHz。
  11. 一种基于超声波的凌空手势识别系统,包括:
    手掌运动趋势识别装置,用于利用预先训练好的手掌运动趋势模型,对所采集的经由人手反射回的超声波信号识别手掌运动趋势,得到包含一系列手掌运动趋势的手掌运动趋势时序序列;其中,所述手掌运动趋势模型是根据人手反射回的超声波信号的声学特征训练得到的用于识别手掌运动趋势的模型;
    手势识别装置,用于利用预先训练好的手势识别模型,对步骤1)得到的手掌运动趋势时序序列进行手势识别;其中,所述手势识别模型是根据由手掌运动趋势时序序列组成的训练数据集训练得到的用于识别手势的模型。
  12. 根据权利要求11所述的系统,其中,由智能移动终端所固有的扬声器持续地发射超声波信号,并且由所述智能移动终端所固有的麦克风采集经由人手反射回的超声波信号。
PCT/CN2016/085475 2016-03-07 2016-06-12 一种基于超声波的凌空手势识别方法及系统 WO2017152531A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610127516.1 2016-03-07
CN201610127516.1A CN105807923A (zh) 2016-03-07 2016-03-07 一种基于超声波的凌空手势识别方法及系统

Publications (1)

Publication Number Publication Date
WO2017152531A1 true WO2017152531A1 (zh) 2017-09-14

Family

ID=56466783

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/085475 WO2017152531A1 (zh) 2016-03-07 2016-06-12 一种基于超声波的凌空手势识别方法及系统

Country Status (2)

Country Link
CN (1) CN105807923A (zh)
WO (1) WO2017152531A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109947249A (zh) * 2019-03-15 2019-06-28 努比亚技术有限公司 穿戴式设备的交互方法、穿戴式设备和计算机存储介质
CN111625087A (zh) * 2020-04-28 2020-09-04 中南民族大学 一种手势采集与识别系统
CN112883849A (zh) * 2021-02-02 2021-06-01 北京小米松果电子有限公司 识别手势的方法、装置、存储介质及终端设备
CN113095386A (zh) * 2021-03-31 2021-07-09 华南师范大学 一种基于三轴加速度空时特征融合的手势识别方法及系统
CN113450537A (zh) * 2021-06-25 2021-09-28 北京小米移动软件有限公司 跌倒检测方法、装置、电子设备和存储介质
WO2023051272A1 (zh) * 2021-09-28 2023-04-06 华为技术有限公司 一种设备组网及声道配置方法和电子设备

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446801B (zh) * 2016-09-06 2020-01-07 清华大学 基于超声主动探测的微手势识别方法及系统
CN106951861A (zh) * 2017-03-20 2017-07-14 上海与德科技有限公司 超声波手势识别方法及装置
CN107450724A (zh) * 2017-07-31 2017-12-08 武汉大学 一种基于双声道音频多普勒效应的手势识别方法及系统
CN109857245B (zh) * 2017-11-30 2021-06-15 腾讯科技(深圳)有限公司 一种手势识别方法和终端
CN107943300A (zh) * 2017-12-07 2018-04-20 深圳大学 一种基于超声波的手势识别方法及系统
CN108562890B (zh) * 2017-12-29 2023-10-03 努比亚技术有限公司 超声波特征值的校准方法、装置及计算机可读存储介质
CN108959866B (zh) * 2018-04-24 2020-10-23 西北大学 一种基于高频声波频率的持续身份认证方法
CN112740219A (zh) * 2018-11-19 2021-04-30 深圳市欢太科技有限公司 手势识别模型的生成方法、装置、存储介质及电子设备
CN110031827B (zh) * 2019-04-15 2023-02-07 吉林大学 一种基于超声波测距原理的手势识别方法
CN110780741B (zh) * 2019-10-28 2022-03-01 Oppo广东移动通信有限公司 模型训练方法、应用运行方法、装置、介质及电子设备
CN111124108B (zh) * 2019-11-22 2022-11-15 Oppo广东移动通信有限公司 模型训练方法、手势控制方法、装置、介质及电子设备
CN111562843A (zh) * 2020-04-29 2020-08-21 广州美术学院 一种手势捕捉的定位方法、装置、设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982315A (zh) * 2012-11-05 2013-03-20 中国科学院计算技术研究所 一种自动检测非手势模式的手势分割识别方法及系统
US20130182539A1 (en) * 2012-01-13 2013-07-18 Texas Instruments Incorporated Multipath reflection processing in ultrasonic gesture recognition systems
CN104616028A (zh) * 2014-10-14 2015-05-13 北京中科盘古科技发展有限公司 基于空间分割学习的人体肢体姿势动作识别方法
CN104751141A (zh) * 2015-03-30 2015-07-01 东南大学 基于特征图像全像素灰度值的elm手势识别算法
CN104766038A (zh) * 2014-01-02 2015-07-08 株式会社理光 手掌开合动作识别方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130182539A1 (en) * 2012-01-13 2013-07-18 Texas Instruments Incorporated Multipath reflection processing in ultrasonic gesture recognition systems
CN102982315A (zh) * 2012-11-05 2013-03-20 中国科学院计算技术研究所 一种自动检测非手势模式的手势分割识别方法及系统
CN104766038A (zh) * 2014-01-02 2015-07-08 株式会社理光 手掌开合动作识别方法和装置
CN104616028A (zh) * 2014-10-14 2015-05-13 北京中科盘古科技发展有限公司 基于空间分割学习的人体肢体姿势动作识别方法
CN104751141A (zh) * 2015-03-30 2015-07-01 东南大学 基于特征图像全像素灰度值的elm手势识别算法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YANG, XIAODONG ET AL.: "Ultrasonic Waves Based Gesture Recognition Method for Wearable Equipment", COMPUTER SCIENCE, vol. 42, no. 10, 31 October 2015 (2015-10-31), pages 20 - 24 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109947249A (zh) * 2019-03-15 2019-06-28 努比亚技术有限公司 穿戴式设备的交互方法、穿戴式设备和计算机存储介质
CN109947249B (zh) * 2019-03-15 2024-03-19 努比亚技术有限公司 穿戴式设备的交互方法、穿戴式设备和计算机存储介质
CN111625087A (zh) * 2020-04-28 2020-09-04 中南民族大学 一种手势采集与识别系统
CN112883849A (zh) * 2021-02-02 2021-06-01 北京小米松果电子有限公司 识别手势的方法、装置、存储介质及终端设备
CN113095386A (zh) * 2021-03-31 2021-07-09 华南师范大学 一种基于三轴加速度空时特征融合的手势识别方法及系统
CN113095386B (zh) * 2021-03-31 2023-10-13 华南师范大学 一种基于三轴加速度空时特征融合的手势识别方法及系统
CN113450537A (zh) * 2021-06-25 2021-09-28 北京小米移动软件有限公司 跌倒检测方法、装置、电子设备和存储介质
WO2023051272A1 (zh) * 2021-09-28 2023-04-06 华为技术有限公司 一种设备组网及声道配置方法和电子设备

Also Published As

Publication number Publication date
CN105807923A (zh) 2016-07-27

Similar Documents

Publication Publication Date Title
WO2017152531A1 (zh) 一种基于超声波的凌空手势识别方法及系统
Dash et al. Detection of COVID-19 from speech signal using bio-inspired based cepstral features
Mouawad et al. Robust detection of COVID-19 in cough sounds: using recurrence dynamics and variable Markov model
CN110364144B (zh) 一种语音识别模型训练方法及装置
CN110853618B (zh) 一种语种识别的方法、模型训练的方法、装置及设备
Hou et al. Signspeaker: A real-time, high-precision smartwatch-based sign language translator
CN110838286B (zh) 一种模型训练的方法、语种识别的方法、装置及设备
CN111461176B (zh) 基于归一化互信息的多模态融合方法、装置、介质及设备
CN110069199B (zh) 一种基于智能手表的皮肤式手指手势识别方法
CN110853617B (zh) 一种模型训练的方法、语种识别的方法、装置及设备
CN110310623A (zh) 样本生成方法、模型训练方法、装置、介质及电子设备
CN113158727A (zh) 一种基于视频和语音信息的双模态融合情绪识别方法
Kim et al. Finger language recognition based on ensemble artificial neural network learning using armband EMG sensors
CN110946554A (zh) 咳嗽类型识别方法、装置及系统
Jiang et al. Interpretable features for underwater acoustic target recognition
CN113674767A (zh) 一种基于多模态融合的抑郁状态识别方法
CN105118356B (zh) 一种手语语音转换方法及装置
Wang et al. Automatic hypernasality detection in cleft palate speech using cnn
CN115620727A (zh) 音频处理方法、装置、存储介质及智能眼镜
Zeng et al. mSilent: Towards general corpus silent speech recognition using COTS mmWave radar
CN110728993A (zh) 一种变声识别方法及电子设备
CN113192537B (zh) 唤醒程度识别模型训练方法及语音唤醒程度获取方法
Sharan Cough sound detection from raw waveform using SincNet and bidirectional GRU
WO2020000523A1 (zh) 一种信号处理方法及装置
CN111913575B (zh) 一种手语词的识别方法

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16893178

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16893178

Country of ref document: EP

Kind code of ref document: A1