WO2019218725A1 - 基于骨传导振动与机器学习的智能输入方法及系统 - Google Patents
基于骨传导振动与机器学习的智能输入方法及系统 Download PDFInfo
- Publication number
- WO2019218725A1 WO2019218725A1 PCT/CN2019/073514 CN2019073514W WO2019218725A1 WO 2019218725 A1 WO2019218725 A1 WO 2019218725A1 CN 2019073514 W CN2019073514 W CN 2019073514W WO 2019218725 A1 WO2019218725 A1 WO 2019218725A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- vibration
- machine learning
- vibration signal
- neural network
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/02—Preprocessing
- G06F2218/04—Denoising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/014—Hand-worn input/output arrangements, e.g. data gloves
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
Definitions
- the invention relates to an intelligent input method, in particular to an intelligent input method based on bone conduction vibration and machine learning, and relates to an intelligent input system adopting the intelligent input method based on bone conduction vibration and machine learning.
- the technical problem to be solved by the present invention is to provide an intelligent input method based on bone conduction vibration and machine learning, which is simpler and more convenient for text input, and further provides intelligence using the intelligent input method based on bone conduction vibration and machine learning. Enter the system.
- the present invention provides an intelligent input method based on bone conduction vibration and machine learning, comprising the following steps:
- Step S1 collecting a vibration signal of the user's back of the hand
- Step S2 performing filtering noise reduction and endpoint segmentation processing on the collected vibration signal
- Step S3 performing alignment processing on the vibration signal after the end segment is cut
- Step S4 performing signal feature extraction on the aligned vibration signal
- step S5 the extracted features are grouped into a training set and transmitted to a neural network classification model for training, and a trained neural network classification model is obtained.
- a further improvement of the present invention is that, in the step S3, the vibration signal after the end segment is segmented by the overall cross-correlation method, and the specific operation of the alignment processing is to calculate the offset between the two vibration signals. Then the current vibration signal is moved, and after the movement, only the complete part shared between the two vibration signals is taken.
- A indicates that the part of the vibration signal a has a length n on both sides, and then obtains a first signal of length 3n;
- B represents a vibration signal b of length n;
- P(A, B) represents the first signal A
- O(A, B) is the calculated offset between the first signal A and the second signal B.
- the power spectral density characteristic of the aligned processed vibration signal is extracted, and the power spectral density characteristic and the amplitude characteristic of the vibration signal before the alignment processing are collectively used as the extracted signal feature.
- a further improvement of the present invention is that in the step S4, the formula is adopted. Extracting the power spectral density characteristic PSD of the oscillating processed vibration signal, where f s is the sampling frequency of the vibration signal, n is the signal length, k represents the signal of the signal length n, and FFT(k) represents the FFT of the signal k
- the leaf transform, abs (FFT(k)), represents the absolute value of FFT(k).
- each hand back position collects a predetermined number of training samples, extracts corresponding signal features, and transmits the signal characteristics of the training samples and their labels as training sets to the neural network classification model for training.
- a trained neural network classification model is obtained, and then a vibration signal is input to the neural network classification model, and the neural network classification model returns the back position of the hand corresponding to the vibration signal to implement a user input operation.
- a further improvement of the present invention is that the neural network classification model of the step S5 comprises an input layer, a hidden layer and an output layer, and the number of nodes of the input layer is the total dimension of the signal feature.
- the number of nodes of the hidden layer is twice the number of the input layer nodes, and the number of nodes of the output layer is the number of keys required by the user.
- the collected vibration signal is subjected to filtering and noise reduction processing by using a Butterworth filter, and the high-pass filter with a cutoff frequency of 20 hz is used to filter the DC component and the low frequency noise, and the cutoff frequency is used.
- Low-pass filtering at 300hz filters out high frequency noise.
- the entire segment of the vibration signal is first subjected to frame processing, and then the variance of each frame signal is used as a criterion for determining the variance of a certain frame signal.
- the tapping signal is considered to appear, and a signal of a certain length before and after the frame signal is taken out as a vibration signal after the end segment is segmented.
- the present invention also provides an intelligent input system based on bone conduction vibration and machine learning, which employs an intelligent input method based on bone conduction vibration and machine learning as described above.
- the invention has the beneficial effects of: by using the bone conduction vibration handle back as a virtual keyboard, combined with the neural network classification model of machine learning, the recognition rate of the text input is high and sensitive, and the reaction The speed is fast, the text input efficiency of the hand-held device is improved, and the user experience is improved.
- the interaction mode of the invention is novel, interesting, convenient and fast, and can meet the needs of various wearable devices, and is widely used.
- FIG. 1 is a schematic diagram of a workflow of an embodiment of the present invention
- FIG. 2 is a schematic diagram of a piezoelectric ceramic vibration sensor for collecting a vibration signal according to an embodiment of the present invention
- FIG. 3 is a structural diagram of a piezoelectric ceramic vibration sensor for collecting a vibration signal according to an embodiment of the present invention
- FIG. 4 is a schematic diagram of simulation of effects before implementing alignment processing according to an embodiment of the present invention.
- FIG. 5 is a schematic diagram of an effect simulation after implementing an alignment process according to an embodiment of the present invention.
- FIG. 6 is a schematic diagram of a neural network classification model according to an embodiment of the present invention.
- FIG. 7 is a schematic diagram of the effect of a virtual keyboard according to an embodiment of the present invention.
- the present invention provides an intelligent input method based on bone conduction vibration and machine learning, comprising the following steps:
- Step S1 collecting a vibration signal of the user's back of the hand
- Step S2 performing filtering noise reduction and endpoint segmentation processing on the collected vibration signal
- Step S3 performing alignment processing on the vibration signal after the end segment is cut
- Step S4 performing signal feature extraction on the aligned vibration signal
- step S5 the extracted features are grouped into a training set and transmitted to a neural network classification model for training, and a trained neural network classification model is obtained.
- this example realizes input on the back of the hand through the principle of bone conduction vibration, that is, the handle back is regarded as a virtual keyboard (the key of the virtual keyboard can be any position on the back of the hand) to realize the input function of the user, the back of the hand
- the handle back is regarded as a virtual keyboard (the key of the virtual keyboard can be any position on the back of the hand) to realize the input function of the user, the back of the hand
- the area is large enough, and the algorithmic response of the machine learning is also sensitive enough and rapid, which solves the problem of poor recognition rate, slow text input, and difficulty in inputting the screen in the prior art.
- the method of tapping as the input by the back of the hand can also There are many interesting applications that extend.
- the specific technical solution of this example is: firstly use the vibration sensor (which can be embedded in other smart devices such as smart watches or smart bracelets) to collect the vibration signals generated by the fingers on the back of the hand, and the schematic diagram and structure diagram of the acquisition are shown in FIG. 2 respectively. And as shown in FIG. 3, after the segmentation processing is performed by filtering denoising and endpoint detection, the user's tapping signal (the vibration signal after the segmentation process) is extracted; then, the overall cross-correlation method (GCC) is used to perform the segmentation. The processed signal (tapping signal) realizes alignment processing, and extracts signal characteristics such as amplitude and frequency spectral density characteristics. Finally, the neural network classification model learns the collected signal features and their corresponding back positions to train a mapping model.
- the vibration sensor which can be embedded in other smart devices such as smart watches or smart bracelets
- the collected vibration signal can be mapped to the corresponding back position of the hand through the trained neural network classification model, and the position on the back of the hand is recognized by the user, and the position of the back of the hand can be corresponding to the keyboard one by one to realize Intelligent input method for bone conduction vibration and machine learning.
- the prediction when the prediction is performed, only the input signal (vibration signal or the tapping signal obtained after processing) needs to be put into the trained neural network classification model, and the result can be directly obtained, and the time required is linear.
- the response is very sensitive, so if the position of the back of the hand is in one-to-one correspondence with the nine-square grid keyboard, as shown in Figure 7, fast text input can be achieved, and the recognition rate can be tested by more than 95%, which can greatly improve the user's experience of inputting text. .
- the piezoelectric ceramic vibration sensor (or other sensor capable of detecting vibration) is embedded in a smart watch or other hand-held smart device, detecting the vibration signal of the user's back of the hand and converting the signal into electricity. The signal is then converted into a digital signal for processing.
- Figure 2 and Figure 3 show the schematic and structure of the piezoelectric ceramic vibration sensor.
- the piezoelectric ceramic vibration sensor generates internal polarity due to the piezoelectric effect. Change, the external display shows the change in voltage.
- the Butterworth filter is used to perform filtering and noise reduction processing on the collected vibration signal by bandpass filtering with a frequency band of 20 to 300 hz. More specifically, this example uses high-pass filtering with a cutoff frequency of 20 hz. The DC component and low frequency noise are filtered out, and high frequency noise is filtered out using a low pass filter with a cutoff frequency of 300 hz.
- the endpoint segment processing is also referred to as endpoint detection processing, and the processing process is to first perform frame processing on the entire segment of the vibration signal, and then use the variance of each frame signal as a criterion, when a certain When the variance of the frame signal exceeds a given threshold, the tap signal is considered to appear, and a signal of a certain length before and after the frame signal is taken out as a vibration signal after the end segment is cut, and the vibration signal after the end segment is also called a tap signal.
- the given threshold can be customized according to the user's needs, or can be based on the value in the training library of the sample as a reference value.
- step S3 of this example the vibration signal after the end segment is segmented by a general cross correlation (GCC), and the specific operation of the alignment process is to calculate the offset between the two vibration signals.
- GCC general cross correlation
- the alignment processing described in this example can align all the vibration signals, which is beneficial to the classification accuracy of the machine learning algorithm.
- the simulation effect diagrams before and after the alignment processing are shown in FIG. 4 and FIG. 5.
- A indicates that the part of the vibration signal a has a length n on both sides, and then obtains a first signal of length 3n;
- B represents a vibration signal b of length n;
- P(A, B) represents the first signal A
- O(A, B) is the calculated offset between the first signal A and the second signal B.
- step S4 of this example the power spectral density (PSD) of the aligned processed vibration signal is extracted, and the power spectral density characteristic and the amplitude characteristic of the vibration signal before the alignment processing are collectively used as the extracted signal. feature.
- the formula is adopted. Extracting the power spectral density characteristic PSD of the oscillating processed vibration signal, where f s is the sampling frequency of the vibration signal, n is the signal length, k represents the signal of the signal length n, and FFT(k) represents the FFT of the signal k
- the leaf transform, abs (FFT(k)), represents the absolute value of FFT(k).
- step S5 of this example a predetermined number of training samples are collected for each hand back position, corresponding signal features are extracted, and the signal characteristics of the training samples and their labels are transmitted as training sets to the neural network classification model for training, and a training is completed.
- the neural network classification model then inputs a vibration signal to the neural network classification model, and the neural network classification model returns the back position of the hand corresponding to the vibration signal to implement a user input operation.
- the predetermined number can be customized and adjusted according to the needs of the user, and the predetermined number is preferably 30 in this example.
- the neural network classification model can be used for information input, and the vibration signal is detected by the smart device in real time, and the user taps the back of the hand to generate a vibration signal with a large energy.
- the device detects the vibration signal, extracts the vibration signal, and filters the denoising, endpoint detection, GCC alignment and signal feature extraction, and uses the signal characteristic generated by the vibration signal as an input of the neural network classification model to obtain a neural network classification.
- the result returned by the model which may include the classification label entered during training, such as position; the result is the position of the back of the hand that the user taps.
- the neural network classification model of step S5 in this example includes an input layer, a hidden layer, and an output layer, and the number of nodes of the input layer is the total dimension of the signal feature.
- the number of nodes of the hidden layer is twice the number of the input layer nodes, and the number of nodes of the output layer is the number of keys required by the user.
- the neural network classification model randomly outputs a result for the input vibration signal.
- the result is a 1*N' matrix, that is, the value corresponding to the N' nodes of the output layer, and the values in the matrix are random.
- the neural network classification model is trained: the training set includes the hand back position vibration signal feature that the user needs to use as a button, and the corresponding hand back position label, that is, the label is the back of the hand, and the representation of the back position label is a 1*N' matrix.
- N' is the total number of positions of the back of the hand that need to be used as a button.
- Each element in the matrix corresponds to a back position of the hand, and in the label of the back of the hand corresponding to a vibration signal, the value of the element corresponding to the position of the back of the hand is 1, and the value of other elements is 0. .
- the training set is used to train the neural network classification model. After the training model, the output of a vibration signal tends to the real label corresponding to the vibration signal. Therefore, for a new vibration signal, the trained model will output.
- a 1*N' matrix, the back of the element corresponding to the value closest to 1 in the matrix is the position of the back of the hand corresponding to the vibration signal.
- the calculation process of the neural network classification model is as follows: the calculation formula of the value of each layer node is Where x i is the value of the i-th node of the previous layer, w ij is the weight of the connection of the i-th node from the previous layer to the j-th node of the next layer, and a j is the offset unit of the previous layer , N is the number of nodes in the previous layer, g(x) is the activation function, and H j is the value of the jth node in the next layer.
- the activation function g(x) uses the logsig function, and the formula of the logsig function (activation function g(x)) is e refers to the natural constant, which is about 2.71828, where x is any real number, and i and j are used to represent the number of nodes, respectively.
- This example also provides an intelligent input system based on bone conduction vibration and machine learning, using the intelligent input method based on bone conduction vibration and machine learning as described above.
- this example is based on the bone conduction vibration handle back as a virtual keyboard, combined with the neural network classification model of machine learning, so that the recognition rate of text input is high and sensitive, the response speed is fast, and the hand wear is improved.
- the text input efficiency of the device improves the user experience, and the interaction mode of the invention is novel, interesting, convenient and fast, and can meet the needs of various wearable devices, and is widely used.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
本发明提供基于骨传导振动与机器学习的智能输入方法及系统,所述智能输入方法包括以下步骤:步骤S1,采集用户敲击手背的振动信号;步骤S2,对采集的振动信号进行滤波降噪和端点切段处理;步骤S3,对端点切段后的振动信号进行对齐处理;步骤S4,对对齐处理后的振动信号进行信号特征提取;步骤S5,将提取的特征组成训练集并传送至神经网络分类模型进行训练,得到一个训练完的神经网路分类模型。本发明通过基于骨传导振动把手背当作一个虚拟键盘,并结合机器学习的神经网路分类模型,使得文本输入的识别率高且灵敏迅速,反应速度快,提高了手戴式设备的文本输入效率,提升了用户体验,本发明的交互方式新颖有趣且方便快捷,应用广泛。
Description
本发明涉及一种智能输入方法,尤其涉及一种基于骨传导振动与机器学习的智能输入方法,并涉及采用了该基于骨传导振动与机器学习的智能输入方法的智能输入系统。
目前,可穿戴智能感知设备迅速发展,其中智能手环和智能手表等手戴式设备也颇为流行,但由于其大小,成本等限制,手表的文本输入方式不够人性化,小小的屏幕导致了用户不能轻松的打字;而如今解决该问题的主要方法包括:传统键盘和语音识别。带上传统键盘会导致不够轻便笨重,而语音识别容易收到周围环境噪声的影响,且速度不够快,同时由于要保护隐私和顾及他人的感受,在公共场所也不好使用语音输入,而如今许多科研团队研究的手指跟踪等技术虽然也能实现打字功能,但由于操作不符合用户习惯且有速度慢的缺陷,并不能很好的解决文本输入不够便捷的问题。
发明内容
本发明所要解决的技术问题是需要提供一种更加简单便捷的文本输入的基于骨传导振动与机器学习的智能输入方法,并进一步提供采用了该基于骨传导振动与机器学习的智能输入方法的智能输入系统。
对此,本发明提供一种基于骨传导振动与机器学习的智能输入方法,包括以下步骤:
步骤S1,采集用户敲击手背的振动信号;
步骤S2,对采集的振动信号进行滤波降噪和端点切段处理;
步骤S3,对端点切段后的振动信号进行对齐处理;
步骤S4,对对齐处理后的振动信号进行信号特征提取;
步骤S5,将提取的特征组成训练集并传送至神经网络分类模型进行训练,得到一个训练完的神经网路分类模型。
本发明的进一步改进在于,所述步骤S3中,通过总体互相关法对端点切段后的振动信号进行对齐处理,所述对齐处理的具体操作是计算两个振动信号之间的偏移量,然后对当前的振动信号进行移动,移动完之后只取两个振动信号之间共有的完整部分。
发明的进一步改进在于,所述步骤S3中,通过公式
以及O(A,B)=P(A,B)-n计算两个振动信号之间的偏移量O(A,B),其中,a和b代表两个信号长度为n的振动信号,a(i)表示振动信号a的第i个点的振幅大小,b(i)表示振动信号b的第i个点的振幅大小,C(a,b)表示振动信号a和振动信号b的相关度;A表示对振动信号a两边长度为n的部分进行补零,进而获得的一个长度为3n的第一信号;B表示长度n的振动信号b;P(A,B)表示第一信号A中与第二信号B相关度最高的长度为n的信号位置;O(A,B)为计算所得的第一信号A与第二信号B之间的偏移量。
本发明的进一步改进在于,所述步骤S4中,提取对齐处理后的振动信号的功率谱密度特征,并将所述功率谱密度特征与对齐处理前振动信号的振幅特征共同作为提取的信号特征。
本发明的进一步改进在于,所述步骤S4中,通过公式
提取对齐处理后的振动信号的功率谱密度特征PSD,其中,f
s为振动信号的采样频率,n为信号长度,k表示信号长度为n的信号,FFT(k)表示对信号k的傅里叶变换,abs(FFT(k))表示对FFT(k)取绝对值。
本发明的进一步改进在于,所述步骤S5中,每个手背位置采集预定数量的训练样本,提取对应的信号特征,将训练样本的信号特征及其标签作为训练集传送至神经网络分类模型进行训练,得到一个训练完的神经网路分类模型,然后对该神经网路分类模型输入一个振动信号,所述神经网路分类模型返回所述振动信号对应的手背位置,以实现用户的输入操作。
本发明的进一步改进在于,所述步骤S5的神经网路分类模型包括一层输入层、一层隐藏层以及一层输出层,所述输入层的节点个数为信号特征的总维数,所述隐藏层的节点个数为所述输入层节点个数的2倍,所述输出层的节点个数为用户需要的按键数目。
本发明的进一步改进在于,所述步骤S2中,采用巴特沃兹滤波器对采集的振动信号进行滤波降噪处理,使用截止频率为20hz的高通滤波滤除直流分量和低频噪音,使用截止频率为300hz的低通滤波滤除高频噪音。
本发明的进一步改进在于,所述步骤S2中,所述端点切段处理中,先对整段振动信号进行分帧处理,然后采用每帧信号的方差作为判断标准,当某一帧信号的方差超过给定阈值时,则认为敲击信号出现,取出该帧信号前后一定长度的信号作为端点切段后的振动信号。
本发明还提供一种基于骨传导振动与机器学习的智能输入系统,采用了如上所述的基于骨传导振动与机器学习的智能输入方法。
与现有技术相比,本发明的有益效果在于:通过基于骨传导振动把手背当作一个虚拟键盘,并结合机器学习的神经网路分类模型,使得文本输入的识别率高且灵敏迅速,反应速度快,提高了手戴式设备的文本输入效率,提升了用户体验,本发明的交互方式新颖有趣且方便快捷,能够满足各种穿戴设备的使用环境需求,应用广泛。
图1是本发明一种实施例的工作流程示意图;
图2是本发明一种实施例采用压电陶瓷振动传感器采集振动信号的原理图;
图3是本发明一种实施例采用压电陶瓷振动传感器采集振动信号的结构图;
图4是本发明一种实施例实现对齐处理前的效果仿真示意图;
图5是本发明一种实施例实现对齐处理后的效果仿真示意图;
图6是本发明一种实施例的神经网路分类模型示意图;
图7是本发明一种实施例的虚拟键盘的效果示意图。
下面结合附图,对本发明的较优的实施例作进一步的详细说明。
如图1所示,本发明提供基于骨传导振动与机器学习的智能输入方法,包括以下步骤:
步骤S1,采集用户敲击手背的振动信号;
步骤S2,对采集的振动信号进行滤波降噪和端点切段处理;
步骤S3,对端点切段后的振动信号进行对齐处理;
步骤S4,对对齐处理后的振动信号进行信号特征提取;
步骤S5,将提取的特征组成训练集并传送至神经网络分类模型进行训练,得到一个训练完的神经网路分类模型。
如图7所示,本例通过骨传导振动原理实现在手背上进行输入,即把手背当作一个虚拟键盘(该虚拟键盘的按键可以是手背的任意一个位置)来实现用户的输入功能,手背面积足够大,机器学习的算法反应也足够灵敏迅速,解决了现有技术中面临的识别率差,文本输入慢,屏幕小难以输入的问题,同时,用手背来敲击作为输入的方式也能延伸出有许多有趣的应用。
本例的具体技术方案是:先使用振动传感器(可嵌入智能手表或智能手环等其他手 戴智能设备中)采集手指敲击手背产生的振动信号,采集的原理图和结构图分别如图2和图3所示,经过滤波去噪和端点检测实现切段处理后,提取出用户的敲击信号(切段处理后的振动信号);然后,采用总体互相关法(GCC)进行对切段处理后的信号(敲击信号)实现对齐处理,提取振幅与频率谱密度特征等信号特征;最后,让神经网络分类模型学习所采集到的信号特征与其对应的手背位置来训练出一个映射模型,之后采集到的振动信号就可以通过训练好的神经网络分类模型映射到对应的手背位置上,识别出用户敲击的是手背上哪个位置,进而可以让手背位置与键盘一一对应,以实现基于骨传导振动与机器学习的智能输入方法。
本例在进行预测时只需要将输入信号(振动信号或是处理后得到的敲击信号)放入至训练好的神经网络分类模型中,就能直接得到结果,所需的时间是线性的,反应十分灵敏,因此如果让手背位置与九宫格键盘一一对应,如图7所示,可以实现快速的文本输入,同时识别率经测试可以达到95%以上,可以很大的提升用户输入文本的体验。
本例所述步骤S1中,将压电陶瓷振动传感器(或其他可检测振动的传感器)嵌入至智能手表或其他手戴式智能设备,检测用户敲击手背的振动信号并将该信号转化为电信号,再将电信号转化为可供处理的数字信号,图2和图3所示的就是压电陶瓷振动传感器的原理图和结构图,压电陶瓷振动传感器因压电效应使内部极性产生变化,对外显示出电压的变化。
本例所述步骤S2中,采用巴特沃兹滤波器使用频段为20~300hz的带通滤波对采集的振动信号进行滤波降噪处理,更为具体的,本例使用截止频率为20hz的高通滤波滤除直流分量和低频噪音,使用截止频率为300hz的低通滤波滤除高频噪音。
本例所述步骤S2中,所述端点切段处理也称为端点检测处理,其处理过程为先对整段振动信号进行分帧处理,然后采用每帧信号的方差作为判断标准,当某一帧信号的方差超过给定阈值时,则认为敲击信号出现,取出该帧信号前后一定长度的信号作为端点切段后的振动信号,端点切段后的振动信号也称为敲击信号。该给定阈值可以根据用户的需求进行自定义设置,也可以根据样本的训练库中的数值作为参考值。
本例所述步骤S3中,通过总体互相关法(general cross correlation,GCC)对端点切段后的振动信号进行对齐处理,所述对齐处理的具体操作是计算两个振动信号之间的偏移量,然后对当前的振动信号进行移动,移动完之后只取两个振动信号之间共有的完整部分。本例所述对齐处理处理能够将所有振动信号对齐,有利于机器学习算法分类精度的提升,其对齐处理前和对齐处理后的仿真效果图如图4和图5所示。
本例所述步骤S3中,通过公式
以及O(A,B)=P(A,B)-n计算两个振动信号之间的偏移量O(A,B),其中,a和b代表两个信号长度为n的振动信号,a(i)表示振动信号a的第i个点的振幅大小,b(i)表示振动信号b的第i个点的振幅大小,C(a,b)表示振动信号a和振动信号b的相关度;A表示对振动信号a两边长度为n的部分进行补零,进而获得的一个长度为3n的第一信号;B表示长度n的振动信号b;P(A,B)表示第一信号A中与第二信号B相关度最高的长度为n的信号位置;O(A,B)为计算所得的第一信号A与第二信号B之间的偏移量。
本例所述步骤S4中,提取对齐处理后的振动信号的功率谱密度特征(power spectral density,PSD),并将所述功率谱密度特征与对齐处理前振动信号的振幅特征共同作为提取的信号特征。优选的,所述步骤S4中,通过公式
提取对齐处理后的振动信号的功率谱密度特征PSD,其中,f
s为振动信号的采样频率,n为信号长度,k表示信号长度为n的信号,FFT(k)表示对信号k的傅里叶变换,abs(FFT(k))表示对FFT(k)取绝对值。
本例所述步骤S5中,每个手背位置采集预定数量的训练样本,提取对应的信号特征,将训练样本的信号特征及其标签作为训练集传送至神经网络分类模型进行训练,得到一个训练完的神经网路分类模型,然后对该神经网路分类模型输入一个振动信号,所述神经网路分类模型返回所述振动信号对应的手背位置,以实现用户的输入操作。所述预定数量可以根据用户的需求进行自定义设置和调整,本例所述预定数量优选为30。
本例得到训练完的神经网路分类模型之后便可以利用该神经网路分类模型进行信息输入,通过智能设备实时检测振动信号,用户敲击手背会产生一个能量较大的振动信号,此时智能设备检测到该振动信号,取出该振动信号并对该振动信号滤波去噪、端点检测、GCC对齐以及信号特征提取,将该振动信号产生的信号特征作为神经网络分类模型的输入,得到神经网络分类模型返回的结果,该结果可包括训练时输入的分类标签,如位置;该结果即为用户敲击的手背位置。
如图6所示,本例所述步骤S5的神经网路分类模型包括一层输入层、一层隐藏层以及一层输出层,所述输入层的节点个数为信号特征的总维数,所述隐藏层的节点个数为所述输入层节点个数的2倍,所述输出层的节点个数为用户需要的按键数目。
一开始神经网络分类模型对于输入的振动信号会随机输出一个结果,该结果是一个1*N’的矩阵,即输出层N’个节点对应的数值,矩阵中的值随机。
对该神经网络分类模型进行训练:训练集中包括用户需要作为按键的手背位置振动信号特征以及对应的手背位置标签,即标签为手背位置,手背位置标签的表示形式为一个1*N’的矩阵,N’为所有需要作为按键的手背位置的总数,矩阵中每个元素对应一个手背位置,而一个振动信号对应的手背位置的标签中,对应该手背位置的元素数值为1,其他元素数值为0。
用训练集训练神经网络分类模型,该神经网络分类模型经过训练,对一个振动信号输出的结果会趋向于该振动信号对应的真实标签,因此对于一个新的振动信号,该训练好的模型会输出一个1*N’的矩阵,矩阵中数值最接近1的元素对应的手背位置既是所述振动信号对应的手背位置。
该神经网络分类模型的计算过程如下:每一层节点数值的计算公式为
式中x
i为前一层的第i个节点的值,w
ij为前一层第i个节点到后一层第j个节点连线的权值,a
j为前一层的偏置单元,N为前一层的节点数目,g(x)为激活函数,H
j即为后一层第j个节点的数值。其中,激活函数g(x)使用的是logsig函数,logsig函数(激活函数g(x))的公式为
e指自然常数,约为2.71828,x为任意实数,i和j分别用于表示节点数。
本例还提供一种基于骨传导振动与机器学习的智能输入系统,采用了如上所述的基于骨传导振动与机器学习的智能输入方法。
综上所述,本例通过基于骨传导振动把手背当作一个虚拟键盘,并结合机器学习的神经网路分类模型,使得文本输入的识别率高且灵敏迅速,反应速度快,提高了手戴式设备的文本输入效率,提升了用户体验,本发明的交互方式新颖有趣且方便快捷,能够满足各种穿戴设备的使用环境需求,应用广泛。
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。
Claims (10)
- 一种基于骨传导振动与机器学习的智能输入方法,其特征在于,包括以下步骤:步骤S1,采集用户敲击手背的振动信号;步骤S2,对采集的振动信号进行滤波降噪和端点切段处理;步骤S3,对端点切段后的振动信号进行对齐处理;步骤S4,对对齐处理后的振动信号进行信号特征提取;步骤S5,将提取的特征组成训练集并传送至神经网络分类模型进行训练,得到一个训练完的神经网路分类模型。
- 根据权利要求1所述的基于骨传导振动与机器学习的智能输入方法,其特征在于,所述步骤S3中,通过总体互相关法对端点切段后的振动信号进行对齐处理,所述对齐处理的具体操作是计算两个振动信号之间的偏移量,然后对当前的振动信号进行移动,移动完之后只取两个振动信号之间共有的完整部分。
- 根据权利要求1至3任意一项所述基于骨传导振动与机器学习的智能输入方法,其特征在于,所述步骤S4中,提取对齐处理后的振动信号的功率谱密度特征,并将所述功率谱密度特征与对齐处理前振动信号的振幅特征共同作为提取的信号特征。
- 根据权利要求1至3任意一项所述基于骨传导振动与机器学习的智能输入方法,其特征在于,所述步骤S5中,每个手背位置采集预定数量的训练样本,提取对应的信号特征,将训练样本的信号特征及其标签作为训练集传送至神经网络分类模型进行训练,得到一个训练完的神经网路分类模型,然后对该神经网路分类模型输入一个振动信号,所述神经网路分类模型返回所述振动信号对应的手背位置,以实现用户的输入操作。
- 根据权利要求6所述基于骨传导振动与机器学习的智能输入方法,其特征在于,所述步骤S5的神经网路分类模型包括一层输入层、一层隐藏层以及一层输出层,所述输入层的节点个数为信号特征的总维数,所述隐藏层的节点个数为所述输入层节点个数的2倍,所述输出层的节点个数为用户需要的按键数目。
- 根据权利要求1至3任意一项所述基于骨传导振动与机器学习的智能输入方法,其特征在于,所述步骤S2中,采用巴特沃兹滤波器对采集的振动信号进行滤波降噪处理,使用截止频率为20hz的高通滤波滤除直流分量和低频噪音,使用截止频率为300hz的低通滤波滤除高频噪音。
- 根据权利要求1至3任意一项所述基于骨传导振动与机器学习的智能输入方法,其特征在于,所述步骤S2中,所述端点切段处理中,先对整段振动信号进行分帧处理,然后采用每帧信号的方差作为判断标准,当某一帧信号的方差超过给定阈值时,则认为敲击信号出现,取出该帧信号前后一定长度的信号作为端点切段后的振动信号。
- 一种基于骨传导振动与机器学习的智能输入系统,其特征在于,采用了如权利要求1至9任意一项所述的基于骨传导振动与机器学习的智能输入方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810470755.6 | 2018-05-16 | ||
CN201810470755.6A CN108681709B (zh) | 2018-05-16 | 2018-05-16 | 基于骨传导振动与机器学习的智能输入方法及系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019218725A1 true WO2019218725A1 (zh) | 2019-11-21 |
Family
ID=63805071
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/073514 WO2019218725A1 (zh) | 2018-05-16 | 2019-01-28 | 基于骨传导振动与机器学习的智能输入方法及系统 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108681709B (zh) |
WO (1) | WO2019218725A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112069962A (zh) * | 2020-08-28 | 2020-12-11 | 中国航发贵阳发动机设计研究所 | 一种基于图像识别强噪声背景下振动频谱的方法 |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108681709B (zh) * | 2018-05-16 | 2020-01-17 | 深圳大学 | 基于骨传导振动与机器学习的智能输入方法及系统 |
CN109634439B (zh) * | 2018-12-20 | 2021-04-23 | 中国科学技术大学 | 智能文本输入方法 |
WO2020147098A1 (zh) * | 2019-01-18 | 2020-07-23 | 深圳大学 | 基于地面振动信号的人体摔倒检测系统 |
WO2020186477A1 (zh) * | 2019-03-20 | 2020-09-24 | 深圳大学 | 一种基于骨传导的智能输入方法和系统 |
CN109933202B (zh) * | 2019-03-20 | 2021-11-30 | 深圳大学 | 一种基于骨传导的智能输入方法和系统 |
CN110058689A (zh) * | 2019-04-08 | 2019-07-26 | 深圳大学 | 一种基于脸部振动的智能设备输入方法 |
CN110363120B (zh) * | 2019-07-01 | 2020-07-10 | 上海交通大学 | 基于振动信号的智能终端触碰认证方法及系统 |
CN110931031A (zh) * | 2019-10-09 | 2020-03-27 | 大象声科(深圳)科技有限公司 | 一种融合骨振动传感器和麦克风信号的深度学习语音提取和降噪方法 |
CN113342159A (zh) * | 2021-05-07 | 2021-09-03 | 哈尔滨工业大学 | 一种通过腕部震动识别的手腕可穿戴系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103035236A (zh) * | 2012-11-27 | 2013-04-10 | 河海大学常州校区 | 基于信号时序特征建模的高质量语音转换方法 |
US20160199641A1 (en) * | 2013-08-19 | 2016-07-14 | Advanced Bionics Ag | Device and method for neural cochlea stimulation |
CN107300971A (zh) * | 2017-06-09 | 2017-10-27 | 深圳大学 | 基于骨传导振动信号传播的智能输入方法及系统 |
CN108681709A (zh) * | 2018-05-16 | 2018-10-19 | 深圳大学 | 基于骨传导振动与机器学习的智能输入方法及系统 |
-
2018
- 2018-05-16 CN CN201810470755.6A patent/CN108681709B/zh active Active
-
2019
- 2019-01-28 WO PCT/CN2019/073514 patent/WO2019218725A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103035236A (zh) * | 2012-11-27 | 2013-04-10 | 河海大学常州校区 | 基于信号时序特征建模的高质量语音转换方法 |
US20160199641A1 (en) * | 2013-08-19 | 2016-07-14 | Advanced Bionics Ag | Device and method for neural cochlea stimulation |
CN107300971A (zh) * | 2017-06-09 | 2017-10-27 | 深圳大学 | 基于骨传导振动信号传播的智能输入方法及系统 |
CN108681709A (zh) * | 2018-05-16 | 2018-10-19 | 深圳大学 | 基于骨传导振动与机器学习的智能输入方法及系统 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112069962A (zh) * | 2020-08-28 | 2020-12-11 | 中国航发贵阳发动机设计研究所 | 一种基于图像识别强噪声背景下振动频谱的方法 |
CN112069962B (zh) * | 2020-08-28 | 2023-12-22 | 中国航发贵阳发动机设计研究所 | 一种基于图像识别强噪声背景下振动频谱的方法 |
Also Published As
Publication number | Publication date |
---|---|
CN108681709B (zh) | 2020-01-17 |
CN108681709A (zh) | 2018-10-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019218725A1 (zh) | 基于骨传导振动与机器学习的智能输入方法及系统 | |
Gupta et al. | Feature extraction using MFCC | |
CN107300971B (zh) | 基于骨传导振动信号传播的智能输入方法及系统 | |
Su et al. | Activity recognition with smartphone sensors | |
US11762474B2 (en) | Systems, methods and devices for gesture recognition | |
Wang et al. | Human activity recognition with user-free accelerometers in the sensor networks | |
Chambers et al. | Hierarchical recognition of intentional human gestures for sports video annotation | |
CN101694692B (zh) | 一种基于加速度传感器的手势识别的方法 | |
Nakano et al. | Effect of dynamic feature for human activity recognition using smartphone sensors | |
CN106237604A (zh) | 可穿戴设备及利用其监测运动状态的方法 | |
CN103294199B (zh) | 一种基于脸部肌音信号的无声信息识别系统 | |
Fang et al. | Dynamic gesture recognition using inertial sensors-based data gloves | |
Su et al. | HDL: Hierarchical deep learning model based human activity recognition using smartphone sensors | |
CN107526437A (zh) | 一种基于音频多普勒特征量化的手势识别方法 | |
CN107491254A (zh) | 一种手势操作方法、装置和移动终端 | |
CN110286774A (zh) | 一种基于手腕运动传感器的手语识别方法 | |
Kim et al. | Activity recognition using fully convolutional network from smartphone accelerometer | |
Zhu et al. | Deep ensemble learning for human activity recognition using smartphone | |
Pätzold et al. | Audio-based roughness sensing and tactile feedback for haptic perception in telepresence | |
Li et al. | Finger gesture recognition using a smartwatch with integrated motion sensors | |
Swee et al. | Malay sign language gesture recognition system | |
CN110413106B (zh) | 一种基于语音和手势的增强现实输入方法及系统 | |
Kim et al. | DeepSchema: Automatic Schema Acquisition from Wearable Sensor Data in Restaurant Situations. | |
CN115438691A (zh) | 一种基于无线信号的小样本手势识别方法 | |
Zhang et al. | Stacked LSTM-Based Dynamic Hand Gesture Recognition with Six-Axis Motion Sensors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19803178 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 09.03.2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19803178 Country of ref document: EP Kind code of ref document: A1 |