CN113703568A - Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium - Google Patents

Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium Download PDF

Info

Publication number
CN113703568A
CN113703568A CN202110786284.1A CN202110786284A CN113703568A CN 113703568 A CN113703568 A CN 113703568A CN 202110786284 A CN202110786284 A CN 202110786284A CN 113703568 A CN113703568 A CN 113703568A
Authority
CN
China
Prior art keywords
vibration information
gesture
gesture recognition
tendon
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110786284.1A
Other languages
Chinese (zh)
Inventor
何柏霖
王灿
段声才
李鹏博
吴新宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202110786284.1A priority Critical patent/CN113703568A/en
Publication of CN113703568A publication Critical patent/CN113703568A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Abstract

The application discloses a gesture recognition method, a gesture recognition device, a gesture recognition system and a storage medium, wherein the method comprises the following steps: acquiring vibration information on the tendon; processing the vibration information to obtain the characteristics corresponding to the vibration information; and classifying the vibration information based on the characteristics to obtain the gesture category corresponding to the tendon. By means of the mode, the gesture classification can be effectively classified and recognized.

Description

Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium
Technical Field
The present disclosure relates to the field of human-machine recognition technologies, and in particular, to a gesture recognition method, a gesture recognition apparatus, a gesture recognition system, and a storage medium.
Background
In the field of exoskeleton human-computer interfaces, intention recognition has become a research focus, including gait recognition, gesture recognition, and the like. The research on the application of the wrist tendon sounds in the rehabilitation exoskeleton gesture recognition is more and more extensive.
Typically, the sound may be collected by a sensor on the tendons of the wrist. However, some researchers record sounds through a microphone by means of a stethoscope fixed to the skin. When the technical scheme in the prior art is applied to the exoskeleton, measurement errors are often caused to collide with equipment of the exoskeleton, and the probability of identifying the type is reduced due to the influence of noise when the sound signals are collected by combining environmental noise and wind.
Disclosure of Invention
A first aspect of an embodiment of the present application provides a gesture recognition method, including: acquiring vibration information on the tendon; processing the vibration information to obtain the characteristics corresponding to the vibration information; and classifying the vibration information based on the characteristics to obtain the gesture category corresponding to the tendon.
A second aspect of an embodiment of the present application provides a gesture recognition apparatus, where the gesture recognition apparatus includes a processor and a memory, where the memory stores a computer program, and the processor is configured to execute the computer program to implement the method provided in the first aspect of the embodiment of the present application.
A third aspect of an embodiment of the present application provides a gesture recognition system, including:
a sensor configured to be fixed to a tendon portion of a hand, for acquiring vibration information of the tendon portion;
the processing equipment is used for processing the vibration information to obtain the characteristics corresponding to the vibration information;
the extraction device is used for extracting the characteristics corresponding to the vibration information;
the gesture recognition device is connected with the sensor, the processing device and the extraction device and is used for executing the method provided by the first aspect of the embodiment of the application.
A fourth aspect of embodiments of the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the method provided by the first aspect of embodiments of the present application.
The beneficial effect of this application is: different from the situation of the related technology, the gesture classification method aims at the current gesture classification recognition method, the obtained vibration information on the tendons is processed, the noise of the surrounding environment is removed, the characteristics corresponding to the tendons on the hand are obtained, the gesture classification is closely related to the vibration information of the tendons, the characteristics of the vibration information corresponding to the tendons are recognized and classified, and the gesture classification corresponding to the tendons can be quickly obtained.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a system framework diagram of the gesture recognition method of the present application;
FIG. 2 is a schematic diagram of gesture types in the gesture recognition method of the present application, wherein FIG. 2(1) shows a five-finger open gesture; FIG. 2(2) shows a gesture to snap the wrist down; FIG. 2(3) shows a gesture in which five fingers are combined into a fist;
FIG. 3 is a schematic flow chart diagram illustrating an embodiment of a gesture recognition method of the present application;
FIG. 4 is a flowchart illustrating an embodiment of step S12 of FIG. 3;
FIG. 5 is a flowchart illustrating an embodiment of step S22 of FIG. 4;
FIG. 6 is a flowchart illustrating an embodiment of step S23 of FIG. 4;
FIG. 7 is a flowchart illustrating an embodiment of step S33 of FIG. 5;
FIG. 8 is a flowchart illustrating an embodiment of step S13 of FIG. 3;
FIG. 9 is a schematic diagram of an experimental result of an embodiment of a gesture recognition method according to the present application;
FIG. 10 is a schematic block diagram of an embodiment of a gesture recognition apparatus of the present application;
FIG. 11 is a schematic block diagram of an embodiment of a gesture recognition system of the present application;
FIG. 12 is a schematic block diagram of one embodiment of a computer-readable storage medium of the present application;
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
To illustrate the technical solution of the present application, the following describes a first aspect of the present application by using specific embodiments to provide a gesture recognition method, and to better explain the gesture recognition method proposed by the present application, please refer to fig. 1, fig. 1 is a system framework diagram of the gesture recognition method of the present application, and the system at least includes the steps of: s1: collecting signals; s2: pre-treating; s3: energy activation; s4: extracting characteristics; s5: and (6) classifying.
Specifically, a sensor is provided on the wrist tendons so that the sensor can measure slight vibration of sound on the wrist tendons. And preprocessing, energy activation, feature extraction and classification are carried out on the sound signals converted from the vibration signals by using preset software, so that the classification of different gestures is obtained.
Still further, the acquisition board may be selected from the group consisting of silicon-germanium (STEVAL-MKIGIBV3), the sensor may be selected from the group consisting of LIS25BA, and other types of acquisition boards and sensors may be selected by those skilled in the art to meet the requirements, and are not limited thereto.
To more conveniently understand the meaning of the gesture category, please refer to fig. 2, fig. 2 is a schematic diagram of the gesture category of the gesture recognition method of the present application, wherein fig. 2(1) shows the gesture of opening five fingers, which is simplified to be represented by a gesture HS; FIG. 2(2) shows a wrist-down gesture, simplified to the gesture FR; fig. 2(3) shows a gesture of synthesizing five fingers into a fist, which is simplified to be shown by a gesture FM, and of course, there may be other gestures, which are only exemplified here, and are not limited here.
For better explaining the gesture recognition method provided by the present application, please refer to fig. 3, where fig. 3 is a schematic flowchart of an embodiment of the gesture recognition method, and the recognition method specifically includes the following steps:
s11: acquiring vibration information on the tendon;
generally, a sensor is disposed on the tendon for sensing vibration information on the tendon, and specifically, the vibration of the tendon can be measured by the sensor to obtain vibration information, for example, a vibration detection sensor is used, and the vibration detection sensor mainly includes an eddy current sensor, a speed sensor, an acceleration sensor, and the like.
The small and small magnitude of the motion related to the change of the gesture can adopt an acceleration sensor, and the measurement can be more accurately carried out, so that the vibration information on the tendon can be acquired.
It should be noted that the sensor may be disposed on the tendon, or on the mechanical arm, or disposed on another human body part, as long as the vibration information of the part to be measured can be sensed, which is not limited herein.
S12: processing the vibration information to obtain the characteristics corresponding to the vibration information;
each vibration information corresponds to a unique feature for distinguishing the gesture classification corresponding to the vibration information, so that the feature corresponding to the vibration information can be obtained by processing the vibration information, for example, the vibration information can be extracted by adopting preset software to obtain the feature corresponding to the vibration information.
Specifically, the characteristics corresponding to the vibration information are different from the types of the gesture classification, and the above-mentioned gesture classification includes HS gesture actions, FM gesture actions, and FS gesture actions, and these three gestures are different in terms of the represented vibration information, so that the characteristics corresponding to the HS gesture actions, the characteristics corresponding to the FM gesture actions, and the characteristics corresponding to the FS gesture actions are obtained by processing the vibration information.
Of course, since there may be other gestures, these three gestures are only used as examples here, and are not limited here specifically.
S13: and classifying the vibration information based on the characteristics to obtain the gesture category corresponding to the tendon.
The vibration information can be classified by acquiring the features corresponding to the vibration information, specifically, the features corresponding to the gesture categories can be stored in a database in advance, and the features corresponding to the acquired vibration information are compared with the features corresponding to the gesture categories stored in the database in advance.
If the gesture corresponding to the feature can be matched with the corresponding tendon, the gesture corresponding to the feature can be obtained as the gesture corresponding to the pre-stored tendon. If the matching fails, the feature may be named and stored in a database for later use in motion recognition and determination, but of course, there are other methods that can classify the vibration information, and here, just one way is to classify the vibration information, so as to obtain the gesture category corresponding to the tendon.
Therefore, according to the recognition method for the current gesture classification, the acquired vibration information on the tendons is processed, the noise of the surrounding environment is removed, the characteristics corresponding to the tendons on the hand are obtained, the gesture classification is closely related to the vibration information of the tendons, the characteristics of the vibration information corresponding to the tendons are recognized and classified, and the gesture classification corresponding to the tendons can be quickly obtained.
Further, referring to fig. 4, fig. 4 is a flowchart illustrating an embodiment of step S12 in fig. 3, where processing the vibration information includes the following steps:
s21: converting the vibration information to obtain a sound signal of the tendon;
generally, the sensor collects vibration information on tendons, wherein the term "vibration" generally focuses on the physical and scientific fields, and refers to the process of object, particle motion or periodic variation of some physical quantity, and has a certain time law and period, for example, "electromagnetic vibration can generate electromagnetic waves".
In the embodiment of the application, the sensor on the tendon acquires vibration information generated by gesture motion change on the tendon. Since the sensor is placed in the environment, the surrounding environment will usually have a certain noise. Therefore, the vibration information includes not only the sound on the tendon but also the sound generated by the surrounding environment.
Therefore, by converting the vibration information, a tendon sound signal can be obtained.
S22: preprocessing the sound signal by adopting preset software to obtain audio information corresponding to the sound signal;
and preprocessing the sound signal converted from the vibration signal by using preset software, specifically preprocessing the sound signal converted from the vibration signal by using MATLAB software, so as to obtain audio information corresponding to the sound signal.
The audio information, i.e. audio signal, is a frequency and amplitude variation information carrier with regular sound waves of voice, music and sound effects. Typically an electrical signal that can be received by an audio device, such as a sound box, and then played back.
Sound signals generally refer to the carrier of sound, i.e. sound waves. The voice signals are preprocessed through the preset software, so that the audio information corresponding to the voice signals can be obtained, the voice signals are converted into electric signals, the voice signals are conveniently processed, and the realizability of the gesture recognition scheme is improved.
S23: performing energy activation on the audio information;
the audio information includes sound wave frequency and amplitude information, and because the amplitude corresponding to the sound signal generated by the tendon corresponding to the gesture motion is small, generally, under the ambient environment condition, it is basically difficult for human ears to hear the sound signal generated by the tendon corresponding to the gesture motion, that is, the amplitude corresponding to the sound signal is small.
In order to more conveniently find useful information in the audio information, energy activation is required to be performed on the audio information, for example, the audio information is amplified according to a preset proportion, so that information meeting the utilization condition in the audio information is found.
Of course, the audio information may be amplified according to a preset ratio to perform energy activation on the audio information, or the audio information may be subjected to protocol control in a predetermined segment to perform energy activation on the audio information, and of course, a person skilled in the art may also have other ways to perform energy activation on the audio information, which is not limited herein.
S24: and performing feature extraction on the activated audio information to obtain features corresponding to the vibration information.
The characteristics corresponding to the vibration information of the activated audio information are more obvious, so that the characteristics of the activated audio information can be conveniently extracted, and the characteristics corresponding to the vibration information can be obtained.
Specifically, the features herein may include audio frequency and audio amplitude of the audio information, and may also be a mean and a variance of the audio information, which is not specifically limited herein. Generally, the difference of the extraction process may be different according to the difference of the features corresponding to the vibration information, for example, the audio frequency may be subjected to feature extraction according to the frequency of the audio frequency, and for example, the audio amplitude may be subjected to feature extraction according to the amplitude of the audio amplitude, so as to facilitate subsequent confirmation of the gesture category.
Further, referring to fig. 5, a preset software is adopted to preprocess the audio signal, and fig. 5 is a flowchart illustrating an embodiment of step S22 in fig. 4, which includes the following steps:
s31: filtering the sound signal by using a low-pass filter to obtain audio information;
since the vibration information acquired by the sensor is likely to be noisy due to the surrounding environment, the sound signal may be further filtered by a low-pass filter, which is an electronic filtering device that allows signals below a cutoff frequency to pass but does not allow signals above the cutoff frequency to pass.
Because the vibration generated by the tendon corresponding to the gesture is small, the noise generated by the surrounding environment is large, and the strength difference of the signals in the middle is too large, the audio amplitude corresponding to the sound signal generated by the tendon is too small to display, and in the post-processing process, it is usually difficult to identify the audio amplitude with too small amplitude and not obvious difference.
Therefore, the sound signal is filtered through the low-pass filter to obtain the denoised audio information, and the realizability of later schemes is facilitated.
S32: framing the audio information to obtain a plurality of frames corresponding to the audio information;
the denoised audio information comprises a plurality of frames, and because the gestures are different, the audio information is also different, and the corresponding frames are also different, the audio information is framed to obtain a plurality of frames corresponding to the audio information, and then the frames are processed.
Specifically, in an experiment of a specific embodiment, the sensor employs a 24kHz sampling frequency, frame shift 510, and frame length 510. Each gesture was made 40 times per subject. The audio information for 40 actions is divided into frames, each audio information being about 500 frames.
S33: a Hamming window is used to process a plurality of frames, and frames greater than an energy threshold are selected.
The sound signal is a non-stationary time-varying signal whose generation is closely related to the movement of the sound-emitting organ. The state change speed of the sounding organ is much slower than the speed of sound vibration, so the sound signal can be considered as stable for a short time.
Since the actual sound signal is very long, it is generally not possible nor necessary to process very long data once. Generally, the solution is to take one piece of data at a time for analysis, and then take another piece of data for analysis.
In order to obtain useful frames more quickly, an energy threshold is set for screening a plurality of frames, usually a small section of audio information has no obvious periodicity, a Hamming window is adopted to process the plurality of frames, and the data shape is represented as periodicity, so that the frames larger than the energy threshold are more favorably selected.
Further, referring to fig. 6, fig. 6 is a schematic flowchart of an embodiment of step S23 in fig. 4, and the method specifically includes the following steps:
s41: performing energy activation on a plurality of frames;
from the above, the sensor employs a 24kHz sampling frequency, frame shift 510, and frame length 510. Each gesture was made 40 times per subject. The audio information for 40 actions is divided into frames, each audio information being about 500 frames.
Specifically, energy activation is performed on a plurality of frames, specifically, in this specific experiment, on one hand, energy activation is performed on the 500 frames, and 2000 frames correspond to 40 motivations; on the other hand, because there are different experimental objects and different gestures, the plurality of frames herein may also represent audio information frames corresponding to different gestures, and further may also represent audio information frames corresponding to different experimental objects.
S42: performing wavelet transformation processing on the frames after the multiple energy activations to obtain multiple wavelet coefficient characteristics;
a wavelet is a wave whose energy is very concentrated in the time domain, its energy is finite, it is concentrated near a certain point, and the value of integral is zero. The fourier transform of audio is the decomposition of a sound signal into sine waves of various frequencies. Likewise, a wavelet transform is the decomposition of an image sound into a set of wavelets shifted and scaled by the original wavelets.
Specifically, the wavelet transform processing step: step 1: the wavelet w (t) is compared with the beginning of the primitive function f (t) to calculate the coefficient C. The coefficient C represents the similarity of the partial function and the wavelet; step 2: the wavelet is shifted to the right by k units to obtain wavelet w (t-k), and 1 is repeated. Repeating the steps until the function f is ended; and step 3: expanding the wavelet w (t) to obtain a wavelet w (t/2), and repeating the steps 1 and 2; and 4, step 4: and (5) continuously expanding the wavelet, and repeating the step 1, the step 2 and the step 3.
Therefore, wavelet transform processing is performed on a plurality of energy-activated frames, so that a plurality of wavelet coefficient characteristics can be obtained, for example, wavelet coefficient (31111 dimension) characteristics are obtained, and then, for example, a wavelet coefficient corresponding to a frequency at each time point is obtained, so that the wavelet coefficients are used as input, and a support vector machine is selected as a classification algorithm.
S43: and inputting the wavelet coefficient characteristics into a training set according to a preset proportion to obtain activated audio information.
The gesture recognition method is provided with a preset proportion and used for training and inputting the obtained wavelet coefficient characteristics, so that on one hand, unnecessary wavelet coefficient characteristics are eliminated, and on the other hand, useful wavelet coefficient characteristics are screened.
Therefore, according to a preset proportion, a plurality of wavelet coefficient features are input into the training set, for example, the preset proportion is 70%, that is, specifically, the obtained wavelet coefficient (31111 dimension) features are input into 70% of the training set as an SVM, and the activated audio information is obtained.
Of course, the preset ratio may also be 65%, 75%, 80% or other data, and is specifically selected and set according to actual requirements, which is not limited herein.
Further, a hamming window is used to process a plurality of frames, and a frame greater than the energy threshold is selected, please refer to fig. 7, fig. 7 is a flowchart illustrating an embodiment of step S33 in fig. 5, which includes the following steps:
s51: judging whether the energy value of the frame is greater than an energy threshold value by utilizing a Hamming window;
the hamming window is actually a function, mainly the fence effect caused by truncation. The barrier effect refers to a phenomenon in which, when the frequency spectrum calculated by transforming audio information is limited to an integer multiple of the fundamental frequency, the output can be seen only at corresponding discrete points.
If a hamming window is added, only the middle data is represented, and the data information on both sides is lost, but in the moving process of the hamming window, such as moving the 1/3 or 1/2 window, the lost data of the previous frame or two frames due to the barrier effect is represented again, so that the effective energy of the frame can be judged, for example, the hamming window is used for judging whether the energy value of the frame is greater than the energy threshold value.
If yes, the process proceeds to step S52, i.e., it is determined that the frame greater than the energy threshold is selected. If not, the process proceeds to step S53, where the frame is discarded, the next frame is selected, and a decision is made to return to step S51.
Further, based on the features, the vibration information is classified to obtain a gesture category corresponding to the tendon, please refer to fig. 8, fig. 8 is a flowchart illustrating an embodiment of step S13 in fig. 3, which includes the following steps:
s61: matching the characteristics corresponding to the vibration information with the characteristics corresponding to the gestures in a preset gesture library;
the preset gesture library is pre-stored with features corresponding to the gesture categories, and in order to quickly distinguish the gesture categories corresponding to the vibration information, the features corresponding to the vibration information can be matched with the features corresponding to the gestures in the preset gesture library, the features corresponding to the vibration information are compared, and the features corresponding to the gestures in the preset gesture library are consistent, inconsistent and inconsistent, so that the gesture categories corresponding to the vibration information are distinguished.
S62: judging whether the characteristics corresponding to the vibration information are successfully matched with the characteristics corresponding to the gestures in the preset gesture library or not;
through the judgment, whether the characteristics corresponding to the vibration information are successfully matched with the characteristics corresponding to the gestures in the preset gesture library can be known.
If the matching is successful, step S63 is performed, that is, the gesture corresponding to the vibration information is determined to be a preset gesture type, specifically, a voice prompt may be performed, and a pop-up window prompt may also be performed. If the matching is not successful, the process goes to step S64, i.e. a pop-up window is displayed to prompt a warning. Of course, this is only a feedback mechanism, and there may be other ways to prompt it, and the details are not limited herein.
Furthermore, in order to measure the accuracy of the gesture recognition method according to the embodiment of the present application, a series of experiments are performed, please refer to fig. 9, where fig. 9 is a schematic diagram of the experimental results of the gesture recognition method according to the embodiment of the present application, in which the ordinate is the accuracy (accuracuracy) and the abscissa is the number of the subject.
The experiment collected 5 people's data, their age was between 23-27 years old, and each subject made 40 times per hand gesture, and each person was a group of experiments, the first was a first group of experiments, the second was a second group of experiments, the third was a third group of experiments, the fourth was a fourth group of experiments, and the fifth was a fifth group of experiments.
The audio information for 40 actions is divided into frames, each audio information being about 500 frames. The frames are then subjected to a wavelet transform process, with 70% as the training set and the remaining 30% as the test set. The sensor is 24kHz sampling frequency, frame displacement 510, frame length 510 and Hamming window signal acquisition. And the 31111 characteristic dimension extracted after wavelet transformation is taken as SVM input. The test results are shown in fig. 3, with the highest accuracy of 95.16% for the third person. And the average accuracy of 5 persons was 92.332%.
It can be seen that the accuracy of gesture recognition averages 92.332% over multiple measurements. In the first set of experiments, the accuracy of gesture recognition was 93.96%. In the second set of experiments, the accuracy of gesture recognition was 90.87%. In the third set of experiments, the accuracy of gesture recognition was 95.16%. In the fourth set of experiments, the accuracy of gesture recognition was 90.72%. In the fifth set of experiments, the accuracy of gesture recognition was 90.95%.
Therefore, gather the sound that produces when muscle tendon department hand activity, utilize this sound signal to do gesture classification, its advantage as following 1) compare with the sensor of human contact, this scheme only needs a sensor can accomplish, effectively avoids because of the too much problem such as the testee wears uncomfortable that leads to of sensor. 2) Compared with a visual scheme, the method is easy to develop, and as a biological sensing signal, the response speed of the method in the process of collecting the biological signal is higher than the visual response speed, and the gesture recognition speed is higher. 3) The bone conduction sensor is used for collecting the sound signals, so that the influence of environmental noise on collected signals can be effectively avoided.
Further, please refer to fig. 10, fig. 10 is a schematic block diagram of an embodiment of a gesture recognition apparatus according to the present application. A second aspect of the embodiment of the present application provides a gesture recognition apparatus 4, which includes a processor 41 and a memory 42, where the memory 42 stores a computer program 421, and the processor 41 is configured to execute the computer program 421 to implement the recognition method according to the first aspect of the embodiment of the present application, which is not described herein again.
Further, please refer to fig. 11, fig. 11 is a schematic block diagram of an embodiment of the gesture recognition system of the present application. A third aspect of the embodiments of the present application further provides a gesture recognition system 5, where the gesture recognition system 5 includes: a sensor 51 configured to be fixed to a tendon portion of a hand, for acquiring vibration information of the tendon portion; the processing device 52 is used for processing the vibration information to obtain the characteristics corresponding to the vibration information; an extraction device 53 for extracting a feature corresponding to the vibration information; the gesture recognition apparatus 54, the connection sensor 51, the processing device 52, and the extraction device 53 are configured to perform the recognition method according to the first aspect of the embodiment of the present application, and are not described herein again.
Referring to fig. 12, fig. 12 is a schematic block diagram of an embodiment of a computer-readable storage medium of the present application. The fourth aspect of the embodiments of the present application also provides a computer-readable storage medium, which, if implemented in the form of software functional units and sold or used as a stand-alone product, can be stored in the computer-readable storage medium 60. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage device and includes instructions (computer program 61) for causing a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. The aforementioned storage device includes: various media such as a usb disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and electronic devices such as a computer, a mobile phone, a notebook computer, a tablet computer, and a camera having the storage medium.
The description of the execution process of the computer program in the computer-readable storage medium may refer to the above-mentioned embodiment of the processing method of the gesture recognition apparatus 5 of the present application, and will not be described herein again.
The above description is only a part of the embodiments of the present application, and not intended to limit the scope of the present application, and all equivalent devices or equivalent processes performed by the content of the present application and the attached drawings, or directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. A method of gesture recognition, the method comprising:
acquiring vibration information on the tendon;
processing the vibration information to obtain the characteristics corresponding to the vibration information;
and classifying the vibration information based on the characteristics to obtain the gesture category corresponding to the tendon.
2. The method of claim 1,
the processing the vibration information includes:
converting the vibration information to obtain a sound signal of the tendon;
preprocessing the sound signal by adopting preset software to obtain audio information corresponding to the sound signal;
energy activating the audio information;
and performing feature extraction on the activated audio information to obtain features corresponding to the vibration information.
3. The method of claim 2,
the preprocessing of the sound signal by adopting preset software comprises the following steps:
filtering the sound signal by using a low-pass filter to obtain the audio information;
framing the audio information to obtain a plurality of frames corresponding to the audio information;
a Hamming window is used to process a plurality of the frames, and the frames larger than the energy threshold are selected.
4. The method of claim 3,
the energy activating the sound signal comprises:
activating energy for a plurality of the frames;
performing wavelet transformation processing on the frames after the energy activation to obtain a plurality of wavelet coefficient characteristics;
and inputting the wavelet coefficient characteristics into a training set according to a preset proportion to obtain the activated audio information.
5. The method of claim 3,
processing a plurality of the frames with a hamming window, selecting frames greater than an energy threshold, comprising:
judging whether the energy value of the frame is greater than an energy threshold value by using a Hamming window;
if so, determining to select the frame larger than the energy threshold.
6. The method of claim 1,
classifying the vibration information based on the characteristics to obtain a gesture category corresponding to the tendon, including:
matching the characteristics corresponding to the vibration information with the characteristics corresponding to the gestures in a preset gesture library;
judging whether the characteristics corresponding to the vibration information are successfully matched with the characteristics corresponding to the gestures in a preset gesture library or not;
and if the matching is successful, determining that the gesture corresponding to the vibration information is a preset gesture type.
7. The method of claim 1,
the acquiring of vibration information on the tendon comprises:
and measuring the vibration of the tendon through a sensor to obtain the vibration information.
8. A gesture recognition apparatus, characterized in that the gesture recognition apparatus comprises a processor and a memory, in which a computer program is stored, the processor being adapted to execute the computer program to implement the processing method according to any of claims 1-7.
9. A gesture recognition system, the gesture recognition system comprising:
a sensor configured to be fixed to a tendon portion of a hand, for acquiring vibration information of the tendon portion;
the processing equipment is used for processing the vibration information to obtain the characteristics corresponding to the vibration information;
the extraction device is used for extracting the characteristics corresponding to the vibration information;
a gesture recognition apparatus connected to the sensor, the processing device and the extraction device for performing the method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202110786284.1A 2021-07-12 2021-07-12 Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium Pending CN113703568A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110786284.1A CN113703568A (en) 2021-07-12 2021-07-12 Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110786284.1A CN113703568A (en) 2021-07-12 2021-07-12 Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium

Publications (1)

Publication Number Publication Date
CN113703568A true CN113703568A (en) 2021-11-26

Family

ID=78648480

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110786284.1A Pending CN113703568A (en) 2021-07-12 2021-07-12 Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium

Country Status (1)

Country Link
CN (1) CN113703568A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150370326A1 (en) * 2014-06-23 2015-12-24 Thalmic Labs Inc. Systems, articles, and methods for wearable human-electronics interface devices
EP3193317A1 (en) * 2016-01-15 2017-07-19 Thomson Licensing Activity classification from audio
US20170215768A1 (en) * 2016-02-03 2017-08-03 Flicktek Ltd. Wearable controller for wrist
CN111103976A (en) * 2019-12-05 2020-05-05 深圳职业技术学院 Gesture recognition method and device and electronic equipment
WO2020186477A1 (en) * 2019-03-20 2020-09-24 深圳大学 Intelligent input method and system based on bone conduction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150370326A1 (en) * 2014-06-23 2015-12-24 Thalmic Labs Inc. Systems, articles, and methods for wearable human-electronics interface devices
EP3193317A1 (en) * 2016-01-15 2017-07-19 Thomson Licensing Activity classification from audio
US20170215768A1 (en) * 2016-02-03 2017-08-03 Flicktek Ltd. Wearable controller for wrist
WO2020186477A1 (en) * 2019-03-20 2020-09-24 深圳大学 Intelligent input method and system based on bone conduction
CN111103976A (en) * 2019-12-05 2020-05-05 深圳职业技术学院 Gesture recognition method and device and electronic equipment

Similar Documents

Publication Publication Date Title
Mouawad et al. Robust detection of COVID-19 in cough sounds: using recurrence dynamics and variable Markov model
EP3191924B1 (en) Method and apparatus for differentiating touch screen users based on touch event analysis
Adnan et al. Fall detection through acoustic local ternary patterns
CN110069199B (en) Skin type finger gesture recognition method based on smart watch
EP2064698B1 (en) A method and a system for providing sound generation instructions
EP3367908A1 (en) Programmable electronic stethoscope devices, algorithms, systems, and methods
Pillos et al. A Real-Time Environmental Sound Recognition System for the Android OS.
US20220007964A1 (en) Apparatus and method for detection of breathing abnormalities
Irtaza et al. A framework for fall detection of elderly people by analyzing environmental sounds through acoustic local ternary patterns
Grønnesby et al. Feature extraction for machine learning based crackle detection in lung sounds from a health survey
Chaki Pattern analysis based acoustic signal processing: a survey of the state-of-art
Turan et al. Monitoring Infant's Emotional Cry in Domestic Environments Using the Capsule Network Architecture.
Ribeiro et al. Binary neural networks for classification of voice commands from throat microphone
Rougui et al. Audio sound event identification for distress situations and context awareness
Banjar et al. Fall event detection using the mean absolute deviated local ternary patterns and BiLSTM
Zanzarukiya et al. Assistive hand gesture glove for hearing and speech impaired
CN113703568A (en) Gesture recognition method, gesture recognition device, gesture recognition system, and storage medium
Guo et al. Integrated and configurable voice activation and speaker verification system for a robotic exoskeleton glove
Sofwan et al. Normal and Murmur Heart Sound Classification Using Linear Predictive Coding and k-Nearest Neighbor Methods
Mendes et al. Subvocal speech recognition based on EMG signal using independent component analysis and neural network MLP
Prasasti et al. Identification of baby cry with discrete wavelet transform, mel frequency cepstral coefficient and principal component analysis
JP5998811B2 (en) Information input device, specific frequency extraction method, specific frequency extraction program
Várallyay et al. Automatic infant cry detection.
Cao et al. IPand: accurate gesture input with ambient acoustic sensing on hand
Naronglerdrit et al. Monitoring of indoors human activities using mobile phone audio recordings

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination