CN111914724B

CN111914724B - Continuous Chinese sign language identification method and system based on sliding window segmentation

Info

Publication number: CN111914724B
Application number: CN202010734304.6A
Authority: CN
Inventors: 王青山; 王鑫炎; 马晓迪; 郑志文; 朱钰; 张江涛; 王�琦
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2023-10-27
Anticipated expiration: 2040-07-27
Also published as: CN111914724A

Abstract

The invention discloses a continuous Chinese sign language identification method based on sliding window segmentation, which comprises the following steps: s1: collecting sEMG and IMU data of the arm through an arm ring; s2: preprocessing the data acquired in the step S1; s3: the method comprises the steps of performing feature extraction on preprocessed data by utilizing a sliding window, wherein the feature extraction comprises the steps of dividing single sign language words of continuous sign language through the sliding window, and performing average segmentation and recombination on each divided data to obtain a plurality of new data; s4: inputting the obtained new data into an LSTM neural network for training to obtain a sign language word predicted value; s5: and judging and analyzing the plurality of identified sign language word predicted values by using a multi-voting strategy based on a threshold value to obtain an identification result. A continuous Chinese sign language recognition system based on sliding window segmentation is also disclosed. The average accuracy of the sign language recognition system provided by the invention reaches 83.8%, and is improved by 18.6% compared with an LSTM model.

Description

Continuous Chinese sign language identification method and system based on sliding window segmentation

Technical Field

The invention relates to the technical field of sign language recognition, in particular to a continuous Chinese sign language recognition method and system based on sliding window segmentation.

Background

Communication is a fundamental requirement of all humans, and disabled persons with hearing impairment (deaf-mute) are no exception. Sign language is commonly used for communication between the deaf-mute and the sound person. Natural sign language can be conveniently used for communication among the deaf-mute, and communication between the deaf-mute and sound person is more dependent on grammar sign language (simply called sign language). However, when a healthy person communicates with a deaf-mute, the healthy person encounters a great obstacle that the person cannot read the sign language. Sign language recognition (Sign Language Recognition, SLR) systems have taken a very important role in the communication of the deaf-mute with the sound person.

The current mainstream sign language recognition method can be divided into: computer vision based methods and sensor based methods. Koller et al propose to improve label-to-image alignment in a weakly supervised real time and embed convolutional neural networks (Convolutional Neural Network, CNN) into hidden markov models (Hidden Markov Model, HMM) for correcting the framing labels, improving 10% accuracy. Cui et al propose a weak supervision framework with deep neural network, which solves the mapping problem of video clips to gloss by introducing convolutional neural network to perform space-time feature extraction and sequence learning, and the method obtains the result equivalent to the latest technology at the time without additional supervision. Poplar et al built a continuous SLR model based on fast HMMs, which embedded HMMs into a level building algorithm (Dynamic Time Warping based Level Building, LB-DTW) based on dynamic time warping to improve the accuracy of sentence recognition. In addition, a rough segmentation method is proposed to provide the maximum number of levels, and the grammar constraint and the symbol length constraint are utilized to achieve improvement of recognition accuracy by reduction of insertion, deletion and substitution errors, and compared with other prior arts, the method has higher recognition performance and lower calculation amount. Yellow et al propose a video-based sign language recognition method that uses a new dual stream 3-DCNN to extract global-local spatiotemporal features from the logo video. Experiments were performed on RWTH-PHOENIX-Weather dataset containing 7000 sentences of Weather forecast sentences from 9 gesturing persons and publicly available dataset acquired by Kinect, with accuracy of recognition of 61.7% and 82.7%, respectively. Guo et al propose a hierarchical LSTM for sign language translation that addresses the difficulty that conventional HMMs and CTC (Connectionist Temporal Classification) may not be able to address the confusing word order corresponding to visual content in sentences during recognition. And the like, a continuous sign language recognition method consisting of offline training and online recognition is proposed. The method utilizes a threshold matrix and a rate threshold to judge transitional motion in an offline training stage, and utilizes coarse segmentation based on the threshold matrix and subdivision segmentation based on a DTW and Length-Root method (Length-Root method) to determine the endpoint of each candidate symbol in an online recognition stage. The method solves some of the error problems caused by gesture transition motion and verifies validity in Kinect-based datasets. However, accuracy of the sign language recognition method based on computer vision is affected by factors such as illumination transformation and clothing shielding.

Another method is a sign language recognition method based on sensors including data gloves, armrings, smart watches, etc. Li and the like provide a continuous SLR model frame which is easy to expand, strong in anti-interference capability and easy to transplant by utilizing an HMM technology, so that the problem of partial scalability in a continuous SLR system is solved, and the vocabulary of 1024 test sentences and 510 words obtained by collecting five gesture collectors by utilizing data gloves is tested, wherein the word accuracy rate reaches 87.4%. Bukhari et al designed a real-time ASL recognition glove equipped with a series of sensors, which used a principal component analysis (Principal Component Analysis, PCA) algorithm to recognize 23 datasets, yielding 92% accuracy. The motion data of the hand motion can be accurately read by carrying a pair of data gloves, so that the recognition with high accuracy is realized, but the portability is lacking.

The sign language identification method based on the sensor mainly adopts devices such as a data glove, an intelligent watch, an armring and the like to collect sign language gesture data to build a model, and identifies sign language. Higher proposes a SOFM/SRN/HMM model that employs a modified simple recursive network (Simple Recurrent Network, SRN), segments successive sign language according to the converted SOFM representation, and uses the output of SRN as HMM state, searching for the best matching word sequence using the Grignard algorithm. The system can achieve 82.9% accuracy rate of recognition of the sign language vocabulary collected by 5113 data gloves and 86.3% accuracy rate of continuous sign language recognition independent of gesture collectors. Benalcazar et al collect surface electromyography (surface Electromyography, sEMG) signals with an armband sensor, and propose a new model of real-time gesture recognition based on the k-Nearest Neighbor (KNN) and DTW algorithms and recognizing five gesture signals to 86% accuracy. Poplar et al propose continuous SLR based on an optimized tree structure framework based on sEMG, accelerometer (ACC) and Gyroscope (GYRO) sensor combinations. The algorithm classifies successive chinese sign language (Chinese Sign Language, CSL) subwords using an optimized tree structure based on the direction and magnitude components of one or both hands. Experimental results on the 150 subword dataset they obtained showed an accuracy of 94.31% in the user-specific test and 87.02% in the user-independent test. Engin Kaya et al propose a new gesture recognition method that utilizes sEMG signals collected from the information acquisition arm ring and extracts seven different time domain features from the original EMG signals using a sliding window method, and by comparing KNN, a support vector machine and an artificial neural network, the highest accuracy of the system results based on the support vector machine classifier is found. The bang et al propose new sEMG and ACC signal acquisition positions, using sEMG and ACC to acquire hand motion signals of the right forearm, wrist and back of the hand, thereby acquiring 18 CSL features. The method divides sEMG and ACC data by using a sliding window, extracts features and combines the features into feature vectors, and judges by using linear discriminant analysis (Linear Discriminant Analysis, LDA). Experiments show that the accuracy of identification reaches 91.4%. Easy et al apply deep belief networks (Deep Belief Networks, DBN) to the field of chinese sign language recognition based on wearable sensors. To obtain the best structure of the network, three different sensor fusion strategies were explored, including data fusion, feature fusion, and decision fusion, with an experimental best recognition accuracy of 95.1% for user-related tests and 88.2% for user-independent tests.

But there is less work in using the sensor for continuous sign language recognition. Mittal et al propose an improved Long Short-Term Memory (LSTM) model. The model uses 35 different sign words to identify 942 indian sign language sentences for testing, and the average accuracy is 72.3%. Gupta et al propose classifier sets based on different length window extraction features, and realize real-time classification of continuous sign language sentences by using a multi-mode wearable sensor so as to improve accuracy of continuous sign language recognition classification. This study shows that the proposed integration method has a higher accuracy for classification of sentences than a single classifier learned using features extracted from a fixed duration window.

According to the research of the inventor, no work arm ring is used for carrying out continuous hand sentence recognition of Chinese sign language at present [11]. This problem currently faces mainly two challenges: how to accurately segment sign language words; because of the overlapping deformation among the sign language words in the continuous sign language, how to improve the accuracy of word recognition. In view of these two challenges, it is desirable to provide a new sign language recognition system to solve the above problems.

Disclosure of Invention

The invention aims to solve the technical problem of providing a continuous Chinese sign language recognition method and a system based on sliding window segmentation, which can remarkably improve the recognition accuracy of continuous Chinese sign language.

In order to solve the technical problems, the invention adopts a technical scheme that: the continuous Chinese sign language identification method based on sliding window segmentation comprises the following steps:

s1: collecting sEMG and IMU data of the arm through an arm ring;

s2: preprocessing the data acquired in the step S1;

s3: the method comprises the steps of performing feature extraction on preprocessed data by utilizing a sliding window, wherein the feature extraction comprises the steps of dividing single sign language words of continuous sign language through the sliding window, and performing average segmentation and recombination on each divided data to obtain a plurality of new data;

s4: inputting the new data obtained in the step S3 into an LSTM neural network for training to obtain a sign language word predicted value;

s5: and judging and analyzing the plurality of identified sign language word predicted values by using a multi-voting strategy based on a threshold value to obtain an identification result.

In a preferred embodiment of the present invention, in step S1, when the sEMG and IMU data of the arm are collected, the armlet is worn on the forearm, and the sEMG sensor provided on the armlet is located at the front end of the forearm in the middle finger direction.

In a preferred embodiment of the present invention, in step S2, the preprocessing process includes:

s201: screening out effective information in the signals by using data normalization and the starting time and the ending time of the regular signals;

s202: and filtering the screened effective information through a Butterworth filter.

In a preferred embodiment of the present invention, in step S3, the method for dividing the continuous sign language into single sign language words through the sliding window is as follows:

the sliding length of the sliding window is selected based on the average length of single sign language of the words, and one gesture signal is equally divided into a plurality of groups of single sign language words in a unit of one gesture second.

In a preferred embodiment of the present invention, in step S3, the step of performing average segmentation and reassembly on each of the divided data includes:

a sign language word is divided into n groups of data on average, one group of data is deleted in sequence, the rest n-1 groups of data are formed into new data according to the original sequence to be used as input, and n different data are generated by one dividing gesture to be used as n different inputs.

In a preferred embodiment of the present invention, in step S4, the LSTM neural network is composed of two fully connected layers, two LSTM models and one fully connected layer, the fully connected layers include a first fully connected layer of 512 neurons and a second fully connected layer of 256 neurons, the LSTM models are bidirectional LSTMs, and each LSTM includes 256 units.

In a preferred embodiment of the present invention, the specific steps of step S5 include:

s501: for the data in each sliding window, setting the n sign word predicted values obtained through LSTM neural network recognition as s ₁ ,s ₂ ,s ₃ ,s _i ,...s _n The corresponding probability is p ₁ ,p ₂ ,p ₃ ,p _i ,...p ₅ ；

S502: setting the threshold value as D, when the probability p _i When not less than D, s is _i Is considered as a valid vote; when probability p _i <D, s is _i Is considered an invalid vote;

s503: the effective ticket number is c _all Then

(1) If the number of votes of a single result is greater than half the number of votes effectively c _all And/2, the result is used as an output result of the window;

(2) If the number of tickets with two results is equal and the number of tickets is equal to the effective number of tickets c _all At/2, then p _i The highest result is taken as the output result of the window;

(3) If there are no cases (1) and (2), the window is regarded as having no valid output information.

In order to solve the technical problems, the invention adopts another technical scheme that: provided is a continuous Chinese sign language recognition system based on sliding window segmentation, comprising:

the data acquisition module is used for collecting sEMG and IMU data of the arm through the arm ring;

the data preprocessing module is used for preprocessing sEMG and IMU data of the arm acquired by the data acquisition module;

the data segmentation and recombination module is used for extracting characteristics of the data preprocessed by the data preprocessing module, and the data segmentation and recombination module comprises the steps of dividing single sign language words of continuous sign language through a sliding window, and carrying out average segmentation and recombination on each divided data to obtain a plurality of new data;

the LSTM-based neural network structure is used for training the new data obtained by the data segmentation and recombination module to obtain a plurality of sign word predicted values;

and the threshold-based multi-voting decision module is used for judging and analyzing a plurality of sign language word predicted values recognized by the LSTM-based neural network structure by applying a threshold-based multi-voting strategy to obtain a recognition result.

In a preferred embodiment of the present invention, the data preprocessing module includes an information screening unit and a filtering unit;

the information screening unit is used for screening effective information in the signals by utilizing the data normalization and the start time and the end time of the regular signals;

and the filtering unit is used for filtering the screened effective information by adopting a Butterworth filter.

In a preferred embodiment of the present invention, the data segmentation and reorganization module includes a continuous sign language segmentation unit and a single sign language segmentation reorganization unit;

the continuous sign language segmentation unit is used for dividing single sign language words of continuous sign language through a sliding window;

the single sign language segmentation and recombination unit is used for carrying out average segmentation and recombination on the data divided by each continuous sign language segmentation unit to obtain a plurality of new data.

The beneficial effects of the invention are as follows:

(1) The invention uses the sign language data collected by the arm ring to identify sign language sentences, and uses a sliding window method based on average word length to solve the problem of continuous sign language segmentation; while the idea of segmentation is used in single sign language recognition. A single sign language is obtained through sliding window division, and part of the sign language is taken for recognition, so that single sign language words can be recognized for multiple times, and the recognition accuracy is improved;

(2) The average accuracy of the continuous Chinese sign language recognition system based on sliding window segmentation provided by the invention reaches 83.8%, which is 18.6% higher than that of an LSTM model (a method of directly using an LSTM neural network after the sliding window extracts the characteristics).

Drawings

FIG. 1 is a flow chart of a continuous Chinese sign language recognition method based on sliding window segmentation of the present invention;

FIG. 2 is a schematic illustration of the arm ring wear;

FIG. 3 is a schematic diagram of sentence recognition accuracy when n is different in value;

FIG. 4 is a schematic diagram of the LSTM neural network;

FIG. 5 is a data histogram of sentence recognition results using SSW system and LSTM model, respectively;

FIG. 6 is a schematic diagram of the stability of the SSW system of the present invention;

FIG. 7 is a data histogram of SSW statement identification accuracy as a function of the number of split groups n;

fig. 8 is a block diagram of the continuous chinese sign language recognition system based on sliding window segmentation.

Detailed Description

The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings so that the advantages and features of the present invention can be more easily understood by those skilled in the art, thereby making clear and defining the scope of the present invention.

Referring to fig. 1, an embodiment of the present invention includes:

a continuous chinese sign language recognition method based on sliding window segmentation (Split Sliding Window, SSW), comprising the steps of:

s1: collecting sEMG and IMU data of the arm through an arm ring;

in this embodiment, the data acquisition device is a commercial Myo arm ring, the wearing mode is shown in fig. 2, the data acquisition device is worn on the forearm, and the sEMG sensor with badge is located at the front end of the forearm and aligned with the middle finger direction. It has 8 sEMG sensors and an IMU unit, together with 18 bits of data.

S2: preprocessing the data acquired in the step S1; the pretreatment process comprises the following steps:

Specifically, the signal of a whole sentence is obtained through the arm ring,

E(x)＝(E ₁ (x),E ₂ (x),...,E ₁₇ (x),E ₁₈ (x))， (1)

wherein E is _i (x) (1.ltoreq.i.ltoreq.18) represents the ith signal transmitted from the arm ring.

The complete sentence collected by the arm ring has a start pause and an end pause, so that the effective information in the signal should be screened out in order to reduce the calculation cost and facilitate the subsequent training. The sliding window and the sliding length are set, the mean absolute value (MeanValue of Absolute, MVA) in the current sliding window is calculated, and if it is smaller than a certain empirical value, the statement is started or ended. In this example, the sEMG signal acquisition frequency was 200Hz, with 50 data points as the sliding window size, 5 data points as the sliding length, and MVA was calculated:

it is known from general experience that if the MVA is less than 10, this piece of information is represented as invalid information, and thereby the invalid information is deleted. The data is then converted into [0,1] intervals using normalization operations and absolute values operations and noted as W (x):

W(x)＝(W ₁ (x),W ₂ (x),...,W ₁₇ (x),W ₁₈ (x))， (3)

wherein W is _i (x) (1.ltoreq.i.ltoreq.18) represents E _i (x) Normalized and absolute value data.

The sensor information is acquired by the arm ring, and part of noise is also included in the signal except for necessary data. The denoising work is carried out by filtering the input signals, so that important information cannot be filtered out in the denoising process, but partial differences of sensor signals are reserved, the robustness of a model is ensured, and the accuracy is improved. The present invention thus performs a filtering operation by means of a butterworth filter. The butterworth filter is one of digital filters, and the frequency response curve in the band pass band is the smoothest, so that a low-pass filter can be designed to eliminate noise interference at high frequencies. Fewer parameters of the butterworth filter are computationally less expensive than other filters. The filtration was carried out with a three-order Butterworth filter with a cut-off frequency of 10Hz, and the result was designated A (x).

by observing the sign language, the average time of one sign language word can be obtained about one second, so that the data of A (x) is divided by taking one gesture as a unit of one second (200 data points). According to the study by Wahid et al, the present example selects the sliding window to have a sliding length of 40 (coverage 80%). Dividing the data to obtain an 18-dimensional data.

After the division of the single sign language word by the sliding window, division is performed on each divided data. Assume that the data of a certain partition is s= { S ₁ ,s ₂ ,...,s ₁₉₉ ,s ₂₀₀ (s is therein _i ＝{s _i1 ,w _i2 ,...,s _i17 ,s _i18 Equal division of signal S into n sets of data records (1.ltoreq.i.ltoreq.200)Is S ¹ ,S ² ,...S ^n-1 ,S ⁿ One group of data is sequentially deleted, the rest n-1 groups of data are formed into new data according to the original sequence to serve as input, and n different data serving as n different inputs can be generated by one dividing gesture.

In this embodiment, ten volunteers are invited to collect 100 sign language information and 20 different sign language sentences under the condition of no interference. After the sign language sentence is divided into data through a sliding window, a sign language word is divided into n groups of data on average, one group of data is sequentially deleted, the rest n-1 groups of data are formed into new data according to the original sequence, the new data are used as new data, and an experimental result when the number n of the divided groups is different can be obtained by inputting the new data into the LSTM neural network, and the experimental result is shown in figure 3. As can be seen from FIG. 3, when n is not less than 2, the data accuracy rises faster, and the accuracy is greater than that without preprocessing, and when n is not less than 6, the data is basically stable, but the cost in the algorithm operation process is greatly increased. Finally, taking n=5, not only improves the accuracy, but also prevents the overheads from being too large in the operation process.

the neural network is composed of two fully connected layers, two LSTM models, and one fully connected layer, as shown in fig. 4. After the data processing, the original matrix of 200×18 is changed into a matrix of 160×18 for input. Input data first passes through the fully connected layers of 512 neurons and the fully connected layers of 256 neurons. The output of the two fully connected layers serves as the input to the bi-directional LSTMs, each layer LSTM containing 256 cells. And then, inputting the output to the full-connection layer, and judging to obtain a predicted value. The predicted value is a predicted result of gesture information of the data obtained by inputting the data subjected to segmentation and combination into a neural network.

For 18-dimensional data in each sliding window, five are obtained through neural network identificationAs a result, let the result be s ₁ ,s ₂ ,s ₃ ,s ₄ ,s ₅ The corresponding probability is p ₁ ,p ₂ ,p ₃ ,p ₄ ,p ₅ . Setting the threshold value as D, when the probability p _i When not less than D, s is _i Is considered as an effective vote when the probability p _i <D, s is _i Is considered an invalid vote. The effective ticket number is c _all The specific operation is as follows:

(1) If the number of votes of a single result is greater than half the number of votes effectively c _all And/2, the result is taken as the output result of the window.

(2) If the number of tickets with two results is equal and the number of tickets is equal to the effective number of tickets c _all At/2, then p _i The highest result is the output of the window.

(3) If the two conditions do not exist, the window is regarded as having no effective output information.

Based on the method, the invention also provides a continuous Chinese sign language recognition system based on sliding window segmentation, referring to fig. 8, comprising the following modules:

The invention evaluates the performance of the sign language sentence recognition system proposed herein through experiments. The experiment used Myo-arm rings as information gathering devices. The arm ring has 8 sEMG sensors, 1 triaxial accelerometer and 1 triaxial gyroscope, can gather arm electromyographic signals at 200Hz frequency, and send the signals to the computer. The computer is equipped with Intel Kuri 7-8700 processor, 16GB memory, nvidia GTX 1080 video card, 8GB video card and Windows10 operating system.

In the present embodiment, the sentence recognition accuracy is defined as [13]

Where S, I, D represents the number of words replaced, the number of words inserted, and the number of words deleted, respectively. N represents the number of sign language words identified by the test succession of sign language.

(1) SSW to LSTM comparison

10 sign language sentences are randomly selected, the sign language sentences are identified by using an SSW system and compared with a method of directly using an LSTM neural network after the characteristics are extracted by a sliding window, and the experimental result is shown in figure 5. As can be seen from fig. 5, the accuracy of SSW system identification is far higher than that of LSTM model. The average accuracy of the SSW system is 83.8%, and the average accuracy of the LSTM model is 65.2%. The reason is that the SSW system trains a sliding window division section for a plurality of times and applies a multi-voting strategy based on a threshold value, so that the influence of deviation such as gesture engagement, incomplete sign language actions and the like in a gesture signal is reduced, and the accuracy of word language identification in sentences is improved.

(2) SSW stability

The purpose of the SSW system is to enable the sign language sentence of most deaf-mutes to be translated into text, so it should be able to accurately recognize the gesture of any deaf-mutes. To verify the stability of the SSW system, 10 volunteers of the same age group, who did not collect the gesture, were invited to sign language recognition in this embodiment, and the experimental results are shown in fig. 6. As can be seen from fig. 6, the average accuracy of the volunteers not collecting the gestures was 79.84%, which is slightly reduced compared to the accuracy of 83.8% of the volunteers collecting the gestures, but still within the acceptance interval.

(3) Influence of the number n of segmentation groups on the accuracy of SSW recognition

In this experiment, gesture information of 10 volunteers was collected using an arm ring. A total of 100 gestures and 20 different sign language were collected, each of 50 samples. 3 sentences are randomly selected, the number n of the segmentation groups is changed, the change condition of the sign language average accuracy of the SSW system is observed, the experimental result is shown in fig. 7, the condition that the SSW sentence recognition accuracy rate changes along with the number n of the segmentation groups is shown in fig. 7 by taking 'I fever' as an example. From the graph, it can be known that when the number of divided groups n=5, the accuracy is improved to 85% compared with the smaller number of divided groups, and when the number of divided groups continues to increase, the calculated amount is increased, the recognition accuracy is not improved significantly, and even for some sentences, the recognition accuracy is reduced.

The invention provides a continuous Chinese sign language recognition system SSW based on sliding window segmentation. The system uses the dividing thought, firstly uses the sliding window to divide the average length of single sign language based on words of continuous sign language, then equally divides a gesture signal into n groups, takes out n-1 groups each time and combines the n-1 groups into new data according to the original sequence, carries out multiple times of recognition, and improves the sentence recognition rate. Meanwhile, SSW adopts a multi-voting strategy based on a threshold value to process the recognition result, and the recognition result is marked as the effective number of votes only if the recognition accuracy is larger than the threshold value, so that the influence of a larger deviation part in the gesture signal is reduced, and the result is more credible. SSW increases the calculation amount to a certain extent, but greatly improves the sign language accuracy. Gesture collection and testing on 10 volunteers show that the accuracy reaches 83.8%.

The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes or direct or indirect application in other related technical fields are included in the scope of the present invention.

Claims

1. A continuous Chinese sign language identification method based on sliding window segmentation comprises the following steps:

s1: collecting sEMG and IMU data of the arm through an arm ring;

s2: preprocessing the data acquired in the step S1;

the method for dividing the single sign language words of the continuous sign language through the sliding window comprises the following steps:

selecting the sliding length of a sliding window based on the average length of single sign language of the word, and equally dividing a gesture signal into a plurality of groups of single sign language words by taking one gesture as a unit of one second;

the step of carrying out average segmentation and recombination on each piece of divided data comprises the following steps:

dividing a sign language word into n groups of data on average, deleting one group of data in sequence, forming a new data as input according to the original sequence of the rest n-1 groups of data, and generating n different data as n different inputs by one dividing gesture;

2. The continuous chinese sign language recognition method based on sliding window segmentation according to claim 1, wherein in step S1, when the sEMG and IMU data of the arm are collected, the armlet is worn on the forearm, and the sEMG sensor provided on the armlet is positioned at the front end of the forearm to target the middle finger direction.

3. The continuous chinese sign language recognition method based on sliding window segmentation according to claim 1, wherein in step S2, the preprocessing process comprises:

4. The continuous chinese sign language recognition method based on sliding window segmentation as set forth in claim 1, wherein the specific step of step S5 comprises:

S502: setting the threshold value as D, when the probability p _i When not less than D, s is _i Is considered as a valid vote; when probability p _i When < D, s is _i Is considered an invalid vote;

s503: if the effective ticket number is call

(1) If the ticket number of the single result is more than half of the call/2 of the effective ticket number, the result is used as the output result of the window;

(2) If the ticket numbers of the two results are equal and the ticket number is equal to the effective ticket number call/2, taking the highest result pi as the output result of the window;

5. A continuous chinese sign language recognition system based on sliding window segmentation applying the recognition method according to any one of claims 1 to 4, comprising:

6. The continuous Chinese sign language identification system based on sliding window segmentation according to claim 5, wherein the data preprocessing module comprises an information screening unit and a filtering unit;

7. The continuous chinese sign language recognition system based on sliding window segmentation of claim 5, wherein the data segmentation and reassembly module comprises a continuous sign language segmentation unit, a single sign language segmentation reassembly unit;