CN111428871A - Sign language translation method based on BP neural network - Google Patents

Sign language translation method based on BP neural network Download PDF

Info

Publication number
CN111428871A
CN111428871A CN202010243856.7A CN202010243856A CN111428871A CN 111428871 A CN111428871 A CN 111428871A CN 202010243856 A CN202010243856 A CN 202010243856A CN 111428871 A CN111428871 A CN 111428871A
Authority
CN
China
Prior art keywords
sign language
neural network
voltage signals
words
gesture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010243856.7A
Other languages
Chinese (zh)
Other versions
CN111428871B (en
Inventor
谢张宁
朱惠臣
孙晓光
吴俊杰
李智玮
傅云霞
雷李华
孔明
管钰晴
刘娜
王道档
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Jiliang University
Shanghai Institute of Measurement and Testing Technology
Original Assignee
China Jiliang University
Shanghai Institute of Measurement and Testing Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Jiliang University, Shanghai Institute of Measurement and Testing Technology filed Critical China Jiliang University
Priority to CN202010243856.7A priority Critical patent/CN111428871B/en
Publication of CN111428871A publication Critical patent/CN111428871A/en
Application granted granted Critical
Publication of CN111428871B publication Critical patent/CN111428871B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/014Hand-worn input/output arrangements, e.g. data gloves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Neurology (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention relates to a sign language translation method based on a BP neural network, which is characterized by comprising the following steps: the method comprises the following steps: 1. collecting gesture voltage signals by using the raspberry pi 3B through a wearable data glove; 2. compiling sign language words and common sign language sentences corresponding to each group of gesture voltage signals into a sign language sentence library by using a signal screening program; 3. writing a neural network classification program comprising a BP neural network structure frame model, a data transmission module and a storage module, wherein the BP neural network structure frame model adopts a three-layer neural network comprising an input layer, an output layer and a hidden layer; 4. converting the gesture voltage signals received each time into sign language words through a BP neural network framework model; 5. and 4, converting the sign language words obtained in the step 4 within a period of time into sign language word groups, matching the sign language word groups with the sign language word library, associating and filling the sign language word groups with the sign language word library to form sentence output results. The invention realizes automatic real-time translation and recognition of sign language by combining a neural network and a sensing technology.

Description

Sign language translation method based on BP neural network
Technical Field
The invention relates to a sign language translation method, in particular to a sign language translation method based on a BP neural network, which combines the neural network and a sensing technology to realize automatic translation and identification of sign language.
Background
In the process of communication between deaf-mutes and normal persons in the society at present, as the normal persons cannot understand sign language, a gap exists between the deaf-mutes and the normal persons, the communication circle of the deaf-mutes is limited, and great limitation is brought to the living and developing space of the deaf-mutes. The deaf-mute auxiliary equipment in the market has two types, one type is an electronic throat from the last 50 years, the electronic throat is arranged at the throat joint, senses the vibration of a vocal cord and expands the vibration to help sound, but the material for sound production is expensive, and the disabled without social work guarantee cannot afford the electronic throat. The other is sign language translating equipment based on computer vision, which appears in recent years, the price of the equipment is not high, but the limb motion recognition technology is still in a starting stage, and the requirement of image processing on the acquisition environment is strict.
The neural network is an operational model, and is formed by connecting a large number of nodes (or called neurons) with each other. Each node represents a particular output function, called an activation function. Every connection between two nodes represents a weighted value, called weight, for the signal passing through the connection, in such a way that the neural network simulates the human memory. The output of the network depends on the structure of the network, the way the network is connected, the weights and the activation functions. And the weights of the networks of each layer are the models to be stored. Neural networks are widely applied in machine learning, such as the fields of function approximation, pattern recognition, classification, data compression, data mining and the like. Therefore, it is a better method to use neural network to construct a nonlinear data classification model.
Disclosure of Invention
The invention aims to solve the defects of the prior art, designs a sign language translation method based on a BP (back propagation) neural network, collects a gesture voltage signal by using a sensing technology as the input of the BP neural network, converts the input voltage signal into corresponding semantic output required normally by using calculation and weighting among neurons, provides feasibility for manufacturing a sign language translation system based on gesture voltage and based on the BP (back propagation) neural network, facilitates the communication between disabled persons and normal persons, and shortens the distance between the disabled persons and the normal persons, so that deaf-dumb persons can be better integrated into the normal society.
The invention is realized by the following steps: a sign language translation method based on a BP neural network is characterized by comprising the following steps:
step 1: gesture voltage signals are collected by the raspberry pi 3B single board computer through the flexible sensor and the acceleration sensor which are arranged on the wearable data gloves, and the gesture voltage signals are transmitted to the storage device for storage through the integrated Bluetooth module after being filtered and amplified.
The wearable data glove in the step 1 is provided with flexible sensors which are strain gauges fixed on 10 finger positions, gesture voltage signals are represented by the bending degree of the strain gauges following the fingers and the mutual positions of two triaxial acceleration sensors fixed on the back of the left hand and the right hand respectively, and the acquisition of the gesture voltage signals is the acquisition of 10 finger bending signals and 6 gesture direction signals, namely 16 signals.
Step 2: sign language words and common sign language sentences corresponding to each group of signals are compiled into a sign language library by a signal screening program to prepare the sign language sentence library, and the gesture voltage signals and the corresponding sign language words collected for many times are divided into a training set and a testing set according to the ratio of 7: 3.
The sign language sentence library in the step 2 comprises a sign language sentence library and a sign language sub-library, wherein the sign language sentence library is manufactured by firstly recording the received current 16 gesture voltage signals in Excel, normalizing and storing the signals in the Access database.
And step 3: and (2) writing a program for establishing a BP neural network structure frame model, wherein the program mainly comprises three modules, namely a neural network structure frame model, a data transmission module and a storage module, training the BP neural network structure frame model through the training set in the step 2, introducing the trained BP neural network structure frame model into a test set for testing, storing the BP neural network structure frame model in the storage module after a test result accords with the preset result, and the BP neural network structure frame model adopts a three-layer neural network of an input layer, an output layer and a hidden layer.
The three-layer neural network of the BP neural network structure frame model in the step 3 is 16 neurons of an input layer, 64 neurons of a middle layer and 18 neurons of an output layer, the transmission mode is divided into two sections, one section has 8 voltage signals and 16 voltage signals in total, the output serial numbers of the 18 neurons of the output layer are 0 to 17, the output serial numbers sequentially correspond to 18 common phrases, and the common phrases are randomly combined into 53 common phrases.
And 4, step 4: converting the gesture voltage signals acquired each time into sign language words through a BP neural network framework model, and the method comprises the following steps:
step 4.1: receiving gesture voltage signals of the wearable data gloves, and screening complete signals by using a signal screening program;
step 4.2: and converting the gesture voltage signal into words through the trained BP neural network structure framework model.
And 5: converting sign language words obtained by converting the gesture voltage signals in the step 4 into sign language word groups within a period of time, matching the sign language word groups with a sign language word library, associating and filling the sign language word groups with the sign language word library to form sentence output results, and the method comprises the following steps of:
step 5.1: dividing the sentences in the sign language sentence library or collected vocabulary groups into words and counting, defining the vocabulary with high frequency and symbolic significance in the sentences as element 1, and defining the rest vocabularies as element 0;
step 5.2: all the frequently used sign language sentences in the sign language sentence library are in a word frequency vector format according to the specification of the step 5.1, and corresponding word frequency vectors are generated;
step 5.3: converting sign language words obtained by converting the gesture voltage signals collected in the step 4 in a period of time into sign language word groups, and converting the obtained sign language word groups into corresponding word frequency vectors in a word frequency vector format according to the specification of the step 5.1;
step 5.4: calculating cosine similarity between the word frequency vector converted in the step 5.3 and the word frequency vector in the sign language sentence library in the step 5.2, and selecting the sign language word in the sign language sentence library with the largest cosine similarity as an output word;
step 5.5: and matching the output words obtained in the step 5.4 with corresponding written language sentences according to the indexes of all the commonly used sign language sentences in the sign language sentence library, and taking the matched written language sentences as final output results.
The invention has the beneficial effects that: according to statistics, more than 2000 tens of thousands of people with physiological and language disabilities can be listened to in the whole country, the people have disabilities in communication with normal people, and the difficulty in communication among deaf-mute people is an important reason why the deaf-mute people cannot work normally. The method of the invention is externally connected with a sound and/or video playing device through the raspberry pi 3B single board computer, can convert the gesture into a normal statement audio or video form for output, has high translation accuracy and quick response, can provide great convenience for the disabled to communicate with the normal people, and meets the urgent requirements of the deaf-mute disabled to communicate with the normal people. The hardware equipment adopted by the method is low in cost, can quickly and effectively solve the problem of communication barrier encountered by the deaf-mute, opens a larger space for the employment of the deaf-mute, helps the deaf-mute barrier to better integrate into the society, and improves the living level of the deaf-mute.
Drawings
FIG. 1 is a schematic block diagram of the flow of the working steps of the method of the present invention.
FIG. 2 is a simplified schematic diagram of a single wearable data glove configuration for acquiring gesture voltage signals according to the method of the present invention.
FIG. 3 is a schematic diagram of the working principle and the working flow of the signal screening program of the method of the present invention.
FIG. 4 is a schematic diagram of a BP neural network structural framework model of the method of the present invention.
Fig. 5 is a schematic diagram of the working process flow of matching, associating, filling and sentence outputting of words generated after the conversion by the BP neural network in the method of the present invention.
Detailed Description
According to the attached figure 1, the invention relates to a sign language translation method based on a BP neural network, which comprises the following steps:
step 1: gesture voltage signals are collected by the raspberry pi 3B single board computer through the flexible sensor and the acceleration sensor which are arranged on the wearable data gloves, and the gesture voltage signals are transmitted to the storage device for storage through the integrated Bluetooth module after being filtered and amplified.
Step 2: sign language words and common sign language sentences corresponding to each group of signals are coded into a sign language library by a signal screening program to prepare a sign language sentence library, and gesture voltage signals and corresponding sign language words collected for many times are divided into a training set and a testing set according to the ratio of 7: 3;
step 2.1: writing hand database recording software based on C # language, wherein the software can record a hand gesture voltage signal received each time and semantics represented by the signal in Excel;
step 2.2: checking whether the received signals are complete or not by using a signal screening program, recording the signals in an Excel table if the signals are complete, and rejecting the signals if the signals are not complete;
step 2.3: dividing the gesture voltage signals collected for multiple times and corresponding sign language words into a training set and a data set according to the proportion of 7: 3;
step 2.4: and normalizing the sign language words and the common sign language sentences corresponding to the gesture voltage signals recorded in the Excel table, and then importing the sign language words and the common sign language sentences into an Access database to manufacture a sign language sentence library.
And step 3: and (2) writing a program for establishing a BP neural network structure frame model, wherein the program mainly comprises three modules, namely a neural network structure frame model, a data transmission module and a storage module, training the BP neural network structure frame model through the training set in the step 2, introducing the trained BP neural network structure frame model into a test set for testing, storing the BP neural network structure frame model in the storage module after a test result accords with the preset result, and the BP neural network structure frame model adopts a three-layer neural network of an input layer, an output layer and a hidden layer.
And 4, step 4: converting the gesture voltage signals acquired each time into sign language words through a BP neural network framework model;
step 4.1: receiving gesture voltage signals of the wearable data gloves, and screening complete signals by using a signal screening program;
step 4.2: and converting the gesture voltage signals into sign language words by using the trained BP neural network structural framework model.
And 5: converting sign language words obtained by the gesture voltage signal conversion in the step 4 into sign language word groups within a period of time, matching the sign language word groups with a sign language word library, and filling the matched sign language word groups into sentence output results;
step 5.1: dividing the sentences in the sign language sentence library or the collected word groups into words and counting the words, wherein the words with high frequency and symbolic significance in the sentences are defined as element 1, and the other words are element 0;
step 5.2: all the frequently used sign language sentences in the sign language sentence library are in a word frequency vector format according to the specification of the step 5.1, and corresponding word frequency vectors are generated;
step 5.3: converting sign language words obtained by converting the gesture voltage signals collected in the step 4 in a period of time into sign language word groups, and converting the obtained sign language word groups into corresponding word frequency vectors in a word frequency vector format according to the specification of the step 5.1;
step 5.4: calculating cosine similarity between the word frequency vector converted in the step 5.3 and the word frequency vector in the sign language sentence library in the step 5.2, and selecting the sign language word in the sign language sentence library with the largest cosine similarity as an output word;
step 5.5: and matching the output words obtained in the step 5.4 with corresponding written language sentences according to the indexes of all the commonly used sign language sentences in the sign language sentence library, and taking the matched written language sentences as final output results.
The invention is described in further detail below with reference to the figures and specific examples.
The specific working steps of the sign language translation method based on the BP neural network are as follows:
step 1: the flexible sensor, the acceleration sensor and the constant resistance on the raspberry group 3B single board computer and the wearable data glove are connected in series, the flexible sensor and the acceleration sensor can change voltage according to the mutual motion state of a finger bending hand, and collected gesture voltage signals are filtered and amplified and then transmitted to the storage device for storage through the Bluetooth module of the raspberry group 3B single board computer.
According to fig. 2, the flexible sensors disposed on the wearable data glove in step 1 are strain gauges fixed at the positions of 10 fingers, and the acceleration sensors are two triaxial acceleration sensors fixed at the positions of the back of the left hand and the back of the right hand. Each strain gauge outputs a voltage value, the three-axis acceleration sensor outputs three voltage values of x, y and z, and each output comprises a starting signalNumiAnd an end signalNumoA total of 18 signals were in turn:NumiX 1 X 2 ,……X 16Numo. WhereinX 1 X 2 ,……X 16Representing the voltage value of the gesture.
Step 2: and (4) compiling sign language words and common sign language sentences corresponding to each group of collected gesture voltage signals into a sign language library to manufacture the sign language sentence library. The sign language sentence library is divided into a training set and a test set in a ratio of 7: 3.
Step 2.1: the sign language library recording software is written based on C # language, and can record gesture voltage signals received each time and semantics represented by the signals in an Excel table.
Step 2.2: fig. 3 is a diagram illustrating the working principle and working steps of a signal screening program, which is used to check whether the signals received each time are complete, and if so, the signals are recorded in an Excel table, otherwise, the signals are rejected. The signal screening program receives the gesture voltage signal and meets the starting signalNumiWhen a stop signal is encountered, counting is startedNumoThe counting stops. When the counting K value is 16, the transmission data is completed and recorded in the Excel table, and if the counting K value is not 16, the counting is restarted.
Step 2.3: the gesture voltage signals and corresponding sign language words collected for multiple times are divided into a training set and a data set in a 7:3 ratio.
Step 2.4: and (4) normalizing the gesture voltage signals, the corresponding sign language words and the common sign language sentences in the Excel table, and importing the normalized sign language words and the common sign language sentences into an Access database to prepare a database.
And step 3: and (2) writing a program for establishing a BP neural network structure frame model, wherein the program mainly comprises three modules, namely a neural network structure frame model, a data transmission module and a storage module, training the BP neural network structure frame model through the training set in the step 2, introducing the trained BP neural network structure frame model into a test set for testing, storing the BP neural network structure frame model in the storage module after a test result accords with the preset result, and the BP neural network structure frame model adopts a three-layer neural network of an input layer, an output layer and a hidden layer.
Each time a gesture is made, the wearable data glove with the flexible sensors transmits 16 sensor values to build a model. It is a method to build an index library by associating different sets of voltage values with gestures. However, Chinese grammar has many gestures, and it takes much time to establish a relevant grammar library. And the voltage value groups formed by different people making gestures are not completely the same, and as the number of users increases, the voltage group corresponding to each gesture becomes larger and larger, which results in the increase of the index duration. Moreover, due to the limitation of the sensitivity of the device, different gesture voltage value sets are not very different, which brings many limitations to accurate recognition. In order to achieve quick and accurate identification, the invention adopts a BP neural network in machine learning to build a classification model of a voltage group.
The neural network classification program for gesture recognition is composed of a BP neural network algorithm part, a model prediction part and a data transmission part 3.
The BP neural network algorithm part comprises network forward propagation, backward propagation, model training and evaluation and model storage. The gesture recognition neural network is written by Python language, and the neural network constructed in the Python language has the advantages of conveniently modifying the number of neural units, the number of layers and an activation function and quickly calculating a large amount of data. The programming environment is PyCharm, and the used program libraries are Numpy, Pandas and SciPy. Numpy is used for algorithm writing of the BP neural network, Pandas is used for data import, and Scipy is used for output storage.
The framework of the BP neural network algorithm part is a BP neural network structural framework model formed by three neural networks of an input layer, an output layer and a hidden layer. The input layer has 16 input cells and the hidden layer has 64 cells. The output layer is 18 commonly used phrases, each output serial number is 0 to 17, 18 commonly used phrases are corresponded in sequence, and 50 commonly used phrases can be formed by randomly combining the commonly used phrases. The correspondence of the commonly used phrases and numbers are given in the following table.
Table 1 part common word group number correspondence table
Figure RE-DEST_PATH_IMAGE001
Because there are 16 sensor values input, 16 input neuron units are set, and each group of data isX 1 X 2 ,……X 16And setting the normalized data as an input layer. The normalized input layer is set asx 1 x 2 ,……x 16The weight parameter on the connecting line between the jth neuron unit of the first layer and the ith neuron unit of the second layer is
Figure RE-259487DEST_PATH_IMAGE002
Since there are many sign language library samples, 64 neuron units are provided in the hidden layer to avoid overfitting, so thatW (1)R (64,16)The bias term of the ith cell of the second layer is
Figure RE-DEST_PATH_IMAGE003
The node activation function is a Sigmoid function:
Figure RE-835350DEST_PATH_IMAGE004
the output of each neuron node of the second layer is then:
Figure RE-DEST_PATH_IMAGE005
Figure RE-5300DEST_PATH_IMAGE006
Figure RE-DEST_PATH_IMAGE007
let the ith unit of the second layer input weighted sum be
Figure RE-338192DEST_PATH_IMAGE008
The ith unit of the second layer inputs a weighted sum of
Figure RE-DEST_PATH_IMAGE009
Then, then
Figure RE-339515DEST_PATH_IMAGE010
Figure RE-DEST_PATH_IMAGE011
Is provided with
Figure RE-99661DEST_PATH_IMAGE012
The weight parameter on the connecting line between the jth neuron unit of the second layer and the ith neuron unit of the third layer is,
Figure RE-DEST_PATH_IMAGE013
is the bias term of the ith unit of the third layer, and the output layer arranged by us has 18 outputs, so the bias term is easy to obtainW (2)R (18,64)The third layer outputsa (3)The node activation function is Sigmoid function, and the BP neural network output is
Figure RE-553645DEST_PATH_IMAGE014
Then, the same can be obtained:
Figure RE-DEST_PATH_IMAGE015
Figure RE-486966DEST_PATH_IMAGE016
Figure RE-DEST_PATH_IMAGE017
Figure RE-93527DEST_PATH_IMAGE018
each forward propagation to obtain a third layer of outputa (3)The result needs to be corrected later, and since the activation function is Sigmoid function, the activation function is derived and will be useda (3)AndY i substituting the difference into the error term of the output layer (i.e. the third layer)
Figure RE-DEST_PATH_IMAGE019
And the error term of each neural unit of the hidden layer (i.e., the second layer)
Figure RE-270912DEST_PATH_IMAGE020
Figure RE-DEST_PATH_IMAGE021
Figure RE-697346DEST_PATH_IMAGE022
Finally, the weight of each connection point is updatedƞTo learn the rate constant, the connection weights of the second and third layers
Figure RE-DEST_PATH_IMAGE023
And connection weights of the first layer and the second layer
Figure RE-621308DEST_PATH_IMAGE024
The updating is as follows:
Figure RE-DEST_PATH_IMAGE025
Figure RE-82376DEST_PATH_IMAGE026
the structural framework model of the BP neural network of the invention is shown in figure 4.
After one forward propagation and one backward propagation are carried out to update the weight, one training of the neural network is completed, and of course, one training is not enough and hundreds of training needs to be carried out, and the maximum training frequency of the neural network is 30000 times. The training sample of the invention is imported from a read _ Excel function in a Pandas library into a sign language library stored as an Excel file. The phrase library is an Excel file, columns from 1 st column to 16 th column are voltage values, and column 17 is a corresponding gesture phrase.
And performing model evaluation after training, wherein the model evaluation adopts mean square error evaluation. As the number of times of use and the number of people using the glove module are increased in the use process, the samples of the hand language library are more and more. There is a subtle difference in the voltage values of the same gesture of different people, which allows the neural network to recognize that overfitting may result, and to prevent overfitting, the samples are divided into a training set and a testing set. The training set is used for training, the testing set is used for testing the prediction accuracy, and the accuracy of the method can reach 80% by adopting BP neural network prediction.
After model training, the model needs to be stored, and the model after neural network training is the weight of each connection itemWAnd bias term for each neural nodebThe invention isW (1)b (2)W (2)Andb (3). The savemat function in the Scipy library in Python can store the trained parameter matrix as a data file in a Mat format, and the loadmat function can import data in the Mat file into a program for prediction calling.
The model prediction part and the neural network algorithm part are different in training evaluation, and the obtained data is a gesture voltage value transmitted back from the mobile phone terminal in real time. After receiving the voltage value transmitted back by the mobile phone end in real time, firstly carrying out normalization processing on the reorganized voltage value, and calling a model (namely weight) with the maximum accuracy rate for training evaluation completionWAnd bias termb) Performing a forward propagation to propagate the completed result
Figure RE-DEST_PATH_IMAGE027
The index value of the maximum value is the obtained gesture number, and the corresponding gesture is the predicted result.
And 4, step 4: and converting the gesture voltage signals acquired each time into sign language words through a BP neural network framework model.
Step 4.1: and receiving gesture voltage signals of the wearable data gloves, and screening complete signals by using a signal screening program.
Step 4.2: and converting the gesture voltage signals into sign language words by using a trained BP neural network framework model.
The specific operation is as follows: and acquiring a current 2-second gesture voltage signal, and converting the acquired gesture voltage signal into a sign language word through a BP neural network framework model.
And 5: converting the sign language words obtained by converting the gesture voltage signals in the step 4 into sign language word groups in a period of time, and matching and filling the sign language word groups into sentences. The specific working process flow from sign language words generated after conversion by the BP neural network framework model to matching association filling sentence output is shown in fig. 5.
Step 5.1: carrying out word segmentation and statistics on sentences in the sign language sentence library or collected word groups, defining words with high frequency and symbolic significance in the sentences as element 1, and defining the rest words as element 0, such as: performing word segmentation statistics according to sentences in the sign language sentence library [ i received, she is beautiful, and you are poor ] to obtain words appearing in the sentence library and the frequency of the words appearing [ i: 1, you: 1, she: 1, clothes: 1, poor: 1, beautiful: 1, receiving: 1, comprising: 1, very: 1], specifying the word-frequency vector format as [ I, you, her, clothing, poor, beautiful, received ];
step 5.2: and (3) defining all the frequently-used sign language sentences in the sign language sentence library into a word frequency vector format according to the step (5.1), and generating corresponding word frequency vectors, such as: the sentence "i received" has a corresponding word frequency vector of [1,0,0,0,0,0,1 ];
step 5.3: converting sign language words obtained by converting the gesture voltage signals acquired in the step 4 in a period of time into sign language word groups, converting the obtained sign language word groups into corresponding word frequency vectors according to the word frequency vector format specified in the step 5.1, and performing the following steps: the word frequency vector corresponding to the word group "she is beautiful" is [0,0,1,0,0,1,0 ];
step 5.4: and (4) calculating the cosine similarity between the word frequency vector converted in the step 5.3 and the word frequency vector in the sign language sentence library in the step 5.2, and selecting the sign language word in the sign language sentence library with the largest cosine similarity as an output word.
Step 5.5: and matching the output words obtained in the step 5.4 with corresponding written language sentences according to the indexes of all the commonly used sign language sentences in the sign language sentence library, and taking the matched written language sentences as final output results.
The output result can be output by a hardware device or a device of external sound and/or video playing equipment of a raspberry pi 3B single board computer, and the translation result can be output in a sentence audio or video form.

Claims (6)

1. A sign language translation method based on a BP neural network is characterized by comprising the following steps:
step (1): the raspberry pi 3B single board computer collects gesture voltage signals through a flexible sensor and an acceleration sensor which are arranged on a wearable data glove, and the gesture voltage signals are transmitted to a storage device for storage through a Bluetooth module integrated with the gesture voltage signals after being filtered and amplified;
step (2): sign language words and common sign language sentences corresponding to each group of signals are coded into a sign language library by a signal screening program to prepare a sign language sentence library, and gesture voltage signals and corresponding sign language words collected for many times are divided into a training set and a testing set according to the ratio of 7: 3;
and (3): writing a program for establishing a BP neural network structure frame model, wherein the program mainly comprises three modules, namely a neural network structure frame model, a data transmission module and a storage module, training the BP neural network structure frame model through the training set in the step (2), introducing the trained BP neural network structure frame model into a test set for testing, storing the BP neural network structure frame model in the storage module after a test result accords with the preset result, and the BP neural network structure frame model adopts a three-layer neural network of an input layer, an output layer and a hidden layer;
and (4): converting the gesture voltage signals acquired each time into sign language words through a BP neural network framework model;
and (5): converting sign language words obtained by the gesture voltage signal conversion in the step (4) into sign language word groups within a period of time, matching the sign language word groups with the sign language word library, and filling the matched sign language word groups into sentence output results in an associated mode.
2. The sign language translation method based on the BP neural network as claimed in claim 1, wherein: the wearable data glove in the step (1) is provided with a flexible sensor which is a strain gauge fixed at 10 finger positions, the strain gauge is used for following the bending degree of fingers and representing gesture voltage signals by utilizing the mutual positions of two triaxial acceleration sensors respectively fixed at the left and right back of the hand positions, and the acquisition of the gesture voltage signals is realized by acquiring 10 finger bending signals and 6 gesture direction signals, namely 16 signals.
3. The sign language translation method based on the BP neural network as claimed in claim 1, wherein: the sign language sentence library in the step (2) comprises a sign language sentence library and a sign language sub-library, wherein the sign language sentence library is manufactured by firstly recording the received current 16 gesture voltage signals in Excel, normalizing and storing the signals in the Access database.
4. The sign language translation method based on the BP neural network as claimed in claim 1, wherein: the three-layer neural network of the BP neural network structure frame model in the step (3) is 16 neurons of an input layer, 64 neurons of a middle layer and 18 neurons of an output layer, the transmission mode is divided into two sections, one section has 8 voltage signals and 16 voltage signals, the output serial numbers of the 18 neurons of the output layer are 0 to 17, the output serial numbers sequentially correspond to 18 common phrases, and the common phrases are randomly combined into 53 common phrases.
5. The method for sign language translation based on the BP neural network as claimed in claim 1, wherein the step (4) of converting the gesture voltage signal into sign language words through the BP neural network framework model comprises the following steps:
step (4.1): receiving gesture voltage signals of the wearable data gloves, and screening complete signals by using a signal screening program;
step (4.2): and converting the gesture voltage signal into words through the trained BP neural network structure framework model.
6. The sign language translation method based on the BP neural network as claimed in claim 1, wherein the step (5) of matching with the sign language sentence library after the conversion of the BP neural network framework model and associating with the filled sentence output result comprises the steps of:
step (5.1): dividing the sentences in the sign language sentence library or collected vocabulary groups into words and counting, defining the vocabulary with high frequency and symbolic significance in the sentences as element 1, and defining the rest vocabularies as element 0;
step (5.2): defining all the frequently-used sign language sentences in the sign language sentence library into a word frequency vector format according to the step (5.1) to generate corresponding word frequency vectors;
step (5.3): converting sign language words obtained by converting the gesture voltage signals collected in the step (4) within a period of time into sign language word groups, and converting the obtained sign language word groups into corresponding word frequency vectors in a word frequency vector format according to the specification of the step (5.1);
step (5.4): calculating the cosine similarity between the word frequency vector converted in the step (5.3) and the word frequency vector in the sign language sentence library in the step (5.2), and selecting the sign language words in the sign language sentence library with the largest cosine similarity as output words;
step (5.5): and (5.4) matching the output words obtained in the step (5.4) with corresponding written language sentences according to the indexes of all the common sign language sentences in the sign language sentence library, and taking the matched written language sentences as final output results.
CN202010243856.7A 2020-03-31 2020-03-31 Sign language translation method based on BP neural network Active CN111428871B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010243856.7A CN111428871B (en) 2020-03-31 2020-03-31 Sign language translation method based on BP neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010243856.7A CN111428871B (en) 2020-03-31 2020-03-31 Sign language translation method based on BP neural network

Publications (2)

Publication Number Publication Date
CN111428871A true CN111428871A (en) 2020-07-17
CN111428871B CN111428871B (en) 2023-02-24

Family

ID=71556171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010243856.7A Active CN111428871B (en) 2020-03-31 2020-03-31 Sign language translation method based on BP neural network

Country Status (1)

Country Link
CN (1) CN111428871B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149540A (en) * 2020-09-14 2020-12-29 东北大学 Yoov 3-based end-to-end sign language recognition technology
CN113081703A (en) * 2021-03-10 2021-07-09 上海理工大学 Method and device for distinguishing direction intention of user of walking aid
CN113111156A (en) * 2021-03-15 2021-07-13 天津理工大学 System for intelligent hearing-impaired people and healthy people to perform man-machine interaction and working method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539994A (en) * 2009-04-16 2009-09-23 西安交通大学 Mutually translating system and method of sign language and speech
US20100023314A1 (en) * 2006-08-13 2010-01-28 Jose Hernandez-Rebollar ASL Glove with 3-Axis Accelerometers
CN110363077A (en) * 2019-06-05 2019-10-22 平安科技(深圳)有限公司 Sign Language Recognition Method, device, computer installation and storage medium
CN110532912A (en) * 2019-08-19 2019-12-03 合肥学院 A kind of sign language interpreter implementation method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100023314A1 (en) * 2006-08-13 2010-01-28 Jose Hernandez-Rebollar ASL Glove with 3-Axis Accelerometers
CN101539994A (en) * 2009-04-16 2009-09-23 西安交通大学 Mutually translating system and method of sign language and speech
CN110363077A (en) * 2019-06-05 2019-10-22 平安科技(深圳)有限公司 Sign Language Recognition Method, device, computer installation and storage medium
CN110532912A (en) * 2019-08-19 2019-12-03 合肥学院 A kind of sign language interpreter implementation method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘婉婉等: "基于门控循环神经网络词性标注的蒙汉机器翻译研究", 《中文信息学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149540A (en) * 2020-09-14 2020-12-29 东北大学 Yoov 3-based end-to-end sign language recognition technology
CN113081703A (en) * 2021-03-10 2021-07-09 上海理工大学 Method and device for distinguishing direction intention of user of walking aid
CN113111156A (en) * 2021-03-15 2021-07-13 天津理工大学 System for intelligent hearing-impaired people and healthy people to perform man-machine interaction and working method thereof
CN113111156B (en) * 2021-03-15 2022-05-13 天津理工大学 System for intelligent hearing-impaired people and healthy people to perform man-machine interaction and working method thereof

Also Published As

Publication number Publication date
CN111428871B (en) 2023-02-24

Similar Documents

Publication Publication Date Title
CN111428871B (en) Sign language translation method based on BP neural network
CN110992987B (en) Parallel feature extraction system and method for general specific voice in voice signal
CN110634491B (en) Series connection feature extraction system and method for general voice task in voice signal
CN112818861B (en) Emotion classification method and system based on multi-mode context semantic features
JP3168779B2 (en) Speech recognition device and method
CN111026847B (en) Text emotion recognition method based on attention network and long-short term memory network
CN112216307B (en) Speech emotion recognition method and device
CN111400461B (en) Intelligent customer service problem matching method and device
CN113723166A (en) Content identification method and device, computer equipment and storage medium
Gorin et al. An experiment in spoken language acquisition
CN112151030A (en) Multi-mode-based complex scene voice recognition method and device
CN113837299B (en) Network training method and device based on artificial intelligence and electronic equipment
CN112115687A (en) Problem generation method combining triples and entity types in knowledge base
CN115862684A (en) Audio-based depression state auxiliary detection method for dual-mode fusion type neural network
CN114724224A (en) Multi-mode emotion recognition method for medical care robot
CN114153942B (en) Event time sequence relation extraction method based on dynamic attention mechanism
CN113948157A (en) Chemical reaction classification method, device, electronic equipment and storage medium
CN116110565A (en) Method for auxiliary detection of crowd depression state based on multi-modal deep neural network
CN117877660A (en) Medical report acquisition method and system based on voice recognition
CN117672268A (en) Multi-mode voice emotion recognition method based on relative entropy alignment fusion
CN116521872A (en) Combined recognition method and system for cognition and emotion and electronic equipment
Anindya et al. Development of Indonesian speech recognition with deep neural network for robotic command
CN114882888A (en) Voiceprint recognition method and system based on variational self-coding and countermeasure generation network
CN111428802B (en) Sign language translation method based on support vector machine
CN112951270B (en) Voice fluency detection method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant