CN105956529A - Chinese sign language identification method based on LSTM type RNN - Google Patents

Chinese sign language identification method based on LSTM type RNN Download PDF

Info

Publication number
CN105956529A
CN105956529A CN201610260747.XA CN201610260747A CN105956529A CN 105956529 A CN105956529 A CN 105956529A CN 201610260747 A CN201610260747 A CN 201610260747A CN 105956529 A CN105956529 A CN 105956529A
Authority
CN
China
Prior art keywords
sign language
lstm
feature
type rnn
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610260747.XA
Other languages
Chinese (zh)
Inventor
程树英
林鹏程
吴丽君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201610260747.XA priority Critical patent/CN105956529A/en
Publication of CN105956529A publication Critical patent/CN105956529A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a Chinese sign language identification method based on LSTM type RNN. According to the features of the Chinese sign language, a plurality of groups of sign language features are acquired to form training data. The feature extraction of the training data is carried out, and are marked according to the linguistic meaning corresponding to the feature vectors. The training data is used as the input of the LSTM type RNN for model training, and then an optimal network model parameter is acquired, and is used as a final identification model. The trained model is used for identifying to-be-identified signal languages, and a character sequence having a maximum probability of an output layer is calculated, and is used as a decoding result, and is converted into a corresponding acoustic sequence, and the result is the identified sign language feature. The Chinese sign language identification method can be linked to a remote state, and the decline of the capability of the later state perceiving the former state is prevented, and the accuracy of identifying the Chinese continuous sign languages is improved.

Description

A kind of Chinese Sign Language recognition methods based on LSTM type RNN
Technical field
The present invention relates to Chinese Sign Language identification field, a kind of based on LSTM type RNN Chinese Sign Language recognition methods.
Background technology
Sign Language Recognition be one sign language information can be changed into voice, word carrying out read aloud or The technology of display.In Sign Language Recognition field, owing to continuous sign language recognition is the key of Sign Language Recognition Problem, therefore, the effect how improving Sign Language Recognition challenge is how that improving continuous sign language knows Other accuracy.
In prior art, the method for continuous sign language recognition mainly has following several:
The first, continuous sign language recognition generally use HMM (Hidden Markov Model, hidden Markov), this method introduces the previous state impact on current state in a model, The identification of sign language is realized by calculating output probability maximization;
The second, continuous sign language recognition may be used without CRF (Conditional RandomField, Condition random field), this method introduces contextual information in a model, needs to enter training characteristics Extend about row, and introduce manual features template and be trained.Traditional method is instructed the most respectively Get sign language model, then use the mode predicted step by step that sign language to be identified is identified.
But, above two method is primarily present problems with:
Although 1 use about extension mode can the association of state before and after to a certain degree introducing, But in order to reduce scale of model and complexity, extension size is extremely limited, therefore before and after link Distance must not be too far away, cause the current time decline to front position perception;
2, using the mode predicted step by step, the transmission of mistake can be caused if making a mistake, impact Last effect.
Summary of the invention
In view of this, the purpose of the present invention is to propose to a kind of middle national champion based on LSTM type RNN Language recognition methods, overcomes the decline to front position perception of the current time node.
The present invention uses below scheme to realize: a kind of Chinese Sign Language based on LSTM type RNN is known Other method, comprises the following steps:
Step S1: gather many group sign language features;
Step S2: be labeled according to the language meaning corresponding to the sign language feature collected, shape Becoming training data, wherein, described training data is for the training of neutral net;
Step S3: described training data carries out the instruction of model as the input of LSTM type RNN Practice, obtain optimum network model parameter, as finally identifying model;
Step S4: sign language to be identified is carried out collection apparatus, and as LSTM type RNN The input of model, calculates the character string of output layer maximum probability, and as the knot of decoding Really, described result is the sign language feature of identification.
Further, described step S1 particularly as follows: use data glove obtain sign language feature, Described data glove include flexibility sensor, nine axle sensors and for data process, Storage, the microprocessor sent.
Further, described step S2 is particularly as follows: by the sign language feature that collects by feature institute Language meaning to be expressed is classified, and the feature of every kind of language meaning is randomly selected a fixed number The feature group of amount, and described a number of characteristic component is not carried out the mark of language meaning, Tissue uses the form of matrix, forms training data.
Further, described step S3 is particularly as follows: according to the corresponding LSTM of sign language feature construction The model of type RNN, the most explicitly models, by the training data in step S2 Sign language feature, mark as input LSTM type RNN set up is trained, to obtain Take the weight parameter that different sign language feature is corresponding.
Further, described LSTM type RNN includes input layer, output layer and hidden layer;Institute State the input of input layer as sign language characteristic value sequence O1O2...OT, the output of output layer is input Corresponding acoustics sequence S1S2...SL, hidden layer includes multiple LSTM unit;Wherein, T is Time step number, L is acoustics sequence length.
Further, described LSTM unit includes that 3 control door, and described 3 control door For controlling the association inputting, export and cross between the internal state three of time step self.
Further, described step S4 is particularly as follows: use LSTM type RNN that step S3 generates Described sign language to be identified is identified, first to described sign language to be identified by final identification model Feature carry out the most abstract, extract characteristic vector, and according to described LSTM type RNN mould Sign language to be identified is predicted by type, carries out acoustical predictions further, to generate parameters,acoustic sequence Row, and generate phonetic synthesis result according to described parameters,acoustic.
Further, the flowing of the employing of LSTM type RNN described in step S4 following formula control information:
It=σ (WixIt+Wimmt-1+WicCt-1+bi);
Ft=σ (WFxIt+WFmmt-1+WFcCt-1+bF);
ct=Ft⊙ct-1+It⊙g(WcxIt+Wcmmt-1+bc);
Ot=σ (WOxIt+WOmmt-1+WOcCt-1+bO);
mt=Ot⊙h(Ct);
Wherein, given list entries I=(I1,I2...IT), T is the length of list entries, ItFor t Input, W is weight matrix, and b is bias matrix, and I, F, c, O, m represent respectively Input Input Gate, Forget Gate, Output Gate, state cell and LSTM The output of structure;
Wherein, σ is three excitation functions controlling door, and formula is:
f ( x ) = 1 1 + e x ;
Wherein, h is the excitation function of state, and formula is:
f ( x ) = tanh = e x - e - x e x + e - x .
Compared with prior art, the present invention has following beneficial effect: the application is from sign language to be predicted Middle extraction characteristic vector, LSTM type RNN good by precondition carries out language to characteristic vector Yan Xue predicts, with Generative Linguistics argument sequence, generation module is raw according to linguistics argument sequence Become voice synthetic effect, i.e. by using LSTM type RNN network structure to train, the company of improving The accuracy of continuous Sign Language Recognition, thus improve recognition accuracy.
Accompanying drawing explanation
Fig. 1 is the inventive method schematic flow sheet.
Fig. 2 is embodiment of the present invention LSTM type RNN basic principle schematic.
Detailed description of the invention
Below in conjunction with the accompanying drawings and embodiment the present invention will be further described.
As it is shown in figure 1, present embodiments provide a kind of Chinese Sign Language based on LSTM type RNN Recognition methods, comprises the following steps:
Step S1: gather many group sign language features;
Step S2: be labeled according to the language meaning corresponding to the sign language feature collected, shape Becoming training data, wherein, described training data is for the training of neutral net;
Step S3: described training data carries out the instruction of model as the input of LSTM type RNN Practice, obtain optimum network model parameter, as finally identifying model;
Step S4: sign language to be identified is carried out collection apparatus, and as LSTM type RNN The input of model, calculates the character string of output layer maximum probability, and as the knot of decoding Really, described result is the sign language feature of identification.
In the present embodiment, described step S1 is particularly as follows: use data glove to obtain sign language spy Levying, described data glove includes flexibility sensor, nine axle sensors and at data The microprocessor manage, store, sent.
In the present embodiment, described step S2 is particularly as follows: by the sign language feature that collects by spy Levy language meaning to be expressed to classify, the feature of every kind of language meaning is randomly selected one The feature group of determined number, and described a number of characteristic component is not carried out the mark of language meaning Note, tissue uses the form of matrix, forms training data.
In the present embodiment, described step S3 is particularly as follows: corresponding according to sign language feature construction The model of LSTM type RNN, the most explicitly model, by the instruction in step S2 Practice the sign language feature of data, LSTM type RNN set up is trained by mark as input, The weight parameter corresponding to obtain different sign language feature.
In the present embodiment, described LSTM type RNN includes input layer, output layer and hidden layer; The input of described input layer is as sign language characteristic value sequence O1O2...OT, the output of output layer is defeated Enter corresponding acoustics sequence S1S2...SL, hidden layer includes multiple LSTM unit;Wherein, T is Time step number, L is acoustics sequence length.
In the present embodiment, described LSTM unit includes that 3 control door, described 3 controls Door processed is used between the internal state three controlling to input, export and cross over time step self Association.
In the present embodiment, described step S4 is particularly as follows: use the LSTM that step S3 generates Type RNN finally identifies that described sign language to be identified is identified by model, first knows described treating The feature of other sign language carries out the most abstract, extracts characteristic vector, and according to described LSTM type Sign language to be identified is predicted by RNN model, carries out acoustical predictions further, to generate acoustics Argument sequence, and generate phonetic synthesis result according to described parameters,acoustic.
As in figure 2 it is shown, the basic thought of LSTM type RNN is by Input Gate, Output These different types of structures of Gate and Forget Gate control the flowing of information.? In the present embodiment, the flowing of the employing following formula control information of LSTM type RNN described in step S4:
It=σ (WixIt+Wimmt-1+WicCt-1+bi);
Ft=σ (WFxIt+WFmmt-1+WFcCt-1+bF);
ct=Ft⊙ct-1+It⊙g(WcxIt+Wcmmt-1+bc);
Ot=σ (WOxIt+WOmmt-1+WOcCt-1+bO);
mt=Ot⊙h(Ct);
Wherein, given list entries I=(I1,I2...IT), T is the length of list entries, ItDuring for t The input carved, W is weight matrix, and b is bias matrix, I, F, c, O, m generation respectively Table input Input Gate, Forget Gate, Output Gate, state cell and LSTM The output of structure;
Wherein, σ is three excitation functions controlling door, and formula is:
f ( x ) = 1 1 + e x ;
Wherein, h is the excitation function of state, and formula is:
f ( x ) = tanh = e x - e - x e x + e - x .
Can be seen that LSTM type RNN has the state of caching history by structure and computing formula The effect of information, and by door, historical information is safeguarded, thus extend big model Place the context information impact on current information, improve the accuracy rate of continuous sign language recognition.
The foregoing is only presently preferred embodiments of the present invention, all according to scope of the present invention patent institute Impartial change and the modification done, all should belong to the covering scope of the present invention.

Claims (8)

1. a Chinese Sign Language recognition methods based on LSTM type RNN, it is characterised in that: include with Lower step:
Step S1: gather many group sign language features;
Step S2: be labeled according to the language meaning corresponding to the sign language feature collected, shape Becoming training data, wherein, described training data is for the training of neutral net;
Step S3: described training data carries out the instruction of model as the input of LSTM type RNN Practice, obtain optimum network model parameter, as finally identifying model;
Step S4: sign language to be identified is carried out collection apparatus, and as LSTM type RNN The input of model, calculates the character string of output layer maximum probability, and as the knot of decoding Really, described result is the sign language feature of identification.
A kind of Chinese Sign Language recognition methods based on LSTM type RNN the most according to claim 1, It is characterized in that: described step S1 is particularly as follows: use data glove to obtain sign language feature, institute The data glove stated includes flexibility sensor, nine axle sensors and processes for data, deposit Storage, the microprocessor sent.
A kind of Chinese Sign Language recognition methods based on LSTM type RNN the most according to claim 1, It is characterized in that: described step S2 is particularly as follows: wanted the sign language feature collected by feature The language meaning expressed is classified, and the feature of every kind of language meaning is randomly selected some Feature group, and described a number of characteristic component is not carried out the mark of language meaning, group Knit the form of employing matrix, form training data.
A kind of Chinese Sign Language recognition methods based on LSTM type RNN the most according to claim 1, It is characterized in that: described step S3 is particularly as follows: according to the corresponding LSTM of sign language feature construction The model of type RNN, the most explicitly models, by the training data in step S2 Sign language feature, mark as input LSTM type RNN set up is trained, to obtain Take the weight parameter that different sign language feature is corresponding.
A kind of Chinese Sign Language recognition methods based on LSTM type RNN the most according to claim 4, It is characterized in that: described LSTM type RNN includes input layer, output layer and hidden layer;Described The input of input layer is as sign language characteristic value sequence O1O2...OT, the output of output layer is input institute Corresponding acoustics sequence S1S2...SL, hidden layer includes multiple LSTM unit;Wherein, when T is Between step number, L is acoustics sequence length.
A kind of Chinese Sign Language recognition methods based on LSTM type RNN the most according to claim 5, It is characterized in that: described LSTM unit includes that 3 control door, and described 3 control door use Input, export and cross over the association between the internal state three of time step self in control.
A kind of Chinese Sign Language recognition methods based on LSTM type RNN the most according to claim 1, It is characterized in that: described step S4 is particularly as follows: use LSTM type RNN that step S3 generates Described sign language to be identified is identified, first to described sign language to be identified by final identification model Feature carry out the most abstract, extract characteristic vector, and according to described LSTM type RNN mould Sign language to be identified is predicted by type, carries out acoustical predictions further, to generate parameters,acoustic sequence Row, and generate phonetic synthesis result according to described parameters,acoustic.
A kind of Chinese Sign Language recognition methods based on LSTM type RNN the most according to claim 1, It is characterized in that: the flowing of the employing following formula control information of LSTM type RNN described in step S4:
It=σ (WixIt+Wimmt-1+WicCt-1+bi);
Ft=σ (WFxIt+WFmmt-1+WFcCt-1+bF);
ct=Ft⊙ct-1+It⊙g(WcxIt+Wcmmt-1+bc);
Ot=σ (WOxIt+WOmmt-1+WOcCt-1+bO);
mt=Ot⊙h(Ct);
Wherein, given list entries I=(I1,I2...IT), T is the length of list entries, ItFor t Input, W is weight matrix, and b is bias matrix, and I, F, c, O, m represent respectively Input Input Gate, Forget Gate, Output Gate, state cell and LSTM The output of structure;
Wherein, σ is three excitation functions controlling door, and formula is:
f ( x ) = 1 1 + e x ;
Wherein, h is the excitation function of state, and formula is:
f ( x ) = tanh = e x - e - x e x + e - x
CN201610260747.XA 2016-04-25 2016-04-25 Chinese sign language identification method based on LSTM type RNN Pending CN105956529A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610260747.XA CN105956529A (en) 2016-04-25 2016-04-25 Chinese sign language identification method based on LSTM type RNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610260747.XA CN105956529A (en) 2016-04-25 2016-04-25 Chinese sign language identification method based on LSTM type RNN

Publications (1)

Publication Number Publication Date
CN105956529A true CN105956529A (en) 2016-09-21

Family

ID=56916848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610260747.XA Pending CN105956529A (en) 2016-04-25 2016-04-25 Chinese sign language identification method based on LSTM type RNN

Country Status (1)

Country Link
CN (1) CN105956529A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778700A (en) * 2017-01-22 2017-05-31 福州大学 One kind is based on change constituent encoder Chinese Sign Language recognition methods
CN106779467A (en) * 2016-12-31 2017-05-31 成都数联铭品科技有限公司 Enterprises ' industry categorizing system based on automatic information screening
CN107316067A (en) * 2017-05-27 2017-11-03 华南理工大学 A kind of aerial hand-written character recognition method based on inertial sensor
CN107463878A (en) * 2017-07-05 2017-12-12 成都数联铭品科技有限公司 Human bodys' response system based on deep learning
CN107992746A (en) * 2017-12-14 2018-05-04 华中师范大学 Malicious act method for digging and device
CN108766434A (en) * 2018-05-11 2018-11-06 东北大学 A kind of Sign Language Recognition translation system and method
CN109902554A (en) * 2019-01-09 2019-06-18 天津大学 A kind of recognition methods of the sign language based on commercial Wi-Fi
CN111104960A (en) * 2019-10-30 2020-05-05 武汉大学 Sign language identification method based on millimeter wave radar and machine vision
CN111354246A (en) * 2020-01-16 2020-06-30 浙江工业大学 System and method for helping deaf-mute to communicate
CN111913575A (en) * 2020-07-24 2020-11-10 合肥工业大学 Method for recognizing hand-language words
WO2020252923A1 (en) * 2019-06-18 2020-12-24 平安科技(深圳)有限公司 Sample data processing method and apparatus, computer apparatus, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539994A (en) * 2009-04-16 2009-09-23 西安交通大学 Mutually translating system and method of sign language and speech
CN102193633A (en) * 2011-05-25 2011-09-21 广州畅途软件有限公司 dynamic sign language recognition method for data glove
CN105205449A (en) * 2015-08-24 2015-12-30 西安电子科技大学 Sign language recognition method based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101539994A (en) * 2009-04-16 2009-09-23 西安交通大学 Mutually translating system and method of sign language and speech
CN102193633A (en) * 2011-05-25 2011-09-21 广州畅途软件有限公司 dynamic sign language recognition method for data glove
CN105205449A (en) * 2015-08-24 2015-12-30 西安电子科技大学 Sign language recognition method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NORIKI NISHIDA等: "Multimodal Gesture Recognition Using Multi-stream Recurrent Neural Network", 《IMAGE AND VIDEO TECHNOLOGY》 *
梁军 等: "基于极性转移和LSTM递归网络的情感分析", 《中文信息学报》 *
王新宇 等: "基于一种改进神经网络的数据手套手势识别", 《PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106779467A (en) * 2016-12-31 2017-05-31 成都数联铭品科技有限公司 Enterprises ' industry categorizing system based on automatic information screening
CN106778700A (en) * 2017-01-22 2017-05-31 福州大学 One kind is based on change constituent encoder Chinese Sign Language recognition methods
CN107316067A (en) * 2017-05-27 2017-11-03 华南理工大学 A kind of aerial hand-written character recognition method based on inertial sensor
CN107316067B (en) * 2017-05-27 2019-11-15 华南理工大学 A kind of aerial hand-written character recognition method based on inertial sensor
CN107463878A (en) * 2017-07-05 2017-12-12 成都数联铭品科技有限公司 Human bodys' response system based on deep learning
CN107992746A (en) * 2017-12-14 2018-05-04 华中师范大学 Malicious act method for digging and device
CN108766434A (en) * 2018-05-11 2018-11-06 东北大学 A kind of Sign Language Recognition translation system and method
CN108766434B (en) * 2018-05-11 2022-01-04 东北大学 Sign language recognition and translation system and method
CN109902554A (en) * 2019-01-09 2019-06-18 天津大学 A kind of recognition methods of the sign language based on commercial Wi-Fi
CN109902554B (en) * 2019-01-09 2023-03-10 天津大学 Sign language identification method based on commercial Wi-Fi
WO2020252923A1 (en) * 2019-06-18 2020-12-24 平安科技(深圳)有限公司 Sample data processing method and apparatus, computer apparatus, and storage medium
CN111104960A (en) * 2019-10-30 2020-05-05 武汉大学 Sign language identification method based on millimeter wave radar and machine vision
CN111104960B (en) * 2019-10-30 2022-06-14 武汉大学 Sign language identification method based on millimeter wave radar and machine vision
CN111354246A (en) * 2020-01-16 2020-06-30 浙江工业大学 System and method for helping deaf-mute to communicate
CN111913575B (en) * 2020-07-24 2021-06-11 合肥工业大学 Method for recognizing hand-language words
CN111913575A (en) * 2020-07-24 2020-11-10 合肥工业大学 Method for recognizing hand-language words

Similar Documents

Publication Publication Date Title
CN105956529A (en) Chinese sign language identification method based on LSTM type RNN
CN105513591B (en) The method and apparatus for carrying out speech recognition with LSTM Recognition with Recurrent Neural Network model
CN103049792B (en) Deep-neural-network distinguish pre-training
CN107492382B (en) Voiceprint information extraction method and device based on neural network
CN111160467B (en) Image description method based on conditional random field and internal semantic attention
CN109241255A (en) A kind of intension recognizing method based on deep learning
Verstraeten et al. Reservoir-based techniques for speech recognition
CN107526834A (en) Joint part of speech and the word2vec improved methods of the correlation factor of word order training
CN108229582A (en) Entity recognition dual training method is named in a kind of multitask towards medical domain
CN107133220A (en) Name entity recognition method in a kind of Geography field
CN106652999A (en) System and method for voice recognition
CN110444191A (en) A kind of method, the method and device of model training of prosody hierarchy mark
CN108346436A (en) Speech emotional detection method, device, computer equipment and storage medium
CN107679491A (en) A kind of 3D convolutional neural networks sign Language Recognition Methods for merging multi-modal data
CN112466326B (en) Voice emotion feature extraction method based on transducer model encoder
CN107609572A (en) Multi-modal emotion identification method, system based on neutral net and transfer learning
CN107273355A (en) A kind of Chinese word vector generation method based on words joint training
CN108566627A (en) A kind of method and system identifying fraud text message using deep learning
CN107316654A (en) Emotion identification method based on DIS NV features
CN108197294A (en) A kind of text automatic generation method based on deep learning
CN110517664A (en) Multi-party speech recognition methods, device, equipment and readable storage medium storing program for executing
CN107943784A (en) Relation extraction method based on generation confrontation network
CN108549658A (en) A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree
CN106529503A (en) Method for recognizing face emotion by using integrated convolutional neural network
CN106897559A (en) A kind of symptom and sign class entity recognition method and device towards multi-data source

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160921

RJ01 Rejection of invention patent application after publication