CN107798964A - The sign language intelligent interaction device and its exchange method of a kind of Real time identification gesture - Google Patents
The sign language intelligent interaction device and its exchange method of a kind of Real time identification gesture Download PDFInfo
- Publication number
- CN107798964A CN107798964A CN201711184055.2A CN201711184055A CN107798964A CN 107798964 A CN107798964 A CN 107798964A CN 201711184055 A CN201711184055 A CN 201711184055A CN 107798964 A CN107798964 A CN 107798964A
- Authority
- CN
- China
- Prior art keywords
- sign language
- information
- processor
- virtual portrait
- web
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B21/00—Teaching, or communicating with, the blind, deaf or mute
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Human Computer Interaction (AREA)
- Social Psychology (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Business, Economics & Management (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention discloses a kind of sign language intelligent interaction device of Real time identification gesture, it includes Web TV, processor, display and camera, the processor includes acquiring unit, processing unit, edit cell and decoding unit, and the acquiring unit is used to obtain the audio-frequency information of Web TV and the image information and depth information of beholder's sign language gesture;The processing unit is used to the audio-frequency information being converted into text message;The edit cell is used for the different sound sources for distinguishing the audio-frequency information, and utilizes the corresponding virtual portrait sign language animation of virtual portrait sign language editing technique editor according to the sound source;The decoding unit is used to described image information and depth information progress decoding process being converted into text message.The present invention can help deaf-mute to understand motor program, may also aid in the communication of deaf-mute and TV programme.
Description
Technical field
The present invention relates to artificial intelligence field, the sign language intelligent interaction device of specifically a kind of Real time identification gesture and
Its exchange method.
Background technology
Sign language is aided with appropriate expression and the shape of the mouth as one speaks as a kind of visible language, the athletic posture of its main armrest and arm
It is the important channel that deaf-mute exchanges with the external world to express the meaning of one's words.
The deaf and dumb population enormous amount in China at present, and for relative TV programme, the program that deaf-mute may be viewed by is relatively single
One, only occur that sign language host helps disabled person to understand news content in some news controllings, in general entertainment programme side
Face, in the market, which lacks the equipment for helping deaf-mute to understand programme content, to be occurred.
In recent years, although sign language Edit and Compose technology and virtual thermal system technology relative maturity, it is used for sign language and expresses
More virtual thermal system technologies application it is still considerably less.Sign Language Recognition technology based on current Kinect cameras is increasingly mature,
The present invention also obtains the image of sign language gesture and depth information using Kinect cameras and correlation technique, at processor
Reason, identification sign language information, and are converted into text, as the feedback information of equipment, realize the task of man-machine interaction.
The content of the invention
The technical problems to be solved by the invention are the defects of overcoming above-mentioned prior art, there is provided a kind of Real time identification gesture
Sign language intelligent interaction device and its exchange method, deaf-mute can be assisted to understand general entertainment programme's content.
Therefore, the present invention adopts the following technical scheme that:A kind of sign language intelligent interaction device of Real time identification gesture, it is wrapped
Web TV, processor, display and camera are included, the processor is connected with Web TV, display respectively, and it includes obtaining
Take unit, processing unit, edit cell and decoding unit;
The acquiring unit is used to obtain the audio-frequency information of Web TV and the image information of beholder's sign language gesture and depth letter
Breath;
The processing unit is used to the audio-frequency information being converted into text message;
The edit cell is used for the different sound sources for distinguishing the audio-frequency information, and utilizes virtual portrait sign language according to the sound source
The corresponding virtual portrait sign language animation of editing technique editor;
The decoding unit is used to described image information and depth information progress decoding process being converted into text message;
The display is connected with processor, for showing virtual portrait sign language animation;
The Web TV is used to send information to processor, receives the information sent with video-stream processor;
The camera is connected with processor, for obtaining described in image information and depth information and the general of beholder's sign language gesture
Image information and depth information are sent to processor.
Further, the sound source is tone color or tone.
Further, the decoding unit is handled and identified beholder's sign language gesture using convolutional neural networks algorithm
Image information and depth information.
Further, the camera is Kinect cameras.
The invention also provides a kind of sign language intelligent interactive method of Real time identification gesture, the sign language intelligent interactive method
Including:
Obtain the audio signal of Web TV and pre-processed;
Interception division is carried out according to signal characteristic to the audio signal, obtains effective sound source;
The differentiation of different sound sources, and the personage's sex and body shape of sound source belonging to judgement are carried out to audio signal according to characteristic voice
As, and then corresponding virtual portrait is selected in virtual portrait storehouse;
Utilize virtual portrait editing technique and animation compound technology generation virtual portrait sign language video;
The virtual portrait sign language video is shown over the display.
Further, the sign language intelligent interactive method also includes:
By training substantial amounts of sign language images of gestures information and depth information to obtain convolutional neural networks model;
Obtain the real-time hand language images of gestures information and depth information of people;
Real-time hand language images of gestures information and depth information decoding process are converted into by text envelope according to convolutional neural networks model
Breath;
Network is uploaded to by the text message and by Web TV connection.
Compared with prior art, the beneficial effects of the invention are as follows:
(1)Sign language intelligent interactive platform device based on Kinect, compared with the Sign Language Recognition technology for being generally basede on Kinect,
The sign language images of gestures information and depth information that get are pre-processed first, then by the result of pretreatment be input to by
Great amount of samples trains obtained convolutional neural networks model, and pre-processed results can be classified by the model, and will classification
As a result export, and be eventually converted into text message.Convolutional neural networks method is effectively improved the accuracy of gesture identification;
(2)Sign language intelligent interactive platform device based on Kinect, compared with general virtual portrait Edit and Compose method, propose
The method that identification of sound source is carried out according to audio signal, after Web TV audio signal is got, audio signal is carried out not
Differentiation with sound source is handled, and judges personage's sex and general body image according to tone color, tone etc., and then in virtual portrait
Corresponding virtual portrait is selected in storehouse, helps deaf-mute to understand some complex more people's session operational scenarios;
(3)Sign language intelligent interactive platform device based on Kinect, Kinect cameras are connected with other sign language equipment, can
The sign language images of gestures information and depth information of people are obtained, and is transmitted and is handled and identified into device handler, obtains phase
The text message answered is connected by Web TV feedback information being uploaded to network as feedback information;
(4)It is only capable of facilitating deaf-mute to view and admire more general entertainment programme, moreover it is possible to strengthen the mass participation of deaf-mute, it is greatly abundant
The entertainment life of deaf-mute.
Brief description of the drawings
Fig. 1 is the sign language intelligent interaction device primary structure sketch of Real time identification gesture.
Fig. 2 is the virtual portrait building method flow chart based on speech recognition.
Description of reference numerals:1- Web TVs, 2- processors, 3- displays, 4- beholders, 5- cameras.
Embodiment
The present invention is further elaborated below by specific embodiment and with reference to accompanying drawing.
As shown in figure 1, the invention provides a kind of sign language intelligent interaction device of Real time identification gesture, it includes network electricity
Depending on, processor, display and camera, the processor be connected respectively with Web TV, display, it include acquiring unit,
Processing unit, edit cell and decoding unit;The acquiring unit is used for the audio-frequency information and beholder's hand for obtaining Web TV
The image information and depth information of language gesture;The processing unit is used to the audio-frequency information being converted into text message;It is described
Edit cell is used for the different sound sources for distinguishing the audio-frequency information, and utilizes virtual portrait sign language editing technique according to the sound source
Edit corresponding virtual portrait sign language animation;The decoding unit is used to carry out described image information and depth information at decoding
Reason is converted into text message, and the display is connected with processor, for showing virtual portrait sign language animation;The Web TV
For sending information to processor, the information sent with video-stream processor is received;The camera is connected with processor, for obtaining
Take the image information of beholder's sign language gesture and depth information and send described image information and depth information to processor.
Preferably, the sound source is tone color or tone.
Preferably, the decoding unit is handled and identified beholder's sign language gesture using convolutional neural networks algorithm
Image information and depth information.
Preferably, the camera is Kinect cameras.The Kinect cameras can obtain the sign language gesture of people
Image information and depth information, and transmit into processor.
As shown in Fig. 2 present invention also offers a kind of sign language intelligent interactive method of Real time identification gesture, the sign language intelligence
Energy exchange method includes:
Obtain the audio signal of Web TV and pre-processed;
Interception division is carried out according to signal characteristic to the audio signal, obtains effective sound source;
The differentiation of different sound sources, and the personage's sex and body shape of sound source belonging to judgement are carried out to audio signal according to characteristic voice
As, and then corresponding virtual portrait is selected in virtual portrait storehouse;
Utilize virtual portrait editing technique and animation compound technology generation virtual portrait sign language video;
The virtual portrait sign language video is shown over the display.
Preferably, the sign language intelligent interactive method also includes:
By training substantial amounts of sign language images of gestures information and depth information to obtain convolutional neural networks model;
Obtain the real-time hand language images of gestures information and depth information of people;
Real-time hand language images of gestures information and depth information decoding process are converted into by text envelope according to convolutional neural networks model
Breath;
Network is uploaded to by the text message and by Web TV connection.
The present invention pre-processes to the sign language images of gestures information and depth information got first, then by pretreatment
As a result it is input to and trains obtained convolutional neural networks model by great amount of samples, pre-processed results can be carried out by the model
Classification, and classification results are exported, and it is eventually converted into text message.Convolutional neural networks method is effectively improved gesture identification
Accuracy.Whole system can not only incite somebody to action(More people's dialogues)The audio signal output of general entertainment programme shows for sign language video
Show, Kinect cameras can also be utilized to obtain the sign language gesture information of people, handle to obtain feedback information upload by processor
To the network platform.The present invention can help deaf-mute to understand most TV content, and can by its feedback opinion it is timely on
The network platform is reached, strengthens the mass participation of deaf-mute, the entertainment life of greatly abundant deaf-mute.
The present invention can help deaf-mute to understand most TV content, and can be uploaded to its feedback opinion in time
The network platform, strengthen the mass participation of deaf-mute, the entertainment life of greatly abundant deaf-mute, additionally it is possible to help deaf-mute to manage
Solve some complex more people's session operational scenarios.
Protection scope of the present invention is not limited to foregoing description, any other forms production under the enlightenment of the present invention
Product, no matter making any change in shape or structure, the technical schemes that are same or similar to the present invention, in this hair
Within bright protection domain.
Claims (6)
1. a kind of sign language intelligent interaction device of Real time identification gesture, it is characterised in that it includes Web TV, processor, shown
Show device and camera, the processor is connected with Web TV, display respectively, and it includes acquiring unit, processing unit, editor
Unit and decoding unit;
The acquiring unit is used to obtain the audio-frequency information of Web TV and the image information of beholder's sign language gesture and depth letter
Breath;
The processing unit is used to the audio-frequency information being converted into text message;
The edit cell is used for the different sound sources for distinguishing the audio-frequency information, and utilizes virtual portrait sign language according to the sound source
The corresponding virtual portrait sign language animation of editing technique editor;
The decoding unit is used to described image information and depth information progress decoding process being converted into text message;
The display is connected with processor, for showing virtual portrait sign language animation;
The Web TV is used to send information to processor, receives the information sent with video-stream processor;
The camera is connected with processor, for obtaining described in image information and depth information and the general of beholder's sign language gesture
Image information and depth information are sent to processor.
2. the sign language intelligent interaction device of Real time identification gesture according to claim 1, it is characterised in that the sound source is
Tone color or tone.
3. the sign language intelligent interaction device of Real time identification gesture according to claim 1, it is characterised in that
The decoding unit using convolutional neural networks algorithm handled and identified beholder's sign language gesture image information and
Depth information.
4. the sign language intelligent interaction device of Real time identification gesture according to claim 1, it is characterised in that the camera
For Kinect cameras.
5. a kind of sign language intelligent interactive method of Real time identification gesture, it is characterised in that the sign language intelligent interactive method includes:
Obtain the audio signal of Web TV and pre-processed;
Interception division is carried out according to signal characteristic to the audio signal, obtains effective sound source;
The differentiation of different sound sources, and the personage's sex and body shape of sound source belonging to judgement are carried out to audio signal according to characteristic voice
As, and then corresponding virtual portrait is selected in virtual portrait storehouse;
Utilize virtual portrait editing technique and animation compound technology generation virtual portrait sign language video;
The virtual portrait sign language video is shown over the display.
6. sign language intelligent interactive method according to claim 5, it is characterised in that the sign language intelligent interactive method also wraps
Include:
By training substantial amounts of sign language images of gestures information and depth information to obtain convolutional neural networks model;
Obtain the real-time hand language images of gestures information and depth information of people;
Real-time hand language images of gestures information and depth information decoding process are converted into by text envelope according to convolutional neural networks model
Breath;
Network is uploaded to by the text message and by Web TV connection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711184055.2A CN107798964A (en) | 2017-11-24 | 2017-11-24 | The sign language intelligent interaction device and its exchange method of a kind of Real time identification gesture |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711184055.2A CN107798964A (en) | 2017-11-24 | 2017-11-24 | The sign language intelligent interaction device and its exchange method of a kind of Real time identification gesture |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107798964A true CN107798964A (en) | 2018-03-13 |
Family
ID=61534724
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711184055.2A Pending CN107798964A (en) | 2017-11-24 | 2017-11-24 | The sign language intelligent interaction device and its exchange method of a kind of Real time identification gesture |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107798964A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108776985A (en) * | 2018-06-05 | 2018-11-09 | 科大讯飞股份有限公司 | A kind of method of speech processing, device, equipment and readable storage medium storing program for executing |
CN109446876A (en) * | 2018-08-31 | 2019-03-08 | 百度在线网络技术(北京)有限公司 | Sign language information processing method, device, electronic equipment and readable storage medium storing program for executing |
CN110020442A (en) * | 2019-04-12 | 2019-07-16 | 上海电机学院 | A kind of portable translating machine |
CN110730360A (en) * | 2019-10-25 | 2020-01-24 | 北京达佳互联信息技术有限公司 | Video uploading and playing methods and devices, client equipment and storage medium |
CN112328076A (en) * | 2020-11-06 | 2021-02-05 | 北京中科深智科技有限公司 | Method and system for driving character gestures through voice |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1532775A (en) * | 2003-03-19 | 2004-09-29 | ���µ�����ҵ��ʽ���� | Visuable telephone terminal |
CN101794528A (en) * | 2010-04-02 | 2010-08-04 | 北京大学软件与微电子学院无锡产学研合作教育基地 | Gesture language-voice bidirectional translation system |
CN103956167A (en) * | 2014-05-06 | 2014-07-30 | 北京邮电大学 | Visual sign language interpretation method and device based on Web |
CN105205475A (en) * | 2015-10-20 | 2015-12-30 | 北京工业大学 | Dynamic gesture recognition method |
CN105868282A (en) * | 2016-03-23 | 2016-08-17 | 乐视致新电子科技(天津)有限公司 | Method and apparatus used by deaf-mute to perform information communication, and intelligent terminal |
CN107291348A (en) * | 2017-05-31 | 2017-10-24 | 珠海市魅族科技有限公司 | Photographic method and device, computer equipment and computer-readable recording medium |
-
2017
- 2017-11-24 CN CN201711184055.2A patent/CN107798964A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1532775A (en) * | 2003-03-19 | 2004-09-29 | ���µ�����ҵ��ʽ���� | Visuable telephone terminal |
CN101794528A (en) * | 2010-04-02 | 2010-08-04 | 北京大学软件与微电子学院无锡产学研合作教育基地 | Gesture language-voice bidirectional translation system |
CN103956167A (en) * | 2014-05-06 | 2014-07-30 | 北京邮电大学 | Visual sign language interpretation method and device based on Web |
CN105205475A (en) * | 2015-10-20 | 2015-12-30 | 北京工业大学 | Dynamic gesture recognition method |
CN105868282A (en) * | 2016-03-23 | 2016-08-17 | 乐视致新电子科技(天津)有限公司 | Method and apparatus used by deaf-mute to perform information communication, and intelligent terminal |
CN107291348A (en) * | 2017-05-31 | 2017-10-24 | 珠海市魅族科技有限公司 | Photographic method and device, computer equipment and computer-readable recording medium |
Non-Patent Citations (1)
Title |
---|
叶平: "基于Kinect的实时手语识别技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108776985A (en) * | 2018-06-05 | 2018-11-09 | 科大讯飞股份有限公司 | A kind of method of speech processing, device, equipment and readable storage medium storing program for executing |
CN109446876A (en) * | 2018-08-31 | 2019-03-08 | 百度在线网络技术(北京)有限公司 | Sign language information processing method, device, electronic equipment and readable storage medium storing program for executing |
CN109446876B (en) * | 2018-08-31 | 2020-11-06 | 百度在线网络技术(北京)有限公司 | Sign language information processing method and device, electronic equipment and readable storage medium |
US11580983B2 (en) | 2018-08-31 | 2023-02-14 | Baidu Online Network Technology (Beijing) Co., Ltd. | Sign language information processing method and apparatus, electronic device and readable storage medium |
CN110020442A (en) * | 2019-04-12 | 2019-07-16 | 上海电机学院 | A kind of portable translating machine |
CN110730360A (en) * | 2019-10-25 | 2020-01-24 | 北京达佳互联信息技术有限公司 | Video uploading and playing methods and devices, client equipment and storage medium |
CN112328076A (en) * | 2020-11-06 | 2021-02-05 | 北京中科深智科技有限公司 | Method and system for driving character gestures through voice |
CN112328076B (en) * | 2020-11-06 | 2021-10-29 | 北京中科深智科技有限公司 | Method and system for driving character gestures through voice |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107798964A (en) | The sign language intelligent interaction device and its exchange method of a kind of Real time identification gesture | |
CN105681920B (en) | A kind of Network teaching method and system with speech identifying function | |
CN102271241A (en) | Image communication method and system based on facial expression/action recognition | |
CN106846940A (en) | A kind of implementation method of online live streaming classroom education | |
CN109118854A (en) | A kind of panorama immersion living broadcast interactive teaching system | |
CN103650002A (en) | Video generation based on text | |
CN109841217A (en) | A kind of AR interactive system and method based on speech recognition | |
CN103369289A (en) | Communication method of video simulation image and device | |
CN115209180A (en) | Video generation method and device | |
CN107808191A (en) | The output intent and system of the multi-modal interaction of visual human | |
CN207181987U (en) | A kind of virtual artificial intelligence companion | |
CN110211582A (en) | A kind of real-time, interactive intelligent digital virtual actor's facial expression driving method and system | |
CN108256458A (en) | A kind of two-way real-time translation system and method for deaf person's nature sign language | |
CN113132741A (en) | Virtual live broadcast system and method | |
CN106653020A (en) | Multi-business control method and system for smart sound and video equipment based on deep learning | |
CN104505089B (en) | Spoken error correction method and equipment | |
CN202929567U (en) | Virtual character animation performance system | |
CN116229311B (en) | Video processing method, device and storage medium | |
CN116449958A (en) | Virtual office system based on meta universe | |
CN113254713B (en) | Multi-source emotion calculation system and method for generating emotion curve based on video content | |
CN114155321A (en) | Face animation generation method based on self-supervision and mixed density network | |
CN208335209U (en) | Listen the raw inclusive education classroom auxiliary system of barrier and device | |
CN112055167A (en) | Remote collaboration three-dimensional modeling system and method based on 5G cloud video conference | |
CN109101942B (en) | Expression simulation method and system for intelligent reality interactive communication transfer robot | |
CN110491250A (en) | A kind of deaf-mute's tutoring system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180313 |
|
WD01 | Invention patent application deemed withdrawn after publication |