CN106254960A

CN106254960A - A kind of video call method for communication disorders and system

Info

Publication number: CN106254960A
Application number: CN201610769730.7A
Authority: CN
Inventors: 洪涛; 孙铭俊
Original assignee: Fuzhou Rockchip Electronics Co Ltd
Current assignee: Fuzhou Rockchip Electronics Co Ltd
Priority date: 2016-08-30
Filing date: 2016-08-30
Publication date: 2016-12-21

Abstract

The present invention provides a kind of video call system for communication disorders, described system to include: video calling originating end, sign language language identification server, video calling miscellaneous function server and video calling destination end；Described sign language language identification server, video calling miscellaneous function server are connected with video calling originating end and video calling destination end by communication network；After the participant of communication disorders is used sign language language to exchange by described video calling originating end, by sign language language identification server, it is text subtile information by sign language language conversion；Video calling originating end video the most at last, audio frequency and the packing of text subtile data, and then by video calling miscellaneous function server, video call data delivered to video calling destination end.Present invention achieves the video calling participant of communication disorders, it is possible to participate in video calling.

Description

A kind of video call method for communication disorders and system

Technical field

The present invention relates to set-top box technique field, particularly relate to a kind of video call method for communication disorders and be System.

Background technology

Video calling is often referred to hold based on the Internet and mobile Internet (3G the Internet), by real-time between intelligent terminal Transmit voice and a kind of communication mode of image (bust of user, photo, article etc.) of people.Video calling prevailing transmission It is image and sound.Special population, may be in the face of some special difficulties when participating in video calling.Special population (deaf mute) Use sign language exchanges, and they cannot the most effectively link up before normal video calling participant.

Summary of the invention

One of the technical problem to be solved in the present invention, is to provide a kind of video call system for communication disorders, real Show the video calling participant of communication disorders, it is possible to carrying out video calling with linking up normal person, person provides for communication disorders Convenient.

One of problem of the present invention is achieved in that a kind of video call system for communication disorders, described system bag Include: video calling originating end, sign language language identification server, video calling miscellaneous function server and video calling target End；Described sign language language identification server, video calling miscellaneous function server by communication network and video calling originating end and Video calling destination end connects；

After the participant of communication disorders is used sign language language to exchange by described video calling originating end, by sign language language Speech identifies server, is text subtile information by sign language language conversion, and text subtile information is converted into digitized audio message；

Described video calling originating end video the most at last, audio-frequency information and the packing of text subtile data, and then pass through video Video call data is delivered to video calling destination end by call miscellaneous function server.

Further, described video calling originating end is provided with in the middle of hardware driving, operating system module, video calling Part module, Sign Language Recognition engine, sign language turns captioning module, captions turn sound module, video/audio/captions coding packetization module And video calling transport module；

Described hardware driving is that the software interface of device hardware is abstract；

Described operating system module is the basis that equipment runs other softwares；

Described video calling middleware module, realizes the general name of the repertoire interface of video calling by software；

Described Sign Language Recognition engine, is used for identifying sign language information；

Described sign language turns captioning module, and the gesture information of collection is converted into text subtile information, including gathering user's figure As information, gesture identification, gesture information and specific action comparison, identify corresponding sign language implication, by written for the conversion of sign language implication Word caption information；

Described captions turn sound module, for transferring word to sound；

Described video/audio/captions coding packetization module, have identified gesture information, and changes into audio stream and caption stream, Then Video stream information, audio stream and caption stream three road stream are repacked；

Described video calling transport module, the i.e. function of the transmission of video calling middleware module.

Further, described hardware driving includes that processor driving, communication interface driving, audio driven and video are compiled firmly Code drives.

Further, described Sign Language Recognition engine includes: Sign Language Recognition interface, Sign Language Recognition Service Operation policy module, Sign Language Recognition implements module and Sign Language Recognition management module；

Described Sign Language Recognition interface has been the definition of required interface on Sign Language Recognition function logic；

Described Sign Language Recognition Service Operation policy module, selects the embodiment of final Sign Language Recognition interface；

Described Sign Language Recognition implements module, for the enforcement to specific embodiment；

Described Sign Language Recognition management module, is responsible for and safeguards being embodied as of multiple Sign Language Recognition interface.

Further, the operation principle of described Sign Language Recognition engine: the people of communication disorders is carried out video pictures collection；Again The image binaryzation pretreatment that will gather；And it is semantic with identification, sign language segmentation, sign language Semantic mapping and sign language to carry out sign language tracking Lard speech with literary allusions word, thus complete gesture identification.

Further, described video video phone system carries out video calling operation particularly as follows: described video calling is initiated End gathers the video pictures of participant, and then by video pictures to sign language language identification server process；Sign Language Recognition is mainly entered The following operation of row: call Sign Language Recognition engine and identify sign language information；Sign language turns captioning module by being converted into by sign language information Text subtile information；Call captions and turn sound module, caption information is converted to acoustic information；By caption information and acoustic information Returning to video calling originating end, the multi-medium data of video calling is compiled by video calling originating end by video/audio/captions Code packetization module is packed, and then calls the video calling transport module of video calling middleware module, by video calling Data pass through video calling miscellaneous function server transport to video calling destination end.

The two of this technical problem to be solved in the present invention, are to provide a kind of video call method for communication disorders, Achieve the video calling participant of communication disorders, it is possible to carry out video calling with linking up normal person, provide for communication disorders person Convenience.

The two of problem of the present invention are achieved in that a kind of video call method for communication disorders, it is characterised in that: Described method need to provide video calling originating end, sign language language identification server, video calling miscellaneous function server and regard Frequently call target end；

The participant of communication disorders uses sign language language to exchange, by sign language language identification at video calling originating end Server, is text subtile information by sign language language conversion, and text subtile information is converted into digitized audio message；

Described video call terminal video the most at last, audio-frequency information and the packing of text subtile data, and then led to by video Video call data is delivered to video calling destination end by words miscellaneous function server.

Described captions turn sound module, for transferring word to sound；

Further, described method is further particularly as follows: described video video phone system carries out video calling operation tool Body is: described video calling originating end gathers the video pictures of participant, and then is serviced to sign language language identification by video pictures Device processes；Sign Language Recognition is substantially carried out following operation: calls Sign Language Recognition engine and identifies sign language information；Sign language turns captioning module By sign language information being converted into text subtile information；Call captions and turn sound module, caption information is converted to acoustic information； Caption information and acoustic information return to video calling originating end, and video calling originating end is by the multi-medium data of video calling Packed by video/audio/captions coding packetization module, and then the video calling calling video calling middleware module passes Defeated module, by the data of video calling by video calling miscellaneous function server transport to video calling destination end.

Present invention have the advantage that the present invention makes the video calling participant of communication disorders, use sign language language to carry out Exchange, by sign language language identification server, is text subtile information by sign language language conversion.Video call terminal regards the most at last Frequently, audio frequency and caption data packing, and then by video calling miscellaneous function server, video call data is delivered to video and leads to Words destination end.It is achieved thereby that the video calling participant of communication disorders, it is possible to carry out video calling, for ditch with linking up normal person Logical obstacle person provides conveniently.

Accompanying drawing explanation

The present invention is further illustrated the most in conjunction with the embodiments.

Fig. 1 is the system overall framework figure of the present invention.

Fig. 2 is the structural representation of each module in video call terminal of the present invention.

Fig. 3 is the fundamental diagram of Sign Language Recognition of the present invention.

Fig. 4 is the inventive method operating process schematic diagram.

Detailed description of the invention

Referring to shown in Fig. 1 to Fig. 3, video call terminal is interconnected by Base communication net (the Internet etc.).Video Call comprises the outside sign language language identification server strengthening call function and video calling miscellaneous function server.Server merit The division of energy is to divide on function logic, not divides from physical logic, i.e. sign language language identification server and video calling Miscellaneous function server is likely to be present on same station server main frame.The efficient combination of the participation main body of video calling is: Communication disorders participant and communication disorders participant (need not special handling)；Link up normal participant and link up normal participant (need not special handling)；Communication disorders participant and the normal participant of communication (needing special handling).

A kind of video call system for communication disorders of the present invention, described system includes: video calling originating end ( As be that the participant of communication disorders uses), sign language language identification server, video calling miscellaneous function server and video lead to Words destination end (is usually linked up normal participant to use)；Described sign language language identification server, video calling miscellaneous function Server is connected with video calling originating end and video calling destination end by communication network；

In the present invention, described video calling originating end is provided with in hardware driving, operating system module, video calling Between part module, Sign Language Recognition engine, sign language turns captioning module, captions turn sound module, video/audio/captions coding packing mould Block and video calling transport module；

Described hardware driving is that the software interface of device hardware is abstract；Described hardware driving includes that processor drives, communicates Interface driver, audio driven and video hard coded drive.

Described captions turn sound module, for transferring word to sound；

Described Sign Language Recognition engine includes: Sign Language Recognition interface, Sign Language Recognition Service Operation policy module, Sign Language Recognition are in fact Execute module and Sign Language Recognition management module；

Described Sign Language Recognition Service Operation policy module, selects the embodiment of final Sign Language Recognition interface；I.e. configuration makes With which kind of Sign Language Recognition server (oneself or the most third-party)

Described Sign Language Recognition management module, is responsible for and safeguards being embodied as of multiple Sign Language Recognition interface.Sign language is known The upgrading of other engine engine for convenience, maintenance and expansion, optimal enforcement is to be deployed in video calling miscellaneous function server On.Sign Language Recognition engine distribution is on video calling miscellaneous function server；Sign Language Recognition interface (API) is deployed in video calling In client.Sign Language Recognition provider management module, is responsible for and safeguards the concrete real of multiple Sign Language Recognition interface (API) Executing, these are embodied as being likely located on third party's Sign Language Recognition server.Choosing is responsible in Sign Language Recognition Service Operation policy module Select the embodiment of final Sign Language Recognition interface.

Wherein, the operation principle of described Sign Language Recognition engine: the people of communication disorders is carried out video pictures collection；To adopt again The image binaryzation pretreatment of collection；And carry out sign language follow the trail of lard speech with literary allusions with identification, sign language segmentation, sign language Semantic mapping and sign language semanteme Word, thus complete gesture identification.

As shown in Figure 4, described video video phone system carries out video calling operation particularly as follows: described video calling is initiated End gathers the video pictures of participant, and then by video pictures to sign language language identification server process；Sign Language Recognition is mainly entered The following operation of row: call Sign Language Recognition engine and identify sign language information；Sign language turns captioning module by being converted into by sign language information Text subtile information；Call captions and turn sound module, caption information is converted to acoustic information；By caption information and acoustic information Returning to video calling originating end, the multi-medium data of video calling is compiled by video calling originating end by video/audio/captions Code packetization module is carried out pack (video/audio/captions), and then calls the video calling transmission mould of video calling middleware module Block, by the data of video calling by video calling miscellaneous function server transport to video calling destination end.

Referring to shown in Fig. 2 to Fig. 4, a kind of video call method for communication disorders of the present invention, described method needs Video calling originating end, sign language language identification server, video calling miscellaneous function server and video calling target are provided End；

Described video calling originating end is provided with hardware driving, operating system module, video calling middleware module, hands Language identification engine, sign language turn captioning module, captions turn sound module, video/audio/captions encode packetization module and video leads to Words transport module；

Described captions turn sound module, for transferring word to sound；

In the present invention, described method is further particularly as follows: described video video phone system carries out video calling operation Particularly as follows: described video calling originating end gathers the video pictures of participant, and then video pictures is taken to sign language language identification Business device processes；Sign Language Recognition is substantially carried out following operation: calls Sign Language Recognition engine and identifies sign language information；Sign language turns captions mould Block by being converted into text subtile information by sign language information；Call captions and turn sound module, caption information is converted to sound letter Breath；Caption information and acoustic information return to video calling originating end, and video calling originating end is by the multimedia of video calling Data are packed by video/audio/captions coding packetization module, and then the video calling video calling middleware module leads to Words transport module, by the data of video calling by video calling miscellaneous function server transport to video calling destination end.

In a word, the present invention makes the video calling participant of communication disorders, uses sign language language to exchange, by sign language language Speech identifies server, is text subtile information by sign language language conversion.Video call terminal video the most at last, audio frequency and captions number According to packing, and then by video calling miscellaneous function server, video call data delivered to video calling destination end.Thus it is real Show the video calling participant of communication disorders, it is possible to carrying out video calling with linking up normal person, person provides for communication disorders Convenient.

Although the foregoing describing the detailed description of the invention of the present invention, but those familiar with the art should managing Solving, our described specific embodiment is merely exemplary rather than for the restriction to the scope of the present invention, is familiar with this The technical staff in field, in the equivalent modification made according to the spirit of the present invention and change, should be contained the present invention's In scope of the claimed protection.

Claims

1. the video call system for communication disorders, it is characterised in that: described system includes: video calling originating end, Sign language language identification server, video calling miscellaneous function server and video calling destination end；Described sign language language identification Server, video calling miscellaneous function server are connected with video calling originating end and video calling destination end by communication network；

After the participant of communication disorders is used sign language language to exchange by described video calling originating end, known by sign language language Other server, is text subtile information by sign language language conversion, and text subtile information is converted into digitized audio message；

Described video calling originating end video the most at last, audio-frequency information and the packing of text subtile data, and then pass through video calling Video call data is delivered to video calling destination end by miscellaneous function server.

A kind of video call system for communication disorders the most according to claim 1, it is characterised in that: described video leads to Be provided with hardware driving in words originating end, operating system module, video calling middleware module, Sign Language Recognition engine, sign language turn Captioning module, captions turn sound module, video/audio/captions coding packetization module and video calling transport module；

Described sign language turns captioning module, and the gesture information of collection is converted into text subtile information, including gathering user images letter Breath, gesture identification, gesture information and specific action comparison, identify corresponding sign language implication, sign language implication be converted into word word Curtain information；

Described captions turn sound module, for transferring word to sound；

A kind of video call system for communication disorders the most according to claim 2, it is characterised in that: described hardware drives Move and include that processor driving, communication interface driving, audio driven and video hard coded drive.

A kind of video call system for communication disorders the most according to claim 2, it is characterised in that: described sign language is known Other engine includes: Sign Language Recognition interface, Sign Language Recognition Service Operation policy module, Sign Language Recognition implement module and sign language knowledge Don't bother about reason module；

A kind of video call system for communication disorders the most according to claim 2, it is characterised in that: described sign language is known The operation principle of other engine: the people of communication disorders is carried out video pictures collection；The image binaryzation pretreatment that will gather again；And Carry out sign language to follow the trail of and lard speech with literary allusions word with identification, sign language segmentation, sign language Semantic mapping and sign language semanteme, thus complete gesture identification.

A kind of video call system for communication disorders the most according to claim 2, it is characterised in that: described video regards Frequently phone system carries out video calling operation particularly as follows: the video pictures of described video calling originating end collection participant, and then By video pictures to sign language language identification server process；Sign Language Recognition is substantially carried out following operation: call Sign Language Recognition engine Identify sign language information；Sign language turns captioning module by sign language information is converted into text subtile information；Call captions and turn sound Module, is converted to acoustic information by caption information；Caption information and acoustic information are returned to video calling originating end, and video leads to The multi-medium data of video calling is packed by video/audio/captions coding packetization module, and then is called by words originating end The data of video calling are serviced by the video calling transport module of video calling middleware module by video calling miscellaneous function Device is transferred to video calling destination end.

7. the video call method for communication disorders, it is characterised in that: described method need to provide video calling originating end, Sign language language identification server, video calling miscellaneous function server and video calling destination end；

The participant of communication disorders uses sign language language to exchange at video calling originating end, is serviced by sign language language identification Device, is text subtile information by sign language language conversion, and text subtile information is converted into digitized audio message；

Described video call terminal video the most at last, audio-frequency information and the packing of text subtile data, so auxiliary by video calling Help function server that video call data is delivered to video calling destination end.

A kind of video call method for communication disorders the most according to claim 7, it is characterised in that: described video leads to Be provided with hardware driving in words originating end, operating system module, video calling middleware module, Sign Language Recognition engine, sign language turn Captioning module, captions turn sound module, video/audio/captions coding packetization module and video calling transport module；

Described captions turn sound module, for transferring word to sound；

A kind of video call method for communication disorders the most according to claim 8, it is characterised in that: described method is entered One step is particularly as follows: described video video phone system carries out video calling operation particularly as follows: described video calling originating end collection The video pictures of participant, and then by video pictures to sign language language identification server process；Sign Language Recognition is substantially carried out following Operation: call Sign Language Recognition engine and identify sign language information；Sign language turns captioning module by sign language information is converted into word word Curtain information；Call captions and turn sound module, caption information is converted to acoustic information；Caption information and acoustic information are returned to Video calling originating end, the multi-medium data of video calling is packed by video calling originating end by video/audio/captions coding Module is packed, and then calls the video calling transport module of video calling middleware module, the data of video calling is led to Cross video calling miscellaneous function server transport to video calling destination end.

A kind of video call method for communication disorders the most according to claim 8, it is characterised in that: described hardware Driving includes that processor driving, communication interface driving, audio driven and video hard coded drive.

11. a kind of video call methods for communication disorders according to claim 8, it is characterised in that: described sign language Identification engine includes: Sign Language Recognition interface, Sign Language Recognition Service Operation policy module, Sign Language Recognition implement module and sign language Identify management module；

12. a kind of video call methods for communication disorders according to claim 8, it is characterised in that: described sign language Identify the operation principle of engine: the people of communication disorders is carried out video pictures collection；The image binaryzation pretreatment that will gather again； And carry out sign language and follow the trail of and lard speech with literary allusions word with identification, sign language segmentation, sign language Semantic mapping and sign language semanteme, thus complete gesture identification.