CN107465887A

CN107465887A - Video call system and video call method

Info

Publication number: CN107465887A
Application number: CN201710826830.3A
Authority: CN
Inventors: 吴萍; 吴小进; 刘云龙; 台流臣
Original assignee: Weifang University
Current assignee: Weifang University
Priority date: 2017-09-14
Filing date: 2017-09-14
Publication date: 2017-12-12

Abstract

The invention discloses video call system and video call method, belong to communication technical field.The phone system mainly includes translating module and sychronization captions module when call link block, speed measure feedback module, selection determining module, picture generation module, image display, interactive instruction module, voice.The call method includes：(1) send call request and test network speed；(2) cancelled according to network speed selection, two-dimentional or three-dimensional video call；(3) two dimension or 3-D view are shot using 3D cameras and carries out two-dimensional video synthesis or 3 D video line holographic projections；(4) module and sychronization captions module are translated when using voice, simultaneous interpretation is multinational and the local dialect, and bilingual subtitles export.In a word, the present invention has the advantages that system is smooth, selectance is high, Consumer's Experience effect is true to nature.

Description

Video call system and video call method

Technical field

The invention belongs to communication technical field, and in particular to video call system and video call method.

Background technology

With the development of science and technology the terminal device of smart mobile phone, notebook computer and computer etc has become work Indispensable important component in life, however, cross-platform, cross-terminal instant communicating system is typically only with man-machine friendship Mutual mode, such as word, picture, animation, audio, video and voice call etc., this interactive mode have it is certain single and Limitation.

Augmented reality (Augmented Reality, abbreviation AR), be it is a kind of in real time calculate camera image position and Angle and the technology for adding respective image, the target of this technology is that virtual world is enclosed on real world on screen and carried out It is interactive.Indicator screen can be expanded to true environment by it, computer interface is folded with icon and reflected in real-world object, by eyes Stare or gesture is given directions and operated；Three-dimensional body is allowed according to current task or to need alternatively to change in the panoramic view of user Become its shape and outward appearance；For real-world objects by being superimposed enhancing effect of the virtual scene generation similar to x-ray fluoroscopy；By map Information is inserted directly into real landscape to guide the action of driver；Outdoor scene is retrieved for examination by virtual window, wall is seemingly become Obtain transparent.It makes interaction expand to whole environment from accurate position, is developed into from simple people in face of screen exchange by oneself It is blended in the space and object of surrounding.No longer it is to feel and intentional independent action with information system, but with people's Current active is naturally integrally formed.Interaction sexual system is no longer to possess clear and definite position, but expands to whole environment.

But it is higher no matter to use two-dimentional or three-dimensional progress video calling to require network speed at present, especially In the better three-dimensional video call of Consumer's Experience, simple to establish three-dimensional video call applicability relatively low.Therefore need to build Found a kind of stronger adaptability and compatible higher video call system and video calling forwarding method, both can guarantee that speech quality also Consumer's Experience can be improved.

The content of the invention

For above technical problem, the present invention provides a kind of video call system and video call method, can improve video The quality of call, and Consumer's Experience effect.

The technical scheme is that：Video call system, it is main to include call link block, speed measure feedback module, choosing Select determining module, picture generation module, image display and interactive instruction module；

The call link block includes call request unit and call connects and refuses unit, and the call request unit is used to lead to Talk about originating end and send video calling request at least one call opposite end, the call, which connects, to be refused unit and be used for the call originating end Unilaterally hung up, or received for call opposite end selection or refuse the call request of call originating end；

The speed measure feedback module includes signal transmitting unit, signal receiving unit, D/A conversion unit, network speed and assesses list Member and memory cell, the signal transmitting unit be used for call request unit send ask while, according to certain frequency and Time interval sends to call opposite end and specifies signal, and the signal receiving unit is used to receive feedback signal, the memory cell For recording the specified signal and sending time, the feedback signal and feedback time, the D/A conversion unit will specify Signal and feedback signal are converted into corresponding data, and the network speed assessment unit calculates the data and the transmission time and anti- The time difference for presenting the time assesses network speed situation；

The selection determining module includes network speed display unit, unit and manual switching unit, the network speed are suggested in prompting Display unit is used for the real-time network speed data that draw the network speed assessment unit and shown, the prompting suggest unit according to The assessment result that network speed assessment unit is drawn provides optimal suggestion and automatically selected, and the manual switching unit is used for certainly After dynamic selection, carry out selecting two dimensional surface video calling mode or 3D hologram video calling mode according to the actual requirements；

The picture generation module includes collection acquiring unit, measuring and calculating processing unit, stored record unit and extraction application Unit, the collection acquiring unit obtain personage's two dimension or 3-D view by camera, and the measuring and calculating processing unit is to each Frame two dimensional image is compressed, quantified, encoding, sorting and storage forms continuous videos, or 3-D view is calculated, compiled Volume, output, synthesis, amendment and development form virtual three-dimensional image, the extraction applying unit be used for the continuous videos or Virtual three-dimensional image carries out extraction broadcasting；

Described image display module includes electrical screen display unit and line holographic projections display unit, and the electrical screen display is single Member is used to show two dimensional image video, and the line holographic projections display unit is entered former image by projector mechanism according to a certain percentage Row floating projects；

The interactive instruction module includes button touch unit, Tracking Recognition unit, instruction acquiring unit and action executing Unit, the button touch unit are conversed to two-dimensional video button or by way of touching screen and assign instruction, the tracking Recognition unit is used to be tracked face and action identification, and the instruction acquiring unit is used under the conditions of three-dimensional video call Virtual three-dimensional image is sent at least one call opposite end, obtains target action instruction, the action execution unit For controlling virtual three-dimensional image to perform the target action instruction.

Further, in addition to during voice translate module, translate module during the voice and include unit, language identification including former sound Unit, instant translation unit, stereo process unit and audio mixing output unit, the former sound include unit and are used to include the call Originating end or the field speech for opposite end of conversing, the language identification unit are used to identify category of language, the instant translation unit For the language translation that will identify that into appointed language, the stereo process unit is used to carry out former sound and the voice after translation Subchannel processing, the voice after the audio mixing output unit is used to assign to the former sound after handling and translated are carried out while exported.

Further, in addition to sychronization captions module, the sychronization captions module include audio interception unit, identification calibration Unit, captions synthesis unit and output display unit, the audio interception unit are used to obtain audio subframe, the identification calibration Unit is used to by the audio subframe further identify and calibrate, and the captions synthesis unit carries out multiple audio subframes Splicing synthesis forms complete captions, and the output display unit exports the complete captions and including below image/video.

Further, the camera is 3D high-definition cameras, and compared to common camera, 3D high-definition cameras are more sharp In the foundation of 3-D view, three-dimensional video call is easy to implement.

Further, module is translated during the voice and can recognize that multinational mainstream speech, such as, China and British, method, Japan and Korea S., Russia, moral Deng the main flow official language of country, and China the local dialect, such as, northeast, Shaanxi, Shanxi, Gansu, Beijing, Shandong, The country dialects such as Hefei, Shanghai, Guangdong, Hangzhou, Suzhou and Sichuan.

Further, the sychronization captions module can carry out bilingual subtitles output, and addition captions can greatly increase communication gap Logical convenience.

Further, the video call method of video call system, comprises the following steps：

S1：The call originating end initiates video calling by server at least one call opposite end asks, if Call originating end is unilaterally hung up or conversed opposite end rejection, then end of conversation；It is on the contrary then into next step；

S2：While the call request unit sends request, using the signal transmitting unit according to certain frequency Sent with time interval to call opposite end and specify signal, afterwards, the D/A conversion unit turns specified signal and feedback signal Change corresponding data into, the data and the transmission time and the time of feedback time are calculated finally by network speed assessment unit Difference assesses network speed situation, and the network speed situation is shown by the network speed display unit, and suggests that unit is given by the prompting Go out prompting；

S3：Suggest cancelling video calling, end of conversation if network speed is less than 50KB/S；If network speed is between 50KB/S- Then suggest opening two-dimensional video call mode between 400KB/S, personage's two dimensional image is obtained by camera, by each frame two dimension Image is compressed, quantified, encoding, sorting and storage forms continuous videos, is carried out finally by the electrical screen display unit Two dimensional image video calling；Suggest opening three-dimensional video call pattern if network speed is more than 400KB/S, people is obtained by camera The multi-faceted two dimensional image of thing and compositing 3 d images, are calculated 3-D view, are edited, are exported, are synthesized, corrected and are developed Virtual three-dimensional image is formed, former image is carried out floating by the last virtual three-dimensional image according to a certain percentage by projector mechanism Projection, line holographic projections video calling is carried out by line holographic projections display unit；

S4：The two dimensional image video calling transmits instruction, the line holographic projections button or by way of touching screen Video calling is tracked identification to face and action by the Tracking Recognition unit, while utilizes the instruction acquiring unit Target action instruction is obtained, and controls the virtual three-dimensional image to perform the target action using the action execution unit and refers to Order；

S5：Unit is included by the former sound and includes the call originating end or the field speech for opposite end of conversing, then is passed through The language identification unit identifies category of language, and by the language translation that the instant translation unit will identify that into specified language Speech, former sound are exported simultaneously with the voice after translation after carry out subchannel processing；At the same time, the audio interception unit obtains Audio subframe, and the audio subframe is identified and calibrated using the identification alignment unit, pass through the captions afterwards Multiple audio subframes are carried out splicing synthesis and form complete captions by synthesis unit, and last simultaneous display is in the two dimensional image video Or the lower section of line holographic projections video.

Further, the prompting suggests that unit system in first prompting can carry out the suitable call mould of autonomous selection Formula, under later stage Real-time Network slowdown monitoring, if network speed changes, pass through the manual switching Unit selection call mode.

Compared with prior art, beneficial effects of the present invention are：The video call system and video call method of the present invention Including two-dimensional video call and 3D hologram projection video two kinds of call modes of call, pass through speed measure feedback and determine call originating end With the real-time network speed for opposite end of conversing, call mode is selected with this, suggests cancelling video calling when network speed is less than 50KB/S, if net Speed is then suggested opening two-dimensional video call mode between 50KB/S-400KB/S, suggested out if network speed is more than 400KB/S Open three-dimensional video call pattern, reasonable selection talking mode, what not only be ensure that the fluency of video calling but also added user can Selectivity, call experience effect more preferably 3D hologram projection video may be selected in the case where network speed is good and converse.It is same with this When, the present invention translates module and sychronization captions module when being also additionally arranged voice, can simultaneous interpretation is multinational and the local dialect, and may be used also Bilingual subtitles output is shown, greatly increases the convenience of communication.In a word, the present invention is with system is smooth, method is more excellent, choosing The advantages that degree of selecting is high, Consumer's Experience effect is more life-like, accessible communication.

Brief description of the drawings

Fig. 1 is the block architecture diagram of the video call system of embodiments of the invention 1；

Fig. 2 is the step flow chart of the video call method of embodiments of the invention 1；

Fig. 3 is the block architecture diagram of the video call system of embodiments of the invention 2；

Fig. 4 is the step flow chart of the video call method of embodiments of the invention 2.

Wherein, 1- converse link block, 11- call requests unit, 12- call connect refuse unit, 2- speed measure feedbacks module, 21- signal transmitting units, 22- signal receiving units, 23- D/A conversion units, 24- network speeds assessment unit, 25- memory cell, Unit, 33- manual switchings unit, 4- pictures generation mould are suggested in 3- selections determining module, 31- network speeds display unit, 32- promptings Block, 41- collections acquiring unit, 42- measuring and calculating processing unit, 43- stored records unit, 44- extractions applying unit, 5- images are shown Module, 51- electrical screens display unit, 52- line holographic projections display unit, 6- interactive instructions module, 61- buttons touch unit, 62- Tracking Recognition unit, 63- translate module when instructing acquiring unit, 64- action execution units, 7- voices, 71- original sounds include unit, 72- language identifications unit, 73- instant translations unit, 74- stereo process unit, 75- audio mixings output unit, 8- sychronization captions moulds Block, 81- audios interception unit, 82- identifications alignment unit, 83- captions synthesis unit, 84- output displays unit, 9a- call hairs Origin or beginning, 9b- calls opposite end, 10- servers.

Embodiment

The present invention is further described in detail for 1-4 and specific embodiment below in conjunction with the accompanying drawings.

Embodiment 1

As shown in figure 1, video call system, main to include call link block 1, speed measure feedback module 2, selection determination mould Block 3, picture generation module 4, image display 5 and interactive instruction module 6；

Call link block 1 includes call request unit 11 and call connects and refuses unit 12, and call request unit 11 is used to lead to Talk about originating end 9a and send video calling request at least one call opposite end 9b, call, which connects, refuses unit 12 for the originating end 9a that converses Unilaterally hung up, or call originating end 9a call request is received or refuse for opposite end 9b selections of conversing；

Speed measure feedback module 2 includes signal transmitting unit 21, signal receiving unit 22, D/A conversion unit 23, network speed and commented Estimate unit 24 and memory cell 25, signal transmitting unit 21 is used for while call request unit 11 sends request, according to one Determine frequency and time interval and send specified signal to call opposite end 9b, signal receiving unit 22 is used to receive feedback signal, stored Unit 25 is used to record specified signal and sends time, feedback signal and feedback time, and D/A conversion unit 23 is by specified signal And feedback signal is converted into corresponding data, network speed assessment unit 24 calculates data and sends time and the time difference of feedback time Assess network speed situation；

Determining module 3 is selected to include network speed display unit 31, prompting suggestion unit 32 and manual switching unit 33, network speed shows Show that the real-time network speed data that unit 31 is used to draw network speed assessment unit 24 are shown, prompt to suggest unit 32 according to network speed The assessment result that assessment unit 24 is drawn provides optimal suggestion and automatically selected, and manual switching unit 33 is used to select automatically After selecting, carry out selecting two dimensional surface video calling mode or 3D hologram video calling mode according to the actual requirements；

Picture generation module 4 includes collection acquiring unit 41, measuring and calculating processing unit 42, stored record unit 43 and extraction should With unit 44, collection acquiring unit 41 obtains personage's two dimension or 3-D view by camera, wherein, camera is that 3D high definitions are taken the photograph As head, compared to common camera, 3D high-definition cameras are more favorable for the foundation of 3-D view, are easy to implement 3 D video and lead to Words.Each frame two dimensional image is compressed, quantified, is encoded, is sorted for measuring and calculating processing unit 42 and storage forms continuous videos, or 3-D view is calculated, edited, is exported, is synthesized, is corrected and development forms virtual three-dimensional image, extraction applying unit 44 is used In continuous videos or virtual three-dimensional image are carried out into extraction broadcasting；

Image display 5 includes electrical screen display unit 51 and line holographic projections display unit 52, electrical screen display unit 51 are used to show two dimensional image video, and line holographic projections display unit 52 is carried out former image by projector mechanism according to a certain percentage Floating projects；

Interactive instruction module 6 includes button touch unit 61, Tracking Recognition unit 62, instruction acquiring unit 63 and action and held Row unit 64, button touch unit 61 are conversed to two-dimensional video button or by way of touching screen and assign instruction, and tracking is known Other unit 62 is used to be tracked face and action identification, and instruction acquiring unit 63 is used under the conditions of three-dimensional video call will Virtual three-dimensional image is sent at least one call opposite end, obtains target action instruction, and action execution unit 64 is used to control Virtual three-dimensional image performance objective action command.

As shown in Fig. 2 the video call method of video call system, comprises the following steps,

S1：The originating end 9a that converses initiates video calling by server 10 at least one call opposite end 9b to be asked, if logical Words originating end 9a is unilaterally hung up or conversed opposite end 9b rejections, then end of conversation；It is on the contrary then into next step；

S2：Call request unit 11 send ask while, using signal transmitting unit 21 according to certain frequency and when Between be spaced to call opposite end 9b send specify signal, afterwards, specified signal and feedback signal are converted into by D/A conversion unit 23 Corresponding data, calculate data finally by network speed assessment unit 24 and send the time difference assessment network speed of time and feedback time Situation, if network speed situation shows that network speed is 310KB/S by network speed display unit 31, and carried by prompting suggestion unit 32 to provide Show that two-dimensional video call mode is opened in suggestion；

S3：Personage's two dimensional image is obtained by camera, each frame two dimensional image is compressed, quantify, encode, sorted Continuous videos are formed with storage, two dimensional image video calling is carried out finally by electrical screen display unit 51；

S4：Two dimensional image video calling transmits instruction button or by way of touching screen, until end of conversation.

Embodiment 2

As shown in figure 3, video call system, main to include call link block 1, speed measure feedback module 2, selection determination mould Module 7 and sychronization captions module 8 are translated when block 3, picture generation module 4, image display 5, interactive instruction module 6, voice；

Translated during voice module 7 including former sound include unit 71, language identification unit 72, instant translation unit 73, at audio mixing Unit 74 and audio mixing output unit 75 are managed, former sound includes the scene that unit 71 is used to include call originating end 9a or the opposite end 9b that converses Voice, language identification unit 72 are used to identify category of language, and instant translation unit 73 is used for the language translation that will identify that into finger Attribute says that stereo process unit 74 is used to former sound and the voice after translation carrying out subchannel processing, and audio mixing output unit 75 is used In the former sound after assigning to processing and the voice progress after translation while export, wherein, module, which is translated, during voice can recognize that multinational master Language is flowed, such as, the main flow official language of the country such as China and British, method, Japan and Korea S., Russia, moral, and China the local dialect, than Such as, the country dialect such as northeast, Shaanxi, Shanxi, Gansu, Beijing, Shandong, Hefei, Shanghai, Guangdong, Hangzhou, Suzhou and Sichuan.

It is aobvious that sychronization captions module 8 includes audio interception unit 81, identification alignment unit 82, captions synthesis unit 83 and output Show unit 84, audio interception unit 81 is used to obtain audio subframe, and identification alignment unit 82 is used to audio subframe entering traveling one Multiple audio subframes are carried out splicing synthesis and form complete captions, output display list by step identification and calibration, captions synthesis unit 83 Member 84 exports complete captions and including below image/video.Wherein, sychronization captions module can carry out bilingual subtitles output, add Captioning can greatly increase the convenience of communication.

As shown in figure 4, the video call method of video call system, comprises the following steps,

S2：Call request unit 11 send ask while, using signal transmitting unit 21 according to certain frequency and when Between be spaced to send to call opposite end and specify signal, afterwards, specified signal and feedback signal are converted into pair by D/A conversion unit 23 The data answered, calculate data finally by network speed assessment unit 24 and send the time difference assessment network speed shape of time and feedback time Condition, network speed situation show that network speed is 782KB/S by network speed display unit 31, and by prompting to suggest that unit 32 provides prompting and built View opens three-dimensional video call pattern；Wherein, prompting suggests that the system in first prompting of unit 32 can carry out autonomous selection and be adapted to Call mode, under later stage Real-time Network slowdown monitoring, if network speed changes, manually switch unit 33 selection call mould Formula.

S3：The multi-faceted two dimensional image of personage and compositing 3 d images are obtained by camera, 3-D view is counted Calculate, editor, output, synthesis, amendment and development form virtual three-dimensional image, last virtual three-dimensional image is by projector mechanism by original Image carries out floating projection according to a certain percentage, and line holographic projections video calling is carried out by line holographic projections display unit 52；

S4：Line holographic projections video calling is tracked identification, while profit to face and action by Tracking Recognition unit 62 Target action is obtained with instruction acquiring unit 63 to instruct, and controls virtual three-dimensional image performance objective using action execution unit 64 Action command；

S5：Unit 71 is included by former sound and includes call originating end 9a or the opposite end 9b that converses field speech, then passes through language Say that recognition unit 72 identifies category of language, and by the language translation that instant translation unit 73 will identify that into appointed language, it is former Sound is exported simultaneously with the voice after translation after carry out subchannel processing；At the same time, audio interception unit 81 obtains audio Frame, and using identifying that audio subframe is identified and calibrated by alignment unit 82, afterwards will be multiple by captions synthesis unit 83 Audio subframe carries out splicing synthesis and forms complete captions, and last simultaneous display is under two dimensional image video or line holographic projections video Side.

Finally it should be noted that：The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations；Although The present invention is described in detail with reference to the foregoing embodiments, it will be understood by those within the art that：It still may be used To be modified to the technical scheme described in previous embodiment, or equivalent substitution is carried out to which part technical characteristic；And These modifications are replaced, and the essence of appropriate technical solution is departed from the spirit and model of technical scheme of the embodiment of the present invention Enclose.

Claims

1. video call system, it is characterised in that main to include call link block (1), speed measure feedback module (2), selection really Cover half block (3), picture generation module (4), image display (5) and interactive instruction module (6)；

The call link block (1) includes call request unit (11) and call connects and refuses unit (12), the call request list First (11) are used to converse originating end (9a) at least one call opposite end (9b) transmission video calling request, and the call, which connects, refuses list First (12) are used for the call originating end (9a) and unilaterally hung up, or receive or refuse for call opposite end (9b) selection The call request of call originating end (9a) absolutely；

The speed measure feedback module (2) includes signal transmitting unit (21), signal receiving unit (22), D/A conversion unit (23), network speed assessment unit (24) and memory cell (25), the signal transmitting unit (21) are used in call request unit (11) while sending request, sent according to certain frequency and time interval to call opposite end (9b) and specify signal, the signal Receiving unit (22) is used to receive feedback signal, the memory cell (25) is used to recording the specified signal and sends the time, Specified signal and feedback signal are converted into corresponding number by the feedback signal and feedback time, the D/A conversion unit (23) According to the network speed assessment unit (24) calculates the data and the time difference assessment network speed shape for sending time and feedback time Condition；

The selection determining module (3) includes network speed display unit (31), unit (32) and manual switching unit are suggested in prompting (33), the network speed display unit (31) is used to be shown the real-time network speed data that the network speed assessment unit (24) draws Show, the prompting suggests that the assessment result that unit (32) is drawn according to network speed assessment unit (24) provides optimal suggestion and carried out certainly Dynamic selection, the manual switching unit (33) are used for after automatically selecting, and carry out selecting two dimensional surface video according to the actual requirements Talking mode or 3D hologram video calling mode；

The picture generation module (4) includes collection acquiring unit (41), measuring and calculating processing unit (42), stored record unit (43) With extraction applying unit (44), the collection acquiring unit (41) obtains personage's two dimension or 3-D view by camera, described Each frame two dimensional image is compressed, quantified, is encoded, is sorted for measuring and calculating processing unit (42) and storage forms continuous videos, or 3-D view is calculated, edited, is exported, is synthesized, is corrected and development forms virtual three-dimensional image, the extraction applying unit (44) it is used to the continuous videos or virtual three-dimensional image carrying out extraction broadcasting；

Described image display module (5) includes electrical screen display unit (51) and line holographic projections display unit (52), the electronics Screen display unit (51) is used to show two dimensional image video, and the line holographic projections display unit (52) is by projector mechanism by former shadow As carrying out floating projection according to a certain percentage；

The interactive instruction module (6) includes button touch unit (61), Tracking Recognition unit (62), instruction acquiring unit (63) With action execution unit (64), the button touch unit (61) is conversed button or by way of touching screen to two-dimensional video Instruction is assigned, the Tracking Recognition unit (62) is used to be tracked face and action identification, the instruction acquiring unit (63) it is used under the conditions of three-dimensional video call send virtual three-dimensional image at least one call opposite end, obtains mesh Action command is marked, the action execution unit (64) is used to control virtual three-dimensional image to perform the target action instruction.

2. video call system as claimed in claim 1, it is characterised in that also including translating module (7), the voice during voice When translate module (7) and include unit (71), language identification unit (72), instant translation unit (73), stereo process list including former sound First (74) and audio mixing output unit (75), the former sound include unit (71) and are used to include the call originating end (9a) or call The field speech of opposite end (9b), the language identification unit (72) are used to identify category of language, the instant translation unit (73) For the language translation that will identify that into appointed language, the stereo process unit (74) is used for former sound and the voice after translation Subchannel processing is carried out, the audio mixing output unit (75) is used for the former sound after subchannel is handled and carried out with the voice after translation Export simultaneously.

3. video call system as claimed in claim 1, it is characterised in that also including sychronization captions module (8), the synchronization Captioning module (8) includes audio interception unit (81), identification alignment unit (82), captions synthesis unit (83) and output display list First (84), the audio interception unit (81) are used to obtain audio subframe, and the identification alignment unit (82) is used for the sound Frequency subframe further identify and calibrate, and multiple audio subframes are carried out splicing synthesis composition by the captions synthesis unit (83) Complete captions, the output display unit (84) export the complete captions and including below image/video.

4. video call system as claimed in claim 1, it is characterised in that the camera is 3D high-definition cameras.

5. video call system as claimed in claim 1, it is characterised in that module is translated during the voice and can recognize that multinational main flow Language, and China the local dialect.

6. video call system as claimed in claim 1, it is characterised in that module is translated during the voice and can recognize that English, the Chinese Ten kinds of language, German, French, Russian, Spanish, Japanese, Arabic, Korean, Portuguese mainstream speeches, and Chinese state Interior the local dialect.

7. video call system as claimed in claim 1, it is characterised in that the sychronization captions module can carry out bilingual subtitles Output.

8. the video call method of the video call system according to claim 1-7 any one, it is characterised in that including Following steps：

S1：The call originating end (9a) initiates video calling by server (10) at least one call opposite end (9b) Request, if one-sided opposite end (9b) rejection of hanging up or converse of call originating end (9a), end of conversation；It is on the contrary then into next Step；

S2：While the call request unit (11) sends request, using the signal transmitting unit (21) according to certain Frequency and time interval send to call opposite end (9b) and specify signal, and afterwards, the D/A conversion unit (23) is by specified signal And feedback signal is converted into corresponding data, the data and the transmission time are calculated finally by network speed assessment unit (24) Network speed situation is assessed with the time difference of feedback time, the network speed situation is shown by the network speed display unit (31), and logical Cross the prompting and suggest that unit (32) provides prompting；

S3：Suggest cancelling video calling, end of conversation if network speed is less than 50KB/S；If network speed between 50KB/S-400KB/S it Between then suggest open two-dimensional video call mode, by camera obtain personage's two dimensional image, by each frame two dimensional image carry out Compression, quantization, coding, sequence and storage form continuous videos, and two dimension is carried out finally by the electrical screen display unit (51) Image/video is conversed；Suggest opening three-dimensional video call pattern if network speed is more than 400KB/S, it is more to obtain personage by camera The two dimensional image and compositing 3 d images in orientation, calculated 3-D view, edited, exported, synthesized, corrected and developed formation Former image is carried out floating throwing by virtual three-dimensional image, the last virtual three-dimensional image according to a certain percentage by projector mechanism Shadow, line holographic projections video calling is carried out by line holographic projections display unit (52)；

S4：The two dimensional image video calling transmits instruction, the line holographic projections video button or by way of touching screen Call is tracked identification to face and action by the Tracking Recognition unit (62), while utilizes the instruction acquiring unit (63) target action instruction is obtained, and controls the virtual three-dimensional image to perform the mesh using the action execution unit (64) Mark action command；

S5：Unit (71) is included by the former sound and includes originating end (9a) or the live language of call opposite end (9b) conversed Sound, then category of language is identified by the language identification unit (72), and will identify that by the instant translation unit (73) Language translation into appointed language, former sound is exported simultaneously with the voice after translation after carry out subchannel processing；At the same time, institute State audio interception unit (81) and obtain audio subframe, and known the audio subframe using the identification alignment unit (82) Not and calibrate, multiple audio subframes are carried out into splicing synthesis by the captions synthesis unit (83) afterwards forms complete captions, Last simultaneous display is in the lower section of the two dimensional image video or line holographic projections video.