CN108650484A - A kind of method and device of the remote synchronous translation based on audio/video communication - Google Patents
A kind of method and device of the remote synchronous translation based on audio/video communication Download PDFInfo
- Publication number
- CN108650484A CN108650484A CN201810694423.6A CN201810694423A CN108650484A CN 108650484 A CN108650484 A CN 108650484A CN 201810694423 A CN201810694423 A CN 201810694423A CN 108650484 A CN108650484 A CN 108650484A
- Authority
- CN
- China
- Prior art keywords
- audio
- video
- direct broadcasting
- broadcasting room
- cloud server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000013519 translation Methods 0.000 title claims abstract description 31
- 230000001360 synchronised effect Effects 0.000 title claims abstract description 15
- 238000004891 communication Methods 0.000 title claims abstract description 11
- 230000005540 biological transmission Effects 0.000 claims description 8
- 239000000284 extract Substances 0.000 claims description 8
- 238000012856 packing Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The present invention provides a kind of remote synchronous translation method and apparatus based on the communication of audio and video high in the clouds, spokesman's audio stream and video flowing are acquired by meeting-place microphone and camera and are stored in the meeting-place ends PC, the audio stream of storage and video flowing are uploaded to audio and video Cloud Server by one-way communication and handled by the meeting-place ends PC, interpreter end interpreter selects input language direct broadcasting room and output language direct broadcasting room from audio and video Cloud Server, the first language being stored in audio and video Cloud Server is translated into second language, and translation accuracy can be improved according to video flowing, last meeting-place audience selects to obtain the language for needing to listen to from audio and video Cloud Server by audience's listening equipment.Terminal of the methods and apparatus of the present invention using audio and video cloud server as different language meets long-range simultaneous interpretation translation demand, while live audience can obtain required object language in time.
Description
Technical field
The invention belongs to remote synchronous translation fields, specifically, belonging to the remote synchronous translation based on audio/video communication
Field.
Background technology
Traditional simultaneous interpretation needs interpreter to be sitting between the simultaneous interpretation of meeting-place, and instant translation is provided by dedicated simultaneous interpretation equipment
Service.Interpreter, which must arrive scene, could provide translation service.Which greatly limits the flexibilities of translation service, in particular for more
The case where a interpreter and interpreter are strange lands, had not only affected the efficiency of meeting, but also increase the cost of translation service.
Existing remote synchronous translation apparatus connecting interpreter as CN201156746Y provides one kind based on broadband internet
With the system at meeting scene, in this mode, if there is the audience of different language at meeting scene, it is required that interpreter must be based on
Translation of the meeting-place spokesman's languages to meeting-place different language audience, i.e., such as meeting-place spokesman's interpretation from French, and audience have Chinese,
German, English audient, then correspond to interpreter should be in method, method moral, method English translator, requirement higher of this mode to interpreter, and
And cost bigger.For another example CN104427294A provides a kind of simultaneous interpretation of the support video conference based on cloud server
The audio data of acquisition is converted to text data by method and apparatus, beyond the clouds, server, is then generated according to this article notebook data
The audio data of required other languages and output, the conversion that this mode can cause audio to arrive audio again to text are excessively numerous
It is trivial.
The present invention is to substitute traditional simultaneous interpretation special equipment with a set of long-range simultaneous interpretation audio and video Cloud Server, and interpreter can not
To meeting scene, simultaneous interpretation service is provided by internet remote mode, and meet multilingual translation demand.
Invention content
The purpose of the present invention is to provide a kind of method and apparatus of the remote synchronous translation based on audio and video Cloud Server,
It is listened in use by means of traditional audio stream and video flowing collecting device, the implement at interpreter end and audience
Equipment realizes meeting-place end and interpreter end and interpreter end and interpreter end by the direct broadcasting room that is arranged in audio and video Cloud Server
Data transmission, to meet multilingual translation demand.
While the present invention proposes concept between voice broadcast in simultaneous interpretation application scenarios, by the voice of live spokesman
Be stored in Cloud Server and be defined as primary sound direct broadcasting room, this with reference to conventional on-site simultaneous interpretation equipment " primary sound channel " it is general
It reads.The interpreter of languages identical with primary sound selects primary sound direct broadcasting room to select oneself as the interpreter of " input " and primary sound different language
The direct broadcasting room of languages is used as " input ", by this set method, can significantly reduce the cost for finding target interpreter.
The present invention also passes the video of spokesman other than the audio of meeting-place spokesman is remotely passed to interpreter in real time
Interpreter is given, interpreter can observe action and the expression of spokesman in real time, improve translation quality and efficiency
Other characteristics and advantages of the present invention will be apparent from by the following detailed description, or partially by the present invention
Practice and acquistion.
According to the first aspect of the invention, a kind of method of the remote synchronous translation based on audio/video communication is provided,
It is characterized in that, this method includes:
Step 1 acquires meeting-place spokesman's audio stream and video flowing by microphone and camera and is transferred to meeting-place PC respectively
End system;
Step 2, meeting-place PC end systems are by network by above-mentioned audio stream and video flowing one-way transmission to audio and video cloud service
Device, above-mentioned audio stream are formed as primary sound direct broadcasting room in Cloud Server, i.e., audio and video identical with spokesman's language are automatically stored
To primary sound direct broadcasting room, above-mentioned video flowing stores in Cloud Server is formed as common video stream, is transferred for interpreter end interpreter;
Step 3, interpreter end interpreter select primary sound direct broadcasting room as input terminal, primary sound direct broadcasting room sound intermediate frequency stream are translated into
First languages audio stream, which is output in audio and video Cloud Server, to be stored, and the first direct broadcasting room is formed as;
Step 4, interpreter end interpreter selects the first direct broadcasting room as input terminal, by the first languages sound in the first direct broadcasting room
Frequency stream, which is translated into the second languages audio stream and is output in audio and video Cloud Server, to be stored, and the second direct broadcasting room is formed as;
Step 5, meeting-place audience select the audio stream of primary sound direct broadcasting room or the first direct broadcasting room or the second direct broadcasting room as needed
It is listened to.
In other embodiments of the present invention, aforementioned schemes are based on, audio stream is stored simultaneously in each direct broadcasting room and regards
Frequency flows, and it is that the interpreter that input language is translated transfers to be provided with the audio stream language.
In some embodiments of the invention, aforementioned schemes are based on, further include interpreter end interpreter from original after step 4
Audio stream is transferred in sound direct broadcasting room or first or second direct broadcasting room carries out translation formation different from primary sound, the first languages and the second language
The third languages of kind, and be transmitted to Cloud Server and preserved, be formed as third direct broadcasting room.
In some embodiments of the invention, meeting-place audience can select not listen to from Cloud Server, and select directly from
Spokesman's speech is listened in meeting-place, and the location of audience ensures that he can not hear spokesman's speech at this time, and audience's receives languages
Identical or audience can understand spokesman's speech languages with spokesman's languages.
According to the second aspect of the invention, a kind of device of the remote synchronous translation based on audio/video communication is provided,
It is characterized in that, which includes:
Audio collection microphone is used to acquire the audio stream of spokesman from meeting-place;The audio collection microphone can be with battle array
The mode of row arranges, clearly accurately to be acquired to the progress of the audio of spokesman;
Video acquisition camera is used to acquire the video flowing of spokesman from meeting-place;The video acquisition camera can also
It is arranged with array way, acquires the video flowing of spokesman from different perspectives;
The meeting-place ends PC, are used to store above-mentioned audio stream and video flowing;
Audio and video Cloud Server is inside formed as primary sound direct broadcasting room, the first direct broadcasting room, the second direct broadcasting room;
Interpreter's end device, interpreter extract input audio stream and input video stream by the device from audio and video Cloud Server,
Output audio stream and outputting video streams are output to audio and video Cloud Server after operation processing;
Meeting-place audience's listening equipment, audience extract required sound from the direct broadcasting room in audio and video Cloud Server by the equipment
Frequency flows.
The ends above-mentioned meeting-place PC and interpreter end and the network connection of audio and video Cloud Server, meeting-place audience's listening equipment and audio and video
Cloud Server network connection.
In some embodiments of the invention, interpreter translates from the audio stream extracted in Cloud Server in primary sound direct broadcasting room
It is stored in the first direct broadcasting room to be output to Cloud Server different from the first languages of primary sound languages;Other interpreter can be from cloud
The first languages audio stream in above-mentioned first direct broadcasting room is extracted in server, is translated as being different from primary sound languages and the first languages
Second languages are simultaneously output in Cloud Server and are stored in the second direct broadcasting room.
In other embodiments of the present invention, correspondence is stored with simultaneously in each direct broadcasting room in audio and video Cloud Server
The audio stream and video flowing of languages.
In other embodiments of the present invention, corresponding language is only stored in each direct broadcasting room in audio and video Cloud Server
The audio stream of kind, and video flowing is stored in individual memory block in Cloud Server.
It should be understood that above general description and following detailed description is only exemplary and explanatory, not
It can the limitation present invention.
Description of the drawings
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the present invention
Example, and be used to explain the principle of the present invention together with specification.It should be evident that the accompanying drawings in the following description is only the present invention
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.In the accompanying drawings:
Fig. 1 shows a kind of simultaneous interpretation telework flow based on audio and video Cloud Server;
Fig. 2 diagrammatically illustrates remote synchronous translation fundamental diagram according to an embodiment of the invention;
Fig. 3 diagrammatically illustrates remote synchronous translation work system figure according to an embodiment of the invention.
Specific implementation mode
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes
Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the present invention will more
Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner
In example.In the following description, many details are provided to fully understand the embodiment of the present invention to provide.However,
It will be appreciated by persons skilled in the art that technical scheme of the present invention can be put into practice without one or more in specific detail,
Or other methods, constituent element, device, step may be used etc..In other cases, it is not shown in detail or describes known side
Method, device, realization or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in attached drawing is only functional entity, not necessarily must be corresponding with physically separate entity.
I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit
These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.
Flow chart shown in attached drawing is merely illustrative, it is not necessary to including all content and operation/step,
It is not required to execute by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close
And or part merge, therefore the sequence actually executed is possible to be changed according to actual conditions.
It is a kind of method of the remote synchronous translation based on audio/video communication as shown in Figure 1, this method includes:Pass through microphone
Array and camera array acquisition meeting-place spokesman's audio stream and video flowing, and it is input to meeting-place PC end systems storage;Meeting-place PC
End system is by network by above-mentioned audio stream and video flowing one-way transmission to audio and video Cloud Server (i.e. the ends server), microphone battle array
Spokesman's audio stream of row acquisition is formed as primary sound direct broadcasting room in Cloud Server, and above-mentioned video flowing is formed as in Cloud Server
Common video stream;Interpreter end interpreter selects primary sound direct broadcasting room as input terminal using the ends interpreter PC software, will be in primary sound direct broadcasting room
Primary sound audio stream translates into the first languages audio stream and is output to audio and video Cloud Server, is formed as the first direct broadcasting room;Interpreter end its
His interpreter selects the first direct broadcasting room as input terminal, and the first languages audio stream in the first direct broadcasting room is translated into the second languages sound
Frequency stream is output to audio and video Cloud Server, is formed as the second direct broadcasting room;In the process, interpreter end interpreter can transfer and deposit at any time
The video stream data in Cloud Server is stored up, action and the expression of spokesman are observed convenient for interpreter, to provide preferably translation clothes
Business;Meeting-place audience utilizes meeting-place audience APP softwares or other listening equipments selection primary sound direct broadcasting room or the first live streaming as needed
Between or the audio stream of the second direct broadcasting room listened to, it is, of course, understood that if live spokesman's sound quality is fine, audience
Spokesman's speech languages can be understood, the meeting-place speech for directly listening to spokesman can be selected, without being taken from cloud by equipment
Business device is listened to.
In the program, before meeting starts, meeting presider creates meeting room on a line, and a meeting corresponds on a line
Meeting room, meeting room includes video and multiple one-way voice direct broadcasting rooms all the way on a line, this video, that is, camera array all the way
The video flowing of the spokesman of acquisition, meeting-place host can control camera by the ends PC software and be directed at current speaker, will send out
The video image of speech people is transmitted to interpreter by long-range simultaneous interpretation audio and video software systems;The each languages being related in meeting correspond to
One one-way voice direct broadcasting room, wherein spokesman's audio stream is defaulted as primary sound direct broadcasting room.On the software of interpreter end, interpreter according to
The source language and the target language of oneself select corresponding input direct broadcasting room and output direct broadcasting room.
In aforementioned schemes, video flowing can be packaged from audio stream and be stored in different direct broadcasting rooms, such as spokesman's audio
Stream is stored in primary sound direct broadcasting room with the packing of spokesman's video flowing;The packing of first languages audio stream and spokesman's video flowing is stored in the
One direct broadcasting room, and so on.Interpreter end interpreter extracts audio stream and video flowing simultaneously from the direct broadcasting room of Cloud Server, and meeting-place
Audience can then select different equipment, such as the equipment with display and earphone to be obtained from the direct broadcasting room of Cloud Server simultaneously
Audio stream and video flowing are taken, the equipment that can also select only listening function only obtains audio from the direct broadcasting room of Cloud Server
Stream.
Fig. 2 shows a kind of long-range simultaneous interpretation fundamental diagram by taking three Chinese, English, French languages as an example, this fields
Technical staff it is understood that other languages or more languages working method can with and so on.
Audio stream is input to meeting-place PC end systems by meeting-place spokesman (French) by microphone array, passes through camera array
Video flowing is input to the ends PC, by Internet, the ends PC are by above-mentioned audio stream and video flowing one-way transmission to audio and video cloud
Server, audio stream are stored in primary sound direct broadcasting room, video flowing storage to individual position;With primary sound languages (French) for source language
It is primary sound direct broadcasting room that the interpreter of speech chooses input direct broadcasting room by interpreter end software, and it is Chinese direct broadcasting room to choose output direct broadcasting room,
French audio stream is obtained from primary sound direct broadcasting room, while obtaining video flowing from Cloud Server, in conjunction with french audio stream and video flowing
French Translator at Chinese (object language) and is exported into the translation completed to the storage of Chinese direct broadcasting room from French to Chinese.
And Sino-British interpreter then selects Chinese direct broadcasting room to input direct broadcasting room, selects English direct broadcasting room to export direct broadcasting room.
Thus can imagine, if meeting is held in China, the French speech of a French spokesman only needs to have in method
Its speech is translated into Chinese by interpreter, the interpreter of other object languages, for example the interpreters such as Sino-British, Sino-German, middle Portugal, Chinese and Western can
Using Chinese direct broadcasting room as input, object language is exported, without finding the translation such as method English, method moral, method Portugal interpreter.
Meeting-place audience utilize listening equipment, select corresponding direct broadcasting room can the corresponding language of uppick, Chinese audience
The Chinese direct broadcasting room of selection, English audience select English direct broadcasting room, and French audience can directly listen spokesman's primary sound at scene, from hair
The audience of speech people farther out can also select primary sound direct broadcasting room to listen to.
Fig. 3 shows a kind of device of the remote synchronous translation based on audio/video communication, which is characterized in that the device packet
It includes:
Audio collecting device is used to acquire the audio stream of spokesman from meeting-place;The audio collecting device can be with battle array
The microphone that the mode of row arranges, clearly accurately to be acquired to the progress of the audio of spokesman;
Video capture device is used to acquire the video flowing of spokesman from meeting-place;The video capture device can be with battle array
The camera that row mode is arranged acquires the video flowing of spokesman from different perspectives;
The meeting-place ends PC, are used to store above-mentioned audio stream and video flowing;
Audio and video Cloud Server, is inside formed as primary sound direct broadcasting room, the first direct broadcasting room, the second direct broadcasting room, and wherein primary sound is straight
Middle storage spokesman's spoken audio stream between broadcasting, the first direct broadcasting room store the first languages audio stream, and the second direct broadcasting room stores the second language
Kind audio stream;
Interpreter's end device, interpreter input audio by interpreter's end device from extraction between respective live in audio and video Cloud Server
Output audio stream and outputting video streams are output to audio and video Cloud Server and are stored in phase by stream and input video stream after operation processing
Answer direct broadcasting room;
Meeting-place audience's listening equipment, audience by the equipment between the respective live in audio and video Cloud Server extraction needed for
Audio stream.
The ends above-mentioned meeting-place PC and interpreter end and the network connection of audio and video Cloud Server, meeting-place audience's listening equipment and audio and video
Cloud Server network connection
According to a kind of specific embodiment, spokesman's interpretation from French, audio collecting device acquires french audio stream as former
Sound audio is transferred to the meeting-place ends PC, and video capture device acquires the transmission of video of spokesman to the meeting-place ends PC;Wherein audio collection
Equipment can be that microphone can be either camera or preferably by taking the photograph preferably by microphone array video capture device
As head array.
The meeting-place ends PC by network by primary sound audio and transmission of video to audio and video Cloud Server, primary sound audio stream is stored
At primary sound direct broadcasting room (French), video flowing is stored in video memory block;
In interpreter end method interpreter by middle method interpreter end the extraction method voice from the primary sound direct broadcasting room of audio and video Cloud Server
Frequency stream, from video memory block, extraction spokesman's video flowing is used as input, carries out operation processing and makees the Chinese audio stream after translation
The Chinese direct broadcasting room being transferred to for output in audio and video Cloud Server;Interpreter end China and Britain interpreter is then regarded by Sino-British interpreter end from sound
The Chinese direct broadcasting room of frequency Cloud Server extracts the Chinese audio stream that method interpreter among the above is output to audio and video Cloud Server, Yi Jicun
It is stored in spokesman's video flowing of the video memory block of Cloud Server, carries out operation processing using the English video flowing after translation as defeated
Go out to be transmitted to the English direct broadcasting room of audio and video Cloud Server.
Meeting-place audience is by corresponding listening equipment, for example, Chinese audience is taken by Chinese listening equipment from audio and video cloud
The Chinese audio stream of Chinese direct broadcasting room extraction being engaged in device, English audience is by English listening equipment from audio and video Cloud Server
English direct broadcasting room extracts English language audio stream, and French audience both can directly listen to the scene speech of live spokesman, can also
It is extracted from the primary sound direct broadcasting room in audio and video Cloud Server/French direct broadcasting room by primary sound listening equipment or French listening equipment
Audio stream is listened to.
Particularly, in audio and video Cloud Server, corresponding audio stream and spokesman can be stored in each direct broadcasting room simultaneously
Video flowing, interpreter end interpreter directly extract input audio stream and video flowing simultaneously from the direct broadcasting room of Cloud Server.
Meeting-place audience can select different listening equipments.For example, the listening equipment with display screen, while from audio and video
The audio stream and video flowing of needs are extracted in Cloud Server.
It should be noted that although being referred to several modules or list for acting the equipment executed in above-detailed
Member, but this division is not enforceable.In fact, according to the embodiment of the present invention, it is above-described two or more
The feature and function of module either unit can embody in a module or unit.Conversely, an above-described mould
Either the feature and function of unit can be further divided into and embodied by multiple modules or unit block.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the present invention
Its embodiment.This application is intended to cover the present invention any variations, uses, or adaptations, these modifications, purposes or
Person's adaptive change follows the general principle of the present invention and includes undocumented common knowledge in the art of the invention
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are wanted by right
It asks and points out.
It should be understood that the invention is not limited in the precision architectures for being described above and being shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.
Claims (10)
1. a kind of method of the remote synchronous translation based on audio/video communication, which is characterized in that this method includes:
Meeting-place spokesman's audio stream is input to meeting-place PC end systems by audio collecting device, passes through video acquisition by step 1
Meeting-place spokesman's video flowing is input to meeting-place PC end systems by equipment;
Step 2, meeting-place PC end systems by network by above-mentioned audio stream and video flowing one-way transmission to audio and video Cloud Server,
Above-mentioned audio stream is formed as primary sound direct broadcasting room in Cloud Server, and above-mentioned video flowing is formed as public video in Cloud Server
Stream;
Step 3, interpreter end interpreter select primary sound direct broadcasting room as input terminal, and primary sound direct broadcasting room sound intermediate frequency stream is translated into first
Languages audio stream is output to audio and video Cloud Server, is stored in the first direct broadcasting room;
Step 4, interpreter end interpreter selects the first direct broadcasting room as input terminal, by the first languages audio stream in the first direct broadcasting room
It translates into the second languages audio stream and is output to audio and video Cloud Server, be stored in the second direct broadcasting room;
Step 5, meeting-place audience select primary sound direct broadcasting room or the audio stream of the first direct broadcasting room or the second direct broadcasting room to carry out as needed
It listens to.
2. according to the method described in claim 1, it is characterized in that, the step 2 sound intermediate frequency stream and video flowing packing are stored in
Primary sound direct broadcasting room in audio and video Cloud Server;The first languages audio stream and video flowing packing are stored in sound and regard in the step 3
The first direct broadcasting room in frequency Cloud Server;The second languages audio stream and video flowing packing are stored in audio and video cloud in the step 4
The second direct broadcasting room in server.
3. method according to claim 1 or 2, which is characterized in that audio collecting device is microphone or microphone array, video
Collecting device is camera or camera array.
4. method according to any one of claim 1-3, which is characterized in that further include being formed as after step 4
Different from the third direct broadcasting room of the first languages and the storage third languages of the second languages.
5. according to the described method of any one of claim 1-4, which is characterized in that in step 5, audience can select directly
The speech of meeting-place spokesman is listened to without being listened to from audio and video cloud server.
6. according to the described method of any one of claim 1-4, meeting-place audience only listens to the audio of direct broadcasting room in Cloud Server
Stream.
7. a kind of device of the remote synchronous translation based on audio/video communication, which is characterized in that the device includes:
Audio collecting device is used to acquire the audio stream of spokesman from meeting-place;
Video capture device is used to acquire the video flowing of spokesman from meeting-place;
The meeting-place ends PC, are used to store above-mentioned audio stream and video flowing;
Audio and video Cloud Server is inside formed with primary sound direct broadcasting room, the first direct broadcasting room, the second direct broadcasting room;
Interpreter's end device, interpreter extract input audio stream and input video by interpreter's end device from audio and video Cloud Server
Output audio stream and outputting video streams are output to audio and video Cloud Server by stream after operation processing;
Meeting-place audience's listening equipment, audience extract required audio from the direct broadcasting room in audio and video Cloud Server by the equipment
Stream.
8. device according to claim 7, which is characterized in that the primary sound direct broadcasting room, the first direct broadcasting room and the second live streaming
Between include corresponding languages audio stream and spokesman's video flowing.
9. device according to claim 7, which is characterized in that the primary sound direct broadcasting room, the first direct broadcasting room and the second live streaming
Between in contain only the audio streams of corresponding languages, video flowing is stored separately in audio and video Cloud Server.
10. according to the device described in any one of claim 7-9, audio and video Cloud Server and interpreter end, the meeting-place ends PC and meeting
Field audience's listening equipment is carried out data transmission by network connection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810694423.6A CN108650484A (en) | 2018-06-29 | 2018-06-29 | A kind of method and device of the remote synchronous translation based on audio/video communication |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810694423.6A CN108650484A (en) | 2018-06-29 | 2018-06-29 | A kind of method and device of the remote synchronous translation based on audio/video communication |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108650484A true CN108650484A (en) | 2018-10-12 |
Family
ID=63749969
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810694423.6A Pending CN108650484A (en) | 2018-06-29 | 2018-06-29 | A kind of method and device of the remote synchronous translation based on audio/video communication |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108650484A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110677406A (en) * | 2019-09-26 | 2020-01-10 | 上海译牛科技有限公司 | Simultaneous interpretation method and system based on network |
CN110166729B (en) * | 2019-05-30 | 2021-03-02 | 上海赛连信息科技有限公司 | Cloud video conference method, device, system, medium and computing equipment |
CN112738446A (en) * | 2020-12-28 | 2021-04-30 | 传神语联网网络科技股份有限公司 | Simultaneous interpretation method and system based on online conference |
CN112735430A (en) * | 2020-12-28 | 2021-04-30 | 传神语联网网络科技股份有限公司 | Multilingual online simultaneous interpretation system |
CN114584735A (en) * | 2022-01-12 | 2022-06-03 | 甲骨易(北京)语言科技股份有限公司 | Online conference simultaneous transmission live broadcast method and system |
EP4013043A1 (en) * | 2020-12-09 | 2022-06-15 | Alfaview Video Conferencing Systems GmbH & Co. KG | Video conferencing system, information transmission method and computer program product |
WO2022127826A1 (en) * | 2020-12-15 | 2022-06-23 | 华为云计算技术有限公司 | Simultaneous interpretation method, apparatus and system |
CN115460371A (en) * | 2021-06-09 | 2022-12-09 | 苏州译牛智能科技有限公司 | Simultaneous interpretation method in video conference, server and readable storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1604573A (en) * | 2004-11-09 | 2005-04-06 | 北京中星微电子有限公司 | A multipath audio frequency buffering method under IP network environment |
US20080300860A1 (en) * | 2007-06-01 | 2008-12-04 | Rgb Translation, Llc | Language translation for customers at retail locations or branches |
CN101631032A (en) * | 2009-08-27 | 2010-01-20 | 深圳华为通信技术有限公司 | Method, device and system for realizing multilingual meetings |
CN202838331U (en) * | 2012-09-14 | 2013-03-27 | 谭建中 | Long-distance synchrony translation system |
US20130342632A1 (en) * | 2012-06-25 | 2013-12-26 | Chi-Chung Su | Video conference apparatus and method for audio-video synchronization |
CN107079069A (en) * | 2014-10-19 | 2017-08-18 | Televic会议股份有限公司 | Interpreter's table of conference system |
KR20170111905A (en) * | 2016-03-30 | 2017-10-12 | 주식회사 플렉싱크 | A Conference Contents Providing System Using the Simultaneous Interpretation Sound |
CN208675397U (en) * | 2018-06-29 | 2019-03-29 | 中译语通科技股份有限公司 | A kind of device of the remote synchronous translation based on audio/video communication |
-
2018
- 2018-06-29 CN CN201810694423.6A patent/CN108650484A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1604573A (en) * | 2004-11-09 | 2005-04-06 | 北京中星微电子有限公司 | A multipath audio frequency buffering method under IP network environment |
US20080300860A1 (en) * | 2007-06-01 | 2008-12-04 | Rgb Translation, Llc | Language translation for customers at retail locations or branches |
CN101631032A (en) * | 2009-08-27 | 2010-01-20 | 深圳华为通信技术有限公司 | Method, device and system for realizing multilingual meetings |
US20130342632A1 (en) * | 2012-06-25 | 2013-12-26 | Chi-Chung Su | Video conference apparatus and method for audio-video synchronization |
CN202838331U (en) * | 2012-09-14 | 2013-03-27 | 谭建中 | Long-distance synchrony translation system |
CN107079069A (en) * | 2014-10-19 | 2017-08-18 | Televic会议股份有限公司 | Interpreter's table of conference system |
KR20170111905A (en) * | 2016-03-30 | 2017-10-12 | 주식회사 플렉싱크 | A Conference Contents Providing System Using the Simultaneous Interpretation Sound |
CN208675397U (en) * | 2018-06-29 | 2019-03-29 | 中译语通科技股份有限公司 | A kind of device of the remote synchronous translation based on audio/video communication |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110166729B (en) * | 2019-05-30 | 2021-03-02 | 上海赛连信息科技有限公司 | Cloud video conference method, device, system, medium and computing equipment |
CN110677406A (en) * | 2019-09-26 | 2020-01-10 | 上海译牛科技有限公司 | Simultaneous interpretation method and system based on network |
EP4013043A1 (en) * | 2020-12-09 | 2022-06-15 | Alfaview Video Conferencing Systems GmbH & Co. KG | Video conferencing system, information transmission method and computer program product |
US11825238B2 (en) | 2020-12-09 | 2023-11-21 | alfaview Video Conferencing Systems GmbH & Co. KG | Videoconference system, method for transmitting information and computer program product |
WO2022127826A1 (en) * | 2020-12-15 | 2022-06-23 | 华为云计算技术有限公司 | Simultaneous interpretation method, apparatus and system |
CN112738446A (en) * | 2020-12-28 | 2021-04-30 | 传神语联网网络科技股份有限公司 | Simultaneous interpretation method and system based on online conference |
CN112735430A (en) * | 2020-12-28 | 2021-04-30 | 传神语联网网络科技股份有限公司 | Multilingual online simultaneous interpretation system |
CN115460371A (en) * | 2021-06-09 | 2022-12-09 | 苏州译牛智能科技有限公司 | Simultaneous interpretation method in video conference, server and readable storage medium |
CN114584735A (en) * | 2022-01-12 | 2022-06-03 | 甲骨易(北京)语言科技股份有限公司 | Online conference simultaneous transmission live broadcast method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108650484A (en) | A kind of method and device of the remote synchronous translation based on audio/video communication | |
CN208675397U (en) | A kind of device of the remote synchronous translation based on audio/video communication | |
CN107708006B (en) | Computer-readable storage medium, real-time translation system | |
CN110166729B (en) | Cloud video conference method, device, system, medium and computing equipment | |
CN102005142B (en) | Information interaction method for teaching | |
KR101970731B1 (en) | Artificial intelligent speaker and its control method | |
CN111447397B (en) | Video conference based translation method, video conference system and translation device | |
CN111739553A (en) | Conference sound acquisition method, conference recording method, conference record presentation method and device | |
WO2019000515A1 (en) | Voice call method and device | |
CN105933738B (en) | Net cast methods, devices and systems | |
JP7448672B2 (en) | Information processing methods, systems, devices, electronic devices and storage media | |
EP2924985A1 (en) | Low-bit-rate video conference system and method, sending end device, and receiving end device | |
CN103763627A (en) | Method and system for realizing real-time video conference | |
CN108337556B (en) | Method and device for playing audio-video file | |
CN110971685B (en) | Content processing method, content processing device, computer equipment and storage medium | |
CN114339302B (en) | Method, device, equipment and computer storage medium for guiding broadcast | |
CN114979545A (en) | Multi-terminal call method, storage medium and electronic device | |
CN108320331B (en) | Method and equipment for generating augmented reality video information of user scene | |
CN112839192A (en) | Audio and video communication system and method based on browser | |
CN105450970A (en) | Information processing method and electronic equipment | |
CN112735430A (en) | Multilingual online simultaneous interpretation system | |
KR102042247B1 (en) | Wireless transceiver for Real-time multi-user multi-language interpretation and the method thereof | |
CN106911901A (en) | A kind of data processing method and system for intelligent robot | |
CN110335610A (en) | The control method and display of multimedia translation | |
CN110111768A (en) | Audio synchronous transmission method, system and computer equipment, computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |