WO2009043275A1 - A method and a system of video communication and a device for video communication - Google Patents

A method and a system of video communication and a device for video communication Download PDF

Info

Publication number
WO2009043275A1
WO2009043275A1 PCT/CN2008/072483 CN2008072483W WO2009043275A1 WO 2009043275 A1 WO2009043275 A1 WO 2009043275A1 CN 2008072483 W CN2008072483 W CN 2008072483W WO 2009043275 A1 WO2009043275 A1 WO 2009043275A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio stream
terminal
picture
sub
orientation information
Prior art date
Application number
PCT/CN2008/072483
Other languages
English (en)
French (fr)
Inventor
Wuzhou Zhan
Original Assignee
Shenzhen Huawei Telecommunication Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huawei Telecommunication Technologies Co., Ltd. filed Critical Shenzhen Huawei Telecommunication Technologies Co., Ltd.
Priority to JP2010526135A priority Critical patent/JP5198567B2/ja
Priority to EP08800966A priority patent/EP2202970A4/en
Publication of WO2009043275A1 publication Critical patent/WO2009043275A1/zh
Priority to US12/732,999 priority patent/US8259625B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a video communication method, system, and apparatus for video communication.
  • a video conferencing system includes a multipoint control unit (MCU, Micro)
  • the terminal reports the configuration of the position and number of the speaker to the MCU, and the MCU according to the speaker of the terminal In the case of configuration, the number of channels is assigned to each terminal. For example, if the terminal has only one speaker, only mono is assigned, if there are two speakers, two channels are assigned, and if there are four speakers, four channels are assigned.
  • the MCU receives the video stream and the audio stream of each endpoint, and combines the video streams into one multi-picture and sends them to the terminal, and the audio stream is generated according to the channel configuration of the terminal.
  • the terminal has four For the channel, four audio streams are generated for the terminal one, and each audio stream corresponds to one speaker of the terminal one.
  • the generation of the audio stream generally adopts a method of adjusting the amplitude and the delay. After the processing is performed in this way, the terminal feels the sound from the position of the speaker in the picture, thereby having the sense of orientation information of the sound.
  • the MCU must know the speaker configuration in advance to generate a corresponding number of audio streams according to the number of speakers, but will result in an MCU and a terminal.
  • the connection is too tight and not flexible enough.
  • the technical problem to be solved by the embodiments of the present invention is to provide a video communication method and system, and a device for video communication, which can reduce the tightness of the connection between the multi-point control unit and the terminal, and improve flexibility.
  • An embodiment of the present invention provides a video communication method, including:
  • Another embodiment of the present invention also provides a computer program product, the computer program product comprising computer program code, when the computer program code is executed by a computer, the computer program code can cause the computer to execute a Any of the steps in a video communication method.
  • Yet another embodiment of the present invention also provides a computer readable storage medium storing computer program code, which when the computer program code is executed by a computer, can cause The computer performs any of the steps of a method of video communication.
  • Another embodiment of the present invention further provides a video communication system, including:
  • an identifier unit configured to identify a sub-picture in the synthesized picture corresponding to each received audio stream
  • an acquiring unit configured to acquire, according to the position of each sub-picture, the position information of each audio stream according to the position in the composite picture
  • a sending unit configured to send an audio stream and corresponding orientation information
  • the terminal unit is configured to process the audio signal according to the received orientation information, so that the audio stream has orientation information.
  • a further embodiment of the present invention further provides an apparatus for video communication, comprising: an identification unit, configured to identify a sub-picture in a synthesized picture corresponding to each received audio stream; and an acquiring unit, configured to use each sub-picture The position of the picture in the composite picture to obtain the orientation of each audio stream Information
  • a sending unit configured to send an audio stream and corresponding orientation information.
  • the above technical solution can be seen that, since the sub-pictures in the synthesized picture corresponding to the received audio streams are identified, the orientation information of each audio stream is obtained, and the audio stream and the corresponding position information are sent to the terminal, The terminal does not need to know the configuration of the terminal speaker, and the terminal processes the audio signal according to the orientation information of the received audio stream, so that the audio stream has the orientation information. Thereby reducing the tightness of the connection between the multi-point control unit and the terminal, and increasing flexibility.
  • FIG. 1 is a schematic diagram of a video system according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of video processing according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of performing audio processing according to an embodiment of the present invention
  • FIG. 5 is a flowchart of a method according to Embodiment 1 of the present invention
  • FIG. 6 is a flowchart of a method according to Embodiment 2 of the present invention
  • FIG. 7 is a flowchart of a method according to Embodiment 3 of the present invention
  • Figure 8 is a schematic diagram of a system according to an embodiment of the present invention
  • Figure 9 is a schematic diagram of an apparatus according to an embodiment of the present invention.
  • the embodiment of the invention provides a video communication method and system, and a device for video communication, which is used for video communication to improve the flexibility of the system.
  • the first terminal 101, the second terminal 102, and the third terminal 103 respectively send respective video streams and audio stream methods to the multipoint control unit 104.
  • the control unit processes the received video stream and the audio stream, and sends the processed audio stream and video stream to the first terminal 101, the second terminal 102, and the third terminal 103.
  • FIG. 2 it is a schematic diagram of performing video processing according to an embodiment of the present invention.
  • the first terminal 101 requests to view a picture synthesized by the second terminal 102 and the third terminal 103
  • the second terminal 102 requests to view a picture synthesized by the second terminal 102 and the third terminal 103
  • the third terminal 103 requests to view the second terminal 102.
  • the multipoint control unit 104 directly forwards the video code stream of the second terminal 102 to the third terminal 103, and further decodes the video code stream of the second terminal 102 and the third terminal 103, and then synthesizes the video stream.
  • the picture, after encoding, is sent to the first terminal 101 terminal and the second terminal 102.
  • the resolution of each terminal video signal in the multi-picture can be adjusted as needed, for example, for the second terminal 102 in the left sub-picture, and the third terminal 103 in the multi-picture synthesized on the right sub-picture, the The resolution of the two terminals 102 and the third terminal 103 in the horizontal direction is reduced by half, so that the resolution of the synthesized multi-picture remains unchanged; and for the virtual conference system or other occasions where the requirements are relatively high, the second terminal 102 and the second terminal 102 may not be lowered.
  • the resolution of the third terminal 103 is simply splicing the two video signals together in the horizontal direction, so that the resolution of the multi-picture signal after the synthesis is twice as large.
  • the multipoint control unit 104 decodes the audio stream of the terminal, mixes it, encodes the mixed sound, and transmits the encoded audio signal to the terminal. In the case of mixing, the sound of the terminal is not normally mixed. For example, the multipoint control unit 104 mixes the audio streams of the second terminal 102 and the third terminal 103, and then encodes and transmits the audio stream to the first terminal 101. After the audio streams of the first terminal 101 and the third terminal 103 are mixed, the code is transmitted to the second terminal 102, and the audio streams of the first terminal 101 and the first terminal 101 are mixed and then encoded and transmitted to the third terminal 103.
  • FIG. 3 is a schematic diagram of performing audio processing according to an embodiment of the present invention.
  • the first terminal 101, the second terminal 102, and the third terminal 103 send the audio stream to the multipoint control unit 104.
  • the multipoint control unit 104 receives the audio stream of each terminal and decodes it, and then decodes each audio stream.
  • the audio stream after the mixing process is encoded, and then sent to each terminal, for example, the mixed code stream of the second terminal and the third terminal is sent to the first terminal, and the first terminal and the first terminal are sent to the second terminal.
  • the mixed code stream of the three terminals sends the mixed code stream of the first terminal and the second terminal to the third terminal.
  • the video stream sent by the multi-point control unit to the first terminal is a composite picture of the second terminal and the third terminal
  • the second terminal is in the left picture
  • the third terminal is in the right picture
  • the multi-point control unit is sent to the first terminal.
  • the audio stream includes an audio stream of the second terminal and an audio stream of the third terminal
  • the audio stream identifying the second terminal corresponds to the left sub-picture
  • the audio stream of the third terminal corresponds to the right sub-picture.
  • the video stream sent by the multipoint control unit to the second terminal is a composite picture of the second terminal and the third terminal
  • the audio stream sent by the multipoint control unit to the second terminal includes the audio stream of the first terminal and the third terminal.
  • the audio stream identifies that the audio stream of the third terminal corresponds to the right sub-picture, but the audio stream of the first terminal does not have a corresponding sub-picture, and the audio stream that identifies the first terminal is a voice-over, and may also be used as an identifier other than the voice-over.
  • the video stream sent by the multipoint control unit to the third terminal is the video stream of the second terminal
  • the audio stream sent by the multipoint control unit to the third terminal includes the audio stream of the first terminal and the audio stream of the second terminal.
  • the third terminal sees a single picture of the second terminal, and the single picture is regarded as a special case in the composite picture.
  • the audio stream of the second terminal is identified and the single picture is corresponding, and the audio stream of the first terminal is identified as a voice-over.
  • the audio stream sent by the multipoint control unit to the first terminal includes the audio stream of the second terminal and the audio stream of the third terminal, and the audio stream of the second terminal is placed in the first channel, and the audio stream of the third terminal Place it on the second channel.
  • the multipoint control unit sends more audio streams to a certain terminal, in order to reduce the code rate, the energy can be placed on the first channel, the second largest in the second channel, and then The remaining audio streams are decoded, mixed, and encoded into one audio stream placed on the third channel.
  • the orientation information may be directly sent to the terminal, or may be transmitted to the audio stream combining unit, and the audio stream combining unit embeds the orientation information into the audio stream, and sends the same to the terminal together with the audio stream.
  • the terminal uses HRTF (Head Related Transfer Function) filtering on the audio signal according to the orientation information of the received audio stream, so that the audio stream has a positional information.
  • HRTF Head Related Transfer Function
  • the orientation information is represented by the angles of the horizontal direction and the vertical direction, and the filtering uses the head related transfer function HRTF.
  • Embodiment 6 is a flowchart of a method provided by Embodiment 2 of the present invention:
  • Identify a sub-screen in the synthesized picture corresponding to each received audio stream The following is an example of the received audio stream and composite picture:
  • the video stream sent by the multi-point control unit to the first terminal is a composite picture of the second terminal and the third terminal
  • the second terminal is in the left picture
  • the third terminal is in the right picture
  • the multi-point control unit is sent to the first terminal.
  • the audio stream includes an audio stream of the second terminal and an audio stream of the third terminal
  • the audio stream identifying the second terminal corresponds to the left sub-picture
  • the audio stream of the third terminal corresponds to the right sub-picture.
  • the video stream sent by the multipoint control unit to the second terminal is a composite picture of the second terminal and the third terminal
  • the audio stream sent by the multipoint control unit to the second terminal includes the audio stream of the first terminal and the third terminal.
  • the audio stream identifies that the audio stream of the third terminal corresponds to the right sub-picture, but the audio stream of the first terminal does not have a corresponding sub-picture, and the audio stream that identifies the first terminal is a voice-over, and may also be used as an identifier other than the voice-over. For example, the audio stream is identified as a no-picture audio stream.
  • the video stream sent by the multipoint control unit to the third terminal is the video stream of the second terminal
  • the audio stream sent by the multipoint control unit to the third terminal includes the audio stream of the first terminal and the audio stream of the second terminal.
  • the third terminal sees a single picture of the second terminal, and the single picture is regarded as a special case in the composite picture.
  • the audio stream of the second terminal is identified and the single picture is corresponding, and the audio stream of the first terminal is identified as a voice-over.
  • the audio stream sent to the terminal 1 is a mix of the terminal 2 and the terminal 3, wherein the audio stream of the terminal 2 participating in the mixing corresponds to the left sub-picture, and the audio stream of the terminal 3 participating in the mixing corresponds to the right sub-picture, the left sub-
  • the center point of the picture is Cl
  • the center point of the right sub-picture is C2
  • the orientation information of the audio stream of the terminal 2 and the terminal 3 can be represented by the relative distances of the C1 and C2 points in the horizontal direction and the vertical direction, respectively, that is, the terminal 2 audio
  • the orientation information of the stream is (-0.5, 0), and the orientation information of the audio stream of the terminal 3 is (0.5, 0).
  • the audio stream sent by the multipoint control unit to the first terminal includes the audio stream of the second terminal and the audio stream of the third terminal, and the multipoint control unit places the audio stream of the second terminal in the first channel, and the third The audio stream of the terminal is placed in the second channel.
  • the energy can be placed on the first channel, the second largest in the second channel, and then the rest.
  • the audio stream is decoded, mixed, and encoded into an audio stream placed on the third channel.
  • the orientation information can be directly sent to the terminal, or the orientation information can be embedded in the audio stream, and the audio stream is sent to the terminal.
  • the terminal adopts HRTF (Head) for the audio signal according to the orientation information of the received audio stream.
  • HRTF Head
  • the orientation information is represented by the relative distance in the horizontal direction and the relative distance in the vertical direction, and the filtering uses the head related transfer function HRTF.
  • FIG. 7 is a flowchart of a method according to Embodiment 3 of the present invention:
  • the multi-point control unit sends the video stream to the first terminal as a synthesis of the second terminal and the third terminal.
  • the second terminal is in the left picture
  • the third terminal is on the right picture.
  • the audio stream sent by the multipoint control unit to the first terminal includes the audio stream of the second terminal and the audio stream of the third terminal, and identifies the audio stream of the second terminal.
  • the audio stream of the third terminal corresponds to the right sub-picture.
  • the video stream sent to the second terminal is a composite picture of the second terminal and the third terminal
  • the audio stream sent by the multi-point control unit to the second terminal includes the audio stream of the first terminal and the audio stream of the third terminal.
  • the audio stream that identifies the third terminal corresponds to the right sub-picture, but the audio stream of the first terminal does not have a corresponding sub-picture, and the audio stream that identifies the first terminal is a voice-over, and may also be used as an identifier other than the voice-over, for example, identifying the The audio stream is a no-picture audio stream.
  • the video stream sent by the multipoint control unit to the third terminal is the video stream of the second terminal
  • the audio stream sent by the multipoint control unit to the third terminal includes the audio stream of the first terminal and the audio stream of the second terminal.
  • the third terminal sees a single picture of the second terminal, and the single picture is regarded as a special case in the composite picture.
  • the audio stream of the second terminal is identified and the single picture is corresponding, and the audio stream of the first terminal is identified as a voice-over.
  • the azimuth information such as the relative distance of each audio stream in the horizontal direction and the relative distance in the vertical direction.
  • the relative distance representation method is shown in Figure 4.
  • the audio stream participating in the mix itself has no orientation information
  • point 0 is the center point of the video image
  • w is the width of the image
  • h is the height of the image.
  • point 0 as the origin and a coordinate
  • the coordinates of the M point in the image are (w0, h0).
  • the audio stream sent to the terminal 1 is a mix of the terminal 2 and the terminal 3, wherein the audio stream of the terminal 2 participating in the mixing corresponds to the left sub-picture, and the audio stream of the terminal 3 participating in the mixing corresponds to the right sub-picture, the left sub-
  • the center point of the picture is Cl
  • the center point of the right sub-picture is C2
  • the orientation information of the audio stream of the terminal 2 and the terminal 3 can be represented by the relative distances of the C1 and C2 points in the horizontal direction and the vertical direction, respectively, that is, the terminal 2 audio
  • the orientation information of the stream is (-0.5, 0), and the orientation information of the audio stream of the terminal 3 is (0.5, 0).
  • a voice-over is also mentioned.
  • the orientation information can be set to (-1, 0) or (1, 0).
  • the orientation information is (0, 0); If the audio stream participating in the mix has orientation information, the orientation information is calculated in the manner described below: For example, for terminal 2 and terminal 3 The audio is mixed, corresponding to the left sub-picture and the right sub-picture respectively, and the position information of the audio of the terminal 2 and the terminal 3 are respectively (w'2, h'2), (w'3, h'3), then new The orientation information should be: ( -0.5 + (w'2/2), h'2) , (0.5 + (w'3/2), h'3 ).
  • the audio stream sent to the first terminal includes the audio stream of the second terminal and the audio stream of the third terminal, the audio stream of the second terminal is placed in the first channel, and the audio stream of the third terminal is placed in the second channel.
  • the audio stream sent to the first terminal includes the audio stream of the second terminal and the audio stream of the third terminal, the audio stream of the second terminal is placed in the first channel, and the audio stream of the third terminal is placed in the second channel.
  • Channels if there are more audio streams sent to a terminal, in order to reduce the bit rate, the energy can be placed on the first channel, the second largest in the second channel, and then the rest.
  • the audio stream is decoded, mixed, and encoded into an audio stream placed on the third channel.
  • the terminal filters the audio signal by adjusting the sound intensity of the left and right channels according to the received orientation information of the audio stream, so that the audio stream has the orientation information.
  • the following two formulas can be used to describe the specific adjustment method:
  • c is a fixed value
  • gl is the left channel sound intensity gain
  • g2 is the right channel sound intensity gain
  • w is the relative distance in the horizontal direction calculated according to step 304.
  • the orientation information is represented by the relative distance in the horizontal direction and the relative distance in the vertical direction, and the filtering is performed by adjusting the amplitudes of the left and right channels.
  • the identifying unit 501 is configured to identify a sub-picture in the synthesized picture corresponding to each received audio stream; for example, the input audio stream interface of the multi-point control unit 104 receives the audio stream from each terminal, and transmits the audio stream to each receiving terminal. Corresponding identification unit 501.
  • the obtaining unit 502 is configured to acquire, according to the position of each sub-picture in the synthesized picture, each audio stream The orientation information; for example, obtaining the angle of the horizontal direction and the vertical direction of each audio stream, or obtaining the relative distance between the horizontal direction of each audio stream and the relative distance of the vertical direction.
  • the terminal unit 504 is configured to process the audio signal according to the received orientation information, so that the audio stream has orientation information. For example, by adjusting the left and right channel sound intensity, or by using HRTF technology for filtering.
  • the system further includes:
  • the audio stream combining unit 505 is configured to embed the orientation information into the audio stream and send it to the sending unit 503.
  • FIG. 9 is a schematic diagram of an apparatus according to an embodiment of the present invention, including:
  • the identifier unit 501 is configured to identify a sub-picture in the synthesized picture corresponding to each received audio stream; for example, the input audio stream interface of the multi-point control unit 104 receives the audio stream from each terminal, and transmits the audio stream to each receiving terminal. Corresponding identification unit 501.
  • the obtaining unit 502 is configured to acquire, according to the position of each sub-picture, the position information of each audio stream according to the position in the composite picture; for example, acquiring an angle of a horizontal direction and a vertical direction of each audio stream, or acquiring a horizontal direction of each audio stream. The relative distance between the relative distance and the vertical direction.
  • a sending unit 503, configured to send an audio stream and corresponding orientation information; for example, the audio stream sent to the first terminal includes an audio stream of the second terminal and an audio stream of the third terminal, and the audio stream of the second terminal is placed in the first One channel, the audio stream of the third terminal is placed in the second channel.
  • the energy can be placed on the first channel, the second largest in the second channel, and then the rest.
  • the audio stream is decoded, mixed, and encoded into an audio stream placed on the third channel.
  • the device further comprises:
  • the above embodiment can be seen that, since the sub-pictures in the synthesized picture corresponding to the received audio streams are identified, the orientation information of each audio stream is obtained, and the audio stream and the corresponding position information are sent to the terminal, The terminal does not need to know the configuration of the terminal speaker, and the terminal processes the audio signal according to the orientation information of the received audio stream, so that the audio stream has the orientation information. Thereby reducing the tightness of the connection between the multi-point control unit and the terminal, and increasing flexibility.
  • the above-mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Description

一种视频通讯的方法、 系统及用于视频通讯的装置 本申请要求于 2007年 09月 28日提交中国专利局、申请 200710151406.X、 发明名称为 "一种视频通讯的方法、 系统及用于视频通讯的装置"的中国专利 申请的优先权。
技术领域
本发明涉及通信技术领域, 尤其涉及一种视频通讯的方法、 系统及用于视 频通讯的装置。
背景技术
随着电视机的广泛普及, 用户对电视机屏幕的尺寸要求越来越高,甚至有 的视频通讯系统采用投影仪或电视墙来显示,这时如果画面由至少两个子画面 合成, 不同子画面中的说话者的位置相对在屏幕尺寸要求低时会有较大的不 同,而目前的多媒体通讯系统的发出声音的位置并没有根据说话者的位置的改 变而相应的发生改变,导致声音的方位信息和子画面不匹配,进而影响到视频 通讯的真实感。
现有技术中, 一个视频会议系统, 包括多点控制单元 (MCU, Micro
Controller Unit)、 单声道终端、 至少两个声道以上的多声道终端等设备, 终端 和 MCU建立连接后 , 终端将扬声器的位置和数目等配制情况上报给 MCU, MCU根据终端的扬声器的配制情况为各个终端分配声道数目, 例如, 如果终 端只有一个扬声器, 则只分配单声道, 如果有两个扬声器, 则分配双声道, 如 果有四个扬声器, 则分配四个声道。 在会议过程中, MCU接收各个端点的视 频流和音频流,将视频流组合成一个多画面发送给终端, 而对于音频流则根据 终端的声道配制情况来生成, 例如, 终端一有四个声道, 则为终端一生成四个 音频流,每个音频流对应终端一的一个扬声器。音频流的生成一般采用调解幅 度和时延的方式,采用这种方式处理后,使得终端一感觉声音从画面中发言人 的位置发出, 从而具有声音的方位信息感。
在对现有技术的研究和实践过程中 , 发明人发现现有技术存在以下问题: MCU必须知道预先知道扬声器配置情况, 才能根据扬声器的数目生成相应数 目的音频流, 但是会导致 MCU和终端的联系太紧密, 不够灵活。
发明内容 本发明实施例要解决的技术问题是提供一种视频通讯的方法、系统及用于 视频通讯的装置, 能够降低多点控制单元与终端之间联系的紧密度,提高灵活 性。
为解决上述技术问题, 本发明所提供的实施例是通过以下技术方案实现 的:
本发明一个实施例提供了一种视频通讯的方法, 包括:
标识接收到的各路音频流对应的合成画面中的子画面;
根据各子画面在合成画面中的位置获取各路音频流的方位信息; 将音频流及相应的方位信息发送给终端;由所述终端根据接收到的音频流 的方位信息, 对音频信号进行处理, 使音频流具有方位信息。
本发明另一个实施例还提供了一种计算机程序产品,所述计算机程序产品 包括计算机程序代码, 当所述计算机程序代码被一个计算机执行的时候, 所述 计算机程序代码可以使得所述计算机执行一种视频通讯的方法中的任意一项 步骤。
本发明又一个实施例还提供了一种计算机可读存储介质,所述计算机可读 存储介质存储计算机程序代码 ,当所述计算机程序代码被一个计算机执行的时 候,所述计算机程序代码可以使得所述计算机执行一种视频通讯的方法中的任 意一项步骤。
本发明再一个实施例还提供的一种视频通讯的系统, 包括:
标识单元, 用于标识接收到的各路音频流对应的合成画面中的子画面; 获取单元,用于根据各子画面在合成画面中的位置获取各路音频流的方位 信息;
发送单元, 用于发送音频流及相应的方位信息;
终端单元, 用于根据接收到的方位信息, 对音频信号进行处理, 使音频流 具有方位信息。
本发明再一个实施例还提供了一种用于视频通讯的装置, 包括: 标识单元, 用于标识接收到的各路音频流对应的合成画面中的子画面; 获取单元,用于根据各子画面在合成画面中的位置获取各路音频流的方位 信息;
发送单元, 用于发送音频流及相应的方位信息。
以上技术方案可以看出,由于对接收到的各路音频流对应的合成画面中的 子画面进行标识,获取各路音频流的方位信息后,将音频流及相应的方位信息 发送给终端, 因此, 不需要知道终端扬声器的配置情况, 由终端根据接收到的 音频流的方位信息, 对音频信号进行处理, 使音频流具有方位信息。 从而降低 多点控制单元与终端之间联系的紧密度, 提高灵活性。
附图说明
图 1为本发明实施例提供的视频^义系统示意图; 图 2为本发明实施例提供的进行视频处理原理图; 图 3为本发明实施例提供的进行音频处理原理图; 图 4为本发明实施例提供的表示相对位置的示意图; 图 5为本发明实施例一提供的方法流程图; 图 6为本发明实施例二提供的方法流程图; 图 7为本发明实施例三提供的方法流程图; 图 8为本发明实施例提供的系统示意图; 图 9为本发明实施例提供的装置示意图。
具体实施方式
本发明实施例提供了一种视频通讯的方法、 系统及用于视频通讯的装置, 用于视频通讯时, 提高系统的灵活性, 为使本发明的目的、技术方案及优点更 加清楚明白, 以下参照附图并举实施例, 对本发明进一步详细说明。 参见图 1, 为本发明实施例提供的视频 义系统示意图, 第一终端 101、 第二终端 102、 第三终端 103分别将各自的视频流和音频流法送给多点控制单 元 104, 多点控制单元对接收到的视频流和音频流进行处理, 将处理后的音频 流和视频流发送给第一终端 101、 第二终端 102、 第三终端 103。
下面对多点控制单元对接收到的视频流和音频流进行处理的过程进行说 明:
参见图 2, 为本发明实施例提供的进行视频处理原理图。 例如, 第一终端 101请求观看第二终端 102和第三终端 103合成的画面, 第二终端 102请求观 看第二终端 102和第三终端 103合成的画面,第三终端 103请求观看第二终端 102的视频, 根据这些请求, 多点控制单元 104将第二终端 102的视频码流直 接转发给第三终端 103, 另外将第二终端 102和第三终端 103的视频码流进行 解码, 然后合成多画面, 编码之后发送给第一终端 101终端和第二终端 102。 在合成多画面时, 可根据需要对多画面中各终端视频信号的分辨率进行调整, 例如对于第二终端 102在左子画面, 第三终端 103在右子画面合成的多画面, 可以将第二终端 102和第三终端 103在水平方向的分辨率降低一半,这样合成 的多画面的分辨率保持不变; 而对于虚拟会议系统或者其他要求比较高的场 合,可以不降低第二终端 102和第三终端 103的分辨率, 而只是将两个视频信 号在水平方向拼接在一起, 这样合成之后的多画面信号的分辨率是原来的二 倍。
多点控制单元 104对终端的音频流进行解码, 然后混音, 并对混合之后的 声音进行编码, 再将编码之后的音频信号发送给终端。 在混音时, 一般情况下 不会混入自己终端的声音, 例如, 多点控制单元 104将第二终端 102和第三终端 103的音频流混音之后进行编码发送给第一终端 101, 将第一终端 101和第三终 端 103的音频流混音之后进行编码发送给第二终端 102, 将第一终端 101和第一 终端 101的音频流混音之后进行编码发送给第三终端 103。
图 3为本发明实施例提供的进行音频处理原理图。 第一终端 101、 第二终 端 102、 第三终端 103将音频流发送到多点控制单元 104, 多点控制单元 104 接收到各个终端的音频流后进行解码,解码后将各路音频流进行混音处理, 混 音处理后的音频流进行编码后, 分别发送给各个终端, 例如, 向第一终端发送 第二终端和第三终端的混音码流,向第二终端发送第一终端和第三终端的混音 码流, 向第三终端发送第一终端和第二终端的混音码流。 下面结合上述示意图和原理图对本发明提供的方法进行伴细说明: 参见图 5, 为本发明实施例一提供的方法流程图:
201 : 标识接收到的各路音频流对应的合成画面中的子画面。 下面针对接 收到的音频流和合成画面进行举例说明:
例一,多点控制单元发送给第一终端视频流是第二终端和第三终端的合成 画面, 第二终端在左画面, 第三终端在右画面, 多点控制单元发送给第一终端 的音频流包括第二终端的音频流和第三终端的音频流,标识第二终端的音频流 和左子画面对应, 第三终端的音频流和右子画面对应。 例二, 多点控制单元发 送给第二终端的视频流是第二终端和第三终端的合成画面,多点控制单元发送 给第二终端的音频流包括第一终端的音频流和第三终端的音频流,标识第三终 端的音频流和右子画面对应,但第一终端的音频流没有相应的子画面,标识第 一终端的音频流为画外音, 也可以作除画外音以外的其它标识。
例三, 多点控制单元发送给第三终端的视频流是第二终端的视频流, 多点 控制单元发送给第三终端的音频流包括第一终端的音频流和第二终端的音频 流,第三终端看到的是第二终端的单画面,单画面看作合成画面中的一个特例, 标识第二终端的音频流和单画面对应, 将第一终端的音频流标识为画外音。
202: 根据子画面在合成画面中的位置获取各路音频流在水平方向和垂直 方向的角度等方位信息。
203: 将音频流及相应的方位信息发送给终端。 例如, 多点控制单元发送 给第一终端的音频流包括第二终端的音频流和第三终端的音频流,将第二终端 的音频流放置在第一个声道, 第三终端的音频流放置在第二个声道。 另外, 如 果多点控制单元发送给某个终端的音频流较多, 为了降低码率,可以将能量最 大的放在第一个声道, 能量第二大的放在第二个声道, 然后将剩下的音频流进 行解码、 混音、 编码成一路音频流放置在第三个声道。
其中, 可以将方位信息直接发送给终端, 也可以传递给音频流组合单元, 由音频流组合单元将方位信息嵌入到音频流内, 和音频流一起发送给终端。
204:终端根据接收到的音频流的方位信息,对音频信号采用 HRTF ( Head Related Transfer Function, 头部相关传输函数)滤波, 使音频流具有方位信 该实施例中, 方位信息用水平方向和垂直方向的角度表示, 滤波采用头部 相关传输函数 HRTF。
参见图 6, 为本发明实施例二提供的方法流程图:
301 : 标识接收到的各路音频流对应的合成画面中的子画面。 下面针对接 收到的音频流和合成画面进行举例说明:
例一,多点控制单元发送给第一终端视频流是第二终端和第三终端的合成 画面, 第二终端在左画面, 第三终端在右画面, 多点控制单元发送给第一终端 的音频流包括第二终端的音频流和第三终端的音频流,标识第二终端的音频流 和左子画面对应, 第三终端的音频流和右子画面对应。 例二, 多点控制单元发 送给第二终端的视频流是第二终端和第三终端的合成画面,多点控制单元发送 给第二终端的音频流包括第一终端的音频流和第三终端的音频流,标识第三终 端的音频流和右子画面对应,但第一终端的音频流没有相应的子画面,标识第 一终端的音频流为画外音, 也可以作除画外音以外的其它标识, 例如, 标识该 音频流为无画面音频流。
例三, 多点控制单元发送给第三终端的视频流是第二终端的视频流, 多点 控制单元发送给第三终端的音频流包括第一终端的音频流和第二终端的音频 流,第三终端看到的是第二终端的单画面,单画面看作合成画面中的一个特例, 标识第二终端的音频流和单画面对应, 将第一终端的音频流标识为画外音。
302: 据子画面在合成画面中的位置, 获取各路音频流在水平方向的相 对距离和垂直方向的相对距离等方位信息。 相对距离的表示方法如图 4所示, 参与混音的音频流本身不带方位信息, 点0是视频图像的中心点, w是图像的 宽度、 h是图像的高度。 以点 0为原点, 建立一个坐标, 则图像中的 M点的 坐标为 (w0, h0 )。 令 w,和 h,分别表示 M点在水平和垂直方向的相对距离, 则可用下面的公式计算:
w' = wO / (w/2) (1)
h, = hO / (h/2) (2) 发送给终端 1的音频流是终端 2和终端 3的混音,其中参与混音的终端 2的音频流 和左子画面对应, 参与混音的终端 3的音频流和右子画面对应, 左子画面的中 心点是 Cl, 右子画面的中心点是 C2, 因此终端 2和终端 3音频流的方位信息可 以分别用 C1和 C2点在水平方向和垂直方向的相对距离来表示, 即终端 2音频流 的方位信息为 (-0.5,0 ), 终端 3音频流的方位信息为 (0.5,0 ) 。 在前一步骤还 提到画外音, 对于是画外音的音频流, 方位信息可设置为 (-1,0 )或 (1,0 ) , 对于和单画面对应的音频流, 其方位信息为(0, 0 ); 如果参与混音的音频流 带有方位信息, 则按照下面描述的方式计算方位信息: 例如, 对终端 2和终端 3 的音频进行混音,分别对应左子画面和右子画面, 终端 2和终端 3的音频本身方 位信息分别为(w'2, h'2 )、 ( w'3 , h'3 ) ,则新的方位信息应为: ( -0.5 + (w'2/2), h'2 ) 、 ( 0.5 + (w'3/2), h'3 ) 。
303 : 将音频流及相应的方位信息发送给终端。 例如, 多点控制单元发送 给第一终端的音频流包括第二终端的音频流和第三终端的音频流,多点控制单 元将第二终端的音频流放置在第一个声道,第三终端的音频流放置在第二个声 道。 另外, 如果发送给某个终端的音频流较多, 为了降低码率, 可以将能量最 大的放在第一个声道, 能量第二大的放在第二个声道, 然后将剩下的音频流进 行解码、 混音、 编码成一路音频流放置在第三个声道。
其中,可以将方位信息直接发送给终端,也可以将方位信息嵌入到音频流 内, 和音频流一 ¾ ^送给终端。
304:终端根据接收到的音频流的方位信息,对音频信号采用 HRTF ( Head
Related Transfer Function, 头部相关传输函数)滤波, 使音频流具有方位信息。
该实施例中, 方位信息用水平方向的相对距离和垂直方向的相对距离表 示, 滤波采用头部相关传输函数 HRTF。
参见图 7, 为本发明实施例三提供的方法流程图:
401 : 标识接收到的各路音频流对应的合成画面中的子画面。 下面针对接 收到的音频流和合成画面进行举例说明:
例一,多点控制单元发送给第一终端视频流是第二终端和第三终端的合成 画面, 第二终端在左画面, 第三终端在右画面, 多点控制单元发送给第一终端 的音频流包括第二终端的音频流和第三终端的音频流,标识第二终端的音频流 和左子画面对应, 第三终端的音频流和右子画面对应。 例二, 发送给第二终端 的视频流是第二终端和第三终端的合成画面,多点控制单元发送给第二终端的 音频流包括第一终端的音频流和第三终端的音频流,标识第三终端的音频流和 右子画面对应,但第一终端的音频流没有相应的子画面,标识第一终端的音频 流为画外音, 也可以作除画外音以外的其它标识, 例如, 标识该音频流为无画 面音频流。
例三, 多点控制单元发送给第三终端的视频流是第二终端的视频流, 多点 控制单元发送给第三终端的音频流包括第一终端的音频流和第二终端的音频 流,第三终端看到的是第二终端的单画面,单画面看作合成画面中的一个特例, 标识第二终端的音频流和单画面对应, 将第一终端的音频流标识为画外音。
402: 根据子画面在合成画面中的位置, 获取各路音频流在水平方向的相 对距离和垂直方向的相对距离等方位信息。 相对距离的表示方法如图 4所示, 参与混音的音频流本身不带方位信息, 点0是视频图像的中心点, w是图像的 宽度、 h是图像的高度。 以点 0为原点, 建立一个坐标, 则图像中的 M点的 坐标为 (w0, h0 )。 令 w,和 h,分别表示 M点在水平和垂直方向的相对距离, 则可用下面的公式计算:
w' = wO / (w/2) (1)
h, = hO / (h/2) (2)
发送给终端 1的音频流是终端 2和终端 3的混音,其中参与混音的终端 2的音频流 和左子画面对应, 参与混音的终端 3的音频流和右子画面对应, 左子画面的中 心点是 Cl, 右子画面的中心点是 C2, 因此终端 2和终端 3音频流的方位信息可 以分别用 C1和 C2点在水平方向和垂直方向的相对距离来表示, 即终端 2音频流 的方位信息为 (-0.5,0 ), 终端 3音频流的方位信息为 (0.5,0 ) 。 在前一步骤还 提到画外音, 对于是画外音的音频流, 方位信息可设置为 (-1,0 )或 (1,0 ) , 对于和单画面对应的音频流, 其方位信息为(0, 0 ); 如果参与混音的音频流 带有方位信息, 则按照下面描述的方式计算方位信息: 例如, 对终端 2和终端 3 的音频进行混音,分别对应左子画面和右子画面, 终端 2和终端 3的音频本身方 位信息分别为(w'2, h'2)、 (w'3, h'3 ),则新的方位信息应为: ( -0.5 + (w'2/2), h'2) 、 (0.5 + (w'3/2), h'3 ) 。
403: 将音频流及相应的方位信息发送给终端。 例如, 发送给第一终端的 音频流包括第二终端的音频流和第三终端的音频流,将第二终端的音频流放置 在第一个声道, 第三终端的音频流放置在第二个声道。 另外, 如果发送给某个 终端的音频流较多, 为了降低码率, 可以将能量最大的放在第一个声道, 能量 第二大的放在第二个声道, 然后将剩下的音频流进行解码、 混音、 编码成一路 音频流放置在第三个声道。
其中, 可以将方位信息直接发送给终端, 也可以传递给音频流组合单元, 由音频流组合单元将方位信息嵌入到音频流内 , 和音频流一起发送给终端。
404: 终端根据接收到的音频流的方位信息, 对音频信号通过调整左右声 道声音强度进行滤波, 使音频流具有方位信息。 例如, 可用下面的两个公式 描述具体的调整的方法:
W = (gl -g2)/(gl+g2) (1)
c = gl*gl +g2*g2 (2)
公式(1)、 (2) 中 c是一个固定值, gl是左声道声音强度增益, g2是右声 道声音强度增益, w,是根据步骤 304计算出来的在水平方向的相对距离。
该实施例中, 方位信息用水平方向的相对距离和垂直方向的相对距离表 示, 滤波采用通过调整左右声道的幅度进行滤波。
以上为对本发明实施例提供的方法流程图的描述,下面对本发明实施例提 供的系统示意图进行伴细说明:
参见图 8, 为本发明实施例提供的系统示意图, 包括:
标识单元 501, 用于标识接收到的各路音频流对应的合成画面中的子画 面; 例如, 多点控制单元 104的输入音频流接口接收来自各个终端的音频流, 并传输给和各个接收终端对应的标识单元 501。
获取单元 502, 用于根据各子画面在合成画面中的位置获取各路音频流的 方位信息; 例如, 获取各路音频流水平方向的角度和垂直方向的角度, 或者获 取各路音频流水平方向的相对距离和垂直方向的相对距离。
发送单元 503, 用于发送音频流及相应的方位信息; 例如, 发送给第一终 端的音频流包括第二终端的音频流和第三终端的音频流,将第二终端的音频流 放置在第一个声道, 第三终端的音频流放置在第二个声道。 另外, 如果发送给 某个终端的音频流较多, 为了降低码率, 可以将能量最大的放在第一个声道, 能量第二大的放在第二个声道, 然后将剩下的音频流进行解码、 混音、 编码成 一路音频流放置在第三个声道。
终端单元 504, 用于根据接收到的方位信息, 对音频信号进行处理, 使音 频流具有方位信息。 例如, 通过调整左右声道声音强度, 或者采用 HRTF技术 进行滤波。
其中, 所述系统进一步包括:
音频流组合单元 505, 用于将所述方位信息嵌入到音频流中, 发送到所发 送单元 503。
参见图 9, 为本发明实施例提供的装置示意图, 包括:
标识单元 501 , 用于标识接收到的各路音频流对应的合成画面中的子画 面; 例如, 多点控制单元 104的输入音频流接口接收来自各个终端的音频流, 并传输给和各个接收终端对应的标识单元 501。
获取单元 502, 用于根据各子画面在合成画面中的位置获取各路音频流的 方位信息; 例如, 获取各路音频流水平方向的角度和垂直方向的角度, 或者获 取各路音频流水平方向的相对距离和垂直方向的相对距离。
发送单元 503, 用于发送音频流及相应的方位信息; 例如, 发送给第一终 端的音频流包括第二终端的音频流和第三终端的音频流,将第二终端的音频流 放置在第一个声道, 第三终端的音频流放置在第二个声道。 另外, 如果发送给 某个终端的音频流较多, 为了降低码率, 可以将能量最大的放在第一个声道, 能量第二大的放在第二个声道, 然后将剩下的音频流进行解码、 混音、 编码成 一路音频流放置在第三个声道。 其中, 所述装置进一步包括:
音频流组合单元 505, 用于将所述方位信息嵌入到音频流中, 发送到所发 送单元 503。
以上实施例可以看出,由于对接收到的各路音频流对应的合成画面中的子 画面进行标识,获取各路音频流的方位信息后,将音频流及相应的方位信息发 送给终端, 因此, 不需要知道终端扬声器的配置情况, 由终端根据接收到的音 频流的方位信息, 对音频信号进行处理, 使音频流具有方位信息。 从而降低多 点控制单元与终端之间联系的紧密度, 提高灵活性。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤 是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可 读存储介质中。
上述提到的存储介质可以是只读存储器, 磁盘或光盘等。
以上对本发明所提供的一种视频通讯的方法、系统及用于视频通讯的装置 进行了详细介绍, 对于本领域的一般技术人员, 依据本发明实施例的思想, 在 具体实施方式及应用范围上均会有改变之处, 综上所述,本说明书内容不应理 解为对本发明的限制。

Claims

权 利 要 求
1、 一种视频通讯的方法, 其特征在于, 包括:
标识接收到的各路音频流对应的合成画面中的子画面;
根据各子画面在合成画面中的位置获取各路音频流的方位信息;
将音频流及相应的方位信息发送给终端,由所述终端根据接收到的音频流 的方位信息, 对音频信号进行处理, 使音频流具有方位信息。
2、 根据权利要求 1所述的方法, 其特征在于, 所述标识接收到的各路音 频流对应的合成画面中的子画面, 包括:
所述各路音频流中的任意一路音频流在合成画面中没有对应的子画面时, 标识所述任意一路音频流为画外音。
3、 根据权利要求 1所述的方法, 其特征在于, 所述标识接收到的各路音 频流对应的合成画面中的子画面, 包括:
所述各路音频流中的任意一路音频流在合成画面中有对应的子画面时,标 识所述任意一路音频流与对应的子画面相对应。
4、 根据权利要求 1所述的方法, 其特征在于, 所述标识接收到的各路音 频流对应的合成画面中的子画面, 包括:
接收到任意一路音频流对应的单画面时,标识所述音频流与所述单画面对 应, 标识其余的音频流为画外音。
5、 根据权利要求 1所述的方法, 其特征在于, 所述音频流的方位信息, 包括:
水平方向的角度和垂直方向的角度。
6、 根据权利要求 1所述的方法, 其特征在于, 所述音频流的方位信息, 包括:
水平方向的相对距离和垂直方向的相对距离。
7、根据权利要求 1所述的方法, 其特征在于, 所述对音频信号进行处理, 包括: 通过调整左右声道声音强度进行处理。
8、 根据权利要求 1所述的方法, 其特征在于, 所述对对音频信号进行处 理, 包括:
采用头部相关传输函数 HRTF进行滤波。
9、 一种计算机程序产品, 其特征在于, 所述计算机程序产品包括计算机 程序代码, 当所述计算机程序代码被一个计算机执行的时候, 所述计算机程序 代码可以使得所述计算机执行权利要求 1至 8项中任意一项的步骤。
10、 一种计算机可读存储介质, 其特征在于, 所述计算机可读存储介质 存储计算机程序代码, 当所述计算机程序代码被一个计算机执行的时候, 所述 计算机程序代码可以使得所述计算机执行权利要求 1 至 8项中任意一项的步 骤。
11、 一种视频通讯的系统, 其特征在于, 包括:
标识单元, 用于标识接收到的各路音频流对应的合成画面中的子画面; 获取单元,用于根据各子画面在合成画面中的位置获取各路音频流的方位 信息;
发送单元, 用于发送音频流及相应的方位信息;
终端单元, 用于根据接收到的方位信息, 对音频信号进行处理, 使音频流 具有方位信息。
12、 根据权利要求 11所述的系统, 其特征在于, 所述系统进一步包括: 音频流组合单元,用于将所述方位信息嵌入到音频流中,发送到所述发送 单元。
13、 一种用于视频通讯的装置, 其特征在于, 包括:
标识单元, 用于标识接收到的各路音频流对应的合成画面中的子画面; 获取单元,用于根据各子画面在合成画面中的位置获取各路音频流的方位 信息;
发送单元, 用于发送音频流及相应的方位信息。
14、 根据权利要求 13所述的装置, 其特征在于, 所述装置进一步包括: 音频流组合单元, 用于将所述方位信息嵌入到音频流中并发送,发送到所 述发送单元。
PCT/CN2008/072483 2007-09-28 2008-09-24 A method and a system of video communication and a device for video communication WO2009043275A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2010526135A JP5198567B2 (ja) 2007-09-28 2008-09-24 ビデオ通信方法、システムおよび装置
EP08800966A EP2202970A4 (en) 2007-09-28 2008-09-24 METHOD, SYSTEM AND VIDEO COMMUNICATION DEVICE
US12/732,999 US8259625B2 (en) 2007-09-28 2010-03-26 Method, system, and device of video communication

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200710151406XA CN101132516B (zh) 2007-09-28 2007-09-28 一种视频通讯的方法、系统及用于视频通讯的装置
CN200710151406.X 2007-09-28

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/732,999 Continuation US8259625B2 (en) 2007-09-28 2010-03-26 Method, system, and device of video communication

Publications (1)

Publication Number Publication Date
WO2009043275A1 true WO2009043275A1 (en) 2009-04-09

Family

ID=39129613

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2008/072483 WO2009043275A1 (en) 2007-09-28 2008-09-24 A method and a system of video communication and a device for video communication

Country Status (5)

Country Link
US (1) US8259625B2 (zh)
EP (1) EP2202970A4 (zh)
JP (1) JP5198567B2 (zh)
CN (1) CN101132516B (zh)
WO (1) WO2009043275A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015202914B2 (en) * 2012-11-15 2017-03-09 Cervical Chinup Pty Ltd Cervical Brace
US10226374B2 (en) 2012-11-15 2019-03-12 Cervical Chinup Pty Ltd Cervical brace

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101132516B (zh) * 2007-09-28 2010-07-28 华为终端有限公司 一种视频通讯的方法、系统及用于视频通讯的装置
JP5200928B2 (ja) * 2008-12-29 2013-06-05 ブラザー工業株式会社 テレビ会議システム、帯域制御方法、会議制御装置、テレビ会議端末装置及びプログラム
CN102209225B (zh) * 2010-03-30 2013-04-17 华为终端有限公司 视频通信的实现方法及装置
CN102222503B (zh) * 2010-04-14 2013-08-28 华为终端有限公司 一种音频信号的混音处理方法、装置及系统
US8446455B2 (en) 2010-12-08 2013-05-21 Cisco Technology, Inc. System and method for exchanging information in a video conference environment
US8553064B2 (en) * 2010-12-08 2013-10-08 Cisco Technology, Inc. System and method for controlling video data to be rendered in a video conference environment
CN102547210B (zh) * 2010-12-24 2014-09-17 华为终端有限公司 级联会议中级联会场的处理方法、装置及系统
CN102655584B (zh) 2011-03-04 2017-11-24 中兴通讯股份有限公司 一种远程呈现技术中媒体数据发送和播放的方法及系统
CN102186049B (zh) * 2011-04-22 2013-03-20 华为终端有限公司 会场终端音频信号处理方法及会场终端和视讯会议系统
CN102223515B (zh) * 2011-06-21 2017-12-05 中兴通讯股份有限公司 远程呈现会议系统、远程呈现会议的录制与回放方法
WO2013019259A1 (en) * 2011-08-01 2013-02-07 Thomson Licensing Telepresence communications system and method
EP2823642B1 (en) 2012-03-09 2024-04-24 InterDigital Madison Patent Holdings, SAS Distributed control of synchronized content
US9179095B2 (en) * 2012-09-27 2015-11-03 Avaya Inc. Scalable multi-videoconferencing system
US10044975B2 (en) * 2016-12-09 2018-08-07 NetTalk.com, Inc. Method and apparatus for coviewing video
US10972521B2 (en) 2012-10-18 2021-04-06 NetTalk.com, Inc. Method and apparatus for coviewing video
CN103780868A (zh) * 2012-10-23 2014-05-07 中兴通讯股份有限公司 基于空间位置的数据传输方法、控制器及设备
US9756288B2 (en) 2013-04-10 2017-09-05 Thomson Licensing Tiering and manipulation of peer's heads in a telepresence system
US9407862B1 (en) * 2013-05-14 2016-08-02 Google Inc. Initiating a video conferencing session
KR20160022307A (ko) 2013-06-20 2016-02-29 톰슨 라이센싱 콘텐츠의 분산 재생의 동기화를 지원하기 위한 시스템 및 방법
TW201517631A (zh) * 2013-08-29 2015-05-01 Vid Scale Inc 使用者適應視訊電話
CN104580993A (zh) * 2015-01-15 2015-04-29 深圳市捷视飞通科技有限公司 一种无线数字视频多点通讯方法
US10771508B2 (en) 2016-01-19 2020-09-08 Nadejda Sarmova Systems and methods for establishing a virtual shared experience for media playback
CN116795464A (zh) * 2018-12-29 2023-09-22 中兴通讯股份有限公司 一种实现远程协助的方法及相关设备
CN111292773A (zh) * 2020-01-13 2020-06-16 北京大米未来科技有限公司 音视频合成的方法、装置、电子设备及介质
CN112532913A (zh) * 2020-11-30 2021-03-19 广州虎牙科技有限公司 一种视频混流方法、视频系统及服务器
DE102020132775A1 (de) * 2020-12-09 2022-06-09 alfaview Video Conferencing Systems GmbH & Co. KG Videokonferenzsystem, Verfahren zum Übertragen von Informationen und Computerprogrammprodukt

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1257631A (zh) * 1997-03-27 2000-06-21 法国电讯公司 视频会议系统
US20030048353A1 (en) * 2001-08-07 2003-03-13 Michael Kenoyer System and method for high resolution videoconferencing
CN2805011Y (zh) * 2004-11-09 2006-08-09 道宏无线系统股份有限公司 多频道实时影音输出显示控制装置
CN1901663A (zh) * 2006-07-25 2007-01-24 华为技术有限公司 一种具有声音位置信息的视频通讯系统及其获取方法
CN101132516A (zh) * 2007-09-28 2008-02-27 深圳华为通信技术有限公司 一种视频通讯的方法、系统及用于视频通讯的装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5335011A (en) * 1993-01-12 1994-08-02 Bell Communications Research, Inc. Sound localization system for teleconferencing using self-steering microphone arrays
JPH09162995A (ja) * 1995-12-08 1997-06-20 Nec Corp 遠隔会議方式
US6593956B1 (en) * 1998-05-15 2003-07-15 Polycom, Inc. Locating an audio source
JP3910537B2 (ja) * 2001-03-26 2007-04-25 富士通株式会社 マルチチャネル情報処理装置
US7667728B2 (en) * 2004-10-15 2010-02-23 Lifesize Communications, Inc. Video and audio conferencing system with spatial audio
US7612793B2 (en) 2005-09-07 2009-11-03 Polycom, Inc. Spatially correlated audio in multipoint videoconferencing
US7864210B2 (en) * 2005-11-18 2011-01-04 International Business Machines Corporation System and methods for video conferencing
CN100556151C (zh) * 2006-12-30 2009-10-28 华为技术有限公司 一种视频终端以及一种音频码流处理方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1257631A (zh) * 1997-03-27 2000-06-21 法国电讯公司 视频会议系统
US20030048353A1 (en) * 2001-08-07 2003-03-13 Michael Kenoyer System and method for high resolution videoconferencing
CN2805011Y (zh) * 2004-11-09 2006-08-09 道宏无线系统股份有限公司 多频道实时影音输出显示控制装置
CN1901663A (zh) * 2006-07-25 2007-01-24 华为技术有限公司 一种具有声音位置信息的视频通讯系统及其获取方法
CN101132516A (zh) * 2007-09-28 2008-02-27 深圳华为通信技术有限公司 一种视频通讯的方法、系统及用于视频通讯的装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2202970A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2015202914B2 (en) * 2012-11-15 2017-03-09 Cervical Chinup Pty Ltd Cervical Brace
US10226374B2 (en) 2012-11-15 2019-03-12 Cervical Chinup Pty Ltd Cervical brace

Also Published As

Publication number Publication date
CN101132516B (zh) 2010-07-28
CN101132516A (zh) 2008-02-27
US20100182394A1 (en) 2010-07-22
JP5198567B2 (ja) 2013-05-15
EP2202970A1 (en) 2010-06-30
EP2202970A4 (en) 2010-12-08
US8259625B2 (en) 2012-09-04
JP2010541343A (ja) 2010-12-24

Similar Documents

Publication Publication Date Title
WO2009043275A1 (en) A method and a system of video communication and a device for video communication
US9843455B2 (en) Conferencing system with spatial rendering of audio data
US20100103244A1 (en) device for and method of processing image data representative of an object
EP2352290B1 (en) Method and apparatus for matching audio and video signals during a videoconference
US9497390B2 (en) Video processing method, apparatus, and system
WO2011140812A1 (zh) 多画面合成方法、系统及媒体处理装置
US8749611B2 (en) Video conference system
US20050280701A1 (en) Method and system for associating positional audio to positional video
WO2011153905A1 (zh) 一种音频信号的混音处理方法及装置
WO2011057511A1 (zh) 实现混音的方法、装置和系统
WO2010094219A1 (zh) 一种语音信号的处理、播放方法和装置
WO2014173091A1 (zh) 一种视频会议中会议材料的显示方法及装置
WO2015127799A1 (zh) 协商媒体能力的方法和设备
US9088690B2 (en) Video conference system
WO2015003532A1 (zh) 多媒体会议的建立方法、装置及系统
WO2012175025A1 (zh) 远程呈现会议系统、远程呈现会议的录制与回放方法
US7453829B2 (en) Method for conducting a video conference
CN112970270B (zh) 沉浸式音频服务中的音频处理
EP4085661A1 (en) Audio representation and associated rendering
CN113726534A (zh) 会议控制方法、装置、电子设备及存储介质
WO2011153926A1 (zh) 会场图像广播方法及多点控制单元
Romanow et al. Requirements for Telepresence Multistreams
JP3189869B2 (ja) 多地点テレビ会議システム
RU2810920C2 (ru) Обработка звука в звуковых услугах с эффектом присутствия
WO2012079493A1 (zh) 发送、接收视频信息的方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08800966

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2010526135

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2008800966

Country of ref document: EP