WO2010022658A1 - 多视点媒体内容的发送和播放方法、装置及系统 - Google Patents

多视点媒体内容的发送和播放方法、装置及系统 Download PDF

Info

Publication number
WO2010022658A1
WO2010022658A1 PCT/CN2009/073547 CN2009073547W WO2010022658A1 WO 2010022658 A1 WO2010022658 A1 WO 2010022658A1 CN 2009073547 W CN2009073547 W CN 2009073547W WO 2010022658 A1 WO2010022658 A1 WO 2010022658A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
media content
view
audio
viewpoint
Prior art date
Application number
PCT/CN2009/073547
Other languages
English (en)
French (fr)
Inventor
詹五洲
王东琦
刘源
Original Assignee
深圳华为通信技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华为通信技术有限公司 filed Critical 深圳华为通信技术有限公司
Publication of WO2010022658A1 publication Critical patent/WO2010022658A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • H04N13/178Metadata, e.g. disparity information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams

Definitions

  • the present invention relates to the field of communications, and in particular, to a method, device and system for transmitting and playing multi-view media content. Background technique
  • Multi-view media content refers to media content composed of multi-view video information and audio information.
  • the multi-viewpoint video information refers to video information composed of a plurality of video streams obtained by synchronously capturing the same scene from different angles using a plurality of cameras.
  • the viewer can view the multi-view media content from different angles by selecting different viewpoints.
  • the playing direction of the sound source is fixed.
  • the directions of the video signal and the audio signal may not match, so that the viewer can view the There may be an angular difference between the video signal and the audio signal that is heard, resulting in poor realism and presence of viewing multi-view media content.
  • the viewer p views the same person in the same scene from three different viewpoints (corresponding angles z′, z, and ⁇ respectively, and obtains the video signal A corresponding to the three viewpoints.
  • B and C in Figure 1, the sound source is located at S (angle Z"), when the viewer selects the viewpoint of angle Z" for viewing, the video signal A and the audio signal (issued from the sound source S) The same angle, the video signal at this time
  • A matches the audio signal; when the viewer selects the viewpoint with the angle z or ⁇ for viewing, there is an angular difference between the video signal ⁇ or C and the audio signal (issued from the sound source S), and the video signal and the audio signal do not match. There is an angular difference between the video signal viewed by the user and the audio signal heard, resulting in poor user experience.
  • Embodiments of the present invention provide a method, apparatus, and system for transmitting and playing multi-view media content, which can match a video signal with a playback direction of an audio signal after switching a viewpoint.
  • a method for playing multi-view media content comprising: receiving multi-view media content; generating a switched viewpoint information when performing view switching; generating a video corresponding to the view information according to the view information and the multi-view media content a signal and a corresponding audio signal; synchronously outputting the video signal and the audio signal.
  • a method for transmitting multi-view media content comprising: acquiring three-dimensional information of the video information according to video information of multiple views; and obtaining sound source location information of the audio information according to audio information received from a plurality of different locations;
  • the multi-view video information and the three-dimensional information of the video information, and the audio information and the sound source location information of the audio information are encoded to generate multi-view media content and then transmitted.
  • a multi-view media content playing device comprising:
  • a media content receiving unit configured to receive multi-view media content
  • a view information generating unit configured to generate, after the view point switching, the switched view information, the signal generating unit, configured to use the view information generated by the view information generating unit, and the multi-view media content received by the media content receiving unit, Generating a video signal corresponding to the view information and a corresponding audio signal;
  • a synchronization output unit configured to synchronously output the video signal and the audio signal generated by the signal generating unit.
  • a device for transmitting multi-view media content includes:
  • a video information processing unit configured to acquire, according to the multi-view video information, three-dimensional information of the video information
  • An audio information processing unit configured to obtain audio information received from a plurality of different locations, Sound source location information of the audio information
  • a multi-viewpoint media content generating unit configured to use the multi-viewpoint video information and the three-dimensional information of the multi-viewpoint video information obtained by the video information processing unit, and the audio information and the audio information processing unit
  • the sound source position information of the audio information is encoded, and is generated after generating the multi-view media content.
  • a multi-view media content playing system comprising:
  • a multi-view media content transmitting apparatus configured to process the received video information of the multi-viewpoint and the audio information received from the plurality of different locations, and acquire the three-dimensional information of the video information and the sound source location information of the audio information, And encoding the multi-view video information and the three-dimensional information of the video information, and the audio information and the sound source location information of the audio information to generate multi-view media content and then transmitting;
  • the multi-view media content playing device is configured to receive the multi-view media content sent by the multi-view media content transmitting device, and when the view switching is performed, generate the switched viewpoint information, according to the view information and the received multi-view media content And generating a corresponding video signal and an audio signal, and synchronously outputting the video signal and the audio signal.
  • the method, device, and system for transmitting and playing multi-view media content provided by the embodiment of the present invention, because the multi-viewpoint media content sent by the sending end includes the three-dimensional information of the multi-viewpoint video information and the sound source location information of the audio information, the playing end
  • the video signal and the audio signal corresponding to the viewpoint information can be generated according to the switched viewpoint information and the received multi-view media content, which solves the problem that the audio signal is fixed in the prior art, and after the viewpoint switching, the audio is The signal does not match the video signal corresponding to the post-switching viewpoint, causing the user to experience the reality of the multi-view media content and the problem of poor presence.
  • FIG. 1 is a schematic diagram of a viewer viewing multi-view media content from three different viewpoints in the prior art
  • FIG. 2 is a flowchart 1 of a method for transmitting multi-view media content according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a method for transmitting multi-view media content according to an embodiment of the present invention
  • FIG. 4 is a flowchart of a method for playing multi-view media content according to an embodiment of the present invention
  • FIG. FIG. 6 is a schematic structural diagram of a multi-viewpoint media content transmitting apparatus according to an embodiment of the present invention
  • FIG. 7 is a schematic structural diagram 1 of a multi-viewpoint media content playing apparatus according to an embodiment of the present invention
  • 8 is a schematic structural diagram of a multi-view media content playing device provided by an embodiment of the present invention
  • FIG. 9 is a schematic structural diagram 1 of a multi-view media content playing system according to an embodiment of the present invention
  • FIG. 10 is a multi-viewpoint according to an embodiment of the present invention.
  • the structure of the media content playback system is shown in Figure 2. detailed description
  • the method for transmitting the multi-viewpoint media content includes: Step 201: Acquire the three-dimensional information of the video information according to the video information of the multi-viewpoint.
  • the multiple The video information of the viewpoint is obtained by shooting a camera group, and the camera group includes one or more cameras located at different viewpoints, and the step 201 may perform three-dimensional information processing on the video information of the multi-viewpoint to obtain the multi-viewpoint video information.
  • the three-dimensional information, wherein the three-dimensional information may include: depth information of the multi-view video information and disparity information between adjacent view video information;
  • Step 202 Obtain sound source location information of the audio information according to audio information received from a plurality of different locations;
  • the audio information received from a plurality of different locations is obtained through a microphone array
  • the microphone array includes a plurality of microphones located at different locations
  • the step 202 may be obtained by using the microphone array.
  • the audio information is processed by using an array signal processing technique such as beamforming to obtain sound source position information of the audio information;
  • the audio information may include more than one sound source signal.
  • the sound source position information of the audio information obtained in the step 202 is the sound source position information corresponding to each sound source signal;
  • Step 203 encode the multi-view video information and the three-dimensional information of the video information, and the audio information and the sound source location information of the audio information to generate multi-view media content and send the content.
  • the method for transmitting the multi-viewpoint media content provided by the embodiment of the present invention, because the transmitted multi-viewpoint media content includes the three-dimensional information of the multi-viewpoint video information and the sound source location information of the audio information, thereby solving the prior art, the audio signal is fixed.
  • the viewpoint switching there is an angular difference between the audio signal and the position of the video signal corresponding to the switched viewpoint, so that the played audio signal does not match the video signal, and the audio signal and the video signal are synchronously switched.
  • the purpose is to improve the realism and presence of the user viewing the multi-view media content. .
  • the method may further include:
  • Step 200a Acquire an audio signal of the played multi-view media content
  • Step 200b Perform echo cancellation processing on the audio information received from the plurality of different locations according to the acquired audio signal of the multi-view media content that is played.
  • the echo cancellation technique is to eliminate the far-end audio signal played from the speaker from the local audio signal picked up by the microphone, usually by adaptive filtering.
  • Adaptive filtering simulates the spatial propagation path from the speaker to the microphone. Due to the reflection and diffraction of the sound, there are multiple spatial propagation paths. For microphones in different positions, the spatial propagation path from the speaker to the microphone is different, so echo cancellation is required separately. Therefore, in the present invention, a preferred embodiment is to perform echo cancellation processing on a plurality of picked up audio signals from different positions using an adaptive filtering method.
  • echo cancellation can also be implemented by other methods, such as blind source separation, auditory scene analysis, etc., and each case will not be described here.
  • the steps 200a and 200b may be located before the step 201 or after the step 201, and the present invention is not limited thereto. In this embodiment, as shown in FIG. 3, the steps 200a and 200b are located before step 201.
  • the method for transmitting the multi-view media content provided by the embodiment of the present invention, because the received audio information is subjected to echo cancellation processing, so that in the two-way system, the audio signal played by the playing end does not generate the audio information received by the transmitting end. interference.
  • the method for playing multi-view media content includes: Step 401: Receive multi-view media content;
  • the step 401 may receive, by the network, the multi-view media content sending end to send the multi-view media content;
  • the multi-view media content may include: video information and three-dimensional information of the video information (eg, depth information or a disparity information or the like, and audio information, sound source position information of the audio information, and the like, wherein the video information is composed of a video stream obtained by one or more viewpoints, the audio information including at least one sound source information, the audio information
  • the sound source position information refers to position information of each sound source;
  • Step 402 When performing view point switching, generating the switched view information, including: receiving the view switching information sent by the user through the remote controller or other input device.
  • the view point switching information may be the angle of view switching. And generating the view information after the switching according to the view switching information and the three-dimensional information of the video information in the multi-view media content.
  • the switched view information may be the switched view point. Position coordinate information, etc.
  • Step 403 Generate, according to the view information and the multi-view media content, a video signal corresponding to the view information and a corresponding audio signal.
  • the video information included in the multi-view media content should be composed of video streams obtained by all viewpoints.
  • the video information included in the multi-view media content is actually only The video stream obtained by several key viewpoints is composed, for example: the video information is composed of video streams obtained from the front, left side, right side and back of the scene;
  • the step 403 specifically uses the video information of the two key viewpoints adjacent to the switched viewpoint and the disparity information between the video information, and uses a virtual view synthesis algorithm to synthesize a video signal corresponding to the switched viewpoint;
  • the step of generating an audio signal corresponding to the viewpoint information may include: first, according to the viewpoint information after the switching obtained in the step 402, and the audio information in the multi-view media content.
  • the step 403 may also generate an audio signal corresponding to the switched viewpoint by using another three-dimensional audio playback technology similar to the wavefront synthesis technique, where no other cases are praised. State
  • the step 403 needs to separately generate sound source location information corresponding to the switched viewpoint for each sound source;
  • Step 404 synchronously outputting the video signal and the audio signal generated in step 403.
  • the method for playing the multi-view media content may further include: performing the step of echoing the audio signal corresponding to the switched viewpoint.
  • the method for playing the multi-view media content can generate a video signal and an audio signal corresponding to the view information according to the switched view information and the received multi-view media content, thereby solving the prior art due to the audio.
  • the signal is fixed. After the viewpoint switching, there is an angular difference between the audio signal and the position of the video signal corresponding to the switched viewpoint, so that the played audio signal does not match the video signal, and the audio signal and the video are realized.
  • the purpose of signal synchronization switching improves the user's sense of realism and presence in viewing the multi-view media content.
  • the embodiment of the present invention further provides a device for transmitting the multi-viewpoint media content, including:
  • the video information processing unit 501 is configured to acquire three-dimensional information of the video information according to the multi-view video information;
  • the audio information processing unit 502 is configured to obtain sound source location information of the audio information according to audio information received from a plurality of different locations; a multi-viewpoint media content generating unit 503, configured to: the video information of the multi-viewpoint and the three-dimensional information of the multi-viewpoint video information obtained by the video information processing unit 501, and the audio information and the audio information processing unit
  • the sound source position information of the audio information obtained by 502 is encoded, and the multi-view media content is generated and transmitted.
  • the apparatus for transmitting the multi-view media content provided by the embodiment of the present invention may further include:
  • the audio signal acquiring unit 504 is configured to acquire an audio signal of the multi-viewpoint media content that is played, and the echo cancellation processing unit 505 is configured to: according to the played audio signal acquired by the echo cancellation information receiving unit 504, the plurality of different audio signals The audio information received by the location is subjected to echo cancellation processing;
  • the audio information processing unit 502 is further configured to process the audio information processed by the echo cancellation processing unit 505 to obtain three-dimensional information of the audio information.
  • the apparatus for transmitting multi-viewpoint media content solves the prior art because the audio signal is because the multi-viewpoint media content transmitted by the multi-viewpoint media content includes the three-dimensional information of the multi-viewpoint video information and the sound source location information of the audio information.
  • the viewpoint switching there is an angular difference between the audio signal and the position of the video signal corresponding to the switched viewpoint, so that the played audio signal does not match the video signal, and the audio signal and the video signal are synchronized.
  • the purpose of the switching is to improve the realism and presence of the user viewing the multi-view media content.
  • the apparatus for playing multi-view media content includes: a media content receiving unit 701, configured to receive multi-view media content;
  • the media content receiving unit 701 may receive, by using a network interface, the multi-view media content processed by the sending end from the network; the multi-view media content may include: video information and three-dimensionality of the video information. Information (such as: depth information or disparity information, etc.), and audio information and sound source position information of the audio information, etc., wherein the video information is composed of a video stream obtained by one or more viewpoints, the audio information including at least one sound source. Information, the sound source location information of the audio information refers to location information of each sound source;
  • the viewpoint information generating unit 702 is configured to generate the view information after the switching when the view point switching is performed.
  • the signal generating unit 703 is configured to use the view point information generated by the view point information generating unit 702 and the received by the media content receiving unit 701. Viewing video content, generating a video signal and an audio signal corresponding to the viewpoint information;
  • the synchronization output unit 704 is configured to synchronously output the video signal and the audio signal generated by the signal generating unit 703.
  • the view information generating unit 702 may include: a switch information acquiring unit 7021, configured to acquire view switching information;
  • the first generating unit 7022 is configured to generate the switched viewpoint information according to the view switching information acquired by the switching information acquiring unit 7021 and the three-dimensional information of the video information included in the multi-view media content.
  • the signal generating unit 703 includes an audio information generating unit 7031, and the audio signal generating unit 7031 may include:
  • the location information generating unit 70311 is configured to generate, according to the viewpoint information generated by the viewpoint information generating unit 702 and the sound source location information of the audio information included in the multi-view media content, a sound source of the audio information corresponding to the viewpoint information. location information;
  • the second generating unit 70312 is configured to generate, according to the audio information included in the multi-view media content and the sound source location information of the audio information corresponding to the viewpoint information generated by the location information generating unit 70311, corresponding to the viewpoint information. audio signal.
  • the multi-view media content playing device may further include: an echo cancellation processing unit 705, configured to perform an echo cancellation process on the audio signal corresponding to the viewpoint information.
  • the apparatus for playing multi-view media content can generate a video signal and an audio signal corresponding to the viewpoint information according to the switched viewpoint information and the received multi-view media content, thereby solving the prior art due to audio.
  • the signal is fixed. After the viewpoint switching, there is an angular difference between the audio signal and the position of the video signal corresponding to the switched viewpoint, so that the sound is
  • the problem that the frequency signal does not match the video signal realizes the purpose of synchronous switching between audio and video, and improves the realism and presence of the user viewing the multi-view media content.
  • the multi-view media content playing system includes: a multi-view media content transmitting apparatus 901, configured to receive video information of multiple views and received from a plurality of different locations. Processing the audio information, acquiring the three-dimensional information of the video information, and the sound source location information of the audio information, the multi-view video information and the three-dimensional information of the video information, and the audio information and the audio source of the audio information.
  • the location information is encoded, and the multi-view media content is generated and sent;
  • the multi-view media content playing device 902 is configured to receive the multi-view media content sent by the multi-view media content transmitting device 901, and when performing the view switching, generate the switched viewpoint information, according to the view information and the received multi-viewpoint
  • the media content generates a corresponding video signal and an audio signal, and synchronously outputs the video signal and the audio signal.
  • the playback system of the multi-view media content may further include:
  • the echo canceling device 903 is configured to receive the audio signal generated by the multi-view media content playing device 902, and send the audio signal to the multi-view media content transmitting device 901;
  • the multi-viewpoint media content transmitting device 901 is further configured to perform echo cancellation processing on the audio information received from the plurality of different locations according to the audio signal sent by the echo canceling device 903.
  • the multi-view media content playing system provided by the embodiment of the present invention can generate a video signal and an audio signal corresponding to the view information according to the switched viewpoint information and the received multi-view media content, thereby solving the prior art due to audio.
  • the signal is fixed. After the viewpoint switching, there is an angular difference between the audio signal and the position of the video signal corresponding to the switched viewpoint, so that the audio signal does not match the video signal, and the audio and video are switched synchronously.
  • the purpose is to improve the realism and presence of the user viewing the multi-view media content.

Description

多视点媒体内容的发送和播放方法、 装置及系统 本申请要求于 2008 年 8 月 27 日提交中国专利局、 申请号为 200810146721.8、 发明名称为"多视点媒体内容的发送和播放方法、 装置及系 统"的中国专利申请的优先权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信领域, 尤其涉及一种多视点媒体内容的发送和播放方法、 装置及系统。 背景技术
多视点媒体内容是指由多视点视频信息和音频信息组成的媒体内容。 其 中, 所述多视点视频信息是指使用多个摄像机, 从不同角度对同一场景进行 同步拍摄获得的多个视频流组成的视频信息。 在多视点媒体内容的播放端, 观看者可以通过选择不同的视点, 从不同角度观看所述多视点媒体内容。
在多视点媒体内容的播放端, 声音源的播放方向是固定不变的, 对所述 多视点媒体内容进行视点切换以后, 视频信号和音频信号的方向可能不匹配, 从而使得观看者观看到的视频信号和听到的音频信号之间可能存在角度差, 造成观看多视点媒体内容的真实感和临场效果不好。 例如: 如图 1 所示, 观 看者 p从三个不同的视点(对应的角度分别为 z"、 z 和^ 观看同一个场 景中的同一个人物, 得到所述三个视点对应的视频信号 A、 B和 C, 在图 1 中, 声音源位于 S处(角度为 Z" ), 当观看者选择角度为 Z"的视点进行观看 时, 视频信号 A与音频信号 (从声音源 S发出) 的角度相同, 此时视频信号
A与音频信号相匹配; 当观看者选择角度为 z 或者 ζ 的视点进行观看时,视 频信号 Β或者 C和音频信号 (从声音源 S发出)之间存在角度差, 视频信号 和音频信号不匹配, 用户观看的视频信号和听到的音频信号之间存在角度差, 使得用户体验差。 发明内容
本发明的实施例提供一种多视点媒体内容的发送和播放方法、 装置及系 统, 能够在切换视点以后, 使视频信号与音频信号的播放方向匹配。
为达到上述目的, 本发明的实施例釆用如下技术方案:
一种多视点媒体内容的播放方法, 包括: 接收多视点媒体内容; 当进行 视点切换时, 生成切换后的视点信息; 根据所述视点信息和多视点媒体内容, 生成与该视点信息对应的视频信号和对应的音频信号; 同步输出所述视频信 号和音频信号。
一种多视点媒体内容的发送方法, 包括: 根据多视点的视频信息, 获取 所述视频信息的三维信息; 根据从多个不同位置接收到的音频信息, 获得所 述音频信息的音源位置信息; 将所述多视点的视频信息以及该视频信息的三 维信息, 和所述音频信息以及该音频信息的音源位置信息进行编码, 生成多 视点媒体内容后发送。
一种多视点媒体内容的播放装置, 包括:
媒体内容接收单元, 用于接收多视点媒体内容;
视点信息生成单元, 用于当进行视点切换时, 生成切换后的视点信息; 信号生成单元, 用于根据所述视点信息生成单元生成的视点信息, 以及 媒体内容接收单元接收的多视点媒体内容, 生成与该视点信息对应的视频信 号和对应的音频信号;
同步输出单元, 用于同步输出所述信号生成单元生成的视频信号和音频 信号。
一种多视点媒体内容的发送装置包括:
视频信息处理单元, 用于根据多视点视频信息, 获取所述视频信息的三 维信息;
音频信息处理单元, 用于根据从多个不同位置接收到的音频信息, 获得 所述音频信息的音源位置信息;
多视点媒体内容生成单元, 用于将所述多视点的视频信息以及所述视频 信息处理单元获得的所述多视点视频信息的三维信息, 和所述音频信息以及 所述音频信息处理单元获得的所述音频信息的音源位置信息进行编码, 生成 多视点媒体内容后发出。
一种多视点媒体内容的播放系统, 包括:
多视点媒体内容发送装置, 用于对接收到的多视点的视频信息以及从多 个不同位置接收到的音频信息进行处理, 获取所述视频信息的三维信息以及 所述音频信息的音源位置信息, 将所述多视点的视频信息以及该视频信息的 三维信息, 和所述音频信息以及该音频信息的音源位置信息进行编码, 生成 多视点媒体内容后发送;
多视点媒体内容播放装置, 用于接收所述多视点媒体内容发送装置发送 的多视点媒体内容, 当进行视点切换时, 生成切换后的视点信息, 根据该视 点信息以及接收到的多视点媒体内容, 生成对应的视频信号和音频信号, 同 步输出所述视频信号和音频信号。
本发明实施例提供的多视点媒体内容的发送和播放方法、 装置及系统, 由于发送端发送的多视点媒体内容中包含了多视点视频信息的三维信息以及 音频信息的音源位置信息, 所以播放端能够根据切换后的视点信息以及接收 到的多视点媒体内容, 生成与该视点信息对应的视频信号和音频信号, 解决 了现有技术由于音频信号是固定不变的, 在进行视点切换以后, 音频信号与 切换后视点所对应的视频信号不匹配, 造成用户收看多视点媒体内容的真实 感和临场效果差的问题。 附图说明
图 1为现有技术中观看者从三个不同视点观看多视点媒体内容的示意图; 图 2为本发明实施例提供的多视点媒体内容的发送方法流程图一; 图 3为本发明实施例提供的多视点媒体内容的发送方法流程图二; 图 4为本发明实施例提供的多视点媒体内容的播放方法流程图; 图 5为本发明实施例提供的多视点媒体内容的发送装置结构示意图一; 图 6为本发明实施例提供的多视点媒体内容的发送装置结构示意图二; 图 7为本发明实施例提供的多视点媒体内容的播放装置结构示意图一; 图 8为本发明实施例提供的多视点媒体内容的播放装置结构示意图二; 图 9为本发明实施例提供的多视点媒体内容的播放系统结构示意图一; 图 10为本发明实施例提供的多视点媒体内容的播放系统结构示意图二。 具体实施方式
如图 2所示, 本发明实施例提供的多视点媒体内容的发送方法, 包括: 步骤 201 , 根据多视点的视频信息, 获取所述视频信息的三维信息; 在本实施例中, 所述多视点的视频信息是通过一个摄像机组拍摄获得的, 该摄像机组包括一个以上位于不同视点的摄像机, 所述步骤 201 可以对所述 多视点的视频信息进行三维信息处理, 获得该多视点视频信息的三维信息, 其中, 该三维信息可以包括: 所述多视点视频信息的深度信息以及相邻视点 视频信息之间的视差信息等;
步骤 202,根据从多个不同位置接收到的音频信息, 获得所述音频信息的 音源位置信息;
在本实施例中, 所述从多个不同位置接收到的音频信息是通过一个麦克 风阵列获得的, 该麦克风阵列包括多个位于不同位置的麦克风, 所述步骤 202 可以对所述通过麦克风阵列获得的音频信息使用波束形成等阵列信号处理技 术进行处理, 获得所述音频信息的音源位置信息;
在本实施例中, 所述音频信息中可能包括一个以上的音源信号, 此时, 所述步骤 202获得的音频信息的音源位置信息为每个音源信号对应的音源位 置信息; 步骤 203 , 将所述多视点的视频信息以及该视频信息的三维信息, 和所述 音频信息以及该音频信息的音源位置信息进行编码, 生成多视点媒体内容后 发送。
本发明实施例提供的多视点媒体内容的发送方法, 由于发送的多视点媒 体内容中包含多视点视频信息的三维信息以及音频信息的音源位置信息, 因 此, 解决了现有技术由于音频信号是固定不变的, 在进行视点切换以后, 音 频信号与切换后视点所对应的视频信号位置之间存在角度差, 使得播放的音 频信号与视频信号不匹配的问题, 实现了音频信号与视频信号同步切换的目 的, 提高了用户观看所述多视点媒体内容的真实感和临场感。。
当本发明实施例提供的多视点媒体内容的发送方法应用在双向系统中 时, 如应用在会场中, 如图 3所示, 在如图 2所示的步骤 202之前, 还可以 包括:
步骤 200a, 获取播放的多视点媒体内容的音频信号;
步骤 200b, 根据所述获取的播放的多视点媒体内容的音频信号, 对所述 从多个不同位置接收到的音频信息进行回声抵消处理。
回声抵消技术是将从扬声器播放的远端音频信号从麦克风拾取的本地音 频信号中消除, 通常釆用自适应滤波的方法来实现。 自适应滤波模拟的是从 扬声器到麦克风的空间传播路径, 由于声音的反射和衍射, 空间传播路径有 多条。 对于不同位置的麦克风, 扬声器到麦克风的空间传播路径是不一样的, 因此需要分别进行回声抵消。 因此在本发明中, 一个优选实施例是对多个从 不同位置的拾取的音频信号釆用自适应滤波方法分别进行回声抵消处理。 除 了自适应滤波之外, 回声抵消也可以釆用其他方法来实现, 例如盲源分离、 听觉场景分析等, 此处不对每种情况进行一一赘述。
所述步骤 200a和 200b可以位于所述步骤 201之前, 也可以位于所述步 骤 201之后, 对此, 本发明不进行限定。 在本实施例中, 如图 3所示, 所述 步骤 200a和 200b位于步骤 201之前。 本发明实施例提供的多视点媒体内容的发送方法, 由于对接收到的音频 信息进行了回声抵消处理, 使得在双向系统中, 播放端播放的音频信号不会 对发送端接收到的音频信息产生干扰。
如图 4所示, 本发明实施例提供的多视点媒体内容的播放方法, 包括: 步骤 401 , 接收多视点媒体内容;
在本实施例中, 所述步骤 401 可以通过网络接收多视点媒体内容发送端 发送多视点媒体内容; 所述多视点媒体内容可以包括: 视频信息以及该视频 信息的三维信息(如: 深度信息或者视差信息等), 和音频信息以及该音频信 息的音源位置信息等, 其中, 所述视频信息由一个以上视点拍摄获得的视频 流组成, 所述音频信息包括至少一个音源信息, 所述音频信息的音源位置信 息是指每个音源的位置信息;
步骤 402, 当进行视点切换时, 生成切换后的视点信息, 包括: 接收用户 通过遥控器或者其他输入设备发送的视点切换信息, 在本实施例中, 所述视 点切换信息可以为视点切换的角度信息等; 根据所述视点切换信息以及所述 多视点媒体内容中视频信息的三维信息, 生成切换后的视点信息, 在本实施 例中, 所述切换后的视点信息可以为切换后的视点的位置坐标信息等;
步骤 403 ,根据所述视点信息和多视点媒体内容, 生成与该视点信息对应 的视频信号和对应的音频信号;
理论上来讲, 所述多视点媒体内容中包含的视频信息应该由所有视点拍 摄获得的视频流组成, 然而, 出于拍摄成本的考虑, 实际上所述多视点媒体 内容中包含的视频信息仅由几个关键视点拍摄获得的视频流组成, 例如: 所 述视频信息是由从景物的前面、 左侧面、 右侧面和后面拍摄获得的视频流组 成;
综上所述, 在本实施例中, 所述步骤 403 具体是利用与切换后的视点相 邻的两个关键视点的视频信息以及该视频信息之间的视差信息, 使用虚拟视 点合成算法, 合成所述切换后的视点对应的视频信号; 在本实施例中, 所述步骤 403 生成与视点信息对应的音频信号的步骤可 以包括: 首先, 根据所述步骤 402 中获得的切换以后的视点信息, 以及所述 多视点媒体内容中音频信息的音源位置信息, 生成与该视点信息对应的音频 信息的音源位置信息; 然后, 根据所述生成的音频信息的音源位置信息以及 所述多视点媒体内容中包含的音频信息, 使用波前合成技术, 生成与该视点 信息对应的音频信号; 当然, 所述步骤 403 也可以釆用其他类似于波前合成 技术的三维音频播放技术生成与切换后的视点对应的音频信号, 此处, 不对 其他情况进行赞述;
当所述音频信息中包括一个以上的音源时, 所述步骤 403 需要为每个音 源分别生成与切换后的视点对应的音源位置信息;
步骤 404, 同步输出步骤 403中生成的视频信号和音频信号。
进一步地, 本发明实施例提供的多视点媒体内容的播放方法, 在所述步 骤 403之后, 还可以包括: 将所述与切换后的视点对应的音频信号进行回声 •ί氐消处理的步骤。
本发明实施例提供的多视点媒体内容的播放方法, 能够根据切换后的视 点信息以及接收到的多视点媒体内容, 生成与该视点信息对应的视频信号和 音频信号, 解决了现有技术由于音频信号是固定不变的, 在进行视点切换以 后, 音频信号与切换后视点所对应的视频信号位置之间存在角度差, 使得播 放的音频信号与视频信号不匹配的问题, 实现了音频信号与视频信号同步切 换的目的, 提高了用户观看所述多视点媒体内容的真实感和临场感。
与上述本发明实施例提供的多视点媒体内容的发送方法相对应地,如图 5 所示, 本发明实施例还提供一种多视点媒体内容的发送装置, 包括:
视频信息处理单元 501 , 用于根据多视点视频信息, 获取所述视频信息的 三维信息;
音频信息处理单元 502, 用于根据从多个不同位置接收到的音频信息, 获 得所述音频信息的音源位置信息; 多视点媒体内容生成单元 503 ,用于将所述多视点的视频信息以及所述视 频信息处理单元 501 获得的所述多视点视频信息的三维信息, 和所述音频信 息以及所述音频信息处理单元 502获得的所述音频信息的音源位置信息进行 编码, 生成多视点媒体内容后发送。
进一步地, 如图 6 所示, 本发明实施例提供的多视点媒体内容的发送装 置, 还可以包括:
音频信号获取单元 504 , 用于获取播放的多视点媒体内容的音频信号; 回声抵消处理单元 505 ,用于根据所述回声抵消信息接收单元 504获取的 播放的音频信号, 对所述从多个不同位置接收到的音频信息进行回声抵消处 理;
所述音频信息处理单元 502 ,还用于对所述回声 4氏消处理单元 505处理以 后的音频信息进行处理, 获取该音频信息的三维信息。
本发明实施例提供的多视点媒体内容的发送装置, 由于其发送的多视点 媒体内容中包含多视点视频信息的三维信息以及音频信息的音源位置信息, 因此, 解决了现有技术由于音频信号是固定不变的, 在进行视点切换以后, 音频信号与切换后视点所对应的视频信号位置之间存在角度差, 使得播放的 音频信号与视频信号不匹配的问题, 实现了音频信号与视频信号同步切换的 目的, 提高了用户观看所述多视点媒体内容的真实感和临场感。
如图 7所示, 本发明实施例提供的多视点媒体内容的播放装置, 包括: 媒体内容接收单元 701 , 用于接收多视点媒体内容;
在本实施例中, 所述媒体内容接收单元 701 可以通过网络接口, 从网络 上接收发送端经过处理以后的多视点媒体内容; 所述多视点媒体内容可以包 括: 视频信息以及该视频信息的三维信息(如: 深度信息或者视差信息等), 和音频信息以及该音频信息的音源位置信息等, 其中, 所述视频信息由一个 以上视点拍摄获得的视频流组成, 所述音频信息包括至少一个音源信息, 所 述音频信息的音源位置信息是指每个音源的位置信息; 视点信息生成单元 702,用于当进行视点切换时,生成切换后的视点信息; 信号生成单元 703 ,用于根据所述视点信息生成单元 702生成的视点信息 , 以及媒体内容接收单元 701 接收的多视点媒体内容, 生成与该视点信息对应 的视频信号和音频信号;
同步输出单元 704 ,用于同步输出所述信号生成单元 703生成的视频信号 和音频信号。
进一步地, 如图 8所示, 所述视点信息生成单元 702可以包括: 切换信息获取单元 7021 , 用于获取视点切换信息;
第一生成单元 7022 ,用于根据所述切换信息获取单元 7021获取的视点切 换信息, 以及所述多视点媒体内容中包含的视频信息的三维信息, 生成切换 后的视点信息。
进一步地, 如图 8所示, 所述信号生成单元 703 包括音频信息生成单元 7031 , 该音频信号生成单元 7031可以包括:
位置信息生成单元 70311 ,用于根据所述视点信息生成单元 702生成的视 点信息, 以及所述多视点媒体内容中包含的音频信息的音源位置信息, 生成 与所述视点信息对应的音频信息的音源位置信息;
第二生成单元 70312 ,用于根据所述多视点媒体内容中包含的音频信息以 及位置信息生成单元 70311 生成的与所述视点信息对应的音频信息的音源位 置信息, 生成与所述视点信息对应的音频信号。
进一步地, 如图 8所示, 所述多视点媒体内容的播放装置, 还可以包括: 回声抵消处理单元 705 ,用于将所述与视点信息对应的音频信号进行回声 氐消处理。
本发明实施例提供的多视点媒体内容的播放装置, 能够根据切换后的视 点信息以及接收到的多视点媒体内容, 生成与该视点信息对应的视频信号和 音频信号, 解决了现有技术由于音频信号是固定不变的, 在进行视点切换以 后, 音频信号与切换后视点所对应的视频信号位置之间存在角度差, 使得音 频信号与视频信号不匹配的问题, 实现了音频与视频同步切换的目的, 提高 了用户观看所述多视点媒体内容的真实感和临场感。
如图 9所示, 本发明实施例提供的多视点媒体内容的播放系统, 包括: 多视点媒体内容发送装置 901 ,用于对接收到的多视点的视频信息以及从 多个不同位置接收到的音频信息进行处理, 获取所述视频信息的三维信息以 及所述音频信息的音源位置信息, 将所述多视点的视频信息以及该视频信息 的三维信息, 和所述音频信息以及该音频信息的音源位置信息进行编码, 生 成多视点媒体内容后发送;
多视点媒体内容播放装置 902 , 用于接收所述多视点媒体内容发送装置 901发送的多视点媒体内容, 当进行视点切换时, 生成切换后的视点信息, 根 据该视点信息以及接收到的多视点媒体内容, 生成对应的视频信号和音频信 号, 同步输出所述视频信号和音频信号。
进一步地, 当本发明实施例提供的多视点媒体内容的播放系统为一个双 向通信系统时, 如会场, 如图 10所示, 所述多视点媒体内容的播放系统, 还 可以包括:
回声抵消装置 903 ,用于接收所述多视点媒体内容播放装置 902生成的音 频信号, 将该音频信号发送给多视点媒体内容发送装置 901 ;
所述多视点媒体内容发送装置 901 ,还用于根据所述回声抵消装置 903发 送的音频信号, 对从多个不同位置接收到的音频信息进行回声抵消处理。
本发明实施例提供的多视点媒体内容的播放系统, 能够根据切换后的视 点信息以及接收到的多视点媒体内容, 生成与该视点信息对应的视频信号和 音频信号, 解决了现有技术由于音频信号是固定不变的, 在进行视点切换以 后, 音频信号与切换后视点所对应的视频信号位置之间存在角度差, 使得音 频信号与视频信号不匹配的问题, 实现了音频与视频同步切换的目的, 提高 了用户观看所述多视点媒体内容的真实感和临场感。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤 是可以通过程序来指令相关的硬件完成, 所述的程序可以存储于一计算机可 读存储介质中, 如 ROM/RAM、 磁碟或光盘等。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护 范围应所述以权利要求的保护范围为准。

Claims

权 利 要求 书
1、 一种多视点媒体内容的播放方法, 其特征在于, 包括:
接收多视点媒体内容;
当进行视点切换时, 生成切换后的视点信息;
根据所述切换后的视点信息和所述多视点媒体内容, 生成与该视点信息对 应的视频信号和对应的音频信号;
同步输出所述视频信号和音频信号。
2、 根据权利要求 1所述的多视点媒体内容的播放方法, 其特征在于, 所述 多视点媒体内容包括: 多视点的视频信息以及该视频信息的三维信息, 和音频 信息以及该音频信息的音源位置信息。
3、 根据权利要求 2所述的多视点媒体内容的播放方法, 其特征在于, 所述 生成切换后的视点信息包括:
获取视点切换信息;
根据所述视点切换信息和视频信息的三维信息, 生成切换后的视点信息。
4、 根据权利要求 2所述的多视点媒体内容的播放方法, 其特征在于, 所述 根据所述视点信息和多视点媒体内容, 生成与该视点信息对应的音频信号包括: 根据所述视点信息和所述音频信息的音源位置信息, 生成与所述视点信息 对应的音频信息的音源位置信息;
根据所述音频信息以及与所述视点信息对应的音频信息的音源位置信息, 生成与所述视点信息对应的音频信号。
5、 根据权利要求 1所述的多视点媒体内容的播放方法, 其特征在于, 所述 根据所述视点信息和多视点媒体内容, 生成与该视点信息对应的视频信号和对 应的音频信号之后, 还包括: 将所述与视点信息对应的音频信号进行回声抵消 处理。
6、 一种多视点媒体内容的发送方法, 其特征在于, 包括:
根据多视点的视频信息, 获取所述视频信息的三维信息; 根据从多个不同位置接收到的音频信息, 获得所述音频信息的音源位置信 息;
将所述多视点的视频信息以及该视频信息的三维信息, 和所述音频信息以 及该音频信息的音源位置信息进行编码, 生成多视点媒体内容后发送。
7、 根据权利要求 6所述的多视点媒体内容的发送方法, 其特征在于, 所述 方法还包括:
获取播放的多视点媒体内容的音频信号;
根据获取的播放的多视点媒体内容的音频信号, 对所述从多个不同位置接 收到的音频信息进行回声 ·ί氏消处理。
8、 一种多视点媒体内容的播放装置, 其特征在于, 包括:
媒体内容接收单元, 用于接收多视点媒体内容;
视点信息生成单元, 用于当进行视点切换时, 生成切换后的视点信息; 信号生成单元, 用于根据所述视点信息生成单元生成的视点信息, 以及媒 体内容接收单元接收的多视点媒体内容, 生成与该视点信息对应的视频信号和 对应的音频信号;
同步输出单元, 用于同步输出所述信号生成单元生成的视频信号和音频信 号。
9、 根据权利要求 8所述的多视点媒体内容的播放装置, 其特征在于, 所述 视点信息生成单元包括:
切换信息获取单元, 用于获取视点切换信息;
第一生成单元, 用于根据所述切换信息获取单元获取的视点切换信息, 以 及所述多视点媒体内容中包含的视频信息的三维信息, 生成切换后的视点信息。
10、 根据权利要求 8 所述的多视点媒体内容的播放装置, 其特征在于, 所 述信号生成单元包括音频信号生成单元, 该音频信号生成单元包括:
位置信息生成单元, 用于根据所述视点信息生成单元生成的视点信息, 以 及所述多视点媒体内容中包含的音频信息的音源位置信息, 生成与所述视点信 息对应的音频信息的音源位置信息;
第二生成单元, 用于根据所述多视点媒体内容中包含的音频信息以及位置 信息生成单元生成的与所述视点信息对应的音频信息的音源位置信息, 生成与 所述视点信息对应的音频信号。
11、根据权利要求 8或 10所述的多视点媒体内容的播放装置,其特征在于, 还包括:
回声抵消处理单元, 用于将所述与视点信息对应的音频信号进行回声抵消 处理。
12、 一种多视点媒体内容的发送装置, 其特征在于, 包括:
视频信息处理单元, 用于根据多视点的视频信息, 获取所述视频信息的三 维信息;
音频信息处理单元, 用于根据从多个不同位置接收到的音频信息, 获得所 述音频信息的音源位置信息;
多视点媒体内容生成单元, 用于将所述多视点的视频信息以及所述视频信 息处理单元获得的所述多视点视频信息的三维信息, 和所述音频信息以及所述 音频信息处理单元获得的所述音频信息的音源位置信息进行编码, 生成多视点 媒体内容后发送。
13、 根据权利要求 12所述的多视点媒体内容的发送装置, 其特征在于, 还 包括:
音频信号获取单元, 用于获取播放的多视点媒体内容的音频信号; 回声抵消处理单元, 用于根据所述音频信号获取单元获取的播放的音频信 号, 对所述从多个不同位置接收到的音频信息进行回声抵消处理;
所述音频信息处理单元, 还用于对所述回声 ·ί氏消处理单元处理以后的音频 信息进行处理, 获取该音频信息的三维信息。
14、 一种多视点媒体内容的播放系统, 其特征在于, 包括:
多视点媒体内容发送装置, 用于对接多视点的视频信息以及从多个不同位 置接收到的音频信息进行处理, 获取所述视频信息的三维信息以及所述音频信 息的音源位置信息, 将所述多视点的视频信息以及该视频信息的三维信息, 和 所述音频信息以及该音频信息的音源位置信息进行编码, 生成多视点媒体内容 后发送;
多视点媒体内容播放装置, 用于接收所述多视点媒体内容发送装置发送的 多视点媒体内容, 当进行视点切换时, 生成切换后的视点信息, 根据该视点信 息以及接收到的多视点媒体内容, 生成对应的视频信号和音频信号, 同步输出 所述视频信号和音频信号。
15、 根据权利要求 14所述的多视点媒体内容的播放系统, 其特征在于, 还 包括:
回声抵消装置, 用于接收所述多视点媒体内容播放装置生成的音频信号, 将该音频信号发送给多视点媒体内容发送装置;
所述多视点媒体内容发送装置, 还用于根据所述回声抵消装置发送的音频 信号, 对从多个不同位置接收到的音频信息进行回声抵消处理。
PCT/CN2009/073547 2008-08-27 2009-08-26 多视点媒体内容的发送和播放方法、装置及系统 WO2010022658A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200810146721.8 2008-08-27
CN200810146721.8A CN101662693B (zh) 2008-08-27 2008-08-27 多视点媒体内容的发送和播放方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2010022658A1 true WO2010022658A1 (zh) 2010-03-04

Family

ID=41720839

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/073547 WO2010022658A1 (zh) 2008-08-27 2009-08-26 多视点媒体内容的发送和播放方法、装置及系统

Country Status (2)

Country Link
CN (1) CN101662693B (zh)
WO (1) WO2010022658A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3276982A1 (en) * 2016-07-28 2018-01-31 Canon Kabushiki Kaisha Information processing apparatus, information processing method, and program
CN114390354A (zh) * 2020-10-21 2022-04-22 西安诺瓦星云科技股份有限公司 节目制作方法、装置和系统及计算机可读存储介质
CN114390354B (zh) * 2020-10-21 2024-05-10 西安诺瓦星云科技股份有限公司 节目制作方法、装置和系统及计算机可读存储介质

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI425498B (zh) * 2011-05-04 2014-02-01 Au Optronics Corp 關聯於雙影像應用的影音播放系統及其方法
CN102984560B (zh) * 2011-09-07 2017-06-20 华为技术有限公司 从断点处播放视频的方法和设备
JP2014127987A (ja) * 2012-12-27 2014-07-07 Sony Corp 情報処理装置および記録媒体
US9462301B2 (en) * 2013-03-15 2016-10-04 Google Inc. Generating videos with multiple viewpoints
CN104994369B (zh) * 2013-12-04 2018-08-21 南京中兴软件有限责任公司 一种图像处理方法、用户终端、图像处理终端及系统
CN103873846B (zh) * 2014-03-24 2015-09-23 中国人民解放军国防科学技术大学 基于滑窗的众视点真三维显示系统视频同步播放方法
CN106792142B (zh) * 2016-12-23 2021-03-23 惠州Tcl移动通信有限公司 一种移动终端的音频播放方法及系统
CN108566514A (zh) * 2018-04-20 2018-09-21 Oppo广东移动通信有限公司 图像合成方法和装置、设备、计算机可读存储介质
CN111866525A (zh) * 2020-09-23 2020-10-30 腾讯科技(深圳)有限公司 多视点视频的播放控制方法及装置、电子设备、存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR19990079486A (ko) * 1998-04-06 1999-11-05 윤종용 다 화면 비디오 디스플레이 시스템에서의 오디오 데이타 처리장치
JP2005159592A (ja) * 2003-11-25 2005-06-16 Nippon Hoso Kyokai <Nhk> コンテンツ送信装置およびコンテンツ受信装置
WO2007088730A1 (ja) * 2006-01-31 2007-08-09 Yamaha Corporation 音声会議装置
KR20080065766A (ko) * 2007-01-10 2008-07-15 광주과학기술원 다시점 화상 및 3차원 오디오 송수신 장치 및 이를 이용한송수신 방법
KR20080098819A (ko) * 2007-05-07 2008-11-12 광주과학기술원 다시점 화상 시스템에서 시점 종속 다채널 오디오 처리방법 및 장치

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR19990079486A (ko) * 1998-04-06 1999-11-05 윤종용 다 화면 비디오 디스플레이 시스템에서의 오디오 데이타 처리장치
JP2005159592A (ja) * 2003-11-25 2005-06-16 Nippon Hoso Kyokai <Nhk> コンテンツ送信装置およびコンテンツ受信装置
WO2007088730A1 (ja) * 2006-01-31 2007-08-09 Yamaha Corporation 音声会議装置
KR20080065766A (ko) * 2007-01-10 2008-07-15 광주과학기술원 다시점 화상 및 3차원 오디오 송수신 장치 및 이를 이용한송수신 방법
KR20080098819A (ko) * 2007-05-07 2008-11-12 광주과학기술원 다시점 화상 시스템에서 시점 종속 다채널 오디오 처리방법 및 장치

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ILKWON PARK ET AL.: "3DTV Conference, 2007, 07-09 May 2007", May 2007, article ILKWON PARK ET AL.: "Interactive Multi-view Video and View-dependent Audio under MPEG-21 DIA(Digital Item Adaptation)", pages: 1 - 4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3276982A1 (en) * 2016-07-28 2018-01-31 Canon Kabushiki Kaisha Information processing apparatus, information processing method, and program
US10235010B2 (en) 2016-07-28 2019-03-19 Canon Kabushiki Kaisha Information processing apparatus configured to generate an audio signal corresponding to a virtual viewpoint image, information processing system, information processing method, and non-transitory computer-readable storage medium
US10664128B2 (en) 2016-07-28 2020-05-26 Canon Kabushiki Kaisha Information processing apparatus, configured to generate an audio signal corresponding to a virtual viewpoint image, information processing system, information processing method, and non-transitory computer-readable storage medium
CN114390354A (zh) * 2020-10-21 2022-04-22 西安诺瓦星云科技股份有限公司 节目制作方法、装置和系统及计算机可读存储介质
CN114390354B (zh) * 2020-10-21 2024-05-10 西安诺瓦星云科技股份有限公司 节目制作方法、装置和系统及计算机可读存储介质

Also Published As

Publication number Publication date
CN101662693A (zh) 2010-03-03
CN101662693B (zh) 2014-03-12

Similar Documents

Publication Publication Date Title
WO2010022658A1 (zh) 多视点媒体内容的发送和播放方法、装置及系统
US9113034B2 (en) Method and apparatus for processing audio in video communication
US20230216965A1 (en) Audio Conferencing Using a Distributed Array of Smartphones
KR102127955B1 (ko) 고차 앰비소닉 오디오 신호의 재생 방법 및 장치
CN101384105B (zh) 三维声音重现的方法、装置及系统
WO2010022633A1 (zh) 音频信号的生成、播放方法及装置、处理系统
US11082662B2 (en) Enhanced audiovisual multiuser communication
US9049339B2 (en) Method for operating a conference system and device for a conference system
EP2352290B1 (en) Method and apparatus for matching audio and video signals during a videoconference
US9025002B2 (en) Method and apparatus for playing audio of attendant at remote end and remote video conference system
JP6404354B2 (ja) 多くの拡声器信号を生成するための装置及び方法、並びにコンピュータ・プログラム
US20090231414A1 (en) Conferencing and Stage Display of Distributed Conference Participants
de Bruijn Application of wave field synthesis in videoconferencing
WO2011057511A1 (zh) 实现混音的方法、装置和系统
JP2014180008A (ja) スピーチ取り込み及びスピーチレンダリング
WO2013178188A1 (zh) 视频会议显示方法及装置
KR20130109615A (ko) 가상 입체 음향 생성 방법 및 장치
Zhang et al. Improving immersive experiences in telecommunication with motion parallax [applications corner]
WO2015090039A1 (zh) 一种声音处理方法、装置及设备
Mauro et al. Binaural Spatialization for 3D immersive audio communication in a virtual world
JP7443973B2 (ja) 音響振動再現システム及び音響振動再現方法
KR101534295B1 (ko) 멀티 뷰어 영상 및 3d 입체음향 제공방법 및 장치
EP3917162A1 (en) System and devices for audio-video spatial communication and event sharing
Lopez et al. Wave field synthesis for next generation videoconferencing
CN112584299A (zh) 一种基于多激励平板扬声器的沉浸式会议系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09809243

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09809243

Country of ref document: EP

Kind code of ref document: A1