KR20040027015A - New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming - Google Patents

New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming Download PDF

Info

Publication number
KR20040027015A
KR20040027015A KR1020020058711A KR20020058711A KR20040027015A KR 20040027015 A KR20040027015 A KR 20040027015A KR 1020020058711 A KR1020020058711 A KR 1020020058711A KR 20020058711 A KR20020058711 A KR 20020058711A KR 20040027015 A KR20040027015 A KR 20040027015A
Authority
KR
South Korea
Prior art keywords
channels
bandwidth
audio
video
streaming
Prior art date
Application number
KR1020020058711A
Other languages
Korean (ko)
Inventor
최두현
이규은
Original Assignee
(주)엑스파미디어
이규은
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)엑스파미디어, 이규은 filed Critical (주)엑스파미디어
Priority to KR1020020058711A priority Critical patent/KR20040027015A/en
Publication of KR20040027015A publication Critical patent/KR20040027015A/en

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10009Improvement or modification of read or write signals
    • G11B20/10046Improvement or modification of read or write signals filtering or equalising, e.g. setting the tap weights of an FIR filter
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/24Signal processing not specific to the method of recording or reproducing; Circuits therefor for reducing noise
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10592Audio or video recording specifically adapted for recording or reproducing multichannel signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/20Disc-shaped record carriers
    • G11B2220/25Disc-shaped record carriers characterised in that the disc is based on a specific recording technology
    • G11B2220/2537Optical discs
    • G11B2220/2562DVDs [digital versatile discs]; Digital video discs; MMCDs; HDCDs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

PURPOSE: A new down-mixing technique using the immersion typed audio to reduce an audio bandwidth for streaming is provided to simultaneously offer high quality video and surround sound by reducing a sound bandwidth on entire bandwidth and allotting more bandwidths to the video when the video is streamed in a restricted bandwidth. CONSTITUTION: After applying a direction to each channel through the respective binaural synthesis of 5 channels by using the HRFT(Head Related Transfer Function), 5 channels are down-mixed to 2 channels. Thus, a user feels stereo when the user listens to the sound through a headphone. The user feels the stereo through two speakers by making a down-mixing result pass a cross-talk filter.

Description

스트리밍시 오디오의 대역폭을 줄이기 위하여 몰입형 오디오를 이용한 새로운 다운믹싱 기법 {New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming}New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming}

스트리밍시 5개의 채널을 모두 전송하거나 혹은 2개의 채널로 단순 다운 믹싱하여 스트리밍한다. 5개의 채널을 전송하는 경우에 많은 양의 데이터를 전송해야 하기 때문에 실시간 전송이 불가능할 수 있고, 2개의 채널로 단순 다운믹싱하는 경우에는 사운드의 입체감을 그대로 유지하는 것이 힘들어 진다.When streaming, all five channels are transmitted or simply down mixed to two channels for streaming. In the case of transmitting five channels, a large amount of data needs to be transmitted, so real-time transmission may not be possible, and in the case of simple downmixing with two channels, it is difficult to maintain the three-dimensionality of sound.

고화질을 가지는 DVD등의 영상물은 사운드의 입체감을 살리기 위해 돌비 디지탈, DTS 등에서 제공하는 5.1채널을 이용한다. 일반적으로 5.1채널은 Dolby Digital의 경우에는 448 Kbps의 대역폭을 가지고, DTS의 경우 1411 Kbps의 대역폭을 가진다. 고화질의 DVD급 영상물을 스트리밍하려면 Dolby Digital 혹은 DTS의 대역폭은 ADSL과 같은 초고속 인터넷에서 조차도 제한된 대역폭 내에서 상당히 많은 대역폭을 차지하므로, 상대적으로 비디오의 대역폭을 잠식한다. 본 발명에서는 Binaural Synthesis를 이용하여 각 채널의 방향성을 유지한 후, 5채널을 2채널로 다운믹싱함으로써, 5채널 본래의 입체감은 유지되면서 채널의 감소로 인한 대역폭의 절감 효과도 기대할 수 있다. 결과적으로, 만약 영상물을 제한된 대역폭에서 스트리밍한다면, 전체 대역폭에서 사운드의 대역폭은 감소하고 상대적으로 비디오에 보다 많은 대역폭을 할당할 수 있기 때문에, 고화질의 비디오와 입체감을 가지는 사운드를 함께 제공할 수 있다.High-definition video such as DVD uses 5.1 channels provided by Dolby Digital, DTS, etc. to enhance the stereoscopic sound. In general, 5.1 channels have a bandwidth of 448 Kbps for Dolby Digital and 1411 Kbps for DTS. To stream high-quality DVD-quality video, the bandwidth of Dolby Digital or DTS occupies a considerable amount of bandwidth within a limited bandwidth, even on high-speed Internet such as ADSL, and relatively erodes video bandwidth. In the present invention, by maintaining the direction of each channel using the Binaural Synthesis, by downmixing the five channels into two channels, while maintaining the original three-dimensional sense of the five channels can be expected to reduce the bandwidth due to the reduction of the channel. As a result, if video is streamed in a limited bandwidth, the bandwidth of the sound at the full bandwidth is reduced and relatively more bandwidth can be allocated to the video, thereby providing high quality video and sound with stereoscopic effect.

스트리밍시 제한된 대역폭에서 5.1채널과 같은 다중채널을 방향성을 유지하여 입체감을 유지하여 2채널로 다운믹싱함으로써 비디오에 보다 많은 대역폭을 할당하여 오디와 함께 비디오도 양질의 스트리밍이 될 수 있도록 한다.When streaming, multi-channel such as 5.1 channel is maintained in a limited bandwidth, down-mixing to 2 channel by maintaining the three-dimensional effect to allocate more bandwidth to the video so that the video can also be streamed with the audio.

(도 1) 5개의 채널을 2개의 채널로 다운믹싱하는 전체 과정(FIG. 1) The whole process of downmixing five channels into two channels

(도 2) Binaural Synthesis와 2채널로 다운믹싱하는 세부 과정(Figure 2) Detailed process of downmixing with Binaural Synthesis and 2-channel

일반적으로 5.1채널(좌, 우, 중앙, 서라운드 좌, 서라운드우 그리고 저주파 효과 채널)을 이용하여 입체감 있는 사운드를 듣기 위해서는 방향성을 가지지 않는 저주파 효과 채널을 제외한 5채널에 해당하는 스피커의 위치가 중요하다. 즉 5개 스피커의 위치에 의해서 입체감이 형성된다. 본 발명에서는 5개 채널의 각각을HRTF를 이용하여 binaural synthesis함으로써 각 채널에 방향성을 준 후, 5채널을 2 채널로 다운믹싱하여 헤드폰으로 청취시 입체감을 느낄 수 있게 한다. 또한 2채널로 다운믹싱된 결과를 cross-talk 제거 필터를 통과시킴으로써 2개의 스피커를 통해서도 청취자가 입체감을 느낄 수 있도록 한다.In general, in order to listen to stereoscopic sound using 5.1 channels (left, right, center, surround left, surround right and low frequency effect channels), the position of speakers corresponding to 5 channels except the non-directional low frequency effect channel is important. . That is, the stereoscopic feeling is formed by the positions of the five speakers. In the present invention, each of the five channels is binaural synthesis using HRTF to give directionality to each channel, and then downmix the five channels to two channels so that a stereoscopic feeling can be felt when listening with headphones. The two-channel downmixed results are then passed through a cross-talk cancellation filter, allowing listeners to feel three-dimensional through two speakers.

첫번째 단계는 (도 1)과 같이 압축된 좌, 우, 중앙, 서라운드 좌, 그리고 서라운드 우 등의 5개 채널에 binaural synthesis를 하기 위해서는 압축된 5개 채널을 PCM(Pulse Coded Modulation)등의 형태로 압축을 풀어준다. 두번째 단계는 5개 채널에 각각 Binaural Synthesis를 한다. Binaural synthesis란 방향성이 없는 모노 채널을 특정한 각도의 방향성을 가지는 HRTF 필터에 통과시킴으로써 방향성을 가지는 신호로 변환하는 것이다. 두번째 단계를 (도 2)에 상세하게 나타내었다. (도 2)와 같이, 좌 채널의 경우 방위각 좌30도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 우 채널의 경우 방위각 우30도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 중앙 채널은 방위각 0도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 서라운드 좌는 방위각 좌120도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 서라운드 우는 방위각 우120도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 세번째 단계에서 Binaural Synthesis를 통과한 5개의 채널을 (도2)와 같이 2개의 채널로 다운믹싱 한다. 헤드폰으로 청취하는 경우는 2채널로 다운믹싱된 PCM 신호를 MP3등으로 압축하면 헤드폰을 위한 다운믹싱은 끝난다. 만약 압축을 위해서MP3를 이용하여 128 Kbps로 압축한다면 Dolby digital의 경우 448 Kbps가 128Kbps로 감소하게 된다.In the first step, in order to perform binaural synthesis on five channels such as left, right, center, surround left, and surround right, the compressed five channels are in the form of pulse coded modulation (PCM). Unzip it. The second stage performs Binaural Synthesis on each of the five channels. Binaural synthesis is the conversion of a non-directional mono channel into a directional signal by passing it through a HRTF filter with a specific angle. The second step is shown in detail in FIG. 2. As shown in FIG. 2, in the case of the left channel, Binaural Synthesis is performed using an azimuth left 30 degrees latitude 0 degrees HRTF. In the case of the right channel, the azimuth right 30 degrees latitude 0 degrees HRTF is used to perform the Binaural Synthesis. The central channel is Binaural Synthesis using azimuth 0 degree latitude 0 degree HRTF. The surround left is Binaural Synthesis using azimuth left 120 degrees latitude 0 degrees HRTF. The surround rain is Binaural Synthesis using azimuth right 120 degrees latitude and 0 degrees HRTF. In the third step, five channels passed through Binaural Synthesis are downmixed into two channels as shown in FIG. 2. When listening to headphones, downmixing for the headphones is done by compressing the PCM signal downmixed into two channels into MP3 or the like. If you compress it to 128 Kbps using MP3 for compression, 448 Kbps will be reduced to 128 Kbps for Dolby digital.

청취자가2개의 스피커를 통해서 청취하고자 할 때는, 입체감을 느끼기 위해서는 Binaural Synthesis 한 후 2채널로 다운믹싱된 채널들 중에서 좌 채널은 좌측귀에 들어가고 우측 채널은 우측귀에 들어가야만 한다. 그러나 스피커의 경우에는 어쩔 수 없이 좌 채널의 일부가 우측귀에 들어가고, 우 채널의 일부가 좌측귀에 들어간다. 이를 Cross-talk이라 하는데 Cross-talk이 존재하면, Binaural Synthesis된 신호에서 청취자가 입체감을 느낄 수 없다. 헤드폰을 통해서 들을 때는 좌측 채널은 좌측귀에, 우측 채널은 우측귀에 들어가기 때문에 채널 상호간의 Cross-talk은 발생되지 않는다. 그러나, 스피커의 경우는 다르다. 2개의 스피커를 통해서 청취하고자 할 때는Binaural Synthesis후 2채널로 다운믹싱된 채널들을 (도 1)의 점선으로 표시된 부분인 Cross-talk 제거 필터를 통과시켜야 한다. 즉, 2개의 스피커를 통해 청취하는 경우에는Cross-talk 제거 필터를 통과한 2채널을 MP3로 압축한다. 이렇게 하면, 2개의 스피커를 사용해서도 입체감 있는 사운드를 즐길 수 있다.When the listener wants to listen through two speakers, the left channel should be in the left ear and the right channel in the right ear of the two channels downmixed after Binaural Synthesis in order to feel 3D. In the case of speakers, however, a part of the left channel enters the right ear and a part of the right channel enters the left ear. This is called cross-talk. If cross-talk is present, the listener cannot feel three-dimensional effects in the Binaural Synthesis signal. When listening through headphones, the left channel is in the left ear and the right channel is in the right ear, so no crosstalk between channels occurs. However, the case of the speaker is different. In order to listen through two speakers, the channels downmixed into two channels after Binaural Synthesis must pass through a cross-talk cancellation filter, which is indicated by the dotted line in FIG. 1. In other words, when listening through two speakers, two channels through the cross-talk cancellation filter are compressed to MP3. In this way, you can enjoy a three-dimensional sound even with two speakers.

본 발명은 오디오 스트리밍에 필요한 대역폭을 줄이면서 헤드폰이나 2개의 스피커를 사용하여 사운드의 입체감을 유지하는 방법에 대한 것이다. 적은 대역폭으로 입체감 있는 사운드를 전송하는 것이 가능해 지므로, 휴대폰과 같은 데이터 전송환경에서도 헤드폰을 이용하여 오케스트라 연주 같은 사운드를 즐길 수 있다. 아울러 DVD와 같은 영상물의 경우에 제안한 기법을 적용한다면 오디오의 입체감은 유지되고 대역폭은 감소하게 된다. 이는, 제한된 전송대역폭이라면 더 좋은 비디오를 기대할 수 있고, 같은 비디오 화질이라면 더 작은 대역폭이 필요함 (더 많은 가입자 서비스가 가능함)을 의미 한다.The present invention relates to a method of maintaining the stereoscopic sound of sound using headphones or two speakers while reducing the bandwidth required for audio streaming. Since it is possible to transmit a three-dimensional sound with a small bandwidth, even in a data transmission environment such as a mobile phone, you can enjoy the sound such as playing the orchestra using headphones. In addition, if the proposed technique is applied to a video such as DVD, the stereoscopic effect of the audio is maintained and the bandwidth is reduced. This means that better video can be expected with limited transmission bandwidth, and smaller bandwidth is required (more subscriber service is possible) with the same video quality.

Claims (1)

Dolby Digital 혹은 DTS등에서 제공되는 다중 채널을 HRTF를 기반으로 Binaural Synthesis한 후 2개의 채널로 다운믹싱하여 스트리밍하는 기법Multi-channel provided by Dolby Digital or DTS, etc.Binaural Synthesis based on HRTF and then downmixed into 2 channels for streaming
KR1020020058711A 2002-09-27 2002-09-27 New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming KR20040027015A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020020058711A KR20040027015A (en) 2002-09-27 2002-09-27 New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020020058711A KR20040027015A (en) 2002-09-27 2002-09-27 New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming

Publications (1)

Publication Number Publication Date
KR20040027015A true KR20040027015A (en) 2004-04-01

Family

ID=37329613

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020020058711A KR20040027015A (en) 2002-09-27 2002-09-27 New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming

Country Status (1)

Country Link
KR (1) KR20040027015A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100644617B1 (en) * 2004-06-16 2006-11-10 삼성전자주식회사 Apparatus and method for reproducing 7.1 channel audio
US20070297616A1 (en) * 2005-03-04 2007-12-27 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Device and method for generating an encoded stereo signal of an audio piece or audio datastream
KR200449588Y1 (en) * 2008-07-30 2010-07-22 오세원 A supporting device for branches of fruit tree
KR100974158B1 (en) * 2010-03-09 2010-08-04 박상훈 Form correction structure of tree

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR19980069336A (en) * 1997-02-27 1998-10-26 김영귀 Telephone number memory and auto dial device of car navigation system
KR100206333B1 (en) * 1996-10-08 1999-07-01 윤종용 Device and method for the reproduction of multichannel audio using two speakers
US6009179A (en) * 1997-01-24 1999-12-28 Sony Corporation Method and apparatus for electronically embedding directional cues in two channels of sound
KR20000026251A (en) * 1998-10-19 2000-05-15 윤종용 System and method for converting 5-channel audio data into 2-channel audio data and playing 2-channel audio data through headphone
KR20000053152A (en) * 1996-11-07 2000-08-25 스티븐 브이, 시드마크 Multi-channel audio enhancement system for use in recording and playback and methods for providing same
KR20010016598A (en) * 2000-12-26 2001-03-05 이원돈 Apparatus for recovering of 3D sound and method thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100206333B1 (en) * 1996-10-08 1999-07-01 윤종용 Device and method for the reproduction of multichannel audio using two speakers
KR20000053152A (en) * 1996-11-07 2000-08-25 스티븐 브이, 시드마크 Multi-channel audio enhancement system for use in recording and playback and methods for providing same
US6009179A (en) * 1997-01-24 1999-12-28 Sony Corporation Method and apparatus for electronically embedding directional cues in two channels of sound
KR19980069336A (en) * 1997-02-27 1998-10-26 김영귀 Telephone number memory and auto dial device of car navigation system
KR20000026251A (en) * 1998-10-19 2000-05-15 윤종용 System and method for converting 5-channel audio data into 2-channel audio data and playing 2-channel audio data through headphone
KR20010016598A (en) * 2000-12-26 2001-03-05 이원돈 Apparatus for recovering of 3D sound and method thereof

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100644617B1 (en) * 2004-06-16 2006-11-10 삼성전자주식회사 Apparatus and method for reproducing 7.1 channel audio
US20070297616A1 (en) * 2005-03-04 2007-12-27 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Device and method for generating an encoded stereo signal of an audio piece or audio datastream
EP2094031A2 (en) * 2005-03-04 2009-08-26 Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. Device and method for creating an encoding stereo signal of an audio section or audio data stream
US8553895B2 (en) * 2005-03-04 2013-10-08 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Device and method for generating an encoded stereo signal of an audio piece or audio datastream
KR200449588Y1 (en) * 2008-07-30 2010-07-22 오세원 A supporting device for branches of fruit tree
KR100974158B1 (en) * 2010-03-09 2010-08-04 박상훈 Form correction structure of tree

Similar Documents

Publication Publication Date Title
TWI532391B (en) Apparatus and method for mapping first and second input channels to at least one output channel
US20220322026A1 (en) Method and apparatus for rendering acoustic signal, and computerreadable recording medium
Faller Coding of spatial audio compatible with different playback formats
KR101358700B1 (en) Audio encoding and decoding
RU2460155C2 (en) Encoding and decoding of audio objects
EP2805326B1 (en) Spatial audio rendering and encoding
US10687162B2 (en) Method and apparatus for rendering acoustic signal, and computer-readable recording medium
US8880413B2 (en) Binaural spatialization of compression-encoded sound data utilizing phase shift and delay applied to each subband
CN101356573A (en) Control for decoding of binaural audio signal
US20050273324A1 (en) System for providing audio data and providing method thereof
JP2003070100A (en) Device and method for multichannel audio reproduction using two speakers
EP3895451A1 (en) Method and apparatus for processing a stereo signal
EP3808106A1 (en) Spatial audio capture, transmission and reproduction
WO2020152394A1 (en) Audio representation and associated rendering
KR20040027015A (en) New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming
JP2021517668A (en) Audio signal processing methods and devices that use metadata
KR20140017344A (en) Apparatus and method for audio signal processing
KR100598602B1 (en) virtual sound generating system and method thereof
KR20010086976A (en) Channel down mixing apparatus
Pfanzagl-Cardone The Art and Science of 3D Audio Recording
KR20050060552A (en) Virtual sound system and virtual sound implementation method
WO2024081957A1 (en) Binaural externalization processing
Plogsties et al. MPEG Sorround binaural rendering-Sorround sound for mobile devices (Binaurale Wiedergabe mit MPEG Sorround-Sorround sound fuer mobile Geraete)
Pulkki Evolution of sound reproduction–from mechanical solutions to digital techniques optimized for human hearing
KR20050029749A (en) Realization of virtual surround and spatial sound using relative sound image localization transfer function method which realize large sweetspot region and low computation power regardless of array of reproduction part and movement of listener

Legal Events

Date Code Title Description
A201 Request for examination
N231 Notification of change of applicant
E902 Notification of reason for refusal
E601 Decision to refuse application