KR20040027015A - New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming - Google Patents
New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming Download PDFInfo
- Publication number
- KR20040027015A KR20040027015A KR1020020058711A KR20020058711A KR20040027015A KR 20040027015 A KR20040027015 A KR 20040027015A KR 1020020058711 A KR1020020058711 A KR 1020020058711A KR 20020058711 A KR20020058711 A KR 20020058711A KR 20040027015 A KR20040027015 A KR 20040027015A
- Authority
- KR
- South Korea
- Prior art keywords
- channels
- bandwidth
- audio
- video
- streaming
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
- G11B20/10009—Improvement or modification of read or write signals
- G11B20/10046—Improvement or modification of read or write signals filtering or equalising, e.g. setting the tap weights of an FIR filter
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/24—Signal processing not specific to the method of recording or reproducing; Circuits therefor for reducing noise
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
- G11B20/10527—Audio or video recording; Data buffering arrangements
- G11B2020/10537—Audio or video recording
- G11B2020/10592—Audio or video recording specifically adapted for recording or reproducing multichannel signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B2220/00—Record carriers by type
- G11B2220/20—Disc-shaped record carriers
- G11B2220/25—Disc-shaped record carriers characterised in that the disc is based on a specific recording technology
- G11B2220/2537—Optical discs
- G11B2220/2562—DVDs [digital versatile discs]; Digital video discs; MMCDs; HDCDs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
Description
스트리밍시 5개의 채널을 모두 전송하거나 혹은 2개의 채널로 단순 다운 믹싱하여 스트리밍한다. 5개의 채널을 전송하는 경우에 많은 양의 데이터를 전송해야 하기 때문에 실시간 전송이 불가능할 수 있고, 2개의 채널로 단순 다운믹싱하는 경우에는 사운드의 입체감을 그대로 유지하는 것이 힘들어 진다.When streaming, all five channels are transmitted or simply down mixed to two channels for streaming. In the case of transmitting five channels, a large amount of data needs to be transmitted, so real-time transmission may not be possible, and in the case of simple downmixing with two channels, it is difficult to maintain the three-dimensionality of sound.
고화질을 가지는 DVD등의 영상물은 사운드의 입체감을 살리기 위해 돌비 디지탈, DTS 등에서 제공하는 5.1채널을 이용한다. 일반적으로 5.1채널은 Dolby Digital의 경우에는 448 Kbps의 대역폭을 가지고, DTS의 경우 1411 Kbps의 대역폭을 가진다. 고화질의 DVD급 영상물을 스트리밍하려면 Dolby Digital 혹은 DTS의 대역폭은 ADSL과 같은 초고속 인터넷에서 조차도 제한된 대역폭 내에서 상당히 많은 대역폭을 차지하므로, 상대적으로 비디오의 대역폭을 잠식한다. 본 발명에서는 Binaural Synthesis를 이용하여 각 채널의 방향성을 유지한 후, 5채널을 2채널로 다운믹싱함으로써, 5채널 본래의 입체감은 유지되면서 채널의 감소로 인한 대역폭의 절감 효과도 기대할 수 있다. 결과적으로, 만약 영상물을 제한된 대역폭에서 스트리밍한다면, 전체 대역폭에서 사운드의 대역폭은 감소하고 상대적으로 비디오에 보다 많은 대역폭을 할당할 수 있기 때문에, 고화질의 비디오와 입체감을 가지는 사운드를 함께 제공할 수 있다.High-definition video such as DVD uses 5.1 channels provided by Dolby Digital, DTS, etc. to enhance the stereoscopic sound. In general, 5.1 channels have a bandwidth of 448 Kbps for Dolby Digital and 1411 Kbps for DTS. To stream high-quality DVD-quality video, the bandwidth of Dolby Digital or DTS occupies a considerable amount of bandwidth within a limited bandwidth, even on high-speed Internet such as ADSL, and relatively erodes video bandwidth. In the present invention, by maintaining the direction of each channel using the Binaural Synthesis, by downmixing the five channels into two channels, while maintaining the original three-dimensional sense of the five channels can be expected to reduce the bandwidth due to the reduction of the channel. As a result, if video is streamed in a limited bandwidth, the bandwidth of the sound at the full bandwidth is reduced and relatively more bandwidth can be allocated to the video, thereby providing high quality video and sound with stereoscopic effect.
스트리밍시 제한된 대역폭에서 5.1채널과 같은 다중채널을 방향성을 유지하여 입체감을 유지하여 2채널로 다운믹싱함으로써 비디오에 보다 많은 대역폭을 할당하여 오디와 함께 비디오도 양질의 스트리밍이 될 수 있도록 한다.When streaming, multi-channel such as 5.1 channel is maintained in a limited bandwidth, down-mixing to 2 channel by maintaining the three-dimensional effect to allocate more bandwidth to the video so that the video can also be streamed with the audio.
(도 1) 5개의 채널을 2개의 채널로 다운믹싱하는 전체 과정(FIG. 1) The whole process of downmixing five channels into two channels
(도 2) Binaural Synthesis와 2채널로 다운믹싱하는 세부 과정(Figure 2) Detailed process of downmixing with Binaural Synthesis and 2-channel
일반적으로 5.1채널(좌, 우, 중앙, 서라운드 좌, 서라운드우 그리고 저주파 효과 채널)을 이용하여 입체감 있는 사운드를 듣기 위해서는 방향성을 가지지 않는 저주파 효과 채널을 제외한 5채널에 해당하는 스피커의 위치가 중요하다. 즉 5개 스피커의 위치에 의해서 입체감이 형성된다. 본 발명에서는 5개 채널의 각각을HRTF를 이용하여 binaural synthesis함으로써 각 채널에 방향성을 준 후, 5채널을 2 채널로 다운믹싱하여 헤드폰으로 청취시 입체감을 느낄 수 있게 한다. 또한 2채널로 다운믹싱된 결과를 cross-talk 제거 필터를 통과시킴으로써 2개의 스피커를 통해서도 청취자가 입체감을 느낄 수 있도록 한다.In general, in order to listen to stereoscopic sound using 5.1 channels (left, right, center, surround left, surround right and low frequency effect channels), the position of speakers corresponding to 5 channels except the non-directional low frequency effect channel is important. . That is, the stereoscopic feeling is formed by the positions of the five speakers. In the present invention, each of the five channels is binaural synthesis using HRTF to give directionality to each channel, and then downmix the five channels to two channels so that a stereoscopic feeling can be felt when listening with headphones. The two-channel downmixed results are then passed through a cross-talk cancellation filter, allowing listeners to feel three-dimensional through two speakers.
첫번째 단계는 (도 1)과 같이 압축된 좌, 우, 중앙, 서라운드 좌, 그리고 서라운드 우 등의 5개 채널에 binaural synthesis를 하기 위해서는 압축된 5개 채널을 PCM(Pulse Coded Modulation)등의 형태로 압축을 풀어준다. 두번째 단계는 5개 채널에 각각 Binaural Synthesis를 한다. Binaural synthesis란 방향성이 없는 모노 채널을 특정한 각도의 방향성을 가지는 HRTF 필터에 통과시킴으로써 방향성을 가지는 신호로 변환하는 것이다. 두번째 단계를 (도 2)에 상세하게 나타내었다. (도 2)와 같이, 좌 채널의 경우 방위각 좌30도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 우 채널의 경우 방위각 우30도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 중앙 채널은 방위각 0도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 서라운드 좌는 방위각 좌120도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 서라운드 우는 방위각 우120도 위도 0도 HRTF를 이용하여 Binaural Synthesis한다. 세번째 단계에서 Binaural Synthesis를 통과한 5개의 채널을 (도2)와 같이 2개의 채널로 다운믹싱 한다. 헤드폰으로 청취하는 경우는 2채널로 다운믹싱된 PCM 신호를 MP3등으로 압축하면 헤드폰을 위한 다운믹싱은 끝난다. 만약 압축을 위해서MP3를 이용하여 128 Kbps로 압축한다면 Dolby digital의 경우 448 Kbps가 128Kbps로 감소하게 된다.In the first step, in order to perform binaural synthesis on five channels such as left, right, center, surround left, and surround right, the compressed five channels are in the form of pulse coded modulation (PCM). Unzip it. The second stage performs Binaural Synthesis on each of the five channels. Binaural synthesis is the conversion of a non-directional mono channel into a directional signal by passing it through a HRTF filter with a specific angle. The second step is shown in detail in FIG. 2. As shown in FIG. 2, in the case of the left channel, Binaural Synthesis is performed using an azimuth left 30 degrees latitude 0 degrees HRTF. In the case of the right channel, the azimuth right 30 degrees latitude 0 degrees HRTF is used to perform the Binaural Synthesis. The central channel is Binaural Synthesis using azimuth 0 degree latitude 0 degree HRTF. The surround left is Binaural Synthesis using azimuth left 120 degrees latitude 0 degrees HRTF. The surround rain is Binaural Synthesis using azimuth right 120 degrees latitude and 0 degrees HRTF. In the third step, five channels passed through Binaural Synthesis are downmixed into two channels as shown in FIG. 2. When listening to headphones, downmixing for the headphones is done by compressing the PCM signal downmixed into two channels into MP3 or the like. If you compress it to 128 Kbps using MP3 for compression, 448 Kbps will be reduced to 128 Kbps for Dolby digital.
청취자가2개의 스피커를 통해서 청취하고자 할 때는, 입체감을 느끼기 위해서는 Binaural Synthesis 한 후 2채널로 다운믹싱된 채널들 중에서 좌 채널은 좌측귀에 들어가고 우측 채널은 우측귀에 들어가야만 한다. 그러나 스피커의 경우에는 어쩔 수 없이 좌 채널의 일부가 우측귀에 들어가고, 우 채널의 일부가 좌측귀에 들어간다. 이를 Cross-talk이라 하는데 Cross-talk이 존재하면, Binaural Synthesis된 신호에서 청취자가 입체감을 느낄 수 없다. 헤드폰을 통해서 들을 때는 좌측 채널은 좌측귀에, 우측 채널은 우측귀에 들어가기 때문에 채널 상호간의 Cross-talk은 발생되지 않는다. 그러나, 스피커의 경우는 다르다. 2개의 스피커를 통해서 청취하고자 할 때는Binaural Synthesis후 2채널로 다운믹싱된 채널들을 (도 1)의 점선으로 표시된 부분인 Cross-talk 제거 필터를 통과시켜야 한다. 즉, 2개의 스피커를 통해 청취하는 경우에는Cross-talk 제거 필터를 통과한 2채널을 MP3로 압축한다. 이렇게 하면, 2개의 스피커를 사용해서도 입체감 있는 사운드를 즐길 수 있다.When the listener wants to listen through two speakers, the left channel should be in the left ear and the right channel in the right ear of the two channels downmixed after Binaural Synthesis in order to feel 3D. In the case of speakers, however, a part of the left channel enters the right ear and a part of the right channel enters the left ear. This is called cross-talk. If cross-talk is present, the listener cannot feel three-dimensional effects in the Binaural Synthesis signal. When listening through headphones, the left channel is in the left ear and the right channel is in the right ear, so no crosstalk between channels occurs. However, the case of the speaker is different. In order to listen through two speakers, the channels downmixed into two channels after Binaural Synthesis must pass through a cross-talk cancellation filter, which is indicated by the dotted line in FIG. 1. In other words, when listening through two speakers, two channels through the cross-talk cancellation filter are compressed to MP3. In this way, you can enjoy a three-dimensional sound even with two speakers.
본 발명은 오디오 스트리밍에 필요한 대역폭을 줄이면서 헤드폰이나 2개의 스피커를 사용하여 사운드의 입체감을 유지하는 방법에 대한 것이다. 적은 대역폭으로 입체감 있는 사운드를 전송하는 것이 가능해 지므로, 휴대폰과 같은 데이터 전송환경에서도 헤드폰을 이용하여 오케스트라 연주 같은 사운드를 즐길 수 있다. 아울러 DVD와 같은 영상물의 경우에 제안한 기법을 적용한다면 오디오의 입체감은 유지되고 대역폭은 감소하게 된다. 이는, 제한된 전송대역폭이라면 더 좋은 비디오를 기대할 수 있고, 같은 비디오 화질이라면 더 작은 대역폭이 필요함 (더 많은 가입자 서비스가 가능함)을 의미 한다.The present invention relates to a method of maintaining the stereoscopic sound of sound using headphones or two speakers while reducing the bandwidth required for audio streaming. Since it is possible to transmit a three-dimensional sound with a small bandwidth, even in a data transmission environment such as a mobile phone, you can enjoy the sound such as playing the orchestra using headphones. In addition, if the proposed technique is applied to a video such as DVD, the stereoscopic effect of the audio is maintained and the bandwidth is reduced. This means that better video can be expected with limited transmission bandwidth, and smaller bandwidth is required (more subscriber service is possible) with the same video quality.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020020058711A KR20040027015A (en) | 2002-09-27 | 2002-09-27 | New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020020058711A KR20040027015A (en) | 2002-09-27 | 2002-09-27 | New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20040027015A true KR20040027015A (en) | 2004-04-01 |
Family
ID=37329613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020020058711A KR20040027015A (en) | 2002-09-27 | 2002-09-27 | New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20040027015A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100644617B1 (en) * | 2004-06-16 | 2006-11-10 | 삼성전자주식회사 | Apparatus and method for reproducing 7.1 channel audio |
US20070297616A1 (en) * | 2005-03-04 | 2007-12-27 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Device and method for generating an encoded stereo signal of an audio piece or audio datastream |
KR200449588Y1 (en) * | 2008-07-30 | 2010-07-22 | 오세원 | A supporting device for branches of fruit tree |
KR100974158B1 (en) * | 2010-03-09 | 2010-08-04 | 박상훈 | Form correction structure of tree |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR19980069336A (en) * | 1997-02-27 | 1998-10-26 | 김영귀 | Telephone number memory and auto dial device of car navigation system |
KR100206333B1 (en) * | 1996-10-08 | 1999-07-01 | 윤종용 | Device and method for the reproduction of multichannel audio using two speakers |
US6009179A (en) * | 1997-01-24 | 1999-12-28 | Sony Corporation | Method and apparatus for electronically embedding directional cues in two channels of sound |
KR20000026251A (en) * | 1998-10-19 | 2000-05-15 | 윤종용 | System and method for converting 5-channel audio data into 2-channel audio data and playing 2-channel audio data through headphone |
KR20000053152A (en) * | 1996-11-07 | 2000-08-25 | 스티븐 브이, 시드마크 | Multi-channel audio enhancement system for use in recording and playback and methods for providing same |
KR20010016598A (en) * | 2000-12-26 | 2001-03-05 | 이원돈 | Apparatus for recovering of 3D sound and method thereof |
-
2002
- 2002-09-27 KR KR1020020058711A patent/KR20040027015A/en not_active Application Discontinuation
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100206333B1 (en) * | 1996-10-08 | 1999-07-01 | 윤종용 | Device and method for the reproduction of multichannel audio using two speakers |
KR20000053152A (en) * | 1996-11-07 | 2000-08-25 | 스티븐 브이, 시드마크 | Multi-channel audio enhancement system for use in recording and playback and methods for providing same |
US6009179A (en) * | 1997-01-24 | 1999-12-28 | Sony Corporation | Method and apparatus for electronically embedding directional cues in two channels of sound |
KR19980069336A (en) * | 1997-02-27 | 1998-10-26 | 김영귀 | Telephone number memory and auto dial device of car navigation system |
KR20000026251A (en) * | 1998-10-19 | 2000-05-15 | 윤종용 | System and method for converting 5-channel audio data into 2-channel audio data and playing 2-channel audio data through headphone |
KR20010016598A (en) * | 2000-12-26 | 2001-03-05 | 이원돈 | Apparatus for recovering of 3D sound and method thereof |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100644617B1 (en) * | 2004-06-16 | 2006-11-10 | 삼성전자주식회사 | Apparatus and method for reproducing 7.1 channel audio |
US20070297616A1 (en) * | 2005-03-04 | 2007-12-27 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Device and method for generating an encoded stereo signal of an audio piece or audio datastream |
EP2094031A2 (en) * | 2005-03-04 | 2009-08-26 | Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. | Device and method for creating an encoding stereo signal of an audio section or audio data stream |
US8553895B2 (en) * | 2005-03-04 | 2013-10-08 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Device and method for generating an encoded stereo signal of an audio piece or audio datastream |
KR200449588Y1 (en) * | 2008-07-30 | 2010-07-22 | 오세원 | A supporting device for branches of fruit tree |
KR100974158B1 (en) * | 2010-03-09 | 2010-08-04 | 박상훈 | Form correction structure of tree |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI532391B (en) | Apparatus and method for mapping first and second input channels to at least one output channel | |
US20220322026A1 (en) | Method and apparatus for rendering acoustic signal, and computerreadable recording medium | |
Faller | Coding of spatial audio compatible with different playback formats | |
KR101358700B1 (en) | Audio encoding and decoding | |
RU2460155C2 (en) | Encoding and decoding of audio objects | |
EP2805326B1 (en) | Spatial audio rendering and encoding | |
US10687162B2 (en) | Method and apparatus for rendering acoustic signal, and computer-readable recording medium | |
US8880413B2 (en) | Binaural spatialization of compression-encoded sound data utilizing phase shift and delay applied to each subband | |
CN101356573A (en) | Control for decoding of binaural audio signal | |
US20050273324A1 (en) | System for providing audio data and providing method thereof | |
JP2003070100A (en) | Device and method for multichannel audio reproduction using two speakers | |
EP3895451A1 (en) | Method and apparatus for processing a stereo signal | |
EP3808106A1 (en) | Spatial audio capture, transmission and reproduction | |
WO2020152394A1 (en) | Audio representation and associated rendering | |
KR20040027015A (en) | New Down-Mixing Technique to Reduce Audio Bandwidth using Immersive Audio for Streaming | |
JP2021517668A (en) | Audio signal processing methods and devices that use metadata | |
KR20140017344A (en) | Apparatus and method for audio signal processing | |
KR100598602B1 (en) | virtual sound generating system and method thereof | |
KR20010086976A (en) | Channel down mixing apparatus | |
Pfanzagl-Cardone | The Art and Science of 3D Audio Recording | |
KR20050060552A (en) | Virtual sound system and virtual sound implementation method | |
WO2024081957A1 (en) | Binaural externalization processing | |
Plogsties et al. | MPEG Sorround binaural rendering-Sorround sound for mobile devices (Binaurale Wiedergabe mit MPEG Sorround-Sorround sound fuer mobile Geraete) | |
Pulkki | Evolution of sound reproduction–from mechanical solutions to digital techniques optimized for human hearing | |
KR20050029749A (en) | Realization of virtual surround and spatial sound using relative sound image localization transfer function method which realize large sweetspot region and low computation power regardless of array of reproduction part and movement of listener |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
N231 | Notification of change of applicant | ||
E902 | Notification of reason for refusal | ||
E601 | Decision to refuse application |