KR20100084332A

KR20100084332A - 3d audio localization method and device and the recording media storing the program performing the said method

Info

Publication number: KR20100084332A
Application number: KR1020090003761A
Authority: KR
Inventors: 박영철; 윤대희; 최택성; 이석필; 현동일
Original assignee: 연세대학교 산학협력단; 전자부품연구원
Priority date: 2009-01-16
Filing date: 2009-01-16
Publication date: 2010-07-26
Also published as: KR101038574B1

Abstract

PURPOSE: In case of 3D audio doing method, and the apparatus and recording medium is the bumping frequency domain, by VBCAP being used and using VBAP in case of the ratio bumping frequency domain, the panning value being calculated and respectively applying to the speaker input signal the exact solo can be embodied. CONSTITUTION: The frequency extracting unit(200) respectively extracts the bumping frequency band and ratio bumping frequency band from the frequency area signal of the inputted signal of melody. The panning calculation unit(300) calculates the inter-channel time delay value and level difference so that solo accomplish about the signal of melody. The panning calculation unit calculates the base vector size panning value about the signal of the ratio bumping frequency band.

Description

3D audio localization method and device and recording medium on which program implementing the method is recorded {3D Audio localization method and device and the recording media storing the program performing the said method}

본 발명은 3차원 오디오 음상 정위 방법과 장치 및 이와 같은 방법을 구현하는 프로그램이 기록되는 기록매체에 관한 것으로서, 보다 상세하게는, 본 발명은 범핑 주파수 영역과 비범핑 주파수 영역을 서로 다른 패닝 방법으로 수행함으로써 보다 정확한 음상 정위를 구현할 수 있는 3차원 오디오 음상 정위 방법과 장치 및 이와 같은 방법을 구현하는 프로그램이 기록되는 기록매체에 관한 것이다.The present invention relates to a three-dimensional audio sound image positioning method and apparatus, and a recording medium on which a program for implementing the method is recorded. More specifically, the present invention relates to a panning frequency region and a non-bumping frequency region with different panning methods. The present invention relates to a three-dimensional audio sound image positioning method and apparatus capable of realizing more accurate image positioning, and a recording medium on which a program for implementing the method is recorded.

오디오 비쥬얼 기술은 빠른 속도로 통합 미디어 기술로 수렴되고 있다. 오디오 비쥬얼 기술에 대한 연구는 최근 빠른 속도로 증가되어 왔다. 멀티미디어의 발전이 오디오와 음향 분야에 새로운 응용분야와 연구주제를 제공하였고, 대표적인 연구 주제가 가상 현실, 가상 음향 환경이다.Audio visual technology is rapidly converging on integrated media technology. The study of audio visual technology has recently increased rapidly. The development of multimedia has provided new applications and research topics in the field of audio and sound, and the main research topics are virtual reality and virtual acoustic environment.

버츄얼 사운드 소스의 정확한 3차원 음상 정위는 오디오 비쥬얼 기술에 가장 핵심 연구 분야중 하나이다. 이와 같은 3차원 오디오 시스템의 목적은 청취자의 귀 주변에서 의도한 소리의 재생을 하는 것이다. Accurate three-dimensional image positioning of virtual sound sources is one of the key research areas in audio visual technology. The purpose of such a three-dimensional audio system is to reproduce the intended sound around the listener's ear.

이와 같은 3차원 음상 정위를 위해 일반적으로 사용되는 벡터 기반 크기 패닝(Vector Base Amplitude Panning; VBAP) 방법은 똑 같은 음원 소스를 청취자로부터 등간격의 2개 혹은 그 이상의 스피커에 적절한 크기를 가지고 전달하는 방법이다. 이때 청취자는 임의의 한 지점에서 오디토리 오브젝트를 인지하게 되는데, 이때 인지된 오디토리 오브젝트를 버츄얼 소스 혹은 팬텀 소스라고 부른다.The Vector Base Amplitude Panning (VBAP) method, which is generally used for three-dimensional image positioning, transfers the same sound source from the listener to two or more equally spaced speakers at an appropriate size. to be. At this point, the listener recognizes the auditory object at an arbitrary point. The recognized auditory object is called a virtual source or a phantom source.

이에 대해 보다 자세하게 설명하면, 2개 혹은 그 이상의 스피커들이 청취자로부터 등간격으로 서로 다른 방향에 위치한다고 가정할 때, 벡터 기반 크기 패닝(VBAP) 방법에 의하면 각각의 스피커에 입력되는 신호는 xi(t)=gi*x(t), i=1, 2,...,N 으로 표현할 수 있다. In more detail, assuming that two or more speakers are located in different directions at equal intervals from the listener, the vector-based size panning (VBAP) method allows the signal input to each speaker to be xi (t ) = gi * x (t), i = 1, 2, ..., N

여기서, xi(t)는 시간 t에서 i 번째 스피커에 입력되는 신호, x(t)는 시간 t에서 동일한 크기의 사운드 신호, gi는 각각의 스피커의 채널에 따른 크기 게인(gain)이고, N은 스피커의 개수이다. Where xi (t) is the signal input to the i-th speaker at time t, x (t) is the sound signal of the same magnitude at time t, gi is the magnitude gain according to the channel of each speaker, and N is The number of speakers.

이와 같은 각각의 스피커에 입력되는 신호 xi(t)가 스피커를 통해 증폭되어 청취자의 귀에 도달할 때 신호들은 청취자의 귀에서 합쳐지고 새로운 신호를 형성하게 된다. 이러한 신호의 성질들이 인지된 음상 정위를 규정하게 되고, 이를 합 음상 정위라고 한다.When the signal xi (t) input to each of these speakers is amplified through the speaker and reaches the listener's ear, the signals are combined at the listener's ear and form a new signal. The properties of these signals define the perceived phonetic position, which is called the summative phonetic position.

한편, 이와 같은 종래의 벡터 기반 크기 패닝(VBAP) 방법을 이용한 음상 정위 방법 및 장치에 의한 경우 특정 주파수 대역에서는 범핑(Bumping)으로 인해 청취자의 귀에서 원하는 귀간 시간 지연 차이(Interaural time difference; ITD)와 귀간 레벨 차이(Interaural level difference; ILD) 정보를 만들어 내지 못하는 문제점이 있었다.On the other hand, according to the conventional stereotactic method and apparatus using a vector-based VBAP method, due to bumping in a specific frequency band, the desired interaural time difference (ITD) in the listener's ear. There was a problem in that it did not produce Interaural level difference (ILD) information.

예를 들어, 도 1과 같이 두 개의 스피커(10, 20)가 청취자의 전방 수평 평면 방향으로 양면에 대칭적으로 놓여지고, 스피커들이 펼쳐진 각도(θ₀)가 60도인 경우, 1700Hz 주변의 주파수 대역에서 귀간 시간 지연 차이(ITD)와 귀간 레벨 차이(ILD)는 심한 범핑(bumping)으로 제대로된 정보를 만들어 내지 못하여 청취자로서는 가상 음원(30)의 각도인 θ_T를 구체적으로 인지하지 못하게 되는 문제점이 있다.For example, when two speakers 10 and 20 are symmetrically placed on both sides in the front horizontal plane direction of the listener as shown in FIG. 1, and the unfolded angle θ ₀ is 60 degrees, a frequency band around 1700 Hz There is a problem in that the in-ear time delay difference (ITD) and the in-ear level difference (ILD) do not produce proper information due to severe bumping, so that the listener does not recognize the angle θ _T which is the angle of the virtual sound source 30 in detail. .

이는 음원 신호의 주파수 대역이 1700Hz인 경우 파장은 대략 190mm가 된다. 그리고, 스테레오 청취환경에서 하나의 스피커와 양귀에서의 전달 경로 길이의 차이가 80-100 mm이다. 이러한 주파수 영역에서는 청취자의 왼쪽 귀방향으로 타측면의 스피커(20)로부터 도착하는 음원 신호는 일측면 스피커(10)로부터 도착하는 음원신호와 비교하였을 때 대략 반주기 정도의 시간 지연이 생긴다. 이때 일어나는 신호의 상쇄가 귀간 레벨 차이 (ILD)에 범핑을 일으키고 또한 귀간 시간 지연 (ITD) 에 범핑을 일으키기 때문이다.When the frequency band of the sound source signal is 1700 Hz, the wavelength becomes approximately 190 mm. And, in the stereo listening environment, the difference in transmission path length between one speaker and both ears is 80-100 mm. In this frequency domain, the sound source signal arriving from the speaker 20 on the other side in the left ear direction of the listener has a time delay of about half a cycle when compared with the sound source signal arriving from the one side speaker 10. This is because the cancellation of the signal occurring at this time causes bumps in the ear level difference (ILD) and also bumps the ear time delay (ITD).

본 발명은 3차원 음상 정위를 구현하기 위해 범핑 주파수 영역과 비범핑 주파수 영역의 음원 신호를 서로 다른 패닝 방법을 사용하여 계산하고 이를 스피커의 입력 신호에 적용함으로써, 보다 정확한 3차원 음상 정위를 구현할 수 있는 3차원 오디오 음상 정위 방법과 장치 및 이와 같은 방법을 구현하는 프로그램이 기록되는 기록매체를 제공하는데 그 목적이 있다.The present invention calculates a sound source signal in the bumping frequency region and the non-bumping frequency region by using a different panning method and applies the same to the input signal of the speaker to realize the 3D sound image alignment, thereby realizing more accurate 3D sound image alignment. It is an object of the present invention to provide a recording medium in which a three-dimensional audio image positioning method and apparatus and a program for implementing the method are recorded.

본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있으며, 본 발명의 실시예에 의해 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.Other objects and advantages of the present invention can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. Also, it will be readily appreciated that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the claims.

본 발명의 일례에 따른 3차원 음상 정위 처리 방법은 음원 신호의 주파수 영역 신호로부터 범핑(Bumping)이 발생하는 범핑 주파수 대역을 추출하는 범핑 주파수 추출단계; 및 범핑 주파수 대역의 신호에 대해 음상 정위가 이루어지도록 귀간 시간 지연 차이(Interaural time difference; ITD)를 재생하기 위한 위상 변조값인 채널간 시간 지연값(Inter channel time difference ;ICTD)과 귀간 레벨 차이(Interaural level difference; ILD)를 재생하기 위한 크기 변조값인 채널간 레 벨 차이값(Inter channel level difference ;ICLD)을 계산하는 복소 패닝 계산(Vector Base Complex Amplitude Panning; VBCAP) 단계;를 포함한다.The three-dimensional image positioning processing method according to an embodiment of the present invention includes a bumping frequency extraction step of extracting a bumping frequency band where bumping occurs from the frequency domain signal of the sound source signal; And inter-channel time difference (ICTD) and inter-level level difference (i.e., phase modulation values for reproducing interaural time difference (ITD) so as to perform a sound phase alignment for signals in the bumping frequency band). And a vector base complex amplitude panning (VBCAP) step of calculating an interchannel level difference (ICLD), which is a magnitude modulation value for reproducing interaural level difference (ILD).

또한, 3차원 음상 정위 처리 방법은 채널간 레벨 차이값(ICLD)과 채널간 시간 지연값(ICTD)을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하는 범핑 주파수 패닝값 적용 단계;를 더 포함할 수 있다.,In addition, the three-dimensional sound image processing method includes applying a bumping frequency panning value to apply the level difference between the channel (ICLD) and the inter-channel time delay value (ICTD) to each of the signals input to the plurality of speakers; It may further include.

여기서, 범핑 주파수 패닝값 적용 단계는 복수의 스피커가 2 개인 경우, 2개의 스피커에 입력되는 각각의 신호의 크기는 ICLD 값과 1-ICLD값이고, 복수의 스피커 중 음상 정위를 위해 시간 지연이 필요한 스피커에 입력되는 신호에는 ICTD값을 곱 할 수 있다.Here, in the step of applying the bumping frequency panning value, when there are two speakers, the magnitude of each signal input to the two speakers is an ICLD value and a 1-ICLD value, and a time delay is required for sound image positioning among the plurality of speakers. The signal input to the speaker can be multiplied by the ICTD value.

여기서, 범핑 주파수 대역은 1.1KHz~2.6KHz일 수 있다.Here, the bumping frequency band may be 1.1KHz to 2.6KHz.

또한, 범핑 주파수 대역은 1.5KHz~1.9KHz의 N배수일 수 있다.In addition, the bumping frequency band may be N multiples of 1.5KHz to 1.9KHz.

또한, 채널간 레벨 차이값(ICLD)인 a(θ,m,k)은 하기의 수학식으로 표현할 수 있다.In addition, a (θ, m, k), which is a level difference value ICLD between channels, may be expressed by the following equation.

[수학식][Equation]

여기서, 아래첨자 R은 청취자의 오른쪽 귀, L은 청취자의 왼쪽 귀이고, θ는 청취자의 정면과 스피커 사이의 각도, k는 크리티컬 밴드 인덱스(Critical band index), m은 시간 인덱스, A(k)는 미리 설정된 스피커의 크기 보정값,

는 스피커의 페이즈 응답 값,

,

는 청취자의 양쪽 귀에서 원하는 각도 θ에 해당하는 머리 전달 함수(Head related transfer function; HRTF)이다.Where the subscript R is the listener's right ear, L is the listener's left ear, θ is the angle between the listener's front and the speaker, k is the critical band index, m is the time index, and A (k) Is the size correction value of the preset speaker,

Is the speaker's phase response value,

,

Is a Head related transfer function (HRTF) corresponding to the desired angle θ at both ears of the listener.

또한, 채널간 시간 지연값(ICTD)인 b^*는 하기의 수학식으로 표현 할 수 있다.In addition, b ^{*, which} is an inter-channel time delay value ICTD, may be expressed by the following equation.

[수학식][Equation]

여기서, 채널간 시간 지연값(ICTD)을 시간 지연이 필요한 스피커의 입력 신호에 곱하는 경우,

형태로 곱해진다.Here, when the inter-channel time delay value (ICTD) is multiplied by the input signal of the speaker requiring the time delay,

Multiply by form

또한, 3차원 음상 정위 처리 방법은 음원 신호를 시간 영역의 신호에서 주파수 영역의 신호로 변환하는 푸리에 분석단계;를 더 포함할 수 있다.The 3D sound image processing method may further include a Fourier analysis step of converting the sound source signal from the signal in the time domain to the signal in the frequency domain.

여기서, 주파수 영역의 신호는 범핑 주파수 대역과 범핑 주파수 대역을 제외한 비범핑 주파수 대역을 포함하고, 3차원 음상 정위 처리 방법은 음원 신호의 주 파수 영역 신호로부터 비범핑 주파수 대역을 추출하는 비범핑 주파수 추출단계;와 비범핑 주파수 대역의 신호에 대해 음상 정위가 이루어지도록 하기 위해 벡터 기반 크기 패닝 값을 계산하는 크기 패닝 계산(Vector Base Amplitude Panning; VBAP) 단계; 및 비범핑 주파수 대역의 신호에 대한 크기 패닝 값을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하는 비범핑 주파수 패닝값 적용 단계;를 더 포함할 수 있다.Here, the signal in the frequency domain includes a bumping frequency band and a non-bumping frequency band excluding the bumping frequency band, and the 3D sound image processing method extracts a non-bumping frequency band for extracting the non-bumping frequency band from the frequency region signal of the sound source signal. And a vector base amplitude panning (VBAP) step of calculating a vector based magnitude panning value so that sound image positioning is performed on a signal of a non-bumping frequency band; And applying a non-bumping frequency panning value to apply the magnitude panning value of the signal of the non-bumping frequency band to each of the signals input to the plurality of speakers.

여기서, 3차원 음상 정위 처리 방법은 범핑 주파수 패닝값이 적용된 주파수 대역의 신호와 비범핑 주파수 패닝값이 적용된 주파수 대역의 신호를 합산하는 합산 단계;를 더 포함할 수 있다.Here, the three-dimensional sound image processing method may further include a summing step of summing the signal of the frequency band to which the bumping frequency panning value is applied and the signal of the frequency band to which the non-bumping frequency panning value is applied.

또한, 본 발명의 일례에 따른 3차원 음상 정위 처리 장치는 음원 신호의 주파수 영역 신호로부터 범핑(Bumping)이 발생하는 범핑 주파수 대역과 범핑 주파수 대역을 제외한 비범핑 주파수 대역을 각각 추출하는 주파수 추출부;와 음원 신호에 대해 음상 정위가 이루어지도록 하기 위해 범핑 주파수 대역의 신호에 대해서는 위상 변조값인 채널간 시간 지연값(Inter channel time difference ;ICTD)과 크기 변조값인 채널간 레벨 차이값(Inter channel level difference ;ICLD)을 구하고, 비범핑 주파수 대역의 신호에 대해서는 크기 변조값인 벡터 기반 크기 패닝(Vector Base Amplitude Panning; VBAP) 값을 계산하는 패닝 계산부; 및 범핑 주파수 대역의 신호에 대해서는 채널간 레벨 차이값(ICLD)과 채널간 시간 지연값(ICTD)을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하고, 비범핑 주파수 대역의 신호에 대해서는 크기 패닝 값을 복수의 스피커에 입력되는 신호 각각 에 음상 정위가 이루어지도록 적용하는 패닝값 적용부;를 포함할 수 있다.In addition, the three-dimensional sound image processing apparatus according to an embodiment of the present invention includes a frequency extraction unit for extracting the bumping frequency band and the bumping frequency band except the bumping frequency band where bumping occurs from the frequency domain signal of the sound source signal, respectively; Inter-channel time difference (ICTD), which is a phase modulation value, and inter-channel level difference, which is a magnitude modulation value, for a signal in the bumping frequency band in order to perform sound phase alignment with respect to the sound source signal. a panning calculation unit for calculating a difference (ICLD) and calculating a vector base amplitude panning (VBAP) value, which is a magnitude modulation value, for a signal of a non-bumping frequency band; For the signals in the bumping frequency band, the level difference value (ICLD) and the inter-channel time delay value (ICTD) are applied to each of the signals input to the plurality of speakers so that sound phase alignment is applied to the signals of the non-bumping frequency band. For example, the panning value applying unit may be configured to apply the magnitude panning value to each of the signals inputted to the plurality of speakers so that the sound image alignment is performed.

여기서, 3차원 음상 정위 처리 장치는 음원 신호를 시간 영역의 신호에서 주파수 영역의 신호로 변환하는 푸리에 분석부;를 더 포함할 수 있다.Here, the 3D sound image processing apparatus may further include a Fourier analysis unit for converting a sound source signal from a signal in the time domain to a signal in the frequency domain.

여기서, 3차원 음상 정위 처리 장치는 범핑 주파수 패닝값이 적용된 주파수 대역의 신호와 비범핑 주파수 패닝값이 적용된 주파수 대역의 신호를 합산하는 합산부;를 더 포함할 수 있다.Here, the 3D sound image processing apparatus may further include an adder configured to add a signal of a frequency band to which the bumping frequency panning value is applied and a signal of a frequency band to which the non-bumping frequency panning value is applied.

여기서, 주파수 추출부는 범핑 주파수 대역을 추출하는 범핑 주파수 추출부와 비범핑 주파수 대역을 추출하는 비범핑 주파수 추출부를 포함할 수 있다.Here, the frequency extractor may include a bumping frequency extractor extracting a bumping frequency band and a non-bumping frequency extractor extracting a non-bumping frequency band.

여기서, 범핑 주파수 대역은 1.1KHz~2.6KHz 일 수 있다.Here, the bumping frequency band may be 1.1KHz to 2.6KHz.

여기서, 범핑 주파수 대역은 1.5KHz~1.9KHz의 N배수일 수 있다.Here, the bumping frequency band may be N multiples of 1.5KHz to 1.9KHz.

여기서, 패닝 계산부는 복소 패닝 계산(Vector Base Complex Amplitude Panning; VBCAP)부와 크기 패닝 계산(Vector Base Amplitude Panning; VBAP)부를 포함하고, 복소 패닝 계산(VBCAP)부는 범핑 주파수 대역의 신호에 대해 음상 정위가 이루어지도록 귀간 시간 지연 차이(Interaural time difference; ITD)를 재생하기 위한 위상 변조값인 채널간 시간 지연값(Inter channel time difference ;ICTD)과 귀간 레벨 차이(Interaural level difference; ILD)를 재생하기 위한 크기 변조값인 채널간 레벨 차이값(Inter channel level difference ;ICLD)을 계산하고, 크기 패닝 계산(VBAP)부는 비범핑 주파수 대역의 신호에 대해 음상 정위가 이루어지도록 벡터 기반 크기 패닝 값을 계산 할 수 있다.Here, the panning calculation unit includes a vector base complex amplitude panning (VBCAP) unit and a magnitude base panning calculation unit (VBCAP) unit, and the complex panning calculation unit (VBCAP) unit performs a sound phase positioning on a signal of a bumping frequency band. Inter channel time difference (ICTD) and interaural level difference (ILD) for reproducing interaural time difference (ITD) An inter-channel level difference (ICLD) is calculated, and a magnitude panning calculation (VBAP) unit can calculate a vector-based magnitude panning value so that sound phase alignment is performed for a signal in an unbumped frequency band. have.

또한, 패닝값 적용부는 범핑 주파수 패닝값 적용부와 비범핑 주파수 패닝값 적용부를 포함하고, 범핑 주파수 패닝값 적용부는 채널간 레벨 차이값(ICLD)과 채널간 시간 지연값(ICTD)을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하고, 비범핑 주파수 패닝값 적용부는 비범핑 주파수 대역의 신호에 대한 크기 패닝 값을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용할 수 있다.The panning value applying unit may include a bumping frequency panning value applying unit and a non-bumping frequency panning value applying unit. The non-bumping frequency panning value applying unit may apply a magnitude panning value for a signal in the non-bumping frequency band so that the sound phase is applied to each of the signals input to the plurality of speakers. have.

또한, 범핑 패닝값 적용부는 복수의 스피커가 2 개인 경우, 2개의 스피커에 입력되는 각각의 신호의 크기는 ICLD 값과 1-ICLD값이고, 복수의 스피커 중 음상 정위를 위해 시간 지연이 필요한 스피커에 입력되는 신호에는 ICTD값을 곱할 수 있다.Also, when the plurality of speakers have two speakers, the bumping panning value applying unit has an ICLD value and a 1-ICLD value for each of the signals inputted to the two speakers. The input signal can be multiplied by the ICTD value.

또한, 본 발명의 일례에 따른 컴퓨터로 판독 가능한 기록매체는 상기 방법을 구현하는 프로그램이 기록되는 기록매체를 포함한다.In addition, the computer-readable recording medium according to an example of the present invention includes a recording medium on which a program for implementing the method is recorded.

본 발명에 따른 3차원 오디오 음상 정위 방법과 장치 및 이와 같은 방법을 구현하는 프로그램이 기록되는 기록매체는 범핑 주파수 영역의 음원 신호에 대해서는 벡터 기반 복소 크기 패닝 방법(VBCAP)을 사용하고, 비범핑 주파수 영역의 음원 신호에 대해서는 벡터 기반 크기 패닝 방법(VBAP)를 사용하여 패닝값을 구하고, 스피커의 입력신호에 각각 적용함으로써 더욱 향상되고 정확한 음상 정위를 구현시키는 효과가 있다.3D audio image positioning method and apparatus according to the present invention, and a recording medium on which a program for implementing such a method is recorded, uses a vector-based complex size panning method (VBCAP) for a sound source signal in a bumping frequency domain, and uses a non-bumping frequency. For the sound source signal in the region, a panning value is obtained by using a vector-based magnitude panning method (VBAP), and applied to the input signal of the speaker, respectively, to further improve and achieve accurate sound positioning.

이하, 본 발명의 바람직한 실시예를 첨부된 도면들을 참조하여 상세히 설명한다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

우선 각 도면의 구성 요소들에 참조 부호를 부가함에 있어서, 동일한 구성 요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다. 또한, 이하에서 본 발명의 바람직한 실시예를 설명할 것이나, 본 발명의 기술적 사상은 이에 한정하거나 제한되지 않고 당업자에 의해 변형되어 다양하게 실시될 수 있음은 물론이다.First, in adding reference numerals to the components of each drawing, it should be noted that the same reference numerals are assigned to the same components as much as possible, even if shown on different drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. In addition, the following will describe a preferred embodiment of the present invention, but the technical idea of the present invention is not limited thereto and may be variously modified and modified by those skilled in the art.

도 2a 및 2b는 본 발명에 따른 3차원 음상 정위 처리 장치의 일례를 설명하기 위한 도이다.2A and 2B are diagrams for explaining an example of the three-dimensional sound image processing apparatus according to the present invention.

도 2a에 도시된 바와 같이, 청취자가 느끼는 특정 음원 신호의 발생위치를 가상의 위치에 정위시키는 3차원 음상 정위 처리 장치는 푸리에 분석부(100), 주파수 추출부(200), 패닝 계산부(300), 패닝값 적용부(400) 및 합산부(500)를 포함할 수 있다.As shown in FIG. 2A, the three-dimensional sound image processing apparatus for positioning a generation position of a specific sound source signal felt by a listener at a virtual position includes a Fourier analyzer 100, a frequency extractor 200, and a panning calculator 300. ), A panning value applying unit 400 and a summing unit 500 may be included.

푸리에 분석부(100)는 음원 신호를 시간 영역의 신호에서 주파수 영역의 신호로 변환하는 기능을 한다. 이때 음원 신호를 시간 영역에서 주파수 영역으로 변 환할 때, 소정의 시간 단위로 변환 할 수 있는데, 예를 들면 음원 신호의 프레임 단위로 변환할 수도 있고, 하나의 프레임을 몇 개의 시간 영역으로 구분한 다음 각 시간 영역에 대해 주파수 영역으로 변환할 수도 있다.The Fourier analyzer 100 converts a sound source signal from a signal in the time domain to a signal in the frequency domain. In this case, when the sound source signal is converted from the time domain to the frequency domain, it can be converted into a predetermined time unit. For example, it can be converted into a frame unit of the sound source signal, and one frame is divided into several time domains. You can also convert to the frequency domain for each time domain.

이와 같이, 시간 영역의 음원 신호를 주파수 영역의 음원 신호로 변환하는 경우, 도 2b와 같이 주파수 영역의 음원 신호는 범핑 주파수 대역(f1~f2)과 f1~f2를 제외한 비범핑 주파수 대역을 포함한다.As described above, when the sound source signal in the time domain is converted into the sound source signal in the frequency domain, the sound source signal in the frequency domain includes the non-bumping frequency bands except for the bumping frequency bands f1 to f2 and f1 to f2 as shown in FIG. 2B. .

여기서, 범핑 주파수 대역에 대해 보다 자세하게 설명하면, 음상 정위를 구현하기 위해 벡터 기반 크기 패닝(VBAP) 방법만을 이용하는 경우, 특정 주파수 대역에서는 청취자의 귀에서 원하는 귀간 시간 지연 차이(Interaural time difference; ITD)와 귀간 레벨 차이(Interaural level difference; ILD) 정보를 제대로 만들어 내지 못하여 원하는 음상 정위를 구현하지 못하는데, 이와 같은 특정 주파수 대역을 범핑 주파수 대역이라고 한다. Here, the bumping frequency band will be described in more detail. In the case where only the vector-based magnitude panning (VBAP) method is used to implement the sound localization, the desired interaural time difference (ITD) in the listener's ear in a specific frequency band is desired. Because it does not properly generate the interaural level difference (ILD) information and does not realize the desired sound location, such a specific frequency band is called a bumping frequency band.

이와 같은 범핑 주파수 대역은 청취자를 기준으로 청취자와 배치된 스피커들 사이의 각도 및 청취자와 배치된 스피커들 사이의 거리에 따라 다르게 나타날 수 있다.Such a bumping frequency band may be different depending on the angle between the listener and the arranged speakers with respect to the listener and the distance between the listener and the arranged speakers.

따라서, 이와 같은 범핑 주파수 대역은 청취자와 스피커들 사이의 각도가 좁은 경우, 예를 들어 배치된 스피커들이 청취자의 왼쪽 귀 쪽으로 치우쳐 배치된 경우 범핑 주파수 대역은 거의 나타나지 않을 수 있으나, 배치된 스피커들이 청취자의 양쪽으로 적절하게 각도를 이루어 대칭적으로 배치된 경우 범핑 주파수 대역은 보다 명확하게 나타난다.Thus, such a bumping frequency band may be almost absent when the angle between the listener and the speakers is narrow, for example, when the placed speakers are placed toward the listener's left ear, but the arranged speakers may not appear. The bumping frequency bands appear more clearly if they are symmetrically arranged at appropriate angles on both sides of the.

이와 같은 범핑 주파수 대역은 예를 들어 스테레오 스피커가 2개이고, 배치되는 스피커가 청취자의 좌우로 대칭적으로 배치되는 경우 보통 1.1KHz ~ 2.6KHz 사이에 형성되고, 청취자를 기준으로 전방 양면으로 좌우 스피커의 각도가 60도인 전형적인 스테레오 청취 환경의 경우 대략 1.7KHz 내외에서 범핑 주파수가 발생하게 된다.Such a bumping frequency band is generally formed between 1.1KHz and 2.6KHz when two stereo speakers are arranged and the speakers to be arranged symmetrically to the left and right of the listener. In a typical stereo listening environment with an angle of 60 degrees, bumping frequencies occur around 1.7 KHz.

주파수 추출부(200)는 음원 신호의 주파수 영역 신호로부터 범핑(Bumping)이 발생하는 범핑 주파수 대역과 범핑 주파수 대역을 제외한 비범핑 주파수 대역을 각각 추출하는 기능을 한다.The frequency extractor 200 extracts the bumping frequency band where bumping occurs and the non-bumping frequency band except for the bumping frequency band from the frequency domain signal of the sound source signal.

이는 범핑 주파수 대역의 신호와 비범핌 주파수 대역의 신호를 각각 분리하여제대로 된 음상 정위를 실현하기 위하여 패닝 방법을 각각의 주파수 특성에 따라 다르게 적용하기 위함이다. This is to apply the panning method differently according to each frequency characteristic in order to realize proper image positioning by separating the signals of the bumping frequency band and the signals of the non-being frequency band, respectively.

이와 같은 주파수 추출부(200)는 범핑 주파수 대역을 추출하는 범핑 주파수 추출부(210)와 비범핑 주파수 대역을 추출하는 비범핑 주파수 추출부(220)를 포함할 수 있다.The frequency extractor 200 may include a bumping frequency extractor 210 extracting a bumping frequency band and a non-bumping frequency extractor 220 extracting a non-bumping frequency band.

이와 같이 범핑 주파수 추출부(210)에서 추출되는 범핑 주파수 대역은 1.1KHz이상 2.6KHz이하가 되도록 할 수 있다.As such, the bumping frequency band extracted by the bumping frequency extractor 210 may be 1.1 KHz or more and 2.6 KHz or less.

이는 음상 정위를 구현하기 위하여 벡터 기반 크기 패닝(VBAP) 방법만을 이용하는 경우 패닝된 버츄얼 소스의 귀간 시간 지연 차이(Interaural time difference; ITD)는 1100Hz 이하의 저주파 대역에서 일관성을 보이고, 2600Hz이상 의 고주파에서는 귀간 레벨 차이(Interaural level difference; ILD) 정보가 저주파에서의 귀간 시간 지연 차이(ITD) 정보와 거의 일치한다. This means that the interaural time difference (ITD) of the panned virtual source is consistent in the low frequency band below 1100 Hz when only the vector-based magnitude panning (VBAP) method is used to implement the sound phase. The interaural level difference (ILD) information is almost identical to the intermittent time delay difference (ITD) information at low frequencies.

그러나, 1100Hz ~ 2600Hz 사이의 주파수 대역에서는 범핑이 발생하는 주파수 대역(1100Hz ~ 2600Hz)에서의 신호 편차가 버츄얼 소스의 방향을 흐트러뜨리게 되어 청취자로 하여금 음원의 방향을 전체적으로 잘못된 방향으로 인지하게 만들기 때문에 귀간 레벨 차이(ILD)와 귀간 시간 지연 차이(ITD) 정보 전부가 귀간 시간 지연 차이 각(Interaural time difference angle; ITDA)으로부터 벗어나 정확한 음상 정위를 구현하지 못하게 된다.However, in the frequency band between 1100 Hz and 2600 Hz, signal deviation in the frequency band where bumping occurs (1100 Hz to 2600 Hz) distracts the direction of the virtual source, causing the listener to perceive the direction of the sound source in the wrong direction as a whole. All of the level difference (ILD) and the time delay difference (ITD) information may be out of the interaural time difference angle (ITDA), thereby preventing accurate sound positioning.

따라서, 이와 같은 범핑 주파수 영역에 대해서는 이후에 설명할 복소 패닝 계산(Vector Base Complex Amplitude Panning; VBCAP) 방법으로 패닝값을 계산하여 보다 정확한 음상 정위를 구현하기 위함이다. Accordingly, in the bumping frequency region, a panning value is calculated by a vector base complex amplitude panning (VBCAP) method, which will be described later, to implement more accurate sound localization.

또한, 범핑 주파수 추출부(210)에서 추출되는 범핑 주파수 대역은 1.5KHz~1.9KHz의 N배수가 되도록 할 수 있다.In addition, the bumping frequency band extracted by the bumping frequency extractor 210 may be N multiples of 1.5KHz to 1.9KHz.

이는 청취자를 기준으로 스피커의 각도가 좌우 대략 60도 전후로 배치되는 전형적인 스테레오 청취 환경의 경우, 범핑 주파수의 대역이 대략 1.5KHz ~ 1.9KHz의 N배수에서 형성되기 때문이다.This is because in a typical stereo listening environment where the angle of the speaker is approximately 60 degrees left and right with respect to the listener, the band of the bumping frequency is formed at an N multiple of approximately 1.5KHz to 1.9KHz.

패닝 계산부(300)는 음원 신호에 대해 음상 정위가 이루어지도록 하기 위해 범핑 주파수 대역의 신호에 대해서는 위상 변조값인 채널간 시간 지연값(Inter channel time difference ;ICTD)과 크기 변조값인 채널간 레벨 차이값(Inter channel level difference ;ICLD)을 구하고, 비범핑 주파수 대역의 신호에 대해서는 크기 변조값인 벡터 기반 크기 패닝(Vector Base Amplitude Panning; VBAP) 값을 계산하는 기능을 한다.The panning calculator 300 performs an interphase time delay value (ICTD), which is a phase modulation value, and an interchannel level, which is a phase modulation value, for a signal of a bumping frequency band so that sound image positioning is performed on the sound source signal. It calculates a difference value (ICLD) and calculates a vector base amplitude panning (VBAP) value, which is a magnitude modulation value, for a signal in a non-bumping frequency band.

보다 자세하게 설명하면, 패닝 계산부(300)는 복소 패닝 계산(Vector Base Complex Amplitude Panning; VBCAP)부(310)와 크기 패닝 계산(Vector Base Amplitude Panning; VBAP)부(320)를 포함할 수 있다.In more detail, the panning calculator 300 may include a vector base complex amplitude panning (VBCAP) unit 310 and a vector base amplitude panning (VBAP) unit 320.

여기서, 복소 패닝 계산(VBCAP)부(310)는 범핑 주파수 대역의 신호에 대해 음상 정위가 이루어지도록 하기 위해 귀간 시간 지연 차이(Interaural time difference; ITD)를 재생하기 위한 위상 변조값인 채널간 시간 지연값(Inter channel time difference ;ICTD)과 귀간 레벨 차이(Interaural level difference; ILD)를 재생하기 위한 크기 변조값인 채널간 레벨 차이값(Inter channel level difference ;ICLD)을 계산한다. Here, the complex panning calculation (VBCAP) unit 310 is a phase delay between channels, which is a phase modulation value for reproducing the interaural time difference (ITD) in order to perform a sound phase alignment for a signal of a bumping frequency band. An interchannel level difference (ICLD), which is a magnitude modulation value for reproducing a value (Interchannel time difference (ICTD) and an interaural level difference (ILD), is calculated.

여기서, 채널간 레벨 차이값(ICLD)인 a(θ,m,k)은 하기의 수학식 1로 표현될 수 있다.Here, a (θ, m, k), which is a level difference value ICLD between channels, may be expressed by Equation 1 below.

여기서, θ는 청취자의 정면과 스피커 사이의 각도, k는 크리티컬 밴드 인덱스(Critical band index), m은 시간 인덱스, A(k)는 미리 설정된 스피커의 크기 보 정값,

는 스피커의 페이즈 응답 값, 아래첨자 R은 청취자의 오른쪽 귀, L은 청취자의 왼쪽 귀이고,

,

는 양쪽 귀에서 원하는 각도 θ에 해당하는 머리 전달 함수(Head related transfer function; HRTF)이다. Where θ is the angle between the front of the listener and the speaker, k is the critical band index, m is the time index, and A (k) is the preset size correction value of the speaker,

Is the phase response value of the speaker, subscript R is the listener's right ear, L is the listener's left ear,

,

Is a head related transfer function (HRTF) corresponding to the desired angle θ at both ears.

여기서, 머리전달함수(HRTF : head-related transfer function)는 자유 음장에서 어떤 특정한 위치에 있는 음원으로부터 사람 귀의 외이도(ear canal)에 이르기까지의 음원 신호의 전파(sound transmission) 특성을 나타내는 주파수축 상에서의 전달함수를 의미하며, 사람의 머리, 귓바퀴 그리고 상반신 등에 의한 선형왜곡을 포함하고 있다. Here, a head-related transfer function (HRTF) is a frequency axis representing a sound transmission characteristic of a sound source signal from a sound source at a specific position in a free sound field to an ear canal of a human ear. It means the transfer function of and includes linear distortion by the human head, auricle and upper body.

또한, A(k)는 미리 설정된 스피커의 크기 보정값으로 청취자와 배치되는 스피커의 위치가 이루는 각도 및 거리에 따라 달라질 수 있다.In addition, A (k) may be changed according to an angle and a distance formed by a position of a speaker disposed with the listener as a preset size correction value of the speaker.

또한, 채널간 시간 지연값(ICTD)인 b*는 하기의 수학식 2로 표현될 수 있다.In addition, b *, which is an interchannel time delay value ICTD, may be expressed by Equation 2 below.

Multiply by form

그리고, 크기 패닝 계산(VBAP)부(320)는 비범핑 주파수 대역의 신호에 대해 음상 정위가 이루어지도록 스피커의 이득값인 벡터 기반 크기 패닝 값을 계산한다. In addition, the magnitude panning calculation (VBAP) unit 320 calculates a vector-based magnitude panning value, which is a gain value of the speaker, to perform sound phase alignment with respect to a signal of a non-bumping frequency band.

이와 같은 벡터 기반 크기 패닝(VBAP) 값의 연산식은 스피커가 2개인 경우 수학식 3과 같이 표현될 수 있다.The calculation of the vector-based magnitude panning (VBAP) value may be expressed as Equation 3 when there are two speakers.

여기서 g1, g2가 각각의 스피커에 전달될 이득값이고, θ_T는 청취자의 정면과 가상 음원이 이루는 각도이고, θ₀는 청취자의 정면과 좌우 대칭된 스피커 사이의 각도이다.(도 1 참조)Where g1 and g2 are gain values to be delivered to each speaker, θ _T is the angle between the front of the listener and the virtual sound source, and θ ₀ is the angle between the front of the listener and the symmetrical speaker (see FIG. 1).

패닝값 적용부(400)는 범핑 주파수 대역의 신호에 대해서는 채널간 레벨 차이값(ICLD)과 채널간 시간 지연값(ICTD)을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하고, 비범핑 주파수 대역의 신호에 대해서는 크기 패닝 값을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하는 기능을 한다.The panning value application unit 400 applies a level difference value ICLD and an interchannel time delay value ICTD to signals of bumping frequency bands so that sound phase alignment is performed on each of the signals input to the plurality of speakers. For signals in the non-bumping frequency band, the magnitude panning value is applied to the sound phase alignment to each of the signals input to the plurality of speakers.

보다 구체적으로 패닝값 적용부(400)는 범핑 주파수 패닝값 적용부(410)와 비범핑 주파수 패닝값 적용부(420)를 포함할 수 있다.More specifically, the panning value applying unit 400 may include a bumping frequency panning value applying unit 410 and a non-bumping frequency panning value applying unit 420.

여기서, 범핑 주파수 패닝값 적용부(410)는 채널간 레벨 차이값(ICLD)과 채널간 시간 지연값(ICTD)을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용할 수 있다.Here, the bumping frequency panning value applying unit 410 may apply the level difference value ICLD and the inter-channel time delay value ICTD to the sound phase alignment for each of the signals input to the plurality of speakers.

예를 들면, 범핑 주파수 패닝값 적용부(410)는 스피커가 2 개인 경우, 2개의 스피커에 입력되는 각각의 신호의 크기는 ICLD 값과 1-ICLD값이고, 복수의 스피커 중 음상 정위를 위해 시간 지연이 필요한 스피커에 입력되는 신호에는 ICTD값을 곱하여 범핑 주파수의 복소 패닝 계산값을 적용할 수 있다.For example, when the bumping frequency panning value applying unit 410 has two speakers, the magnitude of each signal input to the two speakers is an ICLD value and a 1-ICLD value, and the time for sound localization among the plurality of speakers is increased. The complex panning calculation value of the bumping frequency may be applied to the signal input to the speaker requiring the delay by multiplying the ICTD value.

이와 같은 복소 패닝 계산값이 각각의 스피커에 적용되는 식의 일례는 수학식 4와 같이 표현될 수 있다.An example of an equation in which the complex panning calculation value is applied to each speaker may be expressed as in Equation 4.

여기서, Y_L(θ, m, k)는 청취자의 왼쪽 스피커로 입력되는 복소 패닝 계산값이 적용된 신호이고, Y_R(θ, m, k)는 청취자의 오른쪽 스피커로 입력되는 복소 패닝 계산값이 적용된 신호이다.Here, Y _L (θ, m, k) is a signal to which the complex panning calculation value input to the listener's left speaker is applied, and Y _R (θ, m, k) is the complex panning calculation value input to the listener's right speaker. This is the applied signal.

이와 같이, 복소 패닝 계산값이 각각의 스피커에 적용되는 경우, 청취귀의 귀로 입력되는 신호는 수학식 5와 같이 표현될 수 있다. As such, when the complex panning calculation value is applied to each speaker, the signal input to the ear of the hearing ear may be expressed as in Equation 5.

여기서, X_L(θ, m, k)는 청취자의 왼쪽 귀로 입력되는 음원 신호이고, X_R(θ, m, k)는 청취자의 오른쪽 귀로 입력되는 음원 신호이다.Here, X _L (θ, m, k) is a sound source signal input to the listener's left ear, and X _R (θ, m, k) is a sound source signal input to the listener's right ear.

이와 같이, 청취자의 일측면 귀로 입력되는 음원 신호는 청취자의 타측면에 배치되는 스피커의 입력 음원 신호에 미리 설정된 스피커의 크기 보정값과 귀간 지연 시간(ICTD)에 대한 위상 변조값이 곱해진 값과 청취자의 일측면에 배치되는 스피커의 입력 음원 신호와의 합으로 표현되어 음원 신호의 위상차로 인한 범핑 문제를 해결할 수 있는 것이다.In this way, the sound source signal input to one side of the listener is a value obtained by multiplying the input sound source signal of the speaker disposed on the other side of the listener by a preset amplitude correction value of the speaker and a phase modulation value for the ear delay time (ICTD). It is expressed as the sum of the input source signal of the speaker disposed on one side of the listener to solve the bumping problem due to the phase difference of the source signal.

또한, 비범핑 주파수 패닝값 적용부(420)는 비범핑 주파수 대역의 신호에 대한 크기 패닝 값을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하는 기능을 한다. 이와 같은 비범핑 주파수 패닝값 적용부(420)가 패닝 값을 복수의 스피커에 입력되는 신호 각각에 음상 정위가 이루어지도록 적용하는 방법은 주파수 영역에서 통상적으로 벡터 기반 크기 패닝(Vector Base Amplitude Panning; VBAP) 값을 각각의 스피커 입력 신호에 적용하는 방법과 동일하다.In addition, the non-bumping frequency panning value applying unit 420 functions to apply a magnitude panning value for a signal of the non-bumping frequency band so that sound phase alignment is performed on each of the signals input to the plurality of speakers. Such a non-bumping frequency panning value applying unit 420 applies a panning value to each of the signals input to the plurality of speakers, so that a stereotype is generally performed in the frequency domain. Vector base amplitude panning (VBAP) ) Is the same as the method to apply each speaker input signal.

합산부(500)는 범핑 주파수 패닝값이 적용된 주파수 대역의 신호와 비범핑 주파수 패닝값이 적용된 주파수 대역의 신호를 합산하는 기능을 한다.The adder 500 functions to add a signal in a frequency band to which a bumping frequency panning value is applied and a signal in a frequency band to which a non-bumping frequency panning value is applied.

이후에는 통상적으로 사용되는 3차원 음상 정위 처리 방법에 의하여 정확한 음상 정위를 구현할 수 있는 것이다.After that, it is possible to implement accurate image positioning by a commonly used three-dimensional image positioning method.

이와 같은 본 발명에 따른 3차원 음상 정위 처리 장치는 종래 스피커의 크기만 조절하여 음상 정위를 구현하는 방법에서 문제되었던 특정 주파수 대역에서의 범핑 문제를 결과적으로 비범핑 주파수 대역에 대해서는 벡터 기반 크기 패닝(Vector Base Amplitude Panning; VBAP) 방법을 사용하고, 범핑 주파수 대역에 대해서만 채널간 시간 지연값(ICTD)과 채널간 레벨 차이값(ICLD)을 곱해지도록 벡터 기반 복소 패닝(Vector Base Complex Amplitude Panning; VBCAP) 방법을 사용함으로써 해결할 수 있는 것이다. The 3D sound image processing apparatus according to the present invention solves the problem of bumping in a specific frequency band, which is a problem in the method of implementing sound image alignment by adjusting only the size of a conventional speaker. As a result, vector-based size panning for the non-bumping frequency band ( Vector Base Amplitude Panning (VBAP) method, and Vector Base Complex Amplitude Panning (VBCAP) to multiply the inter-channel time delay (ICTD) and the inter-channel level difference (ICLD) only for the bumping frequency band. This can be solved by using the method.

한편, 상술한 본 발명의 실시예들은 컴퓨터에서 실행될 수 있는 프로그램으로 작성 가능하고, 컴퓨터로 읽을 수 있는 기록매체를 이용하여 프로그램을 동작시키는 범용 디지털 컴퓨터에서 구현될 수 있다. 컴퓨터로 읽을 수 있는 기록매체는 마그네틱 저장매체(예를 들면, ROM, 플로피 디스크, 하드 디스크, 자기 테이프 등), 광학적 판독 매체(예를 들면, CD-ROM, DVD, 광데이터 저장장치 등) 및 캐리어 웨이브(예를 들면, 인터넷을 통한 전송)와 같은 저장매체를 포함한다.Meanwhile, the above-described embodiments of the present invention can be written in a program that can be executed in a computer, and can be implemented in a general-purpose digital computer that operates a program using a computer-readable recording medium. Computer-readable recording media include magnetic storage media (e.g., ROM, floppy disks, hard disks, magnetic tape, etc.), optical reading media (e.g., CD-ROM, DVD, optical data storage devices, etc.); Storage media such as carrier waves (eg, transmission over the Internet).

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위 내에서 다양한 수정, 변경 및 치환이 가능할 것 이다. 따라서, 본 발명에 개시된 실시예 및 첨부된 도면들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예 및 첨부된 도면에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of the present invention, and various modifications, changes, and substitutions may be made by those skilled in the art without departing from the essential characteristics of the present invention. will be. Accordingly, the embodiments disclosed in the present invention and the accompanying drawings are not intended to limit the technical spirit of the present invention but to describe the present invention, and the scope of the technical idea of the present invention is not limited by the embodiments and the accompanying drawings. . The protection scope of the present invention should be interpreted by the following claims, and all technical ideas within the equivalent scope should be interpreted as being included in the scope of the present invention.

도 1은 종래의 벡터 기반 크기 패닝(VBAP) 방법을 이용한 음상 정위의 문제점을 설명하기 위한 도.BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a view for explaining a problem of sound localization using a conventional vector-based magnitude panning (VBAP) method.

<도면의 주요 부분에 대한 도면 부호의 설명><Description of reference numerals for the main parts of the drawings>

100 : 푸리에 분석부 200 : 주파수 추출부100: Fourier analysis unit 200: Frequency extraction unit

300 : 패닝 계산부 400 : 패닝값 적용부300: panning calculation unit 400: panning value applying unit

500 : 합산부500: summing

Claims

Extracting a bumping frequency band in which bumping occurs that prevents the implementation of interaural time difference (ITD) and interaural level difference (ILD) information from a frequency domain signal of an input sound source signal. Bumping frequency extraction step; And

Inter channel time difference (ICTD) and inter-level level difference (i.e., phase modulation values for reproducing interaural time difference; A vector base complex amplitude panning (VBCAP) step of calculating an inter channel level difference (ICLD), which is a magnitude modulation value for reproducing an interaural level difference (ILD);

3D sound image positioning processing method comprising a.

The method of claim 1,

The three-dimensional image positioning processing method

A bumping frequency panning value applying step of applying the level difference value ICLD and the time delay value ICTD between channels to each other so that a sound image is placed on a signal input to a plurality of speakers;

3D sound image stereotactic processing method further comprising.

The method of claim 2,

Applying the bumping frequency panning value is

If the plurality of speakers are two,

The magnitude of each signal input to the two speakers is an ICLD value and a 1-ICLD value, and the signal input to the speaker that requires a time delay for sound image positioning among the plurality of speakers is multiplied by the ICTD value.

3D sound image stereotactic processing method characterized in that.

The method of claim 1,

The bumping frequency band is 1.1KHz ~ 2.6KHz three-dimensional sound image processing method characterized in that.

The method of claim 1,

And the bumping frequency band is N multiples of 1.5 KHz to 1.9 KHz.

The method of claim 1,

3. The method of claim 3, wherein a (θ, m, k), which is the level difference value ICLD, is represented by the following equation.

[Equation]

Where the subscript R is the listener's right ear, L is the listener's left ear, θ is the angle between the listener's front and the speaker, k is the critical band index, m is the time index, and A (k) Is the size correction value of the preset speaker,

Is the speaker's phase response value,

,

The method of claim 1,

B ^* , the inter-channel time delay value (ICTD), is expressed by the following equation.

Here, when the inter-channel time delay value (ICTD) is multiplied by the input signal of the speaker requiring the time delay,

Multiply by form

The method of claim 1,

The three-dimensional image positioning processing method

A Fourier analysis step of converting the sound source signal into a signal in a frequency domain from a signal in a time domain;

3D sound image stereotactic processing method further comprising.

The method of claim 8,

The signal in the frequency domain includes a non-bumping frequency band except for the bumping frequency band and the bumping frequency band,

The three-dimensional image positioning processing method

A non-bumping frequency extraction step of extracting a non-bumping frequency band from the frequency domain signal of the sound source signal;

A vector base amplitude panning (VBAP) step of calculating a vector-based magnitude panning value in order to perform a sound image positioning on the signal of the non-bumping frequency band; And

A non-bumping frequency panning value applying step of applying a magnitude panning value for the signal in the non-bumping frequency band so that sound phase alignment is performed on each of the signals input to a plurality of speakers;

3D sound image stereotactic processing method further comprising.

The method of claim 9,

The three-dimensional image positioning processing method

Summing a signal of a frequency band to which the bumping frequency panning value is applied and a signal of a frequency band to which the non-bumping frequency panning value is applied;

3D sound image stereotactic processing method further comprising.

A frequency extracting unit for extracting a bumping frequency band where bumping occurs and a non-bumping frequency band except for the bumping frequency band from the frequency domain signal of the input sound source signal; And

The inter-channel time difference (ICTD), which is a phase modulation value, and the inter-channel level difference value, which is a magnitude modulation value, for the signals of the bumping frequency band in order to perform sound phase alignment with respect to the sound source signal. a panning calculator for calculating a level difference (ICLD) and calculating a vector base amplitude panning (VBAP) value that is a magnitude modulation value for the signal in the non-bumping frequency band;

3D sound image processing apparatus comprising a.

The method of claim 11,

The three-dimensional sound image processing apparatus

A Fourier analyzer converting the sound source signal into a signal in a frequency domain from a signal in a time domain;

Three-dimensional sound image processing apparatus further comprises a.

The method of claim 11,

The three-dimensional sound image processing apparatus

An adder configured to add a signal of a frequency band to which the bumping frequency panning value is applied and a signal of a frequency band to which the non-bumping frequency panning value is applied;

Three-dimensional sound image processing apparatus further comprises a.

The method of claim 11,

And the frequency extracting unit comprises a bumping frequency extracting unit extracting the bumping frequency band and a non-bumping frequency extracting unit extracting the non-bumping frequency band.

The method of claim 14,

The bumping frequency band is 1.1KHz ~ 2.6KHz three-dimensional sound image processing device, characterized in that.

The method of claim 14,

The bumping frequency band is a three-dimensional sound image processing device, characterized in that N multiples of 1.5KHz ~ 1.9KHz.

The method of claim 11,

The panning calculator includes a vector base complex amplitude panning (VBCAP) unit and a vector base amplitude panning (VBAP) unit.

The complex panning calculation unit (VBCAP) is an inter-channel time difference that is a phase modulation value for reproducing an interaural time difference (ITD) so that sound phase positioning is performed on the signal of the bumping frequency band. Calculate interchannel level difference (ICLD), which is a magnitude modulation value for reproducing ICTD and interaural level difference (ILD),

The magnitude panning calculation (VBAP) unit calculates a vector-based magnitude panning value such that sound phase alignment is performed on a signal of the non-bumping frequency band.

Three-dimensional sound image processing apparatus, characterized in that.

The method of claim 17,

And a (θ, m, k), which is the level difference value ICLD, is expressed by the following equation.

[Equation]

Is the speaker's phase response value,

,

The method of claim 17,

Multiply by form

The method of claim 11,

A panning value applying unit for applying the magnitude panning value to each of the signals input to the plurality of speakers with respect to the signal of the non-bumping frequency band;

Three-dimensional sound image processing apparatus further comprises a.

The method of claim 20,

The panning value applying unit includes a bumping frequency panning value applying unit and a non-bumping frequency panning value applying unit.

The bumping frequency panning value applying unit applies the level difference value ICLD and the time delay value ICTD between the channels so that sound phase is placed on each of the signals input to the plurality of speakers.

The non-bumping frequency panning value applying unit applies a magnitude panning value for the signal in the non-bumping frequency band so that sound phase alignment is performed on each of the signals input to the plurality of speakers.

Three-dimensional sound image processing device characterized in that.

The method of claim 21,

The bumping panning value applying unit

If the plurality of speakers are two,

Three-dimensional sound image processing device characterized in that.

In a computer-readable recording medium,

A recording medium on which a program for implementing the method according to any one of claims 1 to 10 is recorded.