KR100673288B1

KR100673288B1 - System for providing audio data and providing method thereof

Info

Publication number: KR100673288B1
Application number: KR1020040043626A
Authority: KR
Inventors: 이규은
Original assignee: (주)엑스파미디어
Priority date: 2004-06-14
Filing date: 2004-06-14
Publication date: 2007-01-24
Also published as: KR20050118495A

Abstract

본 발명은 오디오 데이타 제공 시스템 및 오디오 데이타 제공 방법에 관한 것이다. 본 발명에 따른 오디오 데이타 제공 시스템은 네트워크를 통해 사용자 단말기와 연결되어, 상기 사용자 단말기에 오디오 데이타를 제공하는 시스템으로서, 조건설정부 및 오디오데이타처리부를 포함한다.The present invention relates to an audio data providing system and a method for providing audio data. An audio data providing system according to the present invention is connected to a user terminal through a network and provides audio data to the user terminal. The system includes a condition setting unit and an audio data processing unit.

조건설정부에 사용자에 의해 상기 가상음원위치조건을 비롯한 오디오 데이타 제공 조건이 설정되면, 오디오 데이타 처리부는 설정된 가상음원위치조건에 따라 입력 오디오 신호와 소정의 전달함수를 컨벌루션하여 입력 오디오 신호에 사용자가 원하는 방향의 입체감을 부여하여 출력한다.When the audio data providing condition including the virtual sound source position condition is set by the user in the condition setting unit, the audio data processing unit convolves the input audio signal and a predetermined transfer function according to the set virtual sound source position condition, thereby allowing the user to input the audio signal. Outputs by giving a three-dimensional effect in the desired direction.

따라서, 사용자는 네트워크를 통해서도 원하는 방향의 입체감이 부여된 사운드를 즐길 수 있다.Therefore, the user can enjoy the sound given the three-dimensional effect in a desired direction even through the network.

HRTF, 크로스 토크, 다운믹싱, 머리전달함수HRTF, Cross Talk, Downmixing, Head Transfer Function

Description

Audio data providing system and method for providing audio data {SYSTEM FOR PROVIDING AUDIO DATA AND PROVIDING METHOD THEREOF}

도 1은 본 발명의 실시예에 따른 오디오 데이타 제공 시스템의 전체 구성도이다.1 is an overall configuration diagram of an audio data providing system according to an embodiment of the present invention.

도 2는 도 1의 서비스 제공 서버의 상세구성도이다.FIG. 2 is a detailed configuration diagram of the service providing server of FIG. 1.

도 3은 입력신호와 사용자 귀에 들리는 신호의 관계를 나타낸 그림이다.3 is a diagram illustrating a relationship between an input signal and a signal heard by a user's ear.

도 4는 본 발명의 실시예에 따른 오디오 제공방법의 전체흐름도이다.4 is an overall flowchart of an audio providing method according to an exemplary embodiment of the present invention.

도 5는 입력 오디오 신호의 채널수가 3이상인 경우 2채널 다운믹스 과정이다.5 is a two-channel downmix process when the number of channels of the input audio signal is three or more.

도 6은 입력 오디오 신호의 채널수가 2이하인 경우 채널 수를 확대한 후, 2채널로 다운믹스하는 과정이다.6 is a process of downmixing two channels after expanding the number of channels when the number of channels of the input audio signal is 2 or less.

도 7은 도 5 또는 도 6의 과정을 거친 오디오 데이타가 최종 출력되는 과정이다.FIG. 7 is a process of finally outputting audio data that has undergone the process of FIG. 5 or 6.

* 도면이 주요 부분에 대한 부호의 설명 * * Explanation of symbols for the main parts of the drawing *

100 : 사용자 단말기 200 : 네트워크100: user terminal 200: network

300 : 인터페이스 400 : 서비스제공서버300: interface 400: service providing server

410 : 오디오소스제공부 420 : 채널확인부410: audio source providing unit 420: channel identification unit

430 : 업믹스처리부 433 : 채널확장부430: upmix processing unit 433: channel expansion unit

440 : 다운믹스처리부 450 : 크로스토크처리부 440: downmix processing unit 450: crosstalk processing unit

460 : 조건설정부 470 : 출력부460: condition setting unit 470: output unit

본 발명은 오디오 데이타 제공 시스템 및 오디오 데이타 제공방법에 관한 것으로, 보다 상세하게는 네트워크를 통해 사용자 단말기와 연결되어, 상기 사용자 단말기에 오디오 데이타를 제공하는 오디오데이타 제공시스템에 관한 것이다.The present invention relates to an audio data providing system and an audio data providing method, and more particularly, to an audio data providing system connected to a user terminal through a network and providing audio data to the user terminal.

사용자에게 제공되는 오디오 데이타들은 사운드의 입체감을 살리기 위해 5.1 채널을 이용한다. 이 때 5.1 채널이란 좌, 우, 중앙, 서라운드 좌, 서라운드 우 및 저주파효과 채널을 가리킨다. The audio data provided to the user uses 5.1 channels to enhance the stereoscopic sound. In this case, the 5.1 channel refers to left, right, center, surround left, surround right and low frequency effect channels.

일반적으로 중앙채널은 청취자의 중앙, 좌우 채널은 청취자의 좌우 30도 위치, 좌우 서라운드 채널은 좌우 120도 위치에서 방향성분을 가지며, 저주파효과 채널은 120Hz이하의 대역폭을 가지므로 방향성분을 가지지 않는다.In general, the center channel has a direction component at the center of the listener, the left and right channels at the left and right 30 degree positions, and the left and right surround channel at the left and right 120 degrees positions, and the low frequency effect channel has a bandwidth of 120 Hz or less and thus has no direction component.

여기서, 입체감 있는 사운드를 듣기 위해서는 방향성을 가지지 않는 저주파효과채널을 제외한 5채널에 해당하는 스피커 위치가 중요하다. 즉 5개의 스피커 위치에 의해 사운드의 입체감이 형성된다.Here, in order to hear a three-dimensional sound, the speaker position corresponding to five channels except the low frequency effect channel having no directionality is important. That is, three-dimensional speaker sound is formed by the five speaker positions.

한편, 최근 네트워크를 통한 비디오 혹은 오디오데이타 제공 및 재생기법은 종래의 데이타를 하드 디스크 드라이브에 다운로드 받은 후 재생하는 방식이 아닌 다운로드 없이 실시간으로 재생하는 기법, 즉 스트리밍 방식으로 주로 이루어진다. On the other hand, the recent method of providing and reproducing video or audio data through a network is mainly performed by a method of reproducing in real time without downloading, that is, a streaming method, rather than a method of downloading and reproducing conventional data to a hard disk drive.

그런데, 상기 스트리밍 방식으로 오디오데이타를 제공하는 경우, 5채널 오디오 데이타 전송 및 5채널에 해당하는 스피커 위치를 통해 사운드의 입체감을 부여하기는 힘들다. 네트워크를 통해 5채널 오디오 데이타를 그대로 전송하는 경우, 제한된 대역폭에서 많은 양의 데이타를 전송해야 하므로 실시간 전송이 불가능 할 수 있다. 따라서 5채널 오디오 데이타를 좌, 우의 2채널 오디오 데이타로 다운 믹싱한 후 스트리밍하는 방법이 이용되기도 했다.However, in the case of providing the audio data by the streaming method, it is difficult to give a three-dimensional sound of the sound through the five-channel audio data transmission and the speaker position corresponding to the five channels. If 5-channel audio data is transmitted through the network as it is, it may be impossible to transmit in real time because a large amount of data must be transmitted in a limited bandwidth. Therefore, a method of down-mixing 5-channel audio data into two-channel audio data of left and right has been used.

그런데, 5채널 오디오 데이타 각각을 단순히 왼쪽 성분과 오른쪽 성분으로 나누어 좌, 우 2채널 오디오 데이타로 다운 믹싱하는 경우 사운드의 입체감을 유지하는 것이 힘들다. 따라서 5채널 오디오 데이타를 2채널로 다운믹싱하는 경우에도 사운드 입체감을 유지하기 위해 5채널 오디오 데이타 각각에 쌍이효과 합성(Binaural Synthesis)을 하는 방법이 이용되었다.However, when downmixing each of the five channel audio data into left and right two channel audio data by simply dividing each of the left and right components, it is difficult to maintain a three-dimensional sound. Therefore, even when downmixing 5-channel audio data into 2 channels, a method of performing bi- ural synthesis on each of the 5-channel audio data has been used to maintain sound stereoscopic effect.

쌍이효과란 사람의 청각에서 어떤 음을 들을 때 한 쪽 귀로 들었을 때 음의 강약만을 판단할 수 있으나 양쪽 귀로 들으면 음의 정위, 즉 음원의 방향과 원근을 감지할 수 있는 효과를 가리키는 것으로 스테레오재생의 기본 원리이다. 따라서 쌍이효과 합성이란 방향성이 없는 모노 채널을 특정 각도의 방향성을 가지는 머리전달함수(Head Related Trasnfer Function, HRTF)필터에 통과시켜, 방향성을 가지는 신호로 변환하는 것이다.The pair effect refers to the effect of detecting a sound's position, that is, the sound source's direction and perspective by listening to one ear when listening to a sound in a person's hearing. Basic principle. Therefore, the paired effect synthesis means that a non-directional mono channel is passed through a Head Related Trasnfer Function (HRTF) filter having a specific angle and converted into a signal having a directivity.

즉, 5채널 오디오 데이타 각각에 방향성을 부여한 후 2개의 채널로 다운믹싱하여 스트리밍 함으로써, 데이타 양을 줄이면서 사운드의 입체감을 유지하는 스트 리밍 서비스 가능했다.In other words, by giving direction to each of the five-channel audio data, downmixing and streaming to two channels, it was possible to stream the service to maintain the three-dimensional sound while reducing the amount of data.

그런데, 입체감 있는 사운드 청취를 위해 공식적으로 권고되는 5개 음원 위치는 청취자의 중앙(즉, 청취자로부터 0도), 좌우 채널은 청취자의 좌우 30도위치, 좌우 서라운드 채널은 좌우 120도위치로서, 대부분의 2채널 다운믹싱 방법이 상기 권고사항을 그대로 따르고 있었다. 즉, 상기 머리전달함수를 이용하여 다양한 가상음원위치 설정이 가능함에도 불구하고, 사용자의 개별적인 선호도에 관계없이 5개의 가상 음원 위치가 청취자 중앙, 청취자 좌우 30도, 청취자 좌우 120도에 고정되어 있었다.However, the five sound source positions officially recommended for listening to a three-dimensional sound are the center of the listener (i.e. 0 degrees from the listener), the left and right channels are at the left and right 30 degrees positions, and the left and right surround channels are at 120 degrees. The two-channel downmixing method followed by the above recommendations. That is, although various virtual sound source positions can be set using the head transfer function, five virtual sound source positions are fixed to the center of the listener, 30 degrees to the left and right of the listener, and 120 degrees to the listener's left and right, regardless of the user's individual preference.

한편, 헤드폰을 통해 2채널로 다운된 상기 오디오데이타를 청취하는 경우, 좌측 채널은 좌측귀에, 우측 채널은 우측귀에 들어가므로, 채널 상호간의 신호간섭으로 인한 상쇄효과(이하 크로스토크, Cross-Talk)는 발생하지 않는다. 그런데, 스피커를 통해 2채널 오디오 데이타를 청취하는 경우에는 좌측채널의 일부가 우측귀에 들어가고, 우채널의 일부가 좌측귀에 들어가므로 크로스토크가 발생한다. 따라서 상기 크로스토크를 제거해주는 과정이 별도로 필요하다.On the other hand, when listening to the audio data down to two channels through the headphone, the left channel into the left ear, the right channel into the right ear, the canceling effect due to signal interference between the channels (hereinafter referred to as cross-talk) Does not occur. However, when listening to 2-channel audio data through a speaker, crosstalk occurs because a part of the left channel enters the right ear and a part of the right channel enters the left ear. Therefore, a process for removing the crosstalk is necessary separately.

그런데 일반적으로 네트워크를 통한 오디오 데이타 스트리밍서비스에서는 크로스 토크에 대한 고려가 없거나, 사운드 출력수단이 스피커인지 헤드폰인지 여부에 관계없이 크로스 토크 제거 과정이 추가되었다.However, in general, in the audio data streaming service through a network, there is no consideration of crosstalk or a crosstalk removal process is added regardless of whether a sound output means is a speaker or a headphone.

게다가 상기 크로스토크 제거 과정은 사용자로부터의 스피커 실제 설정 위치에 관계없이 미리 가정된 함수(일반적으로 사용자로부터 좌우 30도위치)를 통해 일률적으로 실행되었다.In addition, the crosstalk removal process is performed uniformly through a presumed function (typically 30 degrees left and right from the user) regardless of the speaker actual setting position from the user.

따라서 본 발명이 이루고자 하는 기술적 과제는 오디오 데이타 스트리밍 서비스에서 사용자가 원하는 방향의 입체감이 부여된 오디오 데이타가 제공되도록 하는데 있다.Therefore, the technical problem to be achieved by the present invention is to provide audio data with a three-dimensional effect in a direction desired by an audio data streaming service.

또한 본 발명이 이루고자 하는 다른 기술적 과제는 제공받은 오디오 데이타 출력수단 조건에 따라 크로스토크 처리여부가 결정되도록 하는데 있다.In addition, another technical problem to be achieved by the present invention is to determine whether or not crosstalk processing according to the provided audio data output means conditions.

그리고 본 발명이 이루고자 하는 또 다른 기술적 과제는 크로스 토크처리과정은 사용자의 스피커 설정 위치에 기초하여 실행되도록 하는데 있다.In addition, another technical problem to be achieved by the present invention is to perform a crosstalk process based on the user's speaker setting position.

상기한 기술적 과제를 달성하기 위한 본 발명의 특징에 따른 오디오 제공 시스템은, 네트워크를 통해 사용자 단말기와 연결되어, 상기 사용자 단말기에 오디오 데이타를 제공하는 오디오데이타 제공 시스템으로서, 사용자가 지정한 가상음원위치조건을 포함한 오디오 데이타 제공 조건이 설정되는 조건 설정부; 상기 조건 설정부에 설정된 상기 가상음원위치조건에 따라 입력오디오 신호에 소정의 전달함수를 이용해 사용자가 원하는 방향의 입체감을 부여하여 출력하는 오디오 데이타 처리부를 포함한다.An audio providing system according to a feature of the present invention for achieving the above technical problem, is an audio data providing system for providing audio data to the user terminal is connected to the user terminal via a network, the user specified virtual sound source position conditions A condition setting unit for setting audio data providing conditions including; And an audio data processor for providing a three-dimensional effect in a direction desired by a user using a predetermined transfer function according to the virtual sound source position condition set in the condition setting unit.

이 때 상기 조건설정부는 사용자에 의해 실시간으로 조건 설정이 변경되도록 구성가능하다. At this time, the condition setting unit may be configured to change the condition setting in real time by the user.

또한 출력되는 오디오 신호의 상호간섭으로 인한 상쇄효과를 제거하는 크로스토크처리부를 더 포함하고, 상기 크로스토크처리부는 상기 조건설정부에 설정되 는 출력수단조건에 따라 실행되도록 구성 가능하다. The apparatus may further include a crosstalk processing unit for canceling an offset effect due to mutual interference of the output audio signal, wherein the crosstalk processing unit is configured to be executed according to an output means condition set in the condition setting unit.

특히, 상기 출력수단 조건에는 헤드폰 혹은 스피커와 같은 출력수단 종류가 포함되고, 상기 크로스토크처리부는 상기 출력수단 조건이 스피커인 경우 오디오 신호 상쇄효과 제거 기능을 실행하도록 구성될 수 있다. 이 때, 상기 출력수단 조건에는 사용자로부터의 스피커 위치가 더 포함되고, 상기 크로스토크처리부는 설정된 스피커 위치에 기초하여 오디오 신호 상쇄효과 제거 기능을 실행하도록 구성가능하다.In particular, the output means condition may include a kind of output means such as a headphone or a speaker, and the crosstalk processing unit may be configured to execute an audio signal canceling effect canceling function when the output means condition is a speaker. At this time, the output means condition further includes a speaker position from the user, and the crosstalk processing unit is configurable to execute an audio signal canceling effect canceling function based on the set speaker position.

그리고 본 발명의 특징에 따른 오디오 데이타 제공 시스템의 오디오 데이타 제공방법은, 네트워크를 통해 사용자 단말기와 연결되어 상기 사용자 단말기에 오디오 데이타를 제공하는 시스템의 오디오 데이타 제공방법에 관한 것으로, a) 사용자가 지정한 가상음원위치조건을 포함하는 오디오 데이타 제공조건이 설정되는 단계; b) 설정된 상기 오디오 데이타 제공 조건에 따라 입력 오디오 신호에 소정의 전달함수를 이용하여 사용자가 원하는 방향의 입체감을 부여하는 단계; c) 상기 오디오 신호를 출력하는 단계를 포함한다. The audio data providing method of the audio data providing system according to an aspect of the present invention relates to an audio data providing method of a system that is connected to a user terminal through a network and provides the audio data to the user terminal. Setting audio data providing conditions including virtual sound source position conditions; b) providing a three-dimensional effect in a direction desired by a user using a predetermined transfer function according to the set audio data providing condition; c) outputting the audio signal.

이 때 a)는 사용자에 의해 실시간으로 설정 변경되는 것을 특징으로 한다. 그리고, d) 출력수단 종류를 선택하는 단계; 및 e) 출력수단 종류가 스피커인 경우, 오디오 신호의 상호 간섭으로 인한 상쇄효과를 제거하는 단계를 더 포함할 수 있다.In this case, a) may be changed in real time by the user. And d) selecting a kind of output means; And e) when the output means is a speaker, removing the offset effect due to mutual interference of the audio signal.

그리고, 출력수단 종류가 스피커인 경우, 사용자로부터의 스피커 위치를 설정하는 단계를 더 포함하여, 설정된 스피커 위치를 기초로 오디오 신호 상쇄 효과 를 제거할 수 있다.If the output means is a speaker, the method may further include setting a speaker position from the user, thereby eliminating an audio signal canceling effect based on the set speaker position.

또, 본 발명의 다른 특징에 따른 오디오 제공 시스템은 네트워크를 통해 사용자 단말기와 연결되어 , 상기 사용자 단말기에 오디오 데이타를 제공하는 오디오 데이타 제공 시스템으로서, 사용자가 지정한 오디오 데이타의 출력수단 종류를 포함한 출력수단 조건이 설정되는 조건설정부; 및 출력되는 오디오 신호의 상호 간섭으로 인한 상쇄효과를 제거하는 크로스토크처리부를 포함하고, 상기 크로스토크처리부는 상기 조건설정부에 설정된 상기 출력수단 조건에 따라 실행되도록 구성가능하다.In addition, an audio providing system according to another aspect of the present invention is an audio data providing system for providing audio data to a user terminal connected to a user terminal through a network, the output means including a type of output means of audio data designated by a user A condition setting unit for setting a condition; And a crosstalk processing unit for canceling an offset effect due to mutual interference of the output audio signal, wherein the crosstalk processing unit is configurable to be executed according to the output means condition set in the condition setting unit.

그리고 본 발명의 다른 특징에 따른 오디오 데이타 제공방법은 네트워크를 통해 상기 사용자 단말기와 연결되어 상기 사용자 단말기에 오디오 데이타를 제공하는 시스템의 오디오 데이타 제공방법으로서, a)사용자가 지정한 오디오 데이타의 출력수단 종류를 포함한 출력수단 조건이 설정되는 단계; b) 출력수단 종류가 스피커인 경우, 오디오 신호의 상호 간섭으로 인한 상쇄 효과를 제거하는 단계를 포함하는 것으로 구성가능하다.The audio data providing method according to another aspect of the present invention is an audio data providing method of a system that is connected to the user terminal through a network and provides the audio data to the user terminal. Setting an output means condition including; b) if the type of output means is a speaker, comprising the step of eliminating the cancellation effect due to mutual interference of the audio signal.

이하 첨부도면을 참조하여 본 발명의 실시예를 상세히 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1에 도시된 바와 같이, 본 발명의 실시예에 따른 오디오 데이타 제공 시스템은 네트워크(200)(전화망, 인터넷, 무선통신망 등 다양한 형태의 모든 네트워크(200)를 포함)를 통해 다수의 사용자 단말기(100)와 연결되어 있으며, 인터페이스(300) 및 서비스제공서버(400)를 포함한다.As shown in FIG. 1, an audio data providing system according to an exemplary embodiment of the present invention includes a plurality of user terminals through a network 200 (including all types of networks 200 in various forms such as a telephone network, the Internet, and a wireless communication network). It is connected to the 100, and includes an interface 300 and the service providing server 400.

사용자 단말기(100)는 네트워크(200)를 통해 시스템에 접속할 수 있는 통신 장치로, 일반적으로는 유선전화, 무선통신단말기, 컴퓨터 및 인터넷 접속 가능 TV등 다양한 통신장치를 의미하며, 본 발명에서는 특히 오디오 데이타를 제공받아 출력할 수 있는 통신장치를 의미한다.The user terminal 100 is a communication device that can be connected to the system through the network 200, and generally refers to various communication devices such as wired telephones, wireless communication terminals, computers, and Internet-accessible TVs. Means a communication device that can receive and output data.

상기 인터페이스(300)는 웹서버나 다른 시스템과의 정보 송수신을 위한 데이타 베이스 연동장치(CGI)등을 포함하며, 네트워크(200), 특히 유선 인터넷이나 무선 인터넷등을 통해 다수의 사용자 단말기(100)들이 접속할 수 있도록 한다. 그리고 오디오 데이타 제공 서비스를 실행하는 서비스제공서버(400)로부터 수신한 다양한 정보들을 통신 규격에 맞게 변환처리하여 복수의 사용자 단말기(100)로 제공하고, 네트워크(200)를 통해 사용자 단말기(100)로부터 전송되는 정보를 수신하여 서비스제공서버(400)로 전달한다.The interface 300 includes a database interlocking device (CGI) for transmitting and receiving information with a web server or other system, and the plurality of user terminals 100 through a network 200, in particular, a wired or wireless Internet. To allow them to connect. In addition, various information received from the service providing server 400 executing the audio data providing service is converted into a communication standard and provided to the plurality of user terminals 100, and the user terminal 100 is provided through the network 200. Receives the transmitted information and delivers it to the service providing server (400).

상기 서비스제공서버(400)는 사용자 단말기(100)에 오디오 데이타 제공 서비스를 제공하며, 음원을 제공하는 오디오소스제공부(410), 오디오데이타처리부, 크로스토크처리부(450), 조건설정부(460) 및 출력부(470)를 포함한다. 그리고 상기 오디오데이타처리부는 채널확인부(420), 업믹스처리부(430) 및 다운믹스처리부 (440)를 포함한다.The service providing server 400 provides an audio data providing service to the user terminal 100 and provides an audio source providing unit 410, an audio data processing unit, a crosstalk processing unit 450, and a condition setting unit 460. ) And an output unit 470. The audio data processor includes a channel checker 420, an upmix processor 430, and a downmix processor 440.

상기 출력부(470)는 오디오 데이타 제공 시스템에 의해 처리된 오디오데이타의 최종 결과물을 출력하여 사용자 단말기(100)에 전달되도록 한다.The output unit 470 outputs the final result of the audio data processed by the audio data providing system to be delivered to the user terminal 100.

상기 크로스토크처리부(450)는 헤드폰과 달리, 스피커를 통해 2채널 오디오 데이타를 청취하는 경우등에 있어서 좌측 채널의 일부가 우측귀에 들어가고, 우채널의 일부가 좌측귀에 들어가는 문제로 인해 발생하는 크로스토크 현상을 제거한 다.Unlike the headphones, the crosstalk processing unit 450 is a crosstalk phenomenon caused by a part of the left channel entering the right ear and a part of the right channel entering the left ear when listening to 2-channel audio data through a speaker. Remove

상기 조건설정부(460)는 사용자가 지정한 오디오 데이타 제공 조건이 설정된다. 상기 오디오 데이타 제공조건은 가상음원위치, 출력수단종류, 사용로자로부터의 스피커 위치를 포함한다. 또한, 상기 조건설정은 사용자에 의해 실시간 변경되도록 구성가능하며, 이 때 사용자가 원하는 조건에 즉각적인 대응을 할 수 있다는 장점이 있다.The condition setting unit 460 sets the audio data providing condition specified by the user. The audio data provision condition includes the virtual sound source position, the type of output means, and the speaker position from the user. In addition, the condition setting is configurable to be changed in real time by the user, there is an advantage that the user can immediately respond to the desired condition.

여기서, 가상음원위치 지정이란, 사용자가 5채널 오디오 데이타 각각에 대해 듣고 싶은 방향을 지정하는 것이다. 예를 들어, 중앙, 좌우, 서라운드 좌우 채널을 사용자중심으로부터 0도, 좌우 30도, 좌우 120도 위치로 지정하여 표준 권고안을 따를 수도 있고, 기타 다양한 설정이 가능하다.Here, the virtual sound source position designation designates a direction in which the user wants to listen to each of the five channel audio data. For example, the center, left, and right surround channels can be set to 0 degrees, 30 degrees left and right, 120 degrees left and right positions from the user center, and follow the standard recommendations, and various other settings are possible.

출력수단 종류 지정이란 예를 들어, 사용자가 네트워크(200)를 통해 전달받은 오디오 데이타를 스피커를 통해 들을 것인지 헤드폰(이하, 실제명칭에 관계없이 사람의 귀에 직접 전달되는 모든 사운드 출력장치를 포함)을 통해 들을 것인지 설정하는 것으로, 기타 다양한 출력수단 종류 설정이 가능하도록 구성될 수 있다. 이 때 설정된 출력수단 종류에 따라 크로스토크처리부(450)의 실행여부가 결정된다.For example, the type of output means means that the user can listen to the audio data transmitted through the network 200 through the speaker or headphones (hereinafter, including all sound output devices directly delivered to the human ear regardless of the actual name). By setting whether or not to listen through, it can be configured to enable a variety of other output means type setting. At this time, whether to execute the crosstalk processing unit 450 is determined according to the type of output means.

사용자로부터의 스피커 위치란 사용자를 중심으로 스피커가 사용자로부터 떨어진 각도를 가리키는 것으로, 이 위치에 따라 크로스토크처리부(450)의 크로스토크 제거함수가 달라진다.The speaker position from the user indicates an angle of the speaker away from the user with respect to the user, and the crosstalk removal function of the crosstalk processing unit 450 varies according to this position.

상기 오디오 데이타 처리부는 상기 오디오소스제공부(410)로부터 전달받은 음원에 입체감을 부여하여 2채널 다운믹스 처리한다..The audio data processing unit provides a two-dimensional downmix process to the sound source received from the audio source providing unit 410.

도 2에 도시된 바와 같이, 상기 오디오 데이타처리부는 채널확인부(420), 음원의 채널수가 2이하인 경우 채널수를 확장한 후 사운드에 입체감을 부여하여 2채널 다운믹스하는 업믹스처리부(430) 및 음원의 채널 수가 3이상인 경우, 사운드에 입체감을 부여하여 2채널 다운믹스하는 다운믹서처리부를 포함한다.As shown in FIG. 2, the audio data processing unit includes a channel identification unit 420 and an upmix processing unit 430 which expands the number of channels when the number of channels of the sound source is 2 or less and gives a three-dimensional effect to sound by downmixing the two channels. And a down mixer processing unit for providing a three-dimensional effect to the sound and downmixing the two channels when the number of channels of the sound source is three or more.

이 때 상기 채널확인부(420)는 상기 오디오 소스로부터 제공받은 음원의 채널수를 확인하여, 그 수에 따라 업믹스처리부(430) 혹은 다운믹스처리부(440)로 음원을 전달한다.At this time, the channel checking unit 420 checks the number of channels of the sound source provided from the audio source, and transfers the sound source to the upmix processor 430 or the downmix processor 440 according to the number.

상기 업믹스처리부(430)는 음원의 채널 수가 2이하인 경우 오디오 데이타를 처리하며, 디코더, 채널확장부(433), 쌍이효과합성부, 다운믹싱부 및 인코더를 포함한다.The upmix processor 430 processes audio data when the number of channels of a sound source is 2 or less, and includes a decoder, a channel expander 433, a pair effect synthesizer, a downmixer, and an encoder.

상기 디코더는 전달받은 음원을 좌우 채널로 분리한다.The decoder divides the received sound source into left and right channels.

상기 채널확장부(433)는 2채널 오디오 데이타를 4채널 오디오 데이타로 확장시킨다. 채널확장부(433)는 디코더로부터 전달받은 좌우 채널을 바이패스 하는 동시에, 하기한 바와 같이 적절한 게인값을 좌우 채널 신호에 곱하고 그 곱한 값을 더하여 서라운드 좌우 신호를 생성한다. The channel expansion unit 433 expands 2-channel audio data into 4-channel audio data. The channel expansion unit 433 bypasses the left and right channels received from the decoder, and multiplies an appropriate gain value with the left and right channel signals as described below, and generates the surround left and right signals by adding the multiplied values.

Left Surround = (gain1*left)+(gain2*right)Left Surround = (gain1 * left) + (gain2 * right)

Right Surround = (gain2*left)+(gain1*right)Right Surround = (gain2 * left) + (gain1 * right)

즉 채널확장부(433)는 좌우 채널의 2채널 오디오 데이타를 좌우 및 서라운드 좌우 채널의 4채널 오디오 데이타로 확장한다. That is, the channel expansion unit 433 extends the two channel audio data of the left and right channels into four channel audio data of the left and right and surround left and right channels.

다음, 상기 쌍이효과합성부는 머리전달함수를 기반으로 한 쌍이효과 합성을 통해 각 채널에 대해 방향성을 부여하여, 4채널 오디오 데이타를 8개의 신호성분으로 분리한다.이 때 상기 조건설정부(460)에 설정된 가상음원위치조건에 해당되는 머리전달함수가 사용되며, 일반적으로, 좌/우 신호의 경우 ±60도, 서라운드 좌/우 신호의 경우 ±120도에 해당되는 머리전달함수가 사용된다.Next, the pair effect synthesizing unit gives directionality to each channel through pair effect synthesizing based on a head transfer function, and separates four channel audio data into eight signal components. The head transfer function corresponding to the virtual sound source position condition set in is used. In general, the head transfer function corresponding to ± 60 degrees for the left / right signal and ± 120 degrees for the surround left / right signal is used.

상기 다운믹싱부는 8개의 신호 성분 각각을 좌/우 신호 성분끼리 모아 적절한 크기로 더하여 2채널로 만들고, 상기 2채널 오디오 데이타는 상기 인코더에 의해 인코딩된다.The downmixer combines each of the eight signal components into left and right signal components, adds them to an appropriate size, and makes two channels. The two-channel audio data is encoded by the encoder.

상기 다운믹스처리부(440)는 음원의 채널 수가 3이상인 경우 오디오 데이타를 처리하며, 디코더, 쌍이효과합성부, 다운믹싱부 및 인코더를 포함하며 이하 음원 채널수가 5.1인 경우를 가정하여 설명한다.The downmix processor 440 processes audio data when the number of channels of the sound source is three or more, and includes a decoder, a pair effect synthesizer, a downmixer, and an encoder.

상기 디코더는 전달받은 음원을 중앙, 좌우 및 서라운드 좌우 저주파효과채널 즉 6개의 채널로 분리한다.The decoder divides the received sound source into center, left, and right surround low frequency effect channels, that is, six channels.

다음, 상기 쌍이효과합성부는 머리전달함수를 기반으로 한 쌍이효과 합성을 통해 저주파효과채널을 제외한 각 채널에 대해 방향성을 부여하며 10개의 신호성분으로 분리하고, 저주파효과채널은 방향성에 대한 고려없이 단순히 좌우신호 성분으로 나뉜다.Next, the pair effect synthesizing unit gives directionality to each channel except the low frequency effect channel through the pair transfer effect synthesis based on the head transfer function, and separates the signal into 10 signal components. It is divided into left and right signal components.

이 때 상기 조건설정부(460)에 설정된 가상음원위치조건에 해당되는 머리전달함수가 사용되며, 일반적으로, 중앙 0도, 좌/우 신호의 경우 ±30도, 서라운드 좌/우 신호의 경우 ±120도에 해당되는 머리전달함수가 사용된다.At this time, the head transfer function corresponding to the virtual sound source position condition set in the condition setting unit 460 is used, and in general, the center 0 degrees, the left / right signal ± 30 degrees, the surround left / right signal ± 120 degrees head transfer function is used.

상기 다운믹싱부는 10개의 신호 성분 및 저주파효과채널의 좌/우 신호 성분 각각을 좌/우 신호 성분끼리 모아 적절한 크기로 더하여 2채널로 만든다. 다음, 상기 2채널 오디오 데이타는 상기 인코더에 의해 인코딩된다.The downmixing unit combines the left and right signal components of each of the 10 signal components and the low frequency effect channel and adds the left and right signal components to an appropriate size to make two channels. The two channel audio data is then encoded by the encoder.

상기 오디오 데이타 처리부를 통해 2채널로 다운된 오디오 데이타는 출력수단이 헤드폰인 경우에는 그대로 좌우 채널로 나뉘어 출력되며, 출력수단이 스피커인 경우, 상기 크로스토크처리부(450)에 의해 신호 상쇄효과 제거과정을 거친다.Audio data down to two channels through the audio data processing unit is output as it is divided into left and right channels as the output means is a headphone, and if the output means is a speaker, the signal canceling effect removal process by the crosstalk processing unit 450 Go through

도 3에 도시된 바와 같이, 사람의 귀에 최종적으로 들리는 좌 우 사운드 신호를

,

입력 좌우 신호를

,

이라고 할 때, 상기 변수와 머리전달함수(HRTF,식에서는 H)와의 관계는 다음과 같다.As shown in Fig. 3, the left and right sound signals finally heard in the human ear are

,

Input left and right signals

,

The relationship between the variable and the head transfer function (HRTF, H in the equation) is as follows.

Y = HX 즉,Y = HX, i.e.

이 때, 크로스토크 제거란 헤드폰을 통해 사용자 귀에 들리는 신호와 스피커를 통해 사용자 귀에 들리는 사운드 신호를 거의 같게 한다는 의미이다. 그리고, 헤드폰을 통해 출력되는 사운드는 사용자 귀에 직접 전달되므로, 헤드폰 출력사운드와 헤드폰을 통해 사용자 귀에 전달되는 사운드 신호는 거의 같다. In this case, crosstalk cancellation means that the signal heard in the user's ear through the headphone and the sound signal heard in the user's ear through the speaker are almost equal. In addition, since the sound output through the headphone is directly transmitted to the user's ear, the sound signal transmitted through the headphone output sound and the user's ear through the headphone is about the same.

즉 크로스토크 제거는 헤드폰 출력 사운드 신호와 스피커를 통해 사용자 귀에 전달되는 사운드 신호를 같게 만드는 것을 정리된다.In other words, crosstalk cancellation is arranged to make the headphone output sound signal equal to the sound signal transmitted to the user's ear through the speaker.

따라서, 헤드폰으로부터 출력되는 사운드 신호를 B(

,

) 로 두면,Therefore, the sound signal output from the headphones is B (

,

),

사용자 귀에 전달되는 사운드 신호 Y가 B가 되게 하는 입력신호 X를 구하면 된다. 즉The input signal X is obtained such that the sound signal Y transmitted to the user's ear becomes B. In other words

스피커에 입력되는 신호

일 때,

가 되므로 사용자 귀에 헤드폰 출력 사운드 신호와 같은 신호가 전달되어, 신호 상쇄효과가 제거된다.Signal input to the speaker

when,

This results in a signal such as a headphone output sound signal being delivered to the user's ear, eliminating signal cancellation.

정리하자면, 사용자의 사운드 출력수단이 헤드폰인 경우 상기 오디오 데이타 처리부에 처리된 2채널 오디오 데이타를 좌우 채널로 나누어 그대로 출력한다. 그리고, 사용자의 사운드 출력수단이 스피커인 경우 상기 오디오 데이타 처리부에 의해 처리된 2채널 오디오 데이타를 좌우 채널로 나눈 후 각 채널 신호에

를 컨벌루션(CONVOLUTION)하여 크로스토크를 제거한다.In summary, when the user's sound output means is a headphone, two-channel audio data processed by the audio data processor is divided into left and right channels and output as it is. When the sound output means of the user is a speaker, the two-channel audio data processed by the audio data processor is divided into left and right channels and then applied to each channel signal.

CONVOLUTION to eliminate crosstalk.

이 때 사용되는 머리전달함수 역시 상기 조건설정부(460)에 설정된 스피커 위치에 따라 결정되며, 사용자로부터 30도 각도 위치에 해당되는 머리 전달 함수가 흔히 이용된다.The head transfer function used at this time is also determined according to the speaker position set in the condition setting unit 460, and a head transfer function corresponding to a 30 degree angle position from the user is often used.

이하 도 4를 참조하여 본 발명의 실시예에 따른 오디오 제공 시스템의 오디오 제공 방법을 상세히 설명한다.Hereinafter, an audio providing method of an audio providing system according to an exemplary embodiment of the present invention will be described in detail with reference to FIG. 4.

사용자가 특정 가상음원위치조건을 지정하는 경우, 상기 조건설정부(460)에는 사용자 지정 가상음원위치조건이 설정된다(S10). 또한 사용자는 오디오 데이타 출력수단이 스피커인지 혹은 헤드폰인지 지정하고, 이 조건 역시 상기 조건설정부(460)에 설정된다(S20). When the user designates a specific virtual sound source position condition, the condition setting unit 460 sets a user specified virtual sound source position condition (S10). Also, the user specifies whether the audio data output means is a speaker or a headphone, and this condition is also set in the condition setting unit 460 (S20).

설정된 출력수단 종류가 헤드폰인 경우, 상기 오디오 소스제공부로부터 전달 받은 음원은 상기 오디오 데이타처리부에 2채널 다운믹스 처리된 후(S70), 헤드폰을 통해 사용자의 귀에 전달된다(S80).When the set type of output means is a headphone, the sound source received from the audio source provider is 2-channel downmixed to the audio data processor (S70) and then delivered to the user's ear through the headphone (S80).

반면, 출력수단 종류가 스피커인 경우, 사용자로부터 스피커의 위치가 상기 조건설정부(460)에 추가 설정된다(S30). 이 때 상기 가상음원위치조건, 상기 스피커 위치 조건등은 사용자에 의해 실시간 설정 변경가능하다.On the other hand, if the type of output means is a speaker, the position of the speaker from the user is additionally set in the condition setting unit 460 (S30). At this time, the virtual sound source position condition, the speaker position condition, etc. can be changed by the user in real time setting.

다음 출력수단이 헤드폰인 경우와 마찬가지로, 상기 오디오 소스제공부로부터 전달받은 음원은 상기 오디오 데이타처리부에 의해 2채널 다운믹스 처리되고(S40), 단, 크로스토크제거과정을 추가적으로 갖는다(S50). Similarly to the case where the next output means is a headphone, the sound source received from the audio source providing unit is two-channel downmixed by the audio data processing unit (S40), but has an additional crosstalk removing process (S50).

끝으로, 크로스토크제거과정을 거친 사운드 신호는 최종적으로 스피커를 통해 출력된다(S60).Finally, the sound signal that has undergone the crosstalk removal process is finally output through the speaker (S60).

이 때, 도 5에 도시된 바와 같이, 상기 오디오 데이타처리부를 통한 오디오 데이타의 2채널 다운믹스는 설정된 출력수단의 종류와 관계없다. At this time, as shown in Fig. 5, the two-channel downmix of the audio data through the audio data processing unit is irrelevant to the type of output means set.

즉, 단계(S100)에서, 상기 오디오데이타제공부로부터 사운드 신호가 입력되면, 입력 오디오 신호의 채널 수가 확인된다. 이 때 입력오디오 신호의 채널수가 3이상인 경우 멀티 채널로 디코딩되며(S110), 본 도면에서는 5.1 채널 입력 오디오 신호가 6채널로 디코딩 된 경우를 예로 들어 설명한다.That is, in step S100, when a sound signal is input from the audio data providing unit, the number of channels of the input audio signal is checked. In this case, when the number of channels of the input audio signal is 3 or more, it is decoded into multi-channels (S110). In this figure, a case where the 5.1-channel input audio signal is decoded into 6 channels will be described as an example.

즉, 입력오디오 신호가 중앙, 좌우, 서라운드 좌우 및 저주파 효과채널로 디코딩되면(S110), 방향성이 없는 저주파 효과채널을 제외한 5개의 채널에 대해 머리전달함수를 이용하여 방향성이 부여되고 10개의 신호 성분으로 분리된다. 이 때 사용되는 머리전달함수는 사용자가 지정하여 설정된 가상음원위치조건에 해당하는 머 리전달함수이며, 일반적으로는 중앙 0도, 좌/우 신호의 경우 ±30도, 서라운드 좌우 신호의 경우 ±120도에 해당되는 머리전달함수가 사용된다(S111~S117, S120~S125, S130~S135, S140~S145, S150~S155, S160~S165).That is, if the input audio signal is decoded into the center, left and right, surround left and low frequency effect channels (S110), the directionality is given to the five channels except for the non-directional low frequency effect channel using the head transfer function and 10 signal components are obtained. Separated by. The head transfer function used in this case is the head transfer function corresponding to the virtual sound source position condition specified by the user. In general, the head transfer function is generally 0 degrees, ± 30 degrees for left and right signals, and ± 120 for surround left and right signals. The head transfer function corresponding to the figure is used (S111 ~ S117, S120 ~ S125, S130 ~ S135, S140 ~ S145, S150 ~ S155, S160 ~ S165).

그리고, 저주파효과채널은 방향성에 대한 고려없이 단순히 좌우 신호 성분으로 나뉜다(S160~`165).The low frequency effect channel is simply divided into left and right signal components without consideration of directionality (S160 to `165).

다음, 상기 10개의 신호 성분 및 저주파효과채널의 좌/우 신호 성분 각각을 좌/우 신호 성분끼리 적절한 크기로 더하여 2채널로 만든다(S200, S210). 그리고 상기 2채널 오디오 데이타는 인코딩된다(S220, S230).Next, each of the 10 signal components and the left and right signal components of the low frequency effect channel are added to the left and right signal components with appropriate sizes to make two channels (S200 and S210). The two-channel audio data is encoded (S220, S230).

한편, 단계(S100)에서 입력 오디오 신호의 채널 수를 확인하여 채널수가 2이하인 경우(S300)는 하기한 바와 같으며, 이하 입력 오디오 신호가 2채널인 경우를 기준으로 설명한다.Meanwhile, when the number of channels of the input audio signal is checked at step S100 and the number of channels is 2 or less (S300), the following description will be made based on the case where the input audio signal is 2 channels.

도 6에 도시된 바와 같이, 즉, 입력 오디오 신호는 좌/우 2채널로 디코딩되며(S310), 2채널 오디오 데이타는 4채널 오디오 데이타로 확대된다. 즉 디코딩된 좌/우 2채널은 바이패스 되는 동시에 적절한 게인값이 좌우 채널 신호에 곱해지고, 그 곱한 값을 더하여 서라운드 좌우 신호가 생성된다(S311~S313, S320~S321, S330~S335, S340~S345).As shown in Fig. 6, that is, the input audio signal is decoded into two left and right channels (S310), and the two channel audio data is expanded to four channel audio data. That is, the decoded left and right two channels are bypassed and the appropriate gain value is multiplied by the left and right channel signals, and the multiplied values are added to generate the surround left and right signals (S311 to S313, S320 to S321, S330 to S335, and S340 to S345).

다음, 상기 4채널 오디오 데이타 각각에 대해 머리전달함수를 기반으로 한 방향성이 부여되어, 4채널 오디오 데이타가 8개의 신호성분으로 분리된다 (S315~S319, S323~S327, S337~S339, S347~S349). 이 때 상기 머리전달함수는 사용자에 의해 지정된 가상음원위치조건에 해당되는 것으로, 일반적으로 좌/우 신호의 경우 ±60도, 서라운드 좌/우 신호의 경우 ±120도에 해당되는 머리전달함수가 사용된다.Next, the directionality based on the head transfer function is given to each of the four channel audio data, and the four channel audio data are separated into eight signal components (S315 to S319, S323 to S327, S337 to S339, and S347 to S349). ). In this case, the head transfer function corresponds to a virtual sound source position condition specified by the user. Generally, the head transfer function corresponds to ± 60 degrees for left / right signals and ± 120 degrees for surround left / right signals. do.

그리고, 단계(S350, S351)에서 8개의 신호 성분 각각은 좌/우 신호 성분끼리 더하여 2채널로 만들어지고 인코딩된다(S353~S357).In operation S350 and S351, each of the eight signal components is added to the left and right signal components, and is made of two channels and encoded (S353 to S357).

도 7에 도시된 바와 같이, 상기 과정을 통해 생성되어 인코딩된 2채널 오디오 데이타는, 좌/우 2채널로 디코딩된다(S400~S420).As shown in FIG. 7, the two channel audio data generated and encoded through the above process is decoded into two left and right channels (S400 to S420).

이 때, 사용자의 사운드 출력수단이 헤드폰인 경우(S430), 좌/우 2채널 오디오 데이타는 상기 출력부(470)를 통해 그대로 출력된다(S440~S443). 그리고 사용자의 사운드 출력수단이 스피커인 경우(S430), 좌/우 2채널 오디오 데이타는 크로스토크 제거과정을 거친다(S450~S467)At this time, when the sound output means of the user is a headphone (S430), the left and right two-channel audio data is output as it is through the output unit 470 (S440 ~ S443). If the user's sound output means is a speaker (S430), the left and right two-channel audio data undergoes a crosstalk removal process (S450 to S467).

즉, 각 채널 신호에 머리전달함수의 역함수를 컨벌루션하여 크로스토크를 제거한다. 이 때 사용되는 머리전달함수는 사용자 지정에 의해 설정된 스피커위치에 따라 결정되며, 사용자로부터 30도위치에 해당되는 머리전달함수가 흔히 이용된다.That is, crosstalk is eliminated by convolving the inverse function of the head transfer function to each channel signal. The head transfer function used at this time is determined by the speaker position set by the user, and the head transfer function corresponding to the 30 degree position from the user is often used.

그리고, 상기 크로스토크 제거과정을 거친 사운드 신호가 상기 출력부(470)를 통해 사용자의 스피커에 전달된다(S469).The sound signal, which has undergone the crosstalk removal process, is transmitted to the user's speaker through the output unit 470 (S469).

이상 설명한 바는 본 발명의 실시예에 불과한 것으로 본 발명의 권리범위가 이에 한정되는 것은 아니며, 당업자에게 자명한 사항에 대해 다양한 변형실시가 가능함은 물론이다.As described above is only an embodiment of the present invention, the scope of the present invention is not limited thereto, and various modifications may be made to the matters obvious to those skilled in the art.

이상 설명한 바와 같이 본 발명에 따르면, 사용자가 지정한 가상음원위치조 건을 기초로 소정의 전달함수를 이용하여 입력오디오신호에 입체감이 부여되므로, 사용자는 원하는 방향의 입체감이 있는 사운드를 즐길 수 있다. As described above, according to the present invention, since a three-dimensional effect is provided to the input audio signal using a predetermined transfer function based on a virtual sound source position condition specified by the user, the user can enjoy a sound having a three-dimensional effect in a desired direction.

또한 출력수단의 종류에 따라 크로스토크제거기능 실행여부를 결정할 수 있어, 헤드폰으로 음악을 듣는 경우처럼 크로스토크제거기능이 필요없는 경우 불필요한 기능 실행을 방지하면서, 스피커로 음악을 들을 때처럼 필요한 경우 크로스 토크 제거기능을 실행시킬 수 있다.Also, depending on the type of output means, it is possible to determine whether to execute the crosstalk removal function.If you do not need the crosstalk removal function such as when listening to music with headphones, you can prevent the unnecessary functions from being executed. The torque cancel function can be executed.

게다가 출력수단이 스피커인 경우, 사용자가 설정해 사용중인 한 쌍의 스피커 상호위치를 기초로 크로스토크 제거기능이 실행되므로, 크로스토크 제거가 보다 효과적으로 이루어질 수 있다.In addition, when the output means is a speaker, the crosstalk removal function is executed based on the pair of speaker mutual positions set and used by the user, so that crosstalk removal can be made more effectively.

Claims

An audio data providing system connected to a user terminal through a network and providing audio data to the user terminal,

A condition setting unit for setting audio data providing conditions including a user specified virtual sound source position condition, an output means type, and a speaker position;

When the input audio signal is 2 channels or less, the number of channels is increased by 4 or more, and the input audio signal is obtained by giving a stereoscopic feeling in the direction desired by the user by using a head transfer function predetermined based on the virtual sound source position condition set in the condition setting unit. Outputs audio data, and when the input audio signal is three or more channels, the audio obtained by giving a three-dimensional effect in a desired direction to the input audio signal using a predetermined head transfer function based on the virtual sound source position condition set in the condition setting unit An audio data processor for outputting data; And

If the type of output means set in the condition setting unit is a speaker, convolution of the inverse function of the head transfer function to each channel signal of the audio data output from the audio data processing unit based on the speaker position set in the condition setting unit. Thereby, including a crosstalk processing unit for removing the offset effect due to mutual interference,

The audio data processor may include: a channel checking unit checking a channel number of an input audio signal; An upmix processing unit for increasing the number of channels to 4 or more to give a stereoscopic effect to the input audio signal and outputting the stereo channel after the number of channels identified by the channel checking unit is 2 or less; And a downmix processing unit for adding a stereoscopic effect to the input audio signal and outputting the stereo channel when the number of channels identified by the channel checking unit is 3 or more.

The downmix processor may include: a decoder configured to decode an input audio signal; A pair effect synthesizing unit which provides a three-dimensional effect to the decoded audio signal by using the predetermined transfer function; A downmixing unit for downscaling the audio signal to which two-dimensional effects are given; And an encoder for encoding the audio signal down to two channels,

The upmix processor may include a decoder configured to decode an input audio signal; A channel expansion unit for extending the number of channels of the decoded audio signal; A pair effect synthesizing unit for providing a three-dimensional effect to each of the channels of the audio signal using a head transfer function; A downmixing unit for downscaling the audio signal to which two-dimensional effects are given; And an encoder for encoding the audio signal down to two channels.

delete

In the audio data providing method of the system that is connected to the user terminal through a network to provide audio data to the user terminal,

a) setting audio data providing conditions including a user specified virtual sound source position condition, an output means type and a speaker position;

b) If the input audio signal is less than 2 channels, the number of channels is enlarged to 4 or more, and according to the set audio data provision conditions, the stereo audio in the direction desired by the user is given by using the head transfer function to the input audio signal, or the input audio signal is 3 Obtaining audio data by providing a three-dimensional effect in a direction desired by a user using a head transfer function to an input audio signal according to the set audio data providing condition when the channel is larger than the channel; And

c) If the output means type set in step a) is a speaker, converse the inverse function of the head transfer function to each channel signal of the audio data to eliminate the offset effect due to mutual interference, wherein the speaker set in step a) A crosstalk removing step of canceling the offset effect of the mutual interference based on the position;

B) checking the number of channels of the input audio signal; b-2) when the number of channels is less than or equal to 2, extending the number of audio signal channels by 4 or more to give each channel a three-dimensional effect in a desired direction by using a head transfer function, and then outputting down to two channels; And b-3) when the number of channels is 3 or more, giving each channel a three-dimensional effect in a desired direction by using a head transfer function, and then outputting down to two channels,

Step b-2) may include b-2-1) decoding an audio signal when the number of audio signal channels is 2 or less; b-2-2) expanding the channel number of the decoded audio signal; b-2-3) imparting stereoscopic effect to each of the audio signal channels by using a head transfer function; b-2-4) bringing down the stereo signal given audio signal to two channels; And b-2-5) encoding an audio signal down to two channels into a stereo channel,

Step b-3) includes b-3-1) decoding an input audio signal when the number of audio channels is 3 or more; b-3-2) imparting stereoscopic effect to the decoded audio signal by using a head transfer function; b-3-3) bringing down the stereoscopically given audio signal to two channels; And b-3-4) encoding the audio signal down to two channels.

delete