KR100998913B1

KR100998913B1 - A method and an apparatus for processing an audio signal

Info

Publication number: KR100998913B1
Application number: KR1020090005507A
Authority: KR
Inventors: 오현오; 정양원
Original assignee: 엘지전자 주식회사
Priority date: 2008-01-23
Filing date: 2009-01-22
Publication date: 2010-12-08
Also published as: CN101926094B; AU2009206856B2; RU2450440C1; CN101926181A; KR20090081341A; DE602009000167D1; CN101926094A; CA2712941A1; JP5319704B2; ATE481829T1; AU2009206856A1; CA2712941C; KR20090081342A; ATE481830T1; MX2010007997A; JP5249354B2; DE602009000166D1; JP2011510589A; RU2010134915A; CN101926181B

Abstract

An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving the audio signal and preset information; obtaining preset matrix from the preset information, wherein the preset matrix indicates contribution degree of the object to output channel; and adjusting output level of the object by using the preset matrix. Accordingly, without user's setting for each object, if preset metadata to be applied to an audio signal is selected with reference to previously- set preset metadata, levels of objects included in the audio signal can be easily adjusted using preset rendering data corresponding to the selected preset metadata.

Description

A method of processing an audio signal and a device therefor {A METHOD AND AN APPARATUS FOR PROCESSING AN AUDIO SIGNAL}

본 발명은 오디오 신호의 처리 방법 및 장치에 관한 것으로, 보다 상세하게는 디지털 매체, 방송 신호 등으로 수신된 오디오 신호를 처리할 수 있는 오디오 신호의 처리 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for processing an audio signal, and more particularly, to a method and apparatus for processing an audio signal capable of processing an audio signal received through a digital medium, a broadcast signal, and the like.

복수 개의 오브젝트들을 포함하는 오디오 신호를, 모노 또는 스트레오 신호로 다운믹스하여 다운믹스 신호를 생성하는 과정에 있어서, 오브젝트들로부터 파라미터들이 추출된다. 이러한 파라미터들은 다운믹스된 신호를 디코딩하는 과정에서 사용되는데, 오브젝트들의 패닝(panning)과 게인(gain)은 유저의 선택에 의하여 컨트롤 될 수 있다. In the process of downmixing an audio signal including a plurality of objects into a mono or stereo signal to generate a downmix signal, parameters are extracted from the objects. These parameters are used in the process of decoding the downmixed signal. The panning and gain of the objects can be controlled by the user's selection.

다운믹스 신호에 포함되어 있는 오브젝트들은 사용자의 선택에 의하여 적절히 조절되어야 한다. 그러나, 사용자가 오브젝트를 제어하는 경우, 직접 오브젝트를 제어하여야 하는 번거로움이 있고, 전문가에 의하여 제어되는 것에 비하여 복수의 오브젝트들을 포함하는 오디오 신호를 환경에 따라 최적의 상태를 재현하는데 어려움이 있을 수 있다.Objects included in the downmix signal should be properly adjusted according to a user's selection. However, when the user controls the object, it is cumbersome to control the object directly, and compared to the control by an expert, it may be difficult to reproduce an optimal state according to the environment of an audio signal including a plurality of objects. have.

본 발명은 상기와 같은 문제점을 해결하기 위해 창안된 것으로서, 프리셋 메타데이터와 프리셋 렌더링 파라미터를 포함하는 프리셋 정보를 이용하여, 오디오 신호에 포함된 오브젝트를 조절할 수 있는 오디오 신호 처리 방법 및 장치를 제공하는 데 그 목적이 있다.The present invention has been devised to solve the above problems, and provides an audio signal processing method and apparatus which can adjust an object included in an audio signal using preset information including preset metadata and preset rendering parameters. Its purpose is to.

본 발명의 또 다른 목적은, 프리셋 데이터 타입이 매트릭스인 경우, 오디오 신호의 출력채널 정보에 기초하여 프리셋 메타데이터와 대응하는 프리셋 매트릭스를 결정하고 오디오 신호에 적용함으로써, 출력채널에서의 오브젝트의 레벨을 조절하는 오디오 신호 처리 방법 및 장치를 제공하는데 있다.Still another object of the present invention is to determine a level of an object in an output channel by determining a preset matrix corresponding to the preset metadata and applying the preset matrix to the audio signal based on the output channel information of the audio signal when the preset data type is a matrix. The present invention provides a method and apparatus for processing an audio signal.

또한, 본 발명의 또 다른 목적은, 오브젝트를 조절하는 프리셋 렌더링 매트릭스를 인코더에서 전송된 모노 타입 프리셋 매트릭스 또는 게인 정보로부터 단계적으로 생성하는 오디오 신호 처리 방법 및 장치를 제공하는데 있다.Still another object of the present invention is to provide a method and apparatus for processing an audio signal, which generates a preset rendering matrix for controlling an object stepwise from a mono type preset matrix or gain information transmitted from an encoder.

본 발명은 다음과 같은 효과와 이점을 제공한다.The present invention provides the following effects and advantages.

첫째, 오브젝트들에 대한 사용자의 설정없이, 기설정된 프리셋 정보중 하나를 선택함으로써 손쉽게 오브젝트의 출력채널의 레벨을 조절할 수 있다.First, it is possible to easily adjust the level of the output channel of the object by selecting one of preset information without setting the user for the objects.

둘째, 프리셋 정보를 표현하는 프리셋 메타데이터를 메타데이터의 길이를 나타내는 프리셋 길이 정보에 기초하여 텍스트 형태로 표현함으로써, 불필요한 코딩을 줄일 수 있다.Second, unnecessary coding may be reduced by expressing preset metadata representing preset information in text form based on preset length information indicating the length of the metadata.

셋째, 프리셋 렌더링 데이터의 타입이 매트릭스인 경우, 오디오 신호의 출력채널 정보에 기초하여 프리셋 렌더링 데이터를 나타내는 프리셋 매트릭스를 결정함으로써, 오브젝트의 출력채널의 레벨을 보다 정확하게 효율적으로 조절할 수 있다. Third, when the type of the preset rendering data is a matrix, the level of the output channel of the object can be adjusted more accurately and efficiently by determining the preset matrix representing the preset rendering data based on the output channel information of the audio signal.

넷째, 프리셋 매트릭스를 단계적으로 생성함으로써, 인코더로부터 전송되는 비트율을 감소시킬 수 있다.Fourth, by generating the preset matrix step by step, it is possible to reduce the bit rate transmitted from the encoder.

다섯째, 복수개의 오브젝트들 중 일부의 오브젝트만을 조절할 수 있는 프리셋 매트릭스를 이용함으로써, 불필요한 코딩을 감소시킬 수 있다.Fifth, unnecessary coding can be reduced by using a preset matrix that can adjust only some of a plurality of objects.

상기와 같은 목적을 달성하기 위하여 본 발명에 따른 오디오 신호 처리 방법은, 적어도 하나의 오브젝트를 포함하는 오디오 신호 및 프리셋 정보를 수신하는 단계; 상기 프리셋 정보에서 프리셋 매트릭스를 획득하는 단계로서, 상기 프리셋 매트릭스는 상기 오브젝트의 출력 채널 포함 정도를 나타는 것인 단계; 상기 프리셋 매트릭스를 이용하여, 출력채널에 따라 상기 오브젝트의 출력레벨을 조절하는 단계; 및 상기 출력레벨이 조절된 오브젝트를 포함하는 오디오 신호를 출력하는 단 계를 포함하되, 상기 프리셋 정보는 상기 프리셋 정보가 포함되었는지를 나타내는 프리셋 존재 정보 및 상기 프리셋 정보의 개수를 나타내는 프리셋 개수 정보에 근거하여 획득되고, 상기 프리셋 매트릭스는 상기 프리셋 정보가 매트릭스로 표현되었는지를 나타내는 프리셋 타입 정보에 근거하여 획득되는 것을 포함한다.In order to achieve the above object, an audio signal processing method includes: receiving an audio signal and preset information including at least one object; Obtaining a preset matrix from the preset information, wherein the preset matrix indicates a degree of inclusion of an output channel of the object; Adjusting an output level of the object according to an output channel using the preset matrix; And outputting an audio signal including an object whose output level is adjusted, wherein the preset information is based on preset presence information indicating whether the preset information is included and preset number information indicating the number of the preset information. The preset matrix may be obtained based on preset type information indicating whether the preset information is represented by a matrix.

본 발명에 따르면, 상기 프리셋 매트릭스는 상기 출력채널이 모노, 스테레오 및 멀티채널 중 하나인 것을 나타내는 출력채널정보에 근거하여 획득되는 것을 특징으로 한다.According to the present invention, the preset matrix is obtained based on output channel information indicating that the output channel is one of mono, stereo and multi-channel.

본 발명에 따르면, 상기 프리셋 타입 정보는 1비트로 표현되는 것을 특징으로 한다.According to the present invention, the preset type information is characterized by being represented by 1 bit.

본 발명에 따르면, 상기 프리셋 매트릭스의 차원은 상기 오브젝트의 개수 및 상기 출력채널의 개수에 근거하여 결정되는 것을 특징으로 한다.According to the present invention, the dimension of the preset matrix is determined based on the number of objects and the number of output channels.

본 발명의 또다른 측면에 따른 오디오 신호 처리 장치에 따르면, 적어도 하나의 오브젝트를 포함하는 오디오 신호를 수신하는 오디오 신호 수신부; 프리셋 정보의 프리셋 메타데이터를 획득하는 프리셋 메타데이터 수신부; 상기 오브젝트의 출력 채널 포함 정도를 나타내는 프리셋 매트릭스를 획득하는 프리셋 렌더링 데이터 수신부로서, 상기 프리셋 렌더링 데이터 수신부는 상기 프리셋 메타데이터에 대응하는 상기 프리셋 매트릭스인 프리셋 렌더링 데이터 수신부; 상기 프리셋 메타데이터를 표시하는 디스플레이부; 상기 프리셋 메타데이터 중 하나를 선택하는 신호를 수신하는 입력부; 상기 선택된 프리셋 메타데이터에 대응하는 상기 프리셋 매트릭스를 이용하여 출력채널에 따라 상기 오브젝트의 출력레벨을 조절하는 오브젝트 조절부; 및 상기 출력레벨이 조절된 오브젝트를 포함하는 오디오 신호를 출력하는 출력부를 포함하는 것을 특징으로 한다.According to another aspect of an exemplary embodiment, an audio signal processing apparatus includes: an audio signal receiver configured to receive an audio signal including at least one object; A preset metadata receiver for obtaining preset metadata of preset information; A preset rendering data receiving unit for obtaining a preset matrix indicating a degree of inclusion of an output channel of the object, wherein the preset rendering data receiving unit is a preset rendering data receiving unit which is the preset matrix corresponding to the preset metadata; A display unit displaying the preset metadata; An input unit configured to receive a signal for selecting one of the preset metadata; An object controller which adjusts an output level of the object according to an output channel by using the preset matrix corresponding to the selected preset metadata; And an output unit for outputting an audio signal including an object whose output level is adjusted.

본 발명에 따르면, 상기 출력부가 상기 오디오 신호를 출력하는 경우, 상기 디스플레이부는 상기 선택된 프리셋 메타데이터를 표시하는 것을 특징으로 한다.According to the present invention, when the output unit outputs the audio signal, the display unit may display the selected preset metadata.

본 발명에 따르면, 상기 디스플레이부는 상기 오브젝트의 출력레벨을 더 표시하는 것을 특징으로 한다.According to the present invention, the display unit may further display an output level of the object.

본 발명에 따르면, 상기 프리셋 정보는 상기 프리셋 정보의 개수를 나타내는 프리셋 개수 정보에 근거하여 획득되고, 상기 프리셋 매트릭스는 상기 프리셋 정보가 매트릭스로 표현되었는지를 나타내는 프리셋 타입 정보에 근거하여 획득되는 것을 특징으로 한다.According to the present invention, the preset information is obtained based on preset number information indicating the number of preset information, and the preset matrix is obtained based on preset type information indicating whether the preset information is expressed in a matrix. do.

본 발명에 따르면, 상기 프리셋 정보는 상기 오브젝트에 적용되는 상기 프리셋 매트릭스의 존재여부를 나타내는 프리셋 오브젝트 적용 정보를 더 포함하는 것을 특징으로 한다.According to the present invention, the preset information may further include preset object application information indicating whether the preset matrix applied to the object exists.

본 발명에 따르면, 상기 디스플레이부는 상기 프리셋 오브젝트 적용 정보에 기초하여 상기 오브젝트에 적용되는 상기 프리셋 매트릭스가 존재하는지를 더 표시하는 것을 특징으로 한다.According to the present invention, the display unit may further display whether there is the preset matrix applied to the object based on the preset object application information.

본 발명에 따르면, 상기 디스플레이부는 상기 프리셋 메타데이터를 텍스트 형태로 표현하는 것을 특징으로 한다.According to the present invention, the display unit may express the preset metadata in a text form.

이하 첨부된 도면을 참조로 본 발명의 바람직한 실시예를 상세히 설명하기로 한다.　 이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다. 따라서, 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형예들이 있을 수 있음을 이해하여야 한다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Prior to this, terms or words used in the specification and claims should not be construed as having a conventional or dictionary meaning, and the inventors should properly explain the concept of terms in order to best explain their own invention. Based on the principle that can be defined, it should be interpreted as meaning and concept corresponding to the technical idea of the present invention. Therefore, the embodiments described in the specification and the drawings shown in the drawings are only the most preferred embodiment of the present invention and do not represent all of the technical idea of the present invention, various modifications that can be replaced at the time of the present application It should be understood that there may be equivalents and variations.

특히, 본 명세서에서 정보(information)란, 값(values), 파라미터(parameters), 계수(coefficients), 성분(elements) 등을 모두 아우르는 용어로 해석되고, 오브젝트(object)는 기타(guitar), 보컬(vocal), 피아노(piano) 등 오디오 신호를 구성하는 소스(source) 신호 각각을 나타내는 것으로 해석되며, 경우에 따라 그 의미는 달리 해석될 수 있는 바, 그러나 본 발명은 이에 한정되지 아니한다.In particular, in the present specification, information is interpreted as a term encompassing values, parameters, coefficients, elements, and the like, and an object is a guitar, a vocal. It is interpreted as representing each source signal constituting an audio signal such as a vocal, piano, and the like, and in some cases, the meaning may be interpreted differently, but the present invention is not limited thereto.

본 발명은 복수개의 오브젝트들을 포함하는 오디오 신호를 디코딩하는 방법에 있어서, 상기 오브젝트를 조절하기 위한 기설정된 정보 중 하나를 이용함으로써 상기 오디오 신호를 효과적으로 디코딩하는 방법을 제공한다. The present invention provides a method of decoding an audio signal including a plurality of objects, the method effectively decoding the audio signal by using one of predetermined information for adjusting the object.

도 1은 본 발명의 실시예에 따른 오디오 신호에 포함된 오브젝트에 적용되는 프리셋 정보의 개념도이다. 상기 오브젝트를 조절하기 위한 기설정된 정보를 본 명세서에서 프리셋 정보(preset information)이라고 지칭한다. 상기 프리셋 정보는 오디오 신호의 특성 또는 청취환경에 따라 선택할 수 있는 여러가지 모드를 나타낼 수 있고, 복수개일 수 있다. 또한, 상기 프리셋 정보는 프리셋 정보의 속성 등을 표현하기 위한 메타데이터(metadata)와 상기 오브젝트를 조절하기 위하여 적용되는 렌더링 데이터(rendering data)를 포함한다. 상기 메타데이터는 텍스트 형태로 표시될 수 있으며, 상기 프리셋 정보의 속성(예를 들면, 콘서트 홀 모드, 가라오케 모드, 뉴스 모드 등)을 나타낸 뿐만 아니라, 상기 프리셋 정보의 작성자, 작성날짜, 상기 프리셋 정보가 적용되는 오브젝트의 이름 등 상기 프리셋 정보를 표현하기 위한 관련 정보를 포함할 수 있다. 한편, 상기 렌더링 데이터는 실질적으로 상기 오브젝트에 적용되는 데이터로서, 다양한 형태를 가질 수 있고, 상세하게는 매트릭스 형태일 수 있다. 1 is a conceptual diagram of preset information applied to an object included in an audio signal according to an embodiment of the present invention. Preset information for adjusting the object is referred to herein as preset information. The preset information may indicate various modes that can be selected according to the characteristics of the audio signal or the listening environment, and may be plural. In addition, the preset information includes metadata for representing attributes of preset information and the like, and rendering data applied to adjust the object. The metadata may be displayed in a text form, and not only indicate the property of the preset information (for example, concert hall mode, karaoke mode, news mode, etc.), but also author, creation date, and preset information of the preset information. It may include related information for expressing the preset information, such as the name of the object to be applied. Meanwhile, the rendering data is data applied to the object substantially, and may have various forms, and in detail, may have a matrix form.

도 1을 참조하면, 프리셋 정보1(preset 1)은 음악 신호를 콘서트 홀에서 듣는 듯한 음장감을 제공하는 콘서트 홀 모드(concert hall mode)일 수 있고, 프리셋 정보2(preset 2)는 오디오 신호 중 보컬(vocal) 오브젝트의 레벨을 감소시킨 가라오케 모드(karaoke mode)일 수 있으며, 프리셋 정보n(preset n)는 음성 오브젝트의 레벨을 증가시킨 뉴스 모드(news mode) 일 수 있다. 또한, 프리셋 정보2(preset 2)는 메타데이터 2와 렌더링 데이터 2를 포함한다. 만일 사용자로부터 프리셋 정보2가 선택된 경우, 메타데이터 2인 가라오케 모드(karaoke mode)가 디스플레이부에 현시될 것이고, 메타데이터 2 와 관련된 렌더링 데이터 2가 오브젝트에 적용되 어 레벨을 조절할 수 있다. Referring to FIG. 1, preset information 1 may be a concert hall mode that provides a sound field as if a music signal is heard in a concert hall, and preset information 2 may be a vocal among audio signals. It may be a karaoke mode in which the level of the vocal object is reduced, and preset information n may be a news mode in which the level of the voice object is increased. In addition, preset information 2 (preset 2) includes metadata 2 and rendering data 2. If preset information 2 is selected by the user, the karaoke mode, which is metadata 2, will be displayed on the display unit, and rendering data 2 related to metadata 2 can be applied to the object to adjust the level.

이 때, 렌더링 데이터가 매트릭스 형태이면, 렌더링 데이터는 모노 매트릭스(mono matrix), 스테레오 매트릭스(stereo matrix), 멀티채널 매트릭스(multi-channl matrix)를 포함할 수 있다. 상기 모노 매트릭스는 상기 오브젝트의 출력채널이 모노인 경우 적용되는 렌더링 데이터이고, 상기 스테레오 매트릭스는 상기 오브젝트의 출력채널이 스테레오인 경우, 상기 멀티채널 매트릭스는 상기 오브젝트의 출력채널이 멀티채널인 경우 적용되는 렌더링 데이터이다. 상기 오브젝트의 출력채널이 결정되면, 이를 이용하여 매트릭스가 결정되고, 상기 매트릭스를 상기 오브젝트에 적용하여 레벨을 조절할 수 있다. In this case, if the rendering data is in the form of a matrix, the rendering data may include a mono matrix, a stereo matrix, and a multi-channl matrix. The mono matrix is rendering data applied when the output channel of the object is mono, and the stereo matrix is applied when the output channel of the object is stereo, and the multichannel matrix is applied when the output channel of the object is multichannel. Rendering data. When the output channel of the object is determined, a matrix is determined using this, and the matrix may be applied to the object to adjust the level.

이와 같이, 상기 프리셋 정보에 포함된 상기 메타데이터와 상기 렌더링 데이터를 이용하여 상기 오브젝트를 조절하고 적용된 프리셋 정보의 속성 또는 특징을 표현함으로써 사용자가 원하는 효과를 갖는 오디오 신호를 효율적으로 제공할 수 있게 된다.As such, by adjusting the object using the metadata and the rendering data included in the preset information and expressing an attribute or a characteristic of the applied preset information, an audio signal having a desired effect can be efficiently provided. .

도 2는 본 발명의 실시예에 따른 오디오 신호 처리 장치(200)를 나타내는 것이다. 도 2를 참조하면, 오디오 신호 처리 장치(200)는 프리셋 정보 생성부(210), 프리셋 정보 수신부(220), 오브젝트 조절부(230)를 포함할 수 있다. 2 shows an audio signal processing apparatus 200 according to an embodiment of the present invention. Referring to FIG. 2, the audio signal processing apparatus 200 may include a preset information generator 210, a preset information receiver 220, and an object controller 230.

프리셋 정보 생성부(210)는 오디오 신호에 포함된 오브젝트를 조절하기 위한 프리셋 정보를 생성하며, 메타데이터 생성부(212) 및 프리셋 렌더링 데이터 생성부(214)를 포함할 수 있다. 메타데이터 생성부(212)는 상기 프리셋 정보를 표현하는 텍스트 정보를 입력받아 프리셋 메타데이터(preset meradata)를 생성할 수 있 다. 상기 프리셋 메타데이터는 상술한 바와 같이 상기 프리셋 정보의 특성 또는 속성을 표현하기 위한 정보일 수 있다. 이 때, 메타데이터 생성부(212)는 상기 프리셋 메타데이터의 글자 길이수를 나타내는 프리셋 길이 정보(preset length information)를 더 생성할 수 있다. 상기 프리셋 길이 정보는 바이트(byte)로 나타낼 수 있으나, 이에 한정하지는 아니한다. The preset information generator 210 may generate preset information for adjusting an object included in the audio signal, and may include a metadata generator 212 and a preset rendering data generator 214. The metadata generator 212 may generate preset metadata by receiving text information representing the preset information. The preset metadata may be information for expressing a characteristic or property of the preset information as described above. In this case, the metadata generator 212 may further generate preset length information indicating the number of character lengths of the preset metadata. The preset length information may be represented by a byte, but is not limited thereto.

한편, 상기 오브젝트의 레벨을 조절하기 위한 게인 및 오브젝트의 패닝(panning)을 위한 정보가 프리셋 렌더링 데이터 생성부(214)에 입력되는 경우, 상기 오브젝트에 적용되는 프리셋 렌더링 데이터(preset rendering data)를 생성할 수 있다. 상기 프리셋 렌더링 데이터는 오브젝트마다 생성될 수 있고, 다양한 타입으로 구현될 수 있으며, 예를 들면, 매트릭스 형태 등으로 구현된 프리셋 매트릭스(preset matrix) 일 수 있다. 또한, 프리셋 렌더링 데이터 생성부(214)는 상기 프리셋 렌더링 데이터가 매트릭스 형태로 구현되었는지를 나타내는 프리셋 타입 정보(preset_type_flag)를 더 생성할 수 있다. 또한, 상기 오브젝트의 출력채널이 몇 개인지를 나타내는 출력채널 정보(output channel information)를 더 생성할 수도 있다. 메타데이터 생성부(212)에서 생성된 프리셋 길이 정보와 프리셋 메타데이터 및 프리셋 렌더링 데이터 생성부(214)에서 생성된 프리셋 타입 정보, 출력채널 정보, 및 프리셋 렌더링 데이터는 하나의 비트스트림에 포함되어 전송될 수 있으며, 상세하게는 오디오 신호를 포함하는 비트스트림의 보조 영역(ancillary region)에 포함되어 전송될 수 있다.Meanwhile, when gain for adjusting the level of the object and information for panning the object are input to the preset rendering data generator 214, preset rendering data applied to the object is generated. can do. The preset rendering data may be generated for each object, and may be implemented in various types. For example, the preset rendering data may be a preset matrix implemented in a matrix form. In addition, the preset rendering data generator 214 may further generate preset type information (preset_type_flag) indicating whether the preset rendering data is implemented in a matrix form. In addition, output channel information indicating how many output channels of the object may be generated. Preset length information generated by the metadata generator 212, preset type information generated by the metadata and preset render data generator 214, output channel information, and preset render data are included in one bitstream and transmitted. In more detail, it may be included in an ancillary region of a bitstream including an audio signal and transmitted.

한편, 프리셋 정보 생성부(210)는 상기 프리셋 길이 정보, 상기 프리셋 메타 데이터, 상기 프리셋 타입 정보, 상기 출력채널 정보, 및 상기 프리셋 렌더링 데이터가 비트스트림에 포함되었음을 나타내는 프리셋 존재 정보(preset exist information)를 더 생성할 수 있다. 상기 프리셋 존재 정보는 상기 프리셋 정보에 관한 정보들이 어느 영역에 포함되어 있는지를 나타내는 컨테이너 타입(container type)일 수 있고, 플래그 타입(flag type)일 수도 있으나, 이에 한정되지는 아니한다.The preset information generator 210 may include preset exist information indicating that the preset length information, the preset metadata, the preset type information, the output channel information, and the preset rendering data are included in the bitstream. You can create more. The preset presence information may be a container type indicating which area the information about the preset information is included in, and may be a flag type, but is not limited thereto.

또한, 프리셋 정보 생성부(210)는 복수개의 프리셋 정보를 생성할 수 있고, 각각의 프리셋 정보는 상기 프리셋 길이 정보, 상기 프리셋 메타데이터, 상기 프리셋 타입 정보, 상기 출력채널 정보, 및 상기 프리셋 렌더링 데이터를 포함한다. 이 때, 프리셋 정보 생성부(210)는 상기 프리셋 정보의 개수를 나타내는 프리셋 개수 정보(preset number information)을 더 생성할 수 있다.The preset information generation unit 210 may generate a plurality of preset information, and each preset information may include the preset length information, the preset metadata, the preset type information, the output channel information, and the preset rendering data. It includes. At this time, the preset information generation unit 210 may further generate preset number information indicating the number of preset information.

프리셋 정보 수신부(220)는 프리셋 정보 생성부(210)에서 생성되어 전송된 프리셋 정보를 수신하고, 메타데이터 수신부(222) 및 프리셋 렌더링 데이터 수신부(224)를 포함할 수 있다. 메타데이터 수신부(222)는 상기 프리셋 메타데이터를 수신하여 출력하고, 프리셋 렌더링 데이터 수신부(224)는 상기 프리셋 렌더링 데이터(예를 들면, 프리셋 매트릭스)를 수신하는데, 이와 관련된 상세한 설명은 도 3 및 도 4를 참조하여 상세히 후술하기로 한다.The preset information receiver 220 may receive preset information generated and transmitted by the preset information generator 210, and may include a metadata receiver 222 and a preset rendering data receiver 224. The metadata receiving unit 222 receives and outputs the preset metadata, and the preset rendering data receiving unit 224 receives the preset rendering data (for example, a preset matrix). It will be described later in detail with reference to 4.

오브젝트 조절부(230)는 복수개의 오브젝트들을 포함하는 오디오 신호와 프리셋 렌더링 데이터 수신부(224)에서 생성된 프리셋 렌더링 데이터를 입력받는다. 상기 프리셋 렌더링 데이터는 상기 오브젝트에 적용되어 상기 오브젝트의 레벨을 조절하거나 상기 오브젝트의 위치를 조절할 수 있다.The object controller 230 receives an audio signal including a plurality of objects and preset rendering data generated by the preset rendering data receiver 224. The preset rendering data may be applied to the object to adjust the level of the object or the position of the object.

도 3은 본 발명의 오디오 신호 처리 장치(200)의 프리셋 정보 수신부(220)에 포함되는 메타데이터 수신부(310) 및 프리셋 렌더링 데이터 수신부(320)의 개략적인 구성을 보여주는 도면이다. 3 is a diagram illustrating a schematic configuration of the metadata receiver 310 and the preset rendering data receiver 320 included in the preset information receiver 220 of the audio signal processing apparatus 200 of the present invention.

메타데이터 수신부(310)는 프리셋 길이 정보 수신부(312) 및 프리셋 메타데이터 수신부(314)를 포함한다. 프리셋 길이 정보 수신부(312)는 상기 프리셋 정보를 표현하기 위한 프리셋 메타데이터의 길이를 나타내는 프리셋 길이 정보(preset length information)을 수신하여, 상기 프리셋 메타데이터의 길이를 획득한다. 이후, 프리셋 메타데이터 수신부(314)는 상기 프리셋 길이 정보가 나타내는 길이만큼 비트스트림을 읽어서 상기 프리셋 메타데이터를 수신한다. 또한, 프리셋 메타데이터 수신부(314)는 상기 프리셋 정보의 종류 또는 속성을 알 수 있는 메타데이터인 상기 프리셋 메타데이터를 텍스트 형식으로 변환하여 출력한다.The metadata receiver 310 includes a preset length information receiver 312 and a preset metadata receiver 314. The preset length information receiver 312 receives preset length information indicating the length of preset metadata for expressing the preset information, and obtains the length of the preset metadata. Thereafter, the preset metadata receiving unit 314 reads the bitstream by the length indicated by the preset length information and receives the preset metadata. In addition, the preset metadata receiving unit 314 converts the preset metadata, which is the metadata for identifying the type or property of the preset information, into a text format and outputs the converted metadata.

프리셋 렌더링 데이터 수신부(320)는 프리셋 타입 플래그 수신부(322), 출력채널 정보 수신부(324), 및 프리셋 매트릭스 수신부(326)를 포함한다. 프리셋 데이터 타입 플래그 수신부(322)는 상기 프리셋 렌더링 데이터가 매트릭스 형태인지를 나타내는 프리셋 타입 플래그(preset_type_flag)를 수신하며, 상기 프리셋 타입 플래그의 의미는 하기 표1 과 같다.The preset rendering data receiver 320 includes a preset type flag receiver 322, an output channel information receiver 324, and a preset matrix receiver 326. The preset data type flag receiver 322 receives a preset type flag (preset_type_flag) indicating whether the preset rendering data is in a matrix form, and the meaning of the preset type flag is shown in Table 1 below.

프리셋 타입 플래그
(preset_type_flag)Preset type flag
(preset_type_flag) 의미meaning 00 프리셋 렌더링 데이터의 타입이 매트릭스가 아닌 경우If the type of preset render data is not a matrix 1One 프리셋 렌더링 데이터의 타입이 매트릭스인 경우When the type of preset render data is matrix

상기 프리셋 타입 플래그가 프리셋 렌더링 데이터의 타입이 매트릭스인 경우를 나타내면, 출력채널 정보 수신부(324)는 오디오 신호에 포함된 오브젝트가 몇 개의 출력채널에서 재생될지를 나타내는 출력채널 정보를 수신한다. 상기 출력채널 정보는 모노 채널, 스테레오 채널, 또는 멀티채널(5.1채널) 일 수 있으나, 이에 한정되지 아니한다.When the preset type flag indicates that the type of preset rendering data is a matrix, the output channel information receiver 324 receives output channel information indicating how many output channels an object included in the audio signal is to be reproduced. The output channel information may be a mono channel, a stereo channel, or a multi-channel (5.1 channel), but is not limited thereto.

프리셋 매트릭스 수신부(36)는 입력된 상기 출력채널 정보를 이용하여 상기 오브젝트에 적용될 프리셋 매트릭스를 수신하여 출력한다. 상기 프리셋 매트릭스는 모노 프리셋 매트릭스, 스테레오 프리셋 매트릭스 또는 멀티채널 프리셋 매트릭스 중 하나일 수 있으며, 상기 프리셋 매트릭스의 차원은 오브젝트의 수 및 출력채널의 수에 근거하여 결정될 수 있고, 상기 프리셋 매트릭스는 (오브젝트의 수) * (출력채널의 수) 형태를 가질 수 있다. 예를 들어, 오디오 신호에 포함된 오브젝트가 n개이고, 출력채널 정보 수신부(324)로부터 출력채널이 5.1 채널, 즉 6개의 채널인 경우, 프리셋 매트릭스 수신부(326)은 n * 6 형태로 구현된 하기 수학식 1의 프리셋 멀티채널 매트릭스를 출력할 수 있다. The preset matrix receiver 36 receives and outputs a preset matrix to be applied to the object by using the input channel information. The preset matrix may be one of a mono preset matrix, a stereo preset matrix, or a multichannel preset matrix, and the dimension of the preset matrix may be determined based on the number of objects and the number of output channels, and the preset matrix may be defined as (object of Number) * (number of output channels). For example, when n objects are included in the audio signal and the output channels are 5.1 channels, that is, six channels from the output channel information receiver 324, the preset matrix receiver 326 is implemented in n * 6 form. The preset multichannel matrix of Equation 1 may be output.

여기서, 매트릭스 성분(

)은 a번째 오브젝트가 b 번째 채널에 포함되는 정도를 나타내는 게인 값이다. 이후, 상기 프리셋 멀티채널 매트릭스는 오디오 신호에 적용되어 해당 오브젝트의 레벨을 조절할 수 있게 된다.Where the matrix component (

) Is a gain value indicating the degree to which the a-th object is included in the b-th channel. Thereafter, the preset multichannel matrix is applied to an audio signal to adjust the level of the corresponding object.

이와 같이, 본 발명의 프리셋 정보 수신부(220)는 상기 프리셋 길이 정보를 이용하여 필요한 만큼의 비트스트림을 읽어 효율적으로 상기 프리셋 메타데이터를 표현하고, 상기 프리셋 매트릭스를 상기 출력채널 정보에 기초하여 획득함으로써 오디오 신호에 포함된 오브젝트의 게인 등을 효과적으로 조절할 수 있게 된다.As described above, the preset information receiving unit 220 of the present invention reads the bitstream as much as necessary using the preset length information to efficiently represent the preset metadata, and obtains the preset matrix based on the output channel information. The gain of the object included in the audio signal can be effectively adjusted.

도 4는 본 발명의 실시예에 따른 오디오 신호 처리 방법을 나타내는 순서도이다. 먼저, 복수의 오브젝트들을 포함하는 오디오 신호를 수신한다(S410). 또한, 오브젝트의 게인 또는 패닝 등을 조절하기 위하여 기설정된 프리셋 정보가 존재하는지 여부를 나타내는 프리셋 존재 정보(preset exist information)을 수신하고(S415), 프리셋 정보가 존재하는 경우 기설정된 프리셋 정보가 몇 개(n)인지를 나타내는 프리셋 개수 정보를 수신한다(S420). 상기 프리셋 개수 정보는 상기 프리셋 정보가 존재하는 것을 가정하고 있으므로, (실제 존재하는 프리셋 개수)-1 개로 표현될 수 있다. 이후, 프리셋 정보를 표현하기 위한 메타데이터가 몇 비트(또는 바이트)를 갖는지를 나타내는 프리셋 길이 정보를 수신한다(S430). 수신된 프리셋 길이 정보에 기초하여 프리셋 메타데이터를 수신하고(S435), 예를 들어, 가라오케 모드, 콘서트 홀 모드, 뉴스 모드 등, 이를 출력한다(S437). 프리셋 메타데이터는 텍스트 형태일 수 있고, 상술한 바와 같이 프리셋 정보의 음장 효과를 표현 하는 메타데이터 뿐만 아니라, 프리셋 작성자, 작성날짜, 프리셋 정보로 조절된 오브젝트의 이름 등을 개시하는 메타데이터일 수도 있으며, 이에 한정되지 아니한다.4 is a flowchart illustrating an audio signal processing method according to an embodiment of the present invention. First, an audio signal including a plurality of objects is received (S410). In addition, in order to adjust gain or panning of an object, the device receives preset exist information indicating whether the preset information exists (S415), and if the preset information exists, some preset information exists. In operation S420, the preset number information indicating whether (n) is received is received. Since the preset number information assumes that the preset information exists, the preset number information may be expressed as (-1). Thereafter, preset length information indicating how many bits (or bytes) the metadata for representing preset information has is received (S430). Preset metadata is received based on the received preset length information (S435), for example, a karaoke mode, a concert hall mode, a news mode, and the like are output (S437). The preset metadata may be in the form of text, and may be metadata representing the preset creator, creation date, name of the object adjusted with the preset information, as well as metadata representing the sound field effect of the preset information as described above. This is not limitative.

이후, 프리셋 정보에 포함된 프리셋 렌더링 데이터의 타입을 나타내는 프리셋 타입 정보를 수신한다(S440). 프리셋 타입 정보에 기초하여 프리셋 데이터의 타입이 매트릭스인지를 판단하고(S445), 매트릭스인 경우(S445의 예) 오브젝트의 출력채널이 몇 개인지를 나타내는 출력채널 정보를 수신한다(S450). 인코딩된 프리셋 매트릭스들 중 출력채널 정보에 기초하여 해당하는 프리셋 매트릭스를 수신한다(S455). 예를 들어, 오브젝트의 출력채널이 스테레오인 경우, 수신된 프리셋 매트릭스는 (오브젝트의 수) * 2 의 형태를 갖는 스테레오 프리셋 매트릭스일 것이다.Thereafter, preset type information indicating the type of preset rendering data included in the preset information is received (S440). On the basis of the preset type information, it is determined whether the type of the preset data is a matrix (S445), and when it is a matrix (YES in S445), output channel information indicating how many output channels of the object is received (S450). A corresponding preset matrix is received based on the output channel information among the encoded preset matrices (S455). For example, if the output channel of an object is stereo, the received preset matrix will be a stereo preset matrix having the form (number of objects) * 2.

상술한 단계에서 수신된 프리셋 길이 정보, 프리셋 메타데이터, 프리셋 타입 정보, 출력채널 정보, 및 프리셋 매트릭스를 포함하는 프리셋 정보(i번째)이 프리셋 개수 정보가 나타내는 프리셋 정보의 개수(n)보다 작은지를 판단한다(S460). 만일 프리셋 개수 정보보다 작은경우(S460의 예), S430단계로 돌아가 다음번째 프리셋 정보(i+1번째)의 프리셋 길이 정보를 수신하는 단계를 반복한다. 만일 프리셋 개수 정보와 같은 경우(S460의 아니오), 프리셋 매트릭스를 오디오 신호에 적용하여 오브젝트의 레벨을 조절한다(S465). 한편, 프리셋 타입이 매트릭스가 아닌 경우(S445의 아니오), 인코더에서 설정된 매트릭스 이외의 형식으로 구현된 프리셋 데이터를 수신하고(S457), 수신된 프리셋 데이터를 오디오 신호에 적용하여 오브젝트의 레벨을 조절한다(S468). 이후, 조절된 오브젝트를 포함하는 오디오 신호를 출력할 수 있다(S470)Whether the preset information (i-th) including the preset length information, the preset metadata, the preset type information, the output channel information, and the preset matrix received in the above-described steps is smaller than the number n of preset information indicated by the preset number information. It is determined (S460). If it is smaller than the preset number information (YES in S460), the process returns to step S430 and repeats receiving the preset length information of the next preset information (i + 1th). If the same as the preset number information (NO in S460), the level of the object is adjusted by applying a preset matrix to the audio signal (S465). On the other hand, if the preset type is not a matrix (No in S445), the preset data implemented in a format other than the matrix set by the encoder is received (S457), and the received preset data is applied to the audio signal to adjust the level of the object. (S468). Thereafter, an audio signal including the adjusted object may be output (S470).

프리셋 매트릭스를 적용하여 오브젝트를 조절하는 단계(S465)는 사용자의 선택에 의하여 결정된 프리셋 매트릭스를 이용할 수 있다(미도시). 사용자는 프리셋 메타데이터를 출력하는 단계(S437)에서 출력된 프리셋 메타데이터를 참고하여, 원하는 프리셋 정보를 선택할 수 있다. 예를 들어, 사용자가 프리셋 메타데이터 중 가라오케 모드라고 표현된 메타데이터를 선택하는 경우, 출력채널 정보에 기초하여 수신된 프리셋 매트릭스(S455) 중 가라오케 모드인 프리셋 메타데이터와 대응하는 프리셋 매트릭스가 선택된다. 이후, 선택된 가라오케 모드에 대응하는 프리셋 매트릭스가 오디오 신호에 적용되어 오브젝트의 레벨을 조절하고, 조절된 오브젝트를 포함하는 오디오 신호가 출력된다.Adjusting the object by applying the preset matrix (S465) may use the preset matrix determined by the user's selection (not shown). The user may select desired preset information by referring to the preset metadata output in step S437 of outputting the preset metadata. For example, when the user selects the metadata expressed as the karaoke mode among the preset metadata, the preset matrix corresponding to the preset metadata in the karaoke mode is selected from the received preset matrix S455 based on the output channel information. . Subsequently, a preset matrix corresponding to the selected karaoke mode is applied to the audio signal to adjust the level of the object, and an audio signal including the adjusted object is output.

도 5는 본 발명의 일실시예에 따른 오디오 신호 처리 방법을 나타내는 신택스(syntax)를 표현한 것이다. 도 5를 참조하면, 프리셋 정보와 관련된 정보들은 비트스트림의 헤더(header)영역에 존재할 수 있다. 비트스트림의 헤더 영역으로부터 프리셋 개수 정보(bsNumPresets)를 획득할 수 있다. 이후, 프리셋 개수 정보가 존재하는 경우(if(bsNumPresets)), 프리셋 개수 정보가 나타내는 프리셋 정보의 개수를 획득한다(numPresets = bsNumPresets + 1). 예를 들면, 프리셋 개수 정보는 프리셋 정보가 1개 존재하는 경우, bsNumPresets 를 0으로 설정할 수 있고, 이 경우 실제 프리셋 정보의 개수는 (프리셋 개수 정보) + 1 로 파악하여 이용될 수 있다.먼저, 프리셋 개수 정보를 비트스트림으로부터 수신할 수 있다.FIG. 5 illustrates syntax representing an audio signal processing method according to an embodiment of the present invention. Referring to FIG. 5, information related to preset information may exist in a header area of a bitstream. Preset number information (bsNumPresets) may be obtained from the header area of the bitstream. Thereafter, when preset number information exists (if (bsNumPresets)), the number of preset information indicated by the preset number information is obtained (numPresets = bsNumPresets + 1). For example, when one preset information exists, bsNumPresets may be set to 0, and in this case, the number of actual preset information may be used by grasping (preset number information) + 1. Preset number information may be received from the bitstream.

또한, 프리셋 개수 정보에 기초하여 프리셋 정보(i번째 프리셋 정보)마다 프 리셋 렌더링 데이터의 타입을 나타내는 정보를 획득할 수 있다(bsPresetType[i]). 만일 프리셋 렌더링 데이터를 매트릭스 타입으로 전송하는 경우를 특정 프리셋 타입으로 정의하는 경우(매트릭스 타입일 때 bsPresetType[i]이 전송되는 경우), 프리셋 렌더링 데이터의 타입을 나타내는 정보는 프리셋 렌더링 데이터가 매트릭스 타입으로 생성되어 전송되었는지를 나타내는 상술한 프리셋 타입 정보(preset_type_flag)일 수 있다. 프리셋 타입 정보는 1비트로 표현될 수 있다.Also, based on the preset number information, information indicating the type of preset rendering data may be obtained for each preset information (i-th preset information) (bsPresetType [i]). If the case where the preset rendering data is transmitted as the matrix type is defined as a specific preset type (when bsPresetType [i] is transmitted when the matrix type is used), the information indicating the type of the preset rendering data indicates that the preset rendering data is the matrix type. It may be the above-described preset type information (preset_type_flag) indicating whether it is generated and transmitted. The preset type information may be represented by 1 bit.

또한, i번째 프리셋 정보에 포함된 프리셋 렌더링 데이터가 매트릭스 타입인 경우(bsPresetType[i]), 출력채널이 몇 개의 채널을 갖는지를 나타내는 출력채널 정보(bsPresetCh[i])을 획득하고, 출력채널 정보에 기초하여 오디오 신호에 포함된 오브젝트의 레벨을 조절하기 위한 프리셋 매트릭스를 획득한다(getRenderingMatrix()).In addition, when preset rendering data included in the i-th preset information is a matrix type (bsPresetType [i]), output channel information (bsPresetCh [i]) indicating how many channels the output channel has is obtained, and output channel information. Obtains a preset matrix for adjusting the level of an object included in the audio signal based on (getRenderingMatrix ()).

도 6은 본 발명의 다른 실시예에 따른 오디오 신호 처리 방법을 나타내는 신택스(syntax)를 표현한 것이다. 프리셋 정보는 헤더 영역에 포함되어 모든 프레임에서 동일하게 적용될 수 있으나, 시간에 따라 가변적(이하, "시변"이라고 함;time-variable)으로 적용됨으로써 오브젝트의 레벨을 효과적으로 조절할 수도 있다. 프리셋 정보가 시변하는 경우에는, 프레임마다 프리셋 정보에 관련된 정보가 포함되어야 한다. 따라서, 프리셋 정보가 프레임마다 포함되었는지를 나타내는 정보를 헤더에 포함함으로써 효과적으로 비트스트림을 구성할 수 있게 된다. 6 is a syntax diagram illustrating an audio signal processing method according to another embodiment of the present invention. The preset information may be included in the header area and may be applied to all frames in the same manner. However, the preset information may be applied in a time-variable manner according to time, thereby effectively adjusting the level of the object. When preset information changes over time, information related to preset information should be included in each frame. Therefore, the bitstream can be effectively configured by including information indicating whether the preset information is included in each frame in the header.

도 6을 참조하면, 프리셋 정보가 프레임마다 포함되었는지를 표현내는 신택스를 도시한다. 도 5에 도시된 오디오 신호의 처리 방법을 나타내는 신택 스(syntax)과 유사하지만, 출력채널 정보(bsPresetCh[i])를 획득한 이후, 프리셋 정보가 시간적으로 가변적으로, 즉, 프레임마다 포함되었는지 여부를 나타내는 프리셋 정보시변 플래그 정보(bsPresetTimeVarying[i])를 포함할 수 있다. 상기 프리셋 정보시변 플래그 정보가 비트스트림의 헤더 영역에 포함된 경우, 비트스트림의 프레임 영역에 포함된 프리셋 매트릭스 및 프리셋 메타데이터를 이용하여 오브젝트의 레벨을 조절하게 된다. 프리셋 정보시변 플래그 정보가 헤더에 존재하는 경우, 프레임마다 프리셋 정보의 갱신이 있는지 여부를 판단하여 갱신이 없는 경우 그대로 이용(keep), 갱신이 있는 경우 읽음(read) 등의 별도의 플래그를 두어 효율적으로 비트스트림을 구성하는 것도 가능하다.Referring to FIG. 6, a syntax for expressing whether preset information is included for each frame is shown. It is similar to the syntax indicating the processing method of the audio signal shown in FIG. 5, but after obtaining the output channel information bsPresetCh [i], whether the preset information is variably temporally, that is, included in each frame. Preset information may include time-variable flag information (bsPresetTimeVarying [i]). When the preset information time varying flag information is included in the header region of the bitstream, the level of the object is adjusted by using the preset matrix and the preset metadata included in the frame region of the bitstream. When the preset information time-variable flag information is present in the header, it is determined whether or not there is an update of the preset information for each frame. It is also possible to construct a bitstream.

또한, 비트스트림에 프리셋 정보가 포함되어 있는지를 나타내는 프리셋 존재 정보(bsPresetExtsts)를 포함할 수도 있다. 만일 프리셋 존재 정보가 프리셋 정보가 비트스트림에 포함되어 있지 않음을 나타내는 경우는, 프리셋 개수 정보(bsNumPresets), 프리셋 타입 정보(bsPresetType[i]), 출력채널 정보(bsPresetCh[i]), 및 프리셋 정보시변 플래그 정보(bsPresetTimeVarying[i])를 획득하는 루프(loop)를 수행하지 않을 수 있다. 상기 프리셋 존재 정보는 경우에 따라 신택스 구문에서 생략될 수 있다.In addition, preset presence information (bsPresetExtsts) indicating whether preset information is included in the bitstream may be included. If the preset presence information indicates that the preset information is not included in the bitstream, the preset number information (bsNumPresets), preset type information (bsPresetType [i]), output channel information (bsPresetCh [i]), and preset information A loop for acquiring time-varying flag information (bsPresetTimeVarying [i]) may not be performed. The preset presence information may be omitted in syntax syntax in some cases.

도 7은 본 발명의 또다른 실시예에 따른 오디오 신호 처리 방법을 나타내는 신택스를 표현한 것이다. 상술한 프리셋 매트릭스는 (오브젝트의 수) * (출력채널의 수) 형태의 매트릭스로서, 오디오 신호에 포함된 모든 오브젝트들의 레벨이 어떻게 조절되어 출력채널에 포함되는지를 나타낸다. 그러나 상기 오브젝트들 중 일 부 오브젝트에 대한 정보만을 수신하여 이용하는 것이 전송되는 비트수를 감소시킴으로써 효율적일 수 있다. 따라서, 본발명의 또다른 실시예에서는 프리셋 정보를 이용하여 원하는 오브젝트만을 조절하는 오디오 신호 처리 방법의 신택스를 제안한다.7 is a syntax diagram illustrating an audio signal processing method according to another embodiment of the present invention. The above-described preset matrix is a matrix in the form of (number of objects) * (number of output channels), and indicates how the levels of all objects included in the audio signal are adjusted and included in the output channel. However, receiving and using only information on some of the objects may be efficient by reducing the number of transmitted bits. Accordingly, another embodiment of the present invention proposes a syntax of an audio signal processing method of adjusting only a desired object using preset information.

도 7을 참조하면, 오브젝트 각각에 대하여 오브젝트의 레벨을 조절하기 위한 프리셋 정보가 적용되는지 여부를 나타내는 프리셋 오브젝트 적용 정보(bsPresetObject[i][j])를 신택스에 더 포함할 수 있다. 상기 프리셋 오브젝트 적용 정보를 이용함으로써 프리셋 정보가 해당 오브젝트에 대한 정보를 포함하고 있는지 여부를 알려주는 것이 가능하다. 상기 프리셋 오브젝트 적용 정보는 비트스트림의 헤더 영역에 존재할 수 있고, 도 6과 같이 프리셋 정보가 시변하는 경우에는 프레임에 존재할 수도 있다. 또한, 도 7에 도시된 바와 같이, 각 오브젝트에 대하여 프리셋 정보에 해당 오브젝트에 대한 정보를 포함하는지 여부를 알려줄 수 있고, 또는 포함여부를 나타내는 오브젝트 인덱스를 비트스트림에 포함할 수도 있다. 만일 오브젝트 인덱스를 이용하는 경우에는, 종료 문자(exit character)를 사용하여 보다 편리하게 비트스트림을 구성할 수 있다. Referring to FIG. 7, the syntax may further include preset object application information (bsPresetObject [i] [j]) indicating whether preset information for adjusting the level of the object is applied to each object. By using the preset object application information, it is possible to inform whether the preset information includes the information on the corresponding object. The preset object application information may exist in the header region of the bitstream, and may exist in a frame when preset information changes with time as shown in FIG. 6. In addition, as illustrated in FIG. 7, it is possible to inform whether each object includes information on the corresponding object in preset information, or may include an object index indicating whether to include the object index in the bitstream. If the object index is used, the bit stream may be more conveniently formed using an exit character.

종료 문자는 로스리스 코딩(lossless coding)에서 호프만 테이블(Huffman table) 등을 이용하여 부호화를 수행하는 경우, 실제 파라미터의 수보다 테이블을 하나 크게 설계하고, 추가로 할당된 파라미터를 종료 파라미터로 정의할 수 있다. 이 때, 종료 파라미터가 비트스트림에서 획득되면 해당 정보를 모두 수신한 것으로 정의하여 이용할 수 있다. 예를 들면, 프리셋 정보가 총 10개의 오브젝트 중 2개 의 오브젝트에 대한 정보만을 포함하는 경우(3번 오브젝트 및 8번 오브젝트에 대한 정보)는 3번 오브젝트 및 8번 오브젝트에 해당하는 호프만 인덱스와 종료 파라미터에 해당하는 호프만 인덱스를 차례로 전송함으로써 효과적으로 비트스트림을 구성할 수 있다.When the end character is encoded using a Huffman table in lossless coding, one table is designed to be larger than the actual number of parameters, and additionally assigned parameters may be defined as end parameters. Can be. At this time, if the end parameter is acquired in the bitstream, all of the corresponding information may be defined and used. For example, if the preset information includes only information about two objects out of a total of 10 objects (information about objects 3 and 8), the Hoffman index and end corresponding to objects 3 and 8 By sequentially transmitting the Hoffman index corresponding to the parameter, the bitstream can be effectively configured.

도 8은 본발명의 또다른 실시예인 프리셋 매트릭스를 단계적으로 생성하는 프리셋 렌더링 데이터 수신부의 개략적인 구성을 보여주는 도면이다. 도 8을 참조하면, 프리셋 렌더링 데이터 수신부(320)는 프리셋 타입 플래그 수신부(322), 출력채널 정보 수신부(324), 및 프리셋 매트릭스 결정부(326)을 포함한다. 다른 구성요소들은 도 2 및 도 3의 프리셋 렌더링 데이터 수신부(224, 320)과 동일한 구성 및 효과를 가지므로 상세한 설명을 생략하기로 한다. 한편, 도 8에 도시된 바와 같이, 프리셋 매트릭스 결정부(326)는 모노 타입 프리셋 매트릭스 수신부(810), 스테레오 타입 프리셋 매트릭스 생성부(820), 및 멀티채널 타입 프리셋 매트릭스 생성부(830)를 모두 포함한다.8 is a diagram illustrating a schematic configuration of a preset rendering data receiver for generating a preset matrix step by step according to another embodiment of the present invention. Referring to FIG. 8, the preset rendering data receiver 320 includes a preset type flag receiver 322, an output channel information receiver 324, and a preset matrix determiner 326. Other components have the same configuration and effects as the preset rendering data receivers 224 and 320 of FIGS. 2 and 3, and thus, detailed descriptions thereof will be omitted. On the other hand, as shown in Figure 8, the preset matrix determiner 326 is a mono type preset matrix receiver 810, a stereo type preset matrix generator 820, and a multi-channel type preset matrix generator 830 Include.

모노 타입 프리셋 매트릭스 수신부(810)는 프리셋 정보 생성부(미도시)로부터 (오브젝트의 수) 형태의 매트릭스로 표현되는 모노 프리셋 매트릭스를 수신받는다. 만일 출력채널 정보 수신부(324)로부터 수신된 출력채널 정보가 모노인 경우, 상기 모노 프리셋 매트릭스는 그대로 출력되고 오디오 신호에 적용되어 오브젝트의 레벨을 조절할 수 있다.The mono type preset matrix receiving unit 810 receives a mono preset matrix expressed as a matrix in the form of (number of objects) from the preset information generating unit (not shown). If the output channel information received from the output channel information receiver 324 is mono, the mono preset matrix is output as it is and applied to the audio signal to adjust the level of the object.

한편, 출력채널 정보가 스테레오인 경우에는 스테레오 타입 프리셋 매트릭스 생성부(820)에 모노 프리셋 매트릭스가 입력되고, 채널 확장 정보를 더 입력받아 (오브젝트의 수) * 2의 형태인 스테레오 프리셋 매트릭스를 생성한다. 만일 출력채널 정보가 멀티채널을 나타내는 경우에는 상기 스테레오 프리셋 매트릭스와 다채널 확장 정보가 멀티채널 타입 프리셋 매트릭스 생성부(830)로 입력되어 (오브젝트의 수) * 6 의 형태인 멀티채널 프리셋 매트릭스를 생성한다. 이와 같이, 인코더에서 모노 프리셋 매트릭스만을 생성하고, 채널 확장 정보를 이용하여 프리셋 매트릭스 결정부(326)에서 단계적으로 프리셋 매트릭스를 생성함으로써, 재생 환경이 스테레오만에 한정되는 경우 전송되는 비트수를 절약할 수 있고, 스테레오 또는 멀티채널을 위한 프리셋 매트릭스를 중복하여 전송하지 아니할 수 있다.Meanwhile, when the output channel information is stereo, a mono preset matrix is input to the stereo type preset matrix generator 820, and further receives channel extension information to generate a stereo preset matrix having a form of (number of objects) * 2. . If the output channel information indicates a multi-channel, the stereo preset matrix and the multi-channel extension information are input to the multi-channel type preset matrix generator 830 to generate a multi-channel preset matrix having the form of (number of objects) * 6. do. As such, only the mono preset matrix is generated by the encoder, and the preset matrix determination unit 326 generates the preset matrix step by step using the channel extension information, thereby saving the number of bits transmitted when the playback environment is limited to stereo only. In addition, the preset matrix for stereo or multichannel may not be repeatedly transmitted.

또한, 본 발명의 또다른 실시예에 따른 오디오 신호 처리 방법은 프리셋 정보를 전송함에 있어 게인 값을 전송하고, 필요에 따라 정규화된 프리셋 매트릭스를 전송하는 방법을 제안한다. 이는 오디오 신호에 포함된 오브젝트를 조절하기 위하여 게인만이 필요한 경우에는 게인값만을 전송하고, 손쉽게 프리셋 매트릭스 전체를 전송하는 방법으로 확장할 수 있다. 예를 들어, 상술한 수학식 1과 같은 프리셋 매트릭스를 전송하기 위하여는 n*6 개의 게인 정보를 먼저 전송하여야 한다. 상기 게인 정보는 하기 수학식 2와 같이 계산될 수 있다.In addition, an audio signal processing method according to another embodiment of the present invention proposes a method of transmitting a gain value in transmitting preset information and transmitting a normalized preset matrix as needed. If only gain is needed to control an object included in the audio signal, it can be extended by transmitting only a gain value and easily transmitting the entire preset matrix. For example, in order to transmit the preset matrix as shown in Equation 1, n * 6 gain information should be transmitted first. The gain information may be calculated as in Equation 2 below.

여기서, i는 오브젝트, j 는 출력채널, nCH은 출력채널의 수를 나타낸다. 상기

는 오브젝트의 수 만큼 존재하므로, 하나의 프리셋 정보에 대하여 n개가 필요하다.Where i is an object, j is an output channel, and nCH is the number of output channels. remind

Since n exists as many as the number of objects, n pieces of preset information are required.

게인 정보 이외에 패닝 정보가 필요한 경우, 정규화된 프리셋 매트릭스(normalized preset matrix)를 추가적으로 이용한다. 상기 정규화된 프리셋 매트릭스는 하기 수학식 3과 같이 정의될 수 있다.If panning information is needed in addition to the gain information, a normalized preset matrix is additionally used. The normalized preset matrix may be defined as in Equation 3 below.

상술한 방법과 같이 게인 정보 및 정규화된 프리셋 매트릭스를 이용하는 경우, n*6 개의 게인 정보를 전송해야 한다. 그러나 정규화 특성에 의하여

과 같은 특성을 갖게 되고,

의 log10 값이 항상 0보다 작거나 같게 되므로, 게인 정보의 양자화를 위하여 채널 레벨 차이 정보(Channel Level Difference Information)의 테이블을 이용하는 경우, 종래에 비하여 절반의 테이블만을 사용하게 된다. 따라서, 게인 정보를 별도로 전송하지 아니하고 정규화되지 아니한 프리셋 매트릭스를 수신하여 이용하는 것보다 전송되는 비트율 뿐만 아니라 이용되는 데이터의 양을 절약할 수 있다. 또한, 프리셋 정보에 게인 정보만을 포함시킬 수 도 있으므로 프리셋 정보를 스케일러블(scalable)하게 사용할 수 있다.When gain information and a normalized preset matrix are used as in the above-described method, n * 6 gain information should be transmitted. But by normalization characteristic

Have the same characteristics as

Since the log10 value of is always less than or equal to 0, when using a table of channel level difference information for quantization of gain information, only half of the table is used as compared to the conventional art. Therefore, it is possible to save not only the bit rate transmitted but also the amount of data used, rather than receiving and using a preset matrix that is not normalized without separately transmitting gain information. In addition, since only the gain information may be included in the preset information, the preset information may be used in a scalable manner.

도 9는 상술한 프리셋 정보에 게인 정보와 패닝과 관련된 정보를 별도로 포함시켜 전송하는 경우의 신택스를 표현한 것이다. 이러한 게인 정보 및 패닝 정보는 헤더 영역 또는 프레임 영역에 포함될 수 있다. 도 9를 참조하면, 이탤릭체로 표현된 부분은 실제 프리셋 정보값을 비트스트림으로부터 수신하는 것이다. 다양한 노이즈리스 코딩 방식(noiseless coding scheme)을 이용할 수 있으며, 도 9에서는 함수로 표현하였다. 예를 들어, 프레임 영역에 상기 정보들이 존재하는 경우, 프리셋 정보가 존재하는지 여부에 따라 존재하는 경우에 프리셋 개수 정보를 수신한다. 이후, 먼저 게인 정보를 수신하는데, 이는 해당 오브젝트를 어떤 게인값으로 재생할지에 대한 정보를 나타낸다. 이 때, 게인 정보는 상술한 G_i 일 수 있고, 외부 입력값에 의하여 오디오 신호의 레벨이 조절된 경우 생성되는 아비트러리 다운믹스 게인(arbitrary downmix gain, ADG)일 수도 있다.FIG. 9 illustrates syntax when the gain information and the panning information are separately included in the preset information and transmitted. Such gain information and panning information may be included in a header area or a frame area. Referring to FIG. 9, an italicized portion receives an actual preset information value from a bitstream. Various noiseless coding schemes may be used, and are represented as functions in FIG. 9. For example, when the information exists in the frame area, the preset number information is received when the information exists according to whether the preset information exists. Thereafter, first, gain information is received, which indicates information about a gain value to be played with the object. In this case, the gain information may be the G_i described above, or may be an arbitrary downmix gain (ADG) generated when the level of the audio signal is adjusted by an external input value.

추가로 획득되는 패닝 정보는 여러가지 형태일 수 있다. 상기 패닝 정보는 상술한 정규화된 프리셋 매트릭스일 수 있고, 또는 스테레오 패닝 정보 및 멀티채널 패닝 정보로 구분되어 있을 수 있다.The additionally obtained panning information may be in various forms. The panning information may be the normalized preset matrix described above, or may be divided into stereo panning information and multichannel panning information.

도 10은 본 발명의 또다른 실시예에 따른 오디오 신호 처리 장치를 도시한 것이다. 오디오 신호 처리 장치는 크게 다운믹싱부(1010), 오브젝트 정보 생성부(1020), 프리셋 정보 생성부(1030), 다운믹스 신호 처리부(1040), 정보 처리부(1050), 및 멀티채널 디코딩부(1060)를 포함한다.10 shows an audio signal processing apparatus according to another embodiment of the present invention. The audio signal processing apparatus includes a downmixer 1010, an object information generator 1020, a preset information generator 1030, a downmix signal processor 1040, an information processor 1050, and a multichannel decoder 1060. ).

복수개의 오브젝트들은 다운믹싱부(1010)에 입력되어 모노 또는 스테레오 다 운믹스 신호를 생성한다. 또한, 복수개의 오브젝트들은 오브젝트 정보 생성부(1020)에 입력되어 오브젝트의 레벨을 나타내는 오브젝트 레벨 정보(object level information), 다운믹스 신호에 포함되는 오브젝트의 게인값 및/또는 스테레오 다운믹스 신호인 경우, 다운믹스 채널에 포함되는 오브젝트의 정도를 나타내는 오브젝트 게인 정보(object gain information), 오브젝트들간의 연관여부를 나타내는 오브젝트 연관 정보(object correlation information)를 포함하는 오브젝트 정보(object information)을 생성한다. 이후, 다운믹스 신호 및 오브젝트 정보는 프리셋 정보 생성부(1030)로 입력되어 오브젝트의 레벨을 조절하기 위한 프리셋 렌더링 데이터와 프리셋 정보를 표현하기 위한 프리셋 메타데이터를 포함하는 프리셋 정보를 생성한다. 상기 프리셋 렌더링 데이터와 상기 프리셋 메타데이터가 생성되는 과정은 상술한 도 1 내지 도 9의 오디오 신호 처리 장치 및 방법에서 설명한 바와 같으므로 상세한 설명은 생략하기로 한다. 오브젝트 정보 생성부(1020)에서 생성된 오브젝트 정보와 프리셋 정보 생성부(1030)에서 생성된 프리셋 정보는 SAOC 비트스트림에 포함되어 전송될 수 있다. The plurality of objects are input to the downmixer 1010 to generate a mono or stereo downmix signal. Also, when the plurality of objects are input to the object information generation unit 1020 and are object level information indicating the level of the object, a gain value of an object included in the downmix signal, and / or a stereo downmix signal, Object information including object gain information indicating the degree of the object included in the downmix channel and object correlation information indicating whether the objects are related to each other is generated. Subsequently, the downmix signal and the object information are input to the preset information generation unit 1030 to generate preset information including preset rendering data for adjusting the level of the object and preset metadata for expressing the preset information. Since the process of generating the preset rendering data and the preset metadata is the same as described above with reference to the audio signal processing apparatus and method of FIGS. 1 to 9, a detailed description thereof will be omitted. The object information generated by the object information generator 1020 and the preset information generated by the preset information generator 1030 may be included in the SAOC bitstream and transmitted.

정보 처리부(1050)는 오브젝트 정보 처리부(1051) 및 프리셋 정보 수신부(1052)를 포함하며, SAOC 비트스트림을 수신받는다. 프리셋 정보 수신부(1052)는 상기 SAOC 비트스트림으로부터 상술한 프리셋 존재 정보, 프리셋 개수 정보, 프리셋 정보길이 정보, 프리셋 메타데이터, 프리셋 타입 정보, 출력채널 정보, 및 프리셋 매트릭스를 수신하고, 이밖의 도 1 내지 도 9의 오디오 신호 처리 방법 및 장치에서 설명된 다양한 실시예에 따른 방법을 이용한다. 프리셋 정보 수신부(1052) 는 프리셋 메타데이터와 프리셋 매트릭스를 출력하고, 오브젝트 정보 처리부(1051)는 이들을 입력받아 SAOC 비트스트림에 포함된 오브젝트 정보와 함께 이용하여 다운믹스 신호를 전처리(pre-processing) 하기 위한 다운믹스 처리 정보(downmix processing information)과 다운믹스 신호를 업믹싱하기 위한 멀티채널 정보(multi-channel information)을 생성한다. The information processor 1050 includes an object information processor 1051 and a preset information receiver 1052, and receives an SAOC bitstream. The preset information receiving unit 1052 receives the above-described preset existence information, preset number information, preset information length information, preset metadata, preset type information, output channel information, and a preset matrix from the SAOC bitstream. The method according to various embodiments described in the audio signal processing method and apparatus of FIG. 9 is used. The preset information receiving unit 1052 outputs preset metadata and a preset matrix, and the object information processing unit 1051 receives them and uses the object information included in the SAOC bitstream to pre-process the downmix signal. A downmix processing information for generating a multi-channel information for upmixing the downmix signal is generated.

이후, 다운믹스 처리 정보는 다운믹스 신호 처리부(1040)에 입력되어, 다운믹스 신호에 포함된 오브젝트의 패닝을 수행할 수 있다. 이와 같이 전처리된 다운믹스 신호는 정보 처리부(1050)에서 출력된 멀티채널 정보와 함께 멀티채널 디코딩부(1060)로 입력되어 업믹싱됨으로써 멀티채널 오디오 신호를 생성할 수 있게 된다.Subsequently, the downmix processing information may be input to the downmix signal processor 1040 to perform panning of an object included in the downmix signal. The preprocessed downmix signal is input to the multichannel decoding unit 1060 together with the multichannel information output from the information processing unit 1050 and upmixed to generate a multichannel audio signal.

이와 같이, 본 발명의 오디오 신호 처리 장치는 복수개의 오브젝트를 포함하는 오디오 신호를 오브젝트 정보를 이용하여 멀티채널 신호로 디코딩하는데 있어서, 기설정된 프리셋 정보를 이용함으로써 손쉽게 오브젝트의 레벨을 조절할 수 있다. 또한, 이 때 오브젝트에 적용되는 프리셋 매트릭스는 출력채널 정보에 기초하여 수신된 매트릭스 형태의 데이터를 이용함으로써 오브젝트의 레벨 조절을 효과적으로 수행하고, 인코더단에서 전송되는 프리셋 정보길이 정보에 기초하여 프리셋 정보를 표현하기 위한 프리셋 메타데이터를 출력함으로써 코딩 효율을 높일수 있다.As described above, in the audio signal processing apparatus of the present invention, in decoding an audio signal including a plurality of objects into a multichannel signal using object information, the level of an object can be easily adjusted by using preset information. In this case, the preset matrix applied to the object effectively adjusts the level of the object by using the received matrix type data based on the output channel information, and preset information based on the preset information length information transmitted from the encoder. Coding efficiency can be improved by outputting preset metadata for presentation.

도 11은 본 발명의 일실시예에 따른 메타데이터 수신부 및 프리셋 렌더링 데이터 수신부를 포함하는 프리셋 정보 수신부가 구현된 제품의 개략적인 구성을 보 여주는 도면이고, 도 12는 본 발명의 실시예에 따른 프리셋 정보 수신부가 구현된 제품들의 관계를 보여주는 도면이다.11 is a diagram illustrating a schematic configuration of a product implemented with a preset information receiver including a metadata receiver and a preset rendering data receiver according to an embodiment of the present invention, and FIG. 12 is a diagram illustrating an embodiment of the present invention. FIG. 3 is a diagram illustrating a relationship between products in which a preset information receiver is implemented. FIG.

도 11을 참조하면, 유무선 통신부 유무선 통신부(1110)는 유무선 통신 방식을 통해서 비트스트림을 수신한다. 구체적으로 유무선 통신부(1110)는 유선통신부(1111), 적외선통신부(1112), 블루투스부(1113), 무선랜통신부(1114) 중 하나 이상을 포함할 수 있다.Referring to FIG. 11, the wire / wireless communication unit wire / wireless communication unit 1110 receives a bitstream through a wired / wireless communication scheme. In more detail, the wired / wireless communication unit 1110 may include at least one of a wired communication unit 1111, an infrared communication unit 1112, a Bluetooth unit 1113, and a wireless LAN communication unit 1114.

사용자 인증부는(1120)는 사용자 정보를 입력 받아서 사용자 인증을 수행하는 것으로서 지문인식부(1121), 홍채인식부(1122), 얼굴인식부(1123), 및 음성인식부(1124) 중 하나 이상을 포함할 수 있는데, 각각 지문, 홍채정보, 얼굴 윤곽 정보, 음성 정보를 입력받아서, 사용자 정보로 변환하고, 사용자 정보 및 기존 등록되어 있는 사용자 데이터와의 일치여부를 판단하여 사용자 인증을 수행할 수 있다. The user authentication unit 1120 receives user information and performs user authentication, and includes one or more of a fingerprint recognition unit 1121, an iris recognition unit 1122, a face recognition unit 1123, and a voice recognition unit 1124. The fingerprint, iris information, facial contour information, and voice information may be input, converted into user information, and the user authentication may be performed by determining whether the user information matches the existing registered user data. .

입력부(1130)는 사용자가 여러 종류의 명령을 입력하기 위한 입력장치로서, 키패드부(1131), 터치패드부(1132), 리모컨부(1133) 중 하나 이상을 포함할 수 있지만, 본 발명은 이에 한정되지 아니한다. 한편, 후술한 메타데이터 수신부(1141)에서 출력되는 복수개의 프리셋 정보에 대한 프리셋 메타데이터가 디스플레이부(1162)를 통하여 화면에 현시되는 경우에, 입력부(1130)를 통하여 사용자가 프리셋 메타데이터를 선택할 수 있고, 선택된 프리셋 메타데이터에 대한 정보가 제어부(1150)로 입력된다.The input unit 1130 is an input device for a user to input various kinds of commands, and may include one or more of a keypad unit 1131, a touch pad unit 1132, and a remote controller unit 1133, but the present invention is directed to the input unit 1130. It is not limited. Meanwhile, when preset metadata for a plurality of preset information output from the metadata receiving unit 1141 described later is displayed on the screen through the display unit 1162, the user selects preset metadata through the input unit 1130. The information about the selected preset metadata may be input to the controller 1150.

신호 디코딩부(1140)는 메타데이터 수신부(1141) 및 프리셋 렌더링 데이터 수신부(1142)를 포함하는데, 메타데이터 수신부(1141)는 프리셋 정보길이 정보를 수신받아 이를 기초로 프리셋 메타데이터를 수신한다. 또한, 프리셋 렌더링 데이터 수신부(1142)는 프리셋 타입 정보에 의하여 프리셋 정보가 매트릭스로 표현된 경우, 출력채널 정보를 수신받아 이를 기초로 프리셋 렌더링 데이터인 프리셋 매트릭스를 수신한다. 신호 디코딩부(550)는 수신된 비트스트림, 프리셋 메타데이터, 및 프리셋 매트릭스를 이용하여 오디오 신호를 디코딩하여 출력신호를 생성하고, 상기 프리셋 메타데이터를 텍스트 형태로 출력한다.The signal decoder 1140 includes a metadata receiver 1141 and a preset rendering data receiver 1142. The metadata receiver 1141 receives preset information length information based on the preset information length information. In addition, when the preset information is expressed as a matrix by preset type information, the preset rendering data receiving unit 1142 receives output channel information and receives a preset matrix as preset rendering data based on the output channel information. The signal decoding unit 550 decodes the audio signal using the received bitstream, preset metadata, and the preset matrix to generate an output signal, and outputs the preset metadata in text form.

제어부(1150)는 입력장치들로부터 입력 신호를 수신하고, 신호 디코딩부(1140)와 출력부(1160)의 모든 프로세스를 제어한다. 상술한 바와 같이, 제어부(1150)에 입력부(1130)로부터 선택된 프리셋 메타데이터에 대한 정보가 입력되는 경우, 프리셋 렌더링 데이터 수신부(1142)는 선택된 프리셋 메타데이터와 대응하는 프리셋 매트릭스를 수신하고, 이를 이용하여 오디오 신호를 디코딩한다. The controller 1150 receives input signals from the input devices and controls all processes of the signal decoding unit 1140 and the output unit 1160. As described above, when the information on the preset metadata selected from the input unit 1130 is input to the control unit 1150, the preset rendering data receiving unit 1142 receives a preset matrix corresponding to the selected preset metadata, and uses the preset matrix. To decode the audio signal.

출력부(1160)는 신호 디코딩부(1140)에 의해 생성된 출력 신호 등이 출력되는 구성요소로서, 스피커부(1161) 및 디스플레이부(1162)를 포함할 수 있다. 출력 신호가 오디오 신호일 때 출력 신호는 스피커부(1161)를 통하여 출력되고, 비디오 신호일 때 출력 신호는 디스플레이부(1162)를 통해 출력된다. 또한, 제어부(1150)로부터 입력된 프리셋 메타데이터를 디스플레이부(1162)를 통하여 화면에 현시한다.The output unit 1160 is a component that outputs an output signal generated by the signal decoding unit 1140, and may include a speaker unit 1161 and a display unit 1162. When the output signal is an audio signal, the output signal is output through the speaker unit 1161, and when the output signal is a video signal, the output signal is output through the display unit 1162. In addition, the preset metadata input from the controller 1150 is displayed on the screen through the display 1162.

도 12는 도 11에서 도시된 제품에 해당하는 단말 및 서버와의 관계를 도시한 것으로서, 도 12의 (A)를 참조하면, 제1 단말(1210) 및 제2 단말(1220)이 각 단말들은 유무선 통신부를 통해서 데이터 내지 비트스트림을 양방향으로 통신할 수 있 음을 알 수 있다. 도 12의 (B)를 참조하면, 서버(1230) 및 제1 단말(1240) 또한 서로 유무선 통신을 수행할 수 있음을 알 수 있다.FIG. 12 illustrates a relationship between a terminal and a server corresponding to the product illustrated in FIG. 11. Referring to FIG. 12A, the first terminal 1210 and the second terminal 1220 may be referred to as the respective terminals. It can be seen that the data or bitstream can be bidirectionally communicated through the wired / wireless communication unit. Referring to FIG. 12B, it can be seen that the server 1230 and the first terminal 1240 may also perform wired or wireless communication with each other.

도 13은 본 발명의 일실시예에 따른 메타데이터 수신부 및 프리셋 렌더링 데이터 수신부를 포함하는 프리셋 정보 수신부가 구현된 방송신호 디코딩 장치(1300)의 개략적인 구성을 보여주는 도면이다.FIG. 13 is a diagram illustrating a schematic configuration of a broadcast signal decoding apparatus 1300 having a preset information receiver including a metadata receiver and a preset rendering data receiver according to an embodiment of the present invention.

도 13을 참조하면, 디멀티플렉서(1320)는 튜너(1310)로부터 TV방송과 관련된 데이터들을 수신한다. 수신된 데이터들은 디멀티플렉서(1320)에서 분리되고, 데이터 디코더(1330)를 통하여 디코딩된다. 한편, 디멀티플렉서(1320)에서 분리된 데이터들은 HDD 와 같은 저장매체(1350)에 저장될 수 있다. 디멀티플렉서(1320)에서 분리된 데이터들은 오디오 디코더(1341) 및 비디오 디코더(1342)를 포함하는 디코더(1340)로 입력되어 오디오 신호 및 비디오 신호를 디코딩한다. 오디오 디코더(1341)는 본 발명의 일실시예에 따른 메타데이터 수신부(1341A) 및 프리셋 렌더링 데이터 수신부(1341B)를 포함하는데, 메타데이터 수신부(1341A)는 프리셋 길이 정보를 수신받아 이를 기초로 프리셋 메타데이터를 수신한다. 또한, 프리셋 렌더링 데이터 수신부(1341B)는 프리셋 타입 정보에 의하여 프리셋 정보가 매트릭스로 표현된 경우, 출력채널 정보를 수신받아 이를 기초로 프리셋 렌더링 데이터인 프리셋 매트릭스를 수신한다. 오디오 디코더(1341)는 수신된 비트스트림, 프리셋 메타데이터, 및 프리셋 매트릭스를 이용하여 오디오 신호를 디코딩하여 출력신호를 생성하고, 상기 프리셋 메타데이터를 텍스트 형태로 출력한다.Referring to FIG. 13, the demultiplexer 1320 receives data related to TV broadcasting from the tuner 1310. The received data are separated at the demultiplexer 1320 and decoded through the data decoder 1330. Meanwhile, data separated from the demultiplexer 1320 may be stored in a storage medium 1350 such as an HDD. The separated data from the demultiplexer 1320 is input to a decoder 1340 including an audio decoder 1341 and a video decoder 1342 to decode the audio signal and the video signal. The audio decoder 1341 includes a metadata receiver 1341A and a preset rendering data receiver 1341B according to an embodiment of the present invention. The metadata receiver 1341A receives preset length information based on the preset meta data. Receive data. In addition, when the preset information is expressed as a matrix by the preset type information, the preset rendering data receiving unit 1341B receives the output channel information and receives the preset matrix as preset rendering data based on the output channel information. The audio decoder 1341 generates an output signal by decoding the audio signal using the received bitstream, preset metadata, and preset matrix, and outputs the preset metadata in text form.

디스플레이부(1370)는 비디오 디코더(1342)에서 출력된 비디오 신호와 오디 오 디코더(1341)에서 출력된 프리셋 메타데이터를 화면에 현시한다. 또한, 디스플레이부(1370)는 스피커부(미도시)를 포함하고, 오디오 디코더(1341)에서 출력되는 오브젝트의 레벨이 프리셋 매트릭스를 이용하여 조절된 오디오 신호를 디스플레이부(1370)에 포함된 스피커부를 통하여 출력한다. 또한, 디코더(1340)에서 디코딩된 데이터들은 HDD 와 같은 저장매체(1350)에 저장될 수 있다.The display unit 1370 displays the video signal output from the video decoder 1342 and the preset metadata output from the audio decoder 1321 on the screen. In addition, the display unit 1370 includes a speaker unit (not shown), and the speaker unit included in the display unit 1370 outputs an audio signal whose level of an object output from the audio decoder 1321 is adjusted using a preset matrix. Output through In addition, the data decoded by the decoder 1340 may be stored in a storage medium 1350 such as an HDD.

한편, 신호 디코딩 장치(1300)는 사용자로부터 정보를 입력받아 수신된 데이터들을 제어할 수 있는 애플리케이션 매니저(1360)를 더 포함할 수 있다. 애플리케이션 매니저(1360)는 유저 인터페이스 매니저(1361) 및 서비스 매니저(1362)를 포함하는데, 유저 인터페이스 매니저(1361)는 사용자로부터 정보를 입력받기 위한 인터페이스(interface)를 제어한다. 예를 들면, 디스플레이부(1370)에 현시되는 텍스트의 글자체, 화면의 밝기, 메뉴 구성 등을 제어할 수 있다. 한편, 서비스 매니저(1362)는 디코더(1340) 및 디스플레이부(1370)에서 방송신호를 디코딩하여 출력하는 경우, 수신되는 방송신호를 사용자로부터 입력되는 정보를 이용하여 제어할 수 있다. 예를 들면, 방송채널의 설정, 알람 기능 설정, 성인인증 기능 등을 제공할 수 있다. 애플리케이션 매니저(1360)에서 출력되는 데이터들은 디코더(1340)뿐만 아니라, 디스플레이부(1370)로도 전송되어 이용가능하다.The signal decoding apparatus 1300 may further include an application manager 1360 that may receive information from a user and control the received data. The application manager 1360 includes a user interface manager 1361 and a service manager 1362, which controls an interface for receiving information from a user. For example, the font of the text displayed on the display unit 1370, the brightness of the screen, the menu configuration, and the like may be controlled. Meanwhile, when the decoder 1340 and the display unit 1370 decode and output a broadcast signal, the service manager 1362 may control the received broadcast signal using information input from a user. For example, a broadcast channel setting, an alarm function setting, an adult authentication function, and the like may be provided. Data output from the application manager 1360 may be transmitted to the display unit 1370 as well as the decoder 1340 and used.

도 14는 본 발명의 일실시예에 따른 프리셋 정보 수신부를 포함하는 제품의 디스플레이부를 도시한 것이다. 디스플레이부는 비트스트림에 포함된 모든 프리셋 메타데이터를 현시할 수 있다. 예를 들면, 도 14에서 도시한 바와 같이, 오디오 신호에 대응하는 프리셋 메타데이터인 가라오케 모드, 콘서트 홀 모드, 및 뉴스 모 드를 모두 화면에 현시한다. 14 illustrates a display unit of a product including a preset information receiver according to an embodiment of the present invention. The display unit may display all preset metadata included in the bitstream. For example, as shown in FIG. 14, the karaoke mode, the concert hall mode, and the news mode, which are preset metadata corresponding to the audio signal, are all displayed on the screen.

그 중 프리셋 메타데이터 중 하나를 사용자가 선택하는 경우, 디스플레이부는 상기 가라오케 모드에 대응하는 프리셋 매트릭스가 복수개의 오브젝트들에 적용됨으로써 레벨이 조절된 오브젝트를 화면에 현시한다. 예를 들면, 사용자가 가라오케 모드를 선택하는 경우, 보컬 오브젝트의 레벨이 최저로 설정된 모양이 현시될 수 있다. 또한, 사용자가 뉴스 모드를 선택하는 경우에는 오디오 신호에 적용되는 프리셋 매트릭스는 보컬 오브젝트 이외의 오브젝트의 레벨을 감소시킬 것이다. 도 14를 참조하면, 뉴스 모드가 선택된 경우, 디스플레이부는 보컬 오브젝트의 레벨이 가라오케 모드시 보컬 오브젝트의 레벨보다 증가하고, 나머지 오브젝트의 레벨들은 최저로 설정된 모양을 현시할 수 있다. 이 때, 프리셋 메타데이터에 포함되는 오브젝트의 이름을 함께 디스플레이부에 더 표시할 수 있으며, 사용자는 레벨이 조절된 오브젝트가 어떠한 것인지 알 수 있다.When one of the preset metadata is selected by the user, the display unit displays the object whose level is adjusted by applying the preset matrix corresponding to the karaoke mode to a plurality of objects. For example, when the user selects the karaoke mode, a shape in which the level of the vocal object is set to the minimum may be manifested. Also, when the user selects the news mode, the preset matrix applied to the audio signal will reduce the level of objects other than the vocal object. Referring to FIG. 14, when the news mode is selected, the display unit may increase the level of the vocal object to be higher than that of the vocal object in the karaoke mode, and the levels of the remaining objects may be set to the lowest level. In this case, the name of the object included in the preset metadata may be further displayed on the display unit, and the user may know which object the level is adjusted.

따라서, 디스플레이부에 프리셋 정보를 표현하는 프리셋 메타데이터뿐만 아니라, 프리셋 매트릭스에 의하여 조절된 오브젝트의 레벨을 현시함으로써, 사용자가 원하는 프리셋 정보모드를 적절하게 선택하여 원하는 음장감을 갖는 오디오 신호를 청취할 수 있게 된다. Therefore, by displaying not only the preset metadata representing the preset information on the display but also the level of the object adjusted by the preset matrix, the user can listen to the audio signal having the desired sound field by appropriately selecting the desired preset information mode. Will be.

이상과 같이, 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 이것에 의해 한정되지 않으며 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 본 발명의 기술사상과 아래에 기재될 특허청구범위의 균등범위 내에서 다양한 수정 및 변형이 가능함은 물론이다. As described above, although the present invention has been described by way of limited embodiments and drawings, the present invention is not limited thereto and is intended by those skilled in the art to which the present invention pertains. Of course, various modifications and variations are possible within the scope of equivalents of the claims to be described.

도 1은 본 발명의 실시예에 따른 오디오 신호에 포함된 오브젝트에 적용되는 프리셋 정보의 개념도를 나타내는 것이다.1 illustrates a conceptual diagram of preset information applied to an object included in an audio signal according to an exemplary embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 오디오 신호 처리 장치를 나타내는 것이다.2 shows an audio signal processing apparatus according to an embodiment of the present invention.

도 3는 본 발명의 실시예에 따른 오디오 신호 처리 장치 중 프리셋 정보 수신부를 나타내는 것이다.3 illustrates a preset information receiver of an audio signal processing apparatus according to an embodiment of the present invention.

도 4는 본 발명의 실시예에 따른 오디오 신호 처리 방법을 나타내는 순서도이다.4 is a flowchart illustrating an audio signal processing method according to an embodiment of the present invention.

도 5는 본 발명의 실시예에 따른 신택스(syntax)를 나타내는 것이다.5 illustrates syntax according to an embodiment of the present invention.

도 6는 본 발명의 다른 실시예에 따른 신택스를 나타내는 것이다.6 illustrates syntax according to another embodiment of the present invention.

도 7는 본 발명의 또다른 실시예에 따른 신택스를 나타내는 것이다.7 illustrates syntax according to another embodiment of the present invention.

도 8은 본 발명의 또다른 실시예에 따른 프리셋 렌더링 데이터 수신부를 나타내는 것이다.8 illustrates a preset rendering data receiver according to another embodiment of the present invention.

도 9 는 본 발명의 또다른 실시예에 따른 신택스를 나타내는 것이다.9 illustrates syntax according to another embodiment of the present invention.

도 10은 본 발명의 또다른 실시예에 따른 오디오 신호 처리 장치를 나타내는 것이다.10 shows an audio signal processing apparatus according to another embodiment of the present invention.

도 11는 본 발명의 실시예에 따른 프리셋 정보 수신부가 구현된 제품의 개략적인 구성을 나타내는 것이다.11 shows a schematic configuration of a product implemented with a preset information receiver according to an embodiment of the present invention.

도 12는 도 11에서 도시된 제품에 해당하는 단말 및 서버와의 관계를 나타내는 것이다.FIG. 12 illustrates a relationship between a terminal and a server corresponding to the product illustrated in FIG. 11.

도 13은 본 발명의 실시예에 따른 프리셋 정보 수신부가 구현된 디지털 TV의 개략적인 구성을 나타내는 것이다.13 illustrates a schematic configuration of a digital TV implemented with a preset information receiver according to an embodiment of the present invention.

도 14는 본 발명의 일실시예에 따른 프리셋 정보 수신부를 포함하는 제품의 디스플레이부를 도시한 것이다.14 illustrates a display unit of a product including a preset information receiver according to an embodiment of the present invention.

Claims

An audio signal receiver configured to receive an audio signal including at least one object;

A preset metadata receiver for obtaining preset metadata of preset information;

A preset rendering data receiving unit obtaining a preset matrix indicating a degree of inclusion of an output channel of the object, wherein the preset rendering data receiving unit obtains the preset matrix corresponding to the preset metadata;

A display unit displaying the preset metadata;

An input unit configured to receive a signal for selecting one of the preset metadata;

An object controller which adjusts an output level of the object according to an output channel by using the preset matrix corresponding to the selected preset metadata; And

And an output unit for outputting an audio signal including an object whose output level is adjusted.

The method of claim 1,

And when the output unit outputs the audio signal, the display unit displays the selected preset metadata.

The method of claim 2,

And the display unit further displays an output level of the object.

The method of claim 1,

And the preset matrix is obtained based on output channel information indicating that the output channel is one of mono, stereo, and multichannel.

The method of claim 1,

The preset information is obtained based on preset number information indicating the number of preset information, and the preset matrix is obtained based on preset type information indicating whether the preset information is expressed in a matrix. .

The method of claim 1,

The preset information may further include preset object application information indicating whether the preset matrix is applied to the object.

The method of claim 6,

And the display unit further displays whether the preset matrix applied to the object exists based on the preset object application information.

The method of claim 1,

And the display unit expresses the preset metadata in text form.

Receiving an audio signal comprising at least one object;

Receiving preset information, the preset information being at least one and including preset metadata and a preset matrix;

Displaying the preset metadata and selecting one of the preset metadata;

Obtaining the preset matrix corresponding to the selected preset metadata from the preset information, wherein the preset matrix indicates a degree of inclusion of an output channel of the object;

Adjusting an output level of the object according to an output channel using the preset matrix; And

And outputting an audio signal including an object whose output level is adjusted.

The method of claim 9,

And displaying the selected preset metadata.

The method of claim 10,

After adjusting the output level of the object,

And displaying an output level of the object.

The method of claim 9,

The preset information is obtained based on preset presence information indicating whether the preset information is included and preset number information indicating the number of the preset information.

And the preset matrix is obtained based on preset type information indicating whether the preset information is expressed in a matrix.

The method of claim 9,

The preset information further includes preset object application information indicating whether the preset matrix applied to the object exists.

And displaying the preset metadata and selecting one of the preset metadata further indicates whether the preset matrix applied to the object exists based on the preset object application information.