KR102049603B1

KR102049603B1 - Apparatus and method for providing the audio metadata, apparatus and method for providing the audio data, apparatus and method for playing the audio data

Info

Publication number: KR102049603B1
Application number: KR1020180130922A
Authority: KR
Inventors: 유재현; 서정일; 강경옥; 이태진
Original assignee: 한국전자통신연구원
Priority date: 2018-10-30
Filing date: 2018-10-30
Publication date: 2019-11-27
Also published as: KR20180121452A

Abstract

오디오 로우(raw) 데이터의 채널 정보가 포함된 오디오 메타데이터를 생성하는 오디오 메타데이터 생성부; 및 상기 생성된 오디오 메타데이터를 오디오 데이터 재생 장치에 송신하는 오디오 메타데이터 송신부를 포함하는 오디오 메타데이터 제공 장치가 개시된다.An audio metadata generator for generating audio metadata including channel information of audio raw data; And an audio metadata transmitter for transmitting the generated audio metadata to an audio data reproducing apparatus.

Description

Apparatus and method for providing audio metadata, Apparatus and method for providing audio data, Apparatus and method for reproducing audio data {APPARATUS AND METHOD FOR PROVIDING THE AUDIO METADATA, APPARATUS AND METHOD FOR PROVIDING THE AUDIO DATA, APPARATUS AND METHOD FOR PLAYING THE AUDIO DATA}

본 발명은 멀티채널을 가진 오디오 데이터를 생성하여 이를 오디오 데이터 재생 장치에 제공하는 방법에 관한 것으로, 보다 구체적으로는 오디오 신호의 채널 정보를 포함하는 메타데이터를 사용하여 멀티채널의 오디오 데이터를 표현하는 방법에 관한 것이다.The present invention relates to a method for generating audio data having multichannels and providing the same to the audio data reproducing apparatus. More particularly, the present invention relates to a method for representing multichannel audio data using metadata including channel information of an audio signal. It is about a method.

오디오 데이터를 멀티채널로 구현하기 위해서는 오디오 콘텐츠가 몇 개의 신호로 만들어져 있는지, 어떤 채널을 공간상의 어떤 위치에 배치해야 하는지에 관한 정보가 필요하다. 기존의 5.1채널 오디오 데이터는 총 6개의 신호가 0, +30, +110, +250, +330, null 도 위치에서 제작 및 재생되는 것을 기본 조건으로 하고 있다.In order to implement audio data in multiple channels, information on how many signals the audio content is made of, and which channel should be placed in a space, is required. The existing 5.1-channel audio data is based on the condition that a total of six signals are produced and reproduced at 0, +30, +110, +250, +330, and null positions.

UHDTV 기술이 개발되고, HDTV에서 제공하던 5.1채널보다 많은 스피커를 사용하면서 보다 사실감이 높은 오디오 재생 방식에 대한 연구가 크게 관심을 받고 있다. 또한, 실제로 5.1채널 이상의 다양한 재생 방식이 존재하고 있다.UHDTV technology has been developed, and research on more realistic audio playback methods using more speakers than the 5.1 channels provided by HDTV has received much attention. In addition, there are actually various playback methods of 5.1 channels or more.

종래의 멀티채널을 가진 오디오 데이터의 표현 방법과 관련하여, 한국등록특허 제10-0522593호는 멀티채널 오디오 시스템에서 복수의 출력이 추가된 경우, 추가된 출력을 기존의 멀티채널 오디오와 조화되도록 생성하여 자연스러운 입체음향을 구현한 멀티채널 오디오 구현 방법을 제안하고 있다.In relation to a conventional method of representing audio data having a multi-channel, Korean Patent No. 10-0522593 generates a matched output with a conventional multi-channel audio when a plurality of outputs are added in a multi-channel audio system. In this paper, a multi-channel audio realization method that implements natural stereo sound is proposed.

구체적으로, 멀티채널 오디오 구현 방법은 인코딩된 오디오 스트림을 입력받아 디코딩하는 단계, 디코딩된 오디오 스트림을 이용하여 멀티채널의 입체음향 사운드를 생성하는 단계를 포함한다. 그 다음으로, 멀티채널 오디오 구현 방법은 생성된 멀티채널 사운드 출력에서 좌측 스테레오 채널 신호와 우측 스테레오 채널 신호 및 중앙 채널의 신호를 이용하여 텔레비전 좌측 스피커 출력 및 텔레비전 우측 스피커 출력을 생성하는 단계를 포함하여 채널을 추가한 멀티채널 오디오를 제공한다.Specifically, the multi-channel audio implementation method includes receiving and decoding an encoded audio stream, and generating multi-channel stereophonic sound using the decoded audio stream. Next, the multi-channel audio implementation method includes generating a television left speaker output and a television right speaker output using a left stereo channel signal, a right stereo channel signal, and a center channel signal in the generated multichannel sound output. Provide multichannel audio with additional channels.

그러나 종래의 멀티채널 오디오 구현 방법은, 5.1채널을 7.1채널로 변환하는 구성만 기재하고 있으며, 다양한 방식의 오디오 채널 시스템 구성이 어렵다는 단점이 존재한다.However, the conventional multichannel audio implementation method describes only a configuration for converting a 5.1 channel to a 7.1 channel, and there is a disadvantage in that it is difficult to configure an audio channel system of various methods.

또한, 현재 Auro 3D, T.Holman 12.2(10.2)채널 등 여러 방식들은 6개 이상의 스피커를 수평면 이외의 위치에도 배치하여 독자적인 방식으로 오디오 채널을 표현하고 있지만, A/V 리시버 등 별도의 추가적인 시스템이 반드시 필요하다는 단점을 가지고 있다.In addition, various methods such as Auro 3D and T.Holman 12.2 (10.2) channel present six or more speakers in a position other than the horizontal plane to express audio channels in a unique manner, but a separate additional system such as an A / V receiver is provided. It has the disadvantage of being necessary.

따라서, 멀티채널을 가진 오디오 데이터를 다양한 방식으로 표현할 수 있는 기술이 요구된다.Therefore, there is a need for a technology capable of representing audio data having multichannels in various ways.

본 발명은 오디오 로우(raw) 데이터의 채널 정보가 포함된 오디오 메타데이터를 생성하여 오디오 재생 장치에 전송할 수 있는 오디오 메타데이터 제공 장치를 제공한다.The present invention provides an audio metadata providing apparatus capable of generating audio metadata including channel information of audio raw data and transmitting the same to audio reproduction apparatus.

본 발명은 오디오 로우 데이터와 오디오 로우 데이터의 채널 정보가 포함된 오디오 메타데이터를 결합하고, 이를 오디오 재생 장치에 전송하여 멀티채널 오디오를 다양한 방식으로 출력할 수 있는 오디오 데이터 제공 장치를 제공한다.The present invention provides an audio data providing apparatus that combines audio raw data and audio metadata including channel information of audio raw data and transmits the same to the audio reproducing apparatus to output multichannel audio in various ways.

본 발명은 오디오 데이터 제공 장치로부터 수신한 오디오 데이터를 디멀티플렉싱(demultiplexing) 또는 디코딩(decoding)하여 오디오 메타데이터에 따라 오디오 로우 데이터를 재생할 수 있는 오디오 데이터 재생 장치를 제공한다.The present invention provides an audio data reproducing apparatus capable of demultiplexing or decoding audio data received from an audio data providing apparatus to reproduce audio raw data according to audio metadata.

본 발명은 재생 환경 정보에 포함된 채널이 오디오 로우 데이터에 포함된 채널보다 하위 채널인 경우, 오디오 로우 데이터에 포함된 오디오 채널의 신호를 조합하는 방식을 통해 채널 호환이 가능한 오디오 데이터 재생 장치를 제공한다.According to the present invention, when a channel included in the reproduction environment information is a lower channel than a channel included in the audio row data, the present invention provides an apparatus for reproducing an audio data compatible with a channel by combining a signal of an audio channel included in the audio row data. do.

본 발명의 일실시예에 따른 오디오 메타데이터 제공 장치는 오디오 로우(raw) 데이터의 채널 정보가 포함된 오디오 메타데이터를 생성하는 오디오 메타데이터 생성부; 및 생성된 오디오 메타데이터를 오디오 데이터 재생 장치에 송신하는 오디오 메타데이터 송신부를 포함할 수 있다.An apparatus for providing audio metadata according to an embodiment of the present invention includes an audio metadata generator for generating audio metadata including channel information of audio raw data; And an audio metadata transmitter for transmitting the generated audio metadata to the audio data reproducing apparatus.

본 발명의 일실시예에 따른 오디오 데이터 제공 장치는 오디오 로우 데이터의 채널 정보가 포함된 오디오 메타데이터를 생성하는 오디오 메타데이터 생성부; 오디오 로우 데이터와 생성된 오디오 메타데이터를 하나의 오디오 데이터로 결합하는 오디오 데이터 결합부; 및 결합된 오디오 데이터를 오디오 데이터 재생 장치에 송신하는 오디오 데이터 송신부를 포함할 수 있다.An audio data providing apparatus according to an embodiment of the present invention comprises: an audio metadata generator for generating audio metadata including channel information of audio row data; An audio data combiner for combining the audio row data and the generated audio metadata into one audio data; And an audio data transmitter for transmitting the combined audio data to the audio data reproducing apparatus.

본 발명의 일실시예에 따른 오디오 데이터 재생 장치는 오디오 데이터 제공 장치로부터 오디오 데이터를 수신하는 오디오 데이터 수신부; 수신한 오디오 데이터에서 오디오 로우 데이터와 오디오 메타데이터를 분석하는 오디오 데이터 분석부; 및 분석한 오디오 메타데이터와 미리 설정된 재생 환경 정보를 기초로 하여 오디오 로우 데이터를 재생하는 오디오 데이터 재생부를 포함할 수 있다.An audio data reproducing apparatus according to an embodiment of the present invention comprises: an audio data receiving unit for receiving audio data from an audio data providing apparatus; An audio data analyzer configured to analyze audio row data and audio metadata from the received audio data; And an audio data reproducing unit for reproducing the audio raw data based on the analyzed audio metadata and preset reproduction environment information.

본 발명의 일실시예에 따르면, 오디오 로우(raw) 데이터의 채널 정보가 포함된 오디오 메타데이터를 생성하여 오디오 재생 장치에 전송할 수 있다.According to an embodiment of the present invention, audio metadata including channel information of audio raw data may be generated and transmitted to the audio reproducing apparatus.

본 발명의 일실시예에 따르면, 오디오 로우 데이터와 오디오 로우 데이터의 채널 정보가 포함된 오디오 메타데이터를 결합하고, 이를 오디오 재생 장치에 전송하여 멀티채널 오디오를 다양한 방식으로 출력할 수 있다.According to an embodiment of the present invention, audio metadata including audio row data and channel information of audio row data may be combined and transmitted to the audio reproducing apparatus to output multichannel audio in various ways.

본 발명의 일실시예에 따르면, 오디오 데이터 제공 장치로부터 수신한 오디오 데이터를 디멀티플렉싱(demultiplexing) 또는 디코딩(decoding)하여 오디오 메타데이터에 따라 오디오 로우 데이터를 재생할 수 있다.According to an embodiment of the present invention, the audio raw data may be reproduced according to the audio metadata by demultiplexing or decoding the audio data received from the audio data providing apparatus.

본 발명의 일실시예에 따르면, 재생 환경 정보에 포함된 채널이 오디오 로우 데이터에 포함된 채널보다 하위 채널인 경우, 오디오 로우 데이터에 포함된 오디오 채널의 신호를 조합하는 방식을 통해 채널 호환이 가능하다.According to an embodiment of the present invention, when a channel included in the reproduction environment information is a lower channel than a channel included in the audio row data, channel compatibility is possible through a combination of signals of the audio channel included in the audio row data. Do.

도 1은 본 발명의 일실시예에 따른 오디오 데이터의 구성을 설명하기 위한 도면이다.
도 2는 본 발명의 일실시예에 따른 오디오 메타데이터 제공 장치가 오디오 데이터 재생 장치에 오디오 메타데이터를 전송하는 동작을 설명하기 위한 도면이다.
도 3은 본 발명의 일실시예에 따른 오디오 데이터 제공 장치가 오디오 데이터 재생 장치에 오디오 데이터를 전송하는 동작을 설명하기 위한 도면이다.
도 4는 본 발명의 일실시예에 따른 오디오 메타데이터의 구성을 도시한 도면이다.
도 5는 본 발명의 일실시예에 따른 오디오 메타데이터 구성의 한 실시예를 설명하기 위한 도면이다.
도 6은 본 발명의 일실시예에 따른 오디오 채널 위치 정보의 기준을 설명하기 위한 도면이다.
도 7은 본 발명의 일실시예에 따른 오디오 채널 위치 정보에 기초하여 공간상에 스피커를 배치한 도면이다.
도 8은 본 발명의 오디오 메타데이터 구성의 다른 실시예를 설명하기 위한 도면이다.
도 9는 본 발명의 일실시예에 따른 오디오 메타데이터 제공 장치가 오디오 메타데이터를 제공하는 동작을 나타낸 흐름도이다.
도 10은 본 발명의 일실시예에 따른 오디오 데이터 제공 장치가 오디오 데이터를 제공하는 동작을 나타낸 흐름도이다.
도 11은 본 발명의 일실시예에 따른 오디오 데이터 재생 장치가 오디오 데이터를 재생하는 동작을 나타낸 흐름도이다.1 is a view for explaining the configuration of audio data according to an embodiment of the present invention.
2 is a diagram illustrating an operation of transmitting audio metadata to an audio data reproducing apparatus by an audio metadata providing apparatus according to an embodiment of the present invention.
3 is a diagram illustrating an operation of transmitting audio data to an audio data reproducing apparatus by an audio data providing apparatus according to an embodiment of the present invention.
4 is a diagram illustrating a configuration of audio metadata according to an embodiment of the present invention.
5 is a view for explaining an embodiment of the audio metadata configuration according to an embodiment of the present invention.
6 is a diagram for describing a criterion of audio channel position information according to an embodiment of the present invention.
7 is a diagram in which a speaker is placed in a space based on audio channel position information according to an embodiment of the present invention.
8 is a view for explaining another embodiment of the audio metadata configuration of the present invention.
9 is a flowchart illustrating an operation of providing audio metadata by an audio metadata providing apparatus according to an embodiment of the present invention.
10 is a flowchart illustrating an operation of providing audio data by an audio data providing apparatus according to an embodiment of the present invention.
11 is a flowchart illustrating an operation of reproducing audio data by an audio data reproducing apparatus according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다. 본 발명의 일실시예에 따른 오디오 메타데이터 제공 방법은 오디오 메타데이터 제공 장치에 의해 수행될 수 있으며, 본 발명의 일실시예에 따른 오디오 데이터 제공 방법은 오디오 데이터 제공 장치에 의해 수행될 수 있다. 또한, 본 발명의 일실시예에 따른 오디오 데이터 재생 방법은 오디오 데이터 재생 장치에 의해 수행될 수 있다. 각 도면에 제시된 동일한 참조부호는 동일한 부재를 나타낸다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. The audio metadata providing method according to an embodiment of the present invention may be performed by an audio metadata providing apparatus, and the audio data providing method according to an embodiment of the present invention may be performed by an audio data providing apparatus. In addition, the audio data reproduction method according to an embodiment of the present invention may be performed by the audio data reproduction apparatus. Like reference numerals in the drawings denote like elements.

도 1은 본 발명의 일실시예에 따른 오디오 데이터의 구성을 설명하기 위한 도면이다.1 is a view for explaining the configuration of audio data according to an embodiment of the present invention.

도 1을 참고하면, 오디오 데이터(110)는 오디오 로우 데이터(120)와 오디오 메타데이터(130)로 구성될 수 있다. 오디오 로우(raw) 데이터는 오디오 신호를 의미하며, 채널의 개수만큼 오디오 신호를 포함할 수 있다. 오디오 로우 데이터(120)는 제작된 오디오 신호를 차례대로 저장할 수 있다. 예를 들어, 5.1채널의 오디오 데이터(110)인 경우, 오디오 로우 데이터(120)에는 L(Left), R(Right), C(Center), LFE(Low Frequency Effect), LS(Left Side), RS(Right Side) 채널과 관련된 각각의 신호(140)가 포함될 수 있다.Referring to FIG. 1, the audio data 110 may include audio row data 120 and audio metadata 130. The audio raw data refers to an audio signal and may include as many audio signals as the number of channels. The audio raw data 120 may sequentially store the produced audio signal. For example, in the case of audio data 110 of 5.1 channels, audio low data 120 includes L (Left), R (Right), C (Center), LFE (Low Frequency Effect), LS (Left Side), Each signal 140 associated with a right side (RS) channel may be included.

오디오 메타데이터(130)는 오디오 신호에 관련된 표현 정보를 의미하며, 오디오 로우 데이터(120)의 채널 정보를 포함할 수 있다. 이에 대한 자세한 설명은 도 4에서 후술한다.The audio metadata 130 refers to representation information related to the audio signal, and may include channel information of the audio raw data 120. Detailed description thereof will be described later with reference to FIG. 4.

도 2는 본 발명의 일실시예에 따른 오디오 메타데이터 제공 장치가 오디오 데이터 재생 장치에 오디오 메타데이터를 전송하는 동작을 설명하기 위한 도면이다.2 is a view for explaining an operation of transmitting audio metadata to an audio data reproducing apparatus by an audio metadata providing apparatus according to an embodiment of the present invention.

도 2를 참고하면, 오디오 메타데이터 제공 장치(210)는 오디오 메타데이터 생성부(220)와 오디오 메타데이터 송신부(230)를 포함할 수 있다.Referring to FIG. 2, the audio metadata providing apparatus 210 may include an audio metadata generator 220 and an audio metadata transmitter 230.

오디오 메타데이터 생성부(220)는 오디오 로우 데이터의 채널 정보가 포함된 오디오 메타데이터를 생성할 수 있다. 생성된 오디오 메타데이터는 오디오 로우 데이터와는 별도로 관리될 수 있으며, 오디오 메타데이터 제공 장치(210)에 의해 코딩될 수 있다.The audio metadata generator 220 may generate audio metadata including channel information of audio row data. The generated audio metadata may be managed separately from the audio row data and may be coded by the audio metadata providing device 210.

오디오 메타데이터 송신부(230)는 오디오 메타데이터를 오디오 데이터 재생 장치(250)에 송신할 수 있다. 오디오 메타데이터 송신부(230)는 다른 실시예에 있어, 오디오 로우 데이터 제공 장치(240)로부터 수신한 오디오 로우 데이터와 오디오 메타데이터를 멀티플렉싱(Multiplexing)하여 오디오 데이터 재생 장치(250)에 전송할 수 있다. 여기서 멀티플렉싱이란 여러 신호를 합쳐 한 개의 신호로 처리하는 방법을 의미한다.The audio metadata transmitter 230 may transmit audio metadata to the audio data reproducing apparatus 250. According to another embodiment, the audio metadata transmitter 230 may multiplex audio row data and audio metadata received from the audio row data providing apparatus 240 and transmit the multiplexed audio row data and the audio metadata to the audio data reproducing apparatus 250. Here, multiplexing refers to a method of combining multiple signals into a single signal.

오디오 로우 데이터 제공 장치(240)는 오디오 데이터 재생 장치(250)에 오디오 신호가 채널 별로 집합해 있는 오디오 로우 데이터를 전송할 수 있다. 경우에 따라, 오디오 메타데이터 제공 장치(210)는 오디오 로우 데이터 제공 장치(240)와 연동하여 오디오 로우 데이터를 기초로 하여 오디오 메타데이터를 생성할 수 있다.The audio raw data providing apparatus 240 may transmit audio raw data in which audio signals are collected for each channel to the audio data reproducing apparatus 250. In some cases, the audio metadata providing apparatus 210 may generate audio metadata based on the audio raw data in cooperation with the audio raw data providing apparatus 240.

오디오 데이터 재생 장치(250)는 수신한 오디오 로우 데이터와 오디오 메타데이터를 분석하여 오디오 로우 데이터를 재생할 수 있다. 이 때, 오디오 데이터 재생 장치(250)는 오디오 데이터 재생 장치(250)에 저장된 오디오 환경 설정 정보와 오디오 메타데이터에 기초하여 오디오 로우 데이터를 채널 정보에 맞게 재생할 수 있다.The audio data reproducing apparatus 250 may reproduce the audio raw data by analyzing the received audio raw data and the audio metadata. At this time, the audio data reproducing apparatus 250 may reproduce the audio row data according to the channel information based on the audio environment setting information and the audio metadata stored in the audio data reproducing apparatus 250.

도 3은 본 발명의 일실시예에 따른 오디오 데이터 제공 장치가 오디오 데이터 재생 장치에 오디오 데이터를 전송하는 동작을 설명하기 위한 도면이다.3 is a diagram for describing an operation of transmitting audio data to an audio data reproducing apparatus by an audio data providing apparatus according to an embodiment of the present invention.

도 3을 참고하면, 오디오 데이터 제공 장치(310)는 오디오 메타데이터 생성부(320), 오디오 데이터 결합부(330), 및 오디오 데이터 송신부(340)를 포함할 수 있다.Referring to FIG. 3, the audio data providing apparatus 310 may include an audio metadata generator 320, an audio data combiner 330, and an audio data transmitter 340.

오디오 메타데이터 생성부(320)는 오디오 로우 데이터의 채널 정보가 포함된 오디오 메타데이터를 생성할 수 있다. 즉, 오디오 메타데이터 생성부(320)는 오디오 로우 데이터를 분석하여, 오디오 신호의 채널 정보를 오디오 메타데이터로 나타낼 수 있다.The audio metadata generator 320 may generate audio metadata including channel information of audio row data. That is, the audio metadata generator 320 may analyze the audio row data to represent channel information of the audio signal as audio metadata.

오디오 데이터 결합부(330)는 오디오 로우 데이터와 오디오 메타데이터를 하나의 오디오 데이터로 결합할 수 있다. 이 때, 오디오 데이터 결합부(330)는 오디오 로우 데이터나 오디오 메타데이터만 코딩하거나 오디오 로우 데이터와 오디오 메타데이터를 함께 코딩할 수 있다.The audio data combiner 330 may combine the audio raw data and the audio metadata into one audio data. In this case, the audio data combiner 330 may code only audio row data or audio metadata, or code audio row data and audio metadata together.

오디오 데이터 송신부(340)는 결합된 오디오 데이터를 오디오 데이터 재생 장치(350)에 송신할 수 있다. 다시 말해, 오디오 데이터 송신부(340)는 코딩된 오디오 데이터를 멀티플렉싱 처리하여 오디오 데이터 재생 장치(350)에 송신할 수 있다.The audio data transmitter 340 may transmit the combined audio data to the audio data reproducing apparatus 350. In other words, the audio data transmitter 340 may multiplex the coded audio data and transmit the multiplexed data to the audio data reproducing apparatus 350.

오디오 데이터 재생 장치(350)는 오디오 데이터 수신부(360), 오디오 데이터 분석부(370), 및 오디오 데이터 재생부(380)를 포함할 수 있다.The audio data reproducing apparatus 350 may include an audio data receiving unit 360, an audio data analyzing unit 370, and an audio data reproducing unit 380.

오디오 데이터 수신부(360)는 오디오 데이터 제공 장치(310)로부터 오디오 데이터를 수신할 수 있다.The audio data receiver 360 may receive audio data from the audio data providing apparatus 310.

오디오 데이터 분석부(370)는 수신한 오디오 데이터에서 오디오 로우 데이터와 오디오 메타데이터를 분석할 수 있다. 오디오 데이터 분석부(370)는 오디오 데이터를 디멀티플렉싱(demultiplexing) 하거나 디코딩할 수 있다. 오디오 데이터 분석부(370)는 디멀티플렉싱과 디코딩 처리를 통해, 오디오 데이터에서 오디오 로우 데이터와 오디오 메타데이터를 분리할 수 있으며, 오디오 로우 데이터에서 각각의 채널 별 오디오 신호를 추출할 수 있다.The audio data analyzer 370 may analyze audio row data and audio metadata from the received audio data. The audio data analyzer 370 may demultiplex or decode the audio data. The audio data analyzer 370 may separate audio row data and audio metadata from the audio data through demultiplexing and decoding processing, and may extract an audio signal for each channel from the audio row data.

본 발명의 다른 실시예에 따르면, 오디오 데이터 분석부(370)는 수신한 오디오 데이터를 오디오 로우 데이터와 오디오 메타데이터로 분리하고, 사용자는 분리된 오디오 로우 데이터와 오디오 메타데이터를 이용하여 추가적인 오디오 신호의 편집이나 재구성을 할 수 있다.According to another embodiment of the present invention, the audio data analyzer 370 separates the received audio data into audio raw data and audio metadata, and the user adds an additional audio signal using the separated audio raw data and the audio metadata. You can edit and reorganize.

오디오 데이터 재생부(380)는 분석한 오디오 메타데이터와 미리 설정된 재생 환경 정보를 기초로 하여 오디오 로우 데이터를 재생할 수 있다. 여기서 미리 설정된 재생 환경이란 가정의 TV 시스템이나 A/V 리시버 등의 스피커 배치 상황과 같이 오디오 재생 장치의 오디오 환경 설정 정보를 의미한다.The audio data reproducing unit 380 may reproduce the audio raw data based on the analyzed audio metadata and preset reproduction environment information. Here, the preset playback environment refers to audio environment setting information of the audio reproducing apparatus such as a speaker arrangement situation of a home TV system or an A / V receiver.

오디오 데이터 재생부(380)는 재생 환경 정보에 포함된 채널이 오디오 로우 데이터에 포함된 채널보다 하위 채널인 경우, 오디오 메타데이터에 포함된 오디오 채널의 하위 호환 정보에 따라 오디오 로우 데이터에 포함된 오디오 채널의 신호를 조합하여 하위 채널로 변경할 수 있다.If the channel included in the playback environment information is a lower channel than the channel included in the audio raw data, the audio data reproducing unit 380 includes the audio included in the audio raw data according to the backward compatibility information of the audio channel included in the audio metadata. The signal of the channel can be combined to change to the lower channel.

예를 들어, 오디오 로우 데이터에 포함된 채널의 개수가 6인 경우, 오디오 재생 장치의 재생 환경 정보에 포함된 설정 채널 수가 2이면, 오디오 데이터 재생부(380)는 수신한 오디오 데이터를 2채널의 하위 채널로 다운믹싱(Down Mixing)할 수 있다. 여기서 다운믹싱이란 여러 채널이 합쳐진 오디오 데이터를 낮은 수의 채널을 갖는 오디오 데이터로 변환하는 것을 의미한다. 이를 통해, 오디오 데이터 재생 장치(350)는 오디오 환경 설정 정보에 포함된 채널보다 많은 채널을 가진 오디오 데이터를 수신하더라도 채널 호환을 통해 오디오 데이터를 재생할 수 있다.For example, when the number of channels included in the audio row data is 6, and the number of set channels included in the reproduction environment information of the audio reproducing apparatus is 2, the audio data reproducing unit 380 converts the received audio data into two channels. Down mixing can be done with the lower channel. Here, downmixing means converting audio data in which several channels are combined into audio data having a low number of channels. Through this, the audio data reproducing apparatus 350 may reproduce the audio data through channel compatibility even if the audio data having more channels than the channels included in the audio configuration information is received.

오디오 데이터 재생 장치(350)는 오디오 환경 설정 정보에 포함된 채널보다 낮은 채널을 가진 오디오 데이터를 수신한 경우, 오디오 신호가 포함되지 않은 채널은 출력하지 않음으로써, 채널의 수에 상관없이 오디오 데이터를 재생할 수 있다.When the audio data reproducing apparatus 350 receives audio data having a lower channel than the channel included in the audio configuration information, the audio data reproducing apparatus 350 does not output the channel that does not include the audio signal, thereby reproducing the audio data regardless of the number of channels. Can play.

도 4는 본 발명의 일실시예에 따른 오디오 메타데이터의 구성을 도시한 도면이다.4 is a diagram illustrating a configuration of audio metadata according to an embodiment of the present invention.

도 4를 참고하면, 오디오 메타데이터(410)는 오디오 채널의 개수 정보(420), 오디오 채널의 이름 정보(430), 오디오 채널의 위치 정보(440), 또는 오디오 채널의 하위 호환 정보(450) 중 적어도 어느 하나를 포함할 수 있다.Referring to FIG. 4, the audio metadata 410 may include audio channel number information 420, audio channel name information 430, audio channel location information 440, or audio channel backward compatibility information 450. It may include at least one of.

오디오 채널의 개수 정보(420)는 오디오 로우 데이터에 포함된 오디오 신호의 개수를 나타낸다. 예를 들어, 5.1채널 신호가 오디오 로우 데이터에 포함되어 있다면, 오디오 채널의 개수 정보(420)는 6이라는 값을 갖는다. 또한 7.1채널 신호가 오디오 로우 데이터에 포함되어 있다면, 오디오 채널의 개수 정보(420)는 8이라는 값을 갖는다. 경우에 따라, 오디오 채널의 개수 정보(420)는 객체기반으로 저장된 오디오 신호의 객체 수를 나타낼 수도 있다.The number information 420 of audio channels indicates the number of audio signals included in audio row data. For example, if the 5.1 channel signal is included in the audio row data, the number information 420 of the audio channel has a value of 6. Also, if the 7.1 channel signal is included in the audio row data, the number information 420 of the audio channel has a value of 8. In some cases, the number information 420 of the audio channel may indicate the number of objects of the audio signal stored on an object basis.

오디오 채널의 이름 정보(430)는 오디오 로우 데이터에 포함된 오디오 신호 각각의 채널 이름을 나타낸다. 예를 들어, 5.1채널 신호의 경우 오디오 채널의 이름 정보(430)는 (L, R, C, LFE, LS, RS) 라는 정보를 포함할 수 있다. 이는 오디오 로우 데이터에 저장되어 있는 오디오 신호가 차례대로 L, R, C, LFE, LS, RS 의 이름에 해당한다는 것을 의미한다. 여기서 오디오 채널의 이름과 오디오 신호 순서의 정의는 강제적인 것이 아니라 제작자가 임의대로 설정할 수 있다. 단, 기존 시스템과의 호환을 위해 5.1채널 및 2.0채널 신호는 각각 (L, R, C, LFE, LS, RS), (L, R)과 같은 형태로 통일시켜 사용하는 것이 바람직하다.The name information 430 of the audio channel indicates a channel name of each audio signal included in the audio row data. For example, in the case of the 5.1-channel signal, the name information 430 of the audio channel may include information (L, R, C, LFE, LS, RS). This means that the audio signals stored in the audio row data correspond to the names of L, R, C, LFE, LS, and RS in order. Here, the definition of the audio channel name and the order of the audio signals is not mandatory, but can be arbitrarily set by the producer. However, for compatibility with the existing system, it is preferable to use the 5.1 channel and 2.0 channel signals unified in the form of (L, R, C, LFE, LS, RS) and (L, R), respectively.

오디오 채널의 위치 정보(440)는 오디오 신호 채널이 공간적으로 어느 위치에 배치되어야 하는지를 나타낸다. 오디오 채널의 위치 정보(440)는 수평 방위각 정보와 수직 방위각 정보로 구성될 수 있다.The location information 440 of the audio channel indicates in which position the audio signal channel should be spatially located. The position information 440 of the audio channel may be composed of horizontal azimuth information and vertical azimuth information.

수평 방위각은 도 6에서와 같이 사용자(610)가 정면을 바라볼 때, 지면에 수평한 면을 기준으로 정면을 기준점인 0도(620)로 하고, 시계방향을 (+) 방향으로 하여 기준을 설정할 수 있다. 수직 방위각도 도 6에서와 같이 사용자(630)가 정면을 바라볼 때, 지면에 수직인 면을 기준으로 정면을 기준점인 0도(640)로 하고, 위쪽 방향을 (+) 방향으로 하여 기준을 설정할 수 있다.As shown in FIG. 6, when the user 610 faces the front side, the horizontal azimuth angle is set to 0 degrees 620, which is a reference point with respect to the front surface with respect to the surface horizontal to the ground, and the reference is set with the clockwise direction as the (+) direction. Can be set. Vertical Azimuth Angle As shown in FIG. 6, when the user 630 faces the front face, the reference point is set to 0 degree 640 which is a reference point with respect to the front face perpendicular to the ground, and the upper direction is set as (+) direction. Can be set.

예를 들어 2.0채널의 경우, 오디오 채널의 이름 정보(430)가 (L, R), 오디오 채널의 위치 정보(440)가 [(330, 0), (30, 0)]과 같이 표현된다고 하면, L채널은 수평 330도, 수직 0도에 배치되고, R채널은 수평 30도, 수직 0도 위치에 배치된다는 것을 나타낸다. 어느 위치에 배치되어도 상관 없는 경우에는 null 기호를 이용하여 표시할 수 있으며, 방위각으로는 (null, null)으로 나타낼 수 있다.For example, in the case of 2.0 channels, if the audio channel name information 430 is represented as (L, R) and the audio channel position information 440 is represented as [(330, 0), (30, 0)] Indicates that the L channel is disposed at 330 degrees horizontally and 0 degrees vertically, and the R channel is disposed at 30 degrees horizontally and 0 degrees vertically. If it does not matter in any position, it can be displayed using a null symbol and can be represented as (null, null) as an azimuth.

오디오 채널의 하위 호환 정보(450)는 오디오 로우 데이터에 포함된 오디오 채널의 신호를 조합하여 하위 채널로 변경하는 방식 정보를 나타낸다. 예를 들어 하위 호환 정보(450)는, 7.1채널을 가진 오디오 로우 데이터가 어떻게 5.1채널 또는 2.0채널과 호환될 수 있는지를 나타낼 수 있다. 5.1채널보다 많은 채널 수를 가진 오디오 데이터는 5.1채널 또는 2.0채널로의 변경 방식 정보를 포함할 수 있으며, 5.1채널의 오디오 데이터는 2.0채널로의 호환 방식 정보만 포함하면 된다.The backward compatibility information 450 of the audio channel represents a method of changing the lower channel by combining the signals of the audio channel included in the audio row data. For example, the backward compatibility information 450 may indicate how audio row data having 7.1 channels may be compatible with 5.1 or 2.0 channels. The audio data having a larger number of channels than the 5.1 channel may include information on changing to 5.1 or 2.0 channels, and the audio data of 5.1 channels need only include information on compatibility with 2.0 channels.

도 5는 본 발명의 일실시예에 따른 오디오 메타데이터 구성의 한 실시예를 설명하기 위한 도면이다.5 is a view for explaining an embodiment of the audio metadata configuration according to an embodiment of the present invention.

도 5에 따르면, 오디오 메타데이터(510)는 오디오 채널의 개수 정보(520), 오디오 채널의 이름 정보(530), 오디오 채널의 위치 정보(540), 오디오 채널의 하위 호환 정보(550)를 포함하고 있다.According to FIG. 5, the audio metadata 510 includes number information 520 of audio channels, name information 530 of audio channels, location information 540 of audio channels, and backward compatibility information 550 of audio channels. Doing.

오디오의 채널 개수 정보(520)의 값은 6으로 오디오 로우 데이터의 채널 개수가 6개이고 LFE채널이 1개인, 5.1채널임을 알 수 있다. 또한, 오디오 채널의 이름 정보(530)에 따라 오디오 로우 데이터에 포함된 오디오 신호들의 각 채널 이름들이 차례대로 L, R, C, LFE, LS, RS 임을 알 수 있다.The value of the channel number information 520 of the audio is 6, and it can be seen that the number of channels of the audio raw data is 6 and the LFE channel is 1, 5.1 channels. In addition, it can be seen that the channel names of the audio signals included in the audio row data are L, R, C, LFE, LS, and RS in order according to the name information 530 of the audio channel.

오디오 채널 위치 정보(540)를 보면 L채널은 (330, 0)이 되어 수평 330도, 수직 0도에 배치되고, R채널은 (30, 0)이 되어 수평 30도, 수직 0도에 배치된다. C채널은 (0, 0)으로 수평 0도, 수직 0도에 배치되고, LS채널은 (250, 0)으로 수평 250도, 수직 0도에 배치된다. 또한, RS채널은 (110, 0)이 되어 수평 110도, 수직 0도에 배치되고, LFE채널은 (null, null)이 되어 어느 위치에 배치되어도 상관없음을 나타낸다. 5.1채널 및 2.0채널 신호의 오디오 채널 위치 정보는 기존 시스템과의 호환을 위해 각각 [(330, 0), (30, 0), (0, 0), (null, null), (250, 0), (110, 0)], [(330, 0), (30, 0)]의 값을 갖는 것이 바람직하다.In the audio channel position information 540, the L channel becomes (330, 0) and is disposed at 330 degrees horizontally and vertically 0 degrees, and the R channel becomes (30, 0) and is disposed at 30 degrees horizontally and 0 degrees vertically. . The C channel is arranged at 0 degrees horizontal and 0 degrees vertically at (0, 0), and the LS channel is arranged at 250 degrees horizontal and 0 degrees vertically at (250, 0). In addition, the RS channel becomes (110, 0), and is disposed at 110 degrees horizontally and 0 degrees vertically, and the LFE channel is (null, null) and may be disposed at any position. Audio channel position information of 5.1-channel and 2.0-channel signals is [(330, 0), (30, 0), (0, 0), (null, null), (250, 0), respectively, for compatibility with existing systems. , (110, 0)], [(330, 0), (30, 0)].

이를 실제 공간상에 구현한 것이 도 7에 표현되어 있다. 도 7은 도 5에서의 오디오 채널 위치 정보(540)에 기초하여 스피커를 배치한 도면이다.The implementation of this in the real space is represented in FIG. FIG. 7 illustrates a layout of a speaker based on the audio channel position information 540 of FIG. 5.

도 7을 참고하면, 사용자(710)를 중심으로, 각 채널이 배치되어야 할 위치에 스피커가 위치하고 있다. 이는 오디오 메타데이터에 포함된 정보에 기초하여 멀티채널 오디오 데이터를 표현한 것으로, 이를 이용하여 5.1채널보다 더 많은 채널을 가진 멀티채널 오디오 데이터를 다양한 방식으로 구현이 가능하다.Referring to FIG. 7, a speaker is positioned at a position where each channel should be arranged around the user 710. The multi-channel audio data is expressed based on the information included in the audio metadata. By using this, multi-channel audio data having more than 5.1 channels can be implemented in various ways.

도 5에서, 오디오 채널의 하위 호환 정보(550)를 보면 5.1채널에서 2.0채널로의 다운믹싱을 위한 상수(560)와 계산식(570)을 알 수 있다. 상수인 (a, k)(560)는 오디오 데이터 제작자가 만든 임의의 상수이며, 계산식(570)은 하위 채널 순서에 따라 저장될 수 있다. 계산식(570)에서 D(1)은 (D)의 첫 번째 데이터인 a를 의미하며, D(2)는 (D)의 두 번째 데이터인 k를 의미한다. N(1)은 (N)의 첫 번째 데이터로서 L채널 신호를 의미하고, N(2)는 (N)의 두 번째 데이터로서 R채널 신호를 의미한다. 계속하여, N(3)은 C채널 신호를, N(4)는 LFE채널 신호를, N(5)는 LS채널 신호를, N(6)은 RS채널 신호를 의미한다. 즉, 오디오 채널의 하위 호환 정보(550)는 아래 <표 1>과 같은 다운믹싱 정보를 저장할 수 있다. 상술한 실시예는 5.1채널에서 2.0채널로의 다운믹싱의 한 예를 설명한 것에 불과하고, 여러 변형 및 임의적인 설정이 가능하다.In FIG. 5, the backward compatibility information 550 of the audio channel shows a constant 560 and a calculation equation 570 for downmixing from 5.1 channel to 2.0 channel. The constant (a, k) 560 is an arbitrary constant made by the audio data producer, and the calculation equation 570 may be stored according to the sub-channel order. In Equation 570, D (1) means a, which is the first data of (D), and D (2) means k, which is the second data of (D). N (1) means the L channel signal as the first data of (N), and N (2) means the R channel signal as the second data of (N). N (3) denotes a C channel signal, N (4) denotes an LFE channel signal, N (5) denotes an LS channel signal, and N (6) denotes an RS channel signal. That is, the backward compatibility information 550 of the audio channel may store downmixing information as shown in Table 1 below. The above embodiment merely describes an example of downmixing from 5.1 channel to 2.0 channel, and various modifications and arbitrary settings are possible.

채널 레이아웃Channel layout 채널 매트릭싱(channel matrixing)
(a와 k는 임의의 상수)Channel matrixing
(a and k are arbitrary constants) 5.1 -> 2.05.1-> 2.0

도 8은 본 발명의 오디오 메타데이터 구성의 다른 실시예를 설명하기 위한 도면이다.8 is a view for explaining another embodiment of the audio metadata configuration of the present invention.

도 8에 따르면, 오디오 메타데이터(810)는 오디오 채널의 개수 정보(820), 오디오 채널의 이름 정보(830), 오디오 채널의 위치 정보(840), 및 오디오 채널의 하위 호환 정보(850)로서 3가지 채널로의 호환 정보를 포함하고 있다.According to FIG. 8, the audio metadata 810 is as the number information 820 of the audio channel, the name information 830 of the audio channel, the location information 840 of the audio channel, and the backward compatibility information 850 of the audio channel. Contains compatibility information for all three channels.

오디오의 채널 개수 정보(820)의 값은 12로 오디오 로우 데이터의 채널 개수가 12개이고 LFE 채널이 두 개인, 10.2채널임을 알 수 있다. 또한, 오디오 채널의 이름 정보(830)에 따라 오디오 로우 데이터에 포함된 오디오 신호들의 각 채널 이름들이 차례대로 L, R, C, LH, RH, LS, RS, LB, RB, TC, LFE1, LFE2 임을 알 수 있다.The value of the channel number information 820 of audio is 12, and it can be seen that the number of channels of audio raw data is 12 and that there are 10.2 channels having two LFE channels. Also, according to the name information 830 of the audio channel, each channel name of the audio signals included in the audio row data is sequentially L, R, C, LH, RH, LS, RS, LB, RB, TC, LFE1, LFE2. It can be seen that.

오디오 채널 위치 정보(840)를 보면 L채널은 (330, 0)이 되어 수평 330도, 수직 0도에 배치되고, R채널은 (30, 0)으로 수평 30도, 수직 0도에 배치되며, C채널은 (0, 0)으로 수평 0도, 수직 0도에 배치된다. 계속하여, LH채널은 (330, 30)이 되어 수평 330도, 수직 30도에 배치되고, RH채널은 (30, 30)으로 수평 30도, 수직 30도에 배치되며, LS채널은 (270, 0)으로 수평 270도, 수직 0도에 배치된다. 또한, RS채널은 (90, 0)이 되어 수평 90도, 수직 0도에 배치되고, LB채널은 (210, 0)으로 수평 210도, 수직 0도에 배치되며, RB채널은 (150, 0)으로 수평 150도, 수직 0도에 배치된다. TC채널은 (0, 90)으로 수평 0도, 수직 90도에 배치되고, LFE1채널과 LEF2채널은 (null, null)이 되어 어느 위치에 배치되어도 상관없음을 나타낸다. 이와 같이 오디오 메타데이터(810)가 수평면뿐만 아니라 수평면보다 상위 레이어에서의 오디오 채널 정보를 포함할 수 있게 되어 오디오 제작자는 입체적인 음향 시스템을 구성할 수 있다.Looking at the audio channel position information 840, the L channel becomes (330, 0) and is disposed at 330 degrees horizontally and 0 degrees vertically, and the R channel is disposed at 30 degrees horizontally and 0 vertically at (30, 0), The C channel is (0, 0) at 0 degrees horizontal and 0 degrees vertical. Subsequently, the LH channel becomes (330, 30) and is disposed at 330 degrees horizontal and 30 degrees vertically, and the RH channel is arranged at 30 degrees horizontal and 30 degrees vertically at (30, 30), and the LS channel is (270, 30). 0) horizontally and 270 degrees vertically 0 degrees. In addition, the RS channel becomes (90, 0) and is disposed at 90 degrees horizontal and vertical 0 degrees, and the LB channel is arranged at 210 degrees horizontal and 0 degrees vertically at (210, 0) and the RB channel is (150, 0). ) Is placed at 150 degrees horizontally and 0 degrees vertically. The TC channel is (0, 90) horizontally at 0 degrees and vertical 90 degrees, and the LFE1 channel and the LEF2 channel are (null, null), indicating that they may be disposed at any position. As such, the audio metadata 810 may include not only the horizontal plane but also audio channel information on a layer higher than the horizontal plane, such that the audio producer may configure a three-dimensional sound system.

오디오 채널의 하위 호환 정보(850)를 보면 10.2채널에서 7.1채널, 5.1채널, 2.0채널로의 다운믹싱을 위한 상수(860)와 계산식(870)을 알 수 있다. 상수(860)인 (a1, c1), (a2, c2) (a3, c3)은 오디오 데이터 제작자가 만든 임의의 상수이며, 다운믹싱 수식 정보(870)는 하위 채널 순서에 따라 저장될 수 있다. 각각의 다운믹싱 수식 정보는 10.2채널에서 각 채널로의 변경을 위한 신호 조합 및 변환 정보가 포함할 수 있다. 이로 인해, 오디오 재생 장치가 10.2채널 오디오 데이터를 수신하더라도, 오디오 재생 장치의 채널 구성에 맞게 호환하여 오디오 데이터를 재생할 수 있다. 다운믹싱 수식 정보가 저장된 순서는 제작자가 임의로 설정할 수 있으며, 위 실시예에 한정되는 것은 아니다.The backward compatibility information 850 of the audio channel shows a constant 860 and a calculation equation 870 for downmixing from 10.2 channels to 7.1 channels, 5.1 channels, and 2.0 channels. The constants 860 (a1, c1), (a2, c2) (a3, c3) are arbitrary constants created by the audio data producer, and the downmixing formula information 870 may be stored according to the order of the lower channels. Each downmixing formula information may include signal combination and conversion information for changing from 10.2 channels to each channel. Thus, even if the audio reproducing apparatus receives the 10.2 channel audio data, the audio reproducing apparatus can reproduce the audio data compatible with the channel configuration of the audio reproducing apparatus. The order in which the downmixing formula information is stored may be arbitrarily set by the producer and is not limited to the above embodiment.

도 9는 본 발명의 일실시예에 따른 오디오 메타데이터 제공 장치가 오디오 메타데이터를 제공하는 동작을 나타낸 흐름도이다.9 is a flowchart illustrating an operation of providing audio metadata by an audio metadata providing apparatus according to an embodiment of the present invention.

단계(910)에서, 오디오 메타데이터 제공 장치는 오디오 로우 데이터에 관련된 정보가 포함된 오디오 메타데이터를 생성할 수 있다. 오디오 메타데이터 제공 장치가 생성한 오디오 메타데이터는 오디오 로우 데이터와는 별도로 관리될 수 있으며, 오디오 메타데이터 제공 장치에 의해 코딩될 수 있다.In operation 910, the audio metadata providing apparatus may generate audio metadata including information related to audio row data. The audio metadata generated by the audio metadata providing apparatus may be managed separately from the audio row data, and may be coded by the audio metadata providing apparatus.

단계(920)에서, 오디오 메타데이터 제공 장치는 생성된 오디오 메타데이터를 오디오 데이터 재생 장치에 송신할 수 있다. 오디오 메타데이터 제공 장치는 다른 실시예에 있어, 오디오 로우 데이터 제공 장치로부터 수신한 오디오 로우 데이터와 오디오 메타데이터를 멀티플렉싱하여 오디오 데이터 재생 장치에 전송할 수 있다.In operation 920, the audio metadata providing apparatus may transmit the generated audio metadata to the audio data reproducing apparatus. According to another exemplary embodiment, the audio metadata providing apparatus may multiplex audio row data and audio metadata received from the audio row data providing apparatus and transmit the multiplexed audio row data to the audio data reproducing apparatus.

도 10은 본 발명의 일실시예에 따른 오디오 데이터 제공 장치가 오디오 데이터를 제공하는 동작을 나타낸 흐름도이다.10 is a flowchart illustrating an operation of providing audio data by an audio data providing apparatus according to an embodiment of the present invention.

단계(1010)에서, 오디오 데이터 제공 장치는 오디오 로우 데이터에 관련된 정보가 포함된 오디오 메타데이터를 생성할 수 있다. 즉, 오디오 데이터 제공 장치는 오디오 로우 데이터를 기반으로 하여, 오디오 신호에 관련된 정보를 오디오 메타데이터로 나타낼 수 있다.In operation 1010, the audio data providing apparatus may generate audio metadata including information related to audio row data. That is, the audio data providing apparatus may represent information related to the audio signal as audio metadata based on the audio raw data.

단계(1020)에서, 오디오 데이터 제공 장치는 오디오 로우 데이터와 생성된 오디오 메타데이터를 하나의 오디오 데이터로 결합할 수 있다. 이때, 오디오 데이터 제공 장치는 오디오 로우 데이터와 오디오 메타데이터를 별도로 코딩하거나 오디오 로우 데이터와 오디오 메타데이터를 함께 코딩할 수 있다.In operation 1020, the audio data providing apparatus may combine the audio raw data and the generated audio metadata into one audio data. In this case, the audio data providing apparatus may separately code the audio row data and the audio metadata or may code the audio row data and the audio metadata together.

단계(1030)에서, 오디오 데이터 제공 장치는 결합된 오디오 데이터를 오디오 데이터 재생 장치에 송신할 수 있다. 오디오 데이터 제공 장치는 코딩된 오디오 데이터를 멀티플렉싱 처리하여 오디오 데이터 재생 장치에 송신할 수 있다.In operation 1030, the audio data providing apparatus may transmit the combined audio data to the audio data reproducing apparatus. The audio data providing apparatus may multiplex the coded audio data and transmit the multiplexed audio data to the audio data reproducing apparatus.

도 11은 본 발명의 일실시예에 따른 오디오 데이터 재생 장치가 오디오 데이터를 재생하는 동작을 나타낸 흐름도이다.11 is a flowchart illustrating an operation of reproducing audio data by an audio data reproducing apparatus according to an embodiment of the present invention.

단계(1110)에서, 오디오 데이터 재생 장치는 오디오 데이터 제공 장치로부터 오디오 데이터를 수신할 수 있다.In operation 1110, the audio data reproducing apparatus may receive audio data from the audio data providing apparatus.

단계(1120)에서, 오디오 데이터 재생 장치는 수신한 오디오 데이터에서 오디오 로우 데이터와 오디오 메타데이터를 분석할 수 있다. 오디오 데이터 재생 장치는 오디오 데이터를 디멀티플렉싱(demultiplexing) 하거나 디코딩할 수 있다. 오디오 재생 장치는 디멀티플렉싱과 디코딩 처리를 통해, 오디오 데이터에서 오디오 로우 데이터와 오디오 메타데이터를 분리할 수 있으며, 오디오 로우 데이터에서 각각의 채널별 오디오 신호를 추출할 수 있다.In operation 1120, the audio data reproducing apparatus may analyze the audio raw data and the audio metadata from the received audio data. The audio data reproducing apparatus may demultiplex or decode the audio data. The audio reproducing apparatus may separate audio row data and audio metadata from the audio data through demultiplexing and decoding processing, and extract an audio signal for each channel from the audio row data.

본 발명의 다른 실시예에 따르면, 오디오 데이터 재생 장치는 수신한 오디오 데이터를 오디오 로우 데이터와 오디오 메타데이터로 분리하고, 사용자는 분리된 오디오 로우 데이터와 오디오 메타데이터를 이용하여 추가적인 오디오 신호의 편집이나 재구성을 할 수 있다According to another embodiment of the present invention, the audio data reproducing apparatus divides the received audio data into audio raw data and audio metadata, and the user edits an additional audio signal using the separated audio raw data and audio metadata. I can reconstruct

단계(1130)에서, 오디오 데이터 재생 장치는 재생 환경 정보에 포함된 채널이 오디오 로우 데이터에 포함된 채널보다 하위 채널인 경우, 오디오 메타데이터에 포함된 오디오 채널의 하위 호환 정보에 따라 오디오 로우 데이터에 포함된 오디오 채널의 신호를 조합하여 하위 채널로 변경할 수 있다.In operation 1130, when the channel included in the reproduction environment information is a lower channel than the channel included in the audio raw data, the audio data reproducing apparatus may determine the audio raw data according to the backward compatibility information of the audio channel included in the audio metadata. The signal of the included audio channel may be combined to change to a lower channel.

단계(1140)에서, 오디오 데이터 재생 장치는 재생 환경 정보에 포함된 채널이 오디오 로우 데이터에 포함된 채널과 동일하거나 보다 상위 채널인 경우, 분석한 오디오 메타데이터와 미리 설정된 재생 환경 정보를 기초로 하여 오디오 로우 데이터를 재생할 수 있다. 또한, 오디오 데이터 재생 장치는 단계(1130)에서 하위 채널로 변경한 오디오 로우 데이터를 재생할 수 있다.In operation 1140, when the channel included in the reproduction environment information is the same channel or higher than the channel included in the audio row data, the audio data reproducing apparatus is based on the analyzed audio metadata and the preset reproduction environment information. Audio raw data can be played back. In addition, the audio data reproducing apparatus may reproduce the audio row data changed to the lower channel in step 1130.

본 발명의 실시 예에 따른 방법들은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 나중에 컴퓨터 시스템에 의해 판독될 수 있는 데이터를 저장할 수 있는 임의의 데이터 저장 장치이다. 컴퓨터 판독 가능 매체의 예는 판독 전용 메모리, 랜덤 엑세스 메모리, CD-ROM, DVD, 자기 테이프, 광학 데이터 저장 장치, 및 반송파를 포함한다. 컴퓨터 판독 가능 매체는 또한, 컴퓨터 판독 가능 코드가 분산된 방식으로 저장 및 실행되도록, 네트워크-연결 컴퓨터 시스템들에 걸쳐 분산되어 일을 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.Methods according to an embodiment of the present invention can be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. A computer readable medium is any data storage device that can store data that can later be read by a computer system. Examples of computer readable media include read only memory, random access memory, CD-ROM, DVD, magnetic tape, optical data storage, and carrier waves. The computer readable medium may also work distributed over network-connected computer systems such that the computer readable code is stored and executed in a distributed fashion. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, the present invention has been described by way of limited embodiments and drawings, but the present invention is not limited to the above embodiments, and those skilled in the art to which the present invention pertains various modifications and variations from such descriptions. This is possible.

그러므로, 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 안되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Therefore, the scope of the present invention should not be limited to the described embodiments, but should be defined not only by the claims below, but also by those equivalent to the claims.

110: 오디오 데이터
120: 오디오 로우 데이터
130: 오디오 메타 데이터
140: 오디오 신호110: audio data
120: audio raw data
130: audio metadata
140: audio signal

Claims

Generating audio metadata including channel information of the audio data;
Providing the audio data and the generated audio metadata
Including,
The audio data is data associated with a channel,
The audio metadata,
And at least one of audio channel number information, audio channel name information, and audio channel position information.

The method of claim 1,
The audio metadata,
The audio data providing method further comprising backward compatibility information of the audio channel.

Receiving audio data;
Playing the audio data based on the audio metadata of the audio data
Including,
The audio metadata,
And at least one of audio channel number information, audio channel name information, and audio channel position information.

The method of claim 3,
The audio metadata,
Audio data playback method further comprising the backward compatibility information of the audio channel.

The method of claim 3,
The number information of the audio channel,
A method of playing audio data, the number of audio channels associated with the audio data.

The method of claim 3,
Position information of the audio channel,
A method of reproducing audio data, which is information indicating where audio channels should be spatially located.

The method of claim 3,
Position information of the audio channel,
An audio data reproduction method comprising horizontal azimuth information and vertical azimuth information.

The method of claim 3,
Name information of the audio channel,
A method of playing audio data, each of which is a name of audio channels included in the audio data.

The method of claim 4, wherein
The backward compatibility information of the audio channel is
And a method of changing the signal of the audio channel included in the audio data into a lower channel.

Identifying audio metadata of the audio data;
Extracting detailed information included in the audio metadata
Including,
The audio metadata,
And at least one of audio channel number information, audio channel name information, and audio channel position information.