JP2011523247A

JP2011523247A - Audio signal processing method and apparatus

Info

Publication number: JP2011523247A
Application number: JP2011504928A
Authority: JP
Inventors: オー，ヒェン−オ; ウォンジュン，ヤン
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2008-04-16
Filing date: 2009-04-16
Publication date: 2011-08-04
Anticipated expiration: 2029-04-16
Also published as: JP5406276B2; CN102007533A; CN102007533B

Abstract

【課題】既に設定されたプリセット情報及びプリセットメタデータを用いてオブジェクトのレベルと位置をコントロールすることができるオーディオ信号処理方法及び装置を提供する。
【解決手段】少なくとも一つのオブジェクトを含むダウンミックス信号、前記オブジェクトの属性に基づくオブジェクト情報、前記ダウンミックス信号をレンダリングするためのプリセット情報及び前記プリセット情報の属性を表すプリセット属性情報を受信する段階と、前記プリセット属性情報に基づいて前記プリセット情報が構成情報領域の拡張領域に含まれた場合、前記プリセット情報を前記ダウンミックス信号の全データ領域に適用することによって前記ダウンミックス信号をレンダリングする段階と、前記プリセット属性情報に基づいて前記プリセット情報がデータ領域の拡張領域に含まれた場合、前記プリセット情報を前記ダウンミックス信号の一つの対応するデータ領域に適用することによって前記ダウンミックス信号をレンダリングする段階と、を有することを特徴とするオーディオ信号処理方法が開示される。
したがって、音源の特性によって、プリセット情報をダウンミックス信号の一つのデータ領域に適用するか、あるいは、プリセット情報をダウンミックス信号の全データ領域に適用するかをそれぞれ選択することによって、オーディオ信号を効率的に復元することができる。
【選択図】図２An audio signal processing method and apparatus capable of controlling the level and position of an object using preset information and preset metadata that have already been set.
Receiving a downmix signal including at least one object, object information based on an attribute of the object, preset information for rendering the downmix signal, and preset attribute information representing an attribute of the preset information; Rendering the downmix signal by applying the preset information to all data areas of the downmix signal when the preset information is included in an extension area of the configuration information area based on the preset attribute information; When the preset information is included in an extended area of the data area based on the preset attribute information, the downmix signal is rendered by applying the preset information to one corresponding data area of the downmix signal. Audio signal processing method characterized by having the steps of ring is disclosed.
Therefore, depending on the characteristics of the sound source, the audio signal can be made efficient by selecting whether the preset information is applied to one data area of the downmix signal or the preset information is applied to all data areas of the downmix signal. Can be restored.
[Selection] Figure 2

Description

本発明は、オーディオ信号の処理方法及び装置に係り、特に、デジタル媒体、放送信号などから受信されたオーディオ信号を処理できるオーディオ信号の処理方法及び装置に関するものである。 The present invention relates to an audio signal processing method and apparatus, and more particularly to an audio signal processing method and apparatus capable of processing an audio signal received from a digital medium, a broadcast signal, or the like.

複数個のオブジェクトを含むオーディオ信号を、モノまたはステレオ信号にダウンミックスしてダウンミックス信号を生成する過程において、オブジェクトからパラメータが抽出される。これらのパラメータは、ダウンミックスされた信号をデコーディングする過程で用いられ、オブジェクトの位置（position）及びゲイン（gain）は、パラメータの他に、ユーザの選択によってコントロールすることもできる。 In the process of generating a downmix signal by downmixing an audio signal including a plurality of objects into a mono or stereo signal, parameters are extracted from the objects. These parameters are used in the process of decoding the downmixed signal, and the position and gain of the object can be controlled by user selection in addition to the parameters.

ダウンミックス信号に含まれているオブジェクトは、ユーザの選択によって調節しなければならない。しかし、ユーザによってオブジェクトを制御する場合、直接すべてのオブジェクト信号を制御しなければならないという面倒さがあり、専門家によって制御される場合に比べてオーディオ信号を最適の状態に再現し難い。 The objects contained in the downmix signal must be adjusted according to the user's selection. However, when an object is controlled by a user, there is a hassle that all object signals must be directly controlled, and it is difficult to reproduce an audio signal in an optimum state as compared with a case where it is controlled by an expert.

本発明は、従来技術の制限や欠点による上記問題のうちの一つ又は複数を実質的に防止するオーディオ信号の処理方法および装置を対象とする。 The present invention is directed to an audio signal processing method and apparatus that substantially prevents one or more of the above problems due to limitations and disadvantages of the prior art.

本発明は上記問題を解決するためのもので、その目的は、既に設定されたプリセット情報及びプリセットメタデータを用いてオブジェクトのレベルと位置をコントロールすることができるオーディオ信号処理方法及び装置を提供することにある。 The present invention has been made to solve the above problems, and an object thereof is to provide an audio signal processing method and apparatus capable of controlling the level and position of an object using preset information and preset metadata that have already been set. There is.

本発明の他の目的は、音源の特性によってプリセット情報及びプリセットメタデータをダウンミックス信号の全データ領域にまたはダウンミックス信号の一つのデータ領域に適用することによって、ダウンミックス信号に含まれたオブジェクトを調節できるオーディオ信号処理方法及び装置を提供することにある。 Another object of the present invention is to apply the preset information and the preset metadata to the entire data area of the downmix signal or to one data area of the downmix signal according to the characteristics of the sound source. It is an object to provide an audio signal processing method and apparatus capable of adjusting the frequency.

本発明のさらに他の目的は、ユーザの選択に基づいてディスプレイ部に表示されたプリセットメタデータのうち一つを選択し、これに対応するプリセット情報を用いてオブジェクトのレベル及び位置をコントロールすることができるオーディオ信号処理方法及び装置を提供することにある。 Still another object of the present invention is to select one of preset metadata displayed on a display unit based on a user's selection, and to control the level and position of the object using preset information corresponding thereto. An audio signal processing method and apparatus capable of performing

本発明のさらに他の目的は、プリセット情報が適用されて調節されたオブジェクト及び選択されたプリセットメタデータをディスプレイ部に表示することによって、ユーザから選択信号を受信することができるオーディオ信号処理方法及び装置を提供することにある。 Still another object of the present invention is to provide an audio signal processing method capable of receiving a selection signal from a user by displaying an object adjusted by applying preset information and selected preset metadata on a display unit, and To provide an apparatus.

上記目的を達成するために、本発明によるオーディオ信号処理方法は、少なくとも一つのオブジェクトを含むダウンミックス信号、前記オブジェクトの属性に基づくオブジェクト情報、前記ダウンミックス信号をレンダリングするためのプリセット情報及び前記プリセット情報の属性を表すプリセット属性情報を受信する段階と、前記プリセット属性情報に基づいて前記プリセット情報が構成情報領域に含まれた場合、前記プリセット情報を前記ダウンミックス信号の全データ領域に適用することによって前記ダウンミックス信号をレンダリングする段階と、前記プリセット属性情報に基づいてプリセット情報がデータ領域に含まれた場合、前記プリセット情報を前記ダウンミックス信号の対応する一つのデータ領域に適用することによって前記ダウンミックス信号をレンダリングする段階と、を含み、前記プリセット情報は、前記プリセット情報の個数を表すプリセット個数情報及び前記レンダリングされたダウンミックス信号の出力チャネルの個数を表す出力チャネル情報に基づいて獲得される。前記プリセット属性情報は、前記プリセット情報が前記データ領域の拡張領域に含まれるか否かを表すことができる。 To achieve the above object, an audio signal processing method according to the present invention includes a downmix signal including at least one object, object information based on an attribute of the object, preset information for rendering the downmix signal, and the preset. Receiving preset attribute information representing an attribute of information, and when the preset information is included in a configuration information area based on the preset attribute information, applying the preset information to all data areas of the downmix signal Rendering the downmix signal according to the method, and when preset information is included in the data area based on the preset attribute information, applying the preset information to one corresponding data area of the downmix signal. Rendering the downmix signal, wherein the preset information is obtained based on preset number information representing the number of the preset information and output channel information representing the number of output channels of the rendered downmix signal. Is done. The preset attribute information may indicate whether the preset information is included in an extension area of the data area.

前記プリセット属性情報は、前記プリセット情報が変動であるかまたは固定であるかを表すことができる。 The preset attribute information may indicate whether the preset information is variable or fixed.

前記プリセット情報は、前記変動は、前記プリセット情報が前記データ領域の拡張領域に存在することを表し、前記固定は、前記プリセット情報が前記構成情報領域の拡張領域に含まれることを表すことができる。本発明に係るオーディオ信号処理方法は、前記オブジェクト情報及び前記プリセット情報を用いて、前記ダウンミックス信号のパニングまたはゲインを調節するダウンミックス処理情報及び前記ダウンミックス信号をアップミキシングするマルチチャネル情報を生成する段階と、前記ダウンミックス処理情報を用いて前記ダウンミックス信号を修正する段階と、をさらに含むことができる。 In the preset information, the variation can indicate that the preset information exists in an extension area of the data area, and the fixation can indicate that the preset information is included in an extension area of the configuration information area. . The audio signal processing method according to the present invention uses the object information and the preset information to generate downmix processing information for adjusting panning or gain of the downmix signal and multichannel information for upmixing the downmix signal. And modifying the downmix signal using the downmix processing information.

また、上記目的を達成するために、本発明によるオーディオ信号処理装置は、少なくとも一つのオブジェクトを含むダウンミックス信号及び前記オブジェクトの属性に基づくオブジェクト情報を受信する信号受信部と、ダウンミックス信号をレンダリングするためのプリセット情報の属性を表すプリセット属性情報を受信するプリセット属性情報受信部と、前記プリセット属性情報に基づいて前記プリセット情報が構成情報領域の拡張領域に含まれた場合、前記ダウンミックス信号の全データ領域に対応するプリセットモードを受信する固定プリセットモード受信部と、前記プリセット属性情報に基づいて前記プリセット情報がデータ領域の拡張領域に含まれた場合、前記ダウンミックス信号の一つのデータ領域に対応するプリセットモードを受信する変動プリセットモード受信部と、前記プリセット情報を前記ダウンミックス信号の全データ領域または一つのデータ領域に適用して前記ダウンミックス信号をレンダリングするレンダリング部と、
を含み、前記プリセットモードは、前記プリセット情報及び前記プリセット情報に対応するプリセットメタデータを含み、前記プリセットメタデータは、前記プリセット情報の特性を表すことができる。 To achieve the above object, an audio signal processing apparatus according to the present invention renders a downmix signal including a downmix signal including at least one object and object information based on an attribute of the object, and a downmix signal. A preset attribute information receiving unit that receives preset attribute information indicating preset attribute information for performing, and when the preset information is included in an extension area of a configuration information area based on the preset attribute information, A fixed preset mode receiving unit that receives preset modes corresponding to all data areas, and when the preset information is included in an extension area of the data area based on the preset attribute information, one data area of the downmix signal Corresponding preset mode A dynamic preset mode receiving unit receiving a rendering unit rendering the downmix signal by applying the preset information to all data regions or one of the data regions of the downmix signal,
The preset mode may include the preset information and preset metadata corresponding to the preset information, and the preset metadata may represent characteristics of the preset information.

本発明は、下記の効果及び利点を提供する。 The present invention provides the following effects and advantages.

第一に、それぞれのオブジェクトに対するユーザの設定なしに、既に設定された複数個のプリセット情報のうちの一つを複数個のプリセットメタデータを用いて選択することによって容易にオブジェクトの出力チャネルのレベルを調節することができる。 First, the output channel level of an object can be easily selected by selecting one of a plurality of preset information using a plurality of preset metadata without user setting for each object. Can be adjusted.

第二に、音源の特性によってプリセット情報をデータ領域単位に個別に選択して適用したり、ダウンミックス信号の全データ領域に同一のプリセット情報を選択して適用したりすることによって、オーディオ信号を効率的に復元することができる。 Second, depending on the characteristics of the sound source, the preset information can be individually selected and applied to each data area unit, or the same preset information can be selected and applied to all data areas of the downmix signal to apply the audio signal. It can be restored efficiently.

第三に、プリセット情報が適用されて調節されたオブジェクト及び選択されたプリセットメタデータをディスプレイ部を介して確認することによって、より適切なプリセット情報を選択してオブジェクトの出力チャネルのレベルまたは位置を調節することができる。 Third, by checking through the display section the object adjusted with the preset information applied and the selected preset metadata, the more appropriate preset information can be selected to determine the level or position of the output channel of the object. Can be adjusted.

本発明の一実施例による、ダウンミックス信号に含まれたオブジェクトに適用されるプリセットモードの概念図である。FIG. 6 is a conceptual diagram of a preset mode applied to an object included in a downmix signal according to an embodiment of the present invention. 本発明の一実施例による、プリセット属性情報に基づいてプリセット情報を適用することによって、ダウンミックス信号に含まれたオブジェクトを調節する概念図である。FIG. 6 is a conceptual diagram of adjusting an object included in a downmix signal by applying preset information based on preset attribute information according to an embodiment of the present invention. 本発明の一実施例による、プリセット属性情報に基づいてプリセット情報を適用することによって、ダウンミックス信号に含まれたオブジェクトを調節する概念図である。FIG. 6 is a conceptual diagram of adjusting an object included in a downmix signal by applying preset information based on preset attribute information according to an embodiment of the present invention. 本発明の一実施例によるオーディオ信号処理装置を示す図である。1 is a diagram illustrating an audio signal processing apparatus according to an embodiment of the present invention. 本発明の一実施例によってプリセット情報がレンダリング部に適用される方法を示すブロック図である。FIG. 5 is a block diagram illustrating a method in which preset information is applied to a rendering unit according to an embodiment of the present invention. 本発明の一実施例によってプリセット情報がレンダリング部に適用される方法を示すブロック図である。FIG. 5 is a block diagram illustrating a method in which preset information is applied to a rendering unit according to an embodiment of the present invention. 本発明の他の実施例による変動プリセット情報受信部及び固定プリセット情報受信部の概略的な構成を示すブロック図である。FIG. 6 is a block diagram illustrating a schematic configuration of a variable preset information receiving unit and a fixed preset information receiving unit according to another embodiment of the present invention. 本発明の他の実施例によるオーディオ信号処理装置を示す図である。It is a figure which shows the audio signal processing apparatus by the other Example of this invention. 本発明の他の実施例によるオーディオ信号処理方法においてプリセット情報と関連したシンタックス（syntax）を様々な方法で表現したものである。In the audio signal processing method according to another embodiment of the present invention, syntax associated with preset information is expressed in various ways. 本発明の他の実施例によるオーディオ信号処理方法においてプリセット情報と関連したシンタックスを様々な方法で表現したものである。In the audio signal processing method according to another embodiment of the present invention, the syntax associated with the preset information is expressed by various methods. 本発明の他の実施例によるオーディオ信号処理方法においてプリセット情報と関連したシンタックスを様々な方法で表現したものである。In the audio signal processing method according to another embodiment of the present invention, the syntax associated with the preset information is expressed by various methods. 本発明の他の実施例によるオーディオ信号処理方法においてプリセット情報と関連したシンタックスを様々な方法で表現したものである。In the audio signal processing method according to another embodiment of the present invention, the syntax associated with the preset information is expressed by various methods. 本発明の他の実施例によるオーディオ信号処理方法においてプリセット情報と関連したシンタックスを様々な方法で表現したものである。In the audio signal processing method according to another embodiment of the present invention, the syntax associated with the preset information is expressed by various methods. 本発明のさらに他の実施例によるオーディオ信号処理装置を示す図である。It is a figure which shows the audio signal processing apparatus by other Example of this invention. 本発明のさらに他の実施例によるオーディオ信号処理装置のディスプレイ部の一例を示す図である。It is a figure which shows an example of the display part of the audio signal processing apparatus by other Example of this invention. 本発明のさらに他の実施例によってプリセット情報が適用されたオブジェクトを表示する少なくとも一つの図形要素を示す図である。FIG. 10 is a diagram illustrating at least one graphic element displaying an object to which preset information is applied according to still another embodiment of the present invention. 本発明のさらに他の実施例による変動プリセット情報受信部と固定プリセットモード受信部が具現された製品の概略的な構成を示す図である。FIG. 10 is a diagram illustrating a schematic configuration of a product in which a variable preset information receiving unit and a fixed preset mode receiving unit according to still another embodiment of the present invention are implemented. 本発明のさらに他の実施例による変動プリセットモード受信部及び固定プリセットモード受信部が具現された製品間の関係を示す図である。FIG. 6 is a diagram illustrating a relationship between products in which a variable preset mode receiving unit and a fixed preset mode receiving unit according to another embodiment of the present invention are implemented. 本発明のさらに他の実施例による変動プリセットモード受信部及び固定プリセットモード受信部が具現された製品間の関係を示す図である。FIG. 6 is a diagram illustrating a relationship between products in which a variable preset mode receiving unit and a fixed preset mode receiving unit according to another embodiment of the present invention are implemented. 本発明のさらに他の実施例による変動プリセットモード受信部及び固定プリセットモード受信部が具現された放送信号デコーディング装置の概略的な構成を示す図である。FIG. 10 is a diagram illustrating a schematic configuration of a broadcast signal decoding apparatus in which a variable preset mode receiver and a fixed preset mode receiver according to another embodiment of the present invention are implemented.

以下、本発明をよりよく理解するため添付された図面は、本明細書に組み込まれて本明細書の一部を構成する。これらの添付図面は、本発明の実施形態を例示し、本明細書の記載と併せて本明細書の主旨を説明するためのものである。 BRIEF DESCRIPTION OF THE DRAWINGS In order to better understand the present invention, the accompanying drawings are incorporated in and constitute a part of this specification. These accompanying drawings illustrate embodiments of the present invention and are intended to explain the gist of the present specification in conjunction with the description of the present specification.

本発明の他の特徴および利点は、以下の記載で説明するが、その一部については、この記載から明らかになるであろうし、あるいは、本発明の実施によって理解されるであろう。本発明の目的および他の利点は、明細書および特許請求の範囲の書面ならびに添付図面で特に示された構成によって、実現され達成されるであろう。 Other features and advantages of the invention will be set forth in the description that follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

以下の本発明に関する概略説明とそれに続く詳細説明は、共に、実施例とその説明であり、特許請求の範囲に記載された本発明をさらに説明するためのものであることを理解されたい。 It is to be understood that both the following general description and the following detailed description of the invention are examples and description thereof, and are intended to further illustrate the invention as claimed.

以下、添付の図面を参照しつつ、本発明の好適な実施例を詳細に説明する。特に、本発明における用語は、以下で参照されるように解釈することができる。また、本明細書に開示されていない用語は、以下の本発明の技術的思想に符合する意味および概念として解釈することができる。したがって、本明細書に記載された実施例と図面に例示された構成は、本発明の最も好適な一実施例に過ぎず、本発明の技術的思想を全部示すものではないため、本出願時点においてそれらに代替可能な様々な均等物と変形例が存在することができる。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In particular, the terms in the present invention can be interpreted as referred to below. Moreover, the term which is not disclosed by this specification can be interpreted as the meaning and concept corresponding to the following technical thought of this invention. Therefore, the embodiments described in the present specification and the configurations illustrated in the drawings are only the most preferred embodiments of the present invention, and do not show all the technical ideas of the present invention. There can be various equivalents and variations that can be substituted for them.

特に、本明細書で情報（information）は、値（values）、パラメータ（parameters）、係数（coefficients）、成分（elements）などを総称する用語であり、その意味は場合によって適宜解釈すればいい。したがって、本発明はこれに限定されない。 In particular, in the present specification, information is a term that collectively refers to values, parameters, coefficients, elements, and the like, and the meaning thereof may be appropriately interpreted depending on circumstances. Therefore, the present invention is not limited to this.

図１は、本発明の実施例による、ダウンミックス信号に含まれたオブジェクトに適用されるプリセットモードの概念図である。オブジェクトを調節するために既に設定された情報の集合を本明細書ではプリセットモード（preset mode）と称する。プリセットモードは、オーディオ信号の特性または聴取環境によってユーザが選択できる様々なモードを表すことができ、少なくとも一つを含むことができる。また、プリセットモードは、オブジェクトを調節するために適用されるプリセット情報（preset information）と、プリセット情報の属性などを表現するためのメタデータであるプリセットメタデータ（preset metadata）とを含む。プリセットメタデータは、テキストで表現することができ、プリセット情報の属性（例えば、コンサートホールモード、カラオケモード、ニュースモード等）を表す他、プリセット情報の作成者、作成日、プリセット情報の適用されるオブジェクト名などをはじめとする、プリセット情報を表現するための関連情報を含むことができる。一方、プリセット情報は、実質的にオブジェクトに適用されるデータで、プリセットメタデータと互いに対応し、様々な形態、例えば、マトリクス形態で表現することができる。 FIG. 1 is a conceptual diagram of a preset mode applied to an object included in a downmix signal according to an embodiment of the present invention. A set of information already set to adjust the object is referred to herein as a preset mode. The preset mode may represent various modes that can be selected by the user according to the characteristics of the audio signal or the listening environment, and may include at least one of the modes. The preset mode includes preset information (preset information) applied to adjust the object, and preset metadata (metadata for expressing attributes of the preset information). Preset metadata can be expressed in text and represents the attributes of preset information (for example, concert hall mode, karaoke mode, news mode, etc.), and the preset information creator, creation date, and preset information are applied. Related information for expressing preset information such as an object name can be included. On the other hand, the preset information is data that is substantially applied to the object, corresponds to the preset metadata, and can be expressed in various forms, for example, a matrix form.

図１を参照すると、プリセットモード１は、音楽信号をコンサートホールで聞くような音場感を提供するコンサートホールモード（concert hall mode）とし、プリセットモード２は、オーディオ信号からボーカル（vocal）オブジェクトのレベルを減少させたカラオケモード（karaoke mode）とし、プリセットモードｎは、音声オブジェクトのレベルを増加させたニュースモード（news mode）とすることができる。また、プリセットモードは、プリセットメタデータ及びプリセット情報を含む。もし、ユーザによりプリセットモード２が選択された場合、プリセットメタデータ２のカラオケモード（karaoke mode）が表示され、このプリセットメタデータ２と関連したプリセット情報２がオブジェクトに適用されてレベルを調節することができる。 Referring to FIG. 1, preset mode 1 is a concert hall mode that provides a sound field feeling like listening to a music signal in a concert hall, and preset mode 2 is a vocal object from a audio signal. The karaoke mode in which the level is reduced can be set, and the preset mode n can be set in the news mode in which the level of the sound object is increased. The preset mode includes preset metadata and preset information. If the preset mode 2 is selected by the user, the karaoke mode of the preset metadata 2 is displayed, and the preset information 2 associated with the preset metadata 2 is applied to the object to adjust the level. Can do.

この場合、プリセット情報は、モノプリセット情報（mono preset information）、ステレオプリセット情報（stereo preset information）及びマルチチャネルプリセット情報（multi-channel preset information）を含むことができる。プリセット情報は、オブジェクトの出力チャネルによって決定される。モノプリセット情報は、オブジェクトの出力チャネルがモノである場合に適用されるプリセット情報であり、ステレオプリセット情報は、オブジェクトの出力チャネルがステレオである場合に適用されるプリセット情報であり、マルチチャネルプリセット情報は、オブジェクトの出力チャネルがマルチチャネルである場合に適用されるプリセット情報である。オブジェクトの出力チャネルが構成情報によって決定されると、決定された出力チャネルを用いてプリセット情報のタイプが決定され、当該プリセット情報をオブジェクトに適用してレベルまたはパニングを調節することができる。 In this case, the preset information may include mono preset information, stereo preset information, and multi-channel preset information. Preset information is determined by the output channel of the object. The mono preset information is preset information applied when the output channel of the object is mono, and the stereo preset information is preset information applied when the output channel of the object is stereo, and the multi-channel preset information. Is preset information applied when the output channel of the object is multi-channel. When the output channel of the object is determined by the configuration information, the type of preset information is determined using the determined output channel, and the preset information can be applied to the object to adjust the level or panning.

図２Ａ及び図２Ｂは、本発明の一実施例によるプリセット属性情報によってプリセット情報を適用して、ダウンミックス信号に含まれたオブジェクトを調節する概念図である。 2A and 2B are conceptual diagrams illustrating adjustment of an object included in a downmix signal by applying preset information according to preset attribute information according to an embodiment of the present invention.

本発明のオーディオ信号は、エンコーダでダウンミックス信号及びオブジェクト情報にエンコーディングされ、これらは一つのビットストリームまたは別個のビットストリームの形態でデコーダに転送される。 The audio signal of the present invention is encoded into a downmix signal and object information by an encoder, and these are transferred to a decoder in the form of one bit stream or a separate bit stream.

図２Ａ及び図２Ｂを参照すると、ビットストリームに含まれたオブジェクト情報は、具体的に、構成情報領域と複数個のデータ領域（データ領域１、データ領域２、…、データ領域ｎ）とで構成される。構成情報領域は、オブジェクト情報のビットストリームにおいて先頭部に位置している領域であり、オブジェクト情報の全データ領域に共通して適用される情報を含む。例えば、ツリー構造などを含む構成情報（configuration information）、データ領域長情報（data region length information）及びオブジェクトの個数情報（object number information）などを含むことができる。一方、データ領域は、データ領域長情報に基づいてオーディオ信号全体の時間ドメインを分割したユニットであり、フレームを含むことができる。オブジェクト情報のデータ領域は、ダウンミックス信号のデータ領域に対応し、対応するダウンミックス信号のデータ領域をアップミキシングするために用いられるオブジェクト情報を含む。オブジェクト情報は、オブジェクトレベル情報及びオブジェクトゲイン情報などを含む。 Referring to FIGS. 2A and 2B, the object information included in the bitstream is specifically composed of a configuration information area and a plurality of data areas (data area 1, data area 2,..., Data area n). Is done. The configuration information area is an area located at the beginning of the bit stream of object information, and includes information that is commonly applied to all data areas of object information. For example, it may include configuration information including a tree structure, data region length information, object number information, and the like. On the other hand, the data area is a unit obtained by dividing the time domain of the entire audio signal based on the data area length information, and can include a frame. The data area of the object information corresponds to the data area of the downmix signal and includes object information used for upmixing the data area of the corresponding downmix signal. The object information includes object level information and object gain information.

まず、本発明の一実施例によるオーディオ信号処理方法では、ビットストリームのオブジェクト情報からプリセット属性情報（preset_attribute_information）を読む。このプリセット属性情報は、プリセット情報がビットストリームのいずれの領域に含まれているかを表すもので、特に、プリセット情報がオブジェクト情報の構成情報領域に含まれるかまたはデータ領域に含まれるかを表す。プリセット属性情報の詳細な意味は、下記の表１の通りである。 First, in an audio signal processing method according to an embodiment of the present invention, preset attribute information (preset_attribute_information) is read from object information of a bitstream. This preset attribute information indicates in which area of the bit stream the preset information is included, and particularly indicates whether the preset information is included in the configuration information area or the data area of the object information. The detailed meaning of the preset attribute information is as shown in Table 1 below.

まず、図２Ａを参照すると、プリセット属性情報が０であって、プリセット情報が構成情報領域に含まれることを表す場合、構成情報領域から抽出されたプリセット情報はダウンミックス信号の全データ領域に同一に適用されてレンダリングを行う。 First, referring to FIG. 2A, when the preset attribute information is 0 and the preset information is included in the configuration information area, the preset information extracted from the configuration information area is the same in all data areas of the downmix signal. Applied to render.

一方、図２Ｂを参照すると、プリセット属性情報が１であって、プリセット情報がデータ領域に含まれることを表す場合、データ領域から抽出されたプリセット情報は、対応するダウンミックス信号のデータ領域に適用されてレンダリングを行う。例えば、データ領域１から抽出されたプリセット情報は、ダウンミックス信号のデータ領域１に適用され、データ領域ｎから抽出されたプリセット情報は、ダウンミックス信号のデータ領域ｎに適用されることができる。 On the other hand, referring to FIG. 2B, when the preset attribute information is 1 and the preset information is included in the data area, the preset information extracted from the data area is applied to the data area of the corresponding downmix signal. To render. For example, the preset information extracted from the data area 1 can be applied to the data area 1 of the downmix signal, and the preset information extracted from the data area n can be applied to the data area n of the downmix signal.

また、プリセット属性情報は、プリセット情報が変動（dynamic）であるかあるいは固定（static）であるかを表すことができる。プリセット属性情報が０で、構成情報領域に含まれる場合、プリセット情報は固定（static）とされうる。一方、プリセット属性情報が１で、データ領域に含まれる場合、プリセット情報は変動（dynamic）とされうる。この場合、プリセット情報は該当のデータ領域にのみ適用されて、対応するデータ領域のダウンミックス信号をレンダリングするので、データ領域別に変動的に適用される。この時、プリセット情報は、変動（dynamic）である場合は、データ領域の拡張領域（extension region）に存在することが好ましく、プリセット情報が固定（static）である場合は、構成情報領域の拡張領域に存在することが好ましい。 The preset attribute information can indicate whether the preset information is dynamic or static. When the preset attribute information is 0 and is included in the configuration information area, the preset information can be fixed. On the other hand, when the preset attribute information is 1 and is included in the data area, the preset information can be dynamic. In this case, since the preset information is applied only to the corresponding data area and the downmix signal of the corresponding data area is rendered, the preset information is variably applied to each data area. At this time, if the preset information is dynamic, it is preferably present in the extension region of the data region. If the preset information is static, the preset information is an extension region of the configuration information region. It is preferable that it exists in.

したがって、本発明の一実施例によるオーディオ信号処理方法は、プリセット属性情報によって、音源の特性に基づいてデータ領域別に適切なプリセット情報を用いたり、同一のプリセット情報を全データ領域に用いたりして、ダウンミックス信号をレンダリングすることが可能になる。 Accordingly, the audio signal processing method according to an embodiment of the present invention uses preset attribute information to use appropriate preset information for each data area based on the characteristics of the sound source, or to use the same preset information for all data areas. It will be possible to render the downmix signal.

図３は、本発明の実施例によるオーディオ信号処理装置３００を示す図である。図３を参照すると、オーディオ信号処理装置３００は、プリセットモード生成部３１０、情報受信部（図示せず）、変動プリセットモード受信部３２０、固定プリセットモード受信部３３０、レンダリング部３４０を含むことができる。 FIG. 3 is a diagram illustrating an audio signal processing apparatus 300 according to an embodiment of the present invention. Referring to FIG. 3, the audio signal processing apparatus 300 may include a preset mode generation unit 310, an information reception unit (not shown), a variable preset mode reception unit 320, a fixed preset mode reception unit 330, and a rendering unit 340. .

プリセットモード生成部３１０は、オーディオ信号に含まれたオブジェクトをレンダリング時に調節するためのプリセットモードを生成し、プリセット属性決定部３１１、プリセットメタデータ生成部３１２及びプリセット情報生成部３１３を含むことができる。 The preset mode generation unit 310 generates a preset mode for adjusting an object included in the audio signal during rendering, and may include a preset attribute determination unit 311, a preset metadata generation unit 312, and a preset information generation unit 313. .

プリセット属性決定部３１１は、上述した通り、プリセット情報を構成情報領域に含めて全データ領域に適用するか、あるいは、データ領域に含めてデータ領域別に適用するかを表すプリセット属性情報を決定する。 As described above, the preset attribute determining unit 311 determines preset attribute information indicating whether preset information is included in the configuration information area and applied to all data areas, or is included in the data area and applied to each data area.

その後、プリセット属性情報によって、プリセットメタデータ生成部３１２及びプリセット情報生成部３１３は、一つのプリセットメタデータ及びプリセット情報、またはデータ領域数だけのプリセットメタデータ及びプリセット情報を生成することができる。 Thereafter, the preset metadata generation unit 312 and the preset information generation unit 313 can generate one preset metadata and preset information, or preset metadata and preset information corresponding to the number of data areas, based on the preset attribute information.

プリセットメタデータ生成部３１２は、プリセット情報を表現するテキストを受信してプリセットメタデータ（preset metadata）を生成することができる。一方、オブジェクトのレベルを調節するためのゲイン及び／またはオブジェクトの位置がプリセット情報生成部３１３に入力される場合、当該オブジェクトに適用されるプリセット情報を生成することができる。 The preset metadata generation unit 312 can receive the text representing the preset information and generate preset metadata. On the other hand, when the gain for adjusting the level of the object and / or the position of the object are input to the preset information generation unit 313, the preset information applied to the object can be generated.

プリセット情報は、オブジェクトごとに適用されるように生成することができ、様々なタイプとすることができ、例えば、チャネルレベル差（CLD：Channel Level Difference）パラメータ、マトリクス（matrix）などを含むことができる。 Preset information can be generated to be applied on an object-by-object basis, and can be of various types, including, for example, a channel level difference (CLD) parameter, a matrix, etc. it can.

また、プリセット情報生成部３１３は、オブジェクトの出力チャネルの数を表す出力チャネル情報（output channel information）をさらに生成することができる。 Also, the preset information generation unit 313 can further generate output channel information indicating the number of output channels of the object.

プリセットメタデータ生成部３１２で生成されたプリセットメタデータ及びプリセット情報生成部３１３で生成されたプリセット情報、出力チャネル情報などは、一つのビットストリームに含まれて転送されることができ、特に、ダウンミックス信号を含むビットストリームの補助領域（ancillary region）に含まれて転送されることができる。 The preset metadata generated by the preset metadata generation unit 312 and the preset information and output channel information generated by the preset information generation unit 313 can be included and transferred in one bitstream, It can be transmitted by being included in ancillary region of the bitstream including the mixed signal.

一方、プリセットモード生成部３１２は、プリセットメタデータ、プリセット情報及び出力チャネル情報がビットストリームに含まれたことを表すプリセット存在情報（preset presence information）をさらに生成することができる。プリセット存在情報は、プリセット情報などがビットストリームのどの領域に含まれているかを表すコンテナタイプ（container type）とすることもでき、どの領域に含まれているかを表さずに、単にビットストリームに含まれたか否かを表すフラグタイプ（flag type）とすることもできるが、これに限定されることはない。 Meanwhile, the preset mode generation unit 312 can further generate preset presence information indicating that the preset metadata, the preset information, and the output channel information are included in the bitstream. Preset presence information can also be a container type that indicates in which area of the bitstream the preset information is included, and does not indicate in which area it is simply included in the bitstream. Although it can also be set as the flag type (flag type) showing whether it was contained, it is not limited to this.

また、プリセットモード生成部３１２は、複数個のプリセットモードを生成することができ、それぞれのプリセットモードは、プリセット情報、プリセットメタデータ及び出力チャネル情報を含む。ここで、プリセットモード生成部３１２は、プリセットモードの個数を表すプリセット個数情報（preset number information）をさらに生成することができる。 The preset mode generation unit 312 can generate a plurality of preset modes, and each preset mode includes preset information, preset metadata, and output channel information. Here, the preset mode generation unit 312 can further generate preset number information indicating the number of preset modes.

このように、プリセットモード生成部３１０は、プリセット属性情報、プリセットメタデータ及びプリセット情報をビットストリームの形態にして出力することができる。 As described above, the preset mode generation unit 310 can output the preset attribute information, the preset metadata, and the preset information in the form of a bit stream.

ビットストリームは、図２Ａ及び図２Ｂに示すような形態を有し、情報受信部（図示せず）に入力される。情報受信部（図示せず）に入力されたビットストリームからまずプリセット属性情報を獲得し、プリセット情報が、転送されたビットストリームのどの領域に含まれたかを決定する。 The bit stream has a form as shown in FIGS. 2A and 2B and is input to an information receiving unit (not shown). First, preset attribute information is acquired from a bit stream input to an information receiving unit (not shown), and it is determined in which area of the transferred bit stream the preset information is included.

変動プリセットモード受信部３２０は、プリセット属性決定部３１１から出力されるプリセット属性情報に基づいてプリセット情報がデータ領域に含まれたことを表す場合（表１のpreset_attribute_flag=1の場合）に、動作する。 The variable preset mode receiving unit 320 operates when the preset information is included in the data area based on the preset attribute information output from the preset attribute determining unit 311 (when preset_attribute_flag = 1 in Table 1). .

変動プリセットモード受信部３２０は、該当のデータ領域に対応するプリセットメタデータを受信する変動プリセットメタデータ受信部３２１及びデータ領域別プリセット情報を受信する変動プリセット情報受信部３２２を含むことができる。変動プリセットメタデータ受信部３２１は、選択されたプリセットメタデータを受信して出力し、変動プリセット情報受信部３２２はプリセット情報を受信する。これについての詳細は、図４Ａ乃至図５を参照して後述する。 The variation preset mode reception unit 320 may include a variation preset metadata reception unit 321 that receives preset metadata corresponding to a corresponding data region and a variation preset information reception unit 322 that receives preset information for each data region. The variable preset metadata receiving unit 321 receives and outputs the selected preset metadata, and the variable preset information receiving unit 322 receives the preset information. Details of this will be described later with reference to FIGS. 4A to 5.

固定プリセットモード受信部３３０は、プリセット属性情報に基づいてプリセット情報が構成情報領域に含まれたことを表す場合（表１のpreset_attribute_flag=0の場合）に、動作する。 The fixed preset mode receiving unit 330 operates when the preset information is included in the configuration information area based on the preset attribute information (when preset_attribute_flag = 0 in Table 1).

固定プリセットモード受信部３３０は、全データ領域に対応するプリセットメタデータを受信する固定プリセットメタデータ受信部３３１及びプリセット情報を受信する固定プリセット情報受信部３３２を含むことができる。 The fixed preset mode receiving unit 330 may include a fixed preset metadata receiving unit 331 that receives preset metadata corresponding to all data areas and a fixed preset information receiving unit 332 that receives preset information.

固定プリセットモード受信部３３０の固定プリセットメタデータ受信部３３１及び固定プリセット情報受信部３３２は、変動プリセットモード受信部３２０の変動プリセットメタデータ受信部３２１及び変動プリセット情報受信部３２２と略同様の構成及び機能を有し、ただし、受信されて出力されるプリセット情報及びプリセットメタデータに対応するダウンミックス信号の範囲が異なる。 The fixed preset metadata receiving unit 331 and the fixed preset information receiving unit 332 of the fixed preset mode receiving unit 330 have substantially the same configuration as the variable preset metadata receiving unit 321 and the variable preset information receiving unit 322 of the variable preset mode receiving unit 320, and However, the range of the downmix signal corresponding to the preset information and preset metadata received and output is different.

レンダリング部３４０は、複数個のオブジェクトを含むオーディオ信号をダウンミキシングして生成されたダウンミックス信号と変動プリセット情報受信部３２２から出力されたプリセット情報または固定プリセット情報受信部３３２から出力されたプリセット情報を受信する。このプリセット情報は、ダウンミックス信号に含まれたオブジェクトに適用されてオブジェクトのレベルを調節したりオブジェクトの位置を調節したりすることができる。 The rendering unit 340 includes a downmix signal generated by downmixing an audio signal including a plurality of objects and preset information output from the variation preset information receiving unit 322 or preset information output from the fixed preset information receiving unit 332. Receive. This preset information can be applied to the object included in the downmix signal to adjust the level of the object or the position of the object.

また、オーディオ信号処理装置３００がディスプレイ部（図示せず）を含む場合、変動プリセットメタデータ受信部３２１から出力される選択されたプリセットメタデータまたは固定プリセットメタデータ受信部３３１から出力される選択されたプリセットメタデータは、ディスプレイ部に表示されることができる。 When the audio signal processing apparatus 300 includes a display unit (not shown), the selected preset metadata output from the variable preset metadata receiving unit 321 or the selected preset metadata receiving unit 331 is selected. The preset metadata can be displayed on the display unit.

図４Ａ及び図４Ｂは、本発明の実施例によるプリセット情報がレンダリング部に適用される方法を示すブロック図である。 4A and 4B are block diagrams illustrating a method in which preset information is applied to a rendering unit according to an embodiment of the present invention.

まず、図４Ａは、変動プリセットモード受信部３２０から出力されたプリセット情報が、レンダリング部４４０に適用される方法を示す図である。変動プリセットモード受信部３２０は、図３における変動プリセットモード受信部３２０と同一であり、変動プリセットメタデータ受信部３２１及び変動プリセット情報受信部３２２を含む。 First, FIG. 4A is a diagram illustrating a method in which the preset information output from the variation preset mode receiving unit 320 is applied to the rendering unit 440. The variation preset mode reception unit 320 is the same as the variation preset mode reception unit 320 in FIG. 3, and includes a variation preset metadata reception unit 321 and a variation preset information reception unit 322.

変動プリセットモード受信部３２０は、データ領域ごとにプリセットメタデータ及びプリセット情報を受信して出力し、このプリセット情報はレンダリング部４４０に入力される。 The variable preset mode receiving unit 320 receives and outputs preset metadata and preset information for each data area, and the preset information is input to the rendering unit 440.

レンダリング部４４０は、プリセット情報の他に、ダウンミックス信号も受信して、データ領域別にレンダリングを行い、データ領域１のレンダリング部４４１、データ領域２のレンダリング部４４２、…、データ領域ｎのレンダリング部４４ｎを含む。ここで、レンダリング部４４０のそれぞれのデータ領域レンダリング部４４Ｘは、それぞれデータ領域に対応するプリセット情報を受信してダウンミックス信号に適用することによってレンダリングする。 The rendering unit 440 receives the downmix signal in addition to the preset information, and performs rendering for each data area. The rendering unit 441 for the data area 1, the rendering part 442 for the data area 2,. 44n included. Here, each data region rendering unit 44X of the rendering unit 440 receives preset information corresponding to each data region and renders it by applying it to the downmix signal.

例えば、１番目のデータ領域は、スタジアムモードであるプリセット情報＿１が適用され、２番目のデータ領域は、カラオケモードであるプリセット情報＿３が適用され、６番目のデータ領域は、ニュースモードであるプリセット情報＿２（ここで、プリセット情報＿ｎのｎは、データ領域モードのインデックスを表す。）が適用されることができる。この場合、プリセットメタデータもデータ領域ごとに出力されることは勿論である。 For example, preset information_1 that is stadium mode is applied to the first data area, preset information_3 that is karaoke mode is applied to the second data area, and preset that is news mode is applied to the sixth data area. Information_2 (here, n of the preset information_n represents an index of the data area mode) can be applied. In this case, of course, preset metadata is also output for each data area.

図４Ｂは、固定プリセットモード受信部３３０から出力されたプリセット情報が、レンダリング部４４０で適用される方法を示す図である。固定プリセットモード受信部３３０は、図３の固定プリセットモード受信部３３０と同一に構成される。 FIG. 4B is a diagram illustrating a method in which the preset information output from the fixed preset mode receiving unit 330 is applied by the rendering unit 440. The fixed preset mode receiving unit 330 is configured in the same manner as the fixed preset mode receiving unit 330 of FIG.

固定プリセットモード受信部３３０は、全データ領域に対応するプリセットメタデータ及びプリセット情報を受信して出力し、レンダリング部４４０は、プリセット情報を受信する。 The fixed preset mode receiving unit 330 receives and outputs preset metadata and preset information corresponding to all data areas, and the rendering unit 440 receives the preset information.

図４Ｂのレンダリング部４４０は、図４Ａのレンダリング部と同様に、データ領域数だけのデータ領域レンダリング部４４Ｘを含む。レンダリング部４４０は、固定プリセットモード受信部３３０からプリセット情報を受信する場合、全データ領域レンダリング部４４Ｘが受信したプリセット情報をダウンミックス信号に同一に適用することによってレンダリングする。 Similar to the rendering unit in FIG. 4A, the rendering unit 440 in FIG. 4B includes as many data region rendering units 44X as the number of data regions. When receiving preset information from the fixed preset mode receiving unit 330, the rendering unit 440 renders the preset information received by the all data region rendering unit 44X by applying the same to the downmix signal.

例えば、固定プリセット情報受信部３３２から出力されたプリセット情報がニュースモードを表すプリセット情報２の場合、１番目のデータ領域からｎ番目のデータ領域まで全データ領域にニュースモードを適用することができる。 For example, when the preset information output from the fixed preset information receiving unit 332 is the preset information 2 representing the news mode, the news mode can be applied to all data areas from the first data area to the nth data area.

図５は、本発明のオーディオ信号処理装置３００の変動プリセットモード受信部３２０に含まれる変動プリセット情報受信部３２２及び固定プリセットモード受信部３３０に含まれる固定プリセット情報受信部３３２の概略的な構成を示す図である。 FIG. 5 shows a schematic configuration of the variable preset information receiving unit 322 included in the variable preset mode receiving unit 320 and the fixed preset information receiving unit 332 included in the fixed preset mode receiving unit 330 of the audio signal processing apparatus 300 of the present invention. FIG.

図５を参照すると、変動または固定プリセット情報受信部３２２，３３２は、出力チャネル情報受信部３２２ａ，３３２ａ及びプリセット情報決定部３２２ｂ，３３２ｂを含む。 Referring to FIG. 5, the variable or fixed preset information receiving units 322 and 332 include output channel information receiving units 322a and 332a and preset information determining units 322b and 332b.

出力チャネル情報受信部３２２ａ，３３２ａは、ダウンミックス信号に含まれたオブジェクトがいくつの出力チャネルに再生されるかを表す出力チャネル情報を受信して出力する。この出力チャネル情報は、モノチャネル、ステレオチャネルまたはマルチチャネル（５．１チャネル）とすることができるが、これに限定されることはない。 The output channel information receiving units 322a and 332a receive and output output channel information indicating how many output channels the object included in the downmix signal is reproduced. This output channel information can be mono channel, stereo channel or multi-channel (5.1 channel), but is not limited to this.

プリセット情報決定部３２２ｂ，３３２ｂは、出力チャネル情報受信部３２２ａ，３３２ｂから入力された出力チャネル情報に基づいて、該当するプリセット情報を受信して出力する。このプリセット情報は、モノプリセット情報、ステレオプリセット情報またはマルチチャネルプリセット情報のうちの一つとすることができる。 The preset information determination units 322b and 332b receive and output corresponding preset information based on the output channel information input from the output channel information reception units 322a and 332b. This preset information can be one of mono preset information, stereo preset information or multi-channel preset information.

プリセット情報がマトリクスタイプである場合は、該プリセット情報の次元は、オブジェクトの数及び出力チャネルの数に基づいて決定することができ、プリセットマトリクスは、（オブジェクトの数）＊（出力チャネルの数）の形態を有することができる。例えば、ダウンミックス信号に含まれたオブジェクトがｎ個であり、出力チャネル情報受信部３２２ａ，３３２ａからの出力チャネルが５．１チャネル、すなわち、６個のチャネルである場合、プリセット情報決定部３２２ｂ，３３２ｂは、ｎ＊６形態としたマルチチャネルプリセット情報を出力することができる。ここで、マトリクスの成分（element）は、ａ番目のオブジェクトがｉ番目のチャネルに含まれる程度を表すゲイン値である。 When the preset information is a matrix type, the dimension of the preset information can be determined based on the number of objects and the number of output channels. The preset matrix is (number of objects) * (number of output channels). It can have the form. For example, when n objects are included in the downmix signal and the output channels from the output channel information receiving units 322a and 332a are 5.1 channels, that is, 6 channels, the preset information determining unit 322b, 332b can output multi-channel preset information in n * 6 form. Here, the element of the matrix is a gain value representing the degree to which the a-th object is included in the i-th channel.

図６は、本発明の他の実施例によるオーディオ信号処理装置６００を示す図である。図６を参照すると、オーディオ信号処理装置６００は、主に、ダウンミキシング部６１０、オブジェクト情報生成部６２０、プリセット情報生成部６３０、ダウンミックス信号処理部６４０、情報処理部６５０及びマルチチャネルデコーディング部６６０を含む。 FIG. 6 is a diagram illustrating an audio signal processing apparatus 600 according to another embodiment of the present invention. Referring to FIG. 6, an audio signal processing apparatus 600 mainly includes a downmixing unit 610, an object information generating unit 620, a preset information generating unit 630, a downmix signal processing unit 640, an information processing unit 650, and a multi-channel decoding unit. 660.

複数個のオブジェクトはダウンミキシング部６１０に入力されて、モノまたはステレオダウンミックス信号を生成する。また、複数個のオブジェクトは、オブジェクト情報生成部６２０に入力されて、オブジェクトのレベルを表すオブジェクトレベル情報と、ダウンミックス信号に含まれるオブジェクトのゲイン値及びステレオダウンミックス信号である場合にダウンミックスチャネルに含まれるオブジェクトの程度を含むオブジェクトゲイン情報と、オブジェクト間の関連有無を表すオブジェクト関連情報と、を含むオブジェクト情報を生成する。 The plurality of objects are input to the downmixing unit 610 to generate a mono or stereo downmix signal. The plurality of objects are input to the object information generation unit 620, and are the object level information indicating the object level, the gain value of the object included in the downmix signal, and the stereo downmix signal. The object information including the object gain information including the degree of the object included in the object and the object related information indicating the presence or absence of the relationship between the objects is generated.

その後、ダウンミックス信号及びオブジェクト情報は、プリセットモード生成部６３０に入力されて、プリセット情報がビットストリームにおいてデータ領域に含まれるかまたはビットストリームにおいて構成情報領域に含まれるかを表すプリセット属性情報、オブジェクトのレベルを調節するためのプリセット情報、及びプリセット情報を表現するためのプリセットメタデータを含むプリセットモードを生成する。プリセット属性情報、プリセット情報及びプリセットメタデータを生成する過程は、図１乃至図５におけるオーディオ信号処理装置及び方法で上述した通りであるから、その詳細な説明は省略する。 Thereafter, the downmix signal and the object information are input to the preset mode generation unit 630, and preset attribute information indicating whether the preset information is included in the data area in the bitstream or the configuration information area in the bitstream, an object A preset mode including preset information for adjusting the level of the preset data and preset metadata for expressing the preset information is generated. Since the process of generating the preset attribute information, the preset information, and the preset metadata is as described above in the audio signal processing apparatus and method in FIGS. 1 to 5, detailed description thereof is omitted.

また、プリセットモード生成部６３０は、プリセット情報がビットストリームに存在するか否かを表すプリセット存在情報、プリセット情報の個数を表すプリセット個数情報及びプリセットメタデータの長さを表すプリセットメタデータ長情報をさらに生成することができる。 The preset mode generation unit 630 also includes preset presence information indicating whether or not the preset information exists in the bitstream, preset number information indicating the number of preset information, and preset metadata length information indicating the length of the preset metadata. Furthermore, it can be generated.

オブジェクト情報生成部６２０で生成されたオブジェクト情報とプリセットモード生成部６３０で生成されたプリセット属性情報、プリセット情報、プリセットメタデータ、プリセット存在情報、プリセット個数情報及びプリセットメタデータ長情報は、ＳＡＯＣビットストリームに含まれて転送されることができ、ダウンミックス信号も含まれた一つのビットストリームの形態として転送されることができる。この場合、ダウンミックス信号及びプリセット関連情報を含むビットストリームは、デコーディング装置の信号受信部（図示せず）に入力されることができる。 The object information generated by the object information generation unit 620 and the preset attribute information, preset information, preset metadata, preset presence information, preset number information, and preset metadata length information generated by the preset mode generation unit 630 are SAOC bitstreams. And can be transferred in the form of one bit stream including a downmix signal. In this case, the bitstream including the downmix signal and the preset related information can be input to a signal receiving unit (not shown) of the decoding apparatus.

情報処理部６５０は、オブジェクト情報処理部６５１、変動プリセットモード受信部６５２及び固定プリセットモード受信部６５３を含み、ＳＡＯＣビットストリームを受信する。ＳＡＯＣビットストリームが変動プリセットモード受信部６５２に入力されるかまたは固定プリセットモード受信部６５３に入力されるかは、図２乃至図５を参照して上述した通り、ＳＡＯＣビットストリームに含まれたプリセット属性情報に基づいて決定される。 The information processing unit 650 includes an object information processing unit 651, a variable preset mode receiving unit 652, and a fixed preset mode receiving unit 653, and receives the SAOC bitstream. Whether the SAOC bit stream is input to the variable preset mode receiving unit 652 or the fixed preset mode receiving unit 653 depends on whether the preset is included in the SAOC bit stream as described above with reference to FIGS. Determined based on attribute information.

変動プリセットモード受信部６５２及び固定プリセットモード受信部６５３は、ＳＡＯＣビットストリームから上記のプリセット属性情報、プリセット存在情報、プリセット個数情報、プリセットメタデータ、出力チャネル情報及びプリセット情報（例えば、プリセットマトリクス）を受信し、図１乃至図５のオーディオ信号処理方法及び装置で説明された様々な実施例による方法を用いる。 The variable preset mode receiving unit 652 and the fixed preset mode receiving unit 653 receive the preset attribute information, preset presence information, preset number information, preset metadata, output channel information, and preset information (for example, preset matrix) from the SAOC bitstream. Receive and use methods according to various embodiments described in the audio signal processing methods and apparatus of FIGS.

変動プリセットモード受信部６５２または固定プリセットモード受信部６５３は、プリセットメタデータとプリセット情報を出力する。 The variable preset mode receiving unit 652 or the fixed preset mode receiving unit 653 outputs preset metadata and preset information.

オブジェクト情報処理部６５１は、受信したプリセットメタデータとプリセット情報を、ＳＡＯＣビットストリームに含まれたオブジェクト情報と一緒に用いて、ダウンミックス信号を前処理（pre-processing）するためのダウンミックス処理情報とダウンミックス信号をレンダリングするためのマルチチャネル情報を生成する。この場合、変動プリセットモード受信部６５２から出力されるプリセット情報及びプリセットメタデータは、ダウンミックス信号の一つのデータ領域に対応するもので、固定プリセットモード受信部６５３から出力されるプリセット情報及びプリセットメタデータは、ダウンミックス信号の全データ領域に対応するものである。 The object information processing unit 651 uses the received preset metadata and preset information together with the object information included in the SAOC bitstream to perform downmix processing information for pre-processing the downmix signal. And generating multi-channel information for rendering downmix signals. In this case, the preset information and preset metadata output from the variable preset mode receiving unit 652 correspond to one data area of the downmix signal, and the preset information and preset meta data output from the fixed preset mode receiving unit 653 are used. Data corresponds to the entire data area of the downmix signal.

その後、ダウンミックス処理情報は、ダウンミックス信号処理部６４０に入力され、ダウンミックス信号に含まれたオブジェクトが含まれるチャネルを変動させることによってパニングを行うことができる。このように前処理されたダウンミックス信号は、情報処理部６５０から出力されたマルチチャネル情報と一緒にマルチチャネルデコーディング部６６０に入力されてアップミキシングされることで、マルチチャネルオーディオ信号を生成することができる。 Thereafter, the downmix processing information is input to the downmix signal processing unit 640, and panning can be performed by changing the channel in which the object included in the downmix signal is included. The premixed downmix signal is input to the multichannel decoding unit 660 together with the multichannel information output from the information processing unit 650 and is upmixed to generate a multichannel audio signal. be able to.

このように、本発明のオーディオ信号処理装置は、複数個のオブジェクトを含むダウンミックス信号を、オブジェクト情報を用いてマルチチャネル信号にデコーディングする際に、既に設定されたプリセット情報及びプリセットメタデータをさらに用いることによって容易にオブジェクトのレベルを調節することができる。また、この時、オブジェクトに適用されるプリセット情報は、プリセット属性情報に基づいてデータ領域ごとに個別に適用されたり、全データ領域に同一に適用されたりすることによって、音源の特性によって適切に音場感を向上させることができる。 As described above, when the audio signal processing apparatus of the present invention decodes a downmix signal including a plurality of objects into a multi-channel signal using object information, the preset information and preset metadata that have already been set are used. Furthermore, the level of the object can be easily adjusted by using it. Also, at this time, the preset information applied to the object is applied to each data area individually based on the preset attribute information, or applied to all data areas in the same manner, so that the sound can be appropriately controlled according to the characteristics of the sound source. The feeling of the place can be improved.

図７乃至図１１は、本発明の他の実施例によるオーディオ信号処理方法を示すシンタックス（syntax）を様々な方法で表現したものである。図７を参照すると、プリセット情報と関連した情報は、ビットストリームの構成情報領域（SAOCSpecificConfig()）に存在することができる。 7 to 11 show various methods of expressing syntax indicating an audio signal processing method according to another embodiment of the present invention. Referring to FIG. 7, information related to the preset information may exist in the configuration information area (SAOCSpecificConfig ()) of the bitstream.

まず、ビットストリームの構成情報領域からプリセット個数情報（bsNumPresets）を獲得することができる。また、プリセット個数情報に基づいてプリセット情報（ｉ番目のプリセット情報）ごとにプリセット情報が適用されるオブジェクトの出力チャネルを表す出力チャネル情報（bsPresetLevel[i]）を獲得することができる。この出力チャネル情報の意味は、下記の表２の通りである。 First, preset number information (bsNumPresets) can be obtained from the configuration information area of the bitstream. Also, output channel information (bsPresetLevel [i]) indicating the output channel of the object to which the preset information is applied can be obtained for each preset information (i-th preset information) based on the preset number information. The meaning of the output channel information is as shown in Table 2 below.

その後、プリセット情報が構成情報領域に含まれるのか或いはデータ領域に含まれるかを表すプリセット属性情報（bsPresetDynamic[i]）を獲得することができる。図７に示すように、プリセット属性情報（bsPresetDynamic[i]）が０の場合、固定プリセットモード（static preset mode）を表し、全データ領域に対応してダウンミックス信号のオブジェクトレベルまたはパニングを調節するためのプリセット情報（getPreset()）を獲得する。この時、プリセットメタデータ（PresetMetaData(numPresets)）もプリセット情報に対応して構成情報領域に含まれることができる。このプリセット属性情報の意味は、下記表３の通りである。 Thereafter, preset attribute information (bsPresetDynamic [i]) indicating whether the preset information is included in the configuration information area or the data area can be acquired. As shown in FIG. 7, when the preset attribute information (bsPresetDynamic [i]) is 0, it represents a fixed preset mode (static preset mode), and the object level or panning of the downmix signal is adjusted corresponding to the entire data area. Preset information (getPreset ()) is acquired. At this time, preset metadata (PresetMetaData (numPresets)) can also be included in the configuration information area corresponding to the preset information. The meaning of the preset attribute information is as shown in Table 3 below.

図８は、図７に示すプリセット属性情報（bsPresetDynamic[i]）が、プリセット情報がデータ領域に含まれることを表す場合、データ領域情報に対するシンタックスを表現したものである。図８を参照すると、図７のプリセット属性情報（bsPresetDynamic[i]）が１の場合、if(!bsPresetDynamic[i])ループを外れるので、構成情報領域からプリセット情報を獲得しない。その後、図８に示すように、データ領域で（SAOCFrame() ｛if(bsPresetDynamic[i])｝）条件を満たすので、プリセット情報（getPreset()）を獲得することができる。このプリセット情報はデータ領域から獲得されるので、図７のプリセット情報が全データ領域に同一に適用されることと違い、該当のデータ領域にのみ適用されることができる。 FIG. 8 shows the syntax for the data area information when the preset attribute information (bsPresetDynamic [i]) shown in FIG. 7 indicates that the preset information is included in the data area. Referring to FIG. 8, when the preset attribute information (bsPresetDynamic [i]) in FIG. 7 is 1, the if (! BsPresetDynamic [i]) loop is removed, so that the preset information is not acquired from the configuration information area. Thereafter, as shown in FIG. 8, the condition (SAOCFrame () {if (bsPresetDynamic [i])}) is satisfied in the data area, so that the preset information (getPreset ()) can be acquired. Since this preset information is acquired from the data area, the preset information of FIG. 7 can be applied only to the corresponding data area, unlike the case where the preset information of FIG.

一方、図７及び図８では、プリセット情報が構成情報領域（SAOCSpecificConfig()）及びデータ領域（SAOCFrame()）に含まれているが、構成情報領域拡張領域（SAOCExtensionConfig()）及びデータ領域拡張領域（SAOCEXtensionFrame()）に含まれることもできる。 On the other hand, in FIGS. 7 and 8, the preset information is included in the configuration information area (SAOCSpecificConfig ()) and the data area (SAOCFrame ()), but the configuration information area extension area (SAOCExtensionConfig ()) and the data area extension area are included. It can also be included in (SAOCEXtensionFrame ()).

この時、構成情報領域拡張領域及びデータ領域拡張領域に含まれるプリセット情報は、図７及び図８を参照して説明されたプリセット情報と同一である。また、構成情報領域拡張領域及びデータ領域拡張領域は、プリセット情報の他に、プリセット情報に対応するプリセットメタデータ、出力チャネル情報、プリセット存在情報などをさらに含むこともできる。 At this time, the preset information included in the configuration information area extension area and the data area extension area is the same as the preset information described with reference to FIGS. In addition to the preset information, the configuration information area extension area and the data area extension area may further include preset metadata corresponding to the preset information, output channel information, preset presence information, and the like.

図９は、本発明の他の実施例によるプリセット情報を表すシンタックスである。図９を参照すると、プリセット情報は、EcDataを用いて生成されたものとすることができる。一方、プリセット情報は、EcDataではなくゲイン値自体を転送して用いる方法を利用することができ、チャネル差情報（ＣＬＤ）テーブルを用いて量子化する方法の他、別の独立したテーブルを用いて量子化することもできる。 FIG. 9 is a syntax representing preset information according to another embodiment of the present invention. Referring to FIG. 9, the preset information may be generated using EcData. On the other hand, the preset information can use the method of transferring and using the gain value itself instead of EcData. In addition to the method of quantizing using the channel difference information (CLD) table, using another independent table It can also be quantized.

図１０は、本発明の他の実施例によるプリセットメタデータを表すシンタックスである。図１０に示すように、プリセットメタデータは、まず、プリセット情報に対応するメタデータの長さを表すプリセットメタデータ長情報（bsNumCharMetaData[prst]）を獲得する。以降、プリセットメタデータ長情報に基づいてプリセット情報ごとに各プリセット情報に対応するプリセットメタデータ（bsMetaData[prst]）を獲得することができる。 FIG. 10 is a syntax representing preset metadata according to another embodiment of the present invention. As shown in FIG. 10, the preset metadata first acquires preset metadata length information (bsNumCharMetaData [prst]) indicating the length of the metadata corresponding to the preset information. Thereafter, preset metadata (bsMetaData [prst]) corresponding to each preset information can be obtained for each preset information based on the preset metadata length information.

このように、プリセット情報を表現するプリセットメタデータを、メタデータの長さを表すプリセット長情報に基づいてテキスト形態で表現することによって、本発明のオーディオ信号処理方法及び装置は余分のコーディングを減らすことができる。 As described above, the audio signal processing method and apparatus of the present invention reduce extra coding by expressing the preset metadata expressing the preset information in a text form based on the preset length information indicating the length of the metadata. be able to.

図１１は、本発明のさらに他の実施例によるプリセット情報を含むデータ領域のシンタックスである。図１１を参照すると、プリセット情報は、オブジェクトの数（numObjects）に基づいてオブジェクト別に出力チャネル（numRenderingChannel[i]）にマッピングされる情報を転送することができる。図１１に示すように、プリセット情報は、ビットストリームのデータ領域から獲得することができるが、データ領域拡張領域に含まれた場合にはデータ領域拡張領域（SAOCExtensionFrame()）、ビットストリームの構成情報領域に含まれた場合には構成情報領域から獲得することができる。 FIG. 11 is a syntax of a data area including preset information according to still another embodiment of the present invention. Referring to FIG. 11, preset information can transfer information mapped to an output channel (numRenderingChannel [i]) for each object based on the number of objects (numObjects). As shown in FIG. 11, the preset information can be acquired from the data area of the bit stream, but when included in the data area extension area, the data area extension area (SAOCExtensionFrame ()), the configuration information of the bit stream If it is included in the area, it can be acquired from the configuration information area.

図１２は、本発明のさらに他の実施例によるオーディオ信号処理装置１２００を示す図である。オーディオ信号処理装置１２００は、主に、プリセットモード生成部１２１０、情報受信部（図示せず）、プリセットモード入力部１２２０、プリセットモード選択部１２３０、変動プリセットモード受信部１２４０、固定プリセットモード受信部１２５０、レンダリング部１２６０及びディスプレイ部１２７０を含む。 FIG. 12 is a diagram illustrating an audio signal processing apparatus 1200 according to another embodiment of the present invention. The audio signal processing device 1200 mainly includes a preset mode generation unit 1210, an information reception unit (not shown), a preset mode input unit 1220, a preset mode selection unit 1230, a variable preset mode reception unit 1240, and a fixed preset mode reception unit 1250. A rendering unit 1260 and a display unit 1270.

図１２のプリセットモード生成部１２１０、情報受信部（図示せず）、変動プリセットモード受信部１２４０、固定プリセットモード受信部１２５０及びレンダリング部１２６０は、図３のプリセットモード生成部３１０、変動プリセットモード受信部３２０、固定プリセットモード受信部３３０及びレンダリング部３４０と同一の構成及び機能を有するので、詳細な説明は省略する。 The preset mode generation unit 1210, the information reception unit (not shown), the variation preset mode reception unit 1240, the fixed preset mode reception unit 1250, and the rendering unit 1260 of FIG. 12 are the same as the preset mode generation unit 310 and the variation preset mode reception of FIG. Since the configuration and function are the same as those of the unit 320, the fixed preset mode reception unit 330, and the rendering unit 340, detailed description thereof is omitted.

図１２を参照すると、プリセットモード入力部１２２０は、プリセットメタデータ生成部１２１２から受信した複数個のプリセットメタデータをまずディスプレイ部１２７０の画面に表示し、これらのうち一つのプリセットメタデータを選択する選択信号を受信する。プリセットモード選択部１２３０は、選択信号によって選択された一つのプリセットメタデータと該プリセットメタデータに対応するプリセット情報を選択する。 Referring to FIG. 12, the preset mode input unit 1220 first displays a plurality of preset metadata received from the preset metadata generation unit 1212 on the screen of the display unit 1270, and selects one of the preset metadata. A selection signal is received. The preset mode selection unit 1230 selects one preset metadata selected by the selection signal and preset information corresponding to the preset metadata.

この時、プリセット属性決定部１２１１から受信するプリセット属性情報（preset_attribute_information）が、プリセット情報がデータ領域に含まれることを表す場合、プリセットモード選択部１２３０で選択されたプリセットメタデータと該プリセットメタデータと対応するプリセット情報を、変動プリセットモード受信部１２４０のプリセットメタデータ受信部１２４１及びプリセット情報受信部１２４２にそれぞれ入力する。この場合、ディスプレイ部１２７０、プリセットモード入力部１２２０及びプリセットモード選択部１２３０は、データ領域の数だけ反復して上記動作を行うことができる。 At this time, if the preset attribute information (preset_attribute_information) received from the preset attribute determination unit 1211 indicates that the preset information is included in the data area, the preset metadata selected by the preset mode selection unit 1230, the preset metadata, Corresponding preset information is input to the preset metadata receiving unit 1241 and the preset information receiving unit 1242 of the variation preset mode receiving unit 1240, respectively. In this case, the display unit 1270, the preset mode input unit 1220, and the preset mode selection unit 1230 can perform the above operation by repeating the number of data areas.

一方、プリセット属性決定部１２１１から受信するプリセット属性情報が、プリセット情報が構成情報領域に含まれることを表す場合、プリセットモード選択部１２２０で選択されたプリセットメタデータと該プリセットメタデータと対応するプリセット情報を、固定プリセットモード受信部１２５０のプリセットメタデータ受信部１２５１及びプリセット情報受信部１２５２にそれぞれ入力する。 On the other hand, when the preset attribute information received from the preset attribute determination unit 1211 indicates that the preset information is included in the configuration information area, the preset metadata selected by the preset mode selection unit 1220 and the preset corresponding to the preset metadata. The information is input to the preset metadata receiving unit 1251 and the preset information receiving unit 1252 of the fixed preset mode receiving unit 1250, respectively.

また、選択されたプリセット情報は、レンダリング部１２６０に出力される反面、選択されたプリセットメタデータはディスプレイ部１２７０に出力されて画面に表示される。 The selected preset information is output to the rendering unit 1260, while the selected preset metadata is output to the display unit 1270 and displayed on the screen.

ディスプレイ部１２７０は、プリセットモード入力部１２２０が選択信号を受信できるように複数個のプリセットメタデータを表示するユニットと同一のユニットとすることができ、それぞれ異なるユニットとすることもできる。ディスプレイ部１２７０とプリセットモード入力部１２２０のためにプリセットメタデータを表示するディスプレイ部が同一のユニットを用いる場合、画面に表示される説明（例えば、「プリセットモードを選択してください」、「プリセットモードＸが選択されました」等）、視覚的客体、文字などを異ならせて構成することによって、それぞれの動作を区別することができる。 The display unit 1270 may be the same unit as a unit that displays a plurality of preset metadata so that the preset mode input unit 1220 can receive a selection signal, or can be a different unit. When the display unit for displaying preset metadata for the display unit 1270 and the preset mode input unit 1220 uses the same unit, explanations displayed on the screen (for example, “Please select a preset mode”, “Preset mode”) X has been selected "), and the like, by distinguishing the visual objects and characters, it is possible to distinguish the respective actions.

図１３は、オーディオ信号処理装置１２００のディスプレイ部１２７０の一例を示す図である。ディスプレイ部１２７０は、選択されたプリセットメタデータの他に、プリセットメタデータに対応するプリセット情報を用いて調節されたオブジェクトのレベルまたは位置を表す少なくとも一つの図形要素を含むことができる。 FIG. 13 is a diagram illustrating an example of the display unit 1270 of the audio signal processing apparatus 1200. In addition to the selected preset metadata, the display unit 1270 may include at least one graphic element representing the level or position of the object adjusted using preset information corresponding to the preset metadata.

図１３を参照すると、まず、図１２のディスプレイ部１２７０に表示された複数個のプリセットメタデータ（例えば、スタジアムモード、洞窟モード、ニュースモード、ライブモード等）のうち、プリセットモード選択部１２３０を通じてニュースモードが選択された場合、ニュースモードに対応するプリセット情報が、ダウンミックス信号に含まれた各オブジェクトに適用される。この場合、ボーカルのレベルは増加し、他のオブジェクト（ギター、バイオリン、ドラム、…、チェロ）のレベルは減少する。 Referring to FIG. 13, first, among a plurality of preset metadata (eg, stadium mode, cave mode, news mode, live mode, etc.) displayed on the display unit 1270 of FIG. When the mode is selected, preset information corresponding to the news mode is applied to each object included in the downmix signal. In this case, the level of vocals will increase and the level of other objects (guitar, violin, drum, ..., cello) will decrease.

ディスプレイ部１２７０に含まれた図形要素は、オブジェクトのレベルまたは位置の活性化または変化を表すために変形される。例えば、図１３に示すように、ボーカルを表す図形要素のスイッチは右に移動し、他のオブジェクトを表す図形要素のスイッチは左に移動することができる。 The graphic elements included in the display unit 1270 are transformed to represent the activation or change of the level or position of the object. For example, as shown in FIG. 13, the switch of the graphic element representing the vocal can be moved to the right, and the switch of the graphic element representing the other object can be moved to the left.

図形要素は、様々な方法でプリセット情報を用いて調節されたオブジェクトのレベルまたは位置を表すことができる。各オブジェクトを表す図形要素は少なくとも一つとすることができ、この場合、第１図形要素は、プリセット情報を適用する前のオブジェクトのレベルまたは位置を表し、第２図形要素は、プリセット情報を適用して調節されたオブジェクトのレベルまたは位置を表すことができる。この場合、プリセット情報を適用する前後のオブジェクトのレベルまたは位置を容易に比較できるので、プリセット情報が各オブジェクトをどのように調節するかが容易にわかる。 The graphic element can represent the level or position of the object adjusted using the preset information in various ways. There can be at least one graphic element representing each object, in which case the first graphic element represents the level or position of the object before the preset information is applied, and the second graphic element applies the preset information. The level or position of the adjusted object. In this case, since the level or position of the object before and after applying the preset information can be easily compared, it can be easily understood how the preset information adjusts each object.

図１４は、プリセット情報が適用されたオブジェクトを表す他の形状の少なくとも一つの図形要素を示す図である。図１４を参照すると、第１図形要素はバー（bar）形態とし、第２図形要素は、第１図形要素の内部の延長線（extensive line）とすることができる。ここで、第１図形要素は、プリセット情報を適用する前のオブジェクトのレベルまたは位置を表し、第２図形要素は、プリセット情報を適用して調節されたオブジェクトのレベルまたは位置を表す。 FIG. 14 is a diagram illustrating at least one graphic element having another shape representing an object to which the preset information is applied. Referring to FIG. 14, the first graphic element may have a bar shape, and the second graphic element may be an extended line inside the first graphic element. Here, the first graphic element represents the level or position of the object before the preset information is applied, and the second graphic element represents the level or position of the object adjusted by applying the preset information.

図１４に示すように、上部の図形要素は、プリセット情報が適用される前のオブジェクトのレベルが、適用された後のオブジェクトのレベルと同一の場合を示す。中央の図形要素は、プリセット情報が適用されて調節されたオブジェクトのレベルが、適用前よりも大きい場合を示し、下部の図形要素は、プリセット情報が適用されることによってオブジェクトのレベルが減少した場合を示す。 As shown in FIG. 14, the upper graphic element shows a case where the level of the object before the preset information is applied is the same as the level of the object after the preset information is applied. The middle graphic element indicates that the level of the object adjusted by applying preset information is higher than before the application, and the lower graphic element is when the object level is decreased by applying the preset information. Indicates.

このように、プリセット情報を適用する前と適用した後のオブジェクトのレベルまたは位置を表す少なくとも一つの図形要素を使用することによって、プリセット情報が各オブジェクトをどのように調節するかが容易にわかる。なお、これにより、プリセット情報の特徴を容易に把握できるので、必要に応じてユーザが適切なプリセットモードを選択するのに役立つことができる。 In this way, it is easy to see how the preset information adjusts each object by using at least one graphic element that represents the level or position of the object before and after applying the preset information. As a result, the characteristics of the preset information can be easily grasped, which can help the user to select an appropriate preset mode as necessary.

図１５は、本発明のさらに他の実施例による変動プリセットモード受信部及び固定プリセットモード受信部が具現された製品の概略的な構成を示す図であり、図１６Ａ及び図１６Ｂは、本発明の実施例による変動プリセットモード受信部及び固定プリセットモード受信部が具現された製品間の関係を示す図である。 FIG. 15 is a diagram illustrating a schematic configuration of a product in which a variable preset mode receiving unit and a fixed preset mode receiving unit according to still another embodiment of the present invention are implemented. FIGS. It is a figure which shows the relationship between the products in which the fluctuation | variation preset mode receiver by the Example and the fixed preset mode receiver were implemented.

図１５を参照すると、有／無線通信部１５１０は、有／無線通信方式を通じてビットストリームを受信する。具体的に、有／無線通信部１５１０は、有線通信部１５１１、赤外線通信部１５１２、ブルトゥース部１５１３、無線ラン通信部１５１４のうち少なくとも一つを含むことができる。 Referring to FIG. 15, the wired / wireless communication unit 1510 receives a bitstream through a wired / wireless communication scheme. Specifically, the wired / wireless communication unit 1510 may include at least one of a wired communication unit 1511, an infrared communication unit 1512, a Bluetooth unit 1513, and a wireless run communication unit 1514.

ユーザ認証部１５２０は、ユーザ情報を受信してユーザ認証を行うもので、指紋認識部１５２１、虹彩認識部１５２２、顔面認識部１５２３、及び音声認識部１５２４のうち少なくとも一つを含むことができる。この場合、ユーザ認証は、それぞれ、指紋、虹彩情報、顔面輪郭情報、音声情報を受信してユーザ情報に変換し、ユーザ情報と既存登録されているユーザデータとが一致するか否か判断して、ユーザ認証を行うことができる。 The user authentication unit 1520 receives user information and performs user authentication, and may include at least one of a fingerprint recognition unit 1521, an iris recognition unit 1522, a face recognition unit 1523, and a voice recognition unit 1524. In this case, user authentication receives fingerprint, iris information, facial contour information, and voice information, respectively, converts them into user information, and determines whether the user information matches the existing registered user data. User authentication can be performed.

入力部１５３０は、ユーザが様々な種類の命令を入力するための入力装置であり、キーパッド部１５３１、タッチパッド部１５３２、リモコン部１５３３のうちの少なくとも一つを含むことができるが、本発明はこれに限定されるわけではない。一方、後述するメタデータ受信部１５４１から出力される複数個のプリセット情報に対するプリセットメタデータが、ディスプレイ部１５６２を通じて画面に表示される場合、入力部１５３０を通じてユーザがプリセットメタデータを選択することができ、選択されたプリセットメタデータに関する情報が制御部１５５０に入力される。 The input unit 1530 is an input device for a user to input various types of commands, and may include at least one of the keypad unit 1531, the touchpad unit 1532, and the remote control unit 1533. Is not limited to this. On the other hand, when preset metadata for a plurality of preset information output from the metadata receiving unit 1541 described later is displayed on the screen through the display unit 1562, the user can select preset metadata through the input unit 1530. Information regarding the selected preset metadata is input to the control unit 1550.

信号デコーディング部１５４０は、変動プリセットモード受信部１５４１及び固定プリセットモード受信部１５４２を含み、変動プリセットモード受信部１５４１は、プリセット属性情報に基づいて、各データ領域に対応するプリセット情報及びプリセットメタデータを受信する。また、固定プリセットモード受信部１５４２は、プリセット属性情報に基づいて、全データ領域に対応するプリセット情報及びプリセットメタデータを受信する。また、プリセットメタデータは、メタデータの長さを表すプリセットメタデータ長情報に基づいて受信され、プリセット情報は、プリセット情報が存在するか否かを表すプリセット存在情報、プリセット情報の個数を表すプリセット個数情報及び出力チャネルの個数に基づく、例えば、出力チャネルがモノ、ステレオ及びマルチチャネルのうち一つであることを表す出力チャネル情報に基づいて獲得される。もし、プリセット情報がマトリクスで表現された場合、出力チャネル情報を受信し、これに基づいてプリセットマトリクスを受信する。 The signal decoding unit 1540 includes a variable preset mode receiving unit 1541 and a fixed preset mode receiving unit 1542. The variable preset mode receiving unit 1541 is based on preset attribute information, and preset information and preset metadata corresponding to each data area. Receive. Further, the fixed preset mode receiving unit 1542 receives preset information and preset metadata corresponding to all data areas based on the preset attribute information. The preset metadata is received based on preset metadata length information indicating the length of the metadata. The preset information is preset presence information indicating whether or not the preset information exists, and a preset indicating the number of the preset information. Based on the number information and the number of output channels, for example, based on output channel information indicating that the output channel is one of mono, stereo, and multi-channel. If the preset information is expressed in a matrix, the output channel information is received and the preset matrix is received based on the output channel information.

信号デコーディング部１５４０は、受信したビットストリーム、プリセットメタデータ、及びプリセット情報を用いてオーディオ信号をデコーディングして出力信号を生成し、プリセットメタデータをテキストの形態として出力する。 The signal decoding unit 1540 generates an output signal by decoding the audio signal using the received bitstream, preset metadata, and preset information, and outputs the preset metadata in the form of text.

制御部１５５０は、入力装置から入力信号を受信し、信号デコーディング部１５４０と出力部１５６０のプロセス全般を制御する。上述の通り、制御部１５５０に、入力部１５３０から選択されたプリセットメタデータに関する情報が入力信号の形態として入力され、有／無線通信部１５１０から、プリセット情報がビットストリームのどの領域に含まれるかを表すプリセット属性情報（preset_attribute_information）が入力される場合、変動プリセットモード受信部１５４１及び固定プリセットモード受信部１５４２は、プリセット属性情報及び入力信号に基づいて、選択されたプリセットメタデータと対応するプリセット情報を受信し、これを用いてオーディオ信号をデコーディングする。 The control unit 1550 receives an input signal from the input device and controls the overall process of the signal decoding unit 1540 and the output unit 1560. As described above, information regarding the preset metadata selected from the input unit 1530 is input to the control unit 1550 as the form of the input signal, and from which area of the bitstream the preset information is included from the wired / wireless communication unit 1510. When preset attribute information (preset_attribute_information) is input, the variable preset mode receiving unit 1541 and the fixed preset mode receiving unit 1542 are preset information corresponding to the selected preset metadata based on the preset attribute information and the input signal. Is used to decode the audio signal.

出力部１５６０は、信号デコーディング部１５４０により生成された出力信号などが出力される構成要素で、スピーカ部１５６１及びディスプレイ部１５６２を含むことができる。出力信号がオーディオ信号の場合、出力信号はスピーカ部１５６１から出力され、ビデオ信号の場合、出力信号はディスプレイ部１５６２から出力される。また、制御部１５５０から入力されたプリセットメタデータをディスプレイ部１５６２を通じて画面に表示する。 The output unit 1560 is a component that outputs an output signal generated by the signal decoding unit 1540 and may include a speaker unit 1561 and a display unit 1562. When the output signal is an audio signal, the output signal is output from the speaker unit 1561, and when the output signal is a video signal, the output signal is output from the display unit 1562. The preset metadata input from the control unit 1550 is displayed on the screen through the display unit 1562.

図１６は、図１５に示す製品に該当する端末間の関係及び端末とサーバとの関係をそれぞれ示す図である。図１６Ａを参照すると、第１端末１６１０及び第２端末１６２０が、有／無線通信部を通じてデータまたはビットストリームを両方向に通信できることがわかる。 FIG. 16 is a diagram illustrating a relationship between terminals corresponding to the product illustrated in FIG. 15 and a relationship between the terminal and the server. Referring to FIG. 16A, it can be seen that the first terminal 1610 and the second terminal 1620 can communicate data or a bit stream in both directions through the wired / wireless communication unit.

有／無線通信部を通じて通信するデータまたはビットストリームは、図２Ａ及び図２Ｂに示すビットストリームの形態としても良く、図１乃至図１５を参照して説明した本発明のプリセット属性情報、プリセット情報、プリセットメタデータなどを含むデータとしても良い。 The data or bit stream communicated through the wired / wireless communication unit may be in the form of the bit stream shown in FIGS. 2A and 2B. The preset attribute information, preset information, and the like described with reference to FIGS. Data including preset metadata may be used.

図１６Ｂを参照すると、サーバ１６３０及び第１端末１６４０も互いに有／無線通信を行うことができる。 Referring to FIG. 16B, the server 1630 and the first terminal 1640 can also perform wired / wireless communication with each other.

図１７は、本発明の一実施例によるメタデータ受信部及びプリセットレンダリングデータ受信部を含むプリセット受信部が具現された放送信号デコーディング装置１７００の概略的な構成を示す図である。 FIG. 17 is a diagram illustrating a schematic configuration of a broadcast signal decoding apparatus 1700 in which a preset receiving unit including a metadata receiving unit and a preset rendering data receiving unit according to an embodiment of the present invention is implemented.

図１７を参照すると、デマルチプレクサ１７２０は、チューナ１７１０からＴＶ放送と関連したデータを受信する。受信されたデータはデマルチプレクサ１７２０で分離され、データデコーダ１７３０でデコーディングされる。一方、デマルチプレクサ１７２０で分離されたデータは、ＨＤＤのような記憶媒体１７５０に記憶されることができる。 Referring to FIG. 17, the demultiplexer 1720 receives data related to the TV broadcast from the tuner 1710. The received data is separated by the demultiplexer 1720 and decoded by the data decoder 1730. On the other hand, the data separated by the demultiplexer 1720 can be stored in a storage medium 1750 such as an HDD.

デマルチプレクサ１７２０で分離されたデータは、オーディオデコーダ１７４１及びビデオデコーダ１７４２を含むデコーダ１７４０に入力されて、オーディオ信号及びビデオ信号をデコーディングする。オーディオデコーダ１７４１は、本発明の一実施例による変動プリセットモード受信部１７４１Ａ及び固定プリセットモード受信部１７４１Ｂを含み、変動プリセットモード受信部１７４１Ａは、プリセット属性情報に基づいて各データ領域に対応するプリセット情報及びプリセットメタデータを受信する。また、固定プリセットモード受信部１７４１Ｂは、プリセット属性情報に基づいて全データ領域に対応するプリセット情報及びプリセットメタデータを受信する。 The data separated by the demultiplexer 1720 is input to a decoder 1740 including an audio decoder 1741 and a video decoder 1742, and the audio signal and the video signal are decoded. The audio decoder 1741 includes a variable preset mode receiving unit 1741A and a fixed preset mode receiving unit 1741B according to an embodiment of the present invention. The variable preset mode receiving unit 1741A is preset information corresponding to each data area based on preset attribute information. And preset metadata. The fixed preset mode receiving unit 1741B receives preset information and preset metadata corresponding to all data areas based on the preset attribute information.

また、プリセットメタデータは、メタデータの長さを表すプリセットメタデータ長情報に基づいて受信され、プリセット情報は、プリセット情報が存在するか否かを表すプリセット存在情報、プリセット情報の個数を表すプリセット個数情報、及び出力チャネルがモノ、ステレオ及びマルチチャネルのうちの一つであることを表す出力チャネル情報に基づいて獲得される。もし、プリセット情報がマトリクスで表現された場合、出力チャネル情報を受信してこれに基づいてプリセットマトリクスを受信する。 The preset metadata is received based on preset metadata length information indicating the length of the metadata. The preset information is preset presence information indicating whether or not the preset information exists, and a preset indicating the number of the preset information. The number information is obtained based on the output channel information indicating that the output channel is one of mono, stereo, and multi-channel. If the preset information is expressed in a matrix, the output channel information is received and the preset matrix is received based on the output channel information.

オーディオデコーダ１７４１は、受信されたビットストリーム、プリセットメタデータ、及びプリセット情報を用いてオーディオ信号をデコーディングして出力信号を生成し、プリセットメタデータをテキスト形態として出力する。 The audio decoder 1741 generates an output signal by decoding the audio signal using the received bitstream, preset metadata, and preset information, and outputs the preset metadata as a text form.

ディスプレイ部１７７０は、ビデオデコーダ１７４２から出力されたビデオ信号とオーディオデコーダ１７４１から出力されたプリセットメタデータを画面に表示する。また、ディスプレイ部１７７０は、スピーカ部（図示せず）を含み、オーディオデコーダ１７４１から出力されるオブジェクトのレベルがプリセット情報を用いて調節されたオーディオ信号を、ディスプレイ部１７７０に含まれたスピーカ部から出力する。また、デコーダ１７４０でデコーディングされたデータは、ＨＤＤのような記憶媒体１７５０に記憶することができる。 The display unit 1770 displays the video signal output from the video decoder 1742 and the preset metadata output from the audio decoder 1741 on the screen. The display unit 1770 includes a speaker unit (not shown), and an audio signal in which the level of an object output from the audio decoder 1741 is adjusted using preset information is transmitted from the speaker unit included in the display unit 1770. Output. The data decoded by the decoder 1740 can be stored in a storage medium 1750 such as an HDD.

一方、信号デコーディング装置１７００は、ユーザから情報を受信し、受信したデータを制御できるアプリケーションマネージャ１７６０をさらに含むことができる。 Meanwhile, the signal decoding apparatus 1700 may further include an application manager 1760 that can receive information from a user and control the received data.

アプリケーションマネージャ１７６０は、ユーザインターフェースマネージャ１７６１及びサービスマネージャ１７６２を含む。ユーザインターフェースマネージャ１７６１は、ユーザから情報を受信するためのインターフェース（interface）を制御する。例えば、ディスプレイ部１７７０に表示されるテキストの書体、画面の明るさ、メニュー構成などを制御することができる。一方、サービスマネージャ１７６２は、デコーダ１７４０及びディスプレイ部１７７０で放送信号をデコーディングして出力する場合、受信される放送信号を、ユーザから入力される情報を用いて制御できる。例えば、放送チャネルの設定、アラーム機能設定、成人認証機能などを提供することができる。アプリケーションマネージャ１７６０から出力されるデータは、デコーダ１７４０の他に、ディスプレイ部１７７０にも転送されて利用可能である。 The application manager 1760 includes a user interface manager 1761 and a service manager 1762. The user interface manager 1761 controls an interface for receiving information from the user. For example, the typeface of text displayed on the display unit 1770, the brightness of the screen, the menu configuration, and the like can be controlled. On the other hand, the service manager 1762 can control the received broadcast signal using information input from the user when the decoder 1740 and the display unit 1770 decode and output the broadcast signal. For example, broadcast channel settings, alarm function settings, adult authentication functions, and the like can be provided. Data output from the application manager 1760 can be transferred to the display unit 1770 and used in addition to the decoder 1740.

以上では具体的な実施例及び図面に基づいて本発明を説明してきたが、本発明は、それらの具体例に限定されず、本発明の属する技術分野における通常の知識を有する者にとっては、本発明の技術思想及び添付の特許請求の範囲とその均等範囲内で様々な修正及び変形が可能であるということは明らかである。 The present invention has been described above based on specific embodiments and drawings. However, the present invention is not limited to these specific examples, and the present invention is not limited to those skilled in the art to which the present invention belongs. Obviously, various modifications and variations can be made within the technical idea of the invention and the appended claims and their equivalents.

本発明は、オーディオ信号をエンコーディング及びデコーディングするのに適用することができる。 The present invention can be applied to encoding and decoding audio signals.

Claims

Receiving a downmix signal including at least one object, object information based on an attribute of the object, preset information for rendering the downmix signal, and preset attribute information representing an attribute of the preset information;
Rendering the downmix signal by applying the preset information to the entire data area of the downmix signal when the preset information is included in an extension area of a configuration information area based on the preset attribute information;
Rendering the downmix signal by applying the preset information to one corresponding data area of the downmix signal when the preset information is included in an extension area of the data area based on the preset attribute information; When,
An audio signal processing method comprising:

The audio signal processing method according to claim 1, wherein the preset attribute information represents whether the preset information is included in an extension area of the data area.

The audio signal processing method according to claim 1, wherein the preset attribute information indicates that the preset information is variable or fixed.

The variation represents that the preset information is included in an extended area of the data area, and the fixed represents that the preset information is included in an extended area of the configuration information area. 4. The audio signal processing method according to 3.

Using the object information and the preset information to generate downmix processing information for adjusting panning or gain of the downmix signal and multichannel information for upmixing the downmix signal;
Modifying the downmix signal using the downmix processing information;
The audio signal processing method according to claim 4, further comprising:

A signal receiving unit for receiving a downmix signal including at least one object and object information based on an attribute of the object;
A preset attribute information receiving unit that receives preset attribute information representing attributes of preset information for rendering the downmix signal;
When the preset information is included in the extension area of the configuration information area based on the preset attribute information, a fixed preset mode receiving unit that receives a preset mode corresponding to all data areas of the downmix signal;
When the preset information is included in an extension area of the data area based on the preset attribute information, a variable preset mode receiving unit that receives a preset mode corresponding to one data area of the downmix signal;
A rendering unit that renders the downmix signal by applying the preset information to the entire data region or one data region of the downmix signal;
Have
The audio signal processing apparatus, wherein the preset mode includes the preset information and preset metadata corresponding to the preset information, and the preset metadata represents a characteristic of the preset information.

The fixed preset mode receiver
A fixed preset information receiving unit for receiving the preset information;
A fixed metadata receiver for receiving the preset metadata;
The audio signal processing apparatus according to claim 6, further comprising:

The variable preset mode receiver is
A variable preset information receiving unit for receiving the preset information;
A variable preset metadata receiving unit for receiving the preset metadata;
The audio signal processing apparatus according to claim 6, further comprising:

The audio signal processing apparatus according to claim 6, wherein the rendering unit includes a plurality of data region rendering units for rendering a data region of the downmix signal.

The audio signal processing apparatus according to claim 9, wherein when the preset information is received from the fixed preset mode receiving unit, the preset information is applied to the plurality of data area rendering units.

The audio according to claim 9, wherein when the preset information is received from the variable preset mode receiving unit, the preset information is applied to one data area rendering unit corresponding to the preset information. Signal processing device.

Downmixing at least one object to generate a downmix signal;
Generating object information based on attributes of the object;
Generating preset information for adjusting the object by applying to the downmix signal;
Generating preset metadata corresponding to the preset information;
Determining preset attribute information representing attributes of the preset information;
An audio signal processing method comprising:

A downmixing unit that downmixes at least one object to generate a downmix signal;
An object information generation unit that generates object information based on the attribute of the object;
A preset information generating unit for generating preset information for adjusting the object by applying to the downmix signal;
A preset metadata generation unit for generating preset metadata corresponding to the preset information;
A preset attribute information determination unit for determining preset attribute information representing an attribute of the preset information;
An audio signal processing apparatus comprising: