JP6633547B2

JP6633547B2 - Spectrum coding method

Info

Publication number: JP6633547B2
Application number: JP2016569544A
Authority: JP
Inventors: ソン，ホ−サン; オシポフ，コンスタンティン; ル，イ
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2014-02-17
Filing date: 2015-02-17
Publication date: 2020-01-22
Anticipated expiration: 2035-02-17
Also published as: CN110176241A; EP3109611A4; JP2017506771A; CN106233112B; EP3109611A1; KR102386738B1; KR20240008413A; CN106233112A; CN110176241B; KR20160122160A; KR102625143B1; KR20220051028A

Description

本発明は、オーディオ信号符号化あるいはスピーチ信号符号化及びその復号に係り、さらに具体的には、周波数ドメインにおいて、スペクトル係数を符号化あるいは復号する方法及びその装置に関する。 The present invention relates to audio signal encoding or speech signal encoding and decoding, and more particularly, to a method and apparatus for encoding or decoding spectral coefficients in the frequency domain.

周波数ドメインにおいて、スペクトル係数の効率的な符号化のために、多様な方式の量子化器が提案されている。例えば、ＴＣＱ（trellis coded quantization）、ＵＳＱ（uniform scalar quantization）、ＦＰＣ（factorial pulse coding）、ＡＶＱ（algebraic ＶＱ）、ＰＶＱ（pyramid ＶＱ）などがあり、それぞれの量子化器に最適化された無損失符号化器が共に具現されるのである。 Various types of quantizers have been proposed for efficient coding of spectral coefficients in the frequency domain. For example, there are TCQ (trellis coded quantization), USQ (uniform scalar quantization), FPC (factorial pulse coding), AVQ (algebraic VQ), PVQ (pyramid VQ), etc., and lossless loss optimized for each quantizer. The encoder is implemented together.

本発明が解決しようとする課題は、周波数ドメインにおいて、多様なビット率、あるいは多様なサブバンドの大きさに適応的に、スペクトル係数を符号化あるいは復号する方法及びその装置を提供するところにある。 An object of the present invention is to provide a method and apparatus for encoding or decoding spectral coefficients adaptively to various bit rates or various sub-band sizes in the frequency domain. .

本発明が解決しようとする他の課題は、信号符号化方法あるいはその復号方法を、コンピュータで実行させるためのプログラムを記録したコンピュータで読み取り可能な記録媒体を提供するところにある。 Another object of the present invention is to provide a computer-readable recording medium storing a program for causing a computer to execute the signal encoding method or the signal decoding method.

本発明が解決しようとする他の課題は、信号符号化装置あるいはその復号装置を採用するマルチメディア機器を提供するところにある。 Another object of the present invention is to provide a multimedia device that employs a signal encoding device or its decoding device.

前記課題を達成するための一側面によるスペクトル符号化方法は、少なくとも各バンドのビット割当て情報に基づいて符号化方式を選択する段階と、ゼロバンドに対してゼロ符号化を行う段階と、各ノンゼロバンドに対して選択された重要周波数成分の情報を符号化する段階と、を含んでもよい。 According to one aspect of the present invention, there is provided a spectrum encoding method including: selecting an encoding method based on at least bit allocation information of each band; performing zero encoding on a zero band; Encoding information of important frequency components selected for the band.

前記課題を達成するための一側面によるスペクトル復号方法は、少なくとも各バンドのビット割当て情報に基づいて復号方式を選択する段階と、ゼロバンドに対してゼロ復号を遂行する段階と、各ノンゼロバンドに対して得られた重要周波数成分の情報を復号する段階と、を含んでもよい。 According to one aspect of the present invention, there is provided a spectrum decoding method including: selecting a decoding method based on at least bit allocation information of each band; performing zero decoding on a zero band; Decoding the information of the important frequency component obtained for the input signal.

多様なビット率と、多様なサブバンドの大きさとに適応的なスペクトル係数の符号化及び復号が可能である。また、マルチレートを支援するコーデックで設計されたビットレート制御モジュールを利用して、固定ビット率でスペクトルをＴＣＱで符号化することができる。このとき、ＴＣＱの高い性能を正確なターゲットビット率で符号化し、コーデックの符号化性能を極大化させることができる。 It is possible to encode and decode spectral coefficients adaptive to various bit rates and various sub-band sizes. Also, a spectrum can be encoded with TCQ at a fixed bit rate using a bit rate control module designed with a codec that supports multi-rate. At this time, it is possible to code the high performance of the TCQ at an accurate target bit rate and maximize the coding performance of the codec.

本発明が適用されるオーディオ符号化装置の一例による構成を示したブロック図である。FIG. 1 is a block diagram illustrating a configuration of an example of an audio encoding device to which the present invention is applied. 本発明が適用されるオーディオ復号装置の一例による構成を示したブロック図である。FIG. 3 is a block diagram illustrating a configuration of an example of an audio decoding device to which the present invention is applied. 本発明が適用されるオーディオ符号化装置の他の例による構成を示したブロック図である。FIG. 11 is a block diagram illustrating a configuration of another example of an audio encoding device to which the present invention is applied. 本発明が適用されるオーディオ復号装置の他の例による構成を示したブロック図である。FIG. 21 is a block diagram illustrating a configuration of another example of an audio decoding device to which the present invention is applied. 本発明が適用されるオーディオ符号化装置の他の例による構成をそれぞれ示したブロック図である。FIG. 10 is a block diagram illustrating a configuration of another example of an audio encoding device to which the present invention is applied. 本発明が適用されるオーディオ復号装置の他の例による構成を示したブロック図である。FIG. 21 is a block diagram illustrating a configuration of another example of an audio decoding device to which the present invention is applied. 本発明が適用されるオーディオ符号化装置の他の例による構成を示したブロック図である。FIG. 11 is a block diagram illustrating a configuration of another example of an audio encoding device to which the present invention is applied. 本発明が適用されるオーディオ復号装置の他の例による構成を示したブロック図である。FIG. 21 is a block diagram illustrating a configuration of another example of an audio decoding device to which the present invention is applied. 本発明が適用される周波数ドメインオーディオ符号化装置の構成を示したブロック図である。FIG. 1 is a block diagram illustrating a configuration of a frequency domain audio encoding device to which the present invention is applied. 本発明が適用される周波数ドメインオーディオ復号装置の構成を示したブロック図である。FIG. 2 is a block diagram illustrating a configuration of a frequency domain audio decoding device to which the present invention is applied. 一実施形態によるスペクトル符号化装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of a spectrum encoding device according to one embodiment. サブバンド分割の例を示す図面である。6 is a diagram illustrating an example of subband division. 一実施形態によるスペクトル量子化装置の構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of a spectrum quantization device according to an embodiment. 一実施形態によるスペクトル符号化装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of a spectrum encoding device according to one embodiment. 一実施形態によるＩＳＣ符号化装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an ISC encoding device according to one embodiment. 一実施形態によるＩＳＣ情報符号化装置の構成を示すブロック図である。It is a block diagram showing the composition of the ISC information encoding device by one embodiment. 他の実施形態によるスペクトル符号化装置の構成を示すブロック図である。It is a block diagram showing the composition of the spectrum encoding device by other embodiments. 他の実施形態によるスペクトル符号化装置の構成を示すブロック図である。It is a block diagram showing the composition of the spectrum encoding device by other embodiments. 一実施形態によるＩＳＣ収集過程及び符号化過程の概念を示す図面である。5 is a diagram illustrating a concept of an ISC collection process and an encoding process according to an embodiment. 他の実施形態によるＩＳＣ収集過程及び符号化過程の概念を示す図面である。9 is a diagram illustrating a concept of an ISC collecting process and an encoding process according to another embodiment. 本発明で使用されたＴＣＱの一例を示す図面である。5 is a diagram illustrating an example of a TCQ used in the present invention. 本発明が適用される周波数ドメインオーディオ復号装置の構成を示したブロック図である。FIG. 2 is a block diagram illustrating a configuration of a frequency domain audio decoding device to which the present invention is applied. 一実施形態によるスペクトル復号装置の構成を示すブロック図である。It is a block diagram showing the composition of the spectrum decoding device by one embodiment. 一実施形態によるスペクトル逆量子化装置の構成を示すブロック図である。It is a block diagram showing composition of a spectrum dequantization device by one embodiment. 一実施形態によるスペクトル復号装置の構成を示すブロック図である。It is a block diagram showing the composition of the spectrum decoding device by one embodiment. 一実施形態によるＩＳＣ復号装置の構成を示すブロック図である。It is a block diagram showing the composition of the ISC decoding device by one embodiment. 一実施形態によるＩＳＣ情報復号装置の構成を示すブロック図である。It is a block diagram showing the composition of the ISC information decoding device by one embodiment. 他の実施形態によるスペクトル復号装置の構成を示すブロック図である。FIG. 18 is a block diagram illustrating a configuration of a spectrum decoding device according to another embodiment. 他の実施形態によるスペクトル復号装置の構成を示すブロック図である。FIG. 18 is a block diagram illustrating a configuration of a spectrum decoding device according to another embodiment. 他の実施形態によるＩＳＣ情報符号化装置の構成を示すブロック図である。It is a block diagram showing the composition of the ISC information encoding device by other embodiments. 他の実施形態によるＩＳＣ情報復号装置の構成を示すブロック図である。It is a block diagram showing the composition of the ISC information decoding device by other embodiments. 一実施形態によるマルチメディア機器の構成を示したブロック図である。FIG. 1 is a block diagram illustrating a configuration of a multimedia device according to an embodiment. 他の実施形態によるマルチメディア機器の構成を示したブロック図である。FIG. 11 is a block diagram illustrating a configuration of a multimedia device according to another embodiment. 他の実施形態によるマルチメディア機器の構成を示したブロック図である。FIG. 11 is a block diagram illustrating a configuration of a multimedia device according to another embodiment. 一実施形態による、スペクトルの微細構造符号化方法の動作を示したフローチャートである。5 is a flowchart illustrating an operation of a spectrum fine structure encoding method according to an embodiment. 一実施形態による、スペクトルの微細構造復号方法の動作を示したフローチャートである。5 is a flowchart illustrating an operation of a method for decoding a fine structure of a spectrum according to an embodiment.

本発明は、多様な変換を加えることができ、さまざまな実施形態を有することができるが、特定実施形態を図面に例示し、詳細な説明によって具体的に説明する。しかし、それは、本発明を特定の実施形態について限定するものではなく、本発明の技術的思想及び技術範囲に含まれる全ての変換、均等物ないし代替物を含むものであると理解される。本発明の説明において、関連公知技術に係わる具体的な説明が、本発明の要旨を不明確にすると判断される場合、その詳細な説明を省略する。 Although the present invention is capable of various modifications and having various embodiments, certain embodiments are illustrated in the drawings and are specifically described by way of the detailed description. It should be understood, however, that the intention is not to limit the invention to particular embodiments, but to cover all transformations, equivalents or alternatives falling within the spirit and scope of the invention. In the description of the present invention, when it is determined that the specific description of the related art will obscure the gist of the present invention, the detailed description will be omitted.

第１、第２のような用語は、多様な構成要素の説明に使用されるが、構成要素は、用語によって限定されるものではない。該用語は、１つの構成要素を他の構成要素から区別する目的のみに使用される。 Terms such as the first and second are used to describe various components, but the components are not limited by the terms. The terms are only used to distinguish one element from another.

本発明で使用した用語は、ただ特定の実施形態の説明に使用されたものであり、本発明を限定する意図ではない。本発明で使用した用語は、本発明での機能を考慮しながら、可能な限り、現在汎用される一般的な用語を選択したが、それは当分野の当業者の意図、判例、または新たな技術の出現などによって異なる。また、特定の場合は、出願人が任意に選定した用語もあり、その場合、当該発明の説明部分で、詳細にその意味を記載する。従って、本発明で使用される用語は、単純な用語の名称ではない、その用語が有する意味と、本発明の全般にわたる内容とを基に定義されなければならない。 The terms used in the present invention are merely used for describing particular embodiments, and are not intended to limit the present invention. The terms used in the present invention have been selected, wherever possible, from general terms currently used, while taking into account the function of the present invention, which is based on the intention, case, or new technology of those skilled in the art. It depends on the appearance of. Further, in a specific case, there is a term arbitrarily selected by the applicant, and in that case, its meaning is described in detail in the description part of the invention. Therefore, the terms used in the present invention must be defined based on the meanings of the terms and the general content of the present invention, not the names of the simple terms.

単数の表現は、文脈上明白に異なって意味しない限り、複数の表現を含む。本発明において、「含む」または「有する」というような用語は、明細書上に記載された特徴、数字、段階、動作、構成要素、部品、またはそれらの組み合わせが存在するということを指定するもんであり、一つまたはそれ以上の他の特徴、数字、段階、動作、構成要素、部品、またはそれらの組み合わせの存在または付加の可能性をあらかじめ排除するものではないと理解されなければならない。 The singular forms include the plural unless the context clearly dictates otherwise. In the present invention, terms such as "comprising" or "having" also indicate that a feature, number, step, act, component, part, or combination thereof, described in the specification is present. It should be understood that this does not exclude the possibility of the presence or addition of one or more other features, figures, steps, acts, components, parts, or combinations thereof.

以下、本発明の実施形態について、添付図面を参照し、詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

図１Ａ及び図１Ｂは、本発明が適用されるオーディオ符号化装置及びオーディオ復号装置の一例による構成をそれぞれ示したブロック図である。 1A and 1B are block diagrams each showing a configuration of an example of an audio encoding device and an audio decoding device to which the present invention is applied.

図１Ａに図示されたオーディオ符号化装置１１０は、前処理部１１２、周波数ドメイン符号化部１１４及びパラメータ符号化部１１６を含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The audio encoding device 110 illustrated in FIG. 1A may include a pre-processing unit 112, a frequency domain encoding unit 114, and a parameter encoding unit 116. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図１Ａにおいて、前処理部１１２は、入力信号に対して、フィルタリングあるいはダウンサンプリングなどを行うことができるが、それらに限定されるものではない。入力信号は、オーディオ、ミュージック、スピーチ、あるいはそれらの混合信号を示すサウンドなどのメディア信号を意味するが、以下では、説明の便宜のために、オーディオ信号とする。 In FIG. 1A, the preprocessing unit 112 can perform filtering, downsampling, or the like on an input signal, but is not limited thereto. The input signal means a media signal such as audio, music, speech, or a sound indicating a mixed signal thereof, but is hereinafter referred to as an audio signal for convenience of description.

周波数ドメイン符号化部１１４は、前処理部１１２から提供されるオーディオ信号に対して時間・周波数変換を行い、オーディオ信号のチャンネル数、符号化帯域及びビット率に対応して符号化ツールを選択し、選択された符号化ツールを利用して、オーディオ信号に対する符号化を行うことができる。時間・周波数変換は、ＭＤＣＴ（modified discrete cosine transform）、ＭＬＴ（modulated lapped transform）あるいはＦＦＴ（fast Fourier transform）を使用するが、それらに限定されるものではない。ここで、与えられたビット数が十分な場合、全体帯域に対して一般的な変換符号化方式を適用し、与えられたビット数が十分ではない場合、一部帯域については、帯域拡張方式を適用することができる。一方、オーディオ信号が、ステレオあるいはマルチチャンネルである場合、与えられたビット数が十分であるならば、各チャンネル別に符号化し、十分ではなければ、ダウンミキシング方式を適用することができる。周波数ドメイン符号化部１１４からは、符号化されたスペクトル係数が生成される。 The frequency domain encoding unit 114 performs time / frequency conversion on the audio signal provided from the preprocessing unit 112, and selects an encoding tool according to the number of channels of the audio signal, the encoding band, and the bit rate. The encoding of the audio signal can be performed using the selected encoding tool. The time-frequency transform uses a modified discrete cosine transform (MDCT), a modulated lapped transform (MLT), or a fast Fourier transform (FFT), but is not limited thereto. Here, when the given number of bits is sufficient, a general transform coding method is applied to the entire band, and when the given number of bits is not sufficient, the band extension method is used for some bands. Can be applied. On the other hand, when the audio signal is stereo or multi-channel, if a given number of bits is sufficient, coding is performed for each channel, and if not enough, a down-mixing method can be applied. The frequency domain encoding unit 114 generates encoded spectral coefficients.

パラメータ符号化部１１６は、周波数ドメイン符号化部１１４から提供される符号化されたスペクトル係数からパラメータを抽出し、抽出されたパラメータを符号化することができる。パラメータは、例えば、サブバンド別あるいはバンド別に抽出され、以下では、説明の簡素化のために、サブバンドとする。各サブバンドは、スペクトル係数をグルーピングした単位であり、臨界帯域を反映し、均一長あるいは不均一長を有することができる。不均一長を有する場合、低周波数帯域に存在するサブバンドの場合、高周波数帯域と比較し、相対的に短い長さを有することができる。１フレームに含まれるサブバンドの個数及び長さは、コーデックアルゴリズムによって異なり、符号化性能に影響を及ぼす。一方、パラメータは、サブバンドのスケールファクタ、パワー、平均エネルギーあるいはｎｏｒｍを例として挙げることができるが、それらに限定されるものではない。符号化の結果として得られるスペクトル係数とパラメータは、ビットストリームを形成し、記録媒体に保存されるか、あるいはチャンネルを介して、例えば、パケット状で伝送される。 The parameter encoding unit 116 can extract a parameter from the encoded spectral coefficient provided from the frequency domain encoding unit 114, and encode the extracted parameter. The parameters are extracted, for example, for each sub-band or each band. Each sub-band is a unit in which spectral coefficients are grouped, reflects a critical band, and can have a uniform length or a non-uniform length. In case of having a non-uniform length, a sub-band existing in a low frequency band may have a relatively short length compared to a high frequency band. The number and length of subbands included in one frame vary depending on the codec algorithm, and affect coding performance. On the other hand, the parameters may include, but are not limited to, the sub-band scale factor, power, average energy, or norm. The spectral coefficients and parameters resulting from the encoding form a bit stream and are stored on a recording medium or transmitted via a channel, for example, in packet form.

図１Ｂに図示されたオーディオ復号装置１３０は、パラメータ復号部１３２、周波数ドメイン復号部１３４及び後処理部１３６を含んでもよい。ここで、周波数ドメイン復号部１３４は、フレーム消去隠匿（ＦＥＣ：frame erasure concealment）アルゴリズムあるいはパケット損失隠匿（ＰＬＣ：packet loss concealment）アルゴリズムを含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The audio decoding apparatus 130 illustrated in FIG. 1B may include a parameter decoding unit 132, a frequency domain decoding unit 134, and a post-processing unit 136. Here, the frequency domain decoding unit 134 may include a frame erasure concealment (FEC) algorithm or a packet loss concealment (PLC) algorithm. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図１Ｂにおいて、パラメータ復号部１３２は、受信されたビットストリームから符号化されたパラメータを復号し、復号されたパラメータから、フレーム単位で、消去あるいは損失のようなエラーが発生したか否かということをチェックすることができる。エラーチェックは、公知の多様な方法を使用することができ、現在フレームが正常フレームであるか、あるいは消去フレームまたは損失フレームであるかということに係わる情報を、周波数ドメイン復号部１３４に提供する。以下では、説明の簡素化のために、消去フレームまたは損失フレームをエラーフレームとする。 In FIG. 1B, a parameter decoding unit 132 decodes an encoded parameter from the received bit stream, and determines whether an error such as erasure or loss has occurred on a frame basis from the decoded parameter. Can be checked. The error check may use various known methods, and provides information regarding whether the current frame is a normal frame, an erased frame, or a lost frame to the frequency domain decoding unit 134. Hereinafter, an erasure frame or a lost frame is referred to as an error frame for the sake of simplicity.

周波数ドメイン復号部１３４は、現在フレームが正常フレームである場合、一般的な変換復号過程を介して復号を行い、合成されたスペクトル係数を生成することができる。一方、周波数ドメイン復号部１３４は、現在フレームがエラーフレームである場合、ＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを介して、以前正常フレームのスペクトル係数をエラーフレームに反復して使用するか、あるいは回帰分析を介してスケーリングして反復することにより、合成されたスペクトル係数を生成することができる。周波数ドメイン復号部１３４は、合成されたスペクトル係数に対して、周波数・時間変換を行い、時間ドメイン信号を生成することができる。 If the current frame is a normal frame, the frequency domain decoding unit 134 may perform decoding through a general transform decoding process to generate a combined spectral coefficient. On the other hand, if the current frame is an error frame, the frequency domain decoding unit 134 may repeatedly use the spectral coefficient of the previous normal frame as the error frame through the FEC algorithm or the PLC algorithm, or may perform the regression analysis. By scaling and iterating, synthesized spectral coefficients can be generated. The frequency domain decoding unit 134 can perform frequency-to-time conversion on the synthesized spectral coefficients to generate a time-domain signal.

後処理部１３６は、周波数ドメイン復号部１３４から提供される時間ドメイン信号に対して、音質向上のためのフィルタリングあるいはアップサンプリングなどを行うことができるが、それらに限定されるものではない。後処理部１３６は、出力信号として、復元されたオーディオ信号を提供する。 The post-processing unit 136 can perform filtering or up-sampling on the time domain signal provided from the frequency domain decoding unit 134 for improving sound quality, but is not limited thereto. The post-processing unit 136 provides a restored audio signal as an output signal.

図２Ａ及び図２Ｂは、本発明が適用されるオーディオ符号化装置及びオーディオ復号装置の他の例による構成をそれぞれ示したブロック図であり、スイッチング構造を有する。 2A and 2B are block diagrams each showing a configuration of another example of an audio encoding device and an audio decoding device to which the present invention is applied, and have a switching structure.

図２Ａに図示されたオーディオ符号化装置２１０は、前処理部２１２、モード決定部２１３、周波数ドメイン符号化部２１４、時間ドメイン符号化部２１５及びパラメータ符号化部２１６を含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The audio encoding device 210 illustrated in FIG. 2A may include a pre-processing unit 212, a mode determining unit 213, a frequency domain encoding unit 214, a time domain encoding unit 215, and a parameter encoding unit 216. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図２Ａにおいて、前処理部２１２は、図１Ａの前処理部１１２と実質的に同一であるので、説明を省略する。 In FIG. 2A, the pre-processing unit 212 is substantially the same as the pre-processing unit 112 in FIG. 1A, and a description thereof will be omitted.

モード決定部２１３は、入力信号の特性を参照し、符号化モードを決定することができる。入力信号の特性によって、現在フレームに適する符号化モードが、音声モードであるか、あるいは音楽モードであるかということを決定することができ、また現在フレームに効率的な符号化モードが、時間ドメインモードであるか、あるいは周波数ドメインモードであるかということを決定することができる。ここで、フレームの短区間特性、あるいは複数のフレームに対する長区間特性などを利用して、入力信号の特性を把握することができるが、それに限定されるものではない。例えば、入力信号が音声信号に該当すれば、音声モードあるいは時間ドメインモードに決定し、入力信号が音声信号以外の信号、すなわち、音楽信号あるいは混合信号に該当すれば、音楽モードあるいは周波数ドメインモードに決定することができる。モード決定部２１３は、入力信号の特性が音楽モードあるいは周波数ドメインモードに該当する場合には、前処理部２１２の出力信号を周波数ドメイン符号化部２１４に提供し、入力信号の特性が音声モードあるいは時間ドメインモードに該当する場合、時間ドメイン符号化部２１５に提供することができる。 The mode determining unit 213 can determine an encoding mode with reference to the characteristics of the input signal. Depending on the characteristics of the input signal, it can be determined whether the coding mode suitable for the current frame is the voice mode or the music mode, and the efficient coding mode for the current frame is determined in the time domain. Mode or frequency domain mode can be determined. Here, the characteristics of the input signal can be grasped by using the short-period characteristics of a frame or the long-period characteristics of a plurality of frames, but the present invention is not limited to this. For example, if the input signal corresponds to an audio signal, the audio mode or the time domain mode is determined.If the input signal corresponds to a signal other than the audio signal, that is, a music signal or a mixed signal, the mode is set to the music mode or the frequency domain mode. Can be determined. When the characteristics of the input signal correspond to the music mode or the frequency domain mode, the mode determination unit 213 provides the output signal of the preprocessing unit 212 to the frequency domain encoding unit 214, and the characteristics of the input signal are set to the voice mode or the frequency domain mode. If it corresponds to the time domain mode, it can be provided to the time domain encoding unit 215.

周波数ドメイン符号化部２１４は、図１Ａの周波数ドメイン符号化部１１４と実質的に同一であるので、説明を省略する。 The frequency domain coding unit 214 is substantially the same as the frequency domain coding unit 114 of FIG. 1A, and a description thereof will be omitted.

時間ドメイン符号化部２１５は、前処理部２１２から提供されるオーディオ信号に対して、ＣＥＬＰ（code excited linear prediction）符号化を行うことができる。具体的には、ＡＣＥＬＰ（algebraic ＣＥＬＰ）を使用することができるが、それに限定されるものではない。 The time domain encoding unit 215 can perform CELP (code excited linear prediction) encoding on the audio signal provided from the preprocessing unit 212. Specifically, ACELP (algebraic CELP) can be used, but is not limited thereto.

パラメータ符号化部２１６は、周波数ドメイン符号化部２１４あるいは時間ドメイン符号化部２１５から提供される符号化されたスペクトル係数からパラメータを抽出し、抽出されたパラメータを符号化する。パラメータ符号化部２１６は、図１Ａのパラメータ符号化部１１６と実質的に同一であるので、説明を省略する。符号化の結果として得られるスペクトル係数とパラメータは、符号化モード情報と共にビットストリームを形成し、チャネルを介してパケット状で伝送されるか、あるいは記録媒体に保存される。 The parameter encoding unit 216 extracts a parameter from the encoded spectral coefficient provided from the frequency domain encoding unit 214 or the time domain encoding unit 215, and encodes the extracted parameter. Parameter encoding section 216 is substantially the same as parameter encoding section 116 of FIG. The spectral coefficients and parameters resulting from the encoding form a bitstream with the encoding mode information and are transmitted in packets over the channel or stored on a recording medium.

図２Ｂに図示されたオーディオ復号装置２３０は、パラメータ復号部２３２、モード決定部２３３、周波数ドメイン復号部２３４、時間ドメイン復号部２３５及び後処理部２３６を含んでもよい。ここで、周波数ドメイン復号部２３４と時間ドメイン復号部２３５は、それぞれ当該ドメインでのＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The audio decoding device 230 illustrated in FIG. 2B may include a parameter decoding unit 232, a mode determination unit 233, a frequency domain decoding unit 234, a time domain decoding unit 235, and a post-processing unit 236. Here, the frequency domain decoding unit 234 and the time domain decoding unit 235 may each include an FEC algorithm or a PLC algorithm in the domain. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図２Ｂにおいて、パラメータ復号部２３２は、パケット状で伝送されるビットストリームからパラメータを復号し、復号されたパラメータから、フレーム単位でエラーが発生したか否かということをチェックすることができる。エラーチェックは、公知の多様な方法を使用することができ、現在フレームが正常フレームであるか、あるいはエラーフレームであるかということに係わる情報を、周波数ドメイン復号部２３４あるいは時間ドメイン復号部２３５に提供する。 In FIG. 2B, a parameter decoding unit 232 can decode parameters from a bit stream transmitted in a packet form and check whether an error has occurred in frame units based on the decoded parameters. For the error check, various known methods can be used, and information on whether the current frame is a normal frame or an error frame is sent to the frequency domain decoding unit 234 or the time domain decoding unit 235. provide.

モード決定部２３３は、ビットストリームに含まれた符号化モード情報をチェックし、現在フレームを周波数ドメイン復号部２３４あるいは時間ドメイン復号部２３５に提供する。 The mode determining unit 233 checks the encoding mode information included in the bit stream, and provides the current frame to the frequency domain decoding unit 234 or the time domain decoding unit 235.

周波数ドメイン復号部２３４は、符号化モードが音楽モードあるいは周波数ドメインモードである場合に動作し、現在フレームが正常フレームである場合、一般的な変換復号過程を介して復号を行い、合成されたスペクトル係数を生成する。一方、現在フレームがエラーフレームであり、以前フレームの符号化モードが音楽モードあるいは周波数ドメインモードである場合、周波数ドメインでのＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを介して、以前正常フレームのスペクトル係数をエラーフレームに反復して使用するか、あるいは回帰分析を介してスケーリングして反復することにより、合成されたスペクトル係数を生成することができる。周波数ドメイン復号部２３４は、合成されたスペクトル係数に対して周波数・時間変換を行い、時間ドメイン信号を生成することができる。 The frequency domain decoding unit 234 operates when the encoding mode is the music mode or the frequency domain mode. When the current frame is a normal frame, the frequency domain decoding unit 234 performs decoding through a general transform decoding process to obtain a synthesized spectrum. Generate coefficients. On the other hand, if the current frame is an error frame and the encoding mode of the previous frame is the music mode or the frequency domain mode, the spectrum coefficient of the previous normal frame is converted to the error frame through the FEC algorithm or the PLC algorithm in the frequency domain. Iterative use, or scaling and repetition via regression analysis, can produce synthesized spectral coefficients. The frequency domain decoding unit 234 can perform frequency-to-time conversion on the synthesized spectral coefficient to generate a time-domain signal.

時間ドメイン復号部２３５は、符号化モードが音声モードあるいは時間ドメインモードである場合に動作し、現在フレームが正常フレームである場合、一般的なＣＥＬＰ復号過程を介して復号を行い、時間ドメイン信号を生成する。一方、現在フレームがエラーフレームであり、以前フレームの符号化モードが音声モードあるいは時間ドメインモードである場合、時間ドメインでのＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを遂行することができる。 The time domain decoding unit 235 operates when the encoding mode is the voice mode or the time domain mode. When the current frame is a normal frame, the time domain decoding unit 235 performs decoding through a general CELP decoding process, and converts the time domain signal. Generate. On the other hand, if the current frame is an error frame and the coding mode of the previous frame is the voice mode or the time domain mode, the FEC algorithm or the PLC algorithm in the time domain can be performed.

後処理部２３６は、周波数ドメイン復号部２３４あるいは時間ドメイン復号部２３５から提供される時間ドメイン信号に対して、フィルタリングあるいはアップサンプリングなどを行うことができるが、それらに限定されるものではない。後処理部２３６は、出力信号として、復元されたオーディオ信号を提供する。 The post-processing unit 236 can perform filtering or up-sampling on the time domain signal provided from the frequency domain decoding unit 234 or the time domain decoding unit 235, but is not limited thereto. The post-processing unit 236 provides a restored audio signal as an output signal.

図３Ａ及び図３Ｂは、本発明が適用されるオーディオ符号化装置及びオーディオ復号装置の他の例による構成をそれぞれ示したブロック図であり、スイッチング構造を有する。 FIGS. 3A and 3B are block diagrams each showing a configuration of another example of an audio encoding device and an audio decoding device to which the present invention is applied, and have a switching structure.

図３Ａに図示されたオーディオ符号化装置３１０は、前処理部３１２、ＬＰ（linear prediction）分析部３１３、モード決定部３１４、周波数ドメイン励起符号化部３１５、時間ドメイン励起符号化部３１６及びパラメータ符号化部３１７を含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The audio encoding device 310 illustrated in FIG. 3A includes a preprocessing unit 312, a linear prediction (LP) analysis unit 313, a mode determination unit 314, a frequency domain excitation encoding unit 315, a time domain excitation encoding unit 316, and a parameter code. May be included. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図３Ａにおいて、前処理部３１２は、図１Ａの前処理部１１２と実質的に同一であるので、説明を省略する。 In FIG. 3A, the pre-processing unit 312 is substantially the same as the pre-processing unit 112 in FIG. 1A, and a description thereof will be omitted.

ＬＰ分析部３１３は、入力信号に対してＬＰ分析を行ってＬＰ係数を抽出し、抽出されたＬＰ係数から励起信号を生成する。該励起信号は、符号化モードによって、周波数ドメイン励起符号化部３１５と時間ドメイン励起符号化部３１６とのうちいずれか一方に提供される。 LP analysis section 313 performs LP analysis on the input signal to extract LP coefficients, and generates an excitation signal from the extracted LP coefficients. The excitation signal is provided to one of the frequency domain excitation encoding unit 315 and the time domain excitation encoding unit 316 according to an encoding mode.

モード決定部３１４は、図２Ｂのモード決定部２１３と実質的に同一であるので、説明を省略する。 The mode determining unit 314 is substantially the same as the mode determining unit 213 in FIG.

周波数ドメイン励起符号化部３１５は、符号化モードが音楽モードあるいは周波数ドメインモードである場合に動作し、入力信号が励起信号であることを除いては、図１Ａの周波数ドメイン符号化部１１４と実質的に同一であるので、説明を省略する。 The frequency domain excitation encoding unit 315 operates when the encoding mode is the music mode or the frequency domain mode, and is substantially the same as the frequency domain encoding unit 114 of FIG. 1A except that the input signal is an excitation signal. Therefore, the description is omitted.

時間ドメイン励起符号化部３１６は、符号化モードが音声モードあるいは時間ドメインモードである場合に動作し、入力信号が励起信号であることを除いては、図２Ａの時間ドメイン符号化部２１５と実質的に同一であるので、説明を省略する。 The time domain excitation encoding unit 316 operates when the encoding mode is the voice mode or the time domain mode, and is substantially the same as the time domain encoding unit 215 of FIG. 2A except that the input signal is an excitation signal. Therefore, the description is omitted.

パラメータ符号化部３１７は、周波数ドメイン励起符号化部３１５あるいは時間ドメイン励起符号化部３１６から提供される符号化されたスペクトル係数からパラメータを抽出し、抽出されたパラメータを符号化する。パラメータ符号化部３１７は、図１Ａのパラメータ符号化部１１６と実質的に同一であるので、説明を省略する。符号化の結果として得られるスペクトル係数とパラメータは、符号化モード情報と共にビットストリームを形成し、チャネルを介してパケット状で伝送されるか、あるいは記録媒体に保存される。 The parameter coding unit 317 extracts parameters from the coded spectral coefficients provided from the frequency domain excitation coding unit 315 or the time domain excitation coding unit 316, and codes the extracted parameters. Parameter encoding section 317 is substantially the same as parameter encoding section 116 in FIG. 1A, and thus description thereof is omitted. The spectral coefficients and parameters resulting from the encoding form a bitstream with the encoding mode information and are transmitted in packets over the channel or stored on a recording medium.

図３Ｂに図示されたオーディオ復号装置３３０は、パラメータ復号部３３２、モード決定部３３３、周波数ドメイン励起復号部３３４、時間ドメイン励起復号部３３５、ＬＰ合成部３３６及び後処理部３３７を含んでもよい。ここで、周波数ドメイン励起復号部３３４と時間ドメイン励起復号部３３５は、それぞれ当該ドメインでのＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 3B may include a parameter decoding unit 332, a mode determination unit 333, a frequency domain excitation decoding unit 334, a time domain excitation decoding unit 335, an LP synthesis unit 336, and a post-processing unit 337. Here, the frequency domain excitation decoding unit 334 and the time domain excitation decoding unit 335 may each include an FEC algorithm or a PLC algorithm in the relevant domain. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図３Ｂにおいて、パラメータ復号部３３２は、パケット状で伝送されるビットストリームからパラメータを復号し、復号されたパラメータから、フレーム単位でエラーが発生したか否かということをチェックすることができる。エラーチェックは、公知の多様な方法を使用することができ、現在フレームが正常フレームであるか、あるいはエラーフレームであるかということに係わる情報を、周波数ドメイン励起復号部３３４あるいは時間ドメイン励起復号部３３５に提供する。 In FIG. 3B, the parameter decoding unit 332 can decode a parameter from a bit stream transmitted in a packet form, and check whether an error has occurred on a frame basis from the decoded parameter. For the error check, various known methods can be used, and information on whether the current frame is a normal frame or an error frame is transmitted to the frequency domain excitation decoding unit 334 or the time domain excitation decoding unit. 335.

モード決定部３３３は、ビットストリームに含まれた符号化モード情報をチェックし、現在フレームを周波数ドメイン励起復号部３３４あるいは時間ドメイン励起復号部３３５に提供する。 The mode determining unit 333 checks the encoding mode information included in the bit stream, and provides the current frame to the frequency domain excitation decoding unit 334 or the time domain excitation decoding unit 335.

周波数ドメイン励起復号部３３４は、符号化モードが音楽モードあるいは周波数ドメインモードである場合に動作し、現在フレームが正常フレームである場合、一般的な変換復号過程を介して復号を行い、合成されたスペクトル係数を生成する。一方、現在フレームがエラーフレームであり、以前フレームの符号化モードが音楽モードあるいは周波数ドメインモードである場合、周波数ドメインでのＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを介して、以前正常フレームのスペクトル係数をエラーフレームに反復して使用するか、あるいは回帰分析を介してスケーリングして反復することにより、合成されたスペクトル係数を生成することができる。周波数ドメイン励起復号部３３４は、合成されたスペクトル係数に対して周波数・時間変換を行い、時間ドメイン信号である励起信号を生成することができる。 The frequency domain excitation decoding unit 334 operates when the encoding mode is the music mode or the frequency domain mode. When the current frame is a normal frame, the frequency domain excitation decoding unit 334 performs decoding through a general transform decoding process and synthesizes. Generate spectral coefficients. On the other hand, if the current frame is an error frame and the encoding mode of the previous frame is the music mode or the frequency domain mode, the spectrum coefficient of the previous normal frame is converted to the error frame through the FEC algorithm or the PLC algorithm in the frequency domain. Iterative use, or scaling and repetition via regression analysis, can produce synthesized spectral coefficients. The frequency-domain excitation decoding unit 334 can perform frequency-to-time conversion on the synthesized spectral coefficients to generate an excitation signal that is a time-domain signal.

時間ドメイン励起復号部３３５は、符号化モードが音声モードあるいは時間ドメインモードである場合に動作し、現在フレームが正常フレームである場合、一般的なＣＥＬＰ復号過程を介して復号を行い、時間ドメイン信号である励起信号を生成する。一方、現在フレームがエラーフレームであり、以前フレームの符号化モードが音声モードあるいは時間ドメインモードである場合、時間ドメインでのＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを遂行することができる。 The time domain excitation decoding unit 335 operates when the encoding mode is the voice mode or the time domain mode. When the current frame is a normal frame, the time domain excitation decoding unit 335 performs decoding through a general CELP decoding process, and performs time domain signal decoding. To generate an excitation signal. On the other hand, if the current frame is an error frame and the coding mode of the previous frame is the voice mode or the time domain mode, the FEC algorithm or the PLC algorithm in the time domain can be performed.

ＬＰ合成部３３６は、周波数ドメイン励起復号部３３４あるいは時間ドメイン励起復号部３３５から提供される励起信号に対してＬＰ合成を行い、時間ドメイン信号を生成する。 The LP synthesis unit 336 performs LP synthesis on the excitation signal provided from the frequency domain excitation decoding unit 334 or the time domain excitation decoding unit 335, and generates a time domain signal.

後処理部３３７は、ＬＰ合成部３３６から提供される時間ドメイン信号に対して、フィルタリングあるいはアップサンプリングなどを行うことができるが、それらに限定されるものではない。後処理部３３７は、出力信号として、復元されたオーディオ信号を提供する。 The post-processing unit 337 can perform filtering or up-sampling on the time domain signal provided from the LP synthesis unit 336, but is not limited thereto. The post-processing unit 337 provides a restored audio signal as an output signal.

図４Ａ及び図４Ｂは、本発明が適用されるオーディオ符号化装置及びオーディオ復号装置の他の例による構成をそれぞれ示したブロック図であり、スイッチング構造を有する。 4A and 4B are block diagrams each showing a configuration of another example of an audio encoding device and an audio decoding device to which the present invention is applied, and have a switching structure.

図４Ａに図示されたオーディオ符号化装置４１０は、前処理部４１２、モード決定部４１３、周波数ドメイン符号化部４１４、ＬＰ分析部４１５、周波数ドメイン励起符号化部４１６、時間ドメイン励起符号化部４１７及びパラメータ符号化部４１８を含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。図４Ａに図示されたオーディオ符号化装置４１０は、図２Ａのオーディオ符号化装置２１０と、図３Ａのオーディオ符号化装置３１０とを結合したものと見ることができるので、共通部分の動作説明は省略する一方、モード決定部４１３の動作について説明する。 The audio encoding device 410 illustrated in FIG. 4A includes a preprocessing unit 412, a mode determination unit 413, a frequency domain encoding unit 414, an LP analysis unit 415, a frequency domain excitation encoding unit 416, and a time domain excitation encoding unit 417. And a parameter encoding unit 418. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown). Since the audio encoding device 410 shown in FIG. 4A can be regarded as a combination of the audio encoding device 210 of FIG. 2A and the audio encoding device 310 of FIG. 3A, the description of the operation of the common part is omitted. Meanwhile, the operation of the mode determination unit 413 will be described.

モード決定部４１３は、入力信号の特性及びビット率を参照し、入力信号の符号化モードを決定することができる。モード決定部４１３は、入力信号の特性によって、現在フレームが音声モードであるか、あるいは音楽モードであるかということにより、また現在フレームに効率的な符号化モードが時間ドメインモードであるか、あるいは周波数ドメインモードであるかということによって、ＣＥＬＰモードと、それ以外のモードとに決定することができる。もし入力信号の特性が音声モードである場合には、ＣＥＬＰモードに決定し、音楽モードでありながら、高ビット率である場合、ＦＤモードに決定し、音楽モードでありながら、低ビット率である場合、オーディオモードに決定することができる。モード決定部４１３は、ＦＤモードである場合、入力信号を周波数ドメイン符号化部４１４に提供し、オーディオモードである場合、ＬＰ分析部４１５を介して、周波数ドメイン励起符号化部４１６に提供し、ＣＥＬＰモードである場合、ＬＰ分析部４１５を介して、時間ドメイン励起符号化部４１７に提供することができる。 The mode determination unit 413 can determine the encoding mode of the input signal with reference to the characteristics and the bit rate of the input signal. The mode determining unit 413 determines whether the current frame is in the audio mode or the music mode according to the characteristics of the input signal, and determines whether the efficient encoding mode for the current frame is the time domain mode, or The CELP mode and other modes can be determined depending on whether the mode is the frequency domain mode. If the characteristic of the input signal is the voice mode, the mode is determined to be the CELP mode. If the high bit rate is the music mode, the mode is determined to be the FD mode. If so, the audio mode can be determined. When the mode is the FD mode, the mode determination unit 413 provides the input signal to the frequency domain coding unit 414, and when the mode is the audio mode, the mode determination unit 413 provides the input signal to the frequency domain excitation coding unit 416 via the LP analysis unit 415, In the case of the CELP mode, it can be provided to the time-domain excitation encoding unit 417 via the LP analyzing unit 415.

周波数ドメイン符号化部４１４は、図１Ａのオーディオ符号化装置１１０の周波数ドメイン符号化部１１４、あるいは図２Ａのオーディオ符号化装置２１０の周波数ドメイン符号化部２１４に対応し、周波数ドメイン励起符号化部４１６あるいは時間ドメイン励起符号化部４１７は、図３Ａのオーディオ符号化装置３１０の周波数ドメイン励起符号化部３１５あるいは時間ドメイン励起符号化部３１６に対応する。 The frequency domain encoding unit 414 corresponds to the frequency domain encoding unit 114 of the audio encoding device 110 of FIG. 1A or the frequency domain encoding unit 214 of the audio encoding device 210 of FIG. The 416 or the time domain excitation encoding unit 417 corresponds to the frequency domain excitation encoding unit 315 or the time domain excitation encoding unit 316 of the audio encoding device 310 in FIG. 3A.

図４Ｂに図示されたオーディオ復号装置４３０は、パラメータ復号部４３２、モード決定部４３３、周波数ドメイン復号部４３４、周波数ドメイン励起復号部４３５、時間ドメイン励起復号部４３６、ＬＰ合成部４３７及び後処理部４３８を含んでもよい。ここで、周波数ドメイン復号部４３４、周波数ドメイン励起復号部４３５及び時間ドメイン励起復号部４３６は、それぞれ当該ドメインでのＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。図４Ｂに図示されたオーディオ復号装置４３０は、図２Ｂのオーディオ復号装置２３０と、図３Ｂのオーディオ復号装置３３０とを結合したものと見ることができるので、共通部分の動作説明は省略する一方、モード決定部４３３の動作について説明する。 The audio decoding device 430 illustrated in FIG. 4B includes a parameter decoding unit 432, a mode determination unit 433, a frequency domain decoding unit 434, a frequency domain excitation decoding unit 435, a time domain excitation decoding unit 436, an LP synthesis unit 437, and a post-processing unit. 438. Here, the frequency domain decoding section 434, the frequency domain excitation decoding section 435, and the time domain excitation decoding section 436 may each include an FEC algorithm or a PLC algorithm in the domain. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown). The audio decoding device 430 shown in FIG. 4B can be regarded as a combination of the audio decoding device 230 of FIG. 2B and the audio decoding device 330 of FIG. 3B. The operation of the mode determining unit 433 will be described.

モード決定部４３３は、ビットストリームに含まれた符号化モード情報をチェックし、現在フレームを周波数ドメイン復号部４３４、周波数ドメイン励起復号部４３５あるいは時間ドメイン励起復号部４３６に提供する。 The mode determining unit 433 checks the coding mode information included in the bitstream, and provides the current frame to the frequency domain decoding unit 434, the frequency domain excitation decoding unit 435, or the time domain excitation decoding unit 436.

周波数ドメイン復号部４３４は、図１Ｂのオーディオ符号化装置１３０の周波数ドメイン復号部１３４、あるいは図２Ｂのオーディオ復号装置２３０の周波数ドメイン復号部２３４に対応し、周波数ドメイン励起復号部４３５あるいは時間ドメイン励起復号部４３６は、図３Ｂのオーディオ復号装置３３０の周波数ドメイン励起復号部３３４あるいは時間ドメイン励起復号部３３５に対応する。 The frequency domain decoding unit 434 corresponds to the frequency domain decoding unit 134 of the audio encoding device 130 in FIG. 1B or the frequency domain decoding unit 234 of the audio decoding device 230 in FIG. The decoding unit 436 corresponds to the frequency domain excitation decoding unit 334 or the time domain excitation decoding unit 335 of the audio decoding device 330 in FIG. 3B.

図５は、本発明が適用される周波数ドメインオーディオ符号化装置の構成を示したブロック図である。 FIG. 5 is a block diagram showing a configuration of a frequency domain audio encoding device to which the present invention is applied.

図５に図示された周波数ドメインオーディオ符号化装置５１０は、トランジェント検出部５１１、変換部５１２、信号分類部５１３、エネルギー符号化部５１４、スペクトル正規化部５１５、ビット割当て部５１６、スペクトル符号化部５１７及び多重化部５１８を含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。ここで、周波数ドメインオーディオ符号化装置５１０は、図２に図示された周波数ドメイン符号化部２１４の全ての機能と、パラメータ符号化部２１６の一部機能とを遂行することができる。一方、周波数ドメインオーディオ符号化装置５１０は、信号分類部５１３を除いては、ＩＴＵ−ＴＧ．７１９標準に開示されたエンコーダの構成で代替され、そのとき、変換部５１２は、５０％のオーバーラップ区間を有する変換ウィンドウを使用することができる。また、周波数ドメインオーディオ符号化装置５１０は、トランジェント検出部５１１及び信号分類部５１３を除いては、ＩＴＵ−ＴＧ．７１９標準に開示されたエンコーダの構成でも代替される。各場合において、図示されてはいないが、ＩＴＵ−ＴＧ．７１９標準のように、スペクトル符号化部５１７の後端に、ノイズレベル推定部をさらに具備し、ビット割当て過程において、ゼロビットが割り当てられたスペクトル係数のためのノイズレベルを推定してビットストリームに含めることができる。 The frequency domain audio encoding device 510 illustrated in FIG. 5 includes a transient detection unit 511, a conversion unit 512, a signal classification unit 513, an energy encoding unit 514, a spectrum normalization unit 515, a bit allocation unit 516, and a spectrum encoding unit. 517 and a multiplexing unit 518. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown). Here, the frequency domain audio encoding device 510 may perform all functions of the frequency domain encoding unit 214 and some functions of the parameter encoding unit 216 shown in FIG. On the other hand, except for the signal classification unit 513, the frequency domain audio encoding device 510 includes the ITU-TG. Instead, the conversion unit 512 may use a conversion window having a 50% overlap interval. In addition, except for the transient detection unit 511 and the signal classification unit 513, the frequency domain audio encoding device 510 includes the ITU-T G. The configuration of the encoder disclosed in the 719 standard is also substituted. In each case, although not shown, ITU-TG. As in the H.719 standard, a noise level estimator is further provided at the rear end of the spectrum encoder 517, and in a bit allocation process, a noise level for a spectral coefficient to which zero bits are allocated is estimated and included in a bit stream. be able to.

図５を参照すれば、トランジェント検出部５１１は、入力信号を分析し、トランジェント特性を示す区間を検出し、検出結果に対応して、各フレームに対するトランジェントシグナリング情報を生成することができる。そのとき、トランジェント区間の検出には、公知の多様な方法を使用することができる。一実施形態によれば、トランジェント検出部５１１は、まず、現在フレームがトランジェントフレームであるか否かということを一次的に判断し、トランジェントフレームであると判断された現在フレームに対して、二次的に検証を行う。トランジェントシグナリング情報は、多重化部５１８を介して、ビットストリームに含まれる一方、変換部５１２に提供される。 Referring to FIG. 5, the transient detection unit 511 may analyze an input signal, detect a section exhibiting a transient characteristic, and generate transient signaling information for each frame according to a detection result. At that time, various known methods can be used for detecting the transient section. According to one embodiment, the transient detection unit 511 first determines whether the current frame is a transient frame or not, and determines a secondary frame for the current frame determined to be a transient frame. Verification is performed. The transient signaling information is provided to the converter 512 while being included in the bit stream via the multiplexer 518.

変換部５１２は、トランジェント区間の検出結果によって、変換に使用されるウィンドウサイズを決定し、決定されたウィンドウサイズに基づいて、時間・周波数変換を行う。一例として、トランジェント区間が検出されたサブバンドの場合、短区間ウィンドウ（short window）を適用し、検出されていないサブバンドの場合、長区間ウィンドウ（long window）を適用することができる。他の例として、トランジェント区間を含むフレームについて、短区間ウィンドウを適用することができる。 The conversion unit 512 determines a window size used for conversion based on the detection result of the transient section, and performs time / frequency conversion based on the determined window size. As an example, a short window may be applied to a subband in which a transient section is detected, and a long window may be applied to a subband in which a transient section is not detected. As another example, a short section window can be applied to a frame including a transient section.

信号分類部５１３は、変換部５１２から提供されるスペクトルをフレーム単位に分析し、各フレームがハーモニックフレームに該当するか否かということを判断することができる。そのとき、ハーモニックフレームの判断には、公知の多様な方法を使用することができる。一実施形態によれば、信号分類部５１３は、変換部５１２から提供されるスペクトルを複数のサブバンドに分け、各サブバンドに対して、エネルギーのピーク値と平均値とを求めることができる。次に、各フレームに対して、エネルギーのピーク値が平均値より所定比率以上大きいサブバンドの数を求め、求められたサブバンドの数が、所定値以上であるフレームをハーモニックフレームと決定することができる。ここで、所定比率及び所定値は、実験あるいはシミュレーションを介して、前もって決定することができる。ハーモニックシグナリング情報は、多重化部５１８を介し、てビットストリームに含まれてもよい。 The signal classification unit 513 may analyze the spectrum provided from the conversion unit 512 on a frame-by-frame basis, and determine whether each frame corresponds to a harmonic frame. At this time, various known methods can be used to determine the harmonic frame. According to one embodiment, the signal classification unit 513 can divide the spectrum provided from the conversion unit 512 into a plurality of subbands, and calculate a peak value and an average value of energy for each subband. Next, for each frame, the number of sub-bands whose energy peak value is larger than the average value by a predetermined ratio or more is determined, and a frame whose calculated number of sub-bands is equal to or more than a predetermined value is determined as a harmonic frame. Can be. Here, the predetermined ratio and the predetermined value can be determined in advance through experiments or simulations. The harmonic signaling information may be included in the bit stream via the multiplexing unit 518.

エネルギー符号化部５１４は、各サブバンド単位でエネルギーを求め、量子化及び無損失符号化することができる。一実施形態によれば、エネルギーとして、各サブバンドの平均スペクトルエネルギーに該当するＮｏｒｍ値を使用することができ、スケールファクタあるいはパワーを代わりに使用することができるが、それらに限定されるものではない。ここで、各サブバンドのＮｏｒｍ値は、スペクトル正規化部５１５及びビット割当て部５１６に提供される一方、多重化部５１８を介して、ビットストリームに含まれてもよい。 The energy encoding unit 514 may obtain energy for each subband, and perform quantization and lossless encoding. According to one embodiment, the energy may be a Norm value corresponding to the average spectral energy of each subband, and a scale factor or power may be used instead, but not limited thereto. Absent. Here, the Norm value of each subband is provided to the spectrum normalization unit 515 and the bit allocation unit 516, and may be included in the bit stream via the multiplexing unit 518.

スペクトル正規化部５１５は、各サブバンド単位で求められたＮｏｒｍ値を利用して、スペクトルを正規化することができる。 The spectrum normalization unit 515 can normalize the spectrum using the Norm value obtained for each subband.

ビット割当て部５１６は、各サブバンド単位で求められたＮｏｒｍ値を利用して、整数単位あるいは小数点単位で、ビット割り当てを行うことができる。また、ビット割当て部５１６は、各サブバンド単位で求められたＮｏｒｍ値を利用して、マスキング臨界値を計算し、マスキング臨界値を利用して、知覚的に必要なビット数、すなわち、許容ビット数を推定することができる。次に、ビット割当て部５１６は、各サブバンドに対して、割当てビット数が許容ビット数を超えないように制限することができる。一方、ビット割当て部５１６は、Ｎｏｒｍ値が大きいサブバンドから順次にビットを割り当て、各サブバンドのＮｏｒｍ値に対して、各サブバンドの知覚的重要度によって、加重値を付与することにより、知覚的に重要なサブバンドに、さらに多くのビットが割り当てられるように調整することができる。そのとき、Ｎｏｒｍ符号化部５１４からビット割当て部５１６に提供される量子化されたＮｏｒｍ値は、ＩＴＵ−ＴＧ．７１９と同様に、心理音響加重（psycho-acoustical weighting）及びマスキング効果を考慮するために、あらかじめ調整された後、ビット割り当てに使用される。 The bit allocation unit 516 can perform bit allocation in integer units or decimal point units using the Norm value obtained in each subband. Also, the bit allocating unit 516 calculates a masking threshold using the Norm value obtained for each subband, and uses the masking threshold to determine the number of bits perceptually required, that is, the allowable bit. The number can be estimated. Next, the bit allocation section 516 can limit the number of allocated bits to each subband so as not to exceed the allowable number of bits. On the other hand, the bit allocation unit 516 sequentially allocates bits in order from the sub-band having a large Norm value, and assigns a weight to the Norm value of each sub-band according to the perceptual importance of each sub-band, thereby obtaining a perceptual value. It can be adjusted so that more bits are allocated to important subbands. At this time, the quantized Norm value provided from the Norm encoding unit 514 to the bit allocation unit 516 is based on ITU-TG. Like 719, it is pre-adjusted to account for psycho-acoustical weighting and masking effects and then used for bit allocation.

スペクトル符号化部５１７は、正規化されたスペクトルに対して、各サブバンドの割当てビット数を利用して量子化を行い、量子化された結果に対して、無損失符号化を行うことができる。一例として、スペクトル符号化に、ＴＣＱ（trellis coded quantizer）、ＵＳＱ（uniform scalar quantizer）、ＦＰＣ（factorial puls ecoder）、ＡＶＱ（analog vector quantizer）、ＰＶＱ（predictive vector quantizer）、あるいはそれらの組み合わせと、各量子化器に対応する無損失符号化器とを使用することができる。また、当該コーデックが搭載される環境、あるいはユーザの必要によって、多様なスペクトル符号化技法を適用することができる。スペクトル符号化部５１７で符号化されたスペクトルに係わる情報は、多重化部５１８を介して、ビットストリームに含まれてもよい。 The spectrum coding unit 517 can perform quantization on the normalized spectrum using the number of bits allocated to each subband, and perform lossless coding on the quantized result. . As an example, for spectral coding, TCQ (trellis coded quantizer), USQ (uniform scalar quantizer), FPC (factorial pulse encoder), AVQ (analog vector quantizer), PVQ (predictive vector quantizer), or a combination thereof, A lossless encoder corresponding to the quantizer can be used. In addition, various spectrum coding techniques can be applied depending on the environment in which the codec is mounted or the needs of the user. Information on the spectrum encoded by the spectrum encoding unit 517 may be included in the bit stream via the multiplexing unit 518.

図６は、本発明が適用される周波数ドメインオーディオ符号化装置の構成を示したブロック図である。図６に図示されたオーディオ符号化装置６００は、前処理部６１０、周波数ドメイン符号化部６３０、時間ドメイン符号化部６５０及び多重化部６７０を含んでもよい。周波数ドメイン符号化部６３０は、トランジェント検出部６３１、変換部６３３及びスペクトル符号化部６３５を含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 FIG. 6 is a block diagram showing a configuration of a frequency domain audio encoding device to which the present invention is applied. The audio encoding device 600 illustrated in FIG. 6 may include a pre-processing unit 610, a frequency domain encoding unit 630, a time domain encoding unit 650, and a multiplexing unit 670. The frequency domain coding unit 630 may include a transient detection unit 631, a conversion unit 633, and a spectrum coding unit 635. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図６において、前処理部６１０は、入力信号に対して、フィルタリングあるいはダウンサンプリングなどを行うことができるが、それらに限定されるものではない。前処理部６１０は、信号特性に基づいて、符号化モードを決定することができる。信号特性によって、現在フレームに適する符号化モードが、音声モードであるか、あるいは音楽モードであるかということを決定することができ、また現在フレームに効率的な符号化モードが、時間ドメインモードであるか、あるいは周波数ドメインモードであるかということを決定することができる。ここで、フレームの短区間特性、あるいは複数のフレームに対する長区間特性などを利用して、信号特性を把握することができるが、それに限定されるものではない。例えば、入力信号が音声信号に該当すれば、音声モードあるいは時間ドメインモードに決定し、入力信号が音声信号以外の信号、すなわち、音楽信号あるいは混合信号に該当すれば、音楽モードあるいは周波数ドメインモードに決定することができる。前処理部６１０は、信号特性が音楽モードあるいは周波数ドメインモードに該当する場合には、入力信号を周波数ドメイン符号化部６３０に提供し、信号特性が音声モードあるいは時間ドメインモードに該当する場合、入力信号を時間ドメイン符号化部６５０に提供することができる。 In FIG. 6, the pre-processing unit 610 can perform filtering or down-sampling on an input signal, but is not limited thereto. The preprocessing unit 610 can determine an encoding mode based on the signal characteristics. The signal characteristics can determine whether the coding mode suitable for the current frame is the voice mode or the music mode, and the efficient coding mode for the current frame is the time domain mode. Or it is in frequency domain mode. Here, the signal characteristic can be grasped using the short section characteristic of the frame or the long section characteristic for a plurality of frames, but is not limited thereto. For example, if the input signal corresponds to an audio signal, the mode is determined to be the audio mode or the time domain mode. If the input signal corresponds to a signal other than the audio signal, that is, a music signal or a mixed signal, the mode is set to the music mode or the frequency domain mode. Can be determined. The preprocessing unit 610 provides an input signal to the frequency domain encoding unit 630 when the signal characteristic corresponds to the music mode or the frequency domain mode, and provides the input signal when the signal characteristic corresponds to the voice mode or the time domain mode. The signal may be provided to a time domain encoder 650.

周波数ドメイン符号化部６３０は、前処理部６１０から提供されるオーディオ信号を、変換符号化に基づいて処理することができる。具体的には、トランジェント検出部６３１は、オーディオ信号からトランジェント成分を検出し、現在フレームがトランジェントフレームであるか否かということを判断することができる。変換部６３３は、トランジェント検出部６３１から提供されるフレームタイプ、すなわち、トランジェント情報に基づいて、変換ウィンドウの長さあるいは形態を決定し、決定された変換ウィンドウに基づいて、オーディオ信号を周波数ドメインに変換することができる。変換技法としては、ＭＤＣＴ、ＦＦＴあるいはＭＬＴを適用することができる。一般的に、トランジェント成分を有するフレームについては、短い長さの変換ウィンドウを適用することができる。スペクトル符号化部６３５は、周波数ドメインに変換されたオーディオスペクトルに対して、符号化を行うことができる。スペクトル符号化部６３５については、図７及び図９を参照し、さらに具体的に説明する。 The frequency domain encoding unit 630 may process the audio signal provided from the pre-processing unit 610 based on transform coding. Specifically, the transient detection unit 631 can detect a transient component from the audio signal and determine whether the current frame is a transient frame. The conversion unit 633 determines the length or form of the conversion window based on the frame type provided by the transient detection unit 631, that is, the transient information, and converts the audio signal into the frequency domain based on the determined conversion window. Can be converted. As a conversion technique, MDCT, FFT, or MLT can be applied. In general, for frames with transient components, a short length transform window can be applied. The spectrum encoding unit 635 can encode the audio spectrum converted into the frequency domain. The spectrum encoding unit 635 will be described more specifically with reference to FIGS.

時間ドメイン符号化部６５０は、前処理部６１０から提供されるオーディオ信号に対して、ＣＥＬＰ（code excited linear prediction）符号化を行うことができる。具体的には、ＡＣＥＬＰ（algebraic ＣＥＬＰ）を使用することができるが、それらに限定されるものではない。 The time domain coding unit 650 may perform code excited linear prediction (CELP) coding on the audio signal provided from the preprocessing unit 610. Specifically, ACELP (algebraic CELP) can be used, but is not limited thereto.

多重化部６７０は、周波数ドメイン符号化部６３０あるいは時間ドメイン符号化部６５０において、符号化の結果として生成されるスペクトル成分あるいは信号成分と、多様なインデックスとを多重化してビットストリームを生成し、ビットストリームは、チャネルを介してパケット状で伝送されるか、あるいは記録媒体に保存される。 The multiplexing unit 670 multiplexes a spectrum component or a signal component generated as a result of the coding with various indexes in the frequency domain coding unit 630 or the time domain coding unit 650 to generate a bit stream, The bit stream is transmitted in a packet form through a channel or stored in a recording medium.

図７は、一実施形態によるスペクトル符号化装置の構成を示すブロック図である。図７に図示された装置は、図６のスペクトル符号化部６３５に対応するか、他の周波数ドメイン符号化装置に含まれるか、あるいは独立しても具現される。 FIG. 7 is a block diagram illustrating a configuration of the spectrum encoding device according to the embodiment. The apparatus shown in FIG. 7 corresponds to the spectrum coding unit 635 of FIG. 6, is included in another frequency domain coding apparatus, or is embodied independently.

図７に図示されたスペクトル符号化装置７００は、エネルギー推定部７１０、エネルギー量子化及び符号化部７２０、ビット割当て部７３０、スペクトル正規化部７４０、スペクトル量子化及び符号化部７５０及びノイズフィリング部７６０を含んでもよい。 7 includes an energy estimating unit 710, an energy quantizing and encoding unit 720, a bit allocating unit 730, a spectrum normalizing unit 740, a spectrum quantizing and encoding unit 750, and a noise filling unit. 760 may be included.

図７を参照すれば、エネルギー推定部７１０は、本来のスペクトル係数をサブバンドに分離し、各サブバンド別エネルギー、例えば、Ｎｏｒｍ値を推定することができる。ここで、１つのフレームにおいて、各サブバンドは、同一大きさを有するか、低域から高域に行くほど、各サブバンドに含まれるスペクトル係数の数を増加させることができる。 Referring to FIG. 7, the energy estimating unit 710 may separate an original spectral coefficient into subbands and estimate energy for each subband, for example, a Norm value. Here, in one frame, the number of spectral coefficients included in each subband can be increased as each subband has the same size or goes from a low band to a high band.

エネルギー量子化及び符号化部７２０は、各サブバンドについて推定されたＮｏｒｍ値を量子化及び符号化することができる。そのとき、Ｎｏｒｍ値は、ベクトル量子化、スカラー量子化、ＴＣＱ、ＬＶＱ（lattice vector quantization）など多様な方式によって量子化される。エネルギー量子化及び符号化部７２０は、さらなる符号化効率を向上させるために、無損失符号化をさらに行うことができる。 The energy quantization and encoding unit 720 may quantize and encode the Norm value estimated for each subband. At this time, the Norm value is quantized by various methods such as vector quantization, scalar quantization, TCQ, and LVQ (lattice vector quantization). The energy quantization and encoding unit 720 may further perform lossless encoding to further improve encoding efficiency.

ビット割当て部７３０は、サブバンド別に量子化されたＮｏｒｍ値を利用して、フレーム当たり許容ビットを考慮しながら、符号化に必要なビットを割り当てることができる。 The bit allocating unit 730 may allocate bits necessary for encoding using a Norm value quantized for each subband, while considering allowable bits per frame.

スペクトル正規化部７４０は、サブバンド別に量子化されたＮｏｒｍ値を利用して、スペクトルに対する正規化を行うことができる。 The spectrum normalization unit 740 may perform normalization on the spectrum using the Norm value quantized for each subband.

スペクトル量子化及び符号化部７５０は、正規化されたスペクトルに対して、サブバンド別に割り当てられたビットに基づいて、量子化及び符号化を行うことができる。 The spectrum quantization and coding unit 750 may perform quantization and coding on the normalized spectrum based on bits allocated to each subband.

ノイズフィリング部７６０は、スペクトル量子化及び符号化部７５０において、許容ビットの制約によって０に量子化された部分に、適切なノイズを追加することができる。 The noise filling unit 760 can add an appropriate noise to the portion quantized to 0 by the restriction of the allowable bit in the spectrum quantization and encoding unit 750.

図８は、サブバンド分割の例を示す図面である。図８を参照すれば、入力信号が、４８ｋＨｚのサンプリング周波数を使用し、２０ｍｓのフレーム大きさを有する場合、毎フレーム当たり処理するサンプルの個数は、９６０個になる。すなわち、入力信号を、ＭＤＣＴを利用して、５０％のオーバーラッピングを適用して変換すれば、９６０個のスペクトル係数が得られる。ここで、オーバーラッピングの比率は、符号化方式によって多様に設定される。周波数ドメインでは、理論的に、２４ｋＨｚまで処理可能であるが、人間の可聴帯域を考慮し、２０ｋＨｚまでの帯域を表現する。低域である０〜３．２ｋＨｚまでは、８個のスペクトル係数を１つのサブバンドにまとめて使用し、３．２〜６．４ｋＨｚの帯域では、１６個のスペクトル係数を１つのサブバンドにまとめて使用する。６．４〜１３．６ｋＨｚの帯域では、２４個のスペクトル係数を１つのサブバンドにまとめて使用し、１３．６〜２０ｋＨｚの帯域では、３２個のスペクトル係数を、１つのサブバンドにまとめて使用する。実際のＮｏｒｍ値を求めて符号化を行う場合、符号化器において決められた帯域までＮｏｒｍを求めて符号化することができる。決定された帯域後の特定高域では、帯域拡張のような多様な方式に基づいた符号化が可能である。 FIG. 8 is a diagram illustrating an example of subband division. Referring to FIG. 8, when an input signal uses a sampling frequency of 48 kHz and has a frame size of 20 ms, the number of samples to be processed per frame is 960. That is, if the input signal is converted by applying 50% overlapping using MDCT, 960 spectral coefficients can be obtained. Here, the overlapping ratio is variously set according to the coding method. In the frequency domain, it is theoretically possible to process up to 24 kHz, but a band up to 20 kHz is expressed in consideration of the human audible band. From 0 to 3.2 kHz, which is a low frequency band, eight spectral coefficients are collectively used in one subband, and in the band of 3.2 to 6.4 kHz, 16 spectral coefficients are used in one subband. Use together. In the band of 6.4 to 13.6 kHz, 24 spectral coefficients are collectively used in one subband, and in the band of 13.6 to 20 kHz, 32 spectral coefficients are collected in one subband. use. When encoding is performed by obtaining an actual Norm value, it is possible to obtain Norm up to a band determined by an encoder and perform encoding. In a specific high band after the determined band, encoding based on various schemes such as band extension is possible.

図９は、一実施形態によるスペクトル量子化装置の構成を示すブロック図である。図９に図示された装置は、量子化器選択部９１０）、ＵＳＱ９３０及びＴＣＱ９５０を含んでもよい。 FIG. 9 is a block diagram illustrating a configuration of the spectrum quantization device according to the embodiment. The apparatus illustrated in FIG. 9 may include a quantizer selection unit 910), USQ 930, and TCQ 950.

図９において、量子化器選択部９１０は、入力信号、すなわち、量子化される信号の特性によって、多様な量子化器のうち最も効率的な量子化器を選択することができる。入力信号の特性としては、バンド別ビット割当て情報、バンドの大きさ情報などが使用可能である。選択結果によって、量子化される信号をＵＳＱ９３０及びＴＣＱ９５０のうち一つに提供され、対応する量子化を行うことができる。 Referring to FIG. 9, a quantizer selector 910 may select the most efficient quantizer among various quantizers according to characteristics of an input signal, that is, a signal to be quantized. As the characteristics of the input signal, bit allocation information for each band, band size information, and the like can be used. According to the selection result, a signal to be quantized is provided to one of USQ 930 and TCQ 950, and corresponding quantization can be performed.

図１０は、一実施形態によるスペクトル符号化装置の構成を示すブロック図である。図１０に図示された装置は、図７のスペクトル量子化及び符号化部７５０に対応するか、他の周波数ドメイン符号化装置に含まれるか、あるいは独立しても具現される。 FIG. 10 is a block diagram illustrating a configuration of the spectrum encoding device according to the embodiment. The device illustrated in FIG. 10 corresponds to the spectrum quantization and coding unit 750 of FIG. 7, is included in another frequency domain coding device, or is embodied independently.

図１０に図示された装置は、符号化方式選択部１０１０、ゼロ符号化部１０２０、スケーリング部１０３０、ＩＳＣ符号化部１０４０、量子化成分復元部１０５０及び逆スケーリング部１０６０を含んでもよい。ここで、量子化成分復元部１０５０及び逆スケーリング部１０６０は、オプションとして具備される。 The apparatus illustrated in FIG. 10 may include a coding scheme selection unit 1010, a zero coding unit 1020, a scaling unit 1030, an ISC coding unit 1040, a quantization component restoration unit 1050, and an inverse scaling unit 1060. Here, the quantization component restoration unit 1050 and the inverse scaling unit 1060 are provided as options.

図１０において、符号化方式選択部１０１０は、入力信号特性を考慮し、符号化方式を選択することができる。入力信号特性は、バンド別に割り当てられたビットを含んでもよい。正規化されたスペクトルは、バンド別に選択された符号化方式に基づいて、ゼロ符号化部１０２０あるいはスケーリング部１０３０に提供される。一実施形態によれば、バンドの各サンプルに割り当てられた平均ビット数が、所定値、例えば、０．７５以上である場合、当該バンドは、非常に重要であると判断され、ＵＳＱが使用される一方、全ての他のバンドは、ＴＣＱが使用される。ここで、平均ビット数は、バンド長あるいはバンド大きさを考慮して決定することができる。選択された符号化方式は、１ビットのフラグを利用して設定される。 In FIG. 10, a coding scheme selection unit 1010 can select a coding scheme in consideration of input signal characteristics. The input signal characteristics may include bits allocated for each band. The normalized spectrum is provided to the zero encoding unit 1020 or the scaling unit 1030 based on the encoding scheme selected for each band. According to one embodiment, if the average number of bits allocated to each sample of a band is greater than or equal to a predetermined value, eg, 0.75, the band is determined to be very important and USQ is used. On the other hand, all other bands use TCQ. Here, the average number of bits can be determined in consideration of the band length or the band size. The selected encoding method is set using a 1-bit flag.

ゼロ符号化部１０２０は、割り当てられたビットが０であるバンドに対して、全てのサンプルを０に符号化することができる。 The zero encoding unit 1020 can encode all samples to 0 for a band where the assigned bits are 0.

スケーリング部１０３０は、バンドに割り当てられたビットに基づいて、スペクトルに対するスケーリングを行うことにより、ビット率を調節することができる。そのとき、正規化されたスペクトルが使用される。スケーリング部１０３０は、バンドに含まれた各サンプル、すなわち、スペクトル係数に割り当てられた平均ビット数を考慮し、スケーリングを行うことができる。例えば、平均ビット数が多いほど、さらに大きいスケーリングが行われる。 The scaling unit 1030 may adjust the bit rate by performing scaling on the spectrum based on the bits allocated to the band. Then, the normalized spectrum is used. The scaling unit 1030 may perform scaling in consideration of each sample included in the band, that is, the average number of bits allocated to the spectrum coefficient. For example, the larger the average number of bits, the larger the scaling.

一実施形態によれば、スケーリング部１０３０は、バンド別にビット割り当てによって、適切なスケーリング値を決定することができる。 According to an exemplary embodiment, the scaling unit 1030 may determine an appropriate scaling value by bit allocation for each band.

具体的には、まず、バンド長（band length）及びビット割当て情報を利用して、現在バンドのためのパルス個数を推定することができる。ここで、パルスは、単位パルスを意味する。まず、下記数式（１）に基づいて、現在バンドで実際に必要なビットｂを算出することができる。 Specifically, first, the number of pulses for the current band can be estimated using the band length and the bit allocation information. Here, the pulse means a unit pulse. First, the bit b actually required in the current band can be calculated based on the following equation (1).

ここで、ｎは、バンド長を示し、ｍは、パルス個数（number of pulses）を意味し、ｉは、ＩＳＣ（the important spectral component）を有するノンゼロ位置の数を意味する。

Here, n indicates the band length, m indicates the number of pulses, and i indicates the number of non-zero positions having ISC (the important spectral component).

一方、ノンゼロ位置の個数は、例えば、下記数式（２）のように、確率に基づいて得られる。 On the other hand, the number of non-zero positions is obtained based on a probability, for example, as in the following equation (2).

そして、ノンゼロ位置のために必要なビット数は、下記数式（３）のように推定される。

Then, the number of bits required for the non-zero position is estimated as in the following equation (3).

最終的に、パルスの個数は、各バンドに割り当てられたビットに最も近い値を有するｂ値によって選択される。

Finally, the number of pulses is selected by the b value having the value closest to the bit assigned to each band.

次に、バンド別に求められたパルス個数推定値と、入力信号の絶対値とを利用して、初期スケーリングファクタを決定することができる。入力信号は、初期スケーリングファクタによってスケーリングされる。もしスケーリングされた原信号、すなわち、量子化された信号に対するパルス個数の和がパルス個数推定値の同じではない場合には、アップデートされたスケーリングファクタを利用して、パルス再分配（redistribution）処理を行うことができる。パルス再分配処理は、現在バンドに対して選択されたパルス個数が、バンド別に求められたパルス個数推定値より少ない場合には、スケーリングファクタを減少させてパルス個数を増加させ、反対に多い場合には、スケーリングファクタを増加させてパルス個数を減少させる。そのとき、原信号との歪曲を最小化する位置を選択し、あらかじめ決定された値ほど増加させるか、あるいは減少させることができる。 Next, an initial scaling factor can be determined using the pulse number estimated value obtained for each band and the absolute value of the input signal. The input signal is scaled by the initial scaling factor. If the sum of the pulse numbers for the scaled original signal, ie, the quantized signal, is not the same as the pulse number estimate, a pulse redistribution process is performed using the updated scaling factor. It can be carried out. The pulse redistribution process increases the number of pulses by decreasing the scaling factor if the number of pulses selected for the current band is less than the estimated number of pulses determined for each band, Reduces the number of pulses by increasing the scaling factor. At this time, a position where distortion with the original signal is minimized is selected, and the position can be increased or decreased by a predetermined value.

ＴＳＱのための歪曲関数は、正確な距離よりは、相対的な大きさを必要とするために、下記の数式（４）のように、各バンドにおいて、それぞれ量子化及び逆量子化された値の自乗距離の和として得られる。 Since the distortion function for TSQ requires a relative size rather than an exact distance, the quantized and dequantized values are respectively calculated for each band as shown in the following equation (4). As the sum of the squared distances of

ここで、ｐｉは、実際値であり、ｑｉは、量子化された値を示す。

Here, pi is an actual value, and qi indicates a quantized value.

一方、ＵＳＱのための歪曲関数は、最善の量子化された値を決定するために、ユークリッド距離を使用することができる。そのとき、複雑度を最小化するために、スケーリングファクタを含む修正された数式を使用し、歪曲関数は、下記数式（５）によって算出される。 On the other hand, the distortion function for USQ can use the Euclidean distance to determine the best quantized value. At this time, in order to minimize the complexity, a modified formula including a scaling factor is used, and the distortion function is calculated by the following formula (5).

もしバンド当たりパルス個数が要求される値とマッチングしない場合、最小メトリックを維持しながら、所定数のパルスを加減する必要がある。それは、１つのパルスを加減する過程を、パルス個数が要求される値に至るまで反復する方法によって遂行される。

If the number of pulses per band does not match the required value, a certain number of pulses must be adjusted while maintaining the minimum metric. It is performed by a method of repeating the process of adding or subtracting one pulse until the number of pulses reaches a required value.

１つのパルスを加減するために、最適の歪曲値を求めるためのｎ個の歪曲値を求める必要がある。例えば、歪曲値ｊは、下記数式（６）のように、バンドにおいてｊ番目の位置にパルスを追加することに該当する。 In order to add or subtract one pulse, it is necessary to find n distortion values for finding an optimal distortion value. For example, the distortion value j corresponds to adding a pulse to a j-th position in a band as in the following equation (6).

前記数式（６）をｎ回遂行することを避けるために、下記数式（７）のように、同じ偏差（deviation）を使用することができる。

In order to avoid performing Equation (6) n times, the same deviation can be used as in Equation (7) below.

前記数式（７）において、

In the equation (7),

は、１回だけ計算すればよい。一方、ｎは、バンド長、すなわち、バンドにある係数数を示し、ｐは、原信号、すなわち、量子化器の入力信号を示し、ｑは、量子化された信号を示し、ｇは、スケーリングファクタを示す。最終的に、歪曲ｄを最小化する位置ｊが選択され、ｑｊがアップデートされる。

Need only be calculated once. On the other hand, n indicates the band length, that is, the number of coefficients in the band, p indicates the original signal, that is, the input signal of the quantizer, q indicates the quantized signal, and g indicates the scaling. Indicates a factor. Finally, the position j that minimizes the distortion d is selected, and qj is updated.

一方、ビット率を制御するために、スケーリングされたスペクトル係数を使用して、適切なＩＳＣを選択して符号化することができる。具体的には、量子化するためのスペクトル成分は、各バンドのビット割り当てを使用して選択される。そのとき、スペクトル成分の分布及び分散による多様な組み合わせに基づいて、スペクトル成分を選択することができる。次に、実際のノンゼロ位置を算出することができる。ノンゼロ位置は、スケーリング量と再分配動作とを分析して得ることができ、そのように選択されたノンゼロ位置は、他の言い方でＩＳＣとすることができる。要約すれば、スケーリングと再分配過程とを経た信号の大きさを分析し、最適スケーリングファクタと、ＩＳＣに該当するノンゼロ位置情報とを求めることができる。ここで、ノンゼロ位置情報は、ノンゼロ位置の個数及び位置を意味する。もしスケーリングと再分配過程とを介して、パルス個数が調節されない場合、選択されたパルスを、実際のＴＣＱ過程を介して量子化し、その結果を利用して、余剰ビットを調整することができる。その過程は、次のような例が可能である。 On the other hand, to control the bit rate, the appropriate ISC can be selected and encoded using the scaled spectral coefficients. Specifically, the spectral components to quantize are selected using the bit allocation for each band. At this time, the spectral components can be selected based on various combinations based on the distribution and dispersion of the spectral components. Next, the actual non-zero position can be calculated. The non-zero position can be obtained by analyzing the scaling amount and the redistribution operation, and the non-zero position so selected can be ISC in other words. In summary, it is possible to determine the optimal scaling factor and the non-zero position information corresponding to the ISC by analyzing the magnitude of the signal that has undergone the scaling and redistribution processes. Here, the non-zero position information means the number and positions of the non-zero positions. If the number of pulses is not adjusted through the scaling and redistribution processes, the selected pulses can be quantized through the actual TCQ process, and the surplus bits can be adjusted using the result. The following example is possible in the process.

ノンゼロ位置数と、バンド別に求められたパルス個数推定値とが同じではなく、ノンゼロ位置の個数が、所定値、例えば、１より大きく求められた量子化器選択情報がＴＣＱを示す条件の場合、実際のＴＣＱ量子化を介して、余剰ビットを調整することができる。具体的には、前記条件に該当する場合、余剰ビットを調整するために、まず、ＴＣＱ量子化過程を経る。前もってバンド別に求められたパルス個数推定値に比べ、実際のＴＣＱ量子化を介して求められた現在バンドのパルス個数がさらに少ない場合には、以前に決定されたスケーリングファクタに、１より大きい値、例えば、１．１を乗じてスケーリングファクタを増加させ、反対の場合には、１より少ない値、例えば、０．９を乗じてスケーリングファクタを減少させる。そのような過程を反復し、バンド別に求められたパルス個数推定値と、ＴＣＱ量子化を介して求められた現在バンドのパルス個数とが同じになる場合、実際のＴＣＱ量子化過程で使用されたビットを計算し、余剰ビットをアップデートする。そのように求められたノンゼロ位置が、ＩＳＣに該当する。 If the number of non-zero positions is not the same as the estimated number of pulses determined for each band, and the number of non-zero positions is a predetermined value, for example, if the quantizer selection information determined to be greater than 1 is a condition indicating TCQ, The surplus bits can be adjusted via actual TCQ quantization. Specifically, if the above condition is satisfied, a TCQ quantization process is first performed to adjust the surplus bits. If the number of pulses in the current band determined through actual TCQ quantization is smaller than the estimated number of pulses determined in advance for each band, the previously determined scaling factor is set to a value greater than 1, For example, multiply by 1.1 to increase the scaling factor and vice versa to decrease the scaling factor by a value less than 1, for example 0.9. Such a process is repeated, and when the pulse number estimated value obtained for each band is equal to the pulse number of the current band obtained through the TCQ quantization, the pulse number is used in the actual TCQ quantization process. Calculate bits and update surplus bits. The non-zero position thus obtained corresponds to the ISC.

ＩＳＣ符号化部１０４０では、最終的に選択されたＩＳＣの個数情報及びノンゼロ位置情報を符号化することができる。その過程において、符号化効率を高めるために、無損失符号化を適用することもできる。ＩＳＣ符号化部１０４０は、割り当てられたビットが０ではないノンゼロバンドに対して選択された量子化器を利用して、符号化を行うことができる。具体的には、ＩＳＣ符号化部１０４０は、正規化されたスペクトルに対して、各バンド別にＩＳＣを選択し、各バンド別に選択されたＩＳＣの情報を、数、位置、大きさ及び符号に基づいて符号化することができる。そのとき、ＩＳＣの大きさは、数、位置及び符号とは異なる方式によって符号化することができる。一例を挙げれば、ＩＳＣの大きさは、ＵＳＱ及びＴＣＱのうち一つを利用して量子化して算術符号化する一方、ＩＳＣの数、位置及び符号については、算術符号化を行うことができる。特定バンドが重要な情報を含んでいると判断される場合、ＵＳＱを使用し、そうではない場合、ＴＣＱを使用することができる。実施形態によれば、信号特性に基づいて、ＴＣＱ及びＵＳＱのうち一つを選択することができる。ここで、信号特性は、各バンドに割り当てられたビットあるいはバンド長を含んでもよい。もしバンドに含まれた各サンプルに割り当てられた平均ビット数が臨界値、例えば、０．７５以上である場合、当該バンドは、非常に重要な情報を含んでいると判断することができるので、ＵＳＱが使用される。一方、バンド長が短い低域の場合にも、必要によっては、ＵＳＱが使用される。他の実施形態によれば、帯域幅によって、第１ジョイント方式と第２ジョイント方式とのうち一つが使用される。例えば、ＮＢ及びＷＢについては、各バンドに対する本来のビット割当て情報だけではなく、以前に符号化されたバンドからの余剰ビットに対する二次ビット割当て処理をさらに利用して、量子化器選択が行われる第１ジョイント方式が使用され、ＳＷＢ及びＦＢについては、ＵＳＱを使用すると決定されたバンドに対して、ＬＳＢ（least significant bit）については、ＴＣＱを使用する第２ジョイント方式が使用される。第１ジョイント方式において、二次ビット割当て処理は、以前符号化されたバンドからの余剰ビットを分配することにより、２バンドを選択することができる。一方、第２ジョイント方式において、残りのビットは、ＵＳＱを使用することができる。 The ISC encoding unit 1040 can encode the number information and the non-zero position information of the finally selected ISC. In the process, lossless coding may be applied to increase coding efficiency. The ISC encoding unit 1040 can perform encoding using a quantizer selected for a non-zero band in which the allocated bits are not 0. Specifically, the ISC encoding unit 1040 selects an ISC for each band with respect to the normalized spectrum, and transmits information of the ISC selected for each band based on the number, position, size, and code. Can be encoded. At that time, the magnitude of the ISC can be encoded by a method different from the number, position, and code. For example, the size of the ISC may be quantized using one of USQ and TCQ and arithmetically coded, while the number, position and code of the ISC may be arithmetically coded. If it is determined that the specific band contains important information, USQ may be used, otherwise TCQ may be used. According to the embodiment, one of TCQ and USQ can be selected based on the signal characteristics. Here, the signal characteristic may include a bit or a band length assigned to each band. If the average number of bits allocated to each sample included in the band is a threshold value, for example, 0.75 or more, it can be determined that the band includes very important information. USQ is used. On the other hand, in the case of a low band having a short band length, USQ is used as necessary. According to another embodiment, one of the first joint method and the second joint method is used depending on the bandwidth. For example, for NB and WB, quantizer selection is performed by further utilizing not only the original bit allocation information for each band but also a secondary bit allocation process for surplus bits from a previously coded band. The first joint scheme is used. For SWB and FB, the band determined to use USQ is used, and for the LSB (least significant bit), the second joint scheme using TCQ is used. In the first joint scheme, the secondary bit allocation process can select two bands by distributing surplus bits from previously encoded bands. On the other hand, in the second joint method, the remaining bits can use USQ.

量子化成分復元部１０５０は、量子化された成分に、ＩＳＣの位置、大きさ及び符号情報を付加し、実際の量子化された成分を復元することができる。ここで、ゼロ位置、すなわち、ゼロに符号化されたスペクトル係数には、０が割り当てられる。 The quantized component restoration unit 1050 can restore the actual quantized component by adding the position, size, and code information of the ISC to the quantized component. Here, zero is assigned to the zero position, that is, the spectral coefficient coded to zero.

逆スケーリング部１０６０は、復元された量子化成分に対して逆スケーリングを行い、正規化された入力スペクトルと同一レベルの量子化されたスペクトル係数を出力することができる。スケーリング部１０３０及び逆スケーリング部１０６０においては、同一スケーリングファクタを使用することができる。 The inverse scaling unit 1060 may perform inverse scaling on the restored quantized component and output quantized spectral coefficients at the same level as the normalized input spectrum. In the scaling unit 1030 and the inverse scaling unit 1060, the same scaling factor can be used.

図１１は、一実施形態によるＩＳＣ符号化装置の構成を示すブロック図である。図１１に図示された装置は、ＩＳＣ選択部１１１０及びＩＳＣ情報符号化部１１３０を含んでもよい。図１１の装置は、図１０のＩＳＣ符号化部１０４０に対応するか、あるいは独立した装置として具現される。 FIG. 11 is a block diagram illustrating a configuration of an ISC encoding device according to an embodiment. The apparatus illustrated in FIG. 11 may include an ISC selector 1110 and an ISC information encoder 1130. The device of FIG. 11 corresponds to the ISC encoder 1040 of FIG. 10 or is embodied as an independent device.

図１１において、ＩＳＣ選択部１１１０は、ビット率を調節するために、スケーリングされたスペクトルから、所定基準に基づいてＩＳＣを選択することができる。ＩＳＣ選択部１１１０は、スケーリングされたスペクトルから、スケーリングされた程度を分析し、実際のノンゼロ位置を求めることができる。ここで、ＩＳＣは、スケーリング以前の実際のノンゼロスペクトル係数に該当する。ＩＳＣ選択部１１１０は、バンド別に割り当てられたビットに基づいて、スペクトル係数の分布及び分散を考慮し、符号化するスペクトル係数、すなわち、ノンゼロ位置を選択することができる。ＩＳＣ選択のためにＴ、ＣＱを使用することができる。 In FIG. 11, an ISC selection unit 1110 can select an ISC from a scaled spectrum based on a predetermined criterion in order to adjust a bit rate. The ISC selection unit 1110 can analyze the scaled extent from the scaled spectrum to determine an actual non-zero position. Here, ISC corresponds to an actual non-zero spectral coefficient before scaling. The ISC selection unit 1110 can select a spectral coefficient to be coded, that is, a non-zero position, based on bits allocated to each band in consideration of distribution and variance of the spectral coefficient. T, CQ can be used for ISC selection.

ＩＳＣ情報符号化部１１３０は、選択されたＩＳＣに基づいて、ＩＳＣ情報、すなわち、ＩＳＣ個数情報、位置情報、大きさ情報及び符号を復号することができる。 The ISC information encoding unit 1130 can decode the ISC information, that is, the ISC number information, the position information, the size information, and the code based on the selected ISC.

図１２は、一実施形態によるＩＳＣ情報符号化装置の構成を示すブロック図である。図１２に図示された装置は、位置情報符号化部１２１０、大きさ情報符号化部１２３０及び符号符号化部１２５０を含んでもよい。 FIG. 12 is a block diagram illustrating a configuration of the ISC information encoding device according to the embodiment. The apparatus illustrated in FIG. 12 may include a position information encoding unit 1210, a size information encoding unit 1230, and a code encoding unit 1250.

図１２において、位置情報符号化部１２１０は、ＩＳＣ選択部１１１０（図１１）で選択されたＩＳＣの位置情報、すなわち、ノンゼロスペクトル係数の位置情報を符号化することができる。位置情報は、選択されたＩＳＣの数及び位置を含んでもよい。位置情報の符号化には、算術符号化（arithmetic coding）が使用される。一方、選択されたＩＳＣを集め、新たなバッファを構成することができる。ＩＳＣ収集のために、ゼロバンドと、選択されていないスペクトルは、除外される。 12, the position information encoding unit 1210 can encode the position information of the ISC selected by the ISC selection unit 1110 (FIG. 11), that is, the position information of the non-zero spectral coefficient. The location information may include the number and location of the selected ISC. Arithmetic coding is used for encoding the position information. On the other hand, the selected ISCs can be collected to form a new buffer. For ISC collection, zero bands and unselected spectra are excluded.

大きさ情報符号化部１２３０は、新たに構成されたＩＳＣの大きさ情報に対して、符号化を行うことができる。そのとき、ＴＣＱ及びＵＳＱのうち一つを選択して量子化を行い、次に、算術符号化を追加して行うことができる。算術符号化の効率を高めるために、ノンゼロ位置情報、及びＩＳＣの数が使用される。 The size information coding unit 1230 can perform coding on the size information of the newly configured ISC. At this time, quantization can be performed by selecting one of TCQ and USQ, and then arithmetic coding can be additionally performed. To increase the efficiency of arithmetic coding, non-zero position information and the number of ISCs are used.

符号情報符号化部１２５０は、選択されたＩＳＣの符号情報に対して、符号化を行うことができる。符号情報の符号化には、算術符号化が使用される。 The code information coding unit 1250 can perform coding on the code information of the selected ISC. Arithmetic encoding is used for encoding the code information.

図１３は、他の実施形態によるスペクトル符号化装置の構成を示すブロック図である。図１３に図示された装置は、図７のスペクトル量子化及び符号化部７５０に対応するか、他の周波数ドメイン符号化装置に含まれるか、あるいは独立しても具現される。 FIG. 13 is a block diagram illustrating a configuration of a spectrum encoding device according to another embodiment. The apparatus shown in FIG. 13 corresponds to the spectrum quantization and coding unit 750 of FIG. 7, is included in another frequency domain coding apparatus, or is embodied independently.

図１３に図示された装置は、スケーリング部１３３０、ＩＳＣ符号化部１３４０、量子化成分復元部１３５０及び逆スケーリング部１３６０を含んでもよい。図１０と比較するとき、ゼロ符号化部１０２０と符号化方式選択部１０１０とが省略され、ＩＳＣ符号化部１３４０は、ＴＣＱを使用することができるということを除いては、各構成要素の動作は同一である。 The apparatus illustrated in FIG. 13 may include a scaling unit 1330, an ISC encoding unit 1340, a quantization component restoration unit 1350, and an inverse scaling unit 1360. As compared with FIG. 10, the zero encoding unit 1020 and the encoding scheme selection unit 1010 are omitted, and the ISC encoding unit 1340 operates in the same way as the operation of each component except that TCQ can be used. Are the same.

図１４は、他の実施形態によるスペクトル符号化装置の構成を示すブロック図である。図１４に図示された装置は、図７のスペクトル量子化及び符号化部７５０に対応するか、他の周波数ドメイン符号化装置に含まれるか、あるいは独立しても具現される。 FIG. 14 is a block diagram illustrating a configuration of a spectrum encoding device according to another embodiment. The device illustrated in FIG. 14 corresponds to the spectrum quantization and coding unit 750 of FIG. 7, is included in another frequency domain coding device, or is embodied independently.

図１４に図示された装置は、符号化方式選択部１４１０、スケーリング部１４３０、ＩＳＣ符号化部１４４０、量子化成分復元部１４５０及び逆スケーリング部１４６０を含んでもよい。図１０と比較するとき、ゼロ符号化部１０２０が省略されているということを除いては、各構成要素の動作は同一である。 The apparatus illustrated in FIG. 14 may include a coding scheme selection unit 1410, a scaling unit 1430, an ISC coding unit 1440, a quantization component restoration unit 1450, and an inverse scaling unit 1460. As compared with FIG. 10, the operation of each component is the same except that the zero encoding unit 1020 is omitted.

図１５は、一実施形態によるＩＳＣ収集過程及び符号化過程の概念を示す図面であり、まず、ゼロバンド（zero band）すなわち、０に量子化されるバンドは除く。次に、ノンゼロバンドに存在するスペクトル成分のうち選択されたＩＳＣを利用して、新たなバッファを構成することができる。新たに構成されたＩＳＣに対して、バンド単位でＴＣＱを遂行し、対応する無損失符号化（lossless encoding）を行うことができる。 FIG. 15 is a diagram illustrating the concept of an ISC collection process and an encoding process according to an embodiment. First, a zero band, that is, a band quantized to 0 is excluded. Next, a new buffer can be configured using the ISC selected from the spectral components existing in the non-zero band. The newly configured ISC may perform TCQ on a band-by-band basis and perform corresponding lossless encoding.

図１６は、他の実施形態によるＩＳＣ収集過程及び符号化過程ＩＳＣ収集過程の概念を示す図面であり、まず、ゼロバンド、すなわち、０に量子化されるバンドは除く。次に、ノンゼロバンドに存在するスペクトル成分のうち選択されたＩＳＣを利用して、新たなバッファを構成することができる。新たに構成されたＩＳＣに対して、バンド単位で、ＵＳＣあるいはＴＣＱを遂行し、対応する無損失符号化を行うことができる。 FIG. 16 is a diagram illustrating a concept of an ISC collecting process and an encoding process according to another embodiment. First, a zero band, that is, a band quantized to 0 is excluded. Next, a new buffer can be configured using the ISC selected from the spectral components existing in the non-zero band. USC or TCQ can be performed on the newly configured ISC in band units, and corresponding lossless coding can be performed.

図１７は、本発明で使用されたＴＣＱの一例を示す図面であり、２つのゼロレベルを有する８ステート４コセットのトレリス構造に該当する。当該ＴＣＱについての詳細な説明は、ＵＳ７６０５７２７に開示されている。 FIG. 17 is a diagram illustrating an example of the TCQ used in the present invention, which corresponds to an 8-state 4-coset trellis structure having two zero levels. A detailed description of the TCQ is disclosed in US Pat. No. 7,605,727.

図１８は、本発明が適用される周波数ドメインオーディオ復号装置の構成を示したブロック図である。 FIG. 18 is a block diagram showing a configuration of a frequency domain audio decoding device to which the present invention is applied.

図１８に図示された周波数ドメインオーディオ復号装置１８００は、フレームエラー検出部１８１０、周波数ドメイン復号部１８３０、時間ドメイン復号部１８５０及び後処理部１８７０を含んでもよい。周波数ドメイン復号部１８３０は、スペクトル復号部１８３１、メモリ更新部１８３３、逆変換部１８３５及びＯＬＡ（overlap and add）部１８３７を含んでもよい。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The frequency domain audio decoding device 1800 illustrated in FIG. 18 may include a frame error detection unit 1810, a frequency domain decoding unit 1830, a time domain decoding unit 1850, and a post-processing unit 1870. The frequency domain decoding unit 1830 may include a spectrum decoding unit 1831, a memory updating unit 1833, an inverse transform unit 1835, and an OLA (overlap and add) unit 1837. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図１８を参照すれば、フレームエラー検出部１８１０は、受信されたビットストリームから、フレームエラーが発生したか否かということを検出することができる。 Referring to FIG. 18, the frame error detection unit 1810 can detect whether a frame error has occurred from the received bitstream.

周波数ドメイン復号部１８３０は、符号化モードが、音楽モードあるいは周波数ドメインモードである場合に動作し、フレームエラーが発生した場合、ＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを動作させ、フレームエラーが発生していない場合、一般的な変換復号過程を介して、時間ドメイン信号を生成する。具体的には、スペクトル復号部１８３１は、復号されたパラメータを利用してスペクトル復号を行い、スペクトル係数を合成することができる。スペクトル復号部１８３１については、図１９及び図２０を参照し、さらに具体的に説明する。 The frequency domain decoding unit 1830 operates when the encoding mode is the music mode or the frequency domain mode. When a frame error occurs, the FEC algorithm or the PLC algorithm operates, and when the frame error does not occur, A time-domain signal is generated through a general transform decoding process. Specifically, the spectrum decoding unit 1831 can perform spectrum decoding using the decoded parameters and combine spectral coefficients. The spectrum decoding unit 1831 will be described more specifically with reference to FIGS.

メモリ更新部１８３３は、正常フレームである現在フレームに対して合成されたスペクトル係数、復号されたパラメータを利用して得られた情報、現在まで連続したエラーフレームの個数、各フレームの信号特性あるいはフレームタイプ情報などを、次のフレームのために更新することができる。ここで、信号特性は、トランジェント特性、ステーショナリ特性を含んでもよく、フレームタイプは、トランジェントフレーム、ステーショナリフレームあるいはハーモニックフレームを含んでもよい。 The memory updating unit 1833 performs the processing based on the spectrum coefficient synthesized with the current frame that is a normal frame, information obtained by using the decoded parameters, the number of error frames that have been continued up to the present, the signal characteristics of each frame, or the frame. Type information etc. can be updated for the next frame. Here, the signal characteristic may include a transient characteristic and a stationary characteristic, and the frame type may include a transient frame, a stationary frame, or a harmonic frame.

逆変換部１８３５は、合成されたスペクトル係数に対して、時間・周波数逆変換を行い、時間ドメイン信号を生成することができる。 The inverse transform unit 1835 can perform a time-frequency inverse transform on the synthesized spectral coefficient to generate a time-domain signal.

ＯＬＡ部１８３７は、以前フレームの時間ドメイン信号を利用して、ＯＬＡ処理を行い、その結果、現在フレームに対する最終時間ドメイン信号を生成し、後処理部１８７０に提供することができる。 The OLA unit 1837 performs an OLA process using the time domain signal of the previous frame, and as a result, generates a final time domain signal for the current frame and provides the generated signal to the post-processing unit 1870.

時間ドメイン復号部１８５０は、符号化モードが、音声モードあるいは時間ドメインモードである場合に動作し、フレームエラーが発生した場合、ＦＥＣアルゴリズムあるいはＰＬＣアルゴリズムを動作させ、フレームエラーが発生していない場合、一般的なＣＥＬＰ復号過程を介して、時間ドメイン信号を生成する。 The time domain decoding unit 1850 operates when the encoding mode is the audio mode or the time domain mode. When a frame error occurs, the FEC algorithm or the PLC algorithm is operated. A time domain signal is generated through a general CELP decoding process.

後処理部１８７０は、周波数ドメイン復号部１８３０あるいは時間ドメイン復号部１８５０から提供される時間ドメイン信号に対して、フィルタリングあるいはアップサンプリングなどを行うことができるが、それらに限定されるものではない。後処理部１６７０は、出力信号として、復元されたオーディオ信号を提供する。 The post-processing unit 1870 can perform filtering or up-sampling on the time-domain signal provided from the frequency-domain decoding unit 1830 or the time-domain decoding unit 1850, but is not limited thereto. The post-processing unit 1670 provides a restored audio signal as an output signal.

図１９は、一実施形態によるスペクトル復号装置の構成を示すブロック図である。図１９に図示された装置は、図１８のスペクトル復号部１８３１に対応するか、他の周波数ドメイン復号装置に含まれるか、あるいは独立しても具現される。 FIG. 19 is a block diagram illustrating a configuration of a spectrum decoding device according to one embodiment. The device illustrated in FIG. 19 corresponds to the spectrum decoding unit 1831 in FIG. 18, is included in another frequency domain decoding device, or is embodied independently.

図１９に図示されたスペクトル復号装置１９００は、エネルギー復号及び逆量子化部１９１０、ビット割当て部１９３０、スペクトル復号及び逆量子化部１９５０、ノイズフィリング部１９７０及びスペクトルシェーピング部１９９０を含んでもよい。ここで、ノイズフィリング部１９７０は、スペクトルシェーピング部１９９０の後端に位置することもできる。各構成要素は、少なくとも１以上のモジュールに一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The spectrum decoding apparatus 1900 illustrated in FIG. 19 may include an energy decoding and inverse quantization unit 1910, a bit allocation unit 1930, a spectrum decoding and inverse quantization unit 1950, a noise filling unit 1970, and a spectrum shaping unit 1990. Here, the noise filling unit 1970 may be located at the rear end of the spectrum shaping unit 1990. Each component is integrated into at least one or more modules and is also embodied as at least one or more processors (not shown).

図１９を参照すれば、エネルギー復号及び逆量子化部１９１０は、符号化過程において無損失符号化が行われたパラメータ、例えば、Ｎｏｒｍ値のようなエネルギーに対して無損失復号を行い、復号されたＮｏｒｍ値に対して逆量子化を行うことができる。符号化過程において、Ｎｏｒｍ値の量子化された方式に対応する方式を使用して逆量子化を行うことができる。 Referring to FIG. 19, the energy decoding and inverse quantization unit 1910 performs lossless decoding on an energy such as a Norm value, which is a parameter for which lossless encoding has been performed in an encoding process, and is decoded. Inverse quantization can be performed on the calculated Norm value. In the encoding process, the inverse quantization may be performed using a method corresponding to the quantized method of the Norm value.

ビット割当て部１９３０は、量子化されたＮｏｒｍ値、あるいは逆量子化されたＮｏｒｍ値に基づいて、サブバンド別に必要とするビット数を割り当てることができる。その場合、サブバンド単位に割り当てられたビット数は、符号化過程で割り当てられたビット数と同一である。 The bit allocation unit 1930 can allocate the required number of bits for each subband based on the quantized Norm value or the dequantized Norm value. In this case, the number of bits allocated for each subband is the same as the number of bits allocated in the encoding process.

スペクトル復号及び逆量子化部１９５０は、符号化されたスペクトル係数に対して、サブバンド別に割り当てられたビット数を使用して無損失復号を行い、復号されたスペクトル係数に対して逆量子化過程を行い、正規化されたスペクトル係数を生成することができる。 The spectrum decoding and inverse quantization unit 1950 performs lossless decoding on the encoded spectral coefficients using the number of bits allocated to each subband, and performs an inverse quantization process on the decoded spectral coefficients. To generate normalized spectral coefficients.

ノイズフィリング部１９７０は、正規化されたスペクトル係数のうち、サブバンド別にノイズフィリングを必要とする部分に対して、ノイズを充填することができる。 The noise filling unit 1970 can fill a portion of the normalized spectral coefficients that requires noise filling for each subband with noise.

スペクトルシェーピング部１９９０は、逆量子化されたＮｏｒｍ値を利用して、正規化されたスペクトル係数をシェーピングすることができる。スペクトルシェーピング過程を介して、最終的に復号されたスペクトル係数が得られる。 The spectrum shaping unit 1990 can shape the normalized spectral coefficients using the dequantized Norm value. Through the spectrum shaping process, finally decoded spectrum coefficients are obtained.

図２０は、一実施形態によるスペクトル逆量子化装置の構成を示すブロック図である。図２０に図示された装置は、逆量子化期選択部２０１０、ＵＳＱ２０３０及びＴＣＱ２０５０を含んでもよい。 FIG. 20 is a block diagram illustrating a configuration of the spectrum inverse quantization device according to the embodiment. The apparatus illustrated in FIG. 20 may include an inverse quantization period selector 2010, USQ 2030, and TCQ 2050.

図２０において、逆量子化期選択部２０１０は、入力信号、すなわち、逆量子化される信号の特性によって、多様な逆量子化器のうち、最も効率的な逆量子化器を選択することができる。入力信号の特性としては、バンド別ビット割当て情報、バンドの大きさ情報などが使用可能である。選択結果によって、逆量子化される信号をＵＳＱ２０３０及びＴＣＱ２０５０のうち一つに提供し、対応する逆量子化を行うことができる。 In FIG. 20, the inverse quantization period selection unit 2010 may select the most efficient inverse quantizer from among various inverse quantizers according to the characteristics of an input signal, that is, a signal to be inversely quantized. it can. As the characteristics of the input signal, bit allocation information for each band, band size information, and the like can be used. According to the selection result, a signal to be dequantized can be provided to one of USQ 2030 and TCQ 2050, and a corresponding dequantization can be performed.

図２１は、一実施形態によるスペクトル復号装置の構成を示すブロック図である。図２１に図示された装置は、図１９のスペクトル復号及び逆量子化部１９５０に対応するか、他の周波数ドメイン復号装置に含まれるか、あるいは独立しても具現される。 FIG. 21 is a block diagram illustrating a configuration of a spectrum decoding device according to one embodiment. The apparatus shown in FIG. 21 corresponds to the spectrum decoding and inverse quantization unit 1950 of FIG. 19, is included in another frequency domain decoding apparatus, or is implemented independently.

図２１に図示された装置は、復号方式選択部２１１０、ゼロ復号部２１３０、ＩＳＣ復号部２１５０、量子化成分復元部２１７０及び逆スケーリング部２１９０を含んでもよい。ここで、量子化成分復元部２１７０及び逆スケーリング部２１９０は、オプションとして具備される。 The apparatus illustrated in FIG. 21 may include a decoding scheme selection unit 2110, a zero decoding unit 2130, an ISC decoding unit 2150, a quantization component restoration unit 2170, and an inverse scaling unit 2190. Here, the quantization component restoration unit 2170 and the inverse scaling unit 2190 are provided as options.

図２１において、復号方式選択部２１１０は、バンド別に割り当てられたビットに基づいて、復号方式を選択することができる。正規化されたスペクトルは、バンド別に選択された復号方式に基づいて、ゼロ復号部２１３０あるいはＩＳＣ復号部２１５０に提供される。 In FIG. 21, decoding scheme selection section 2110 can select a decoding scheme based on bits allocated for each band. The normalized spectrum is provided to the zero decoding unit 2130 or the ISC decoding unit 2150 based on the decoding scheme selected for each band.

ゼロ復号部２１３０は、割り当てられたビットが０であるバンドに対して、全てのサンプルを０に復号することができる。 The zero decoding unit 2130 can decode all samples to 0 for a band where the assigned bits are 0.

ＩＳＣ復号部２１５０は、割り当てられたビットが０ではないバンドに対して選択された逆量子化器を利用して、復号を行うことができる。ＩＳＣ復号部２１５０は、符号化されたスペクトルの各バンド別に、重要周波数成分の情報を得て、各バンド別に得られた重要周波数成分の情報を、数、位置、大きさ及び符号に基づいて復号することができる。重要周波数成分の大きさは、数、位置及び符号とは異なる方式によって復号することができる。一例を挙げれば、重要周波数成分の大きさは、算術復号し、ＵＳＱ及びＴＣＱのうち一つを利用して逆量子化する一方、重要周波数成分の数、位置及び符号に対して、算術復号を行うことができる。逆量子化器選択は、図１０に図示されたＩＳＣ符号化部１０４０と同一結果を利用して行うことができる。ＩＳＣ復号部２１５０は、割り当てられたビットが０ではないバンドに対して、ＴＣＱ及びＵＳＱのうち一つを利用して逆量子化を行うことができる。 The ISC decoding unit 2150 can perform decoding by using an inverse quantizer selected for a band in which the allocated bits are not 0. The ISC decoding unit 2150 obtains information of important frequency components for each band of the encoded spectrum, and decodes information of important frequency components obtained for each band based on the number, position, size, and code. can do. The magnitude of the important frequency component can be decoded in a manner different from the number, position and sign. In one example, the magnitude of the important frequency component is arithmetically decoded and inversely quantized using one of USQ and TCQ, while the arithmetic decoding is performed on the number, position and code of the important frequency component. It can be carried out. The inverse quantizer selection can be performed using the same result as that of the ISC encoder 1040 shown in FIG. The ISC decoding unit 2150 can perform inverse quantization on a band whose assigned bit is not 0 using one of TCQ and USQ.

量子化成分復元部２１７０は、復元されたＩＳＣの位置、大きさ及び符号情報に基づいて、実際の量子化成分を復元することができる。ここで、ゼロ位置、すなわち、ゼロに復号されたスペクトル係数である量子化されていない部分には、０が割り当てられる。 The quantization component restoration unit 2170 can restore an actual quantization component based on the restored position, size, and code information of the ISC. Here, 0 is assigned to the zero position, that is, the unquantized part that is the spectral coefficient decoded to zero.

さらに、逆スケーリング部（図示せず）を含んで復元された量子化成分に対して、逆スケーリングを行い、正規化されたスペクトルと同一レベルの量子化されたスペクトル係数を出力することができる。 Furthermore, inverse scaling is performed on the restored quantized component including the inverse scaling unit (not shown), and quantized spectral coefficients at the same level as the normalized spectrum can be output.

図２２は、一実施形態によるＩＳＣ復号装置の構成を示すブロック図である。図２２の装置は、パルス数推定部２２１０及びＩＳＣ情報復号部２２３０を含んでもよい。図２２の装置は、図２１のＩＳＣ復号部２１５０に対応するか、あるいは独立した装置で具現される。 FIG. 22 is a block diagram illustrating a configuration of the ISC decoding device according to the embodiment. The device in FIG. 22 may include a pulse number estimating unit 2210 and an ISC information decoding unit 2230. The device of FIG. 22 corresponds to the ISC decoding unit 2150 of FIG. 21 or is embodied as an independent device.

図２２において、パルス数推定部２２１０は、バンド大きさとビット割当て情報とを利用して、現在バンドで必要なパルス個数推定値を決定することができる。すなわち、現在フレームのビット割当て情報がエンコーダと同一であるので、同一ビット割当て情報を利用して、同一パルス個数推定値を導き出して復号を進める。 In FIG. 22, a pulse number estimation unit 2210 can determine an estimated pulse number value for a current band using band size and bit allocation information. That is, since the bit allocation information of the current frame is the same as that of the encoder, the same bit allocation information is used to derive the same pulse number estimation value and proceed with decoding.

ＩＳＣ情報復号部２２３０は、推定されたパルス数に基づいて、ＩＳＣ情報、すなわち、ＩＳＣ個数情報、位置情報、大きさ情報及び符号を復号することができる。 The ISC information decoding unit 2230 can decode the ISC information, that is, the ISC number information, the position information, the size information, and the code based on the estimated number of pulses.

図２３は、一実施形態によるＩＳＣ情報復号装置の構成を示すブロック図である。図２３に図示された装置は、位置情報復号部２３１０、大きさ情報復号部２３３０及び符号復号部２３５０を含んでもよい。 FIG. 23 is a block diagram showing the configuration of the ISC information decoding device according to one embodiment. The apparatus illustrated in FIG. 23 may include a position information decoding unit 2310, a size information decoding unit 2330, and a code decoding unit 2350.

図２３において、位置情報復号部２３１０は、ビットストリームに含まれた位置情報と係わるインデックスを復号し、ＩＳＣの数及び位置を復元することができる。位置情報の復号には、算術復号が使用される。大きさ情報復号部２３３０は、ビットストリームに含まれた大きさ情報と係わるインデックスに対して算術復号を行い、復号されたインデックスに対して、ＴＣＱ及びＵＳＱのうち一つを選択し、逆量子化を行うことができる。算術復号の効率を高めるために、ノンゼロ位置情報、及びＩＳＣの数が使用される。符号復号部２３５０は、ビットストリームに含まれた符号情報と係わるインデックスを復号し、ＩＳＣの符号を復元することができる。符号情報の復号には、算術復号が使用される。一実施形態によれば、ノンゼロバンドが必要とするパルス数を推定し、位置情報、大きさ情報あるいは符号情報復号に使用することができる。 In FIG. 23, a position information decoding unit 2310 can decode an index related to position information included in a bitstream and restore the number and position of ISCs. Arithmetic decoding is used for decoding the position information. The size information decoding unit 2330 performs arithmetic decoding on the index related to the size information included in the bitstream, selects one of TCQ and USQ on the decoded index, and performs inverse quantization. It can be performed. To increase the efficiency of arithmetic decoding, non-zero position information and the number of ISCs are used. The code decoding unit 2350 decodes an index related to the code information included in the bit stream and can restore the ISC code. Arithmetic decoding is used for decoding the code information. According to one embodiment, the number of pulses required by a non-zero band can be estimated and used for decoding position information, size information or code information.

図２４は、他の実施形態によるスペクトル復号装置の構成を示すブロック図である。図２４に図示された装置は、図１９のスペクトル復号及び逆量子化部１９５０に対応するか、他の周波数ドメイン復号装置に含まれるか、あるいは独立しても具現される。 FIG. 24 is a block diagram illustrating a configuration of a spectrum decoding device according to another embodiment. The apparatus illustrated in FIG. 24 corresponds to the spectrum decoding and inverse quantization unit 1950 of FIG. 19, is included in another frequency domain decoding apparatus, or is embodied independently.

図２４に図示された装置は、ＩＳＣ復号部２４５０、量子化成分復元部２４７０及び逆スケーリング部２４９０を含んでもよい。図２１と比較するとき、復号方式選択部２１１０とゼロ復号部２１３０とが省略されており、ＩＳＣ復号部２４５０がＴＣＱを使用するということを除いては、各構成要素の動作は同一である。 The apparatus illustrated in FIG. 24 may include an ISC decoding unit 2450, a quantization component restoration unit 2470, and an inverse scaling unit 2490. As compared with FIG. 21, the operation of each component is the same except that the decoding scheme selection unit 2110 and the zero decoding unit 2130 are omitted and the ISC decoding unit 2450 uses TCQ.

図２５は、他の実施形態によるスペクトル復号装置の構成を示すブロック図である。図２５に図示された装置は、図１９のスペクトル復号及び逆量子化部１９５０に対応するか、他の周波数ドメイン復号装置に含まれるか、あるいは独立しても具現される。 FIG. 25 is a block diagram illustrating a configuration of a spectrum decoding device according to another embodiment. The apparatus shown in FIG. 25 corresponds to the spectrum decoding and inverse quantization unit 1950 of FIG. 19, is included in another frequency domain decoding apparatus, or is implemented independently.

図２５に図示された装置は、復号方式選択部２５１０、ＩＳＣ復号部２５５０、量子化成分復元部２５７０及び逆スケーリング部２５９０を含んでもよい。図２１と比較するとき、ゼロ復号部２１３０が省略されているということを除いては、各構成要素の動作は同一である。 The apparatus illustrated in FIG. 25 may include a decoding scheme selection unit 2510, an ISC decoding unit 2550, a quantization component restoration unit 2570, and an inverse scaling unit 2590. When compared with FIG. 21, the operation of each component is the same except that the zero decoding unit 2130 is omitted.

図２６は、他の実施形態によるＩＳＣ情報符号化装置の構成を示すブロック図である。図２６の装置は、確率算出部２６１０と無損失符号化部２６３０とを含んでもよい。 FIG. 26 is a block diagram illustrating a configuration of an ISC information encoding device according to another embodiment. The device in FIG. 26 may include a probability calculation unit 2610 and a lossless encoding unit 2630.

図２６において、確率算出部２６１０は、ＩＳＣ個数、パルス個数、ＴＣＱ情報を利用して、下記数式（８），（９）によって、大きさ符号化のための確率値を計算することができる。 In FIG. 26, the probability calculating unit 2610 can calculate a probability value for magnitude coding by using the following equations (8) and (9) using the number of ISCs, the number of pulses, and TCQ information.

ここで、

here,

は、各バンドで伝送されるＩＳＣ個数のうち符号化されて残った個数を意味し、

Means the number of remaining ISCs among the number of ISCs transmitted in each band,

は、各バンドで伝送されるパルスの個数のうち、符号化されて残った個数を示し、Ｍｓは、トレリス状態で存在する大きさの集合を意味する。そして、ｊは、大きさのうち符号化されたパルス個数を意味する。

Indicates the number of pulses remaining in the number of pulses transmitted in each band, and Ms indicates a set of sizes existing in a trellis state. J indicates the number of encoded pulses in the magnitude.

無損失符号化部２６３０は、その求められた確率値を利用して、ＴＣＱ大きさ情報、すなわち、大きさと経路情報とを無損失符号化することができる。各大きさのパルス個数は、 The lossless coding unit 2630 can losslessly code the TCQ size information, that is, the size and the path information, using the obtained probability value. The number of pulses of each magnitude is

値と

Value and

値とによって符号化される。ここで、

And encoded by the value. here,

値は、以前大きさの最後のパルスの確率を意味する。そして、

The value refers to the probability of the last pulse of the previous magnitude. And

値は、それ以外の他のパルスに該当する確率を意味する。最終的に、そのように求められた確率値によって、符号化されたインデックスを出力する。

The value means a probability corresponding to another pulse. Finally, an encoded index is output according to the probability value thus obtained.

図２７は、他の実施形態によるＩＳＣ情報復号装置の構成を示すブロック図である。図２７の装置は、確率算出部２７１０と無損失復号部２７３０とを含んでもよい。 FIG. 27 is a block diagram showing a configuration of an ISC information decoding device according to another embodiment. The device in FIG. 27 may include a probability calculation unit 2710 and a lossless decoding unit 2730.

図２７において、確率算出部２７１０は、ＩＳＣ情報（個数ｉ、位置）、ＴＣＱ情報、パルス個数ｍ、及びバンドの大きさｎを利用して、大きさ（magnitude）符号化のための確率値を計算することができる。それのために、まず、求められたパルス個数とバンド大きさとを利用して、必要なビット情報ｂを求める。そのとき、前記数式（１）のように求めることができる。その後、求められたビット情報ｂ、ＩＳＣ個数、ＩＳＣ位置そしてＴＣＱ情報を利用して、前記数式（８），（９）に基づいて、大きさ符号化のための確率値を計算する。 In FIG. 27, the probability calculation unit 2710 calculates a probability value for magnitude coding using ISC information (number i, position), TCQ information, pulse number m, and band size n. Can be calculated. For that purpose, first, necessary bit information b is obtained by using the obtained pulse number and band size. At this time, it can be obtained as in the above equation (1). Then, using the obtained bit information b, the number of ISCs, the ISC position, and the TCQ information, a probability value for magnitude coding is calculated based on the above equations (8) and (9).

無損失復号部２７３０は、符号化装置と同一に求められた確率値と、伝送されたインデックス情報とを利用して、ＴＣＱ大きさ情報、すなわち、大きさ（magnitude）情報と経路（path）情報とを無損失復号することができる。それのために、まず、確率値を利用して、個数情報に係わる算術符号化モデルを作り、その求められたモデルを利用して、ＴＣＱ大きさ情報の算術復号を遂行してＴＣＱ大きさ情報を復号する。具体的には、各大きさのパルス個数は、 The lossless decoding unit 2730 uses the probability value calculated in the same way as the encoding device and the transmitted index information to transmit the TCQ size information, that is, the magnitude information and the path information. Can be losslessly decoded. To this end, first, an arithmetic coding model related to the number information is created using the probability value, and the TCQ size information is arithmetically decoded using the obtained model to perform the TCQ size information. Is decrypted. Specifically, the number of pulses of each magnitude is

値と

Value and

値とによって復号される。ここで、

And decoded by the value. here,

値は、以前大きさの最後のパルス確率を意味する。そして、

The value means the last pulse probability of the previous magnitude. And

値は、それ以外の他のパルスに該当する確率を意味する。最終的に、そのように求められた確率値によって復号されたＴＣＱ情報、すなわち、大きさ情報と経路情報とを出力する。

The value means a probability corresponding to another pulse. Finally, the TCQ information decoded by the probability value thus obtained, that is, the size information and the path information are output.

図２８は、本発明の一実施形態による符号化モジュールを含むマルチメディア機器の構成を示したブロック図である。 FIG. 28 is a block diagram illustrating a configuration of a multimedia device including an encoding module according to an embodiment of the present invention.

図２８に図示されたマルチメディア機器２８００は、通信部２８１０と符号化モジュール２８３０とを含んでもよい。また、符号化の結果として得られるオーディオビットストリームの用途によって、オーディオビットストリームを保存する保存部２８５０をさらに含んでもよい。また、マルチメディア機器２８００は、マイクロホン２８７０をさらに含んでもよい。すなわち、保存部２４５０とマイクロホン２８７０は、オプションとして具備される。一方、図２８に図示されたマルチメディア機器２８００は、任意の復号モジュール（図示せず）、例えば、一般的な復号機能を遂行する復号モジュール、あるいは本発明の一実施形態による復号モジュールをさらに含んでもよい。ここで、符号化モジュール２８３０、マルチメディア機器２８００に具備される他の構成要素（図示せず）と共に一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The multimedia device 2800 illustrated in FIG. 28 may include a communication unit 2810 and an encoding module 2830. In addition, the storage unit may further include a storage unit 2850 for storing the audio bitstream according to the use of the audio bitstream obtained as a result of the encoding. In addition, the multimedia device 2800 may further include a microphone 2870. That is, the storage unit 2450 and the microphone 2870 are provided as options. Meanwhile, the multimedia device 2800 illustrated in FIG. 28 further includes an arbitrary decoding module (not shown), for example, a decoding module performing a general decoding function, or a decoding module according to an exemplary embodiment of the present invention. May be. Here, the encoding module 2830 is integrated with other components (not shown) included in the multimedia device 2800, and is also embodied as at least one or more processors (not shown).

図２８を参照すれば、通信部２８１０は、外部から提供されるオーディオと、符号化されたビットストリームとのうち少なくとも一つを受信するか、復元されたオーディオと、符号化モジュール２８３０の符号化の結果として得られるオーディオビットストリームとのうち少なくとも一つを送信することができる。 Referring to FIG. 28, the communication unit 2810 receives at least one of an externally provided audio and an encoded bit stream, or recovers the restored audio, and encodes the encoded data by the encoding module 2830. And at least one of the resulting audio bit streams.

通信部２８１０は、無線インターネット、無線イントラネット、無線電話網、無線ＬＡＮ（local area network）、Ｗｉ−Ｆｉ（wireless fidelity）、ＷＦＤ（Ｗｉ−Ｆｉ direct）、３Ｇ（３rd generation）、４Ｇ（４th generation）、ブルートゥース（Bluetooth（登録商標））、赤外線通信（ＩｒＤＡ：infrared data association）、ＲＦＩＤ（radio frequency identification）、ＵＷＢ（ultra wideband）、ジグビー（ZigBee（登録商標））、ＮＦＣ（near field communication）のような無線ネットワーク、または有線電話網、有線インターネットのような有線ネットワークを介して、外部のマルチメディア機器あるいはサーバとデータを送受信することができるように構成される。 The communication unit 2810 includes a wireless Internet, a wireless intranet, a wireless telephone network, a wireless LAN (local area network), Wi-Fi (wireless fidelity), WFD (Wi-Fi direct), 3G (3rd generation), and 4G (4th generation). , Bluetooth (registered trademark), infrared communication (IrDA: infrared data association), RFID (radio frequency identification), UWB (ultra wideband), ZigBee (ZigBee (registered trademark)), NFC (near field communication) It is configured to be able to transmit and receive data to and from an external multimedia device or server via a simple wireless network or a wired network such as a wired telephone network or a wired Internet.

符号化モジュール２８３０は、一実施形態によれば、正規化されたスペクトルに対して、各バンド別に重要周波数成分を選択し、各バンド別に選択された重要周波数成分の情報を、数、位置、大きさ及び符号に基づいて符号化することができる。重要周波数成分の大きさは、数、位置及び符号とは異なる方式によって符号化することができ、一例を挙げれば、重要周波数成分の大きさは、ＵＳＱ及びＴＣＱのうち一つを利用して量子化して算術符号化する一方、重要周波数成分の数、位置及び符号に対して、算術符号化を行うことができる。一実施形態によれば、正規化されたスペクトルを、各バンド別に割り当てられたビットに基づいてスケーリングを行い、スケーリングされたスペクトルに対して、重要周波数成分を選択することができる。 According to an exemplary embodiment, the encoding module 2830 may select an important frequency component for each band with respect to the normalized spectrum, and may transmit information of the important frequency component selected for each band to a number, a position, and a size. It can be encoded based on the magnitude and the sign. The magnitude of the important frequency component can be encoded by a method different from the number, position, and code. For example, the magnitude of the important frequency component is quantized using one of USQ and TCQ. While performing arithmetic coding, the arithmetic coding can be performed on the number, position, and sign of important frequency components. According to an exemplary embodiment, the normalized spectrum may be scaled based on bits allocated to each band, and important frequency components may be selected for the scaled spectrum.

保存部２８５０は、マルチメディア機器２８００の運用に必要な多様なプログラムを保存することができる。 The storage unit 2850 may store various programs necessary for operating the multimedia device 2800.

マイクロホン２８７０は、ユーザ、あるいは外部のオーディオ信号を符号化モジュール２８３０に提供することができる。 The microphone 2870 can provide a user or an external audio signal to the encoding module 2830.

図２９は、本発明の一実施形態による復号モジュールを含むマルチメディア機器の構成を示したブロック図である。 FIG. 29 is a block diagram illustrating a configuration of a multimedia device including a decoding module according to an embodiment of the present invention.

図２９に図示されたマルチメディア機器２９００は、通信部２９１０と復号モジュール２９２０とを含んでもよい。また、復号の結果として得られる復元されたオーディオ信号の用途によって、復元されたオーディオ信号を保存する保存部２９６０をさらに含んでもよい。また、マルチメディア機器２９００は、スピーカ２９７０をさらに含んでもよい。すなわち、保存部２９６０とスピーカ２９７０は、オプションとして具備される。一方、図２９に図示されたマルチメディア機器２９００は、任意の符号化モジュール（図示せず）、例えば、一般的な符号化機能を遂行する符号化モジュール、あるいは本発明の一実施形態による符号化モジュールをさらに含んでもよい。ここで、復号モジュール２９２０は、マルチメディア機器２９００に具備される他の構成要素（図示せず）と共に一体化され、少なくとも１つの以上のプロセッサ（図示せず）としても具現される。 The multimedia device 2900 illustrated in FIG. 29 may include a communication unit 2910 and a decoding module 2920. In addition, a storage unit 2960 for storing the restored audio signal may be further included depending on the use of the restored audio signal obtained as a result of decoding. In addition, the multimedia device 2900 may further include a speaker 2970. That is, the storage unit 2960 and the speaker 2970 are provided as options. Meanwhile, the multimedia device 2900 illustrated in FIG. 29 may include an arbitrary encoding module (not shown), for example, an encoding module performing a general encoding function, or encoding according to an embodiment of the present invention. It may further include a module. Here, the decoding module 2920 is integrated with other components (not shown) included in the multimedia device 2900, and is also embodied as at least one or more processors (not shown).

図２９を参照すれば、通信部２９１０は、外部から提供される符号化されたビットストリームと、オーディオ信号とのうち少なくとも一つを受信するか、あるいは復号モジュール２９２０の復号結果として得られる復元されたオーディオ信号と、符号化の結果として得られるオーディオビットストリームとのうち少なくとも一つを送信することができる。一方、通信部２９１０は、図２８の通信部２８１０と実質的に類似して具現される。 Referring to FIG. 29, the communication unit 2910 may receive at least one of an externally provided encoded bit stream and an audio signal, or may obtain a decoded signal obtained as a decoding result of the decoding module 2920. At least one of the audio signal and the audio bit stream obtained as a result of the encoding. Meanwhile, the communication unit 2910 is substantially similar to the communication unit 2810 of FIG.

復号モジュール２９２０は、一実施形態によれば、通信部２９１０を介して提供されるビットストリームを受信し、符号化されたスペクトルの各バンド別に、重要周波数成分の情報を得て、各バンド別に得られた重要周波数成分の情報を、数、位置、大きさ及び符号に基づいて復号することができる。重要周波数成分の大きさは、数、位置及び符号とは異なる方式によって復号することができ、一例を挙げれば、重要周波数成分の大きさは、算術復号し、ＵＳＱ及びＴＣＱのうち一つを利用して逆量子化する一方、重要周波数成分の数、位置及び符号に対して、算術復号を行うことができる。 According to one embodiment, the decoding module 2920 receives the bit stream provided through the communication unit 2910, obtains information on important frequency components for each band of the encoded spectrum, and obtains information for each band. The obtained important frequency component information can be decoded based on the number, position, size, and code. The magnitude of the important frequency component can be decoded by a method different from the number, position, and code. For example, the magnitude of the important frequency component is arithmetically decoded and uses one of USQ and TCQ. While performing inverse quantization, arithmetic decoding can be performed on the number, position, and code of important frequency components.

保存部２９６０は、復号モジュール２９２０で生成される復元されたオーディオ信号を保存することができる。一方、保存部２９６０は、マルチメディア機器２９００の運用に必要な多様なプログラムを保存することができる。 The storage unit 2960 may store the restored audio signal generated by the decoding module 2920. Meanwhile, the storage unit 2960 can store various programs necessary for operating the multimedia device 2900.

スピーカー２９７０は、復号モジュール２９２０で生成される復元されたオーディオ信号を外部に出力することができる。 The speaker 2970 may output the restored audio signal generated by the decoding module 2920 to the outside.

図３０は、本発明の一実施形態による符号化モジュールと復号モジュールとを含むマルチメディア機器の構成を示したブロック図である。 FIG. 30 is a block diagram illustrating a configuration of a multimedia device including an encoding module and a decoding module according to an embodiment of the present invention.

図３０に図示されたマルチメディア機器３０００は、通信部３０１０、符号化モジュール３０２０及び復号モジュール３０３０を含んでもよい。また、符号化の結果として得られるオーディオビットストリーム、あるいは復号結果として得られる復元されたオーディオ信号の用途によって、オーディオビットストリーム、あるいは復元されたオーディオ信号を保存する保存部３０４０をさらに含んでもよい。また、マルチメディア機器３０００は、マイクロホン３０５０あるいはスピーカ３０６０をさらに含んでもよい。ここで、符号化モジュール３０２０と復号モジュール３０３０は、マルチメディア機器３０００に具備される他の構成要素（図示せず）と共に一体化され、少なくとも１以上のプロセッサ（図示せず）としても具現される。 The multimedia device 3000 illustrated in FIG. 30 may include a communication unit 3010, an encoding module 3020, and a decoding module 3030. In addition, the storage unit 3040 may further include a storage unit 3040 for storing an audio bitstream or a restored audio signal according to an application of an audio bitstream obtained as a result of encoding or a restored audio signal obtained as a result of decoding. Further, the multimedia device 3000 may further include a microphone 3050 or a speaker 3060. Here, the encoding module 3020 and the decoding module 3030 are integrated with other components (not shown) included in the multimedia device 3000, and are also embodied as at least one or more processors (not shown). .

図３０に図示された各構成要素は、図２８に図示されたマルチメディア機器２８００の構成要素、あるいは図２９に図示されたマルチメディア機器２９００の構成要素と重複するので、その詳細な説明は省略する。 Each component illustrated in FIG. 30 is the same as the component of the multimedia device 2800 illustrated in FIG. 28 or the component of the multimedia device 2900 illustrated in FIG. 29, and thus a detailed description thereof will be omitted. I do.

図２８ないし図３０に図示されたマルチメディア機器２８００，２９００，３０００には、電話、モバイルフォンなどを含む音声通信専用端末；ＴＶ（television）、ＭＰ３プレーヤなどを含む放送専用装置あるいは音楽専用装置；あるいは音声通信専用端末と、放送専用装置あるいは音楽専用装置との融合端末装置；テレカンファレンシングシステムあるいはインタラクションシステムのユーザ端末が含まれてもよいが、それらに限定されるものではない。また、マルチメディア機器２８００，２９００，３０００は、クライアント、サーバ、あるいはクライアントとサーバとの間に配置される変換器としても使用される。 The multimedia devices 2800, 2900, and 3000 shown in FIGS. 28 to 30 include terminals dedicated to voice communication including a telephone, a mobile phone, and the like; devices dedicated to broadcasting or music including a TV (television) and an MP3 player; Alternatively, an integrated terminal device of a voice communication dedicated terminal and a broadcast dedicated device or a music dedicated device; a user terminal of a teleconferencing system or an interaction system may be included, but is not limited thereto. The multimedia devices 2800, 2900, and 3000 are also used as a client, a server, or a converter arranged between the client and the server.

一方、マルチメディア機器２８００，２９００，３０００が、例えば、モバイルフォンである場合、図示されてはいないが、キーパッドのようなユーザ入力部；ユーザインターフェース、あるいはモバイルフォンで処理される情報をディスプレイするディスプレイ部；モバイルフォンの全般的な機能を制御するプロセッサをさらに含んでもよい。また、モバイルフォンは、撮像機能を有するカメラ部と、モバイルフォンで必要とする機能を遂行する少なくとも１以上の構成要素とをさらに含んでもよい。 On the other hand, when the multimedia devices 2800, 2900, and 3000 are, for example, mobile phones, not shown, a user input unit such as a keypad; a user interface; or displays information processed by the mobile phone. Display unit; may further include a processor for controlling general functions of the mobile phone. In addition, the mobile phone may further include a camera unit having an imaging function and at least one or more components that perform a function required by the mobile phone.

一方、マルチメディア機器２８００，２９００，３０００が、例えば、ＴＶである場合、図示されてはいないが、キーパッドのようなユーザ入力部、受信された放送情報をディスプレイするディスプレイ部、ＴＶの全般的な機能を制御するプロセッサをさらに含んでもよい。また、ＴＶは、ＴＶで必要とする機能を遂行する少なくとも１以上の構成要素をさらに含んでもよい。 On the other hand, when the multimedia devices 2800, 2900, and 3000 are, for example, TVs, not shown, a user input unit such as a keypad, a display unit for displaying received broadcast information, and a general TV unit. It may further include a processor that controls various functions. In addition, the TV may further include at least one or more components that perform a function required by the TV.

図３１は、一実施形態による、スペクトルの微細構造符号化方法の動作を示したフローチャートである。図３１を参照すれば、３１１０段階においては、符号化方式が選択される。そのために、各バンドに係わる情報及びビット割当て情報が使用される。ここで、符号化方式は、量子化方式を含んでもよい。 FIG. 31 is a flowchart illustrating the operation of the spectrum fine structure encoding method according to an embodiment. Referring to FIG. 31, in step 3110, an encoding method is selected. For this purpose, information related to each band and bit allocation information are used. Here, the encoding scheme may include a quantization scheme.

３１３０段階においては、現在バンドが、ビット割り当てがゼロであるバンド、すなわち、ゼロバンドであるか否かということを判断し、ゼロバンドである場合、３２５０段階に進み、ノンゼロバンドである場合、３２７０段階に進む。 In step 3130, it is determined whether the current band is a band having zero bit allocation, that is, a zero band. If the current band is a zero band, the process proceeds to step 3250. Proceed to stage.

３１５０段階においては、ゼロバンドにある全てのサンプルをゼロに符号化することができる。 In step 3150, all samples in the zero band can be coded to zero.

３１７０段階においては、ゼロバンドではないバンドが選択された量子化方式に基づいて符号化することができる。一実施形態によれば、バンド長及びビット割当て情報を使用して、バンド当たりパルス個数を推定し、ノンゼロ位置個数を決定し、ノンゼロ位置の必要ビット数を推定し、最終パルス数を決定することができる。次に、バンド当たりパルス個数と、入力信号の絶対値とに基づいて、初期スケーリングファクタを決定し、初期スケーリングファクタによるスケーリング及びパルス再分配過程を介して、スケーリングファクタをアップデートすることができる。最終アップデートされたスケーリングファクタを利用して、スペクトル係数をスケーリングし、スケーリングされたスペクトル係数を使用して、適切なＩＳＣが選択される。量子化するスペクトル成分は、各バンドのビット割当て情報に基づいて選択される。次に、収集されたＩＳＣの大きさが、ＵＳＣジョイント方式及びＴＣＱジョイント方式によって量子化されて算術符号化される。ここで、算術符号化の効率を高めるために、ノンゼロ位置とＩＳＣの数とが使用される。ＵＳＣジョイント方式及びＴＣＱジョイント方式は、帯域幅によって、第１ジョイント方式と第２ジョイント方式とを有する。第１ジョイント方式は、以前バンドからの余剰ビットに対する二次ビット割当て処理を利用して、量子化器選択が行われるものであり、ＮＢ及びＷＢに使用され、第２ジョイント方式は、ＵＳＱと決定されたバンドについて、ＬＳＢについては、ＴＣＱを使用し、残りのビットは、ＵＳＱを使用する方式であり、ＳＷＢ及びＦＢに使用することができる。一方、選択されたＩＳＣの符号情報は、正負の符号に対して同一確率で算術復号される。 In operation 3170, a band other than the zero band may be encoded based on the selected quantization scheme. According to one embodiment, using band length and bit allocation information to estimate the number of pulses per band, determine the number of non-zero positions, estimate the required number of bits for non-zero positions, and determine the final number of pulses. Can be. Next, the initial scaling factor is determined based on the number of pulses per band and the absolute value of the input signal, and the scaling factor can be updated through scaling and pulse redistribution using the initial scaling factor. Utilizing the last updated scaling factor, the spectral coefficients are scaled, and the appropriate ISC is selected using the scaled spectral coefficients. The spectral components to be quantized are selected based on the bit allocation information of each band. Next, the magnitude of the collected ISC is quantized by the USC joint scheme and the TCQ joint scheme and arithmetically coded. Here, the non-zero position and the number of ISCs are used to increase the efficiency of arithmetic coding. The USC joint scheme and the TCQ joint scheme have a first joint scheme and a second joint scheme depending on the bandwidth. The first joint method is a method in which a quantizer is selected by using a secondary bit allocation process for surplus bits from a previous band, is used for NB and WB, and the second joint method is determined to be USQ. As for the set bands, TCQ is used for the LSB, and the remaining bits are of the system using the USQ, and can be used for SWB and FB. On the other hand, the selected ISC code information is arithmetically decoded with the same probability with respect to positive and negative codes.

３１７０段階以後、追加して量子化成分を復元する段階と、バンドを逆スケーリングする段階とを具備することができる。各バンドの実際の量子化成分を復元するために、量子化成分に、位置、符号、大きさ情報が付加されてもよい。ゼロ位置には、ゼロが割り当てられる。一方、スケーリング時に使用されたものと同一スケーリングファクタを使用して、逆スケーリングファクタを抽出し、復元された実際の量子化成分に対して、逆スケーリングを行うことができる。逆スケーリングされた信号は、正規化されたスペクトル、すなわち、入力信号と同一レベルを有することができる。 After step 3170, the method may further include restoring a quantized component and de-scaling the band. In order to restore the actual quantization component of each band, position, code, and size information may be added to the quantization component. Zero positions are assigned zero. On the other hand, an inverse scaling factor can be extracted using the same scaling factor used at the time of scaling, and inverse scaling can be performed on the restored actual quantized component. The inversely scaled signal can have a normalized spectrum, ie, the same level as the input signal.

図３１の各段階については、必要によって、前述の符号化装置の各構成要素の動作がさらに付加されてもよい。 For each stage in FIG. 31, the operation of each component of the above-described encoding device may be further added as necessary.

図３２は、一実施形態による、スペクトルの微細構造復号方法の動作を示したフローチャートである。図３２の方法によれば、正規化されたスペクトルの微細構造を逆量子化するために、各バンドに対して、ＩＳＣと、選択されたＩＳＣに係わる情報とが位置、数、符号及び大きいによって復号される。ここで、大きさ情報は、算術復号、並びにＵＳＱジョイント方式及びＴＣＱジョイント方式によって復号され、位置、数、符号情報は、算術復号によって復号される。 FIG. 32 is a flowchart illustrating an operation of a method for decoding a fine structure of a spectrum according to an embodiment. According to the method of FIG. 32, in order to dequantize the fine structure of the normalized spectrum, for each band, the ISC and the information related to the selected ISC are determined by the position, number, sign, and size. Decrypted. Here, the size information is decoded by arithmetic decoding and the USQ joint method and the TCQ joint method, and the position, number, and code information are decoded by arithmetic decoding.

具体的には、図３２を参照すれば、３２１０段階においては、復号方式が選択される。そのために、各バンドに係わる情報及びビット割当て情報が使用される。ここで、復号方式は、逆量子化方式を含んでもよい。逆量子化方式は、前述の符号化装置で適用された量子化方式選択と同一過程を介して選択される。 Specifically, referring to FIG. 32, in step 3210, a decoding method is selected. For this purpose, information related to each band and bit allocation information are used. Here, the decoding scheme may include an inverse quantization scheme. The inverse quantization scheme is selected through the same process as the quantization scheme selection applied in the encoding device described above.

３２３０段階においては、現在バンドが、ビット割り当てがゼロであるバンド、すなわち、ゼロバンドであるか否かということを判断し、ゼロバンドである場合、３２５０段階に進み、ノンゼロバンドである場合、３２７０段階に進む。 In step 3230, it is determined whether the current band is a band in which bit allocation is zero, that is, a zero band. If the current band is a zero band, the process proceeds to step 3250. If the current band is a non-zero band, 3270 Proceed to stage.

３２５０段階においては、ゼロバンドにある全てのサンプルをゼロに復号することができる。 In step 3250, all samples in the zero band can be decoded to zero.

３２７０段階においては、ゼロバンドではないバンドが選択された逆量子化方式に基づいて復号することができる。一実施形態によれば、バンド長及びビット割当て情報を使用して、バンド当たりパルス個数を推定あるいは決定することができる。それは、前述の符号化装置で適用されたスケーリングと同一過程を介して遂行される。次に、ＩＳＣの位置情報、すなわち、ＩＳＣの数及び位置を復元することができる。それは、前述の符号化装置と類似して処理され、適切な復号のために、同一確率値が使用される。次に、収集されたＩＳＣの大きさが、算術復号によって復号され、ＵＳＣジョイント方式及びＴＣＱジョイント方式によって逆量子化される。ここで、ノンゼロ位置とＩＳＣの数とが算術復号のために使用される。ＵＳＣジョイント方式及びＴＣＱジョイント方式は、帯域幅によって第１ジョイント方式と第２ジョイント方式とを有する。第１ジョイント方式は、以前バンドからの余剰ビットに対する二次ビット割当て処理を追加して利用して、量子化器選択が遂行されるものであり、ＮＢ及びＷＢに使用され、第２ジョイント方式は、ＵＳＱと決定されたバンドに対して、ＬＳＢについては、ＴＣＱを使用し、残りのビットは、ＵＳＱを使用する方式であり、ＳＷＢ及びＦＢに使用することができる。一方、選択されたＩＳＣの符号情報は、正負の符号に対して、同一確率で算術復号される。 In operation 3270, a band other than the zero band can be decoded based on the selected inverse quantization scheme. According to one embodiment, band length and bit allocation information may be used to estimate or determine the number of pulses per band. It is performed through the same process as the scaling applied in the encoding device described above. Next, the position information of the ISC, that is, the number and position of the ISC can be restored. It is processed analogously to the encoding device described above, and the same probability values are used for proper decoding. Next, the magnitude of the collected ISC is decoded by arithmetic decoding and dequantized by the USC joint scheme and the TCQ joint scheme. Here, the non-zero position and the number of ISCs are used for arithmetic decoding. The USC joint method and the TCQ joint method have a first joint method and a second joint method according to the bandwidth. In the first joint scheme, a quantizer is selected by additionally using a secondary bit allocation process for surplus bits from a previous band. The first joint scheme is used for NB and WB, and the second joint scheme is , USQ, the TCB is used for the LSB, and the remaining bits use the USQ, and can be used for SWB and FB. On the other hand, the selected ISC code information is arithmetically decoded with the same probability for positive and negative codes.

３２７０段階以後、追加して量子化成分を復元する段階と、バンドを逆スケーリングする段階とを具備することができる。各バンドの実際の量子化成分を復元するために、量子化成分に位置、符号、大きさ情報が付加されてもよい。伝送されるデータがないバンドは、ゼロで充填される。次に、ノンゼロバンドにあるパルス数が推定され、ＩＳＣの数及び位置を含む位置情報が、推定されたパルス数に基づいて復号される。大きさ情報については、無損失復号、並びにＵＳＣジョイント方式及びＴＣＱジョイント方式による復号が行われる。ノンゼロ大きさ値については、符号及び量子化された成分が最終的に復元される。一方、復元された実際の量子化成分に対して、伝送されたｎｏｒｍ情報を使用し、て逆スケーリングが行われる。 After the step 3270, the method may further include a step of restoring the quantized component and a step of de-scaling the band. To restore the actual quantized component of each band, position, sign, and size information may be added to the quantized component. Bands with no data to be transmitted are filled with zeros. Next, the number of pulses in the non-zero band is estimated, and position information including the number and position of the ISC is decoded based on the estimated number of pulses. For the size information, lossless decoding and decoding by the USC joint method and the TCQ joint method are performed. For non-zero magnitude values, the sign and quantized components are finally restored. On the other hand, inverse scaling is performed on the restored actual quantized component using the transmitted norm information.

図３２の各段階については、必要によって、前述の復号装置の各構成要素の動作がさらに付加されてもよい。 For each stage in FIG. 32, the operation of each component of the above-described decoding device may be further added as necessary.

前記実施形態は、コンピュータで実行されるプログラムに作成可能で、コンピュータで読み取り可能な記録媒体を利用して、前記プログラムを動作させる汎用デジタルコンピュータで具現される。また、前述の本発明の実施形態で使用されるデータ構造、プログラム命令あるいはデータファイルは、コンピュータで読み取り可能な記録媒体に多様な手段を介して記録される。コンピュータで読み取り可能な記録媒体は、コンピュータシステムによって読み取り可能なデータが保存される全種の保存装置を含んでもよい。コンピュータで読み取り可能な記録媒体の例としては、ハードディスク、フロッピー（登録商標）ディスク及び磁気テープのような磁気媒体（magnetic media）；ＣＤ（compact disc）−ＲＯＭ（read only memory）、ＤＶＤ（digital versatile disc）のような光記録媒体（optical media）；フロプティカルディスク（floptical disk）のような磁気・光媒体（magneto-optical media）；及びＲＯＭ、ＲＡＭ（random access memory）、フラッシュメモリのようなプログラム命令を保存して実行するように特別に構成されたハードウェア装置が含まれてもよい。また、コンピュータで読み取り可能な記録媒体は、プログラム命令、データ構造などを指定する信号を伝送する伝送媒体でもある。プログラム命令の例としては、コンパイラによって作われるような機械語コードだけではなく、インタープリタなどを使用して、コンピュータによって実行される高級言語コードを含んでもよい。 The embodiment may be embodied in a general-purpose digital computer that operates on a computer-readable recording medium that can be created as a computer-executable program. In addition, the data structure, program instructions, or data files used in the embodiments of the present invention are recorded on a computer-readable recording medium via various means. The computer-readable recording medium may include all types of storage devices that store data that can be read by a computer system. Examples of a computer-readable recording medium include a magnetic medium such as a hard disk, a floppy (registered trademark) disk, and a magnetic tape; a compact disc (CD) -read only memory (ROM); and a digital versatile (DVD). optical media such as discs; magnetic-optical media such as floppy disks; and ROM, random access memory, and flash memory A hardware device specially configured to store and execute the program instructions may be included. Further, the computer-readable recording medium is a transmission medium for transmitting a signal designating a program instruction, a data structure, and the like. Examples of the program instructions may include not only machine language codes generated by a compiler but also high-level language codes executed by a computer using an interpreter or the like.

以上、本発明の一実施形態は、たとえ限定された実施形態及び図面によって説明されたとしても、本発明の一実施形態は、前述の実施形態に限定されるものではなく、それらは、本発明が属する分野で当業者であるならば、そのような記載から多様な修正及び変形が可能であろう。従って、本発明のスコープは、前述の説明ではなく、特許請求の範囲に示されており、それと均等または等価的な変形は、いずれも本発明の技術的思想範疇に属するものである。 As described above, even if one embodiment of the present invention is described with reference to the limited embodiment and the drawings, the one embodiment of the present invention is not limited to the above-described embodiment. Various modifications and variations will be possible from such a description if one of ordinary skill in the art to which this belongs. Therefore, the scope of the present invention is described not in the above description but in the appended claims, and any equivalent or equivalent modifications fall within the technical concept of the present invention.

Claims

Selecting an encoding scheme based on the bits allocated to the band;
Encoding the spectral components of the band to zero if the selected encoding scheme is a zero encoding scheme;
If the selected coding scheme is not the zero coding scheme, USQ (uniform scalar quantization) and TCQ (trellis coded quantization) may be used according to the average number of bits allocated to the spectral components of the band with respect to the size of the important spectral components. ), Encoding using one of the spectrum encoding methods.

The method of claim 1, further comprising coding a number, a position, and a code of the important spectral components when the selected coding scheme is not a zero coding scheme.

The spectrum encoding method according to claim 2, wherein the size of the important spectral component is encoded by a scheme different from the scheme used for the number, position, and code of the important spectral component.

Scaling the normalized spectrum based on the bits allocated to the band, if the selected coding scheme is not the zero coding scheme; and, for the scaled spectrum, the significant spectral component The spectrum encoding method according to claim 1, further comprising the step of: