JP6872056B2

JP6872056B2 - Audio decoding device and audio decoding method

Info

Publication number: JP6872056B2
Application number: JP2020070269A
Authority: JP
Inventors: 菊入　圭; 圭菊入; 山口　貴史; 貴史山口
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2020-04-09
Filing date: 2020-04-09
Publication date: 2021-05-19
Anticipated expiration: 2034-03-24
Also published as: JP2020109538A

Description

本発明は、音声復号装置および音声復号方法に関する。 The present invention relates to an audio decoding device and an audio decoding method.

音声信号、音響信号のデータ量を数十分の一に圧縮する音声符号化技術は、信号の伝送・蓄積において極めて重要な技術である。広く利用されている音声符号化技術の例として、周波数領域にて信号を符号化する変換符号化方式を挙げることができる。 A voice coding technology that compresses the amount of data of a voice signal and an acoustic signal to one tenth is an extremely important technology in signal transmission and storage. An example of a widely used voice coding technique is a transform coding method that encodes a signal in the frequency domain.

変換符号化においては、低いビットレートで高い品質を得るために、入力信号に応じて周波数帯域ごとに符号化に要するビットを割り当てる適応ビット割り当てが広く用いられている。符号化による歪みを最小化するビット割り当て方法は、各周波数帯域の信号パワーに応じた割り当てであり、それに人間の聴覚を加味した形でのビット割り当ても行われている。 In transform coding, adaptive bit allocation, which allocates the bits required for coding for each frequency band according to an input signal, is widely used in order to obtain high quality at a low bit rate. The bit allocation method that minimizes distortion due to coding is allocation according to the signal power of each frequency band, and bit allocation is also performed in a form that takes human hearing into consideration.

一方で、割り当てビット数が非常に少ない周波数帯域の品質を改善するための技術がある。特許文献１では、所定の閾値よりも割り当てられたビット数が少ない周波数帯域の変換係数を、その他の周波数帯域の変換係数で近似する手法が開示されている。また、特許文献２では、周波数帯域内でパワーが小さいためにゼロに量子化されてしまった成分に対して、擬似雑音信号を生成する手法、他の周波数帯域のゼロに量子化されていない成分の信号を複製する手法が開示されている。 On the other hand, there is a technique for improving the quality of the frequency band in which the number of allocated bits is very small. Patent Document 1 discloses a method of approximating the conversion coefficient of a frequency band in which the number of bits allocated is smaller than a predetermined threshold value by the conversion coefficient of another frequency band. Further, in Patent Document 2, a method of generating a pseudo-noise signal for a component that has been quantized to zero due to low power in the frequency band, and a component that has not been quantized to zero in another frequency band. A method of replicating the signal of is disclosed.

さらには、音声信号、音響信号は一般的に高周波数帯域よりも低周波数帯域にパワーが偏り、主観品質に与える影響も大きいことを加味して、入力信号の高周波数帯域は符号化した低周波数帯域を用いて生成する帯域拡張技術も広く用いられている。帯域拡張技術は、少ないビット数で高周波数帯域を生成可能なため、低ビットレートで高い品質を得ることが可能である。特許文献３では、低周波数帯域のスペクトルを高周波数帯域に複写した後に、符号化器より送信される高周波数帯域スペクトルの性質に関する情報に基づいてスペクトル形状を調整して高周波数帯域を生成する手法が開示されている。 Furthermore, considering that the power of audio signals and acoustic signals is generally biased to the low frequency band rather than the high frequency band and has a large effect on the subjective quality, the high frequency band of the input signal is the encoded low frequency. Bandwidth expansion technology that uses bands to generate is also widely used. Bandwidth expansion technology can generate a high frequency band with a small number of bits, so it is possible to obtain high quality at a low bit rate. In Patent Document 3, after copying the spectrum of the low frequency band to the high frequency band, the spectrum shape is adjusted based on the information regarding the properties of the high frequency band spectrum transmitted from the encoder to generate the high frequency band. Is disclosed.

特開平９-１５３８１１号公報Japanese Unexamined Patent Publication No. 9-153811 米国特許第７４４７６３１号明細書U.S. Pat. No. 7447631 特許第５２０３０７７号Patent No. 5203077

上記の技術では、少ないビット数で符号化された周波数帯域の成分が原音の当該成分に周波数領域で似るように生成している。一方で、時間領域では歪みが目立ってしまい、品質が劣化することがある。 In the above technique, a component of the frequency band encoded with a small number of bits is generated so as to resemble the component of the original sound in the frequency domain. On the other hand, distortion becomes noticeable in the time domain, and the quality may deteriorate.

上記の問題を鑑み、本発明は、少ないビット数で符号化された周波数帯域の成分の時間領域における歪みを軽減し、品質を改善することができる音声復号装置および音声復号方法を提供することを目的とする。 In view of the above problems, the present invention provides a voice decoding device and a voice decoding method capable of reducing distortion in the time domain of a component of a frequency band encoded with a small number of bits and improving quality. The purpose.

本発明の音声復号装置は、符号化された音声信号を復号して音声信号を出力する音声復号装置であって、前記符号化された音声信号を含む符号化系列を復号して復号信号を得る復号部と、前記符号化系列の復号に関する復号関連情報に基づいて、復号信号における周波数帯域の時間包絡を整形する選択的時間包絡整形部と、を備え、前記復号部は、一部の周波数帯域において当該周波数帯域と異なる周波数帯域の信号の複製により復号信号を得て、前記選択的時間包絡整形部は、時間包絡を整形しない周波数帯域に対応する前記復号信号を周波数領域において他の信号に置き換える。 The voice decoding device of the present invention is a voice decoding device that decodes a coded voice signal and outputs a voice signal, and obtains a decoded signal by decoding a coding sequence including the coded voice signal. The decoding unit includes a decoding unit and a selective time-wrapping shaping unit that shapes the time-wrapping of the frequency band in the decoding signal based on the decoding-related information regarding the decoding of the coded sequence, and the decoding unit includes a part of the frequency band. In, a decoded signal is obtained by duplicating a signal in a frequency band different from the frequency band, and the selective time-wrapping shaping unit replaces the decoded signal corresponding to the frequency band in which the time-wrapping is not shaped with another signal in the frequency domain. ..

本発明によれば、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 According to the present invention, it is possible to shape the time envelope of the decoded signal in the frequency band encoded with a small number of bits into a desired time envelope and improve the quality.

、第１の実施形態に係る音声復号装置１０の構成を示す図である。, Is a diagram showing the configuration of the audio decoding device 10 according to the first embodiment. 第１の実施形態に係る音声復号装置１０の動作を示すフローチャートである。It is a flowchart which shows the operation of the voice decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の復号部１０ａの第１の例の構成を示す図である。It is a figure which shows the structure of the 1st example of the decoding part 10a of the audio decoding apparatus 10 which concerns on 1st Embodiment. 、第１の実施形態に係る音声復号装置１０の復号部１０ａの第１の例の動作を示すフローチャートである。, Is a flowchart showing the operation of the first example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment. 第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の構成を示す図である。It is a figure which shows the structure of the 2nd example of the decoding part 10a of the audio decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の動作を示すフローチャートである。It is a flowchart which shows the operation of the 2nd example of the decoding part 10a of the voice decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第１復号部の構成を示す図である。It is a figure which shows the structure of the 1st decoding part of the 2nd example of the decoding part 10a of the voice decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第１復号部の動作を示すフローチャートである。It is a flowchart which shows the operation of the 1st decoding part of the 2nd example of the decoding part 10a of the voice decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第２復号部の構成を示す図である。It is a figure which shows the structure of the 2nd decoding part of the 2nd example of the decoding part 10a of the voice decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第２復号部の動作を示すフローチャートである。It is a flowchart which shows the operation of the 2nd decoding part of the 2nd example of the decoding part 10a of the voice decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の選択的時間包絡整形部１０ｂの第１の例の構成を示す図である。It is a figure which shows the structure of the 1st example of the selective time envelope shaping part 10b of the voice decoding apparatus 10 which concerns on 1st Embodiment. 第１の実施形態に係る音声復号装置１０の選択的時間包絡整形部１０ｂの第１の例の動作を示すフローチャートである。It is a flowchart which shows the operation of the 1st example of the selective time envelope shaping part 10b of the voice decoding apparatus 10 which concerns on 1st Embodiment. 時間包絡整形処理を示す説明図である。It is explanatory drawing which shows the time envelope shaping process. 第２の実施形態に係る音声復号装置１１の構成を示す図である。It is a figure which shows the structure of the voice decoding apparatus 11 which concerns on 2nd Embodiment. 第２の実施形態に係る音声復号装置１１の動作を示すフローチャートである。It is a flowchart which shows the operation of the voice decoding apparatus 11 which concerns on 2nd Embodiment. 第２の実施形態にかかる音声符号化装置２１の構成を示す図である。It is a figure which shows the structure of the voice coding apparatus 21 which concerns on 2nd Embodiment. 第２の実施形態に係る音声符号化装置２１の動作を示すフローチャートである。It is a flowchart which shows the operation of the voice coding apparatus 21 which concerns on 2nd Embodiment. 第３の実施形態に係る音声復号装置１２の構成を示す図である。It is a figure which shows the structure of the voice decoding apparatus 12 which concerns on 3rd Embodiment. 第３の実施形態に係る音声復号装置１２の動作を示すフローチャートである。It is a flowchart which shows the operation of the voice decoding apparatus 12 which concerns on 3rd Embodiment. 第４の実施形態に係る音声復号装置１３の構成を示す図である。It is a figure which shows the structure of the voice decoding apparatus 13 which concerns on 4th Embodiment. 第４の実施形態に係る音声復号装置１３の動作を示すフローチャートである。It is a flowchart which shows the operation of the voice decoding apparatus 13 which concerns on 4th Embodiment. 本実施形態の音声復号装置または音声符号化装置として機能するコンピュータのハードウェア構成を示す図である。It is a figure which shows the hardware configuration of the computer which functions as the voice decoding apparatus or voice coding apparatus of this embodiment. 音声復号装置として機能させるためのプログラム構成を示す図である。It is a figure which shows the program structure for functioning as a voice decoding apparatus. 音声符号化装置として機能させるためのプログラム構成を示す図である。It is a figure which shows the program structure for functioning as a voice coding apparatus.

添付図面を参照しながら本発明の実施形態を説明する。可能な場合には、同一の部分には同一の符号を付して、重複する説明を省略する。
［第１の実施形態］
図１は、第１の実施形態に係る音声復号装置１０の構成を示す図である。音声復号装置１０の通信装置は、音声信号を符号化した符号化系列を受信し、更に、復号した音声信号を外部に出力する。音声復号装置１０は、図１に示すように、機能的には、復号部１０ａ、選択的時間包絡整形部１０ｂを備える。 An embodiment of the present invention will be described with reference to the accompanying drawings. When possible, the same parts are designated by the same reference numerals and duplicate description is omitted.
[First Embodiment]
FIG. 1 is a diagram showing a configuration of an audio decoding device 10 according to the first embodiment. The communication device of the voice decoding device 10 receives the coding sequence in which the voice signal is encoded, and further outputs the decoded voice signal to the outside. As shown in FIG. 1, the voice decoding device 10 functionally includes a decoding unit 10a and a selective time envelope shaping unit 10b.

図２は、第１の実施形態に係る音声復号装置１０の動作を示すフローチャートである。 FIG. 2 is a flowchart showing the operation of the voice decoding device 10 according to the first embodiment.

復号部１０ａは、符号化系列を復号し、復号信号を生成する（ステップＳ１０-１）。 The decoding unit 10a decodes the coded sequence and generates a decoded signal (step S10-1).

選択的時間包絡整形部１０ｂは、前記復号部から符号化系列を復号する際に得られる情報である復号関連情報と復号信号を受け取り、復号信号の成分の時間包絡を選択的に所望の時間包絡に整形する（ステップＳ１０-２）。なお、以降の記載において、信号の時間包絡は、時間方向に対する信号のエネルギーまたはパワー（及び、これらと等価のパラメータ）の変動を表すものとする。 The selective time envelope shaping unit 10b receives the decoding-related information and the decoding signal, which are information obtained when decoding the coded sequence from the decoding unit, and selectively selects the time envelope of the component of the decoding signal as desired. (Step S10-2). In the following description, the time envelope of a signal represents a variation in the energy or power (and parameters equivalent thereto) of the signal in the time direction.

図３は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第１の例の構成を示す図である。復号部１０ａは、図３に示すように、機能的には、復号/逆量子化部１０ａＡ、復号関連情報出力部１０ａＢ、時間周波数逆変換部１０ａＣを備える。 FIG. 3 is a diagram showing the configuration of a first example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment. As shown in FIG. 3, the decoding unit 10a functionally includes a decoding / dequantization unit 10aA, a decoding-related information output unit 10aB, and a time-frequency inverse conversion unit 10aC.

図４は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第１の例の動作を示すフローチャートである。 FIG. 4 is a flowchart showing the operation of the first example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment.

復号/逆量子化部１０ａＡは、符号化系列の符号化方式に応じて、符号化系列に対して復号、逆量子化のうち少なくとも１つを実施して周波数領域復号信号を生成する（ステップＳ１０-１-１）。 The decoding / dequantization unit 10aA performs at least one of decoding and dequantization on the coding series according to the coding method of the coding series to generate a frequency domain decoding signal (step S10). -1-1).

復号関連情報出力部１０ａＢは、前記復号/逆量子化部１０ａＡにて復号信号を生成する際に得られる復号関連情報を受け、復号関連情報を出力する（ステップＳ１０-１-２）。さらには、符号化系列を受けて解析して復号関連情報を得て、復号関連情報を出力してもよい。復号関連情報としては、例えば、周波数帯域ごとの符号化ビット数でもよく、これと同等の情報（例えば，周波数帯域ごとの１周波数成分あたりの平均符号化ビット数）でもよい。さらには、周波数成分ごとの符号化ビット数でもよい。さらには、周波数帯域ごとの量子化ステップサイズでもよい。さらには、周波数成分の量子化値でもよい。ここで、周波数成分とは、例えば所定の時間周波数変換の変換係数である。さらには、周波数帯域ごとのエネルギーまたはパワーでもよい。さらには、所定の周波数帯域（周波数成分でもよい）を提示する情報でもよい。さらには、例えば、復号信号生成の際に他の時間包絡整形に関する処理を含む場合には、当該時間包絡整形処理に関する情報であってもよく、例えば、当該時間包絡整形処理をするか否かの情報、当該時間包絡整形処理により整形される時間包絡に関する情報、当該時間包絡整形処理の時間包絡整形の強度の情報のうち少なくともひとつであってもよい。前記の例のうち少なくとも１つが復号関連情報として出力される。 The decoding-related information output unit 10aB receives the decoding-related information obtained when the decoding / dequantization unit 10aA generates the decoding signal, and outputs the decoding-related information (step S10-1-2). Further, the decoding-related information may be output by receiving and analyzing the coded sequence to obtain the decoding-related information. The decoding-related information may be, for example, the number of coding bits for each frequency band, or information equivalent to this (for example, the average number of coding bits per frequency component for each frequency band). Furthermore, the number of coding bits for each frequency component may be used. Furthermore, the quantization step size for each frequency band may be used. Furthermore, it may be a quantized value of a frequency component. Here, the frequency component is, for example, a conversion coefficient for a predetermined time-frequency conversion. Furthermore, it may be energy or power for each frequency band. Furthermore, it may be information that presents a predetermined frequency band (which may be a frequency component). Further, for example, when the decoding signal generation includes processing related to other time envelope shaping, the information may be information related to the time envelope shaping process. For example, whether or not to perform the time envelope shaping process. It may be at least one of information, information on the time envelope shaped by the time envelope shaping process, and information on the strength of the time envelope shaping of the time envelope shaping process. At least one of the above examples is output as decoding-related information.

時間周波数逆変換部１０ａＣは、前記周波数領域復号信号を所定の時間周波数逆変換により時間領域の復号信号に変換し出力する（ステップＳ１０-１-３）。ただし、周波数領域復号信号に時間周波数逆変換を施さずに出力してもよい。例えば、選択的時間包絡整形部１０ｂが入力信号として周波数領域の信号を要求する場合が該当する。 The time-frequency inverse conversion unit 10aC converts the frequency-domain decoding signal into a time-domain decoding signal by a predetermined time-frequency inverse conversion and outputs the signal (step S10-1-3). However, the frequency domain decoding signal may be output without reverse time-frequency conversion. For example, the case where the selective time envelope shaping unit 10b requests a signal in the frequency domain as an input signal is applicable.

図５は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の構成を示す図である。復号部１０ａは、図５に示すように、機能的には、符号化系列解析部１０ａＤ、第１復号部１０ａＥ、第２復号部１０ａＦを備える。 FIG. 5 is a diagram showing a configuration of a second example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment. As shown in FIG. 5, the decoding unit 10a functionally includes a coding sequence analysis unit 10aD, a first decoding unit 10aE, and a second decoding unit 10aF.

図６は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の動作を示すフローチャートである。 FIG. 6 is a flowchart showing the operation of the second example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment.

符号化系列解析部１０ａＤは、符号化系列を解析して、第１符号化系列と第２符号化系列に分離する（ステップＳ１０-１-４）。 The coded sequence analysis unit 10aD analyzes the coded sequence and separates it into a first coded sequence and a second coded sequence (step S10-1-4).

第１復号部１０ａＥは、第１符号化系列を第１の復号方式にて復号して第１復号信号を生成し、当該復号に関する情報である第１復号関連情報を出力する（ステップＳ１０-１-５）。 The first decoding unit 10aE decodes the first coding sequence by the first decoding method to generate the first decoding signal, and outputs the first decoding-related information which is the information related to the decoding (step S10-1). -5).

第２復号部１０ａＦは、前記第１復号信号を用いて、第２符号化系列を第２の復号方式にて復号して復号信号を生成し、当該復号に関する情報である第２復号関連情報を出力する（ステップＳ１０-１-６）。本例においては、この第１復号関連情報および第２復号関連情報を合わせたものが、復号関連情報である。 The second decoding unit 10aF uses the first decoding signal to decode the second coding sequence by the second decoding method to generate a decoding signal, and obtains the second decoding-related information which is the information related to the decoding. Output (step S10-1-6). In this example, the combination of the first decoding-related information and the second decoding-related information is the decoding-related information.

図７は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第１復号部の構成を示す図である。第１復号部１０ａＥは、図７に示すように、機能的には、第１復号/逆量子化部１０ａＥ-ａ、第１復号関連情報出力部１０ａＥ-ｂを備える。 FIG. 7 is a diagram showing a configuration of a first decoding unit of a second example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment. As shown in FIG. 7, the first decoding unit 10aE functionally includes a first decoding / dequantization unit 10aE-a and a first decoding-related information output unit 10aE-b.

図８は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第１復号部の動作を示すフローチャートである。 FIG. 8 is a flowchart showing the operation of the first decoding unit of the second example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment.

第１復号/逆量子化部１０ａＥ-ａは、第１符号化系列の符号化方式に応じて、第１符号化系列に対して復号、逆量子化のうち少なくとも１つを実施して第１復号信号を生成し出力する（ステップＳ１０-１-５-１）。 The first decoding / dequantization unit 10aE-a performs at least one of decoding and dequantization on the first coding series according to the coding method of the first coding series, and first. A decoded signal is generated and output (step S10-1-5-1).

第１復号関連情報出力部１０ａＥ-ｂは、前記第１復号/逆量子化部１０ａＥ-ａにて第１復号信号を生成する際に得られる第１復号関連情報を受け、第１復号関連情報を出力する（ステップＳ１０-１-５-２）。さらには、第１符号化系列を受けて解析して第１復号関連情報を得て、第１復号関連情報を出力してもよい。第１復号関連情報の例としては、前記復号関連情報出力部１０ａＢが出力する復号関連情報の例と同様でもよい。さらには、第１復号部の復号方式が第１復号方式であることを第１復号関連情報としてもよい。さらには、第１復号信号に含まれる周波数帯域（周波数成分でもよい）（第１符号化系列に符号化されている音声信号の周波数帯域（周波数成分でもよい））を示す情報を第１復号関連情報としてもよい。 The first decoding-related information output unit 10aE-b receives the first decoding-related information obtained when the first decoding / inverse quantization unit 10aE-a generates the first decoding signal, and receives the first decoding-related information. Is output (step S10-1-5-2). Further, the first coding-related information may be output by receiving and analyzing the first coding sequence to obtain the first decoding-related information. The example of the first decoding-related information may be the same as the example of the decoding-related information output by the decoding-related information output unit 10aB. Furthermore, the fact that the decoding method of the first decoding unit is the first decoding method may be used as the first decoding-related information. Further, the information indicating the frequency band (which may be a frequency component) included in the first decoding signal (the frequency band (which may be a frequency component) of the audio signal encoded in the first coding series) is related to the first decoding. It may be used as information.

図９は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第２復号部の構成を示す図である。第２復号部１０ａＦは、図９に示すように、機能的には、第２復号/逆量子化部１０ａＦ-ａ、第２復号関連情報出力部１０ａＦ-ｂ、復号信号合成部１０ａＦ-ｃを備える。 FIG. 9 is a diagram showing a configuration of a second decoding unit of a second example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment. As shown in FIG. 9, the second decoding unit 10aF functionally includes a second decoding / inverse quantization unit 10aF-a, a second decoding-related information output unit 10aF-b, and a decoding signal synthesis unit 10aF-c. Be prepared.

図１０は、第１の実施形態に係る音声復号装置１０の復号部１０ａの第２の例の第２復号部の動作を示すフローチャートである。 FIG. 10 is a flowchart showing the operation of the second decoding unit of the second example of the decoding unit 10a of the audio decoding device 10 according to the first embodiment.

第２復号/逆量子化部１０ａＦ-１は、第２符号化系列の符号化方式に応じて、第２符号化系列に対して復号、逆量子化のうち少なくとも１つを施して第２復号信号を生成し出力する（ステップs１０-１-６-１）。第２復号信号の生成に際しては、第１復号信号を用いてもよい。第２復号部の復号方式（第２復号方式）は、帯域拡張方式であってもよく、第１復号信号を用いた帯域拡張方式であってもよい。さらには、特許文献１（特開平９-１５３８１１号公報）に示されるように、第１の符号化方式にて割り当てられたビット数が所定の閾値よりも少なかった周波数帯域の変換係数を、第２の符号化方式として他の周波数帯域の変換係数で近似する符号化方式に対応する復号方式でもよい。また、さらには、特許文献２（米国特許第７４４７６３１）に示されるように、第１の符号化方式にてゼロに量子化された周波数の成分に対して、第２の符号化方式にて擬似雑音信号を生成するまたは他の周波数成分の信号を複製する符号化方式に対応する復号方式でもよい。さらには、当該周波数の成分に対して、第２の符号化方式にて他の周波数成分の信号を用いて近似する符号化方式に対応する復号方式でもよい。また、第１の符号化方式にてゼロに量子化された周波数の成分は、第１の符号化方式で符号化されない周波数の成分と解釈できる。これらの場合、第１の符号化方式に対応する復号方式が第１復号部の復号方式である第１復号方式、第２の符号化方式に対応する復号方式が第２復号部の復号方式である第２復号方式としてもよい。 The second decoding / dequantization unit 10aF-1 performs at least one of decoding and dequantization on the second coding series according to the coding method of the second coding series to perform the second decoding. A signal is generated and output (step s10-1-6-1). When generating the second decoded signal, the first decoded signal may be used. The decoding method (second decoding method) of the second decoding unit may be a band expansion method or a band expansion method using the first decoding signal. Further, as shown in Patent Document 1 (Japanese Unexamined Patent Publication No. 9-153811), the conversion coefficient of the frequency band in which the number of bits allocated by the first coding method is less than a predetermined threshold value is determined. As the coding method of 2, a decoding method corresponding to a coding method that approximates with a conversion coefficient of another frequency band may be used. Further, as shown in Patent Document 2 (US Pat. No. 7,447,631), the frequency component quantized to zero by the first coding method is simulated by the second coding method. A decoding method corresponding to a coding method that generates a noise signal or duplicates a signal of another frequency component may be used. Further, a decoding method corresponding to a coding method that approximates the component of the frequency by using a signal of another frequency component in the second coding method may be used. Further, the frequency component quantized to zero by the first coding method can be interpreted as the frequency component not encoded by the first coding method. In these cases, the decoding method corresponding to the first coding method is the first decoding method which is the decoding method of the first decoding unit, and the decoding method corresponding to the second coding method is the decoding method of the second decoding unit. It may be a certain second decoding method.

第２復号関連情報出力部１０ａＦ-ｂは、前記第２復号/逆量子化部１０ａＦ-ａにて第２復号信号を生成する際に得られる第２復号関連情報を受け、第２復号関連情報を出力する（ステップＳ１０-１-６-２）。さらには、第２符号化系列を受けて解析して第２復号関連情報を得て、第２復号関連情報を出力してもよい。第２復号関連情報の例としては、前記復号関連情報出力部１０ａＢが出力する復号関連情報の例と同様でもよい。 The second decoding-related information output unit 10aF-b receives the second decoding-related information obtained when the second decoding / inverse quantization unit 10aF-a generates the second decoding signal, and receives the second decoding-related information. Is output (step S10-1-6-2). Further, the second decoding-related information may be output by receiving and analyzing the second coding sequence to obtain the second decoding-related information. The example of the second decoding-related information may be the same as the example of the decoding-related information output by the decoding-related information output unit 10aB.

さらには、第２復号部の復号方式が第２復号方式であることを示す情報を第２復号関連情報としてもよい。例えば、第２復号方式が帯域拡張方式であることを示す情報を第２復号関連情報としてもよい。さらに例えば、帯域拡張方式で生成される第２復号信号の各周波数帯域に対する帯域拡張方式を示す情報を第２復号情報としてもよい。当該各周波数帯域に対する帯域拡張方式を示す情報としては、例えば、他の周波数帯域より信号を複製した、他の周波数帯域の信号で当該周波数の信号を近似した、擬似雑音信号を生成した、サイン信号を付加した等の情報であってもよい。さらに例えば、他の周波数帯域の信号で当該周波数の信号を近似する際には近似方法に関する情報であってもよい。さらに例えば、他の周波数帯域の信号で当該周波数の信号を近似する際に白色化を用いた場合には、白色化の強度に関する情報を第２復号情報としてもよい。さらに例えば、他の周波数帯域の信号で当該周波数の信号を近似する際に擬似雑音信号を付加した場合には、擬似雑音信号のレベルに関する情報を第２復号情報としてもよい。さらに例えば、擬似雑音信号を生成した場合には、擬似雑音信号のレベルに関する情報を第２復号情報としてもよい。 Further, the information indicating that the decoding method of the second decoding unit is the second decoding method may be used as the second decoding-related information. For example, information indicating that the second decoding method is a band expansion method may be used as the second decoding-related information. Further, for example, the information indicating the band expansion method for each frequency band of the second decoding signal generated by the band expansion method may be used as the second decoding information. Information indicating the band expansion method for each frequency band includes, for example, a sine signal that duplicates a signal from another frequency band, approximates a signal of the frequency band with a signal of another frequency band, and generates a pseudo noise signal. It may be information such as adding. Further, for example, when approximating a signal of the frequency with a signal of another frequency band, information on an approximation method may be used. Further, for example, when whitening is used when approximating a signal of the frequency with a signal of another frequency band, information on the intensity of whitening may be used as the second decoding information. Further, for example, when a pseudo noise signal is added when approximating a signal of the frequency with a signal of another frequency band, information regarding the level of the pseudo noise signal may be used as the second decoding information. Further, for example, when a pseudo noise signal is generated, information regarding the level of the pseudo noise signal may be used as the second decoding information.

さらに例えば、第２復号方式が、第１の符号化方式にて割り当てられたビット数が所定の閾値よりも少なかった周波数帯域の変換係数を、他の周波数帯域の変換係数での近似、及び擬似雑音信号の変換係数を付加（置換でもよい）のうちのいずれかまたは両方とする符号化方式に対応する復号方式であることを示す情報を第２復号関連情報としてもよい。例えば、当該周波数帯域の変換係数の近似方法に関する情報を第２復号関連情報としてもよい。例えば、近似方法として他の周波数帯域の変換係数を白色化する方法を用いた場合には、白色化の強度に関する情報を第２復号情報としてもよい。例えば、当該擬似雑音信号のレベルに関する情報を第２復号情報としてもよい。 Further, for example, in the second decoding method, the conversion coefficient of the frequency band in which the number of bits allocated by the first coding method is less than a predetermined threshold is approximated by the conversion coefficient of another frequency band, and pseudo. The information indicating that the decoding method corresponds to the coding method in which the conversion coefficient of the noise signal is added (may be replaced) or both may be used as the second decoding-related information. For example, information on the method of approximating the conversion coefficient of the frequency band may be used as the second decoding-related information. For example, when a method of whitening the conversion coefficient of another frequency band is used as the approximation method, the information on the intensity of whitening may be used as the second decoding information. For example, the information regarding the level of the pseudo noise signal may be used as the second decoding information.

さらに例えば、第２の符号化方式が、第１の符号化方式にてゼロに量子化された（すなわち、第1の符号化方式にて符号化されない）周波数の成分に対して、擬似雑音信号を生成するまたは他の周波数成分の信号を複製する符号化方式であることを示す情報を第２復号関連情報としてもよい。例えば、各周波数成分に対して、第１の符号化方式にてゼロに量子化された（すなわち、第1の符号化方式にて符号化されない）周波数の成分か否かを示す情報を、第２復号関連情報としてもよい。例えば、当該周波数成分に対して擬似雑音信号を生成するか他の周波数成分の信号を複製するかを示す情報を、第２復号関連情報としてもよい。さらに例えば、当該周波数成分に対して他の周波数成分の信号を複製する場合、複製方法に関する情報を第２復号関連情報としてもよい。複製方法に関する情報としては、例えば、複製元の周波数であってもよい。さらに例えば、複製の際に複製元の周波数成分に対して処理を加えるか否か、さらには加える処理に関する情報であってもよい。さらに例えば、当該複製元の周波数成分に対して加える処理が白色化の場合には、白色化の強度に関する情報であってもよい。さらに例えば、当該複製元の周波数成分に対して加える処理が擬似雑音信号付加の場合には、擬似雑音信号のレベルに関する情報であってもよい。 Further, for example, the second coding method is a pseudo-noise signal for a frequency component quantized to zero by the first coding method (that is, not encoded by the first coding method). The information indicating that the coding method is used to generate a signal or duplicate a signal of another frequency component may be used as the second decoding-related information. For example, information indicating whether or not each frequency component is a component of a frequency quantized to zero by the first coding method (that is, not encoded by the first coding method) can be obtained. 2 Decoding-related information may be used. For example, information indicating whether to generate a pseudo noise signal for the frequency component or duplicate a signal of another frequency component may be used as the second decoding-related information. Further, for example, when duplicating a signal of another frequency component with respect to the frequency component, the information regarding the duplication method may be used as the second decoding-related information. The information regarding the duplication method may be, for example, the frequency of the duplication source. Further, for example, it may be information on whether or not to add a process to the frequency component of the copy source at the time of duplication, and further, information on the process to be added. Further, for example, when the process applied to the frequency component of the duplication source is whitening, it may be information on the intensity of whitening. Further, for example, when the processing applied to the frequency component of the duplication source is the addition of a pseudo noise signal, the information may be information on the level of the pseudo noise signal.

復号信号合成部１０ａＦ-ｃは、第１復号信号と第２復号信号より、復号信号を合成して出力する(ステップＳ１０-１-６-３)。第２の符号化方式が帯域拡張方式である場合は、一般的には、第１復号信号が低周波数帯域の信号、第２復号信号が高周波数帯域の信号であり、復号信号はこれら両方の周波数帯域をもつことになる。 The decoding signal synthesis unit 10aF-c synthesizes and outputs a decoding signal from the first decoding signal and the second decoding signal (step S10-1-6-3). When the second coding method is a band extension method, generally, the first decoded signal is a low frequency band signal, the second decoded signal is a high frequency band signal, and the decoded signal is both of these. It will have a frequency band.

図１１は、第１の実施形態に係る音声復号装置１０の選択的時間包絡整形部１０ｂの第１の例の構成を示す図である。選択的時間包絡整形部１０ｂは、図１１に示すように、機能的には、時間周波数変換部１０ｂＡ、周波数選択部１０ｂＢ、周波数選択的時間包絡整形部１０ｂＣ、時間周波数逆変換部１０ｂＤを備える。 FIG. 11 is a diagram showing a configuration of a first example of the selective time envelope shaping unit 10b of the voice decoding apparatus 10 according to the first embodiment. As shown in FIG. 11, the selective time envelope shaping unit 10b functionally includes a time frequency conversion unit 10bA, a frequency selection unit 10bB, a frequency selective time envelope shaping unit 10bC, and a time frequency inverse conversion unit 10bD.

図１２は、第１の実施形態に係る音声復号装置１０の選択的時間包絡整形部１０ｂの第１の例の動作を示すフローチャートである。 FIG. 12 is a flowchart showing the operation of the first example of the selective time envelope shaping unit 10b of the voice decoding device 10 according to the first embodiment.

時間周波数変換部１０ｂＡは、時間領域の復号信号を所定の時間周波数変換により周波数領域の復号信号に変換する（ステップＳ１０-２-１）。ただし、復号信号が周波数領域の信号の場合には、当該時間周波数変換部１０ｂＡ、及び当該処理ステップＳ１０-２-１を省略できる。 The time-frequency conversion unit 10bA converts the time-domain decoding signal into a frequency-domain decoding signal by a predetermined time-frequency conversion (step S10-2-1). However, when the decoded signal is a signal in the frequency domain, the time-frequency conversion unit 10bA and the processing step S10-2-1 can be omitted.

周波数選択部１０ｂＢは、周波数領域の復号信号及び復号関連情報のうち少なくとも一つを用いて、周波数領域の復号信号において時間包絡整形処理を施す周波数帯域を選択する（ステップＳ１０-２-２）。前記周波数選択処理は、時間包絡整形処理を施す周波数成分を選択してもよい。当該選択される周波数帯域（周波数成分でもよい）は、復号信号のうちの一部の周波数帯域（周波数成分でもよい）でもよく、また復号信号のすべての周波数帯域（周波数成分でもよい）でもよい。 The frequency selection unit 10bB uses at least one of the decoding signal in the frequency domain and the decoding-related information to select the frequency band in which the time-wrapping shaping process is performed on the decoding signal in the frequency domain (step S10-2-2). In the frequency selection process, a frequency component to be subjected to the time envelope shaping process may be selected. The selected frequency band (which may be a frequency component) may be a part of the frequency band (may be a frequency component) of the decoded signal, or may be the entire frequency band (which may be a frequency component) of the decoded signal.

例えば、復号関連情報が周波数帯域ごとの符号化ビット数である場合は、当該符号化ビット数が所定の閾値よりも小さい周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。前記周波数帯域ごとの符号化ビット数と同等の情報の場合にも、同様に、所定の閾値との比較により時間包絡整形処理を施す周波数帯域を選択できることは明白である。さらに例えば、復号関連情報が周波数成分ごとの符号化ビット数である場合は、当該符号化ビット数が所定の閾値よりも小さい周波数成分を、時間包絡整形処理を施す周波数成分として選択してもよい。例えば、変換係数を符号化されていない周波数成分を、時間包絡整形処理を施す周波数成分として選択してもよい。さらに例えば、復号関連情報が周波数帯域ごとの量子化ステップサイズである場合、当該量子化ステップサイズが所定の閾値よりも大きい周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。さらに例えば、復号関連情報が周波数成分の量子化値である場合、当該量子化値を所定の閾値と比較して、時間包絡整形処理を施す周波数帯域を選択してもよい。例えば、量子化変換係数が所定の閾値よりも小さい成分を、時間包絡整形処理を施す周波数成分として選択してもよい。さらに例えば、復号関連情報が周波数帯域ごとのエネルギーまたはパワーである場合、当該エネルギーまたはパワーを所定の閾値と比較して、時間包絡整形処理を施す周波数帯域を選択してもよい。例えば、選択的時間包絡整形処理の対象となる周波数帯域のエネルギーまたはパワーが所定の閾値よりも小さい場合は、当該周波数帯域には時間包絡整形処理を施さないとしてもよい。 For example, when the decoding-related information is the number of coding bits for each frequency band, a frequency band in which the number of coding bits is smaller than a predetermined threshold may be selected as the frequency band for performing the time wrapping shaping process. Similarly, in the case of information equivalent to the number of encoded bits for each frequency band, it is clear that the frequency band to be subjected to the time envelope shaping process can be selected by comparing with a predetermined threshold value. Further, for example, when the decoding-related information is the number of coding bits for each frequency component, a frequency component whose number of coding bits is smaller than a predetermined threshold may be selected as the frequency component to be subjected to the time envelope shaping process. .. For example, a frequency component whose conversion coefficient is not encoded may be selected as the frequency component to be subjected to the time envelope shaping process. Further, for example, when the decoding-related information is the quantization step size for each frequency band, a frequency band in which the quantization step size is larger than a predetermined threshold may be selected as the frequency band to be subjected to the time wrapping shaping process. Further, for example, when the decoding-related information is the quantization value of the frequency component, the frequency band to be subjected to the time envelope shaping process may be selected by comparing the quantization value with a predetermined threshold value. For example, a component having a quantization conversion coefficient smaller than a predetermined threshold value may be selected as a frequency component to be subjected to the time envelope shaping process. Further, for example, when the decoding-related information is energy or power for each frequency band, the frequency band to be subjected to the time envelope shaping process may be selected by comparing the energy or power with a predetermined threshold value. For example, when the energy or power of the frequency band subject to the selective time envelope shaping process is smaller than a predetermined threshold value, the time envelope shaping process may not be performed on the frequency band.

さらに例えば、復号関連情報が他の時間包絡整形処理に関する情報である場合は、当該時間包絡整形処理が施されない周波数帯域を、本発明における時間包絡整形処理を施す周波数帯域として選択してもよい。 Further, for example, when the decoding-related information is information related to another time envelope shaping process, the frequency band in which the time envelope shaping process is not performed may be selected as the frequency band in which the time envelope shaping process is performed in the present invention.

さらに例えば、復号部１０ａが復号部１０ａの第２の例に記載の構成であって、復号関連情報が第２復号部の符号化方式である場合に、第２復号部の符号化方式に応じて第２復号部にて復号される周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。例えば、第２復号部の符号化形式が帯域拡張方式である場合に、第２復号部にて復号される周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。例えば、第２復号部の符号化形式が時間領域における帯域拡張方式である場合に、第２復号部にて復号される周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。例えば、第２復号部の符号化形式が周波数領域における帯域拡張方式である場合に、第２復号部にて復号される周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。例えば、帯域拡張方式にて他の周波数帯域より信号を複製した周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。例えば、帯域拡張方式にて他の周波数帯域の信号を用いて当該周波数の信号を近似した周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。例えば、帯域拡張方式にて擬似雑音信号を生成した周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。例えば、帯域拡張方式にてサイン信号を付加した周波数帯域を除く周波数帯域を、時間包絡整形処理を施す周波数帯域として選択してもよい。 Further, for example, when the decoding unit 10a has the configuration described in the second example of the decoding unit 10a and the decoding-related information is the coding method of the second decoding unit, it depends on the coding method of the second decoding unit. The frequency band decoded by the second decoding unit may be selected as the frequency band to be subjected to the time envelope shaping process. For example, when the coding format of the second decoding unit is the band expansion method, the frequency band decoded by the second decoding unit may be selected as the frequency band to be subjected to the time envelope shaping process. For example, when the coding format of the second decoding unit is the band expansion method in the time domain, the frequency band decoded by the second decoding unit may be selected as the frequency band to be subjected to the time envelope shaping process. For example, when the coding format of the second decoding unit is the band expansion method in the frequency domain, the frequency band decoded by the second decoding unit may be selected as the frequency band to be subjected to the time wrapping shaping process. For example, a frequency band in which a signal is duplicated from another frequency band by the band expansion method may be selected as the frequency band to be subjected to the time wrapping shaping process. For example, a frequency band obtained by approximating a signal of the frequency band using a signal of another frequency band in the band expansion method may be selected as the frequency band to be subjected to the time wrapping shaping process. For example, the frequency band in which the pseudo-noise signal is generated by the band expansion method may be selected as the frequency band to which the time envelope shaping process is performed. For example, a frequency band excluding the frequency band to which the sine signal is added by the band expansion method may be selected as the frequency band to which the time wrapping shaping process is performed.

さらに例えば、復号部１０ａが復号部１０ａの第２の例に記載の構成であって、第２の符号化方式が第１の符号化方式にて割り当てられたビット数が所定の閾値よりも少なかった周波数帯域または成分（第１の符号化方式にて符号化されていない周波数帯域または成分でもよい）の変換係数を、他の周波数帯域または成分の変換係数を用いた近似、及び擬似雑音信号の変換係数を付加（置換でもよい）のうちのいずれかまたは両方とする符号化方式である場合において、変換係数を他の周波数帯域または成分の変換係数を用いて近似した周波数帯域または成分を、時間包絡整形処理を施す周波数帯域または成分として選択してもよい。例えば、擬似雑音信号の変換係数を付加（置換でもよい）した周波数帯域または成分を、時間包絡整形処理を施す周波数帯域または成分として選択してもよい。例えば、変換係数を他の周波数帯域または成分の変換係数を用いて近似する際の近似方法に応じて、時間包絡整形処理を施す周波数帯域または成分として選択してもよい。例えば、近似方法として他の周波数帯域または成分の変換係数を白色化する方法を用いた場合には、白色化の強度に応じて、時間包絡整形処理を施す周波数帯域または成分を選択してもよい。例えば、擬似雑音信号の変換係数を付加（置換でもよい）する場合において、当該擬似雑音信号のレベルに応じて、時間包絡整形処理を施す周波数帯域または成分を選択してもよい。 Further, for example, the decoding unit 10a has the configuration described in the second example of the decoding unit 10a, and the number of bits allocated by the second coding method in the first coding method is less than a predetermined threshold. The conversion coefficient of the frequency band or component (which may be a frequency band or component not encoded by the first coding method) is approximated using the conversion coefficient of another frequency band or component, and the pseudo-noise signal. In the case of a coding method in which the conversion coefficient is added (or may be replaced), or both, the frequency band or component obtained by approximating the conversion coefficient using the conversion coefficient of another frequency band or component is used for time. It may be selected as a frequency band or component to be subjected to the wrapping shaping process. For example, a frequency band or component to which a conversion coefficient of the pseudo-noise signal is added (may be replaced) may be selected as the frequency band or component to be subjected to the time envelope shaping process. For example, the conversion coefficient may be selected as the frequency band or component to be subjected to the time envelope shaping process depending on the approximation method when approximating the conversion coefficient using the conversion coefficient of another frequency band or component. For example, when a method of whitening the conversion coefficient of another frequency band or component is used as the approximation method, the frequency band or component to be subjected to the time envelope shaping process may be selected according to the intensity of whitening. .. For example, when adding (or replacing) the conversion coefficient of the pseudo-noise signal, the frequency band or component to be subjected to the time envelope shaping process may be selected according to the level of the pseudo-noise signal.

さらに例えば、復号部１０ａが復号部１０ａの第２の例に記載の構成であって、第２の符号化方式が、第１の符号化方式にてゼロに量子化された（すなわち、第1の符号化方式にて符号化されない）周波数の成分に対して、擬似雑音信号を生成するまたは他の周波数成分の信号を複製（他の周波数成分の信号を用いた近似でもよい）する符号化方式である場合において、擬似雑音信号を生成した周波数成分を、時間包絡整形処理を施す周波数成分として選択してもよい。例えば、他の周波数成分の信号を複製（他の周波数成分の信号を用いて近似でもよい）した周波数成分を、時間包絡整形処理を施す周波数成分として選択してもよい。例えば、当該周波数成分に対して他の周波数成分の信号を複製（他の周波数成分の信号を用いて近似でもよい）する場合、複製元（近似元）の周波数に応じて、時間包絡整形処理を施す周波数成分を選択してもよい。例えば、複製の際に複製元の周波数成分に対して処理を加えるか否かに応じて、時間包絡整形処理を施す周波数成分を選択してもよい。例えば、複製（近似でも良い）の際に複製元（近似元）の周波数成分に対して加える処理に応じて、時間包絡整形処理を施す周波数成分を選択してもよい。例えば、当該複製元（近似元）の周波数成分に対して加える処理が白色化の場合には、白色化の強度に応じて、時間包絡整形処理を施す周波数成分を選択してもよい。例えば、近似の際の近似方法に応じて、時間包絡整形処理を施す周波数成分を選択してもよい。 Further, for example, the decoding unit 10a has the configuration described in the second example of the decoding unit 10a, and the second coding method is quantized to zero by the first coding method (that is, the first coding method). A coding method that generates a pseudo-noise signal or duplicates a signal of another frequency component (may be an approximation using a signal of another frequency component) for a frequency component (not encoded by the coding method of). In this case, the frequency component that generated the pseudo-noise signal may be selected as the frequency component to be subjected to the time-wrapping shaping process. For example, a frequency component obtained by duplicating a signal of another frequency component (which may be approximated by using a signal of another frequency component) may be selected as the frequency component to be subjected to the time envelope shaping process. For example, when duplicating a signal of another frequency component with respect to the frequency component (may be approximated by using a signal of another frequency component), a time wrapping shaping process is performed according to the frequency of the duplication source (approximate source). The frequency component to be applied may be selected. For example, the frequency component to be subjected to the time envelope shaping process may be selected depending on whether or not the process is applied to the frequency component of the copy source at the time of duplication. For example, the frequency component to be subjected to the time envelope shaping process may be selected according to the process to be applied to the frequency component of the copy source (approximate source) at the time of duplication (approximate). For example, when the process applied to the frequency component of the duplication source (approximate source) is whitening, the frequency component to be subjected to the time envelope shaping process may be selected according to the intensity of whitening. For example, a frequency component to be subjected to the time envelope shaping process may be selected according to the approximation method at the time of approximation.

周波数成分または周波数帯域の選択方法は、上記の例を組み合わせてもよい。また、周波数領域の復号信号及び復号関連情報のうち少なくとも一つを用いて、周波数領域の復号信号において時間包絡整形処理を施す周波数成分または帯域を選択すればよく、周波数成分または周波数帯域の選択方法は上記の例に限定されない。 The frequency component or frequency band selection method may be a combination of the above examples. Further, at least one of the decoding signal in the frequency domain and the decoding-related information may be used to select the frequency component or band to be subjected to the time-wrapping shaping process in the decoding signal in the frequency domain. Is not limited to the above example.

周波数選択的時間包絡整形部１０ｂＣは、復号信号の前記周波数選択部１０ｂＢで選択された周波数帯域の時間包絡を所望の時間包絡に整形する（ステップＳ１０-２-３）。前記時間包絡整形の実施は、周波数成分単位であってもよい。 The frequency selective time envelope shaping unit 10bC shapes the time envelope of the frequency band selected by the frequency selection unit 10bB of the decoded signal into a desired time envelope (step S10-2-3). The time envelope shaping may be performed in units of frequency components.

時間包絡の整形方法は、例えば、選択された周波数帯域の変換係数を線形予測分析して得られた線形予測係数を用いた線形予測逆フィルタでフィルタリングすることで、時間包絡を平坦にする方法であってもよい。当該線形予測逆フィルタの伝達関数Ａ（ｚ）は、離散時間系における当該線形予測逆フィルタの応答を表す関数であり、 The time envelope shaping method is, for example, a method of flattening the time envelope by filtering the conversion coefficient of the selected frequency band with a linear prediction inverse filter using the linear prediction coefficient obtained by linear prediction analysis. There may be. The transfer function A (z) of the linear prediction inverse filter is a function representing the response of the linear prediction inverse filter in a discrete-time system.

で表すことができる。ｐは予測次数であり、α_ｉ（ｉ = １, ..,ｐ）は線形予測係数である。例えば、選択された周波数帯域の変換係数を、当該線形予測係数を用いた線形予測フィルタでフィルタリングすることで、時間包絡を立ち上がりまたは/及び立ち下がりにする方法であってもよい。当該線形予測フィルタの伝達関数は、 Can be represented by. p is the prediction order and α _i (i = 1, .., p) is the linear prediction coefficient. For example, a method may be used in which the time envelope rises and / or falls by filtering the conversion coefficient of the selected frequency band with a linear prediction filter using the linear prediction coefficient. The transfer function of the linear prediction filter is

で表すことができる。 Can be represented by.

上記線形予測係数を用いる時間包絡整形処理においては、帯域幅拡大率ρを用いて、時間包絡を平坦にするまたは立ち上がりまたは/及び立ち下がりにする強度を調整してもよい。 In the time envelope shaping process using the linear prediction coefficient, the bandwidth expansion factor ρ may be used to adjust the intensity of flattening the time envelope or making it rise and / or fall.

上記の例は、復号信号を時間周波数変換した変換係数だけでなく、復号信号をフィルタバンクによって周波数領域の信号に変換して得られるサブバンド信号の任意の時間tにおけるサブサンプルに対して処理してもよい。上記の例では、復号信号に対して周波数領域において線形予測分析に基づくフィルタリングを施すことで、復号信号の時間領域におけるパワーの分布を変え、時間包絡を整形できる。 The above example processes not only the conversion coefficient obtained by converting the decoded signal into a time frequency, but also the subsample of the subband signal obtained by converting the decoded signal into a signal in the frequency domain by a filter bank at an arbitrary time t. You may. In the above example, by filtering the decoded signal in the frequency domain based on linear predictive analysis, the distribution of power in the time domain of the decoded signal can be changed and the time envelope can be shaped.

さらに例えば、復号信号をフィルタバンクによって周波数領域の信号に変換したサブバンド信号の振幅を、任意の時間セグメントにおいて、時間包絡整形処理を施す周波数成分（または、周波数帯域）の平均振幅にすることにより時間包絡を平坦にしてもよい。これにより、時間包絡整形処理前の当該時間セグメントの当該周波数成分（または、周波数帯域）のエネルギーを保持したまま、時間包絡を平坦にできる。同様に、時間包絡整形処理前の当該時間セグメントの当該周波数成分（または、周波数帯域）のエネルギーを保持したまま、サブバンド信号の振幅を変更することで時間包絡を立ち上がり/立ち下がりにしてもよい。 Further, for example, the amplitude of the subband signal obtained by converting the decoded signal into a signal in the frequency domain by the filter bank is set to the average amplitude of the frequency component (or frequency band) to be subjected to the time wrapping shaping process in an arbitrary time segment. The time wrap may be flattened. As a result, the time envelope can be flattened while retaining the energy of the frequency component (or frequency band) of the time segment before the time envelope shaping process. Similarly, the time envelope may rise / fall by changing the amplitude of the subband signal while retaining the energy of the frequency component (or frequency band) of the time segment before the time envelope shaping process. ..

さらに例えば、図１３に示すように、上記周波数選択部１０ｂＢにて時間包絡を整形する周波数成分または周波数帯域として選択されなかった周波数成分または周波数帯域（非選択周波数成分または非選択周波数帯域とよぶ）を含む周波数帯域において、復号信号の非選択周波数成分（非選択周波数帯域でもよい）の変換係数（またはサブサンプル）を他の値にて置き換えた上で、上記時間包絡整形方法にて時間包絡整形処理を施した後に、当該非選択周波数成分（非選択周波数帯域でもよい）の変換係数（またはサブサンプル）を置き換える前の元の値に戻すことで、非選択周波数成分（非選択周波数帯域でもよい）を除いた周波数成分（周波数帯域）に時間包絡整形処理を施してもよい。 Further, for example, as shown in FIG. 13, a frequency component or frequency band not selected as a frequency component or frequency band for shaping the time wrapping by the frequency selection unit 10bB (referred to as a non-selection frequency component or non-selection frequency band). In the frequency band including, after replacing the conversion coefficient (or subsample) of the non-selective frequency component (which may be the non-selective frequency band) of the decoded signal with another value, time-wrapping shaping is performed by the above-mentioned time-wrapping shaping method. After processing, the conversion coefficient (or subsample) of the non-selected frequency component (which may be the non-selected frequency band) is returned to the original value before replacement, so that the non-selected frequency component (which may be the non-selected frequency band) may be used. ) May be applied to the frequency component (frequency band) excluding the time wrapping process.

これにより、非選択周波数成分（または、非選択周波数帯域）が点在することによって時間包絡整形処理を施す周波数成分（または周波数帯域）が細かく分割されてしまう場合においても、分割されてしまう周波数成分（または周波数帯域）をまとめて時間包絡整形処理することができ、演算量を削減できる。例えば、上記線形予測分析を用いる時間包絡整形方法においては、細かく分割された時間包絡整形処理を施す周波数成分（または、周波数帯域）に対して線形予測分析をするのに対し、当該分割された周波数成分（または、周波数帯域）を非選択周波数成分（または、非選択周波数帯域）も含めてまとめて一度の線形予測分析をすればよく、さらに線形予測逆フィルタ（線形予測フィルタでもよい）でのフィルタリング処理も、当該分割された周波数成分（または、周波数帯域）を非選択周波数成分（または、非選択周波数帯域）も含めてまとめて一度のフィルタリングででき、低演算量で実現できる。 As a result, even when the frequency component (or frequency band) to be subjected to the time-wrapping shaping process is finely divided due to the non-selection frequency component (or non-selection frequency band) being scattered, the frequency component is divided. (Or frequency band) can be collectively time-wrapped and shaped, and the amount of calculation can be reduced. For example, in the time-wrapping shaping method using the linear predictive analysis, the linear predictive analysis is performed on the frequency component (or frequency band) to which the finely divided time-wrapping shaping process is performed, whereas the divided frequency is used. The components (or frequency bands), including the non-selection frequency components (or non-selection frequency bands), may be collectively subjected to a single linear prediction analysis, and further filtered by a linear prediction inverse filter (or a linear prediction filter). The processing can also be performed by filtering the divided frequency components (or frequency bands) including the non-selection frequency components (or non-selection frequency bands) at one time, and can be realized with a low calculation amount.

当該非選択周波数成分（非選択周波数帯域でもよい）の変換係数（またはサブサンプル）の置き換えは、例えば、当該非選択周波数成分（非選択周波数帯域でもよい）の変換係数（またはサブサンプル）及びその近隣の周波数成分（または、周波数帯域でもよい）を含めた振幅の平均値を用いて、当該非選択周波数成分（非選択周波数帯域でもよい）の変換係数（またはサブサンプル）の振幅を置き換えてもよい。その際には、例えば、変換係数の符号は元の変換係数の符号を維持してもよく、サブサンプルの位相は元のサブサンプルの位相を維持してもよい。さらに例えば、当該周波数成分（周波数帯域でもよい）の変換係数（またはサブサンプル）が量子化/符号化されておらず、他の周波数成分（周波数帯域でもよい）の変換係数（またはサブサンプル）で複製・近似、または/及び擬似雑音信号の生成・付加、及び/またはサイン信号の付加で生成された周波数成分（周波数帯域でもよい）に対して時間包絡整形処理を施すと選択された場合は、非選択周波数成分（非選択周波数帯域でもよい）の変換係数（またはサブサンプル）を擬似的に他の周波数成分（周波数帯域でもよい）の変換係数（またはサブサンプル）で複製・近似、または/及び擬似雑音信号の生成・付加、及び/またはサイン信号の付加で生成した変換係数（またはサブサンプル）に置き換えてもよい。選択された周波数帯域の時間包絡の整形方法は上記の方法を組み合わせてもよく、時間包絡整形方法は上記の例に限定されない。 The replacement of the conversion coefficient (or subsample) of the non-selection frequency component (which may be the non-selection frequency band) is, for example, the conversion coefficient (or subsample) of the non-selection frequency component (which may be the non-selection frequency band) and its subsample. The amplitude of the conversion coefficient (or subsample) of the non-selected frequency component (which may be the non-selected frequency band) may be replaced by using the average value of the amplitude including the neighboring frequency component (or the frequency band). Good. In that case, for example, the sign of the conversion coefficient may maintain the sign of the original conversion coefficient, and the phase of the subsample may maintain the phase of the original subsample. Further, for example, the conversion coefficient (or subsample) of the frequency component (which may be a frequency band) is not quantized / encoded, and the conversion coefficient (or subsample) of another frequency component (which may be a frequency band) is used. If it is selected to perform time wrapping shaping on the frequency components (which may be in the frequency band) generated by duplication / approximation or / and generation / addition of pseudo-noise signals and / or addition of sine signals. The conversion coefficient (or subsample) of the non-selection frequency component (which may be the non-selection frequency band) is duplicated / approximated by the conversion coefficient (or subsample) of another frequency component (which may be the frequency band) in a pseudo manner, or / and It may be replaced with the conversion coefficient (or subsample) generated by the generation / addition of the pseudo-noise signal and / or the addition of the sine signal. The time envelope shaping method of the selected frequency band may be combined with the above methods, and the time envelope shaping method is not limited to the above example.

時間周波数逆変換部１０ｂＤは、周波数選択的に時間包絡整形を施された復号信号を時間領域の信号に変換し出力する（ステップＳ１０-２-４）。
［第２の実施形態］
図１４は、第２の実施形態に係る音声復号装置１１の構成を示す図である。音声復号装置１１の通信装置は、音声信号を符号化した符号化系列を受信し、更に、復号した音声信号を外部に出力する。音声復号装置１１は、図１４に示すように、機能的には、逆多重化部１１ａ、復号部１０ａ、選択的時間包絡整形部１１ｂを備える。 The time-frequency inverse conversion unit 10bD converts the decoded signal that has undergone time-envelope shaping in a frequency-selective manner into a signal in the time domain and outputs it (step S10-2-4).
[Second Embodiment]
FIG. 14 is a diagram showing the configuration of the audio decoding device 11 according to the second embodiment. The communication device of the voice decoding device 11 receives the coding sequence in which the voice signal is encoded, and further outputs the decoded voice signal to the outside. As shown in FIG. 14, the voice decoding device 11 functionally includes a demultiplexing unit 11a, a decoding unit 10a, and a selective time envelope shaping unit 11b.

図１５は、第２の実施形態に係る音声復号装置１１の動作を示すフローチャートである。 FIG. 15 is a flowchart showing the operation of the voice decoding device 11 according to the second embodiment.

逆多重化部１１ａは、符号化系列を復号/逆量子化して復号信号を得る符号化系列と時間包絡情報とに分離する（ステップＳ１１-１）。復号部１０ａは、符号化系列を復号し、復号信号を生成する（ステップＳ１０-１）。時間包絡情報が符号化もしくは/及び量子化されている場合は、復号もしくは/及び逆量子化して時間包絡情報を得る。 The demultiplexing unit 11a separates the coded sequence into a coded sequence for obtaining a decoded signal by decoding / dequantizing the coded sequence and time-envelope information (step S11-1). The decoding unit 10a decodes the coded sequence and generates a decoded signal (step S10-1). If the time-envelope information is encoded and / and quantized, it is decoded and / and dequantized to obtain the time-envelope information.

時間包絡情報としては、例えば、符号化装置にて符号化した入力信号の時間包絡が平坦であることを示す情報であってもよい。例えば、当該入力信号の時間包絡が立ち上がりであることを示す情報であってもよい。例えば、当該入力信号の時間包絡が立ち下がりであることを示す情報であってもよい。 The time-envelope information may be, for example, information indicating that the time-envelope of the input signal encoded by the coding apparatus is flat. For example, it may be information indicating that the time envelope of the input signal is a rising edge. For example, it may be information indicating that the time envelope of the input signal is falling.

さらには、例えば、時間包絡情報は、当該入力信号の時間包絡の平坦の度合いを示す情報であってもよく、例えば、当該入力信号の時間包絡の立ち上がりの度合いを示す情報であってもよく、例えば、当該入力信号の時間包絡の立ち下がりの度合いを示す情報であってもよい。 Further, for example, the time-wrapping information may be information indicating the degree of flatness of the time-wrapping of the input signal, and may be, for example, information indicating the degree of rise of the time-wrapping of the input signal. For example, it may be information indicating the degree of falling edge of the time wrapping of the input signal.

さらには、例えば、時間包絡情報は、選択的時間包絡整形部にて時間包絡を整形するか否かを示す情報であってもよい。 Further, for example, the time envelope information may be information indicating whether or not the time envelope is shaped by the selective time envelope shaping unit.

選択的時間包絡整形部１１ｂは、復号部１０ａから符号化系列を復号する際に得られる情報である復号関連情報と復号信号を受け取り、前記逆多重化部より時間包絡情報を受け取り、これらのうち少なくともひとつに基づいて、復号信号の成分の時間包絡を選択的に所望の時間包絡に整形する（ステップＳ１１-２）。 The selective time wrapping shaping unit 11b receives the decoding-related information and the decoding signal, which are the information obtained when decoding the coded sequence from the decoding unit 10a, and receives the time wrapping information from the demultiplexing unit, and among these, the time wrapping information is received. Based on at least one, the time wrapping of the components of the decoded signal is selectively shaped into the desired time wrapping (step S11-2).

選択的時間包絡整形部１１ｂにおける選択的時間包絡整形の方法は、例えば、選択的時間包絡整形部１０ｂと同様でもよく、さらに時間包絡情報を加味して選択的時間包絡整形を施してもよい。例えば、時間包絡情報が符号化装置にて符号化した入力信号の時間包絡が平坦であることを示す情報である場合には、当該情報に基づいて、時間包絡を平坦に整形してもよい。例えば、時間包絡情報が当該入力信号の時間包絡が立ち上がりであることを示す情報である場合には、当該情報に基づいて、時間包絡を立ち上がりに整形してもよい。例えば、時間包絡情報が当該入力信号の時間包絡が立ち下がりであることを示す情報である場合には、当該情報に基づいて、時間包絡を立ち下がりに整形してもよい。 The method of the selective time envelope shaping in the selective time envelope shaping unit 11b may be the same as that of the selective time envelope shaping section 10b, for example, and the selective time envelope shaping may be performed in consideration of the time envelope information. For example, when the time envelope information is information indicating that the time envelope of the input signal encoded by the coding apparatus is flat, the time envelope may be shaped flat based on the information. For example, when the time envelope information is information indicating that the time envelope of the input signal is the rising edge, the time envelope may be shaped into the rising edge based on the information. For example, when the time envelope information is information indicating that the time envelope of the input signal is a falling edge, the time envelope may be shaped into a falling edge based on the information.

さらに例えば、時間包絡情報が当該入力信号の時間包絡の平坦の度合いを示す情報である場合には、当該情報に基づいて時間包絡を平坦にする強度を調整してもよい。例えば、時間包絡情報が当該入力信号の時間包絡の立ち上がりの度合いを示す情報である場合には、当該情報に基づいて時間包絡を立ち上がりにする強度を調整してもよい。例えば、時間包絡情報が当該入力信号の時間包絡の立ち下がりの度合いを示す情報である場合には、当該情報に基づいて時間包絡を立ち下がりにする強度を調整してもよい。 Further, for example, when the time envelope information is information indicating the degree of flatness of the time envelope of the input signal, the strength of flattening the time envelope may be adjusted based on the information. For example, when the time envelope information is information indicating the degree of rise of the time envelope of the input signal, the strength of causing the time envelope to rise may be adjusted based on the information. For example, when the time envelope information is information indicating the degree of the fall of the time envelope of the input signal, the strength of causing the time envelope to fall may be adjusted based on the information.

さらに例えば、時間包絡情報が選択的時間包絡整形部１１ｂにて時間包絡を整形するか否かを示す情報である場合には、当該情報に基づいて時間包絡整形処理を施すか否かを決定してもよい。 Further, for example, when the time envelope information is information indicating whether or not the time envelope is shaped by the selective time envelope shaping unit 11b, it is determined whether or not the time envelope shaping process is performed based on the information. You may.

さらに例えば、上記の例の時間包絡情報で当該時間包絡情報に基づいて時間包絡整形処理を施すにあたり、時間包絡整形を施す周波数帯域（周波数成分でもよい）を第１の実施形態と同様に選択し、復号信号における当該選択された周波数帯域（周波数成分でもよい）の時間包絡を所望の時間包絡に整形してもよい。 Further, for example, in performing the time wrapping shaping process based on the time wrapping information in the above example, the frequency band (which may be a frequency component) to be subjected to the time wrapping shaping is selected in the same manner as in the first embodiment. , The time wrapping of the selected frequency band (which may be a frequency component) in the decoded signal may be shaped into the desired time wrapping.

図１６は、第２の実施形態にかかる音声符号化装置２１の構成を示す図である。音声符号化装置２１の通信装置は、符号化の対象となる音声信号を外部から受信し、更に、符号化された符号化系列を外部に出力する。音声符号化装置２１は、図１６に示すように、機能的には、符号化部２１ａ、時間包絡情報符号化部２１ｂ、多重化部２１ｃを備える。 FIG. 16 is a diagram showing the configuration of the voice coding device 21 according to the second embodiment. The communication device of the voice coding device 21 receives the voice signal to be coded from the outside, and further outputs the coded coding sequence to the outside. As shown in FIG. 16, the voice coding device 21 functionally includes a coding unit 21a, a time-envelope information coding unit 21b, and a multiplexing unit 21c.

図１７は、第２の実施形態に係る音声符号化装置２１の動作を示すフローチャートである。 FIG. 17 is a flowchart showing the operation of the voice coding device 21 according to the second embodiment.

符号化部２１ａは、入力された音声信号を符号化し符号化系列を生成する（ステップＳ２１-１）。符号化部２１ａにおける音声信号の符号化方式は、前記復号部１０ａの復号方式に対応する符号化方式である。 The coding unit 21a encodes the input audio signal to generate a coded sequence (step S21-1). The audio signal coding method in the coding unit 21a is a coding method corresponding to the decoding method of the decoding unit 10a.

時間包絡情報符号化部２１ｂは、入力された音声信号と前記符号化部２１ａにて音声信号を符号化する際に得られる情報のうち少なくともひとつより時間包絡情報を生成する。生成された時間包絡情報は符号化/量子化されてもよい（ステップＳ２１-２）。時間包絡情報は、例えば、前記音声復号装置１１の逆多重化部１１ａで得られる時間包絡情報であってもよい。 The time-envelope information coding unit 21b generates time-envelope information from at least one of the input voice signal and the information obtained when the voice signal is encoded by the coding unit 21a. The generated time envelope information may be encoded / quantized (step S21-2). The time-envelope information may be, for example, the time-envelope information obtained by the demultiplexing unit 11a of the voice decoding device 11.

さらに例えば、音声復号装置１１の復号部にて復号信号を生成する際に本発明とは別の時間包絡整形に関する処理をし、当該時間包絡整形処理に関する情報を音声符号化装置２１にて保持している場合、当該情報を用いて時間包絡情報を生成してもよい。例えば、本発明とは別の時間包絡処理をするか否かの情報に基づいて、音声復号装置１１の選択的時間包絡整形部１１ｂにて時間包絡を整形するか否かを示す情報を生成してもよい。 Further, for example, when the decoding unit of the voice decoding device 11 generates a decoding signal, a process related to time envelope shaping different from the present invention is performed, and the information related to the time envelope shaping process is held in the voice coding device 21. If so, the time-envelope information may be generated using the information. For example, based on the information on whether or not to perform the time envelope processing different from that of the present invention, the selective time envelope shaping unit 11b of the voice decoding device 11 generates information indicating whether or not to shape the time envelope. You may.

さらに例えば、前記音声復号装置１１の選択的時間包絡整形部１１ｂでは、前記第１の実施形態に係る音声復号装置１０の選択的時間包絡整形部１０ｂの第１の例に記載の線形予測分析を用いた時間包絡整形の処理を施す場合には、当該時間包絡整形処理での線形予測分析と同様に、入力された音声信号の変換係数（サブバンドサンプルでもよい）を線形予測分析した結果を用いて時間包絡情報を生成してもよい。具体的には、例えば、当該線形予測分析による予測利得を算出し、当該予測利得に基づいて時間包絡情報を生成してもよい。予測利得の算出の際には、入力された音声信号のすべての周波数帯域の変換係数（サブバンドサンプルでもよい）を線形予測分析してもよく、さらには入力された音声信号の一部の周波数帯域の変換係数（サブバンドサンプルでもよい）を線形予測分析してもよい。さらには、入力された音声信号を複数の周波数帯域に分割して当該周波数帯域ごとに変換係数（サブバンドサンプルでもよい）の線形予測分析をしてもよく、その際には複数の予測利得が算出でき、当該複数の予測利得を用いて時間包絡情報を生成してもよい。 Further, for example, in the selective time envelope shaping unit 11b of the voice decoding device 11, the linear prediction analysis described in the first example of the selective time envelope shaping unit 10b of the voice decoding device 10 according to the first embodiment is performed. When performing the time envelope shaping process used, the result of linear prediction analysis of the conversion coefficient (may be a subband sample) of the input voice signal is used in the same manner as the linear prediction analysis in the time envelope shaping process. May generate time-envelope information. Specifically, for example, the predicted gain by the linear prediction analysis may be calculated, and the time envelope information may be generated based on the predicted gain. When calculating the predicted gain, the conversion coefficients (which may be subband samples) of all frequency bands of the input audio signal may be linearly predicted and analyzed, and the frequency of a part of the input audio signal may be calculated. The band conversion factor (which may be a subband sample) may be linearly predicted. Further, the input audio signal may be divided into a plurality of frequency bands and a linear prediction analysis of the conversion coefficient (may be a subband sample) for each frequency band may be performed. In that case, a plurality of predicted gains may be obtained. It can be calculated, and the time wrapping information may be generated using the plurality of predicted gains.

さらに例えば、前記符号化部２１ａにて音声信号を符号化する際に得られる情報は、復号部１０ａが前記第２の例の構成の場合、第１の復号方式に対応する符号化方式（第１の符号化方式）での符号化の際に得られる情報と第２の復号方式に対応する符号化方式（第２の符号化方式）での符号化の際に得られる情報のうち少なくとも１つであってもよい。 Further, for example, when the decoding unit 10a has the configuration of the second example, the information obtained when the coding unit 21a encodes the audio signal is the coding method corresponding to the first decoding method (the first decoding method). At least one of the information obtained during coding by the coding method (1) and the information obtained by coding by the coding method (second coding method) corresponding to the second decoding method. It may be one.

多重化部２１ｃは、前記符号化部で得られた符号化系列と前記時間包絡情報符号化部で得られた時間包絡情報を多重化し出力する（ステップＳ２１-３）。
［第３の実施形態］
図１８は、第３の実施形態に係る音声復号装置１２の構成を示す図である。音声復号装置１２の通信装置は、音声信号を符号化した符号化系列を受信し、更に、復号した音声信号を外部に出力する。音声復号装置１２は、図１８に示すように、機能的には、復号部１０ａ、時間包絡整形部１２ａを備える。 The multiplexing unit 21c multiplexes and outputs the coding sequence obtained by the coding unit and the time envelope information obtained by the time envelope information coding unit (step S21-3).
[Third Embodiment]
FIG. 18 is a diagram showing the configuration of the audio decoding device 12 according to the third embodiment. The communication device of the voice decoding device 12 receives the coding sequence in which the voice signal is encoded, and further outputs the decoded voice signal to the outside. As shown in FIG. 18, the voice decoding device 12 functionally includes a decoding unit 10a and a time envelope shaping unit 12a.

図１９は、第３の実施形態に係る音声復号装置１２の動作を示すフローチャートである。復号部１０ａは、符号化系列を復号し、復号信号を生成する（ステップＳ１０-１）。そして、時間包絡整形部１２ａは、前記復号部１０ａから出力される復号信号の時間包絡を所望の時間包絡に整形する（ステップＳ１２-１）。時間包絡の整形方法は、前記第１の実施形態と同様に、復号信号の変換係数を線形予測分析して得られた線形予測係数を用いた線形予測逆フィルタでフィルタリングすることで、時間包絡を平坦にする方法でもよく、当該線形予測係数を用いた線形予測フィルタでフィルタリングすることで、時間包絡を立ち上がりまたは/及び立ち下がりにする方法であってもよく、さらに帯域幅拡大率を用いて平坦/立ち上がり/立ち下がりの強度を制御してもよく、さらには復号信号の変換係数の代わりに復号信号をフィルタバンクによって周波数領域の信号に変換して得られるサブバンド信号の任意の時間tにおけるサブサンプルに対して上記の例の時間包絡整形を施してもよい。さらには、前記第１の実施形態と同様に、任意の時間セグメントにおいて、所望の時間包絡になるように、当該サブバンド信号の振幅を修正してもよく、例えば、時間包絡整形処理を施す周波数成分（または、周波数帯域）の平均振幅にすることにより時間包絡を平坦にしてもよい。上記の時間包絡整形は復号信号の全周波数帯域に施してもよく、所定の周波数帯域に施してもよい。
［第４の実施形態］
図２０は、第４の実施形態に係る音声復号装置１３の構成を示す図である。音声復号装置１３の通信装置は、音声信号を符号化した符号化系列を受信し、更に、復号した音声信号を外部に出力する。音声復号装置１３は、図２０に示すように、機能的には、逆多重化部１１ａ、復号部１０ａ、時間包絡整形部１３ａを備える。 FIG. 19 is a flowchart showing the operation of the voice decoding device 12 according to the third embodiment. The decoding unit 10a decodes the coded sequence and generates a decoded signal (step S10-1). Then, the time envelope shaping unit 12a shapes the time envelope of the decoding signal output from the decoding unit 10a into a desired time envelope (step S12-1). The time-wrapping shaping method is similar to the first embodiment, in which the time-wrapping is filtered by a linear prediction inverse filter using the linear prediction coefficient obtained by linear prediction analysis of the conversion coefficient of the decoded signal. It may be a flattening method, or it may be a method of making the time wrapping rise or / or fall by filtering with a linear prediction filter using the linear prediction coefficient, and further flattening using the bandwidth expansion factor. The strength of the rising edge / falling edge may be controlled, and the sub-band signal obtained by converting the decoded signal into a signal in the frequency domain by a filter bank instead of the conversion coefficient of the decoded signal is sub-banded at an arbitrary time t. The sample may be subjected to the time wrapping shaping of the above example. Further, as in the first embodiment, the amplitude of the subband signal may be modified so as to obtain a desired time envelope in an arbitrary time segment, for example, a frequency at which the time envelope shaping process is performed. The time envelope may be flattened by making the average amplitude of the components (or frequency band). The above time envelope shaping may be applied to the entire frequency band of the decoded signal, or may be applied to a predetermined frequency band.
[Fourth Embodiment]
FIG. 20 is a diagram showing the configuration of the audio decoding device 13 according to the fourth embodiment. The communication device of the voice decoding device 13 receives the coding sequence in which the voice signal is encoded, and further outputs the decoded voice signal to the outside. As shown in FIG. 20, the voice decoding device 13 functionally includes a demultiplexing unit 11a, a decoding unit 10a, and a time envelope shaping unit 13a.

図２１は、第４の実施形態に係る音声復号装置１３の動作を示すフローチャートである。逆多重化部１１ａは、符号化系列を復号/逆量子化して復号信号を得る符号化系列と時間包絡情報とに分離し（ステップＳ１１-１）、復号部１０ａは、符号化系列を復号し、復号信号を生成する（ステップＳ１０-１）。そして、時間包絡整形部１３ａは、逆多重化部１１ａより時間包絡情報を受け取り、当該時間包絡情報に基づいて、復号部１０ａから出力される復号信号の時間包絡を所望の時間包絡に整形する（ステップＳ１３-１）。 FIG. 21 is a flowchart showing the operation of the voice decoding device 13 according to the fourth embodiment. The demultiplexing unit 11a separates the coded sequence into a coded sequence obtained by decoding / dequantumizing the coded sequence and time-envelope information (step S11-1), and the decoding unit 10a decodes the coded sequence. , Generates a decoding signal (step S10-1). Then, the time envelope shaping unit 13a receives the time envelope information from the demultiplexing unit 11a, and based on the time envelope information, shapes the time envelope of the decoding signal output from the decoding unit 10a into a desired time envelope ( Step S13-1).

当該時間包絡情報は、前記第２の実施形態と同様に、符号化装置にて符号化した入力信号の時間包絡が平坦であることを示す情報、当該入力信号の時間包絡が立ち上がりであることを示す情報、当該入力信号の時間包絡が立ち下がりであることを示す情報であってもよく、さらには、例えば、当該入力信号の時間包絡の平坦の度合いを示す情報、当該入力信号の時間包絡の立ち上がりの度合いを示す情報、当該入力信号の時間包絡の立ち下がりの度合いを示す情報であってもよく、さらには、時間包絡整形部１３ａにて時間包絡を整形するか否かを示す情報であってもよい。
［ハードウェア構成］
上述の音声復号装置１０，１１、１２、１３および音声符号化装置２１はそれぞれ、ＣＰＵ等のハードウェアから構成されているものである。図１１は、音声復号装置１０，１１、１２、１３および音声符号化装置２１それぞれのハードウェア構成の一例を示す図である。音声復号装置１０，１１、１２、１３および音声符号化装置２１はそれぞれ、物理的には、図１１に示すように、ＣＰＵ１００、主記憶装置であるＲＡＭ１０１及びＲＯＭ１０２、ディスプレイ等の入出力装置１０３、通信モジュール１０４、及び補助記憶装置１０５などを含むコンピュータシステムとして構成されている。 The time wrapping information is information indicating that the time wrapping of the input signal encoded by the coding device is flat, and that the time wrapping of the input signal is rising, as in the second embodiment. The information to be shown may be information indicating that the time wrapping of the input signal is falling, and further, for example, information indicating the degree of flatness of the time wrapping of the input signal, the time wrapping of the input signal. It may be information indicating the degree of rise, information indicating the degree of fall of the time wrapping of the input signal, and further, information indicating whether or not the time wrapping shaping unit 13a shapes the time wrapping. You may.
[Hardware configuration]
The above-mentioned voice decoding devices 10, 11, 12, 13 and the voice coding device 21 are each composed of hardware such as a CPU. FIG. 11 is a diagram showing an example of the hardware configuration of each of the voice decoding devices 10, 11, 12, 13 and the voice coding device 21. The voice decoding devices 10, 11, 12, 13 and the voice coding device 21, respectively, are physically as shown in FIG. 11, the CPU 100, the main storage devices RAM 101 and ROM 102, and the input / output devices 103 such as displays. It is configured as a computer system including a communication module 104, an auxiliary storage device 105, and the like.

音声復号装置１０，１１、１２、１３および音声符号化装置２１はそれぞれの各機能ブロックの機能はそれぞれ、図２２に示すＣＰＵ１００、ＲＡＭ１０１等のハードウェア上に所定のコンピュータソフトウェアを読み込ませることにより、ＣＰＵ１００の制御のもとで入出力装置１０３、通信モジュール１０４、及び補助記憶装置１０５を動作させるとともに、ＲＡＭ１０１におけるデータの読み出し及び書き込みを行うことで実現される。
［プログラム構成］
引き続いて、上述した音声復号装置１０，１１、１２、１３および音声符号化装置２１はそれぞれによる処理をコンピュータに実行させるための音声復号プログラム５０及び音声符号化プログラム６０を説明する。 The voice decoding devices 10, 11, 12, 13 and the voice coding device 21 have their respective function blocks loaded with predetermined computer software on the hardware such as the CPU 100 and the RAM 101 shown in FIG. 22, respectively. This is realized by operating the input / output device 103, the communication module 104, and the auxiliary storage device 105 under the control of the CPU 100, and reading and writing data in the RAM 101.
[Program structure]
Subsequently, the above-mentioned voice decoding devices 10, 11, 12, 13 and the voice coding device 21 will explain the voice decoding program 50 and the voice coding program 60 for causing the computer to execute the processing by each.

図２３に示すように、音声復号プログラム５０は、コンピュータに挿入されてアクセスされる、あるいはコンピュータが備える記録媒体４０に形成されたプログラム格納領域４１内に格納される。より具体的には、音声復号プログラム５０は、音声復号装置１０が備える記録媒体４０に形成されたプログラム格納領域４１内に格納される。 As shown in FIG. 23, the voice decoding program 50 is inserted into a computer and accessed, or is stored in a program storage area 41 formed in a recording medium 40 included in the computer. More specifically, the voice decoding program 50 is stored in the program storage area 41 formed in the recording medium 40 included in the voice decoding device 10.

音声復号プログラム５０は、復号モジュール５０ａ、選択的時間包絡整形モジュール５０ｂを実行させることにより実現される機能は、上述した音声復号装置１０の復号部１０ａ、選択的時間包絡整形部１０ｂの機能とそれぞれ同様である。さらに、復号モジュール５０ａは、復号／逆量子化部１０ａＡ、復号関連情報出力部１０ａＢ、および時間周波数逆変換部１０ａＣとして機能するためのモジュールを備える。また、復号モジュール５０ａは、符号化系列解析部１０ａＤ、第１復号部１０ａＥ、第２復号部１０ａＦとして機能するためのモジュールを備えるようにしてもよい。 The voice decoding program 50 has the functions realized by executing the decoding module 50a and the selective time envelope shaping module 50b with the functions of the decoding unit 10a and the selective time envelope shaping unit 10b of the voice decoding device 10 described above, respectively. The same is true. Further, the decoding module 50a includes a module for functioning as a decoding / inverse quantization unit 10aA, a decoding-related information output unit 10aB, and a time-frequency inverse conversion unit 10aC. Further, the decoding module 50a may include a module for functioning as a coded sequence analysis unit 10aD, a first decoding unit 10aE, and a second decoding unit 10aF.

また、選択的時間包絡整形モジュール５０ｂは、時間周波数変換部１０ｂＡ、周波数選択部１０ｂＢ、周波数選択的時間包絡整形部１０ｂＣ、時間周波数逆変換部１０ｂＤとして機能するためのモジュールを備える。 Further, the selective time wrapping shaping module 50b includes a module for functioning as a time frequency conversion section 10bA, a frequency selection section 10bB, a frequency selective time wrapping shaping section 10bC, and a time frequency inverse conversion section 10bD.

また、音声復号プログラム５０は、上述音声復号装置１１と機能するために、逆多重化部１１ａ、復号部１０ａ、選択的時間包絡整形部１１ｂとして機能するためのモジュールを備える。 Further, the voice decoding program 50 includes a module for functioning as a demultiplexing unit 11a, a decoding unit 10a, and a selective time envelope shaping unit 11b in order to function with the above-mentioned voice decoding device 11.

また、音声復号プログラム５０は、上述音声復号装置１２として機能するために、復号部１０ａ、時間包絡整形部１２ａとして機能するためのモジュールを備える。 Further, the voice decoding program 50 includes a module for functioning as a decoding unit 10a and a time envelope shaping unit 12a in order to function as the above-mentioned voice decoding device 12.

また、音声復号プログラム５０は、音声復号装置１３として機能するために、逆多重化部１１ａ、復号部１０ａ、時間包絡整形部１３ａとして機能するためのモジュールを備える。 Further, the voice decoding program 50 includes a module for functioning as a demultiplexing unit 11a, a decoding unit 10a, and a time envelope shaping unit 13a in order to function as the voice decoding device 13.

また、図２４に示すように、音声符号化プログラム６０は、コンピュータに挿入されてアクセスされる、あるいはコンピュータが備える記録媒体４０に形成されたプログラム格納領域４１内に格納される。より具体的には、音声符号化プログラム６０は、音声符号化装置２０が備える記録媒体４０に形成されたプログラム格納領域４１内に格納される。 Further, as shown in FIG. 24, the voice coding program 60 is inserted into a computer and accessed, or is stored in a program storage area 41 formed in a recording medium 40 included in the computer. More specifically, the voice coding program 60 is stored in the program storage area 41 formed on the recording medium 40 included in the voice coding device 20.

音声符号化プログラム６０は、符号化モジュール６０ａ、時間包絡情報符号化モジュール６０ｂ、及び多重化モジュール６０ｃを備えて構成される。符号化モジュール６０ａ、時間包絡情報符号化モジュール６０ｂ、及び多重化モジュール６０ｃを実行させることにより実現される機能は、上述した音声符号化装置２１の符号化部２１ａ、時間包絡情報符号化部２１ｂ、及び多重化部２１ｃの機能とそれぞれ同様である。 The voice coding program 60 includes a coding module 60a, a time-envelope information coding module 60b, and a multiplexing module 60c. The functions realized by executing the coding module 60a, the time-wrapping information coding module 60b, and the multiplexing module 60c are the coding unit 21a and the time-wrapping information coding unit 21b of the voice coding device 21 described above. And the functions of the multiplexing unit 21c are the same.

なお、音声復号プログラム５０及び音声符号化プログラム６０それぞれは、その一部若しくは全部が、通信回線等の伝送媒体を介して伝送され、他の機器により受信されて記録（インストールを含む）される構成としてもよい。また、音声復号プログラム５０及び音声符号化プログラム６０それぞれの各モジュールは、１つのコンピュータでなく、複数のコンピュータのいずれかにインストールされてもよい。その場合、当該複数のコンピュータによるコンピュータシステムよって上述した音声復号プログラム５０及び音声符号化プログラム６０それぞれの処理が行われる。 A part or all of the voice decoding program 50 and the voice coding program 60 are transmitted via a transmission medium such as a communication line, and are received and recorded (including installation) by another device. May be. Further, each module of the voice decoding program 50 and the voice coding program 60 may be installed on any of a plurality of computers instead of one computer. In that case, each of the above-mentioned voice decoding program 50 and voice coding program 60 is processed by the computer system by the plurality of computers.

本実施形態における音声復号装置および音声符号化装置の一側面について以下の通り明記する。 One aspect of the voice decoding device and the voice coding device in the present embodiment will be specified as follows.

本発明の一側面に係る音声復号装置は、符号化された音声信号を復号して音声信号を出力する音声復号装置であって、前記符号化された音声信号を含む符号化系列を復号して復号信号を得る復号部と、前記符号化系列の復号に関する復号関連情報に基づいて、復号信号における周波数帯域の時間包絡を整形する選択的時間包絡整形部と、を備える。信号の時間包絡は、時間方向に対する信号のエネルギーまたはパワー（及び、これらと等価のパラメータ）の変動を表す。本構成により、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The voice decoding device according to one aspect of the present invention is a voice decoding device that decodes a coded voice signal and outputs a voice signal, and decodes a coded sequence including the coded voice signal. It includes a decoding unit that obtains a decoding signal, and a selective time wrapping shaping unit that shapes the time wrapping of the frequency band in the decoding signal based on the decoding-related information regarding the decoding of the coded sequence. The time envelope of a signal represents a variation in the energy or power (and equivalent parameters) of the signal in the time direction. With this configuration, the time envelope of the decoded signal in the frequency band encoded with a small number of bits can be shaped into a desired time envelope, and the quality can be improved.

また、本発明の別の一側面に係る音声復号装置は、符号化された音声信号を復号して音声信号を出力する音声復号装置であって、前記符号化された音声信号を含む符号化系列と当該音声信号の時間包絡に関する時間包絡情報を分離する逆多重化部と、前記符号化系列を復号して復号信号を得る復号部と、前記時間包絡情報と前記符号化系列の復号に関する復号関連情報のうち少なくとも一つに基づいて、復号信号における周波数帯域の時間包絡を整形する選択的時間包絡整形部と、を備える。本構成により、前記音声信号の符号化系列を生成し出力する音声符号化装置にて当該音声符号化装置に入力される音声信号を参照して生成された時間包絡情報に基づき、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 Further, the audio decoding device according to another aspect of the present invention is an audio decoding device that decodes a coded audio signal and outputs an audio signal, and is a coding sequence including the encoded audio signal. And a demultiplexing unit that separates the time wrapping information related to the time wrapping of the audio signal, a decoding unit that decodes the coded sequence to obtain a decoded signal, and a decoding related unit relating to the decoding of the time wrapping information and the coded sequence. A selective time-wrapping shaping unit that shapes the time-wrapping of the frequency band in the decoded signal based on at least one of the information is provided. With this configuration, the number of bits is small based on the time wrapping information generated by referring to the voice signal input to the voice coding device in the voice coding device that generates and outputs the coded sequence of the voice signal. It is possible to shape the time wrapping of the decoded signal in the encoded frequency band into a desired time wrapping and improve the quality.

復号部は、前記符号化系列を復号または/および逆量子化して周波数領域の復号信号を得る復号・逆量子化部と、前記復号・逆量子化部における復号または/および逆量子化の過程で得られる情報、および前記符号化系列を解析して得られる情報のうち少なくとも一つを復号関連情報として出力する復号関連情報出力部と、前記周波数領域の復号信号を時間領域の信号に変換して出力する時間周波数逆変換部とを備える、こととしてもよい。本構成により、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The decoding unit is a decoding / dequantization unit that decodes / / and dequantizes the coded sequence to obtain a decoding signal in the frequency domain, and a decoding / / dequantization process in the decoding / dequantization unit. A decoding-related information output unit that outputs at least one of the obtained information and information obtained by analyzing the coding sequence as decoding-related information, and a decoding-related information output unit that converts the decoding signal in the frequency domain into a signal in the time domain. It may be provided with a time-frequency inverse conversion unit for output. With this configuration, the time envelope of the decoded signal in the frequency band encoded with a small number of bits can be shaped into a desired time envelope, and the quality can be improved.

また、復号部は、前記符号化系列を第１符号化系列と第２符号化系列に分離する符号化系列解析部と、前記第１符号化系列を復号または/および逆量子化して第１復号信号を得て前記復号関連情報として第１復号関連情報を得る第１復号部と、前記第２符号化系列と第１復号信号のうち少なくとも一つを用いて第２復号信号を得て出力し、前記復号関連情報として第２復号関連情報を出力する第２復号部とを備える、こととしてもよい。本構成により、複数の復号部により復号されて復号信号が生成される際にも、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 Further, the decoding unit includes a coding sequence analysis unit that separates the coding sequence into a first coding sequence and a second coding sequence, and a first decoding unit that decodes or / or inversely quantizes the first coding sequence. A second decoding signal is obtained and output by using a first decoding unit that obtains a signal and obtains first decoding-related information as the decoding-related information, and at least one of the second coding series and the first decoding signal. , A second decoding unit that outputs the second decoding-related information may be provided as the decoding-related information. With this configuration, even when a decoded signal is generated by being decoded by a plurality of decoding units, the time envelope of the decoded signal in the frequency band encoded with a small number of bits is shaped into a desired time envelope, and the quality is improved. It becomes possible to do.

第１復号部は、前記第１符号化系列を復号または/および逆量子化して第１復号信号を得る第１復号・逆量子化部と、前記第１復号・逆量子化部における復号または/および逆量子化の過程で得られる情報、および前記第１符号化系列を解析して得られる情報のうち少なくとも一つを第１復号関連情報として出力する第１復号関連情報出力部とを備える、こととしてもよい。本構成により、複数の復号部により復号されて復号信号が生成される際に、少なくとも第１の復号部に関連する情報に基づいて、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The first decoding unit is a first decoding / dequantization unit that decodes and / and dequantizes the first coding sequence to obtain a first decoding signal, and a decoding / / / decoding in the first decoding / dequantization unit. And a first decoding-related information output unit that outputs at least one of the information obtained in the process of inverse quantization and the information obtained by analyzing the first coding sequence as the first decoding-related information. It may be that. According to this configuration, when the decoding signal is generated by being decoded by a plurality of decoding units, the time of the decoding signal in the frequency band encoded with a small number of bits based on at least the information related to the first decoding unit. It is possible to shape the envelope into a desired time envelope and improve the quality.

第２復号部は、前記第２符号化系列と前記第１復号信号のうち少なくとも１つを用いて第２復号信号を得る第２復号・逆量子化部と、前記第２復号・逆量子化部における第２復号信号を得る過程で得られる情報、および前記第２符号化系列を解析して得られる情報のうち少なくとも一つを第２復号関連情報として出力する第２復号関連情報出力部とを備える、こととしてもよい。本構成により、複数の復号部により復号されて復号信号が生成される際に、少なくとも第２の復号部に関連する情報に基づいて、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The second decoding unit includes a second decoding / dequantization unit that obtains a second decoding signal using at least one of the second coding series and the first decoding signal, and the second decoding / dequantization unit. A second decoding-related information output unit that outputs at least one of the information obtained in the process of obtaining the second decoding signal in the unit and the information obtained by analyzing the second coding sequence as the second decoding-related information. May be provided. According to this configuration, when the decoding signal is generated by being decoded by a plurality of decoding units, the time of the decoding signal in the frequency band encoded with a small number of bits based on at least the information related to the second decoding unit. It is possible to shape the envelope into a desired time envelope and improve the quality.

選択的時間包絡整形部は、前記復号信号を周波数領域の信号に変換する時間・周波数変換部と、前記復号関連情報に基づいて、前記周波数領域の復号信号を各周波数帯域の時間包絡を整形する周波数選択的時間包絡整形部と、前記各周波数帯域の時間包絡を整形された周波数領域の復号信号を時間領域の信号に変換する時間・周波数逆変換部とを備える、こととしてもよい。本構成により、周波数領域において少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The selective time wrapping shaping unit shapes the time wrapping of the decoding signal in the frequency domain in each frequency band based on the time / frequency conversion unit that converts the decoded signal into a signal in the frequency domain and the decoding-related information. It may be provided with a frequency-selective time-environment shaping unit and a time-frequency inverse conversion unit that converts a decoded signal in the frequency domain in which the time-wrapping of each frequency band is shaped into a signal in the time domain. With this configuration, the time envelope of the decoded signal in the frequency band encoded with a small number of bits in the frequency domain can be shaped into a desired time envelope, and the quality can be improved.

復号関連情報は、各周波数帯域の符号化ビット数に関連する情報である、こととしてもよい。本構成により、各周波数帯域の符号化ビット数に応じて、当該周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The decoding-related information may be information related to the number of coding bits in each frequency band. With this configuration, it is possible to shape the time envelope of the decoded signal of the frequency band into a desired time envelope according to the number of coded bits of each frequency band, and improve the quality.

復号関連情報は、各周波数帯域の量子化ステップに関連する情報であることとしてもよい。本構成により、各周波数帯域の量子化ステップに応じて、当該周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The decoding-related information may be information related to the quantization step of each frequency band. With this configuration, it is possible to shape the time envelope of the decoded signal of the frequency band into a desired time envelope according to the quantization step of each frequency band, and improve the quality.

復号関連情報は、各周波数帯域の符号化方式に関連する情報である、こととしてもよい。本構成により、各周波数帯域の符号化方式に応じて、当該周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The decoding-related information may be information related to the coding method of each frequency band. With this configuration, it is possible to shape the time envelope of the decoded signal of the frequency band into a desired time envelope according to the coding method of each frequency band, and improve the quality.

復号関連情報は、各周波数帯域に注入される雑音成分に関連する情報である、こととしてもよい。本構成により、各周波数帯域に注入される雑音成分に応じて、当該周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 Decoding-related information may be information related to noise components injected into each frequency band. With this configuration, it is possible to shape the time envelope of the decoded signal of the frequency band into a desired time envelope according to the noise component injected into each frequency band, and improve the quality.

周波数選択的時間包絡整形部は、時間包絡を整形する周波数帯域に対応する前記復号信号を、当該復号信号を周波数領域において線形予測分析して得られた線形予測係数を用いたフィルタを用いて所望の時間包絡に整形する、こととしてもよい。本構成により、周波数領域における復号信号を用いて、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The frequency-selective time-wrapping shaping unit desires the decoding signal corresponding to the frequency band for shaping the time-wrapping by using a filter using a linear prediction coefficient obtained by linearly predicting and analyzing the decoding signal in the frequency domain. It may be shaped into the time wrapping of. With this configuration, it is possible to improve the quality by shaping the time wrapping of the decoding signal of the frequency band encoded with a small number of bits into a desired time wrapping by using the decoding signal in the frequency domain.

周波数選択的時間包絡整形部は、時間包絡を整形しない周波数帯域に対応する前記復号信号を周波数領域において他の信号に置き換えた後、時間包絡を整形する周波数および時間包絡を整形しない周波数に対応する復号信号を、周波数領域において線形予測分析して得られた線形予測係数を用いたフィルタを用いて、周波数領域において前記時間包絡を整形する周波数および時間包絡を整形しない周波数に対応する復号信号をフィルタリング処理することで所望の時間包絡に整形し、時間包絡整形後に、前記時間包絡を整形しない周波数帯域に対応する復号信号は他の信号に置き換える前の元の信号に戻す、こととしてもよい。本構成により、より少ない演算量にて、周波数領域における復号信号を用いて、少ないビット数で符号化された周波数帯域の復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 The frequency-selective time-wrapping shaping unit corresponds to a frequency that shapes the time-wrapping and a frequency that does not shape the time-wrapping after replacing the decoded signal corresponding to the frequency band that does not shape the time-wrapping with another signal in the frequency region. Using a filter using the linear prediction coefficient obtained by linear predictive analysis of the decoded signal in the frequency region, the decoded signal corresponding to the frequency that shapes the time envelope and the frequency that does not shape the time envelope is filtered in the frequency region. By processing, it may be shaped into a desired time wrapping, and after the time wrapping shaping, the decoded signal corresponding to the frequency band in which the time wrapping is not shaped may be returned to the original signal before being replaced with another signal. With this configuration, the time wrapping of the decoding signal in the frequency band encoded with a small number of bits is shaped into the desired time wrapping by using the decoding signal in the frequency domain with a smaller amount of calculation, and the quality is improved. Is possible.

また、本発明の別の一側面に係る音声復号装置は、符号化された音声信号を復号して音声信号を出力する音声復号装置であって、前記符号化された音声信号を含む符号化系列を復号して復号信号を得る復号部と、前記復号信号を周波数領域において線形予測分析して得られた線形予測係数を用いたフィルタを用いて、周波数領域において前記復号信号をフィルタリング処理することで所望の時間包絡に整形する時間包絡整形部と、を備える。本構成により、周波数領域における復号信号を用いて、当該少ないビット数で符号化された復号信号の時間包絡を所望の時間包絡に整形し、品質を改善することが可能となる。 Further, the voice decoding device according to another aspect of the present invention is a voice decoding device that decodes a coded voice signal and outputs a voice signal, and is a coding sequence including the coded voice signal. By using a decoding unit that obtains a decoded signal by decoding the decoded signal and a filter that uses a linear prediction coefficient obtained by linearly predicting and analyzing the decoded signal in the frequency domain, the decoded signal is filtered in the frequency domain. It is provided with a time-wrapping shaping unit for shaping into a desired time-wrapping. With this configuration, it is possible to improve the quality by shaping the time envelope of the decoded signal encoded with the small number of bits into a desired time envelope by using the decoded signal in the frequency domain.

また、本発明の別の一側面に係る音声符号化装置は、入力される音声信号を符号化して符号化系列を出力する音声符号化装置であって、前記音声信号を符号化して前記音声信号を含む符号化系列を得る符号化部と、前記音声信号の時間包絡に関する情報を符号化する時間包絡情報符号化部と、前記符号化部で得られる符号化系列と、前記時間包絡情報符号化部で得られる時間包絡に関する情報の符号化系列を多重化する多重化部と、を備える。 Further, the voice coding device according to another aspect of the present invention is a voice coding device that encodes an input voice signal and outputs a coded sequence, and encodes the voice signal and outputs the voice signal. A coding unit that obtains a coding sequence including the above, a time-wrapping information coding unit that encodes information regarding the time-wrapping of the voice signal, a coding sequence obtained by the coding unit, and the time-wrapping information coding. It is provided with a multiplexing unit that multiplexes a coding sequence of information regarding the time wrapping obtained by the unit.

また、本発明の一側面に係る態様は、以下の通り音声復号方法、音声符号化方法、音声復号プログラム、および音声符号化プログラムとして捉えることができる。 Further, an aspect according to one aspect of the present invention can be grasped as a voice decoding method, a voice coding method, a voice decoding program, and a voice coding program as follows.

すなわち、本発明の一側面に係る音声復号方法は、符号化された音声信号を復号して音声信号を出力する音声復号装置の音声復号方法であって、前記符号化された音声信号を含む符号化系列を復号して復号信号を得る復号ステップと、前記符号化系列の復号に関する復号関連情報に基づいて、復号信号における周波数帯域の時間包絡を整形する選択的時間包絡整形ステップと、を備える。 That is, the audio decoding method according to one aspect of the present invention is an audio decoding method of an audio decoding device that decodes an encoded audio signal and outputs an audio signal, and is a code including the encoded audio signal. It includes a decoding step of decoding a coding sequence to obtain a decoding signal, and a selective time-wrapping shaping step of shaping the time wrapping of a frequency band in the decoding signal based on the decoding-related information regarding the decoding of the coded sequence.

また、本発明の一側面に係る音声復号方法は、符号化された音声信号を復号して音声信号を出力する音声復号装置の音声復号方法であって、前記符号化された音声信号を含む符号化系列と当該音声信号の時間包絡に関する時間包絡情報を分離する逆多重化ステップと、前記符号化系列を復号して復号信号を得る復号ステップと、前記時間包絡情報と前記符号化系列の復号に関する復号関連情報のうち少なくとも一つに基づいて、復号信号における周波数帯域の時間包絡を整形する選択的時間包絡整形ステップと、を備える。 Further, the audio decoding method according to one aspect of the present invention is an audio decoding method of an audio decoding device that decodes an encoded audio signal and outputs an audio signal, and is a code including the encoded audio signal. The demultiplexing step of separating the time-wrapping information related to the time-wrapping of the audio signal and the audio signal, the decoding step of decoding the coded sequence to obtain the decoding signal, and the decoding of the time-wrapping information and the coded sequence. It comprises a selective time wrapping shaping step that shapes the time wrapping of the frequency band in the decoding signal based on at least one of the decoding related information.

また、本発明の一側面に係る音声復号プログラムは、前記符号化された音声信号を含む符号化系列を復号して復号信号を得る復号ステップと、前記符号化系列の復号に関する復号関連情報に基づいて、復号信号における周波数帯域の時間包絡を整形する選択的時間包絡整形ステップと、をコンピュータに実行させる。 Further, the audio decoding program according to one aspect of the present invention is based on a decoding step of decoding a coded sequence including the coded audio signal to obtain a decoded signal and decoding-related information regarding decoding of the coded sequence. Then, the computer is made to perform a selective time wrapping shaping step of shaping the time wrapping of the frequency band in the decoded signal.

また、本発明の一側面に係る音声復号方法は、符号化された音声信号を復号して音声信号を出力する音声復号装置の音声復号方法であって、前記符号化された音声信号を含む符号化系列と当該音声信号の時間包絡に関する時間包絡情報を分離する逆多重化ステップと、前記符号化系列を復号して復号信号を得る復号ステップと、前記時間包絡情報と前記符号化系列の復号に関する復号関連情報のうち少なくとも一つに基づいて、復号信号における周波数帯域の時間包絡を整形する選択的時間包絡整形ステップと、をコンピュータに実行させる。 Further, the audio decoding method according to one aspect of the present invention is an audio decoding method of an audio decoding device that decodes an encoded audio signal and outputs an audio signal, and is a code including the encoded audio signal. The demultiplexing step of separating the time-wrapping information related to the time-wrapping of the audio signal and the audio signal, the decoding step of decoding the coded sequence to obtain the decoding signal, and the decoding of the time-wrapping information and the coded sequence. Have the computer perform a selective time-wrapping shaping step that shapes the time-wrapping of the frequency band in the decoding signal based on at least one of the decoding-related information.

また、本発明の一側面に係る音声復号方法は、符号化された音声信号を復号して音声信号を出力する音声復号装置の音声復号方法であって、前記符号化された音声信号を含む符号化系列を復号して復号信号を得る復号ステップと、前記復号信号を周波数領域において線形予測分析して得られた線形予測係数を用いたフィルタを用いて、周波数領域において前記復号信号をフィルタリング処理することで所望の時間包絡に整形する時間包絡整形ステップと、を備える。 Further, the voice decoding method according to one aspect of the present invention is a voice decoding method of a voice decoding device that decodes a coded voice signal and outputs a voice signal, and is a code including the coded voice signal. The decoded signal is filtered in the frequency domain using a decoding step of decoding the coding sequence to obtain a decoded signal and a filter using the linear prediction coefficient obtained by linearly predicting and analyzing the decoded signal in the frequency domain. This includes a time-encapsulation shaping step of shaping into a desired time-encapsulation.

また、本発明の一側面に係る音声符号化方法は、入力される音声信号を符号化して符号化系列を出力する音声符号化装置の音声符号化方法であって、前記音声信号を符号化して前記音声信号を含む符号化系列を得る符号化ステップと、前記音声信号の時間包絡に関する情報を符号化する時間包絡情報符号化ステップと、前記符号化ステップで得られる符号化系列と、前記時間包絡情報符号化ステップで得られる時間包絡に関する情報の符号化系列を多重化する多重化ステップと、を備える。 Further, the voice coding method according to one aspect of the present invention is a voice coding method of a voice coding device that encodes an input voice signal and outputs a coded sequence, and encodes the voice signal. A coding step for obtaining a coded sequence including the voice signal, a time-wrapping information coding step for coding information regarding the time-wrapping of the voice signal, a coding sequence obtained in the coding step, and the time-wrapping. It includes a multiplexing step that multiplexes a coding sequence of information about the time wrapping obtained in the information coding step.

また、本発明の一側面に係る音声復号プログラムは、符号化された音声信号を含む符号化系列を復号して復号信号を得る復号ステップと、前記復号信号を周波数領域において線形予測分析して得られた線形予測係数を用いたフィルタを用いて、周波数領域において前記復号信号をフィルタリング処理することで所望の時間包絡に整形する時間包絡整形ステップと、をコンピュータに実行させる。 Further, the voice decoding program according to one aspect of the present invention obtains a decoding step of decoding a coded sequence including a coded voice signal to obtain a decoded signal and linearly predicting and analyzing the decoded signal in the frequency domain. Using a filter using the obtained linear prediction coefficient, the computer is made to perform a time-wrapping shaping step of shaping the decoded signal into a desired time-wrapping by filtering in the frequency domain.

また、本発明の一側面に係る音声符号化プログラムは、音声信号を符号化して前記音声信号を含む符号化系列を得る符号化ステップと、前記音声信号の時間包絡に関する情報を符号化する時間包絡情報符号化ステップと、前記符号化ステップで得られる符号化系列と、前記時間包絡情報符号化ステップで得られる時間包絡に関する情報の符号化系列を多重化する多重化ステップと、コンピュータに実行させる。 Further, the voice coding program according to one aspect of the present invention includes a coding step of coding a voice signal to obtain a coding sequence including the voice signal, and a time inclusion of coding information regarding the time inclusion of the voice signal. The computer is made to execute the information coding step, the coding sequence obtained in the coding step, and the multiplexing step for multiplexing the coding sequence of the information related to the time inclusion obtained in the time-wrapping information coding step.

１０ａＦ-１…逆量子化部、１０…音声復号装置、１０ａ…復号部、１０ａＡ…復号/逆量子化部、１０ａＢ…復号関連情報出力部、１０ａＣ…時間周波数逆変換部、１０ａＤ…符号化系列解析部、１０ａＥ…第１復号部、１０ａＥ-ａ…第１復号/逆量子化部、１０ａＥ-ｂ…第１復号関連情報出力部、１０ａＦ…第２復号部、１０ａＦ-ａ…第２復号/逆量子化部、１０ａＦ-ｂ…第２復号関連情報出力部、１０ａＦ-ｃ…復号信号合成部、１０ｂ…選択的時間包絡整形部、１０ｂＡ…時間周波数変換部、１０ｂＢ…周波数選択部、１０ｂＣ…周波数選択的時間包絡整形部、１０ｂＤ…時間周波数逆変換部、１１…音声復号装置、１１ａ…逆多重化部、１１ｂ…選択的時間包絡整形部、１２…音声復号装置、１２ａ…時間包絡整形部、１３…音声復号装置、１３ａ…時間包絡整形部、２１…音声符号化装置、２１ａ…符号化部、２１ｂ…時間包絡情報符号化部、２１ｃ…多重化部。
10aF-1 ... Inverse quantization unit, 10 ... Voice decoding device, 10a ... Decoding unit, 10aA ... Decoding / inverse quantization unit, 10aB ... Decoding related information output unit, 10aC ... Time frequency inverse conversion unit, 10aD ... Coding sequence Analysis unit, 10aE ... 1st decoding unit, 10aE-a ... 1st decoding / dequantization unit, 10aE-b ... 1st decoding-related information output unit, 10aF ... 2nd decoding unit, 10aF-a ... 2nd decoding / Inverse quantization unit, 10aF-b ... Second decoding-related information output unit, 10aF-c ... Decoding signal synthesis unit, 10b ... Selective time entrainment shaping unit, 10bA ... Time frequency conversion unit, 10bB ... Frequency selection unit, 10bC ... Frequency selective time wrapping shaping unit, 10bD ... Time frequency inverse conversion unit, 11 ... Voice decoding device, 11a ... Demultiplexing section, 11b ... Selective time wrapping shaping unit, 12 ... Voice decoding device, 12a ... Time wrapping shaping unit , 13 ... voice decoding device, 13a ... time wrapping shaping unit, 21 ... voice coding device, 21a ... coding unit, 21b ... time wrapping information coding unit, 21c ... multiplexing unit.

Claims

An audio decoding device that decodes an encoded audio signal and outputs an audio signal.
A decoding unit that decodes a coded sequence including the coded audio signal to obtain a decoded signal, and
A selective time envelope shaping unit that shapes the time envelope of the frequency band in the decoded signal based on the decoding-related information regarding the decoding of the coded sequence.
With
The decoding unit obtains a decoded signal by duplicating a signal in a frequency band different from the frequency band in a part of the frequency band.
The selective time envelope shaping unit replaces the decoded signal corresponding to a frequency band that does not shape the time envelope with another signal in the frequency domain.
Audio decoding device.

It is a voice decoding method of a voice decoding device that decodes a coded voice signal and outputs a voice signal.
A decoding step of decoding a coded sequence including the encoded audio signal to obtain a decoded signal, and
A selective time envelope shaping step that shapes the time envelope of the frequency band in the decoded signal based on the decoding-related information regarding the decoding of the coded sequence.
With
In the decoding step, a decoding signal is obtained by duplicating a signal in a frequency band different from the frequency band in a part of the frequency band.
The selective time envelope shaping step replaces the decoded signal corresponding to a frequency band that does not shape the time envelope with another signal in the frequency domain.
Audio decoding method.