JP4978539B2

JP4978539B2 - Encoding apparatus, encoding method, and program.

Info

Publication number: JP4978539B2
Application number: JP2008099810A
Authority: JP
Inventors: 博康井手
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2008-04-07
Filing date: 2008-04-07
Publication date: 2012-07-18
Anticipated expiration: 2028-04-07
Also published as: JP2009253706A

Abstract

<P>PROBLEM TO BE SOLVED: To improve coding efficiency of data, while securing the quality of the data. <P>SOLUTION: A modified discrete cosine transform (MDCT) section 23 performs frequency conversion upon digital signals of blocks, respectively, to produce a MDCT coefficients. A time-order permutation section 30 time-sequentially permutes the MDCT coefficients, regarding the same frequency. A coefficient VQ section 31 uses a codebook 51, to vector-quantizes the group of MDCT coefficients for each frequency and arranges obtained indexes in the order of frequencies to produce a data stream of transform coefficient indexes. An entropy-coding section 32 codes these data streams and information about a flag. A data deletion section 34 calculates the degree of importance for the group MDCT coefficients for each frequency. The data deletion section 34 compresses the data streams of the transformation coefficient indexes, based on the degree of importance and changes the flag, to thereby compress the data to be coded. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、デジタル信号を符号化する符号化装置、デジタル信号を符号化する符号化方法、及びデジタル信号処理を行うコンピュータにより実行させるプログラムに関する。 The present invention is an encoding apparatus for encoding a digital signal, the encoding method of encoding a digital signal, a program to be executed by a computer to perform 及 beauty digital signal processing.

人間の聴覚の特性に基づいて音声符号化を行い、この符号化データを復号する音声処理装置が開示されている（例えば、特許文献１乃至４参照）。 A speech processing apparatus that performs speech coding based on human auditory characteristics and decodes the coded data is disclosed (for example, see Patent Documents 1 to 4).

特開２００５−１２８４０４号公報JP-A-2005-128404 特開２００６−１１９３６３号公報JP 2006-119363 A 特開２００６−２５９５１７号公報JP 2006-259517 A 特開２００６−２６２２９５号公報JP 2006-262295 A

この種の音声処理装置は、電子辞書装置のような語学辞書において発音される単語の音声用としても利用される。語学辞書用の音声処理装置では、十分な音質を確保しつつ、16kbps程度のデータレートを確保しなければならない。 This type of speech processing device is also used for speech of words that are pronounced in a language dictionary such as an electronic dictionary device. A speech processing device for a language dictionary must secure a data rate of about 16 kbps while ensuring sufficient sound quality.

本発明は、このような事情に鑑みてなされたもので、データの質を確保しつつ、そのデータの符号化効率を向上させることができる符号化装置、符号化方法、及びプログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, provide while maintaining the quality of the data, the encoding device which can improve the coding efficiency of the data, marks Goka method, the 及 beauty program The purpose is to do.

上記目的を達成するために、本発明の符号化装置は、所定時間長のデジタル信号を、複数のブロックに分割する分割部と、前記各ブロックのデジタル信号をそれぞれ周波数変換し、第１の変換係数群をブロック毎に生成する周波数変換部と、前記周波数変換部で生成された第１の変換係数群を、周波数が高くなるにつれて帯域幅が広くなるような複数の小周波数帯域に分割する帯域分割部と、前記小周波数帯域に属する前記第１の変換係数の絶対値の最大値を、前記小周波数帯域毎に検索し、検索された最大値を周波数順に並べることにより、最大値列を、前記ブロック毎に生成する最大値検索部と、前記各ブロックの最大値列を、最大値列コードブックを用いてベクトル量子化し、求められたインデックスを、時系列順に並べることにより、最大値列インデックスのデータ列を生成する最大値列ベクトル量子化部と、前記最大値列ベクトル量子化部で求めた前記各ブロックの最大値列のインデックスを前記最大値列コードブックを用いて逆量子化し、前記各ブロックのそれぞれの小周波数帯域に属する第１の変換係数群を、当該ブロックに対応する逆量子化値であって当該小周波数帯域の逆量子化値を用いて除算する除算部と、前記除算部で除算された前記各ブロックの第１の変換係数群にそれぞれ含まれる同一周波数の変換係数を、時系列に並び替えることにより、第２の変換係数群を、周波数毎に生成する時系列並び替え部と、前記各周波数の第２の変換係数群を、変換係数コードブックを用いてベクトル量子化し、求められたインデックスを、周波数順に並べることにより、変換係数インデックスのデータ列を生成する変換係数ベクトル量子化部と、前記各周波数の第２の変換係数群の重要度に基づいて、前記変換係数インデックスのデータ列を圧縮し、前記各周波数の第２の変換係数群が符号化対象であるか否かを示すフラグに関する情報を生成するデータ圧縮部と、前記最大値列ベクトル量子化部で生成された最大値列インデックスのデータ列と前記データ圧縮部で生成された前記フラグに関する情報と、前記圧縮されたデータ列とを符号化する符号化部と、を備える。 In order to achieve the above object, an encoding apparatus according to the present invention includes a dividing unit that divides a digital signal having a predetermined time length into a plurality of blocks, a frequency conversion of the digital signal of each block, and a first conversion. A band that divides a frequency conversion unit that generates a coefficient group for each block and a first conversion coefficient group generated by the frequency conversion unit into a plurality of small frequency bands that have a wider bandwidth as the frequency increases. By searching for the maximum value of the absolute value of the first transform coefficient belonging to the small frequency band for each of the small frequency bands, and by dividing the searched maximum values in order of frequency, A maximum value search unit to be generated for each block, and a maximum value sequence of each block is vector-quantized using a maximum value sequence codebook, and the obtained indexes are arranged in time series A maximum value sequence vector quantization unit that generates a data sequence of a maximum value sequence index, and the index of the maximum value sequence of each block obtained by the maximum value sequence vector quantization unit is inverted using the maximum value sequence codebook. A division unit that quantizes and divides a first transform coefficient group belonging to each small frequency band of each block using an inverse quantization value corresponding to the block and using the inverse quantization value of the small frequency band And generating a second transform coefficient group for each frequency by rearranging the transform coefficients of the same frequency included in the first transform coefficient group of each block divided by the division unit in time series A time series rearrangement unit that performs vector quantization on the second transform coefficient group of each frequency using a transform coefficient codebook, and arranges the obtained indexes in order of frequency. Based on the importance of the transform coefficient vector quantization unit that generates the transform coefficient index data string and the second transform coefficient group of each frequency, the transform coefficient index data string is compressed, A data compression unit that generates information on a flag indicating whether or not the transform coefficient group of 2 is an encoding target, a data sequence of the maximum value sequence index generated by the maximum value sequence vector quantization unit, and the data compression An encoding unit that encodes the information about the flag generated by the unit and the compressed data string.

また、前記符号化部により符号化されたデータの符号量が、目標符号量よりも小さいか否かの判定を、その判定が肯定されるまで繰り返す符号量判定部をさらに備え、前記データ圧縮部は、前記符号量判定部の判定が否定された場合に、前記重要度が小さい順に、第２の変換係数群を、符号化対象から削除することにより、前記変換係数インデックスのデータ列を圧縮して、前記フラグに関する情報を生成し、前記符号化部は、前記符号量判定部の判定が肯定されるまで、前記データ圧縮部によって圧縮された前記データ列と、生成された前記フラグに関する情報とを符号化することとしてもよい。 The data compression unit further includes a code amount determination unit that repeats the determination as to whether or not the code amount of the data encoded by the encoding unit is smaller than a target code amount until the determination is positive. When the determination of the code amount determination unit is negative, the data sequence of the transform coefficient index is compressed by deleting the second transform coefficient group from the encoding target in ascending order of importance. The information on the flag is generated, and the encoding unit, the data sequence compressed by the data compression unit until the determination of the code amount determination unit is affirmed , the information on the generated flag, May be encoded.

また、前記データ圧縮部は、前記符号化されるデータの符号量が目標符号量より少なく、かつ、その目標符号量に近い値となるまで、前記重要度が高い順に、前記符号化部の符号化対象となる周波数周波数を選択し、選択された周波数周波数に対応する第２の変換係数群を符号化対象として、前記変換係数インデックスのデータ列を圧縮し、前記フラグに関する情報を生成することとしてもよい。 In addition, the data compression unit is configured such that the code of the encoding unit is in descending order of importance until the code amount of the encoded data is less than the target code amount and is close to the target code amount. Selecting a frequency frequency to be encoded, compressing a data sequence of the conversion coefficient index with a second conversion coefficient group corresponding to the selected frequency frequency as an encoding target, and generating information on the flag Also good.

また、前記データ圧縮部は、前記生成されたフラグを、周波数順に並べることによりフラグ列を形成し、形成されたフラグ列に基づいて、そのフラグ列において同一の値が連続する連続数の数列を前記フラグに関する情報として生成することとしてもよい。 In addition, the data compression unit forms a flag sequence by arranging the generated flags in order of frequency, and based on the formed flag sequence, a continuous number sequence in which the same value continues in the flag sequence. It is good also as producing | generating as the information regarding the said flag.

この場合、前記データ圧縮部は、前記フラグ列において、同一の値が連続する連続数がその上限値に等しい場合には、前記数列において、その連続数と、次の連続数との間に、０を挿入することとしてもよい。 In this case, when the number of consecutive consecutive identical values is equal to the upper limit value in the flag string, the data compression unit, between the number of consecutive numbers in the number sequence and the next consecutive number, It is also possible to insert 0.

また、前記データ圧縮部は、前記フラグ列が、１から開始される場合には、前記数列の先頭に０を挿入することとしてもよい。 The data compression unit may insert 0 at the head of the number sequence when the flag sequence starts from 1.

また、一連の複数の前記所定時間長のデジタル信号に関して前記符号化部によってそれぞれ符号化された符号化データの符号量の和が、全体の目標符号量よりも小さいか否かの判定を、その判定が肯定されるまで繰り返す全体符号量判定部と、前記全体符号量判定部の判定が否定された場合に、前記重要度が全体で最小となる第２の変換係数群に対応する変換係数インデックスを、前記変換係数インデックスのデータ列から除外し、除外された第２の変換係数群に対応する前記フラグを符号化対象でないことを示す値に変更する調整部と、前記変換係数インデックスが除外され、前記フラグが変更された前記所定時間長のデジタル信号に関するデータを再符号化する再符号化部と、をさらに備えることとしてもよい。 In addition, it is determined whether or not the sum of the code amounts of the encoded data encoded by the encoding unit with respect to the series of the plurality of digital signals having the predetermined time length is smaller than the entire target code amount, An overall code amount determination unit that repeats until the determination is affirmative, and a transform coefficient index corresponding to the second transform coefficient group that minimizes the degree of importance overall when the determination of the overall code amount determination unit is denied Are excluded from the transform coefficient index data string, and an adjustment unit that changes the flag corresponding to the excluded second transform coefficient group to a value indicating that it is not an encoding target; and the transform coefficient index is excluded. And a re-encoding unit that re-encodes data related to the digital signal having the predetermined time length in which the flag is changed.

本発明によれば、データの質を確保しつつ、符号化効率を向上させることができる。 According to the present invention, encoding efficiency can be improved while ensuring the quality of data.

≪第１の実施形態≫
次に、本発明の第１の実施形態について図面を参照して詳細に説明する。 << First Embodiment >>
Next, a first embodiment of the present invention will be described in detail with reference to the drawings.

図１には、本実施形態に係る音声処理装置１の概略的な構成が示されている。このような音声処理装置１としては、例えば、携帯電話機や、電子辞書のような端末装置が想定される。 FIG. 1 shows a schematic configuration of a sound processing apparatus 1 according to the present embodiment. As such a voice processing device 1, for example, a mobile phone or a terminal device such as an electronic dictionary is assumed.

音声処理装置１は、音声入出力装置１１と、記憶装置１２と、ＲＯＭ１３と、ＲＡＭ１４と、ＣＰＵ１５と、を備える。これらは、内部バスを介して接続される。 The voice processing device 1 includes a voice input / output device 11, a storage device 12, a ROM 13, a RAM 14, and a CPU 15. These are connected via an internal bus.

音声入出力装置１１は、入力された音声をデジタル信号に変換する。音声入出力装置１１は、例えば、入力された音声をサンプリング周波数１６ｋＨｚでサンプリングし、１６ビットで量子化することにより、デジタル信号Sound0を生成する。また、音声入出力装置１１は、デジタル信号が供給されると、このデジタル信号に対応する音声を出力する。 The voice input / output device 11 converts the input voice into a digital signal. For example, the voice input / output device 11 generates the digital signal Sound0 by sampling the input voice at a sampling frequency of 16 kHz and quantizing the voice by 16 bits. Further, when a digital signal is supplied, the voice input / output device 11 outputs a voice corresponding to the digital signal.

記憶装置１２には、音声入出力装置１１によって生成されたデジタル信号Sound0が符号化されたデータが格納される。記憶装置１２には、その符号化データを復号するのに必要なデータも格納される。これらのデータについては後述する。 The storage device 12 stores data in which the digital signal Sound0 generated by the audio input / output device 11 is encoded. The storage device 12 also stores data necessary for decoding the encoded data. These data will be described later.

ＲＯＭ１３には、ＣＰＵ１５によって実行される処理に必要なプログラムコードなどの各種データが格納されている。ＲＡＭ１４には、ＣＰＵ１５によって実行される処理に必要なデータが格納される。 The ROM 13 stores various data such as program codes necessary for processing executed by the CPU 15. The RAM 14 stores data necessary for processing executed by the CPU 15.

ＣＰＵ１５は、ＲＯＭ１３に格納されたプログラムコードに従って処理を実行する。ＣＰＵ１５による処理の実行により、符号化部１６と復号部１７とが実現されている。 The CPU 15 executes processing according to the program code stored in the ROM 13. The encoding unit 16 and the decoding unit 17 are realized by execution of processing by the CPU 15.

符号化部１６は、音声入出力装置１１によって変換されたデジタル信号の符号化を行う。符号化部１６は、図２に示されるように、ＤＣ除去部２１と、フレーム化部２２と、ＭＤＣＴ（Modified Discrete Cosine Transform, 修正離散コサイン変換）部２３と、正規化部２４と、帯域分割部２５と、最大値検索部２６と、最大値列ベクトル量子化（ＶＱ）部２７と、最大値除算部２８と、量子化部２９と、時間順並び替え部３０と、係数ベクトル量子化（ＶＱ）部３１と、エントロピ符号化部３２と、符号量比較部３３と、データ削除部３４と、コードブック５０、５１と、を備えている。 The encoding unit 16 encodes the digital signal converted by the voice input / output device 11. As shown in FIG. 2, the encoding unit 16 includes a DC removing unit 21, a framing unit 22, an MDCT (Modified Discrete Cosine Transform) unit 23, a normalizing unit 24, and a band division. Unit 25, maximum value search unit 26, maximum value sequence vector quantization (VQ) unit 27, maximum value division unit 28, quantization unit 29, time order rearrangement unit 30, coefficient vector quantization ( VQ) unit 31, entropy encoding unit 32, code amount comparison unit 33, data deletion unit 34, and codebooks 50 and 51.

ＤＣ除去部２１は、図３に示されるように、音声入力装置１１が生成したサンプリング周期Ｔｓのデジタル信号Sound0から、直流（ＤＣ）成分Ｘdcを削除する。直流成分Ｘdcを削除するのは、直流成分Ｘdcが、音質とは無関係であるためである。ＤＣ除去部２１は、例えば、高域通過フィルタによって実現することができる。以下の式（１）には、高域通過フィルタの伝達関数Ｈ（ｚ）の一例が示されている。 As shown in FIG. 3, the DC removal unit 21 deletes a direct current (DC) component Xdc from the digital signal Sound0 of the sampling period Ts generated by the audio input device 11. The reason why the direct current component Xdc is deleted is that the direct current component Xdc is irrelevant to the sound quality. The DC removal unit 21 can be realized by a high-pass filter, for example. In the following equation (1), an example of the transfer function H (z) of the high-pass filter is shown.

ＤＣ除去部２１は、直流成分Ｘdcが除去されたデジタル信号Sound1を記憶装置１２に格納する。そして、ＤＣ除去部２１は、フレーム化部２２に処理開始を通知する。

The DC removing unit 21 stores the digital signal Sound1 from which the direct current component Xdc has been removed in the storage device 12. Then, the DC removal unit 21 notifies the framing unit 22 of the start of processing.

この通知を受けて、フレーム化部２２は、記憶装置１２に格納されたデジタル信号Sound1を読み出してフレームに分割する。図４には、デジタル信号Sound1のフレーム分割により生成されたフレーム信号（所定時間長のデジタル信号）との関係が模式的に示されている。図４に示されるように、各フレーム信号は、直前のフレーム信号と、時間が一部重複している。この重複時間をＴとする。図４には、後述するＭＤＣＴの処理単位であるＭＤＣＴブロック（１ＭＤＣＴ）も示されている。各ＭＤＣＴブロックの時間長は２Ｔとなっている。すなわち、ＭＤＣＴブロックの重複時間は、その時間長の半分である。また、フレーム間の重複時間も、ＭＤＣＴブロックの時間長の半分である時間Ｔとなっている。このようにすれば、複数のフレーム（デジタル信号の全区間）に渡って、各ＭＤＣＴブロックの時間間隔は、一定となる。 Upon receiving this notification, the framing unit 22 reads the digital signal Sound1 stored in the storage device 12 and divides it into frames. FIG. 4 schematically shows a relationship with a frame signal (a digital signal having a predetermined time length) generated by frame division of the digital signal Sound1. As shown in FIG. 4, each frame signal partially overlaps with the immediately preceding frame signal. Let this overlap time be T. FIG. 4 also shows an MDCT block (1MDCT) which is an MDCT processing unit described later. The time length of each MDCT block is 2T. That is, the MDCT block overlap time is half of its length. The overlap time between frames is also a time T that is half the time length of the MDCT block. In this way, the time interval of each MDCT block is constant over a plurality of frames (all sections of the digital signal).

なお、図４では、１フレームにつきＭＤＣＴブロックが４つとなっているが、以下では、１フレームにつき、Ｎ個（Ｎは、２以上の整数）のブロックが生成されるものとして説明する。フレーム化部２２は、フレーム分割により生成された複数のフレーム信号を、フレーム単位で、記憶装置１２に格納する。そして、フレーム化部２２は、ＭＤＣＴ部２３に処理開始を通知する。 In FIG. 4, although there are four MDCT blocks per frame, the following description will be made assuming that N blocks (N is an integer of 2 or more) are generated per frame. The framing unit 22 stores a plurality of frame signals generated by frame division in the storage device 12 in units of frames. Then, the framing unit 22 notifies the MDCT unit 23 of the start of processing.

この通知を受けて、分割部及び周波数変換部としてのＭＤＣＴ部２３は、記憶装置１２から読み出した各フレーム信号に対して、周波数変換を行う。より具体的には、ＭＤＣＴ部２３は、記憶装置１２から読み出したフレーム信号を、複数のＭＤＣＴブロックに分割し、そのブロック毎に周波数変換を行い、ＭＤＣＴ係数Ｘ_k（ｋ；周波数を示す符号）をブロック毎に算出する。ＭＤＣＴ部２３は、次の式（２）、式（３）を用いて、ＭＤＣＴ係数Ｘ_kを算出する。１回のＭＤＣＴで１つのＭＤＣＴブロックのＭＤＣＴ係数Ｘ_kが算出される。なお、ＭＤＣＴ係数Ｘ_kのタップ長Ｍは、５１２タップが理想的である。 Upon receiving this notification, the MDCT unit 23 serving as the dividing unit and the frequency converting unit performs frequency conversion on each frame signal read from the storage device 12. More specifically, the MDCT unit 23 divides the frame signal read from the storage device 12 into a plurality of MDCT blocks, performs frequency conversion for each block, and MDCT coefficient X _k (k: code indicating frequency). Is calculated for each block. The MDCT unit 23 calculates the MDCT coefficient X _k using the following equations (2) and (3). MDCT coefficient X _k of one MDCT block in a single MDCT is calculated. The tap length M of the MDCT coefficient X _k is ideally 512 taps.

ＭＤＣＴ部２３で生成された、各ＭＤＣＴブロックのＭＤＣＴ係数Ｘ_k（すなわち第１の変換係数群）は、記憶装置１２に格納される。ＭＤＣＴ部２３は、正規化部２４に処理開始を通知する。

The MDCT coefficient X _k (that is, the first transform coefficient group) of each MDCT block generated by the MDCT unit 23 is stored in the storage device 12. The MDCT unit 23 notifies the normalization unit 24 of the start of processing.

この通知を受けて、正規化部２４は、各ＭＤＣＴブロックのＭＤＣＴ係数Ｘ_kを、記憶装置１２から読み出す。正規化部２４は、各ＭＤＣＴブロックのＭＤＣＴ係数Ｘ_kを、フレーム単位で、正規化する。正規化部２４は、ＭＤＣＴ係数Ｘ_kの最大値gainを取得して分離し、各ＭＤＣＴ係数Ｘ_kをこの最大値gainで除算することにより、正規化を行う。 Upon receiving this notification, the normalization unit 24 reads the MDCT coefficient X _k of each MDCT block from the storage device 12. The normalizing unit 24 normalizes the MDCT coefficient X _k of each MDCT block in units of frames. Normalization unit 24 obtains the maximum value gain of MDCT coefficient X _k is separated by dividing each MDCT coefficient X _k at this maximum gain, normalization is performed.

より具体的には、正規化部２４は、以下の式（４）を用いて、フレーム内における、ＭＤＣＴ係数Ｘ_kの最大値gainを取得する。 More specifically, the normalization unit 24 obtains the maximum value gain of the MDCT coefficient X _k in the frame using the following equation (4).

図５（Ａ）には、各ブロックにおける補正係数Ｘ_kの最大値Ｘｍａｘ_iが示されている。この最大値Ｘｍａｘ_iが、ブロック間で比較され、最も大きいＸｍａｘ_iが、最終的な最大値gainとして求められる。

FIG. 5A shows the maximum value Xmax _i of the correction coefficient X _k in each block. This maximum value Xmax _i is compared between the blocks, and the largest Xmax _i is obtained as the final maximum value gain.

次に、正規化部２４は、以下の式（５）を用いてＭＤＣＴ係数の正規化を行う。 Next, the normalization unit 24 normalizes the MDCT coefficient using the following equation (5).

図５（Ｂ）には、正規化されたＭＤＣＴ係数Ｘｎ_kの一例が示されている。この正規化により、ＭＤＣＴ係数は、例えば、１６ビットから８ビット程度に量子化される。正規化部２４は、ＭＤＣＴ係数Ｘｎ_kを、ブロック毎にまとめて、フレーム単位で、記憶装置１２に格納する。また、正規化部２４は、最大値gainを記憶装置１２に格納する。そして、正規化部２４は、帯域分割部２５に処理開始を通知する。 In FIG. 5 (B), an example of the normalized MDCT coefficients Xn _k is shown. By this normalization, the MDCT coefficient is quantized to, for example, about 16 bits to 8 bits. Normalization unit 24, the MDCT coefficients Xn _k, collectively for each block, frame by frame, and stores in the storage device 12. Further, the normalizing unit 24 stores the maximum value gain in the storage device 12. Then, the normalizing unit 24 notifies the band dividing unit 25 of the start of processing.

この通知を受けて、帯域分割部２５は、図５（Ｂ）に示されるように、記憶装置１２から読み出したＭＤＣＴ係数Ｘｎ_kの全周波数帯域を、ブロック毎に、Ｐ個（Ｐ：２以上の整数）の分割帯域ｂ_ｐ（ｐ；分割帯域の番号）に対数的に分割する（区分けする）。帯域分割部２５は、人間の聴覚の特性に合わせて、低域（低周波数帯域）ほど狭く、高域（高周波数帯域）ほど広くなるように、対数的に周波数帯域を分割する。これにより、その分割帯域ｂ＿ｐは、聴覚の特性に即したものとなる。分割帯域の分割数Ｐは、例えば、１６程度であるのが望ましい。 In response to this notification, the band division section 25, as shown in FIG. 5 (B), the entire frequency band of the MDCT coefficients Xn _k read from the storage device 12, for each block, P number (P: 2 or more Are divided logarithmically (divided) into subbands b_p (p: subband number). The band dividing unit 25 divides the frequency band logarithmically so as to be narrower as the low frequency band (low frequency band) and wider as the high frequency band (high frequency band) according to the characteristics of human hearing. As a result, the divided band b_p conforms to the auditory characteristics. The division number P of the divided band is preferably about 16, for example.

周波数帯域を分割すると、帯域分割部２５は、最大値検索部２６に処理開始を通知する。 When the frequency band is divided, the band dividing unit 25 notifies the maximum value searching unit 26 of the start of processing.

この通知を受けて、最大値検索部２６は、図６（Ａ）に示されるように、各周波数帯域ｂ＿ｐに属する、ＭＤＣＴ係数Ｘｎ_kの絶対値の最大値を検索し、各周波数帯域ｂ＿ｐのその最大値ｅｎｖ_ｐを、ブロック毎に取得する。そして、最大値検索部２６は、図６（Ｂ）に示されるように、これらの最大値ｅｎｖ_ｐの集合から成る最大値列ｅｎｖ［ｐ］（ｐ＝１〜Ｐ）を、ブロック毎に生成する。最大値検索部２６は、ＭＤＣＴ係数Ｘｎ_k及び最大値列ｅｎｖ［ｐ］（ｐ＝１〜Ｐ）を、ブロック毎にまとめて、フレーム単位で、記憶装置１２に格納する。最大値検索部２６は、最大値列ベクトル量子化部２７に処理開始を通知する。 In response to this notification, the maximum value search unit 26, as shown in FIG. 6 (A), belonging to each frequency band b_p, it retrieves the maximum value of the absolute values of the MDCT coefficients Xn _k, for each frequency band b_p The maximum value env_p is acquired for each block. Then, as shown in FIG. 6B, the maximum value search unit 26 generates a maximum value sequence env [p] (p = 1 to P) including a set of these maximum values env_p for each block. . Maximum value detection section 26, MDCT coefficients Xn _k and maximum column env [p] a (p = 1 to P), are summarized for each block, frame by frame, and stores in the storage device 12. The maximum value search unit 26 notifies the maximum value sequence vector quantization unit 27 of the start of processing.

この通知を受けて、最大値列ベクトル量子化部２７は、記憶装置１２から読み出した最大値列ｅｎｖ［ｐ］を、ブロック毎にベクトル量子化する。このベクトル量子化には、ＲＯＭ１３に格納されているコードブック５０が用いられる。図７（Ａ）には、コードブック５０が示されている。図７（Ａ）に示されるように、コードブック５０には、最大値列ｅｎｖ［ｐ］の次元Ｐと同じ次元ＰのべクトルＶ_j（ｊ＝１〜ｑ）が、ｑ個登録されている。最大値列ベクトル量子化部２７は、このコードブック５０を参照して、以下の式（６）の値ｅ_jが最小となるインデックスｊの値を、ブロック毎に求める。 Upon receiving this notification, the maximum value sequence vector quantization unit 27 vector quantizes the maximum value sequence env [p] read from the storage device 12 for each block. For this vector quantization, a code book 50 stored in the ROM 13 is used. FIG. 7A shows a code book 50. As shown in FIG. 7 (A), q vectors V _j (j = 1 to q) of the same dimension P as the dimension P of the maximum value sequence env [p] are registered in the codebook 50. Yes. The maximum value sequence vector quantization unit 27 refers to the code book 50 and obtains the value of the index j that minimizes the value e _{j of} the following equation (6) for each block.

図７（Ｂ）では、各ブロックの最大値列ｅｎｖ［ｐ］が、ｅｎｖ［ｐ］₁〜ｅｎｖ[ｐ]_Nとして示されている。最大値列ベクトル量子化部２７は、求められたｅｎｖ［ｐ］₁〜ｅｎｖ[ｐ]_Nにそれぞれ対応するインデックスｊを、時系列順に並べることにより、図７（Ｂ）に示されるような最大値列インデックスのデータ列ｉｎｄｅｘ１［ｉ］（ｉ＝１〜Ｎ）を生成する。データ列ｉｎｄｅｘ１［ｉ］は、記憶装置１２に格納される。そして、最大値列ベクトル量子化部２７は、最大値除算部２８に処理開始を通知する。

In FIG. 7B, the maximum value sequence env [p] of each block is shown as env [p] _{1 to} env [p] _N. The maximum value sequence vector quantization unit 27 arranges the indexes j respectively corresponding to the obtained env [p] ₁ to env [p] _N in time series order, thereby obtaining the maximum value as shown in FIG. A data string index1 [i] (i = 1 to N) of the value string index is generated. The data string index1 [i] is stored in the storage device 12. Then, the maximum value sequence vector quantization unit 27 notifies the maximum value division unit 28 of the start of processing.

この通知を受けて、最大値除算部２８は、ＭＤＣＴ係数Ｘｎ_kと、データ列ｉｎｄｅｘ１［ｉ］を、記憶装置１２から読み出す。そして、最大値除算部２８は、各ブロックのそれぞれの分割帯域に属するＭＤＣＴ係数Ｘｎ_kを、そのブロックのインデックスｉｎｄｅｘ１［ｉ］に対応するコードブック５０の各要素の値（すなわち逆量子化値）であって、その分割帯域に対応する逆量子化値を用いて除算する。これにより、ＭＤＣＴ係数Ｘｅ_k（ｋ＝１〜Ｍ／２−１）が、ブロック毎に得られる。図８には、この除算により生成された、あるブロックのＭＤＣＴ係数Ｘｅ_kの一例が示されている。最大値除算部２８は、ＭＤＣＴ係数Ｘｅ_kを、ブロック毎にまとめて、フレーム単位で、記憶装置１２に格納する。そして、最大値除算部２８は、量子化部２９に処理開始を通知する。 In response to this notification, the maximum value the division unit 28, the MDCT coefficients Xn _k, a data string index1 [i], from the storage device 12. Then, the maximum value the division unit 28, the MDCT coefficients Xn _k belonging to each divided band of each block, the value (i.e., the inverse quantized value) of each element of the codebook 50 corresponding to the index index1 [i] of the block Then, the division is performed using the inverse quantization value corresponding to the divided band. Thereby, MDCT coefficient Xe _k (k = 1 to M / 2-1) is obtained for each block. 8, this was produced by the division, and an example of the MDCT coefficients Xe _k of a block is illustrated. Maximum division unit 28, the MDCT coefficients Xe _k, collectively for each block, frame by frame, and stores in the storage device 12. Then, the maximum value division unit 28 notifies the quantization unit 29 of the start of processing.

この通知を受けて、量子化部２９は、記憶装置１２から読み出したＭＤＣＴ係数Ｘｅ_kを、分割帯域ｂ＿ｐ毎に予め設定されている精度（ビット数）で量子化する。図９（Ａ）、図９（Ｂ）には、量子化の様子が示されている。量子化部２９は、図９（Ａ）に示されるＭＤＣＴ係数Ｘｅ_kを量子化した結果として、図９（Ｂ）に示されるＭＤＣＴ係数Ｘｑ_kを取得する。量子化部２９は、取得されたＭＤＣＴ係数Ｘｑ_kを、記憶装置１２に格納する。そして、量子化部２９は、時間順並び替え部３０に処理開始を通知する。 In response to this notice, the quantization unit 29, the MDCT coefficients Xe _k read from the storage device 12 is quantized with accuracy (number of bits) which is previously set for each divided band b_p. 9A and 9B show the state of quantization. The quantization unit 29 obtains the MDCT coefficient Xq _k shown in FIG. 9B as a result of quantizing the MDCT coefficient Xe _k shown in FIG. The quantization unit 29 stores the acquired MDCT coefficient Xq _k in the storage device 12. Then, the quantization unit 29 notifies the time order rearrangement unit 30 of the start of processing.

この通知を受けて、時系列並び替え部としての時間順並び替え部３０は、１フレーム内の複数のＭＤＣＴブロック各々のＭＤＣＴ係数群（すなわち第１の変換係数群）に含まれるＭＤＣＴ係数Ｘｑ_kを、記憶装置１２から読み出す。そして、時間順並び替え部３０は、ＭＤＣＴ係数Ｘｑ_kを、同一周波数のグループに再グループ化し、再グループ化されたＭＤＣＴ係数群（すなわち第２の変換係数群）について、ＭＤＣＴ係数Ｘｑ_kを時間順に並び替える。図１０（Ａ）、図１０（Ｂ）には、この並び替えの様子が示されている。図１０（Ａ）に示されるように、ｉ（ｉ＝１〜Ｎ）番目のブロック目のＭＤＣＴ係数Ｘｑ_kを、補正係数Ｘｑ_k,iとする。図１０（Ｂ）に示されるように、補正係数Ｘｑ_k,iは、同一周波数のＭＤＣＴ係数群にグループ化され、時系列順に並び替えられている。ここで、同一周波数の変換係数群の各ＭＤＣＴ係数Ｘｑ_k,iを各要素とするベクトルを、係数ベクトルＦ_k（ｋ＝１〜Ｍ／２−１）とする。時間順並び替え部３０は、この係数ベクトルＦ_kを、記憶装置１２に格納する。そして、時間順並び替え部３０は、係数ベクトル量子化部３１に処理開始を通知する。 In response to this notification, the time-order rearrangement unit 30 as the time-series rearrangement unit includes the MDCT coefficients Xq _k included in the MDCT coefficient group (that is, the first transform coefficient group) of each of the plurality of MDCT blocks in one frame. Are read from the storage device 12. Then, the time order rearrangement unit 30 regroups the MDCT coefficients Xq _k into groups of the same frequency, and converts the MDCT coefficients Xq _k to time for the regrouped MDCT coefficient group (that is, the second transform coefficient group). Sort in order. 10A and 10B show the rearrangement. As shown in FIG. 10 (A), i and (i = 1 to N) th MDCT coefficients Xq _k of th block, the correction coefficient Xq _{k, i.} As shown in FIG. 10B, the correction coefficients Xq _{k, i} are grouped into MDCT coefficient groups having the same frequency and rearranged in time series. Here, a vector having each MDCT coefficient Xq _{k, i} of the transform coefficient group having the same frequency as each element is a coefficient vector F _k (k = 1 to M / 2-1). The time order rearrangement unit 30 stores the coefficient vector F _k in the storage device 12. Then, the time order rearrangement unit 30 notifies the coefficient vector quantization unit 31 of the start of processing.

この通知を受けて、係数ベクトル量子化部３１は、記憶装置１２から読み出した係数ベクトルＦ_kを、ベクトル量子化する。このベクトル量子化には、ＲＯＭ１３に格納されたコードブック５１が用いられる。図１１（Ａ）には、コードブック５１が示されている。図１１（Ａ）に示されるように、コードブック５１には、係数ベクトルＦ_kの次元Ｎと同じ次元Ｎの代表ベクトルＷ_jが、ｓ個登録されている。係数ベクトル量子化部３１は、このコードブック５１を参照して、次の式（７）の値が最小となるインデックスｊの値を、周波数（要するにｋ）毎に求める。 Upon receipt of this notification, the coefficient vector quantization unit 31 performs vector quantization on the coefficient vector F _k read from the storage device 12. For this vector quantization, a code book 51 stored in the ROM 13 is used. FIG. 11A shows a code book 51. As shown in FIG. 11A, in the code book 51, s representative vectors W _j having the same dimension N as the dimension N of the coefficient vector F _k are registered. The coefficient vector quantization unit 31 refers to the code book 51 to obtain the value of the index j that minimizes the value of the following equation (7) for each frequency (in short, k).

これにより、図１１（Ｂ）に示されるように、周波数毎に、係数ベクトルＦ_kに最も近いベクトルのインデックス値ｉｎｄｅｘ２［ｋ］（ｋ＝１〜Ｍ／２−１）が求められる。係数ベクトル量子化部３１は、ｉｎｄｅｘ２［ｋ］（ｋ＝１〜Ｍ／２−１）を、記憶装置１２に格納する。そして、係数ベクトル量子化部３１は、エントロピ符号化部３２に処理開始を通知する。

Thus, as shown in FIG. 11 (B), for each frequency, the index value index2 closest vector to the coefficient vector _{F k [k] (k =} 1~M / 2-1) is obtained. The coefficient vector quantization unit 31 stores index2 [k] (k = 1 to M / 2-1) in the storage device 12. Then, the coefficient vector quantization unit 31 notifies the entropy encoding unit 32 of the start of processing.

この通知を受けて、エントロピ符号化部３２は、最大値gainと、最大値列のインデックスのデータ列ｉｎｄｅｘ１［１］〜ｉｎｄｅｘ１［Ｎ］と、ＭＤＣＴ係数群のインデックスのデータ列ｉｎｄｅｘ２［ｋ］〜ｉｎｄｅｘ２［Ｋ］（初期段階では、Ｋ＝Ｍ／２−１）と、後述する符号数列Ｃ_tとを、記憶装置１２から読み出す。そして、エントロピ符号化部３２は、レンジコーダ、ハフマンコード等のエントロピ符号化方法を用いて、図示しないコード表を用いて、読み込まれたデータをエントロピ符号化し、符号化データを生成する。 Upon receiving this notification, the entropy encoding unit 32 receives the maximum value gain, the index data string index1 [1] to index1 [N] of the maximum value string, and the index data string index2 [k] to MDCT coefficient group index. index2 [K] (K = M / 2-1 in the initial stage) and a code sequence C _t described later are read from the storage device 12. Then, the entropy encoding unit 32 uses the entropy encoding method such as a range coder or Huffman code to perform entropy encoding on the read data using a code table (not shown) to generate encoded data.

図１２には、エントロピ符号化部３２によって符号化されるデータが示されている。図１２に示されるように、エントロピ符号化部３２によって符号化されるデータとして、符号数列Ｃ_tがある。符号数列Ｃ_tは、存在フラグＦＬＧ_kが符号化されたものである。存在フラグＦＬＧ_kとは、図１３に示されるように、各周波数のＭＤＣＴ係数群が符号化対象であるか否かを示すフラグである。存在フラグＦＬＧ_kが１であれば、対応する周波数のＭＤＣＴ係数群は符号化対象であり、存在フラグＦＬＧ_kが０であれば、その周波数のＭＤＣＴ係数群は符号化対象ではない。初期段階では、すべての周波数のＭＤＣＴ係数群が符号化の対象となっているため、図１３に示されるように、存在フラグＦＬＧ_kには、全周波数で１が設定されるようになる。すなわち、存在フラグＦＬＧ_kの初期値は、すべて１である。 FIG. 12 shows data encoded by the entropy encoding unit 32. As shown in FIG. 12, as the data to be encoded by the entropy encoding unit 32, there is a code sequence C _t. The code number sequence C _t is _obtained by encoding the presence flag FLG _k . The presence flag FLG _k is a flag indicating whether or not the MDCT coefficient group of each frequency is an encoding target, as shown in FIG. If the presence flag FLG _k is 1, the MDCT coefficient group of the corresponding frequency is an encoding target, and if the presence flag FLG _k is 0, the MDCT coefficient group of the frequency is not an encoding target. In the initial stage, since MDCT coefficient groups of all frequencies are to be encoded, as shown in FIG. 13, the presence flag FLG _k is set to 1 at all frequencies. That is, the initial values of the presence flags FLG _k are all 1.

このように、符号化フラグＦＬＧ_kのフラグ列（以下、適宜、フラグ列ＦＬＧ_kと略述する）は、０と１とからなる数列である。このフラグ列ＦＬＧ_kを、０と１の連続する数で表現することにより、符号化したものが符号数列Ｃ_tである。図１４（Ａ）、図１４（Ｂ）には、符号数列Ｃ_tの一例が示されている。図１４（Ａ）に示されるように、存在フラグ列ＦＬＧ_kにおいて、０と１が連続する数が、順番に、２、２、１、３、３、１、１である場合、符号数列Ｃ_tは、｛２、２、１、３、３、１、１｝となる。 Thus, the flag sequence of the encoding flag FLG _k (hereinafter, abbreviated as the flag sequence FLG _{k as} appropriate) is a sequence of 0s and 1s. A code number sequence C _t is encoded by expressing this flag sequence FLG _k as a continuous number of 0s and 1s. FIG. 14 (A), the in FIG. 14 (B) is an example of a code sequence C _t is shown. As shown in FIG. 14 (A), in the presence flag sequence FLG _k , when the number of consecutive 0s and 1s is 2, 2, 1, 3, 3, 1, 1 in order, the code sequence C _t is {2, 2, 1, 3, 3, 1, 1}.

また、図１４（Ｂ）に示されるように、フラグ列ＦＬＧ_kにおいて、同一の値が連続する連続数がその上限値に等しい場合には、符号数列Ｃ_tにおいて、その連続数と、次の連続数との間に、０が挿入される。 Further, as shown in FIG. 14 (B), in the flag column FLG _k, when the continuous number of the same value continues is equal to the upper limit value, the code sequence C _t, and the number of consecutive, the following 0 is inserted between the consecutive numbers.

また、図１４（Ｂ）に示されるように、フラグ列ＦＬＧ_kが、１から開始される場合には、符号数列Ｃ_tの先頭に０が挿入される。 As shown in FIG. 14B, when the flag sequence FLG _k starts from 1, 0 is inserted at the head of the code sequence C _t .

前述のように、初期段階では、存在フラグＦＬＧ_kは、全て１であるため、符号数列Ｃ_tの初期値も一意に決まっている。記憶装置１２には、予め、存在フラグＦＬＧ_kを全て１としたときの符号数列Ｃ_tの初期値が格納されている。エントロピ符号化部３２は、初回の符号化では、符号数列Ｃ_tの初期値を読み出して、それをエントロピ符号化する。 As described above, since the presence flags FLG _k are all 1 at the initial stage, the initial value of the code sequence C _t is also uniquely determined. The storage device 12 stores in advance the initial value of the code number sequence C _t when the presence flags FLG _k are all set to 1. In the first encoding, the entropy encoding unit 32 reads an initial value of the code sequence C _t and entropy encodes it.

エントロピ符号化部３２は、符号化されたデータを、記憶装置１２に記憶する。そして、エントロピ符号化部３２は、符号量比較部３３に対して、処理開始を通知する。 The entropy encoding unit 32 stores the encoded data in the storage device 12. Then, the entropy encoding unit 32 notifies the code amount comparison unit 33 of the start of processing.

この通知を受けて、符号量比較部３３は、記憶装置１２に格納された符号データを読み出し、そのデータの符号量と目標符号量とを比較する。この目標符号量は、十分な音質を確保しつつ、１６ｋｂｐｓ程度のデータレートを確保するために予め設定されている。 Upon receiving this notification, the code amount comparison unit 33 reads the code data stored in the storage device 12 and compares the code amount of the data with the target code amount. This target code amount is set in advance to ensure a data rate of about 16 kbps while ensuring sufficient sound quality.

符号量比較部３３は、合計符号量と目標符号量とを比較し、合計符号量が目標符号量以下であるか否かを判定する。符号量比較部３３は、その判定が否定された場合、その旨をデータ削除部３４に通知する。 The code amount comparison unit 33 compares the total code amount with the target code amount, and determines whether the total code amount is equal to or less than the target code amount. If the determination is negative, the code amount comparison unit 33 notifies the data deletion unit 34 to that effect.

この通知を受けて、データ削除部３４は、符号化されるデータの一部を削除する。まず、データ削除部３４は、周波数毎に、ＭＤＣＴ係数群、すなわち各係数ベクトルＦ_kに対応するＭＤＣＴ係数群の重要度（音質に影響を及ぼす度合い）を算出する。 Upon receiving this notification, the data deletion unit 34 deletes a part of the encoded data. First, the data deletion unit 34 calculates the importance (degree of influence on sound quality) of the MDCT coefficient group, that is, the MDCT coefficient group corresponding to each coefficient vector F _k for each frequency.

重要度を算出する最も単純な方法には、周波数毎に合計エネルギｇ_kを算出する方法がある。周波数毎の合計エネルギｇ_kは、次の式（８）によって表される。 The simplest method for calculating the importance is a method for calculating the total energy g _k for each frequency. The total energy g _{k for} each frequency is expressed by the following equation (8).

なお、データ削除部３４は、エネルギｇ_kに、周波数に依存した重み係数を乗算するようにしても良い。例えば、データ削除部３４は、５００Ｈｚ未満の周波数帯域に属するＭＤＣＴ係数には、１．３を乗算し、５００以上３５００Ｈｚ未満の周波数帯域に属するＭＤＣＴ係数には、１．１を乗算し、３５００Ｈｚ以上の周波数帯の補正係数には、１．０を乗算することができる。すなわち、低域の重みを大きくすることができる。 Note that the data deleting unit 34 may multiply the energy g _k by a weighting coefficient depending on the frequency. For example, the data deletion unit 34 multiplies the MDCT coefficient belonging to the frequency band below 500 Hz by 1.3, and multiplies the MDCT coefficient belonging to the frequency band of 500 or more and less than 3500 Hz by 1.1 to 3500 Hz or more. The frequency band correction coefficient can be multiplied by 1.0. That is, the weight of the low frequency can be increased.

次に、データ削除部３４は、図１５に示されるように、エネルギｇ_kが最小である周波数に対応する要素を、ｉｎｄｅｘ２［ｋ］から、削除することにより、ｉｎｄｅｘ２［ｋ］を圧縮する。さらに、データ削除部３４は、エネルギｇ_kの値が０である周波数に対応する存在フラグＦＬＧ_kを０に設定する。 Next, the data deleting unit 34, as shown in Figure 15, the elements that energy g _k corresponding to the frequency is a minimum, from index2 [k], by removing, compressing index2 [k]. Further, the data deletion unit 34 sets the presence flag FLG _k corresponding to the frequency _where the value of the energy g _k is 0 to 0.

さらに、データ削除部３４は、変更された存在フラグＦＬＧ_kに基づいて、上述した方法を用いて符号数列Ｃ_tを生成する。このデータ削除部３４によるデータ削除により、値が０となる存在フラグＦＬＧ_kが増えて、０の連続数が増加し、符号数列Ｃ_tの符号長は短くなる。データ削除部３４は、圧縮されたｉｎｄｅｘ２［ｋ］と符号数列Ｃ_tとを記憶装置１２に格納する。そして、データ削除部３４は、エントロピ符号化部３２に処理開始を通知する。 Further, the data deleting unit 34 generates the code sequence C _t using the method described above based on the changed presence flag FLG _k . The data deletion by the data deleting unit 34, an increasing number of existence flag FLG _k value becomes 0, increases the number of continuous 0, the code length of the code sequence C _t becomes shorter. The data deleting unit 34 stores the compressed index 2 [k] and the code number sequence C _t in the storage device 12. Then, the data deletion unit 34 notifies the entropy encoding unit 32 of the start of processing.

そして、エントロピ符号化部３２は、図１２に示されるデータを記憶装置１２から読み出して、改めてエントロピ符号化を行い、符号化されたデータを、記憶装置１２に記憶する。そして、エントロピ符号化部３２は、符号量比較部３３に処理開始を通知する。 Then, the entropy encoding unit 32 reads out the data shown in FIG. 12 from the storage device 12, performs entropy encoding again, and stores the encoded data in the storage device 12. Then, the entropy encoding unit 32 notifies the code amount comparison unit 33 of the start of processing.

符号量比較部３３は、符号化データの符号量と目標符号量とを再び比較する。このようにして、符号量比較部３３における判定が肯定されるまで、データ削除部３４におけるデータ削除と、エントロピ符号化部３２におけるエントロピ符号化が繰り返される。この繰り返しの過程で、図１５に示されるように、エネルギg_kの小さい順に、ＭＤＣＴ係数群が、符号化対象から除外され、ｉｎｄｅｘ２［ｋ］の要素数Ｋ（Ｋ＜Ｍ／２−１）が減り、符号数列Ｃ_tが減っていくようになり、結果的に、符号化されるデータの符号量が小さくなっていく。 The code amount comparison unit 33 compares the code amount of the encoded data with the target code amount again. In this way, data deletion in the data deletion unit 34 and entropy encoding in the entropy encoding unit 32 are repeated until the determination in the code amount comparison unit 33 is affirmed. In this iterative process, as shown in FIG. 15, the MDCT coefficient group is excluded from the encoding target in ascending order of energy g _k , and the number K of elements of index2 [k] (K <M / 2-1) is reduced, it becomes code sequence C _t is gradually decreased, as a result, the code amount of data to be encoded becomes smaller.

符号量が目標符号量以下となり、符号量比較部３３における判定が肯定されると、そのときの符号化データが、符号列として記憶装置１２に格納される。 When the code amount becomes equal to or less than the target code amount and the determination in the code amount comparison unit 33 is affirmed, the encoded data at that time is stored in the storage device 12 as a code string.

次に、復号部１７について説明する。復号部１７は、記憶装置１２から符号化データを読み出して符号化データを復号し、デジタル音声信号を生成する。復号部１７は、図１６に示されるように、エントロピ復号部４１と、係数逆ベクトル量子化（ＶＱ）部４２と、周波数並び替え部４３と、逆量子化部４４と、最大値列逆ベクトル量子化（ＶＱ）部４５と、最大値乗算部４６と、ゲイン合成部４７と、ＩＭＤＣＴ部４８と、コードブック５０、５１とを備える。 Next, the decoding unit 17 will be described. The decoding unit 17 reads the encoded data from the storage device 12, decodes the encoded data, and generates a digital audio signal. As shown in FIG. 16, the decoding unit 17 includes an entropy decoding unit 41, a coefficient inverse vector quantization (VQ) unit 42, a frequency rearrangement unit 43, an inverse quantization unit 44, and a maximum value sequence inverse vector. A quantization (VQ) unit 45, a maximum value multiplication unit 46, a gain synthesis unit 47, an IMDCT unit 48, and codebooks 50 and 51 are provided.

エントロピ復号部４１は、記憶装置１２から読み出された符号化データのエントロピ復号を行い、図１２に示される各種データを取得する。これらのデータは、記憶装置１２に格納される。エントロピ復号部４１は、係数逆ベクトル量子化（ＶＱ）部４２に、処理開始を通知する。 The entropy decoding unit 41 performs entropy decoding of the encoded data read from the storage device 12, and acquires various data shown in FIG. These data are stored in the storage device 12. The entropy decoding unit 41 notifies the coefficient inverse vector quantization (VQ) unit 42 of the start of processing.

この通知を受けて、係数逆ベクトル量子化（ＶＱ）部４２は、符号数列Ｃ_tと、ｉｎｄｅｘ２［ｋ］（ｋ＝１〜Ｋ）とに基づいて、コードブック５１を参照して、逆ベクトル量子化を行い、ＭＤＣＴ係数Ｘｑ_kを生成する。より具体的には、係数逆ＶＱ部４２は、まず、図１７に示されるように、符号数列Ｃ_tに基づいて、存在フラグＦＬＧ_kを復号する。復号では、存在フラグＦＬＧ_kが１になっているところに、ｉｎｄｅｘ２［ｋ］（ｋ＝１〜Ｋ）の各要素が対応する。したがって、係数逆ベクトル量子化（ＶＱ）部４２は、復号された存在フラグＦＬＧ_kを参照し、存在フラグＦＬＧ_kが０になっている周波数では、係数ベクトルＦ_kを０ベクトルとし、存在フラグＦＬＧkが１になっている周波数では、ｉｎｄｅｘ２［ｋ］に対応するコードブック５１の係数ベクトルを、ｋの順に、係数ベクトルＦ_kとして設定する。 In response to this notification, the coefficient inverse vector quantization (VQ) unit 42 refers to the code book 51 based on the code sequence C _t and index2 [k] (k = 1 to K), and performs the inverse vector. Quantization is performed to generate MDCT coefficients Xq _k . More specifically, the coefficient inverse VQ unit 42 first decodes the presence flag FLG _k based on the code sequence C _t as shown in FIG. In decoding, each element of index2 [k] (k = 1 to K) corresponds to the presence flag FLG _k being 1. Therefore, the coefficient inverse vector quantization (VQ) unit 42 refers to the decoded presence flag FLG _k , sets the coefficient vector F _k to 0 vector at the frequency at which the presence flag FLG _k is 0, and sets the presence flag FLG _k. Is set to a coefficient vector F _{k in} the order of _k .

このようにして、係数逆ベクトル量子化（ＶＱ）部４２は、すべての周波数におけるＭＤＣＴ係数群Ｆ_kを生成する。図１７には、このようにして生成されたＭＤＣＴ係数群Ｆ_kの一例が模式的に示されている。ＭＤＣＴ係数群Ｆ_kは、すべての要素が０の係数群か、コードブック５１内の係数ベクトルＷ₁〜Ｗ_sに対応する係数群かのいずれかとなる。ＭＤＣＴ係数群Ｆ_kの１つ１つの要素が、ＭＤＣＴ係数Ｘｑ_k,iである。係数逆ベクトル量子化（ＶＱ）部４２は、ＭＤＣＴ係数Ｘｑ_k,iを、記憶装置１２に格納する。そして、係数逆ベクトル量子化（ＶＱ）部４２は、周波数並び換え部４３に処理開始を通知する。 In this way, the coefficient inverse vector quantization (VQ) unit 42 generates the MDCT coefficient group F _k at all frequencies. FIG. 17 schematically shows an example of the MDCT coefficient group F _k generated in this way. The MDCT coefficient group F _k is either a coefficient group in which all elements are 0 or a coefficient group corresponding to the coefficient vectors W _{1 to} W _s in the code book 51. Each element of the MDCT coefficient group F _k is an MDCT coefficient Xq _{k, i} . The coefficient inverse vector quantization (VQ) unit 42 stores the MDCT coefficient Xq _{k, i} in the storage device 12. Then, the coefficient inverse vector quantization (VQ) unit 42 notifies the frequency rearrangement unit 43 of the start of processing.

この通知を受けて、周波数並び換え部４３は、記憶装置１２に格納されていたＭＤＣＴ係数Ｘｑ_k,iを、各ブロックのＭＤＣＴ係数群、すなわち周波数方向に並び替え、記憶装置１２に格納する。そして、周波数並び換え部４３は、逆量子化部４４に処理開始を通知する。 In response to this notification, the frequency rearranging unit 43 rearranges the MDCT coefficients Xq _{k, i} stored in the storage device 12 in the MDCT coefficient group of each block, that is, in the frequency direction, and stores the rearranged MDCT coefficients Xq _{k, i} in the storage device 12. Then, the frequency rearrangement unit 43 notifies the inverse quantization unit 44 of the start of processing.

この通知を受けて、逆量子化部４４は、記憶装置１２から読み出された各ブロックのＭＤＣＴ係数Ｘｑ_kに対し、分割帯域ｂ＿ｐ毎に予め設定された精度で逆量子化を行い、ＭＤＣＴ係数Ｘｅ_kをブロック毎に生成する。逆量子化部４４は、取得されたＭＤＣＴ係数Ｘｅ_kを、記憶装置１２に格納し、最大値列逆ベクトル量子化（ＶＱ）部４５に処理開始を通知する。 In response to this notification, the inverse quantization unit 44 performs inverse quantization on the MDCT coefficient Xq _{k of} each block read from the storage device 12 with an accuracy set in advance for each divided band b_p, and the MDCT coefficient to generate a Xe _k for each block. Inverse quantization unit 44, the acquired MDCT coefficients Xe _k, stored in the storage device 12, and sends the processing start to the maximum column inverse vector quantization (VQ) unit 45.

この通知を受けて、最大値列逆ベクトル量子化（ＶＱ）部４５は、コードブック５０を参照して、復号され記憶装置１２に格納されている最大値列のデータ列ｉｎｄｅｘ１［ｉ］に基づいてベクトル逆量子化を行い、最大値列ｅｎｖ［ｐ］_Nを取得する。最大値列ｅｎｖ［ｐ］_Nは、記憶装置１２に格納される。最大値列逆ベクトル量子化（ＶＱ）部４５は、最大値乗算部４６に処理開始を通知する。 In response to this notification, the maximum value sequence inverse vector quantization (VQ) unit 45 refers to the code book 50 and is based on the data sequence index1 [i] of the maximum value sequence decoded and stored in the storage device 12. Then, vector inverse quantization is performed to obtain a maximum value sequence env [p] _N. The maximum value sequence env [p] _N is stored in the storage device 12. The maximum value sequence inverse vector quantization (VQ) unit 45 notifies the maximum value multiplication unit 46 of the start of processing.

この通知を受けて、最大値乗算部４６は、その分割帯域ｂ＿ｐに属するＭＤＣＴ係数Ｘｅ_kを抽出し、抽出されたＭＤＣＴ係数Ｘｅ_kと、ｅｎｖ［ｐ］に格納された最大値ｅｎｖ＿ｐとを乗算する。この乗算により、各ブロックのＭＤＣＴ係数Ｘｎ_kが取得される。最大値乗算部４６は、取得された各ブロックのＭＤＣＴ係数Ｘｎ_kを記憶装置１２に格納し、ゲイン合成部４７に処理開始を通知する。 Upon receiving this notification, the maximum value multiplier 46 extracts the MDCT coefficient Xe _k belonging to the divided band b_p, and multiplies the extracted MDCT coefficient Xe _k by the maximum value env_p stored in env [p]. To do. This multiplication, MDCT coefficients Xn _k of each block is obtained. Maximum value multiplying unit 46 stores the MDCT coefficients Xn _k of each block acquired in the storage unit 12, and sends the processing start the gain combining unit 47.

この通知を受けて、ゲイン合成部４７は、記憶装置１２から読み出された各ブロックのＭＤＣＴ係数Ｘｎ_kに、復号された最大値gainを乗算し、各ブロックのＭＤＣＴ係数Ｘ_kを取得する。ゲイン合成部４７は、取得されＭＤＣＴ係数Ｘ_kを記憶装置１２に格納するとともに、ＩＭＤＣＴ部４８に処理開始を通知する。 In response to this notification, the gain combining section 47, the MDCT coefficients Xn _k of each block read out from the storage unit 12, by multiplying the maximum value gain decoded, to obtain the MDCT coefficient X _k of the blocks. The gain synthesis unit 47 stores the acquired MDCT coefficient X _k in the storage device 12 and notifies the IMDCT unit 48 of the start of processing.

ＩＭＤＣＴ部４８は、記憶装置１２から読み出されたＭＤＣＴ係数Ｘ_kに対して、ブロック毎に逆ＭＤＣＴを行う。ＩＭＤＣＴ部４８は、さらに、この逆ＭＤＣＴにより得られた各ブロックのデジタル信号を合成して、デジタル音声信号Sound1を復元する。復元されたデジタル音声信号Sound1は、音声入出力装置１１に送られ、再生される。 The IMDCT unit 48 performs inverse MDCT for each block on the MDCT coefficient X _k read from the storage device 12. The IMDCT unit 48 further synthesizes the digital signals of the blocks obtained by the inverse MDCT to restore the digital audio signal Sound1. The restored digital audio signal Sound1 is sent to the audio input / output device 11 and reproduced.

次に、本実施形態に係る音声処理装置１の動作について説明する。符号化部１６における符号化動作では、まず、図１８に示されるように、ＤＣ削除部２１において直流成分Ｘ_dcが削除されたデジタル信号が、ＭＤＣＴ部２３においてＮ個のブロック信号に分割される。１ブロック当たりのサンプル数はＭである。 Next, the operation of the speech processing apparatus 1 according to this embodiment will be described. In the encoding operation in the encoding unit 16, first, as shown in FIG. 18, the digital signal from which the DC component X _dc has been deleted in the DC deletion unit 21 is divided into N block signals in the MDCT unit 23. . The number of samples per block is M.

その後、ＭＤＣＴ部２３においてブロック毎にＭＤＣＴが行われる。図１９には、このときの符号量が、模式的に示されている。この時点で、１つの変換係数のビット長が１６ビットであるとすると、フレーム単位の符号量は、１６×Ｎ（ブロック数）×Ｍ／２−１（１ブロックあたりの変換係数の数）となっている。 Thereafter, MDCT is performed for each block in the MDCT unit 23. FIG. 19 schematically shows the code amount at this time. At this point, if the bit length of one transform coefficient is 16 bits, the code amount in units of frames is 16 × N (number of blocks) × M / 2-1 (number of transform coefficients per block). It has become.

その後、正規化部２４において、ＭＤＣＴ係数の正規化が行われる。正規化が行われると、すべてのＭＤＣＴ係数のデータ長は、例えば、１６ビットから８ビットに短縮され、図２０に示されるように、１フレームあたりの符号量は、１／２に短縮される。 Thereafter, the normalization unit 24 normalizes the MDCT coefficients. When normalization is performed, the data length of all MDCT coefficients is reduced from 16 bits to 8 bits, for example, and the code amount per frame is reduced to ½ as shown in FIG. .

次に、図２１に示されるように、帯域分割部２５において、周波数帯域がＰ個に分割され、それぞれの分割帯域における最大値ｅｎｖ［１］〜ｅｎｖ［Ｐ］が検索される。そして、図２２に示されるように、最大値列ＶＱ部２７において、求められた最大値列にｅｎｖ［１］〜ｅｎｖ［Ｐ］に対するベクトル量子化が行われ、コードブック５０を参照して、最大値列のデータ列ｉｎｄｅｘ１［ｉ］（ｉ＝１〜Ｎ）が生成される。 Next, as shown in FIG. 21, the frequency division unit 25 divides the frequency band into P pieces, and searches for maximum values env [1] to env [P] in the respective divided bands. Then, as shown in FIG. 22, the maximum value sequence VQ unit 27 performs vector quantization on the determined maximum value sequence for env [1] to env [P]. The data string index1 [i] (i = 1 to N) of the maximum value string is generated.

次に、最大値除算部２８において、最大値ｅｎｖ［１］〜ｅｎｖ［Ｐ］を用いて、ＭＤＣＴ係数の除算が行われる。図２３では、ｉｎｄｅｘ１［ｉ］に対応するコードブック５０のベクトルの各要素（逆量子化値）を用いて、各分割帯域に属するＭＤＣＴ係数が除算される様子が模式的に示されている。この除算により、ＭＤＣＴ係数のビット数はさらに小さくなる。 Next, the maximum value division unit 28 divides the MDCT coefficient using the maximum values env [1] to env [P]. FIG. 23 schematically shows how the MDCT coefficients belonging to each divided band are divided using each element (inverse quantization value) of the vector of the codebook 50 corresponding to index1 [i]. This division further reduces the number of bits of the MDCT coefficient.

次に、量子化部２９におけるＭＤＣＴ係数の量子化後、図２４に示されるように、時間順並び替え部３０におけるＭＤＣＴ係数の並び替えが行われる。なお、量子化では、低域になればなるほど、ＭＤＣＴ変換係数のビット数は増加するようになるが、図２４では、図面の錯綜を防止するために、周波数全域にわたって、ＭＤＣＴ変換係数のビット数が同じであるものとしている。そして、図２５に示されるように、係数ＶＱ部３１において、コードブック５１を参照して、ＭＤＣＴ係数のベクトル量子化が行われ、ｉｎｄｅｘ２［ｋ］が求められる。そして、図２６に示されるように、データ削除部３４においてエネルギｇ_kが算出され、そのエネルギｇ_kが小さい順にＭＤＣＴ係数群が削除され、ｉｎｄｅｘ２［ｋ］が圧縮され、フラグＦＬＧ_kが変更され、符号数列Ｃ_tが圧縮される。 Next, after the quantization of the MDCT coefficients in the quantization unit 29, the MDCT coefficients are rearranged in the time order rearrangement unit 30 as shown in FIG. In the quantization, the bit number of the MDCT transform coefficient increases as the frequency becomes lower. In FIG. 24, the bit number of the MDCT transform coefficient over the entire frequency range in order to prevent complication of the drawing. Are assumed to be the same. Then, as shown in FIG. 25, the coefficient VQ unit 31 refers to the code book 51, performs vector quantization of the MDCT coefficient, and obtains index2 [k]. Then, as shown in FIG. 26, the energy g _k is calculated in the data deletion unit 34, the MDCT coefficient group is deleted in order from the smallest energy g _k , index2 [k] is compressed, and the flag FLG _k is changed. The code number sequence C _t is compressed.

そして、図２７に示されるように、エントロピ符号化部３２において、ｇａｉｎ、ｉｎｄｅｘ１［ｉ］、ｉｎｄｅｘ２［ｋ］、Ｃ_tが符号化されるようになる。エネルギが小さい順に周波数に対応するＭＤＣＴ係数群が削除されればされるほど、ｉｎｄｅｘ２［ｋ］、符号数列Ｃ_tのデータ長は短くなり、データ圧縮率が向上する。 Then, as shown in FIG. 27, the entropy encoding unit 32, gain, index1 [i] , index2 [k], C t is to be coded. As the MDCT coefficient group corresponding to the frequency is deleted in ascending order of energy, the data length of the index 2 [k] and the code sequence C _t is shortened, and the data compression rate is improved.

次に、復号部１７における復号の際には、図２８に示されるように、符号数列Ｃ_tからフラグＦＬＧ_kが復号される。そして、フラグＦＬＧ_kと、圧縮されたｉｎｄｅｘ２［ｋ］とに基づいて、ｉｎｄｅｘ２［ｋ］が復元される。そして、復元されたｉｎｄｅｘ２［ｋ］に基づいて、コードブック５１を参照して、各周波数のＭＤＣＴ係数群が復元される。一方、ｉｎｄｅｘ１［ｉ］に基づいて、最大値列ｅｎｖ［ｐ］（ｅｎｖ＿ｐ）が復元される。そして、分割帯域ごとに、復号されたＭＤＣＴ係数群と、最大値ｅｎｖ＿ｐとが乗算されて、ＭＤＣＴ係数Ｘｑ_kが復元される。そして、ＭＤＣＴ係数Ｘｎ_kが、最大値ｇａｉｎと乗算されて、ＭＤＣＴ係数Ｘ_kが復元される。復元されたＭＤＣＴ係数Ｘ_kを、逆ＭＤＣＴすることにより、各ブロックの音声信号が復元され、これらの音声信号がフレーム単位の音声信号に合成される。 Then, upon decoding in the decoding unit 17, as shown in FIG. 28, the flag FLG _k is decoded from the code sequence C _t. Then, index2 [k] is restored based on the flag FLG _k and the compressed index2 [k]. Then, based on the restored index2 [k], the MDCT coefficient group of each frequency is restored with reference to the code book 51. On the other hand, the maximum value sequence env [p] (env_p) is restored based on index1 [i]. Then, for each divided band, the decoded MDCT coefficient group and the maximum value env_p are multiplied to restore the MDCT coefficient Xq _k . Then, MDCT coefficients Xn _k is multiplied by the maximum gain, MDCT coefficient X _k is restored. By performing inverse MDCT on the restored MDCT coefficient X _k , the audio signal of each block is restored, and these audio signals are synthesized into audio signals in units of frames.

このように、上述のように符号化されたデータは、復号部１７で復号されることにより、１６ビットで量子化され、周波数１６ｋＨｚでサンプリングされた元の音声信号に復元される。この結果、音声処理装置１によって再生される音声の音質は、学習用途に好適なものとなる。 As described above, the data encoded as described above is decoded by the decoding unit 17, quantized with 16 bits, and restored to the original audio signal sampled with a frequency of 16 kHz. As a result, the sound quality of the sound reproduced by the sound processing apparatus 1 is suitable for learning applications.

以上述べたように、本実施形態によれば、エネルギの小さい順にＭＤＣＴ係数群のデータを削除する。このようにすれば、音質に影響しないデータを優先的に削除することができるようになるので、音質を低下させることなく、符号化効率を高めることができるようになる。 As described above, according to the present embodiment, the data of the MDCT coefficient group is deleted in ascending order of energy. In this way, data that does not affect the sound quality can be preferentially deleted, so that the coding efficiency can be increased without degrading the sound quality.

≪第２の実施形態≫
次に、本発明の第２の実施形態について説明する。本実施形態では、符号化部１６の構成が、上記第１の実施形態に係る符号化部１６の構成と異なっている。 << Second Embodiment >>
Next, a second embodiment of the present invention will be described. In the present embodiment, the configuration of the encoding unit 16 is different from the configuration of the encoding unit 16 according to the first embodiment.

図２９に示されるように、本実施形態に係る符号化部１６は、データ削除部３４の代わりに符号化周波数選択部３５を備えている。また、本実施形態に係る符号化部１６では、符号量比較部３３が設けられておらず、この比較結果によるループも設けられていない。 As illustrated in FIG. 29, the encoding unit 16 according to the present embodiment includes an encoding frequency selection unit 35 instead of the data deletion unit 34. Further, in the encoding unit 16 according to the present embodiment, the code amount comparison unit 33 is not provided, and a loop based on the comparison result is not provided.

符号化周波数選択部３５は、エントロピ符号化部３２における符号化対象となる周波数を選択する。符号化周波数選択部３５は、周波数毎に重要度を算出し、この重要度に基づいて、符号化対象となる周波数を選択する。 The encoding frequency selection unit 35 selects a frequency to be encoded in the entropy encoding unit 32. The encoding frequency selection unit 35 calculates importance for each frequency, and selects a frequency to be encoded based on the importance.

より具体的には、符号化周波数選択部３５は、上記式（８）に従ってエネルギｇ_kを算出し、このエネルギｇ_kが高いほど、この周波数の重要度が高いと判定する。そして、符号化周波数選択部３５は、重要度の高い順に、符号化対象の周波数を選択する。 More specifically, the encoding frequency selection unit 35 calculates the energy g _k according to the above equation (8), and determines that the importance of this frequency is higher as the energy g _k is higher. Then, the encoding frequency selection unit 35 selects encoding target frequencies in descending order of importance.

なお、符号化周波数選択部３５は、エネルギｇ_kに、周波数に依存した重み係数を乗算するようにしてもよい。例えば、符号化周波数選択部３５は、５００Ｈｚ未満の周波数帯域の周波数のＭＤＣＴ係数には、１．３を乗算し、５００以上３５００Ｈｚ未満の周波数帯域の周波数のＭＤＣＴ係数には、１．１を乗算し、３５００Ｈｚ以上の周波数帯域の周波数のＭＤＣＴ係数には、１．０を乗算するようにしてもよい。 The encoding frequency selection unit 35 may multiply the energy g _k by a frequency-dependent weighting factor. For example, the encoding frequency selection unit 35 multiplies the MDCT coefficient in the frequency band of less than 500 Hz by 1.3, and multiplies the MDCT coefficient in the frequency band of 500 to less than 3500 Hz by 1.1. Then, the MDCT coefficient of the frequency band of 3500 Hz or higher may be multiplied by 1.0.

符号化周波数選択部３５は、符号化データの符号量が目標符号量に達したか否かを判別し、符号化データの符号量が目標符号量に達するまで、符号化する周波数を選択する。符号化周波数選択部３５は、選択された周波数のＭＤＣＴ係数群についてのベクトル量子化により得られるインデックスを、周波数順に並び替えることにより、ｉｎｄｅｘ２［ｋ］を生成する。また、符号化周波数選択部３５は、選択された周波数のＭＤＣＴ係数群に対応するフラグＦＬＧ_kを１に設定し、残りのフラグＦＬＧ_kを０に設定する。そして、符号化周波数選択部３５は、生成されたフラグＦＬＧ_kを符号数列Ｃ_tに符号化する。符号化周波数選択部３５は、ｉｎｄｅｘ２［ｋ］及び符号数列Ｃ_tを記憶装置１２に格納する。そして、符号化周波数選択部３５は、エントロピ符号化部３２に処理開始を通知する。 The encoding frequency selection unit 35 determines whether or not the code amount of the encoded data has reached the target code amount, and selects a frequency to be encoded until the code amount of the encoded data reaches the target code amount. The encoding frequency selection unit 35 generates index2 [k] by rearranging the indexes obtained by vector quantization for the MDCT coefficient group of the selected frequency in order of frequency. Also, the encoding frequency selection unit 35 sets the flag FLG _k corresponding to the MDCT coefficient group of the selected frequency to 1, and sets the remaining flag FLG _k to 0. Then, the encoding frequency selection unit 35 encodes the generated flag FLG _k into a code number sequence C _t . The encoding frequency selection unit 35 stores the index 2 [k] and the code number sequence C _t in the storage device 12. Then, the encoding frequency selection unit 35 notifies the entropy encoding unit 32 of the start of processing.

エントロピ符号化部３２では、上記第１の実施形態と同様に、図１２に示されるデータを符号化して、符号化データを生成する。 The entropy encoding unit 32 encodes the data shown in FIG. 12 to generate encoded data, as in the first embodiment.

以上述べたように、本実施形態のように、重要度の低い方からデータを削除するのではなく、重要度の高い方から選択した方が、目標符号量に達するまでに符号化されるデータ量を、少なくすることができるので、符号化に要する時間を短縮することができるようになる。 As described above, instead of deleting data from the lower importance as in the present embodiment, the data selected before reaching the target code amount is selected from the higher importance. Since the amount can be reduced, the time required for encoding can be shortened.

≪第３の実施形態≫
次に、本発明の第３の実施形態について説明する。本実施形態に係る音声処理装置の構成は、上記各実施形態に係る音声処理装置の構成と同じであるので、詳細な説明を省略する。 << Third Embodiment >>
Next, a third embodiment of the present invention will be described. Since the configuration of the speech processing apparatus according to this embodiment is the same as the configuration of the speech processing apparatus according to each of the above embodiments, detailed description thereof is omitted.

本実施形態では、音声データの圧縮がフレーム単位で行われるだけでなく、複数のフレームにまたがって、音声データの圧縮が行われる。図３０には、本実施形態に係る音声処理装置の符号化動作のフローチャートが示されている。まず、ＣＰＵ１５は、ステップ２０１に示されるように、符号化部１６において、上記各実施形態で説明したように、フレーム毎にデジタル音声信号の符号化を行う。ここで、目標符号量は、フレーム毎に変更することができる。次のステップ２０３では、ＣＰＵ１５は、全フレームが符号化されたか否かを判定する。この判定が否定されれば、ＣＰＵ１５は、ステップ２０１に戻る。このようにして、全フレームの符号化が行われる。 In the present embodiment, the audio data is compressed not only in units of frames, but also is compressed over a plurality of frames. FIG. 30 shows a flowchart of the encoding operation of the speech processing apparatus according to this embodiment. First, as shown in step 201, the CPU 15 encodes the digital audio signal for each frame in the encoding unit 16 as described in the above embodiments. Here, the target code amount can be changed for each frame. In the next step 203, the CPU 15 determines whether or not all frames have been encoded. If this determination is negative, the CPU 15 returns to step 201. In this way, all frames are encoded.

ステップ２０３における判定が肯定されると、ＣＰＵ１５は、ステップ２０５に進む。ステップ２０５では、ＣＰＵ１５は、全フレームの符号量の和を算出する。次のステップ２０７では、ＣＰＵ１５は、符号量の和が、全体の目標符号量以下であるか否かを判定する。この判定が肯定されれば、ＣＰＵ１５は、符号化処理を終了する。一方、この判定が否定されれば、ＣＰＵ１５は、ステップ２０９に進む。 If the determination in step 203 is affirmed, the CPU 15 proceeds to step 205. In step 205, the CPU 15 calculates the sum of the code amounts of all frames. In the next step 207, the CPU 15 determines whether or not the sum of the code amounts is equal to or less than the overall target code amount. If this determination is positive, the CPU 15 ends the encoding process. On the other hand, if this determination is negative, the CPU 15 proceeds to step 209.

ステップ２０９では、ＣＰＵ１５は、重要度が最小であるＭＤＣＴ係数群を、全フレームから検索する。例えば、図３１に示されるように、フレーム１では、５つのＭＤＣＴ係数群が符号化対象となっており、フレーム２では、４つのＭＤＣＴ係数群が符号化対象となっており、フレーム３では、４つのＭＤＣＴ係数群が符号化対象となっているものとする。また、ｉ番目のフレームの周波数ｋのエネルギをｇ_i,kとする。この場合、ＣＰＵ１５は、すべてのＭＤＣＴ係数群を、エネルギｇ_i,kの低い順に並べ替え、エネルギｇ_i,kの最小の変換係数群を検索する。図３１の例では、フレーム１のＭＤＣＴ係数群Ｆ₉のエネルギｇ_1,9が最小となっている。 In step 209, the CPU 15 searches the MDCT coefficient group having the minimum importance from all frames. For example, as shown in FIG. 31, in frame 1, five MDCT coefficient groups are to be encoded, in frame 2, four MDCT coefficient groups are to be encoded, and in frame 3, Assume that four MDCT coefficient groups are to be encoded. Also, let g _{i, k} be the energy of the frequency k of the i-th frame. In this case, CPU 15, all of the MDCT coefficient set, energy g _i, sorted in ascending order of _k, looking for energy g _i, the minimum conversion coefficient set of _k. In the example of FIG. 31, the energy g _1,9 of the MDCT coefficient group F ₉ of frame 1 is minimum.

次のステップ２１１では、ＣＰＵ１５は、エネルギｇ_i,kが最小のＭＤＣＴ係数群を、符号化対象から除外する。図３１の例では、フレーム１の変換係数群Ｆ₉が、符号化対象から除外されるようになる。次のステップ２１３では、ＣＰＵ１５は、ＭＤＣＴ係数群が符号化対象から除外されたフレーム全体を再符号化する。図３１の例では、フレーム１に関するデータが再符号化されるようになる。 In the next step 211, the CPU 15 excludes the MDCT coefficient group having the smallest energy g _{i, k} from the encoding target. In the example of FIG. 31, the transform coefficient group F ₉ of frame 1 is excluded from the encoding target. In the next step 213, the CPU 15 re-encodes the entire frame in which the MDCT coefficient group is excluded from the encoding target. In the example of FIG. 31, data relating to frame 1 is re-encoded.

ステップ２１３を行った後は、ＣＰＵ１５は、ステップ２０５に戻り、全フレームの符号量の和の算出（ステップ２０５）、全体の目標符号量との比較（ステップ２０７）を行う。このようにして、ステップ２０７における判定が肯定されるまで、ステップ２０５→２０７→２０９→２１１→２１３が繰り返される。このようにして、複数のフレームの符号化量が、全体の目標符号化量に抑えられる。 After performing Step 213, the CPU 15 returns to Step 205, calculates the sum of the code amounts of all frames (Step 205), and compares it with the entire target code amount (Step 207). In this way, steps 205 → 207 → 209 → 211 → 213 are repeated until the determination in step 207 is affirmed. In this way, the encoding amount of a plurality of frames is suppressed to the overall target encoding amount.

なお、上記ステップ２１１では、ＭＤＣＴ係数群を１つずつ除外したが、一度に複数除外するようにしてもよい。 In step 211, one MDCT coefficient group is excluded one by one, but a plurality may be excluded at a time.

本実施形態では、ステップ２０７が、全体符号量判定部に対応し、ステップ２０９、２１１が、調整部に対応し、ステップ２１３が、再符号化部に対応する。本実施形態によれば、複数のフレーム全体でのデータ圧縮が可能となる。このため、音質を維持するために、データ圧縮率を低くせざるをえないフレームは、データ圧縮率を低くしても、データ圧縮率を高めても音質にさほど影響のないフレームのデータ圧縮率を高めて、全体のデータ圧縮率を向上させることができる。 In the present embodiment, Step 207 corresponds to the overall code amount determination unit, Steps 209 and 211 correspond to the adjustment unit, and Step 213 corresponds to the re-encoding unit. According to the present embodiment, it is possible to compress data in a plurality of entire frames. For this reason, in order to maintain the sound quality, the data compression rate of a frame that has a low data compression rate does not affect the sound quality even if the data compression rate is low or the data compression rate is increased. And the overall data compression rate can be improved.

例えば、各フレームの目標符号量を、１６〜２０ｋｂｐｓとし、全体の目標符号量を１２ｋｂｐｓとする。このようにすれば、フレームによっては、音質を低下させないために符号量を２０ｋｂｐｓ程度とし、データ圧縮率を高めても音質に影響しない他のフレームでのデータ圧縮を高め、全体として符号量を１２ｋｂｐｓ以下とすることができるようになる。この結果、音質を確保しつつ、そのデータの符号化効率を向上させることができる。 For example, the target code amount of each frame is set to 16 to 20 kbps, and the entire target code amount is set to 12 kbps. In this way, depending on the frame, the code amount is set to about 20 kbps in order not to deteriorate the sound quality, and the data compression in other frames that do not affect the sound quality even if the data compression rate is increased is improved, and the code amount is set to 12 kbps as a whole. It will be possible to: As a result, the coding efficiency of the data can be improved while ensuring the sound quality.

以上詳細に説明したように、上記各実施形態によれば、時間順並び替え部３０により、ＭＤＣＴ係数を、周波数毎にグループ化し、エネルギの小さいグループを除去している。これにより、音質を損ねることなく符号化効率を高めることができる。さらに、周波数毎にまとめられたＭＤＣＴ係数群をベクトル量子化しているので、それらをスカラ量子化するよりも、データ圧縮率を高めることができる。この結果、音質を損なうことなく、符号化効率を高めることができる。 As described above in detail, according to each of the above embodiments, the MDCT coefficients are grouped for each frequency by the time order rearrangement unit 30 to remove groups with small energy. Thereby, encoding efficiency can be improved without impairing sound quality. Furthermore, since the MDCT coefficient group grouped for each frequency is vector-quantized, the data compression rate can be increased as compared with scalar quantization. As a result, encoding efficiency can be improved without deteriorating sound quality.

また、上記各実施形態によれば、最大値列ＶＱ部２７において、最大値列インデックスのデータ列をベクトル量子化しているので、それらをスカラ量子化するよりも、データ圧縮率を高めることができる。この結果、音質を損なうことなく、符号化効率を高めることができる。 Further, according to each of the above embodiments, the maximum value sequence VQ unit 27 vector-quantizes the data sequence of the maximum value sequence index, so that the data compression rate can be increased as compared with scalar quantization. . As a result, encoding efficiency can be improved without deteriorating sound quality.

上記各実施形態に係る符号化動作を行うことにより、１６ＫＨｚサンプリングの音声信号が、学習用途音質を維持したまま、１２ｋｂｐｓ程度に圧縮可能となる。 By performing the encoding operation according to each of the above embodiments, a 16 KHz sampling audio signal can be compressed to about 12 kbps while maintaining the sound quality for learning use.

また、上記各実施形態によれば、データ削除部３４又は符号化周波数選択部３５において、フラグ列ＦＬＧ_kが、そのフラグ列ＦＬＧ_kにおいて同一の値が連続する連続数の数列Ｃ_tに変換される。これにより、可逆な状態を保ったまま、データ圧縮率をさらに高めることができる。この結果、音質を損なうことなく、符号化効率を高めることができる。 Further, according to the above embodiments, the data deletion portion 34 or encoding the frequency selection unit 35, the flag column FLG _k is converted into the number of consecutive sequence C _t in which the same value continues in the flag column FLG _k The As a result, the data compression rate can be further increased while maintaining a reversible state. As a result, encoding efficiency can be improved without deteriorating sound quality.

また、上記各実施形態によれば、データ削除部３４又は符号化周波数選択部３５では、フラグ列ＦＬＧ_kにおいて、同一の値が連続する連続数がその上限値に等しい場合には、符号数列Ｃ_tにおいて、その連続数と次の連続数との間に、０が挿入される。このように、連続数の上限値を設けるようにすれば、フラグ列ＦＬＧ_kの連続数の出現パターンが、どのようなものであっても、符号数列によるデータ長を、等しく短くすることができるようになる。 Further, according to each of the above embodiments, in the data deletion unit 34 or the encoding frequency selection unit 35, when the consecutive number of consecutive identical values is equal to the upper limit value in the flag sequence FLG _k , the code sequence C _{In t} , 0 is inserted between the continuous number and the next continuous number. In this way, if the upper limit value of the continuous number is provided, the data length of the code number sequence can be shortened equally regardless of the appearance pattern of the continuous number of the flag sequence FLG _k. It becomes like this.

また、上記各実施形態によれば、データ削除部３４又は符号化周波数選択部３５では、フラグ列ＦＬＧ_kが、１から開始される場合には、符号数列Ｃ_tの先頭に０が挿入される。このようにすれば、フラグ列ＦＬＧ_kに対する可逆なデータ圧縮が可能となる。 Further, according to the above embodiments, the data deletion portion 34 or encoding the frequency selection unit 35, the flag column FLG _k is, when starting from 1, 0 to the beginning of the code sequence C _t is inserted . In this way, reversible data compression can be performed on the flag string FLG _k .

なお、人間の聴覚の特性上、音声信号の音質を高めるためには、低音域に関わる符号化量をできるだけ多くし、高音域に関わる符号化量を相対的に少なくするのが望ましい。そこで、上記各実施形態では、量子化部におけるビット数や、エネルギｇ_kに乗ずる重みなどを低音域で多くした。このような観点からすると、ＭＤＣＴ係数のベクトル量子化に用いるコードブックを、低音域のものと高音域のものと２つ用意し、低音域のコードブックは、ベクトルの数ｑを多くし、高音域のコードブックは、ベクトルの数ｑを相対的に少なくするようにしてもよい。 In order to improve the sound quality of the audio signal, it is desirable to increase the coding amount related to the low sound range as much as possible and relatively reduce the coding amount related to the high sound region in terms of human auditory characteristics. Therefore, in each of the above embodiments, the number of bits in the quantization unit, the weight multiplied by the energy g _k , and the like are increased in the low sound range. From this point of view, two codebooks for low-frequency and high-frequency codebooks are prepared for use in vector quantization of MDCT coefficients. The low-frequency codebook increases the number of vectors q, In the codebook of the range, the number q of vectors may be relatively reduced.

なお、上記各実施形態では、周波数変換としてＭＤＣＴを適用した。しかし、周波数変換方法としては、ＭＤＣＴに限られるものではなく、ＤＣＴを採用することもできる。 In each of the above embodiments, MDCT is applied as frequency conversion. However, the frequency conversion method is not limited to MDCT, and DCT can also be adopted.

また、上記各実施形態では、プログラムが、それぞれメモリ等に予め記憶されているものとして説明した。しかし、上述の処理を実行させるためのプログラムを、フレキシブルディスク、ＣＤ−ＲＯＭ（Compact Disk Read-Only Memory）、ＤＶＤ（Digital Versatile Disk）、ＭＯ（Magneto Optical disk）などのコンピュータ読み取り可能な記録媒体に格納して配布し、これを別のコンピュータにインストールし、上述の手段として動作させ、あるいは、上述の工程を実行させてもよい。 In each of the above-described embodiments, the program is described as being stored in advance in a memory or the like. However, a program for executing the above-described processing is stored on a computer-readable recording medium such as a flexible disk, a CD-ROM (Compact Disk Read-Only Memory), a DVD (Digital Versatile Disk), or an MO (Magneto Optical disk). It may be stored and distributed, installed in another computer, operated as the above-mentioned means, or the above-mentioned steps may be executed.

さらに、インターネット上のサーバ装置が有するディスク装置等にプログラムを格納しておき、例えば、搬送波に重畳させて、コンピュータにダウンロード等するものとしてもよい。 Furthermore, the program may be stored in a disk device or the like included in a server device on the Internet, and may be downloaded onto a computer by being superimposed on a carrier wave, for example.

なお、本発明は、上記実施形態に限定されず、種々の変形及び応用が可能である。上述のハードウェア構成やブロック構成、フローチャートは例示であって、限定されるものではない。例えば、上記各実施形態では、音声処理装置として、携帯電話や電子辞書を想定して説明した。しかしながら、ＰＨＳ（Personal Handyphone System）や、ＰＤＡ（Personal Digital Assistants）、あるいは一般的なパーソナルコンピュータにも、本発明を容易に適用することができる。すなわち、上記実施形態は説明のためのものであり、本願発明の範囲を制限するものではない。 In addition, this invention is not limited to the said embodiment, A various deformation | transformation and application are possible. The above-described hardware configuration, block configuration, and flowchart are examples, and are not limited. For example, in each of the embodiments described above, a mobile phone or an electronic dictionary has been described as the voice processing device. However, the present invention can be easily applied to PHS (Personal Handyphone System), PDA (Personal Digital Assistants), or general personal computers. That is, the said embodiment is for description and does not restrict | limit the scope of the present invention.

本発明の第１の実施形態に係る音声処理装置１の概略的な構成を示すブロック図である。1 is a block diagram showing a schematic configuration of a speech processing device 1 according to a first embodiment of the present invention. 図１の符号化部の構成を示すブロック図である。It is a block diagram which shows the structure of the encoding part of FIG. デジタル信号の直流成分除去の一例を示す図である。It is a figure which shows an example of DC component removal of a digital signal. フレーム分割の一例を示す図である。It is a figure which shows an example of a frame division | segmentation. 図５（Ａ）は、各ブロックの変換係数の最大値の一例を示す図であり、図５（Ｂ）は、分割帯域を示す図である。FIG. 5A is a diagram illustrating an example of the maximum value of the transform coefficient of each block, and FIG. 5B is a diagram illustrating a divided band. 図６（Ａ）は、最大値の一例を示す図であり、図６（Ｂ）は、最大値列の一例を示す図である。FIG. 6A is a diagram illustrating an example of the maximum value, and FIG. 6B is a diagram illustrating an example of the maximum value sequence. 図７（Ａ）は、最大値列のコードブックの一例を示す図であり、図７（Ｂ）は、最大値列インデックスのデータ列の一例を示す図である。FIG. 7A is a diagram illustrating an example of the code book of the maximum value sequence, and FIG. 7B is a diagram illustrating an example of the data sequence of the maximum value sequence index. 最大値による除算の一例を示す図である。It is a figure which shows an example of the division by the maximum value. 図９（Ａ）は、量子化前のＭＤＣＴ係数の一例を示す図であり、図９（Ｂ）は、量子化後のＭＤＣＴ係数の一例を示す図である。FIG. 9A is a diagram illustrating an example of an MDCT coefficient before quantization, and FIG. 9B is a diagram illustrating an example of an MDCT coefficient after quantization. 図１０（Ａ）は、並べ替え前のＭＤＣＴ係数群の一例を示す図であり、図１０（Ｂ）は、並べ替え後のＭＤＣＴ係数群の一例を示す図である。FIG. 10A is a diagram illustrating an example of the MDCT coefficient group before rearrangement, and FIG. 10B is a diagram illustrating an example of the MDCT coefficient group after rearrangement. 図１１（Ａ）は、ＭＤＣＴ係数コードブックの一例を示す図であり、図１１（Ｂ）は、ベクトル量子化により求められるＭＤＣＴ係数インデックスのデータ列の一例を示す図である。FIG. 11A is a diagram illustrating an example of an MDCT coefficient codebook, and FIG. 11B is a diagram illustrating an example of a data string of an MDCT coefficient index obtained by vector quantization. 符号化対象のデータの一例を示す図である。It is a figure which shows an example of the data of encoding object. 重要度により生成されるＭＤＣＴ係数インデックスのデータ列及びフラグ列の一例を示す図である。It is a figure which shows an example of the data row | line | column and flag row | line | column of the MDCT coefficient index produced | generated by importance. 図１４（Ａ）及び図１４（Ｂ）は、フラグ列に基づく符号数列の生成方法の一例を示す図である。FIG. 14A and FIG. 14B are diagrams illustrating an example of a code number sequence generation method based on a flag sequence. データ削除の一例を示す図である。It is a figure which shows an example of data deletion. 図１の復号部の構成を示すブロック図である。It is a block diagram which shows the structure of the decoding part of FIG. ＭＤＣＴ係数の逆ベクトル量子化の一例を模式的に示す図である。It is a figure which shows typically an example of the inverse vector quantization of a MDCT coefficient. ブロック分割時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of block division. ＭＤＣＴ時のデータの変化を模式的に示す図である。It is a figure which shows the change of the data at the time of MDCT typically. 正規化時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of normalization. 帯域分割及び最大値検索時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of a band division | segmentation and a maximum value search. 最大値列ベクトル量子化時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of maximum value column vector quantization. 最大値除算時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of the maximum value division. 時間順並び替え時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time order rearrangement. ＭＤＣＴ係数のベクトル量子化時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of the vector quantization of a MDCT coefficient. データ圧縮時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of data compression. 符号化時のデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data at the time of an encoding. 復号動作におけるデータの変化を模式的に示す図である。It is a figure which shows typically the change of the data in decoding operation | movement. 本発明の第２の実施形態に係る符号化部の構成を示すブロック図である。It is a block diagram which shows the structure of the encoding part which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施形態に係る符号化動作のフローチャートである。It is a flowchart of the encoding operation | movement which concerns on the 3rd Embodiment of this invention. フレーム間のデータ圧縮の一例を示す図である。It is a figure which shows an example of the data compression between frames.

Explanation of symbols

１音声処理装置
１１音声入出力装置
１２記憶装置
１３ＲＯＭ
１４ＲＡＭ
１５ＣＰＵ
１６符号化部
１７復号部
２１ＤＣ除去部
２２フレーム化部
２３ＭＤＣＴ部
２４正規化部
２５帯域分割部
２６最大値検索部
２７最大値列ベクトル量子化部
２８最大値除算部
２９量子化部
３０時間順並び替え部
３１係数ベクトル量子化部
３２エントロピ符号化部
３３符号量比較部
３４データ削除部
３５符号化周波数選択部
４１エントロピ復号部
４２係数逆ベクトル量子化部
４３周波数並び替え部
４４逆量子化部
４５最大値列逆ベクトル量子化部
４６最大値乗算部
４７ゲイン合成部
４８ＩＭＤＣＴ部
５０、５１コードブック DESCRIPTION OF SYMBOLS 1 Voice processing device 11 Voice input / output device 12 Storage device 13 ROM
14 RAM
15 CPU
16 encoding unit 17 decoding unit 21 DC removal unit 22 framing unit 23 MDCT unit 24 normalization unit 25 band division unit 26 maximum value search unit 27 maximum value sequence vector quantization unit 28 maximum value division unit 29 quantization unit 30 time Order rearrangement unit 31 Coefficient vector quantization unit 32 Entropy encoding unit 33 Code amount comparison unit 34 Data deletion unit 35 Encoding frequency selection unit 41 Entropy decoding unit 42 Coefficient inverse vector quantization unit 43 Frequency rearrangement unit 44 Inverse quantization Unit 45 maximum value sequence inverse vector quantization unit 46 maximum value multiplication unit 47 gain synthesis unit 48 IMDCT unit 50, 51 codebook

Claims

A dividing unit that divides a digital signal having a predetermined time length into a plurality of blocks;
A frequency conversion unit that frequency-converts the digital signal of each block and generates a first conversion coefficient group for each block;
A band dividing unit that divides the first transform coefficient group generated by the frequency converting unit into a plurality of small frequency bands such that the bandwidth increases as the frequency increases;
By searching the maximum value of the absolute value of the first transform coefficient belonging to the small frequency band for each small frequency band, and arranging the searched maximum values in order of frequency, a maximum value string is obtained for each block. A maximum value search unit to be generated;
Maximum value sequence vector quantization that generates a data sequence of the maximum value sequence index by vector quantizing the maximum value sequence of each block using a maximum value sequence codebook and arranging the obtained indexes in time series order And
A first transform coefficient belonging to each small frequency band of each block by inversely quantizing the maximum value sequence index of each block obtained by the maximum value sequence vector quantization unit using the maximum value sequence codebook A division unit that divides the group by using the inverse quantization value corresponding to the block and the inverse quantization value of the small frequency band;
When generating a second transform coefficient group for each frequency by rearranging the transform coefficients of the same frequency included in the first transform coefficient group of each block divided by the division unit in time series A series sorting section;
A transform coefficient vector quantization unit that vector-quantizes the second transform coefficient group of each frequency using a transform coefficient codebook and generates a data string of transform coefficient indexes by arranging the obtained indexes in order of frequency. When,
A flag indicating whether or not the second transform coefficient group of each frequency is a coding target by compressing the data string of the transform coefficient index based on the importance of the second transform coefficient group of each frequency. A data compression unit that generates information about,
An encoding unit that encodes the data sequence of the maximum value sequence index generated by the maximum value sequence vector quantization unit, the information about the flag generated by the data compression unit, and the compressed data sequence;
An encoding device comprising:

A code amount determination unit that repeats the determination of whether or not the code amount of the data encoded by the encoding unit is smaller than the target code amount until the determination is affirmative,
The data compression unit
When the determination of the code amount determination unit is denied, by compressing the data sequence of the transform coefficient index by deleting the second transform coefficient group from the encoding target in order of increasing importance, Generating information about the flag,
The encoding unit includes:
2. The code according to claim 1, wherein the data sequence compressed by the data compression unit and the information on the generated flag are encoded until the determination of the code amount determination unit is affirmed. Device.

The data compression unit
The frequency to be encoded by the encoding unit is selected in descending order of importance until the code amount of the encoded data is less than the target code amount and close to the target code amount. ,
2. The encoding according to claim 1, wherein the second conversion coefficient group corresponding to the selected frequency is used as an encoding target, and the data sequence of the conversion coefficient index is compressed to generate information on the flag. apparatus.

The data compression unit
Forming a flag string by arranging the generated flags in order of frequency,
2. The encoding apparatus according to claim 1, wherein, based on the formed flag sequence, a continuous number sequence in which the same value continues in the flag sequence is generated as information on the flag.

The data compression unit
In the flag sequence, when the consecutive number of consecutive identical values is equal to the upper limit value, 0 is inserted between the consecutive number and the next consecutive number in the number sequence. The encoding device according to claim 4 .

The data compression unit
The flag sequence will, when starting from 1, the encoding apparatus according to claim 4 or 5, characterized in that inserting a 0 to the beginning of the sequence.

It is determined whether or not the sum of the code amounts of the encoded data encoded by the encoding unit with respect to the series of the plurality of digital signals having the predetermined time length is smaller than the entire target code amount. An overall code amount determination unit that repeats until affirmed;
When the determination of the overall code amount determination unit is negative, the conversion coefficient index corresponding to the second conversion coefficient group having the minimum importance as a whole is excluded from the data string of the conversion coefficient index and excluded An adjustment unit that changes the flag corresponding to the second transformed coefficient group to a value indicating that it is not an encoding target;
The transform coefficient index is excluded, any of claims 1 to 6, further comprising a, a re-encoding unit for re-encoding the data relating to the flag modified digital signals of said predetermined time length The encoding device according to one item.

A dividing step of dividing a digital signal having a predetermined time length into a plurality of blocks;
A frequency conversion step of frequency-converting the digital signal of each block to generate a first conversion coefficient group for each block;
A band dividing step of dividing the first transform coefficient group generated in the frequency converting step into a plurality of small frequency bands such that the bandwidth becomes wider as the frequency becomes higher;
By searching the maximum value of the absolute value of the first transform coefficient belonging to the small frequency band for each small frequency band, and arranging the searched maximum values in order of frequency, a maximum value string is obtained for each block. A maximum value search step to be generated;
Maximum value sequence vector quantization that generates a data sequence of the maximum value sequence index by vector quantizing the maximum value sequence of each block using a maximum value sequence codebook and arranging the obtained indexes in time series order Process,
The first transform coefficient belonging to each small frequency band of each block by inversely quantizing the maximum value sequence index of each block obtained in the maximum value sequence vector quantization step using the maximum value sequence codebook A division step of dividing the group by using the inverse quantization value corresponding to the block and the inverse quantization value of the small frequency band;
When generating the second transform coefficient group for each frequency by rearranging the transform coefficients of the same frequency included in the first transform coefficient group of each block divided in the division step in time series A series rearrangement process;
Transform coefficient vector quantization step of generating a data sequence of transform coefficient indices by vector quantizing the second transform coefficient group of each frequency using a transform coefficient codebook and arranging the obtained indexes in order of frequency When,
A flag indicating whether or not the second transform coefficient group of each frequency is a coding target by compressing the data string of the transform coefficient index based on the importance of the second transform coefficient group of each frequency. A data compression process for generating information about,
An encoding step for encoding the data sequence of the maximum value sequence index generated in the maximum value sequence vector quantization step, the information about the flag generated in the data compression step, and the compressed data sequence;
An encoding method including:

A division procedure for dividing a digital signal of a predetermined time length into a plurality of blocks;
A frequency conversion procedure for frequency-converting the digital signal of each block and generating a first conversion coefficient group for each block;
A band division procedure for dividing the first transform coefficient group generated by the frequency conversion procedure into a plurality of small frequency bands in which the bandwidth becomes wider as the frequency becomes higher;
By searching the maximum value of the absolute value of the first transform coefficient belonging to the small frequency band for each small frequency band, and arranging the searched maximum values in order of frequency, a maximum value string is obtained for each block. The maximum value search procedure to generate,
Maximum value sequence vector quantization that generates a data sequence of the maximum value sequence index by vector quantizing the maximum value sequence of each block using a maximum value sequence codebook and arranging the obtained indexes in time series order Procedure and
A first transform coefficient belonging to each small frequency band of each block by inversely quantizing the maximum value sequence index of each block obtained by the maximum value sequence vector quantization procedure using the maximum value sequence codebook A division procedure for dividing a group by using an inverse quantization value corresponding to the block and using the inverse quantization value of the small frequency band;
When generating the second transform coefficient group for each frequency by rearranging the transform coefficients of the same frequency included in the first transform coefficient group of each block divided by the division procedure in time series Series reordering procedure,
A transform coefficient vector quantization procedure for vector-quantizing the second transform coefficient group of each frequency using a transform coefficient codebook and generating a data sequence of transform coefficient indexes by arranging the obtained indexes in order of frequency. When,
A flag indicating whether or not the second transform coefficient group of each frequency is a coding target by compressing the data string of the transform coefficient index based on the importance of the second transform coefficient group of each frequency. A data compression procedure to generate information about,
An encoding procedure for encoding the data sequence of the maximum value sequence index generated by the maximum value sequence vector quantization procedure, the information about the flag generated by the data compression procedure, and the compressed data sequence;
A program that causes a computer to execute.