JP4888048B2

JP4888048B2 - Audio signal encoding / decoding method, apparatus and program for implementing the method

Info

Publication number: JP4888048B2
Application number: JP2006291299A
Authority: JP
Inventors: 雅之上田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2006-10-26
Filing date: 2006-10-26
Publication date: 2012-02-29
Anticipated expiration: 2026-10-26
Also published as: JP2008107629A

Abstract

<P>PROBLEM TO BE SOLVED: To provide audio signal encoding and decoding technology, in which encoding efficiency is high and encoding and decoding processing delay is small. <P>SOLUTION: A preceding frame is searched which is similar to a frame to be encoded of an audio signal which is converted to a frequency domain. When the similar frame is searched, information which indicates the similar frame, and information which indicates difference of a frequency component of the frame to be encoded and the similar frame, are encoded. When the similar frame is not searched, the frequency component of the framed to be encoded is encoded. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、オーディオ信号の符号化復号化技術に関し、特にオーディオ信号のもつ周期性を利用した符号化復号化技術に関する。 The present invention relates to an audio signal encoding / decoding technique, and more particularly to an encoding / decoding technique using the periodicity of an audio signal.

ｉＰｏｄなどのデジタル音楽Playerが普及するにつれて、聴覚上の音質を劣化させること無く、1曲当たりのファイルサイズを抑え、Playerに搭載されるメモリに保存できる曲数を増加させることが望まれている。この技術では、楽曲の聴感上の音質を落とさず、1つのコンテンツ（楽曲）あたりの符号量を抑えることが望まれる。 As digital music players such as iPod become popular, it is desired to reduce the file size per song and increase the number of songs that can be stored in the memory installed in the player without deteriorating the sound quality of hearing. . With this technology, it is desirable to reduce the amount of code per content (music) without degrading the sound quality of the music.

この要請に応える技術として、特開２００５−２９２６４０（特許文献１）記載の技術が知られている。 As a technique that meets this requirement, a technique described in JP-A-2005-292640 (Patent Document 1) is known.

この特許文献１記載の技術では、符号化側は、オーディオ信号を例えば、２ミリ秒程度のフレームに区切り、フレーム単位で、オーディオ信号を符号化する。この符号化の前処理として、符号化側は、ます、全フレームを、独立フレームと、予測フレームに分類する。ここで、独立フレームは、他のフレームのオーディオ信号を参照することなく、そのフレーム内のオーディオ信号が符号化されるフレームである。予測フレームは、そのフレーム内のオーディオ信号が、独立フレームのオーディオ信号を基とした予測符号化を用いて符号化されるフレームである。この分類が終了すると、符号化側は、まず、独立フレームを符号化し、その後に、予測フレームの予測符号化を行う。 In the technique described in Patent Document 1, the encoding side divides the audio signal into frames of about 2 milliseconds, for example, and encodes the audio signal in units of frames. As preprocessing of this encoding, the encoding side classifies all frames into independent frames and prediction frames. Here, the independent frame is a frame in which the audio signal in the frame is encoded without referring to the audio signal of another frame. A predicted frame is a frame in which an audio signal in the frame is encoded using predictive encoding based on the audio signal of an independent frame. When this classification is completed, the encoding side first encodes the independent frame, and then performs predictive encoding of the prediction frame.

特許文献１の復号化側では、まず、独立フレームを復号化し、次に、この独立フレームの再生オーディオ信号を用いて、予測フレームの予測誤差信号を予測復号化する。 On the decoding side of Patent Document 1, first, the independent frame is decoded, and then the prediction error signal of the prediction frame is predictively decoded using the reproduced audio signal of the independent frame.

この特許文献１の技術では、時間的に過去のフレーム内のオーディオ信号のみならず未来の独立クレーム内のオーディオ信号をも用いて、予測フレームのオーディオ信号を予測符号化できる。したがって、この特許文献１の技術は、高い符号化効率を提供する。 With the technique of this patent document 1, it is possible to predictively encode an audio signal of a prediction frame using not only an audio signal in a past frame in time but also an audio signal in a future independent claim. Therefore, the technique of this patent document 1 provides high encoding efficiency.

特開２００５−２９２６４０JP-A-2005-292640

しかしながら、特許文献１記載の技術は、符号化処理の処理遅延、復号化処理の処理遅延が大きいという不具合がある。この理由は、次の通りである。 However, the technique described in Patent Document 1 has a problem that the processing delay of the encoding process and the processing delay of the decoding process are large. The reason for this is as follows.

この従来技術では、符号化装置は、まず全フレームを独立フレームと予測フレームに分類し、次に、独立フレームの符号化し、最後に予測フレームを予測符号化する。このため、符号化側は、フレームの並び替えを行う必要がある。これらの分類及びフレーム並び替えは、符号化処理の処理遅延を大きくする。 In this prior art, the encoding apparatus first classifies all frames into independent frames and prediction frames, then encodes independent frames, and finally predictively encodes prediction frames. For this reason, the encoding side needs to rearrange the frames. These classification and frame rearrangement increase the processing delay of the encoding process.

また、この従来技術では、復号化装置は、まず、独立フレームを復号化し、次に予測フレームを予測復号化し、最後に復号化された独立フレームと復号化された予測フレームを、並び替えてオーディオ信号を再生する。このため、復号化側も、フレームの並び替えを行う必要がある。このフレーム並び替えは、復号化処理の処理遅延を大きくする。 In this prior art, the decoding device first decodes an independent frame, then predictively decodes a predicted frame, and rearranges the decoded independent frame and the decoded predicted frame to perform audio decoding. Play the signal. For this reason, the decoding side must also rearrange the frames. This frame rearrangement increases the processing delay of the decoding process.

本発明の目的は、符号化能率が高く、符号化処理遅延、復号化処理遅延の小さい、オーディオ信号符号化復号化技術を提供することにある。 An object of the present invention is to provide an audio signal encoding / decoding technique having high encoding efficiency and low encoding processing delay and decoding processing delay.

本発明の１側面において、符号化側は、周波数領域に変換されたオーディオ信号を、前記周波数領域に変換されたオーディオ信号のフレームのピーク値に基づいて正規化し、前記正規化された１曲分のオーディオ信号の先頭から符号化対象フレームの直前まで、符号化対象フレームと類似した先行フレームを探索する。類似したフレームが探索された場合には、符号化側は、前記フレーム番号の符号化対象フレームが類似フレームと判定されたことを示す情報と、符号化対象フレームと類似フレームとの周波数成分差分データとを符号化する。類似フレームが探索されなかった場合には、符号化側は、符号化対象フレームの周波数成分を符号化する。
In one aspect of the present invention, the encoding side normalizes an audio signal converted to the frequency domain based on a peak value of a frame of the audio signal converted to the frequency domain, and the normalized one piece of music The preceding frame similar to the encoding target frame is searched from the beginning of the audio signal to the immediately preceding encoding target frame. When a similar frame is searched, the encoding side includes information indicating that the encoding target frame having the frame number is determined to be a similar frame, and frequency component difference data between the encoding target frame and the similar frame. Are encoded. When the similar frame is not searched, the encoding side encodes the frequency component of the encoding target frame.

また、復号化側は、類似フレーム情報が付加されていないフレームについては、符号化された周波数成分情報を復号化し、復号化された周波数成分を、逆周波数変換する。また、復号化側は、類似フレーム情報が付加されたフレームについては、符号化された周波数成分差分データを復号化し、前記復号化された周波数成分を類似フレーム番号の周波数成分に加算し、前記類似フレーム情報が付加されたフレームの周波数成分情報を復元し、前記復元された周波数成分を逆周波数変換する。

Further, the decoding side decodes the encoded frequency component information for a frame to which similar frame information is not added, and performs inverse frequency conversion on the decoded frequency component. Also, the decoding side, for the frame similar frame information is added decodes the frequency component difference data coded, the decoded frequency components by adding the frequency components of similar frame number, the similarity The frequency component information of the frame to which the frame information is added is restored, and the restored frequency component is subjected to inverse frequency conversion.

このように本発明の１側面ば、符号化対象フレームに類似した先行フレームを探索し、符号化対象フレームと類似フレームとの周波数成分差分データを符号化している。類似したフレーム間での差分信号は、通常の信号よりも小さくなる。このため、この１側面は、聴感上の音質を落とすことなく、全体の符号量を抑えることができる。 As described above, according to one aspect of the present invention, the previous frame similar to the encoding target frame is searched, and the frequency component difference data between the encoding target frame and the similar frame is encoded. A difference signal between similar frames is smaller than a normal signal. For this reason, this one side surface can suppress the whole code amount, without degrading the sound quality on hearing.

また、本発明の１側面は、類似フレームを、符号化対象フレームに先行するフレームのみから探索しているので、フレーム単位でのオーディオデータの並び替えは不要である。これは、符号化処理遅延のみならず、復号化処理遅延をも小さくする。 In addition, according to one aspect of the present invention, since similar frames are searched only from frames preceding the encoding target frame, it is not necessary to rearrange audio data in units of frames. This reduces not only the encoding process delay but also the decoding process delay.

次に、本発明の実施形態に関わる符号化装置が、図１を参照して、説明される。図１は、符号化装置１００の構成を示すブロック図である。 Next, an encoding apparatus according to an embodiment of the present invention will be described with reference to FIG. FIG. 1 is a block diagram illustrating a configuration of the encoding apparatus 100.

図１に示されたとおり、符号化装置１００は、入力バッファ１０１と、周波数変換器１０２と、正規化部１０３と、差分信号抽出処理部１０４と、差分信号閾値処理部１０５と、量子化／符号化処理部１０６と、出力メモリ１０７とからなる。 As shown in FIG. 1, the encoding apparatus 100 includes an input buffer 101, a frequency converter 102, a normalization unit 103, a differential signal extraction processing unit 104, a differential signal threshold processing unit 105, a quantization / The encoding processing unit 106 and an output memory 107 are included.

以下では、符号化開始時には、１曲分のオーディオ信号（ＰＣＭ信号）が、入力バッファ１０２に蓄積されているものとして説明する。 In the following description, it is assumed that an audio signal (PCM signal) for one song is stored in the input buffer 102 at the start of encoding.

周波数変換器１０２は、入力バッファ２０１から所定の時間単位、例えば、２ミリ秒単位で、オーディオ信号を読みだし、このオーディオ信号を周波数変換する。この所定単位がオーディオ信号のフレームとなる。この周波数変換されたデータには、フレーム番号が付加され、フレーム単位の周波数変換されたオーディオデータが作成され、正規化部１０３に供給される。この周波数変換としては、フーリエ変換、ディスクリートコサイン変換（ＤＣＴ）、モディファイドディスクリートコサイン変換（ＭＤＣＴ）等が使用可能である。なお、周波数変換されたオーディオ信号、即ち、周波数成分データが入力バッファの蓄積される場合には、この周波数変換機１０２が不要となることはいうまでもない。 The frequency converter 102 reads an audio signal from the input buffer 201 in a predetermined time unit, for example, 2 milliseconds, and converts the frequency of the audio signal. This predetermined unit is a frame of the audio signal. A frame number is added to the frequency-converted data, and the frequency-converted audio data is generated for each frame and supplied to the normalization unit 103. As this frequency transform, Fourier transform, discrete cosine transform (DCT), modified discrete cosine transform (MDCT) or the like can be used. Needless to say, when the frequency-converted audio signal, that is, the frequency component data is stored in the input buffer, the frequency converter 102 becomes unnecessary.

正規化部１０３は、供給された周波数成分データは、フレーム単位正規化される。この正規化は、フレーム内のピーク周波数成分のレベルに応じて、行われる。この正規化に使用された正規化係数は、量子化／符号化処理部１０６に供給される。正規化された周波数変換周波数成分データＸ（ｋ）（ｋ＝０、１、・・・、Ｋ−１）は、差分信号抽出処理部１０４に供給される。 The normalization unit 103 normalizes the supplied frequency component data in units of frames. This normalization is performed according to the level of the peak frequency component in the frame. The normalization coefficient used for this normalization is supplied to the quantization / encoding processing unit 106. The normalized frequency conversion frequency component data X (k) (k = 0, 1,..., K−1) is supplied to the differential signal extraction processing unit 104.

この正規化処理は、例えば、異なるフレームの間でオーディオ信号が、同一音階、同一音色で、その振幅のみが異なる場合に、予測効率を特に高めるのに貢献する。 This normalization process contributes to particularly improving the prediction efficiency when, for example, the audio signals are of the same scale and the same tone color and differ only in amplitude between different frames.

なお、正規化された周波数成分データが、既に、入力バッファ１０１に格納されている場合には、周波数変換器１０２及び正規化部１０３が不要となることはいうまでもない。 Needless to say, when the normalized frequency component data is already stored in the input buffer 101, the frequency converter 102 and the normalizing unit 103 are not necessary.

差分信号抽出処理部１０４は、直前のフレーム以前のフレーム内の正規化された周波数成分データの中から、現符号化対象フレームの正規化された周波数成分データと類似した周波数成分データを探索する。 The difference signal extraction processing unit 104 searches for frequency component data similar to the normalized frequency component data of the current encoding target frame from the normalized frequency component data in the frame before the previous frame.

この探索は、例えば、次の（１）式の計算を、適宜の数の先行フレームについて行うことにより、行うことができる。 This search can be performed, for example, by calculating the following equation (1) for an appropriate number of preceding frames.

ここで、Ｘ_diffは相違度を示し、その値が小さいほど類似度が高いことを示す。Ｘ（ｋ）は、ｋ番目の正規化された周波数成分データを示す。Ｘ_{ｒｅｆｅｒｅｎｃｅ}（ｋ）は、先行フレームのｋ番目の正規化された周波数成分データを示す。Ｎは、1フレームあたりの周波数成分数を示す。 Here, X _diff indicates the degree of difference, and the smaller the value, the higher the degree of similarity. X (k) represents the kth normalized frequency component data. X _reference (k) indicates the k-th normalized frequency component data of the preceding frame. N indicates the number of frequency components per frame.

差分信号抽出処理部１０４は、（１）式の計算結果が閾値（Ｔ）以下となった先行フレームが探索できなかった場合には、類似フレームなしと判断する。また、差分信号抽出処理部１０４は、（１）式の計算結果がＴ以下となる先行フレームを少なくとも１つ検出できた場合には、検出された先行フレームの中から1つを、類似フレームとして選択する。 The difference signal extraction processing unit 104 determines that there is no similar frame when a preceding frame in which the calculation result of the expression (1) is equal to or less than the threshold (T) cannot be searched. In addition, when the difference signal extraction processing unit 104 can detect at least one preceding frame in which the calculation result of equation (1) is T or less, one of the detected preceding frames is determined as a similar frame. select.

差分信号抽出処理部１０４は、現符号化対象フレームと類似する先行フレームを探索できなかった場合には、この符号化対象フレームを、一般フレーム（general frame；Ｇフレーム）と判定する。そして、差分信号抽出処理部１０４は、この符号化対象フレームがＧフレームである旨を示すフラッグと、この符号化対象フレームの正規化された周波数成分データを、差分閾値処理部１０５に供給する。 When the difference signal extraction processing unit 104 cannot search for a preceding frame similar to the current encoding target frame, the differential signal extraction processing unit 104 determines the encoding target frame as a general frame (G frame). Then, the differential signal extraction processing unit 104 supplies a flag indicating that the encoding target frame is a G frame and the normalized frequency component data of the encoding target frame to the difference threshold processing unit 105.

差分信号抽出処理部１０４は、現符号化対象フレームと類似する先行フレームを探索できた場合には、この符号化対象フレームを、差分フレーム（differential frame；Ｄフレーム）と判定する。そして、差分信号抽出処理部１０４は、この符号化対象フレームがＤフレームである旨を示すフラッグを、差分閾値処理部１０５に供給する。また、差分信号抽出処理部１０４は、この符号化対象フレームと類似した先行フレームのフレーム番号、即ち類似フレーム番号を、差分閾値処理部１０５に供給する。さらに、差分信号抽出処理部１０４は、この符号化対象フレームの正規化された周波数成分データと類似フレーム番号の正規化された周波数成分データとの差分信号である、周波数成分差分データを、差分閾値処理部１０５に供給する。符号化対象フレームと類似した先行フレームは1つとは限らないが、差分閾値処理部１０５へ供給される類似フレーム番号は、符号化対象フレームと最も類似した先行フレームのフレーム番号であることが好ましい。 When the differential signal extraction processing unit 104 can search for a preceding frame similar to the current encoding target frame, the differential signal extraction processing unit 104 determines the encoding target frame as a differential frame (D frame). Then, the difference signal extraction processing unit 104 supplies a flag indicating that the encoding target frame is a D frame to the difference threshold processing unit 105. Further, the difference signal extraction processing unit 104 supplies the frame number of the preceding frame similar to the encoding target frame, that is, the similar frame number, to the difference threshold processing unit 105. Further, the difference signal extraction processing unit 104 converts the frequency component difference data, which is a difference signal between the normalized frequency component data of the encoding target frame and the normalized frequency component data of the similar frame number, to the difference threshold value. Supply to the processing unit 105. Although the number of preceding frames similar to the encoding target frame is not necessarily one, the similar frame number supplied to the difference threshold processing unit 105 is preferably the frame number of the preceding frame most similar to the encoding target frame.

なお、Ｇフレーム、Ｄフレームの種別を示すフラッグは、差分閾値処理部１０５を介して、量子化／符号化処理部１０６にも供給される。 The flag indicating the type of G frame or D frame is also supplied to the quantization / encoding processing unit 106 via the difference threshold processing unit 105.

差分閾値処理部１０５は、供給されたデータがＧフレームである場合には、供給された周波数成分データ及びＧフレームを示すフラグを、そのまま、量子化／符号化処理部１０６に転送する。 When the supplied data is a G frame, the difference threshold processing unit 105 transfers the supplied frequency component data and a flag indicating the G frame to the quantization / encoding processing unit 106 as they are.

差分閾値処理部１０５は、供給されたデータがＤフレームである場合には、供給された周波数成分差分データを、差分閾値（ＤＴ）と比較する。そして、ＤＴより小さい値の周波数成分差分データを値ゼロに置き換える。この差分閾値処理が施された周波数成分差分データは、量子化／符号化処理部１０６に供給される。また、差分閾値処理部１０５は、差分信号抽出処理部から供給されたフラグ（Ｄフレームを示す）と類似フレーム番号とを、そのまま、量子化／符号化処理部１０６に供給する。なお、本実施例において、この差分閾値処理を行わない場合には、この差分閾値処理部１０５が不要となることはいうまでもない。 When the supplied data is a D frame, the difference threshold processing unit 105 compares the supplied frequency component difference data with a difference threshold (DT). Then, the frequency component difference data having a value smaller than DT is replaced with the value zero. The frequency component difference data that has been subjected to the difference threshold processing is supplied to the quantization / encoding processing unit 106. Further, the difference threshold processing unit 105 supplies the flag (indicating D frame) and the similar frame number supplied from the difference signal extraction processing unit to the quantization / encoding processing unit 106 as they are. In the present embodiment, it goes without saying that the difference threshold processing unit 105 is not necessary when the difference threshold processing is not performed.

量子化／符号化処理部１０６は、この差分閾値処理部１０５から供給された周波数成分データまたは周波数成分差分データを符号化し、符号化された周波数成分データまたは符号化された周波数成分差分データを生成する。 The quantization / encoding processing unit 106 encodes the frequency component data or frequency component difference data supplied from the difference threshold processing unit 105, and generates encoded frequency component data or encoded frequency component difference data. To do.

量子化／符号化処理部１０６は、このフレームが、Ｇフレームに対応する場合には、この差分閾値処理部１０５から供給された周波数成分データを符号化し、符号化された周波数成分データを生成する。そして、量子化／符号化処理部１０６は、フレーム番号と、フラグと、正規化係数と、ｍ符号化された周波数成分データとから、図２に例示したＧフレームを作成し、出力メモリ１０７に格納する。 When this frame corresponds to the G frame, the quantization / encoding processing unit 106 encodes the frequency component data supplied from the difference threshold processing unit 105 and generates encoded frequency component data. . Then, the quantization / encoding processing unit 106 creates the G frame illustrated in FIG. 2 from the frame number, the flag, the normalization coefficient, and the m-encoded frequency component data, and stores it in the output memory 107. Store.

図２において、「フレーム番号」は、このフレームのフレーム番号を示し、「ＦＴ」は、フレームの種別（この場合はＧフレーム）を示し、「正規化係数」は正規化部１０３から供給された正規化係数を示し、「周波数成分データ」は、符号化された周波数成分データを示している。 In FIG. 2, “frame number” indicates the frame number of this frame, “FT” indicates the type of frame (in this case, G frame), and “normalization coefficient” is supplied from the normalization unit 103. A normalization coefficient is indicated, and “frequency component data” indicates encoded frequency component data.

また、量子化／符号化処理部１０６は、このフレームが、Ｄフレームに対応する場合には、この差分閾値処理部１０５から供給された周波数成分差分データを符号化し、符号化された周波数成分差分データを生成する。そして、量子化／符号化処理部１０６は、フレーム番号と、フラグと、正規化係数と、類似フレーム番号と、符号化された周波数成分データとから、図３に例示されたＤフレームを作成し、出力メモリ１０７に格納する。 Further, when this frame corresponds to the D frame, the quantization / encoding processing unit 106 encodes the frequency component difference data supplied from the difference threshold processing unit 105 and encodes the frequency component difference thus encoded. Generate data. Then, the quantization / encoding processing unit 106 creates the D frame illustrated in FIG. 3 from the frame number, the flag, the normalization coefficient, the similar frame number, and the encoded frequency component data. And stored in the output memory 107.

図３において、「フレーム番号」は、このフレームのフレーム番号を示し、「ＦＴ」は、フレームの種別（この場合はＤフレーム）を示し、「類似フレーム番号」は差分信号抽出処理部で選択された先行フレームのフレーム番号を示し、「正規化係数」は正規化部１０３から供給された正規化係数を示し、「周波数成分差分データ」は、符号化された周波数成分差分データを示している。 In FIG. 3, “frame number” indicates the frame number of this frame, “FT” indicates the type of frame (in this case, D frame), and “similar frame number” is selected by the differential signal extraction processing unit. The “normalization coefficient” indicates the normalization coefficient supplied from the normalization unit 103, and the “frequency component difference data” indicates the encoded frequency component difference data.

このように、図1の符号化装置は、符号化対象フレームに類似した先行フレームを探索し、符号化対象フレームと類似フレームとの周波数成分差分データを符号化している。類似したフレーム間での差分信号は、周波数成分よりも小さくなる。このため、図1の符号化装置は、聴感上の音質を落とすことなく、全体の符号量を抑えることができる。 As described above, the encoding apparatus in FIG. 1 searches for a preceding frame similar to the encoding target frame, and encodes frequency component difference data between the encoding target frame and the similar frame. The difference signal between similar frames is smaller than the frequency component. For this reason, the encoding apparatus of FIG. 1 can suppress the entire code amount without degrading the audible sound quality.

また、図1の符号化装置は、ある周波数成分において、その周波数成分差分データが、差分閾値（ＤＴ）以下の場合、その差分データ０（類似フレームと同一成分）として扱うことにより、更に符号量を抑えることが可能である。この閾値をゼロとして、全ての差分成分を符号化すれば、ロスレス符号化が実現できる。つまり、ロスレスからロッシー符号化を差分閾値（ＤＴ）の変更だけでシームレスに対応できる。 In addition, when the frequency component difference data of a certain frequency component is equal to or smaller than the difference threshold (DT), the encoding device in FIG. Can be suppressed. If this threshold is set to zero and all the difference components are encoded, lossless encoding can be realized. That is, lossy coding can be seamlessly handled only by changing the difference threshold (DT) from lossless.

さらに、図１の符号化装置は、類似フレームを、符号化対象フレームに先行するフレームのみから探索しているので、フレーム単位でのオーディオデータの並び替えは不要である。これは、符号化装置の符号化遅延を小さくする。 Further, since the encoding apparatus in FIG. 1 searches for similar frames only from frames preceding the encoding target frame, it is not necessary to rearrange audio data in units of frames. This reduces the encoding delay of the encoding device.

次に、図４を参照して、本発明の実施形態にかかる復号化装置が、説明される。図４は、復号化装置２００の構成を示すブロック図である。 Next, a decoding device according to an embodiment of the present invention will be described with reference to FIG. FIG. 4 is a block diagram illustrating a configuration of the decoding apparatus 200.

図４に示されたとおり、復号化装置２００は、入力バッファ２０１と、復号化処理部２０２と、加算処理部２０３と、逆正規化部２０４と、周波数逆変換器２０５と、出力メモリ２０６とからなる。 As illustrated in FIG. 4, the decoding device 200 includes an input buffer 201, a decoding processing unit 202, an addition processing unit 203, a denormalization unit 204, a frequency inverse transformer 205, and an output memory 206. Consists of.

復号化開始時には、入力バッファ２０１は、図1の符号化装置で生成されたＧフレーム及びＤフレームを、所定のフレーム数だけ格納しているものとする。 At the start of decoding, it is assumed that the input buffer 201 stores a predetermined number of G frames and D frames generated by the encoding device of FIG.

復号化処理部２０２は、入力バッファから、ＧフレームあるいはＤフレームを、フレーム番号順に1フレームずつ読み出す。まず、復号化処理部２０２は、読み出したフレームから、フレーム種別情報（図２、図３の「ＴＹＰＥ」）及び正規化係数を分離する。このフレーム種別情報は、加算処理部２０４に供給される。この正規化係数は、逆正規化部２０４に供給される。また、復号化処理部２０２は、そのフレームがＤフレームである場合には、類似フレーム番号を、加算処理部２０４に供給する。さらに、復号化処理部２０２は、符号化された周波数成分データまたは符号化された周波数成分差分データを復号し、復号結果を、加算処理部２０４に供給する。 The decoding processing unit 202 reads G frames or D frames from the input buffer frame by frame in the order of frame numbers. First, the decoding processing unit 202 separates frame type information (“TYPE” in FIGS. 2 and 3) and a normalization coefficient from the read frame. This frame type information is supplied to the addition processing unit 204. The normalization coefficient is supplied to the denormalization unit 204. In addition, when the frame is a D frame, the decoding processing unit 202 supplies the similar frame number to the addition processing unit 204. Further, the decoding processing unit 202 decodes the encoded frequency component data or the encoded frequency component difference data, and supplies the decoding result to the addition processing unit 204.

加算処理部２０３は、フレーム種別情報がＧフレームを示している場合には、復号化処理部２０３から供給された周波数成分データを、そのまま逆正規化部２０４に供給する。また、加算処理部２０３は、この周波数成分データを、そのフレームのフレーム番号とともに記憶する。 When the frame type information indicates a G frame, the addition processing unit 203 supplies the frequency component data supplied from the decoding processing unit 203 to the inverse normalization unit 204 as it is. The addition processing unit 203 stores the frequency component data together with the frame number of the frame.

また、加算処理部２０３は、フレーム種別情報がＤフレームを示している場合には、復号化処理部２０３から供給される復号化対象フレームの周波数成分差分データを、類似フレームの周波数データと加算し、この復号化対照フレームの周波数成分データを再生する。加算処理部２０３は、この再生された周波数成分データを、逆正規化部２０４に供給する。また、加算処理部２０３は、この周波数成分データを、そのフレームのフレーム番号とともに記憶する。 In addition, when the frame type information indicates a D frame, the addition processing unit 203 adds the frequency component difference data of the decoding target frame supplied from the decoding processing unit 203 to the frequency data of the similar frame. The frequency component data of the decoded reference frame is reproduced. The addition processing unit 203 supplies the reproduced frequency component data to the inverse normalization unit 204. The addition processing unit 203 stores the frequency component data together with the frame number of the frame.

逆正規化部２０４は、加算処理部から供給された周波数成分データに、復号化処理部２０２から供給された正規化係数を乗算する。この乗算結果は、周波数逆変換器２０５に供給される。 The inverse normalization unit 204 multiplies the frequency component data supplied from the addition processing unit by the normalization coefficient supplied from the decoding processing unit 202. The multiplication result is supplied to the frequency inverse converter 205.

周波数逆変換器２０５は、逆正規化部から供給される逆正規化された周波数成分データを逆変換し、時間領域のオーディオ信号を再生し、出力メモリ２０８に格納する。この出力メモリ２０６に格納された時間領域のオーディオ信号は、オーディオ再生装置等の要求に応じて、読み出される。 The frequency inverse transformer 205 inversely transforms the denormalized frequency component data supplied from the denormalization unit, reproduces the time domain audio signal, and stores it in the output memory 208. The time-domain audio signal stored in the output memory 206 is read in response to a request from an audio playback device or the like.

なお、この復号化装置において、逆正規化部２０４は、復号化処理部２０２と加算処理部２０３との間に設けられていてもよいことは明らかであろう。 In this decoding apparatus, it is obvious that the denormalization unit 204 may be provided between the decoding processing unit 202 and the addition processing unit 203.

このように、図４の復号化装置は、フレーム番号順に、復号処理を実行するので、復号化装置内部でのフレーム並び替え処理を必要としないので、復号化遅延が少ない。したがって、本実施例に係る復号化装置は、図1の符号化装置と連携して、特許文献１では回避できなかった問題を解決する。 As described above, since the decoding apparatus in FIG. 4 performs the decoding process in the order of the frame numbers, the frame rearrangement process is not required in the decoding apparatus, so that the decoding delay is small. Therefore, the decoding apparatus according to the present embodiment solves the problem that cannot be avoided in Patent Document 1 in cooperation with the encoding apparatus of FIG.

次に、図５及び図６のフローチャートを参照して、図1の符号化装置及び図４の復号化装置を、コンピュータにより実現するためのプログラムにつき、説明される。図５は、図1の符号化装置をコンピュータにより実現するためのプログラムを説明するためのフローチャートである。図６は、図４の復号化装置をコンピュータにより実現するためのプログラムを説明するためのフローチャートである。 Next, a program for realizing the encoding apparatus of FIG. 1 and the decoding apparatus of FIG. 4 by a computer will be described with reference to the flowcharts of FIGS. FIG. 5 is a flowchart for explaining a program for realizing the encoding apparatus of FIG. 1 by a computer. FIG. 6 is a flowchart for explaining a program for realizing the decoding apparatus of FIG. 4 by a computer.

まず、図５につき説明される。図５において、ステップＳ１１以下のステップが実行される前に、所定フレーム数のオーディオデータが、図１の入力バッファに格納されているものとする。コンピュータの中央処理装置（ＣＰＵ）は、図５のステップＳ１１からステップＳ１９で、図１の周波数変換器１０２から量子化／符号化処理部１０６までの処理と同等の処理を、実行する。 First, FIG. 5 will be described. In FIG. 5, it is assumed that a predetermined number of frames of audio data are stored in the input buffer of FIG. 1 before the steps after step S11 are executed. The central processing unit (CPU) of the computer executes processing equivalent to the processing from the frequency converter 102 to the quantization / encoding processing unit 106 in FIG. 1 in steps S11 to S19 in FIG.

ステップＳ１１で、ＣＰＵは、図１の入力バッファから、１フレーム分のオーディオデータを読み出す。この際に、ＣＰＵは、この読み出されたデータに、フレーム番号を付与する。 In step S11, the CPU reads audio data for one frame from the input buffer of FIG. At this time, the CPU gives a frame number to the read data.

次に、ステップＳ１２で、ＣＰＵは、１フレームのオーディオデータを周波数領域に変換する。ステップＳ１３では、ＣＰＵは、この周波数領域データを、前述したとおりの手法で正規化し、周波数成分データを生成する。ＣＰＵは、この正規化に使用された正規化係数を、レジスタ等に記憶する。また、ＣＰＵは、この正規化された周波数成分データを、フレーム番号と共に、内部メモリ等に、記憶する。 Next, in step S12, the CPU converts the audio data of one frame into the frequency domain. In step S13, the CPU normalizes this frequency domain data by the method as described above, and generates frequency component data. The CPU stores the normalization coefficient used for the normalization in a register or the like. Further, the CPU stores the normalized frequency component data together with the frame number in an internal memory or the like.

ステップＳ１４では、ＣＰＵは、過去のフレームの正規化された周波数成分データの中に、現フレームの正規化された周波数成分データと類似したデータを有するフレームがあるか否かをサーチする。ＣＰＵは、このサーチを、前述の（１）式を過去のフレームにつき計算することにより行う。なお、この過去のフレームとの類似度計算は、符号化対象フレームの直前のフレームから、例えば、１００フレーム程度までの過去のフレームにつき、類似度計算の範囲を制限してもよい。また、オーディオ信号のピッチ周波数を別途求めておき、このピッチ周波数に基づいて、サーチ対象とする過去のフレームを決定することにより、類似フレーム探索のための所要演算量を減らすこともできる。 In step S14, the CPU searches whether there is a frame having data similar to the normalized frequency component data of the current frame in the normalized frequency component data of the past frame. The CPU performs this search by calculating the above equation (1) for the past frame. The similarity calculation with the past frame may limit the range of the similarity calculation for the past frame from the frame immediately before the encoding target frame to about 100 frames, for example. Further, by separately obtaining the pitch frequency of the audio signal and determining a past frame to be searched based on this pitch frequency, it is possible to reduce the amount of calculation required for the similar frame search.

ステップＳ１５では、ＣＰＵは、ステップＳ１４の実行結果に基づき、類似フレームの有無を判定する。類似したフレームがない場合には、現フレームに、Ｇフレームを示すフラグをセットする。そして、ＣＰＵの処理は、ステップＳ１８に進む。類似フレームがある場合には、ＣＰＵの処理は、ステップＳ１６に進む。 In step S15, the CPU determines whether there is a similar frame based on the execution result of step S14. If there is no similar frame, a flag indicating the G frame is set in the current frame. Then, the processing of the CPU proceeds to step S18. If there is a similar frame, the CPU proceeds to step S16.

ステップＳ１６では、ＣＰＵは、類似フレームを１つ、好ましくは、最類似フレームを選択する。そして、ＣＰＵは、選択した類似フレームのフレーム番号を、レジスタに記憶する。また、ＣＰＵは、現フレームの正規化された周波数成分データと類似フレームの正規化された周波数成分データとの差信号である、周波数成分差分データを計算する。この後、ＣＰＵの処理は、ステップＳ１７に進む。 In step S16, the CPU selects one similar frame, preferably the most similar frame. Then, the CPU stores the frame number of the selected similar frame in the register. Further, the CPU calculates frequency component difference data, which is a difference signal between the normalized frequency component data of the current frame and the normalized frequency component data of the similar frame. Thereafter, the processing of the CPU proceeds to step S17.

ステップＳ１７では、ＣＰＵは、前述した差分閾値処理を行う。この際に、この差分閾値（ＤＴ）は、固定値でもよいが、この差分閾値を、再生音声信号に要求される品質に応じて可変とすることもできる。例えば、本実施例は、差分閾値(ＤＴ)をゼロにした場合、全ての周波数成分差分データを符号化することになり、ロスレス符号化に近い符号化が実現できる。即ち、本実施例は、この差分閾値を変えるだけで、ロスレスからロッシー（lossy）符号化までをシームレスにカバーできる。この後、ＣＰＵの処理は、ステップＳ１８に進む。 In step S17, the CPU performs the above-described difference threshold process. At this time, the difference threshold value (DT) may be a fixed value, but the difference threshold value may be variable according to the quality required for the reproduced audio signal. For example, in this embodiment, when the difference threshold value (DT) is set to zero, all frequency component difference data are encoded, and encoding close to lossless encoding can be realized. That is, the present embodiment can seamlessly cover from lossless to lossy coding only by changing the difference threshold. Thereafter, the processing of the CPU proceeds to step S18.

ステップＳ１８では、ＣＰＵは、現フレームが、Ｇフレームの場合には、正規化された周波数成分データに、量子化及び符号化を施し、オーディオ信号信号自体のビットストリーム（図２の「周波数成分データ」）を生成する。また、ＣＰＵは、図２に示されたＧフレームを作成し、図１の出力メモリ１０７に書き込む。 In step S18, if the current frame is a G frame, the CPU performs quantization and encoding on the normalized frequency component data, and the bit stream of the audio signal signal itself (“frequency component data in FIG. 2). )). Further, the CPU creates the G frame shown in FIG. 2 and writes it in the output memory 107 of FIG.

また、ステップＳ１８では、ＣＰＵは、現フレームが、Ｄフレームの場合には、正規化された周波数成分差分データに、量子化及び符号化を施し、オーディオ信号信号自体のビットストリーム（図３の「周波数成分差分データ」）を生成する。また、ＣＰＵは、図３に示されたＤフレームを作成し、図１の出力メモリ１０７に書き込む込む。 In step S18, if the current frame is a D frame, the CPU performs quantization and encoding on the normalized frequency component difference data, and a bit stream of the audio signal signal itself (see “ Frequency component difference data ") is generated. Further, the CPU creates the D frame shown in FIG. 3 and writes it into the output memory 107 of FIG.

ステップＳ１９では、ＣＰＵは、最終フレームへの処理が終了したか否かを判定する。この判定結果が、“Ｙｅｓ”の場合には、ＣＰＵは、符号化処理を終了する。そうでない場合には、ＣＰＵの処理は、ステップＳ１１に戻り、次のフレームのオーディオ信号の符号化処理を行う。 In step S19, the CPU determines whether or not the processing for the last frame has been completed. When the determination result is “Yes”, the CPU ends the encoding process. Otherwise, the processing of the CPU returns to step S11 and performs the encoding processing of the audio signal of the next frame.

次に、図６が説明される。図６は、図４の復号化装置をコンピュータにより実現するためのプログラムを説明するためのフローチャートである。 Next, FIG. 6 will be described. FIG. 6 is a flowchart for explaining a program for realizing the decoding apparatus of FIG. 4 by a computer.

図６において、ステップＳ２１以下が実行される前に、所定フレーム数の図１の符号化装置で作成された符号化データが、図４の入力バッファ２０１に格納されているものとする。ＣＰＵは、図６のステップＳ２１からステップＳ２６で、図４の復号化処理部２０２から周波数逆変換器２０６までの処理と同等の処理を、実行する。 In FIG. 6, it is assumed that the encoded data created by the encoding apparatus of FIG. 1 for a predetermined number of frames is stored in the input buffer 201 of FIG. 4 before step S21 and subsequent steps are executed. The CPU executes processing equivalent to the processing from the decoding processing unit 202 to the frequency inverse transformer 206 in FIG. 4 in steps S21 to S26 in FIG.

ステップＳ２１において、ＣＰＵは、図４の入力バッファ２０１から、１フレームの符号化データを読み出す。この符号化データは、図２または図３に示されたフォーマットである。 In step S21, the CPU reads out one frame of encoded data from the input buffer 201 in FIG. This encoded data is in the format shown in FIG.

ＣＰＵは、この読み出されたデータから、フレーム番号、フレーム種別（ＴＹＰＥ）及び正規化係数を読み出し、各々の値を、レジスタ等にセットする。このフレームがＤフレームの場合には、類似フレーム番号を抽出し、この値をレジスタにセットする。 The CPU reads the frame number, frame type (TYPE), and normalization coefficient from the read data, and sets each value in a register or the like. If this frame is a D frame, a similar frame number is extracted and this value is set in a register.

また、ＣＰＵは、そのフレームがＧフレームの場合には、正規化された周波数成分データを復元し、フレーム番号と対応付けて内部メモリ等に記憶する。そのフレームがＤフレームの場合には、正規化された周波数成分差分データを復元し、、フレーム番号と対応付けて内部メモリ等に記憶する。 When the frame is a G frame, the CPU restores normalized frequency component data and stores it in an internal memory or the like in association with the frame number. If the frame is a D frame, the normalized frequency component difference data is restored and stored in an internal memory or the like in association with the frame number.

ステップＳ２２では、ＣＰＵは、現フレームがＧフレーム、Ｄフレームのいずれであるかを判定する。Ｇフレームである場合には、ＣＰＵの処理は、ステップＳ２４に進む。Ｄフレームである場合には、ＣＰＵの処理は、ステップＳ２３に進む。 In step S22, the CPU determines whether the current frame is a G frame or a D frame. If it is the G frame, the CPU proceeds to step S24. If it is a D frame, the CPU proceeds to step S23.

ステップＳ２３では、ＣＰＵは、現フレームの正規化された周波数成分差分データと、類似フレーム番号に対応する正規化された周波数成分差分データとを加算し、現フレームの正規化された周波数成分データを復元する。この復元された周波数成分データは、フレーム番号と対応付けられて、内部メモリに格納される。 In step S23, the CPU adds the normalized frequency component difference data of the current frame and the normalized frequency component difference data corresponding to the similar frame number to obtain the normalized frequency component data of the current frame. Restore. The restored frequency component data is stored in the internal memory in association with the frame number.

ステップＳ２４では、ＣＰＵは、ステップＳ２１またはステップＳ２３で復元された正規化された周波数成分データを逆正規化し、現フレームの周波数成分データを、復元する。 In step S24, the CPU denormalizes the normalized frequency component data restored in step S21 or step S23, and restores the frequency component data of the current frame.

なお、このステップＳ２４の逆正規化処理は、ステップＳ２１で行ってもよい。この場合には、ＣＰＵは、ステップＳ２１では、Ｇフレームにおいては、逆正規化後の周波数成分データを復元し、Ｄフレームにおいては、逆正規化後の周波数成分差分データを復元することになる。 Note that the denormalization process in step S24 may be performed in step S21. In this case, in step S21, the CPU restores the frequency component data after denormalization in the G frame, and restores the frequency component difference data after denormalization in the D frame.

ステップＳ２５では、ＣＰＵは、現フレームの復元された周波数成分データに、逆周波数変換処理を施し、オーディオ信号を再生する。 In step S25, the CPU performs inverse frequency conversion processing on the restored frequency component data of the current frame to reproduce the audio signal.

ステップＳ２６では、ＣＰＵは、全てのフレームについて、復号処理が終了したか否かうを判定する。この判定結果が“ＮＯ”の場合には、ＣＰＵの処理はステップＳ２１に戻り、後続するフレームの復号処理を行う。この判定結果が“ＹＥＳ”の場合には、ＣＰＵは、復号処理を終了する。 In step S26, the CPU determines whether or not the decoding process has been completed for all frames. If this determination is “NO”, the CPU returns to step S 21 to perform decoding processing for the subsequent frame. If the determination result is “YES”, the CPU ends the decoding process.

本発明の実施形態にかかる符号化装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the encoding apparatus concerning embodiment of this invention. 本発明におけるＧフレームの例を示す図である。It is a figure which shows the example of the G flame | frame in this invention. 本発明におけるＤフレームの例を示す図である。It is a figure which shows the example of D frame in this invention. 本発明の実施形態にかかる復号化装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the decoding apparatus concerning embodiment of this invention. 図１の符号化装置をコンピュータで実現するためのプログラムを説明するためのフローチャートである。It is a flowchart for demonstrating the program for implement | achieving the encoding apparatus of FIG. 1 with a computer. 図４の復号化装置をコンピュータで実現するためのプログラムを説明するためのフローチャートである。6 is a flowchart for explaining a program for realizing the decoding apparatus of FIG. 4 by a computer.

Explanation of symbols

１００符号化装置
１０１入力バッファ
１０２周波数変換器
１０３正規化部
１０４差分信号抽出処理部
１０５差分信号閾値処理部
１０６量子化／符号化処理部
１０７出力メモリ
２００復号化装置
２０１入力バッファ
２０２復号化処理部
２０３加算処理部
２０４逆正規化部
２０５周波数逆変換器
２０６出力メモリ DESCRIPTION OF SYMBOLS 100 Encoding apparatus 101 Input buffer 102 Frequency converter 103 Normalization part 104 Difference signal extraction process part 105 Difference signal threshold value process part 106 Quantization / coding process part 107 Output memory 200 Decoding apparatus 201 Input buffer 202 Decoding process part 203 Addition Processing Unit 204 Inverse Normalization Unit 205 Frequency Inverse Transformer 206 Output Memory

Claims

An audio signal encoding method including the following steps (A) to (D):
(A) normalizing the audio signal converted to the frequency domain based on the peak value of the frame of the audio signal converted to the frequency domain ;
(B) Search for a preceding frame similar to the encoding target frame from the beginning of the normalized audio signal for one song to immediately before the encoding target frame;
(C) When a similar frame is searched, the frame number of the encoding target frame , information indicating that the encoding target frame of the frame number is determined to be a similar frame, and similar to the encoding target frame Encode the frequency component difference data with the frame,
(D) When a similar frame is not searched, the frame number of the encoding target frame and the frequency component of the encoding target frame are encoded.

The audio signal encoding method according to claim 1, further comprising a step of storing a normalization coefficient used for the normalization.

The encoding method according to claim 1, wherein the step (B) includes:
When a preceding frame is searched for in which the sum of squares of the difference between the frequency component of the preceding frame and the frequency component of the encoding target frame is equal to or less than a predetermined threshold, it is determined that there is a similar frame.

The encoding method according to claim 2, wherein the step (B) includes:
When a preceding frame in which the sum of squares of the difference between the normalized frequency component and the normalized frequency component of the encoding target frame is not more than a predetermined threshold is searched, it is determined that there is a similar frame.

5. The encoding method according to claim 1, wherein the step (C) includes:
The preceding frame having the frequency component most similar to the frequency component of the encoding target frame is set as the similar frame.

General frame (G frames) containing the frequency component information of the frame the frame number and the frame number, the difference frame including a frequency component difference data between frames similar to the frame number of the frame numbers similar to the frame number and the frame number ( An audio signal decoding method for decoding an audio signal from a bit stream mixed with D frames),
The G frame and the D frame further include a normalization coefficient, and the G frame includes frequency component data normalized based on a normalization coefficient included in the G frame as the frequency component information, and the D frame includes An audio signal decoding method including difference frequency data normalized based on a normalization coefficient included in the D frame as the frequency component difference data and including the following steps (A) to (C).
(A) When the current frame is a G frame, the frequency component data is restored by decoding the frequency component information,
(B) If the current frame is a D frame,
Decoding the frequency component difference data, decoding the difference frequency component of the current frame to obtain frequency component data,
The difference frequency component of the decoded current frame is restored from the frequency component of the similar frame, and the frequency component of the current frame is restored.
(C) The frequency component restored in steps (A) and (B) is inversely transformed to generate an audio signal of the current frame.

The audio signal decoding method according to claim 6, wherein
The method
The method further comprises the step of denormalizing the restored frequency component based on a normalization coefficient included in the G frame or D frame, and supplying the denormalized frequency component to the step (C).

The audio signal decoding method according to claim 6, wherein
The step (A) restores the frequency component data by denormalizing the frequency component data based on a normalization coefficient included in the G frame ,
The step (B) denormalizes the frequency component difference data based on a normalization coefficient included in the D frame, and restores the frequency component based on the denormalized frequency component difference data.

A normalization unit that normalizes the audio signal converted into the frequency domain using a normalization coefficient based on the peak value of the frame of the audio signal converted into the frequency domain ;
A search unit that searches for a preceding frame similar to the encoding target frame from the head of the normalized audio signal of one music piece to immediately before the encoding target frame;
When a similar frame is searched, the frame number of the encoding target frame , information indicating that the encoding target frame of the frame number is determined to be a similar frame, and the encoding target frame and the similar frame An encoding unit that encodes the frequency component difference data and encodes the frame number of the encoding target frame and the frequency component of the encoding target frame when a similar frame is not searched;
An audio signal encoding device comprising:

The audio signal encoding device according to claim 9, wherein the normalization unit further stores the normalization coefficient .

The encoding device according to claim 9, wherein
The search unit determines that there is a similar frame when a previous frame in which the sum of squares of the difference between the frequency component of the previous frame and the frequency component of the encoding target frame is found is equal to or less than a predetermined threshold.

The encoding device according to claim 10, wherein
The search unit has a similar frame when a preceding frame in which the sum of squares of the difference between the normalized frequency component and the normalized frequency component of the encoding target frame is less than a predetermined threshold is searched. Is determined.

The encoding device according to claim 9 to 12,
The encoding unit includes:
The preceding frame having the frequency component most similar to the frequency component of the encoding target frame is set as the similar frame.

General frame (G frames) containing the frequency component information of the frame the frame number and the frame number, the difference frame including a frequency component difference data between frames similar to the frame number of the frame numbers similar to the frame number and the frame number ( An audio signal decoding device that decodes an audio signal from a bit stream mixed with D frame),
The G frame and the D frame further include a normalization coefficient, and the G frame includes frequency component data normalized based on a normalization coefficient included in the G frame as the frequency component information, and the D frame includes The difference frequency data normalized based on the normalization coefficient included in the D frame is included as the frequency component difference data,
When the current frame is a G frame, the frequency component information is decoded by restoring the frequency component information, and when the current frame is a D frame,
A decoding unit that decodes the frequency component difference data and decodes the difference frequency component of the frame to obtain frequency component data;
When the current frame is a G frame, an addition processing unit that restores the differential frequency component of the decoded current frame from the frequency component of the similar frame, and the decoding unit or the addition An inverse transform unit that inversely transforms the frequency component restored by the processing unit to generate an audio signal of the current frame;
An audio signal decoding device comprising:

The audio signal decoding device according to claim 14, wherein
The decoding unit
The image processing apparatus further includes a denormalization unit that denormalizes the restored frequency component based on a normalization coefficient included in the G frame or the D frame .

A program for causing a computer to execute an audio signal encoding process, and the program includes the following steps (A) to (D).
(A) normalizing the audio signal converted to the frequency domain based on the peak value of the frame of the audio signal converted to the frequency domain ;
(B) Search for a preceding frame similar to the encoding target frame from the beginning of the normalized audio signal for one song to immediately before the encoding target frame;
(C) When a similar frame is searched, the frame number of the encoding target frame , information indicating that the encoding target frame of the frame number is determined to be a similar frame, and similar to the encoding target frame Encode the frequency component difference data with the frame,
(D) When a similar frame is not searched, the frame number of the encoding target frame and the frequency component of the encoding target frame are encoded.

The program according to claim 16, further comprising a step of storing a normalization coefficient used for the normalization.

The program according to claim 16, wherein the step (B) includes:
When a preceding frame is searched for in which the sum of squares of the difference between the frequency component of the preceding frame and the frequency component of the encoding target frame is equal to or less than a predetermined threshold, it is determined that there is a similar frame.

The program according to claim 17, wherein the step (B) includes:
When a preceding frame in which the sum of squares of the difference between the normalized frequency component and the normalized frequency component of the encoding target frame is not more than a predetermined threshold is searched, it is determined that there is a similar frame.

The program according to any one of claims 16 to 19, wherein the step (C) includes:
The preceding frame having the frequency component most similar to the frequency component of the encoding target frame is set as the similar frame.

A program for causing a computer to execute an audio signal decoding process,
General frame (G frames) containing the frequency component information of the frame the frame number and the frame number, the difference frame including a frequency component difference data between frames similar to the frame number of the frame numbers similar to the frame number and the frame number ( D frame) is a program for decoding an audio signal from a mixed bitstream,
The G frame and the D frame further include a normalization coefficient, and the G frame includes frequency component data normalized based on a normalization coefficient included in the G frame as the frequency component information, and the D frame includes A program including difference frequency data normalized based on a normalization coefficient included in the D frame as the frequency component difference data and including the following steps (A) to (C).
(A) When the current frame is a G frame, the frequency component data is restored by decoding the frequency component information,
(B) If the current frame is a D frame,
Decoding the frequency component difference data, decoding the difference frequency component of the current frame to obtain frequency component data,
The difference frequency component of the decoded current frame is restored from the frequency component of the similar frame, and the frequency component of the current frame is restored.
(C) The frequency component restored in steps (A) and (B) is inversely transformed to generate an audio signal of the current frame.

The program according to claim 21, wherein
The program is
The method further comprises the step of denormalizing the restored frequency component based on a normalization coefficient included in the G frame or D frame, and supplying the denormalized frequency component to the step (C).

The program according to claim 21, wherein
The step (A) restores the frequency component data by denormalizing the frequency component data based on a normalization coefficient included in the G frame ,
The step (B) denormalizes the frequency component difference data based on a normalization coefficient included in the D frame, and restores the frequency component based on the denormalized frequency component difference data.