JP2001184090A

JP2001184090A - Signal encoding device and signal decoding device, and computer-readable recording medium with recorded signal encoding program and computer-readable recording medium with recorded signal decoding program

Info

Publication number: JP2001184090A
Application number: JP36957199A
Authority: JP
Inventors: Toshikazu Fujinaga; 利和藤長
Original assignee: FUJI TECHNO ENTERPRISE KK
Current assignee: FUJI TECHNO ENTERPRISE KK
Priority date: 1999-12-27
Filing date: 1999-12-27
Publication date: 2001-07-06

Abstract

PROBLEM TO BE SOLVED: To solve the problem that it is difficult for a conventional encoding and decoding device for an acoustic signal, etc., to meet requirements of compressibility, quality, and an arithmetic quantity at high level. SOLUTION: High compressibility and high quality are secured with a high arithmetic quantity by determining the number of allocation bits allocated to each frequency component according to the spectrum envelope of the frequency spectrum of an object signal and encoding the frequency spectrum quantized corresponding to the number of allocation bits and the spectrum envelope.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は，例えば音声信号や
楽音信号を含む音響信号，或いは映像信号など，時間的
又は空間的に変化する対象信号の符号化，復号化を行う
ための装置，及び符号化，復号化を行うためのプログラ
ムを記録したコンピュータ読み取り可能な記録媒体に関
するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus for encoding and decoding a temporally or spatially changing target signal such as an audio signal including a sound signal and a musical sound signal, or a video signal, and the like. The present invention relates to a computer-readable recording medium on which a program for performing encoding and decoding is recorded.

【０００２】[0002]

【従来の技術】近年のインターネットの急速な普及に伴
って，一般消費者が利用可能な通信伝送路は多様化しつ
つあり，数Ｍｂｐｓ程度の帯域幅を提供することが可能
な通信サービスも幾つか現れはじめている。しかしなが
ら，比較的高速な通信サービスを利用する利用者は，現
在のところ少数であり，多くの利用者が利用可能な帯域
幅はたかだか１２８ｋｂｐｓ程度である。この帯域幅
は，音楽や映像といった，いわゆるマルチメディア情報
を伝送するのには十分なものとは言い難い。前記マルチ
メディア情報をディジタル化する場合の最も基本的な符
号化方式は，線形PCM(Pulse Code Modulation)方式であ
る。前記線形PCM 方式は，時間的又は空間的に離散化さ
れた信号の大きさを等ステップで量子化することによ
り，対象となった信号を符号に置き換えるものである。
前記線形PCM 方式により符号化された情報のビットレー
トは，サンプリング周波数×量子化ビット数×チャンネ
ル数であり，例えばオーディオ用ＣＤ（サンプリングレ
ート４４．１ｋＨｚ，量子化ビット数１６，ステレオ）
相当の情報のビットレートは，約１４１１ｋｂｐｓとな
る。多くの利用者が利用可能な帯域幅は，このビットレ
ートよりずっと小さく，上述した１２８ｋｂｐｓ程度の
帯域幅では，４分程度の楽音信号を得るのに少なくとも
４４分程度の時間が必要となる。もちろん，この条件で
は前記楽音信号を実時間で再生することは望めない。但
し，前記マルチメディア情報には，冗長な部分が比較的
多く存在するため，可逆的，非可逆的に情報を高能率符
号化（圧縮）することが可能である。上述のような比較
的低速な通信伝送路を用いる場合，対象となる信号につ
いて高能率符号化を行うことは，実用上欠かせない。例
えば楽音信号について高能率符号化を行う方式の代表的
なものには，国際標準規格MPEG-1(Moving Picture Expe
rts Group)に含まれるMPEG-1 audio layer3(いわゆるMP
3)に規定された復号化方式に対応したものや，ＮＴＴヒ
ューマンインターフェース研究所が開発した，TwinVQ(T
ransform-domain Weighted INterleave Vector Quantiz
ation)などがある。これらは，人間の聴覚特性を利用す
るなどして，音質の劣化を抑えながら，１／１０〜１／
２０程度まで楽音信号を圧縮する。これらの楽音信号に
対する高能率符号化方式は，インターネット等を利用し
た音楽配信サービスを普及させる主因となり，また半導
体メモリを用いた携帯用再生端末の製品化も可能にし
た。前記MP3 や，TwinVQを採用した携帯用再生端末は，
６４Ｍバイト程度のメモリ容量があれば，ＣＤ相当の音
質でも１時間強の録音が可能であり，十分実用的なもの
となっている。また，高能率符号化を行うことは，限り
ある通信伝送路の資源を節約することにもなる。例えば
携帯電話などに代表される移動体無線通信の分野では，
無線帯域の制限を考慮して，ADPCM(Adaptive Different
ial Pulse Code Modulation)方式や，各種CELP(Code Ex
cited Linear Prediction)方式などに従って，音声信号
の符号化が行われてきた。音声信号を前記線形PCM 方式
により符号化する場合，少なくとも６４ｋｂｐｓの帯域
幅が必要とされているが，前記ADPCM 方式により符号化
された音声信号のビットレートは，３２ｋｂｐｓであ
り，前記各種CELP方式により符号化された音声信号のビ
ットレートは，４ｋｂｐｓ〜１６ｋｂｐｓ程度である。
前記各種CELP方式により符号化した場合には，前記線形
PCM 方式により符号化した場合と較べると，かなり低ビ
ット化され，必要な無線周波数帯域が節約されることに
なる。また，このような音声通信の符号化方式は，前記
移動体無線通信の分野だけでなく，IP(Internet Protoc
l)ネットワークを用いて音声通信を行うVOIP(Voice Ove
r IP) にも採用されており，インターネットを利用した
長距離電話通信の低価格サービスの提供が期待されてい
る。2. Description of the Related Art With the rapid spread of the Internet in recent years, communication transmission lines available to general consumers are diversifying, and some communication services capable of providing a bandwidth of about several Mbps are available. It is starting to appear. However, there are currently few users who use relatively high-speed communication services, and the bandwidth available to many users is at most about 128 kbps. This bandwidth is not enough to transmit so-called multimedia information such as music and video. The most basic encoding method for digitizing the multimedia information is a linear PCM (Pulse Code Modulation) method. The linear PCM method replaces a target signal with a code by quantizing the magnitude of a temporally or spatially discretized signal in equal steps.
The bit rate of the information encoded by the linear PCM method is sampling frequency × quantization bit number × channel number, and is, for example, an audio CD (sampling rate 44.1 kHz, quantization bit number 16, stereo).
The bit rate of the corresponding information is about 1411 kbps. The bandwidth available to many users is much smaller than this bit rate. With the above-mentioned bandwidth of about 128 kbps, it takes at least about 44 minutes to obtain a tone signal of about 4 minutes. Of course, under this condition, it is impossible to reproduce the tone signal in real time. However, since the multimedia information has a relatively large number of redundant portions, the information can be reversibly and irreversibly encoded (compressed) with high efficiency. When using a relatively low-speed communication transmission line as described above, it is practically essential to perform high-efficiency coding on a target signal. For example, a typical method for performing high-efficiency encoding of a tone signal is the international standard MPEG-1 (Moving Picture Expe
rts Group) MPEG-1 audio layer3 (so-called MP
It is compatible with the decoding method specified in 3) and TwinVQ (T) developed by NTT Human Interface Laboratories.
ransform-domain Weighted INterleave Vector Quantiz
ation). These control the deterioration of sound quality by utilizing the human auditory characteristics, and
The tone signal is compressed to about 20. The high-efficiency encoding method for these tone signals has become a main cause of the spread of music distribution services using the Internet and the like, and has also made it possible to commercialize portable reproduction terminals using semiconductor memories. The portable playback terminal that adopts the MP3 and TwinVQ,
With a memory capacity of about 64 Mbytes, recording for just over one hour is possible even with sound quality equivalent to a CD, which is sufficiently practical. Performing high-efficiency coding also saves limited communication transmission path resources. For example, in the field of mobile radio communications represented by mobile phones,
In consideration of wireless bandwidth restrictions, ADPCM (Adaptive Differential
ial Pulse Code Modulation) method and various CELP (Code Ex
Audio signals have been encoded according to the cited linear prediction (cid Linear Prediction) method. When the audio signal is encoded by the linear PCM system, a bandwidth of at least 64 kbps is required. However, the bit rate of the audio signal encoded by the ADPCM system is 32 kbps, and the bit rate of the CELP system is 32 kbps. The bit rate of the encoded audio signal is approximately 4 kbps to 16 kbps.
When coding by the various CELP methods, the linear
Compared to the case of encoding by the PCM method, the number of bits is considerably reduced, and the necessary radio frequency band is saved. In addition, such an encoding method of voice communication is used not only in the field of the mobile radio communication but also in an IP (Internet Protocol).
l) VOIP (Voice Ove) that performs voice communication using a network
r IP), which is expected to provide a low-cost service for long-distance telephone communication using the Internet.

【０００３】[0003]

【発明が解決しようとする課題】ところで，前記楽音信
号や音声信号を含む音響信号に対する高能率符号化処理
を行うにあたって，主要な評価基準となるものに，圧縮
率，品質，演算量がある。前記圧縮率は，もちろん，符
号化前後でのビットレートの比であり，前記品質は，符
号化された符号化信号を復号化して再生した場合の感覚
的評価である。また，前記演算量は，信号の符号化や復
号化を行う際に必要となる演算量であり，処理の複雑さ
に関係する。上述したような各種の高能率符号化方式
（及びそれらに対応した復号化方式）は，用途に応じた
適当なバランスの元で前記基準を満足するが，ある程度
の音質と高い圧縮率を得るために処理が複雑化し，前記
演算量が増大する傾向にあり，前記３つの評価基準の全
てを高い水準で満足することは難しい。この問題は，前
記音響信号のように時間的に変化する信号だけでなく，
静止画或いは動画像に含まれる画像のように空間的に変
化する信号について高能率符号化を行う場合でも，同様
に発生する。本発明は，このような従来の技術における
課題を解決するために，信号符号化装置，及び信号符号
化プログラム，並びに信号復号化装置，及び信号復号化
プログラムを改良し，品質を維持しながら少ない演算量
で高い圧縮率を得ることのでる信号符号化装置，及び信
号符号化プログラムを記録したコンピュータ読み取り可
能な記録媒体，並びに前記信号符号化装置や前記信号符
号化プログラムを用いて生成した符号化信号を少ない演
算量で復号化し高い品質の再生を行うことが可能な信号
復号化装置，及び信号復号化プログラムを記録したコン
ピュータ読み取り可能な記録媒体を提供することを目的
とするものである。In performing high-efficiency encoding processing on the audio signal including the tone signal and the audio signal, the compression rate, the quality, and the calculation amount are the main evaluation criteria. The compression ratio is, of course, the ratio of the bit rates before and after encoding, and the quality is a sensory evaluation when the encoded signal is decoded and reproduced. The operation amount is an operation amount required when encoding or decoding a signal, and is related to the complexity of processing. Although the various high-efficiency coding methods described above (and the decoding methods corresponding to them) satisfy the above criteria under an appropriate balance according to the application, it is necessary to obtain a certain sound quality and a high compression rate. However, the processing tends to be complicated and the amount of calculation tends to increase, and it is difficult to satisfy all of the three evaluation criteria at a high level. The problem is not only with time-varying signals like the acoustic signals,
The same occurs when high-efficiency encoding is performed on a spatially changing signal such as an image included in a still image or a moving image. In order to solve the problems in the conventional technology, the present invention improves a signal encoding device, a signal encoding program, a signal decoding device, and a signal decoding program, and reduces the number of signals while maintaining the quality. A signal encoding device capable of obtaining a high compression ratio with an operation amount, a computer-readable recording medium recording a signal encoding program, and an encoding generated using the signal encoding device and the signal encoding program An object of the present invention is to provide a signal decoding device capable of decoding a signal with a small amount of calculation and performing high-quality reproduction, and a computer-readable recording medium recording a signal decoding program.

【０００４】[0004]

【課題を解決するための手段】前記目的を達成するため
に，請求項１に係る発明は，時間的又は空間的に変化す
る対象信号の周波数スペクトルを求める周波数スペクト
ル演算手段と，前記周波数スペクトル演算手段により演
算された前記対象信号の周波数スペクトルからスペクト
ル包絡を抽出するスペクトル包絡抽出手段と，前記スペ
クトル包絡抽出手段により抽出された前記スペクトル包
絡に応じて，前記対象信号の周波数スペクトルの各周波
数成分に割り当てる割当ビット数を定める割当ビット数
決定手段と，前記割当ビット数決定手段により定められ
た前記割当ビット数に基づいて，前記対象信号の周波数
スペクトルを量子化する周波数スペクトル量子化手段
と，前記周波数スペクトル量子化手段により量子化され
た前記対象信号の周波数スペクトルに関する周波数スペ
クトル情報と，前記スペクトル包絡に関するスペクトル
包絡情報とを符号化した符号化信号を生成する符号化信
号生成手段とを具備してなる信号符号化装置として構成
されている。前記請求項１に記載の信号符号化装置によ
り生成される符号化信号には，時間的又は空間的に変化
する対象信号の周波数スペクトルに関する周波数スペク
トル情報と前記対象信号のスペクトル包絡に関するスペ
クトル包絡情報とが含まれる。前記周波数スペクトル情
報は，前記スペクトル包絡に応じて割り当てられた割当
ビット数に基づいて前記対象信号の周波数スペクトルが
量子化されて符号化されたものであり，そのビットレー
トは，前記対象信号を線形PCM 方式により符号化した場
合と較べてずっと少なく，スペクトル包絡を利用する前
記MP3 やTwinVQなどの高能率符号化方式と同等かそれよ
り少ない。また，周波数スペクトルをほぼそのまま符号
化するため，前記対象信号の品質の劣化を抑えることが
できる。しかも，時間領域から周波数領域への変換演算
は，符号化に当たって一度だけ行えばよく，演算量が少
なくてすみ，短時間で符号化を行うことができる。In order to achieve the above object, the invention according to claim 1 comprises a frequency spectrum calculating means for obtaining a frequency spectrum of a target signal which changes temporally or spatially, and the frequency spectrum calculating means. Spectrum envelope extracting means for extracting a spectrum envelope from the frequency spectrum of the target signal calculated by the means; and, for each frequency component of the frequency spectrum of the target signal, according to the spectrum envelope extracted by the spectrum envelope extracting means. Allocation bit number determination means for determining the allocation bit number to be allocated; frequency spectrum quantization means for quantizing a frequency spectrum of the target signal based on the allocation bit number determined by the allocation bit number determination means; The circumference of the target signal quantized by the spectrum quantization means And the frequency spectral information about the number spectrum, and is configured and spectrum envelope information about the spectral envelope as a signal encoding apparatus comprising; and a coded signal generating means for generating an encoded signal encoded. The coded signal generated by the signal coding apparatus according to claim 1 includes frequency spectrum information related to a frequency spectrum of a target signal that changes temporally or spatially, and spectrum envelope information related to a spectrum envelope of the target signal. Is included. The frequency spectrum information is obtained by quantizing and encoding the frequency spectrum of the target signal based on the number of allocated bits allocated according to the spectrum envelope. It is much less than when encoding using the PCM method, and is equal to or less than high-efficiency encoding methods such as MP3 and TwinVQ that use the spectral envelope. Further, since the frequency spectrum is encoded almost as it is, deterioration of the quality of the target signal can be suppressed. Moreover, the conversion operation from the time domain to the frequency domain only needs to be performed once for encoding, and the amount of operation is small, and encoding can be performed in a short time.

【０００５】また，請求項２に係る発明は，時間的又は
空間的に変化する対象信号の周波数スペクトルに関する
周波数スペクトル情報と前記対象信号のスペクトル包絡
に関するスペクトル包絡情報とを符号化した符号化信号
から前記スペクトル包絡情報を取得し，該取得した前記
スペクトル包絡情報に基づいて前記対象信号の周波数ス
ペクトルの各周波数成分に対する前記スペクトル包絡を
復元するスペクトル包絡復元手段と，前記スペクトル包
絡復元手段により復元された前記スペクトル包絡から，
前記対象信号の周波数スペクトルの各周波数成分に割り
当てられた割当ビット数を取得する割当ビット数取得手
段と，前記符号化信号から前記周波数スペクトル情報を
取得し，前記割当ビット数取得手段により取得された前
記割当ビット数に基づいて，前記取得した前記周波数ス
ペクトル情報から前記対象信号の周波数スペクトルを復
号化する周波数スペクトル復号化手段と，前記周波数ス
ペトル復号化手段により復号化された前記対象信号の周
波数スペクトルに基づいて，前記対象信号を復元する信
号復元手段とを具備してなる信号復号化装置である。前
記請求項２に記載の発明によれば，前記請求項１に記載
の信号符号化装置により生成された符号化信号を復号化
して，時間的又は空間的に変化する対象信号を復元する
のに好適な信号復号化装置を提供することができる。前
記請求項２に記載の信号復号化装置において，復号化の
ために行われる主な演算は，周波数領域から時間領域に
変換する一度だけの逆変換の処理と割当ビット数の取得
に必要な比較的単純な演算のみであるから，短時間で前
記対象信号の復号化を行うことができ，また実時間再生
を行うのも容易になる。According to a second aspect of the present invention, a coded signal obtained by coding frequency spectrum information on a frequency spectrum of a target signal which changes temporally or spatially and spectrum envelope information on a spectrum envelope of the target signal is encoded. The spectrum envelope restoring means for acquiring the spectrum envelope information and restoring the spectrum envelope for each frequency component of the frequency spectrum of the target signal based on the acquired spectrum envelope information; and the spectrum envelope restoring means. From the spectral envelope,
An allocation bit number obtaining unit that obtains an allocation bit number allocated to each frequency component of the frequency spectrum of the target signal; and the frequency spectrum information that obtains the frequency spectrum information from the coded signal, which is obtained by the allocation bit number obtaining unit. Frequency spectrum decoding means for decoding a frequency spectrum of the target signal from the acquired frequency spectrum information based on the allocated bit number, and a frequency spectrum of the target signal decoded by the frequency spectrum decoding means And a signal restoring means for restoring the target signal based on the signal decoding. According to the second aspect of the present invention, the coded signal generated by the signal encoding apparatus according to the first aspect is decoded to restore a temporally or spatially changing target signal. A suitable signal decoding device can be provided. 3. The signal decoding apparatus according to claim 2, wherein the main operations performed for decoding are a one-time inverse transform process for transforming from the frequency domain to the time domain and a comparison required for obtaining the number of allocated bits. Since only simple calculations are performed, decoding of the target signal can be performed in a short time, and real-time reproduction can be easily performed.

【０００６】また，請求項３に係る発明は，コンピュー
タに，空間的又は時間的に変化する対象信号の周波数ス
ペクトルを求める手順，前記対象信号の周波数スペクト
ルからスペクトル包絡を抽出する手順，前記スペクトル
包絡に応じて，前記対象信号の周波数スペクトルの各周
波数成分に割り当てる割当ビット数を定める手順，前記
割当ビット数に基づいて，前記対象信号の周波数スペク
トルを量子化する手順，前記対象信号の周波数スペクト
ルに関する周波数スペクトル情報と，前記スペクトル包
絡を定めるスペクトル包絡情報とを符号化した符号化信
号を生成する手順を実行させるための信号符号化プログ
ラムを記録したコンピュータ読み取り可能な記録媒体で
ある。前記請求項３に記載のコンピュータ読み取り可能
な記録媒体は，前記請求項１に記載の信号符号化装置を
コンピュータにより実現するのに好適な信号符号化プロ
グラムを提供する。前記請求項３に記載の信号符号化プ
ログラムに従って，コンピュータにより生成させた符号
化信号には，前記請求項１に記載の信号符号化装置と同
様に，時間的又は空間的に変化する対象信号の周波数ス
ペクトルに関する周波数スペクトル情報と前記対象信号
のスペクトル包絡に関するスペクトル包絡情報とが含ま
れる。前記周波数スペクトル情報は，前記スペクトル包
絡に応じて割り当てられた割当ビット数に基づいて前記
対象信号の周波数スペクトルが量子化されて符号化され
たものであり，その伝送や記録に必要な情報量は，前記
対象信号を線形PCM 方式により符号化した場合と較べて
ずっと少なく，スペクトル包絡を利用する前記MP3 やTw
inVQなどの高能率符号化方式と同等かそれよりも少な
い。また，周波数スペクトルをほぼそのまま符号化する
ため，前記対象信号の劣化を抑えることができる。しか
も，スペクトル分析に伴うFFT などの演算は，符号化に
当たって一度だけ行えばよく，コンピュータの演算量は
少なくてすみ，短時間で符号化を行うことができる。According to a third aspect of the present invention, there is provided a computer which obtains a frequency spectrum of an object signal which changes spatially or temporally, extracts a spectrum envelope from the frequency spectrum of the object signal, and A procedure for determining the number of bits to be allocated to each frequency component of the frequency spectrum of the target signal, a procedure for quantizing the frequency spectrum of the target signal based on the number of allocated bits, and a frequency spectrum of the target signal. A computer-readable recording medium on which a signal encoding program for executing a procedure for generating an encoded signal obtained by encoding frequency spectrum information and spectrum envelope information defining the spectrum envelope is recorded. A computer-readable recording medium according to a third aspect provides a signal encoding program suitable for implementing the signal encoding apparatus according to the first aspect by a computer. A coded signal generated by a computer in accordance with the signal coding program according to claim 3 includes, as in the signal coding apparatus according to claim 1, a target signal that changes temporally or spatially. Frequency spectrum information on a frequency spectrum and spectrum envelope information on a spectrum envelope of the target signal are included. The frequency spectrum information is obtained by quantizing and encoding the frequency spectrum of the target signal based on the number of bits allocated according to the spectrum envelope. The amount of information required for transmission and recording is , The MP3 or Tw using spectral envelope is much less than when the target signal is coded by the linear PCM method.
Equal to or less than a high efficiency coding method such as inVQ. Further, since the frequency spectrum is encoded almost as it is, deterioration of the target signal can be suppressed. In addition, the calculation such as the FFT associated with the spectrum analysis need only be performed once in the encoding, and the amount of computation in the computer is small, and the encoding can be performed in a short time.

【０００７】また，請求項４に係る発明は，コンピュー
タに，時間的又は空間的に変化する対象信号の周波数ス
ペクトルに関する周波数スペクトル情報と前記対象信号
のスペクトル包絡に関するスペクトル包絡情報とを符号
化した符号化信号から前記スペクトル包絡情報を取得
し，前記スペクトル包絡情報に基づいて前記対象信号の
周波数スペクトルの各周波数成分に対する前記スペクト
ル包絡を復元する手順，前記スペクトル包絡から，前記
対象信号の周波数スペクトルの各周波数成分に割り当て
られた割当ビット数を取得する手順，前記符号化信号か
ら前記周波数スペクトル情報を取得し，前記割当ビット
数に基づいて，前記周波数スペクトル情報から前記対象
信号の周波数スペクトルを復号化する手順，前記対象信
号の周波数スペクトルに基づいて前記対象信号を復元す
る手順を実行させるための信号復号化プログラムを記録
したコンピュータ読み取り可能な記録媒体である。前記
請求項４に記載のコンピュータ読み取り可能な記録媒体
によれば，前記請求項１に記載の信号符号化装置によっ
て生成された符号化信号を，又は前記請求項３に記載の
信号符号化プログラムに従ってコンピュータにより生成
された符号化信号を復号化して，時間的又は空間的に変
化する対象信号を復元するのに好適な信号復号化プログ
ラムを提供することができる。前記請求項４に記載の信
号復号化プログラムにおいても，周波数領域から時間領
域に変換する逆変換の処理は，復号化を行うにあたって
一度行うだけでよいから，コンピュータが行う全体の演
算量は少なくてすみ，短時間で前記対象信号の復号化を
行うことができ，また比較的低速なコンピュータを用い
て実時間再生を行うことも可能となる。According to a fourth aspect of the present invention, there is provided a computer which encodes frequency spectrum information relating to a frequency spectrum of a target signal which changes temporally or spatially and spectrum envelope information relating to a spectrum envelope of the target signal. Acquiring the spectrum envelope information from the coded signal and restoring the spectrum envelope for each frequency component of the frequency spectrum of the target signal based on the spectrum envelope information. Obtaining the number of allocated bits allocated to the frequency component, obtaining the frequency spectrum information from the encoded signal, and decoding the frequency spectrum of the target signal from the frequency spectrum information based on the allocated number of bits. Procedure, frequency spectrum of the target signal A computer-readable recording medium recording a signal decoding program for executing the steps to restore the target signal based on. According to the computer-readable recording medium of the fourth aspect, the encoded signal generated by the signal encoding device of the first aspect or the signal encoding program of the third aspect is used. A signal decoding program suitable for decoding a coded signal generated by a computer and restoring a temporally or spatially changing target signal can be provided. Also in the signal decoding program according to the fourth aspect, since the inverse transform processing for transforming from the frequency domain to the time domain only needs to be performed once for decoding, the total amount of computation performed by the computer is small. The decoding of the target signal can be performed in a short time, and the real-time reproduction can be performed using a relatively low-speed computer.

【０００８】[0008]

【発明の実施の形態】以下，添付図面を参照して，本発
明の実施の形態につき説明し，本発明の理解に供する。
尚，以下の実施の形態は，本発明の具体的な例であっ
て，本発明の技術的範囲を限定する性格のものではな
い。本発明の実施の形態に係る符号化装置１は，図１に
示す如く，時間的に変化する音声信号，楽音信号，又は
その両方を含む音響信号（対象信号の一例）の周波数ス
ペクトルを求める周波数スペクトル演算部１０３と，前
記周波数スペクトル演算部１０３により演算された前記
音響信号の周波数スペクトルから，スペクトル包絡を抽
出するスペクトル包絡抽出部１０４と，前記スペクトル
包絡抽出部１０４により抽出された前記スペクトル包絡
に応じて，前記音響信号の周波数スペクトルの各周波数
成分に割り当てる割当ビット数を定める割当ビット数決
定部１０５と，前記割当ビット数決定部１０５により定
められた前記割当ビット数に基づいて，前記音響信号の
周波数スペクトルを量子化する周波数スペクトル量子化
部１０６と，前記周波数スペクトル量子化部１０６によ
り量子化された前記音響信号の周波数スペクトルに関す
る周波数スペクトル情報と，前記スペクトル包絡に関す
るスペクトル情報とを符号化した符号化信号を生成する
符号化信号生成部１０７とを具備する。また，本発明の
実施の形態に係る信号復号化装置２は，図２に示す如
く，時間的に変化する前記音響信号の周波数スペクトル
に関する周波数スペクトル情報と前記２響信号のスペク
トル包絡に関するスペクトル包絡情報とを符号化した符
号化信号から前記スペクトル包絡情報を取得し，該取得
した前記スペクトル包絡情報に基づいて前記音響信号の
周波数スペクトルの各周波数成分に対する前記スペクト
ル包絡を復元するスペクトル包絡復元部２０３と，前記
スペクトル包絡復元部２０３により復元された前記スペ
クトル包絡から，前記音響信号の周波数スペクトルの各
周波数成分に割り当てられた割当ビット数を取得する割
当ビット数取得部２０４と，前記符号化信号から前記周
波数スペクトル情報を取得し，前記割当ビット数取得部
２０４により取得された前記割当ビット数に基づいて，
前記取得した前記周波数スペクトル情報から前記音響信
号の周波数スペクトルを復号化する周波数スペクトル復
号化部２０５と，前記周波数スペクトル復号化部２０５
により復号化された前記音響信号の周波数スペクトルに
基づいて前記音響信号を復元する信号復元部２０６とを
具備する。Embodiments of the present invention will be described below with reference to the accompanying drawings to provide an understanding of the present invention.
The following embodiments are specific examples of the present invention and do not limit the technical scope of the present invention. As shown in FIG. 1, a coding apparatus 1 according to an embodiment of the present invention provides a frequency for obtaining a frequency spectrum of an audio signal (an example of a target signal) including a time-varying audio signal, a musical sound signal, or both. A spectrum calculation unit 103, a spectrum envelope extraction unit 104 for extracting a spectrum envelope from the frequency spectrum of the acoustic signal calculated by the frequency spectrum calculation unit 103, and a spectrum envelope extracted by the spectrum envelope extraction unit 104. Accordingly, based on the allocated bit number determined by the allocated bit number determining unit 105, the allocated bit number determining unit 105 determines the number of allocated bits to be assigned to each frequency component of the frequency spectrum of the audio signal. A frequency spectrum quantization unit 106 for quantizing the frequency spectrum of the Comprising a frequency spectrum information about the frequency spectrum of the acoustic signal which has been quantized by spectrum quantization unit 106, and a coded signal generating unit 107 and spectral information to generate an encoded signal encoded for said spectral envelope. Further, as shown in FIG. 2, the signal decoding device 2 according to the embodiment of the present invention comprises frequency spectrum information relating to the frequency spectrum of the acoustic signal which changes with time and spectrum envelope information relating to the spectral envelope of the bi-resonant signal. A spectrum envelope restoring unit 203 for acquiring the spectrum envelope information from an encoded signal obtained by encoding the above, and restoring the spectrum envelope for each frequency component of the frequency spectrum of the audio signal based on the acquired spectrum envelope information; An allocation bit number obtaining section 204 for obtaining the number of allocated bits allocated to each frequency component of the frequency spectrum of the audio signal from the spectrum envelope restored by the spectrum envelope restoring section 203; The frequency spectrum information is acquired, and the allocated bit number acquiring section 20 is acquired. Based on the number of allocated bits obtained by,
A frequency spectrum decoding unit 205 for decoding a frequency spectrum of the audio signal from the acquired frequency spectrum information, and a frequency spectrum decoding unit 205
And a signal restoring unit 206 for restoring the audio signal based on the frequency spectrum of the audio signal decoded according to.

【０００９】また，本発明の実施の形態に係る信号符号
化プログラムは，コンピュータに，図３に示すような，
前記音響信号の周波数スペクトルを求める手順Ｓ１０
１，前記音響信号の周波数スペクトルからスペクトル包
絡を抽出する手順Ｓ１０２，前記スペクトル包絡に応じ
て，前記音響信号の周波数スペクトルの各周波数成分に
割り当てる割当ビット数を定める手順Ｓ１０３，前記割
当ビット数に基づいて，前記音響信号の周波数スペクト
ルを量子化する手順Ｓ１０４，前記音響信号の周波数ス
ペクトルに関する周波数スペクトル情報と，前記スペク
トル包絡を定めるスペクトル包絡情報とを符号化した符
号化信号を生成する手順Ｓ１０４を実行させるためのも
のであって，この信号符号化プログラムを記録した例え
ばＣＤ−ＲＯＭが，本発明の実施の形態に係る信号化符
号化プログラムを記録したコンピュータ読み取り可能な
記録媒体の具体例である。また，本発明の実施の形態に
係る信号復号化プログラムは，コンピュータに，図４に
示すような，時間的に変化する前記音響信号の周波数ス
ペクトルに関する周波数スペクトル情報と前記音響信号
のスペクトル包絡に関するスペクトル包絡情報とを符号
化した符号化信号から前記スペクトル包絡情報を取得
し，該取得した前記スペクトル包絡情報に基づいて前記
音響信号の周波数スペクトルの各周波数成分に対する前
記スペクトル包絡を復元する手順Ｓ２０１，前記スペク
トル包絡から，前記音響信号の周波数スペクトルの各周
波数成分に割り当てられた割当ビット数を取得する手順
Ｓ２０２，前記符号化信号から前記周波数スペクトル情
報を取得し，前記割当ビット数に基づいて，前記周波数
スペクトル情報から前記音響信号の周波数スペクトルを
復号化する手順Ｓ２０３，前記音響信号の周波数スペク
トルに基づいて前記音響信号を復元する手順Ｓ２０４を
実行させるためのものであって，この信号復号化プログ
ラムを記録した例えばＣＤ−ＲＯＭが，本発明の実施の
形態に係る信号復号化プログラムを記録したコンピュー
タ読み取り可能な記録媒体の具体例である。[0009] A signal encoding program according to an embodiment of the present invention is provided to a computer as shown in FIG.
Step S10 of obtaining a frequency spectrum of the acoustic signal
1, a step S102 of extracting a spectrum envelope from a frequency spectrum of the audio signal, a step S103 of determining an allocated bit number to be assigned to each frequency component of the frequency spectrum of the audio signal in accordance with the spectrum envelope, and based on the allocated bit number. Then, a step S104 of quantizing the frequency spectrum of the audio signal and a step S104 of generating an encoded signal obtained by encoding frequency spectrum information on the frequency spectrum of the audio signal and spectrum envelope information defining the spectrum envelope are executed. For example, a CD-ROM on which the signal encoding program is recorded is a specific example of a computer-readable recording medium on which the signal encoding program according to the embodiment of the present invention is recorded. In addition, the signal decoding program according to the embodiment of the present invention stores in a computer, as shown in FIG. 4, frequency spectrum information relating to a time-varying frequency spectrum of the acoustic signal and a spectrum relating to a spectrum envelope of the acoustic signal. Acquiring the spectrum envelope information from the encoded signal obtained by encoding the envelope information, and restoring the spectrum envelope for each frequency component of the frequency spectrum of the audio signal based on the acquired spectrum envelope information, step S201, Step S202 of acquiring the number of allocated bits assigned to each frequency component of the frequency spectrum of the audio signal from the spectrum envelope, acquiring the frequency spectrum information from the encoded signal, and obtaining the frequency based on the number of allocated bits. From the spectrum information, the frequency spectrum of the acoustic signal And a step S204 of restoring the audio signal based on the frequency spectrum of the audio signal. For example, a CD-ROM storing this signal decoding program is It is a specific example of a computer-readable recording medium recording a signal decoding program according to an embodiment of the present invention.

【００１０】そして，前記信号符号化装置１，及び前記
信号復号化装置２は，例えば前記信号符号化プログラ
ム，及び信号復号化プログラムを実行可能にしたコンピ
ュータとしてそれぞれ具体化される。前記コンピュータ
は，図５に示す如く，キーボードやマウス等の入力装置
５０１，ディスプレイなどの出力装置５０２，ＣＤ−Ｒ
ＯＭドライブ５０３，ハードディスクドライブ５０４，
ＲＡＭ５０５，演算装置５０６，スピーカ５０７１及び
マイク５０７２が接続された音響入出力用ボード５０
７，ＬＡＮボードやモデム等の通信装置５０８などを有
する標準的な構成を備えるものである。前記コンピュー
タにおいて，前記信号符号化プログラム，及び信号復号
化プログラムのいずれか一方又は両方が圧縮状態で記録
されたＣＤ−ＲＯＭが，前記ＣＤ−ＲＯＭドライブ５０
３に挿入され，使用者が入力装置５０１を用いて与えた
指示に従って前記ＣＤ−ＲＯＭ中のインストールプログ
ラムが実行されると，前記ＣＤ−ＲＯＭから読み出され
た前記信号符号化プログラム，及び信号復号化プログラ
ムのいずれか一方又は両方が，ハードディスクドライブ
５０４などの記憶媒体上に実行可能な状態に展開され
る。前記信号符号化プログラム，及び前記信号復号化プ
ログラムのいずれか一方又は両方の実行指示が，入力装
置５０１を介して使用者からあった場合には，前記ハー
ドディスクドライブ５０４上などに展開された前記信号
符号化プログラム，及び前記信号復号化プログラムのい
ずれか一方又は両方の，一部又は全部が前記ＲＡＭ５０
５や前記ハードディスクドライブ５０４などから読み出
され，前記演算装置５０６により，前記信号符号化プロ
グラム，及び前記信号復号化プログラムのいずれか一方
又は両方が実行される。The signal encoding device 1 and the signal decoding device 2 are each embodied as, for example, a computer capable of executing the signal encoding program and the signal decoding program. As shown in FIG. 5, the computer includes an input device 501 such as a keyboard and a mouse, an output device 502 such as a display, and a CD-R.
OM drive 503, hard disk drive 504,
Sound input / output board 50 to which RAM 505, arithmetic unit 506, speaker 5071 and microphone 5072 are connected.
7, a standard configuration including a communication device 508 such as a LAN board and a modem. In the computer, the CD-ROM in which one or both of the signal encoding program and the signal decoding program are recorded in a compressed state is stored in the CD-ROM drive 50.
3, when the installation program in the CD-ROM is executed in accordance with an instruction given by the user using the input device 501, the signal encoding program and the signal decoding read from the CD-ROM are executed. Either or both of the computerized programs are developed in an executable state on a storage medium such as the hard disk drive 504. When a user issues an instruction to execute one or both of the signal encoding program and the signal decoding program from the user via the input device 501, the signal expanded on the hard disk drive 504 or the like. A part or the whole of one or both of the encoding program and the signal decoding program is stored in the RAM 50.
5 and the hard disk drive 504, and the arithmetic unit 506 executes one or both of the signal encoding program and the signal decoding program.

【００１１】以下，前記信号符号化装置１，及び前記信
号復号化装置２が，同一のコンピュータによって実現さ
れた場合を例にして，前記信号符号化装置１，前記信号
復号化装置２，前記信号符号化プログラム，及び前記信
号復号化プログラムの詳細を説明する。前記信号符号化
プログラムが実行され，前記コンピュータが前記信号符
号化装置１として動作すると，はじめに例えば符号化対
象となる前記音響信号の指定や，符号化後のビットレー
トの指定など各種設定が可能なダイアログが前記出力装
置５０２に表示される。前記符号化後のビットレートの
指定は，前記信号符号化装置１による符号化処理の際に
利用可能なビット数の上限を定めるのに用いられる。ま
た，符号化対象となる前記音響信号は，そのときにマイ
クから入力するものや，既に何らかの符号化形式により
ディジタル化され，前記ハードディスクドライブ５０４
上に格納されているものなどである。前記音響信号をマ
イクから入力することが使用者により選択された場合に
は，前記出力装置５０２に，取り込み時の量子化ビット
数や，サンプリングレート，ステレオ・モノラルの区別
などの情報を使用者が指定するためのダイアログが表示
される。使用者が，前記入力装置５０１を用いて前記取
り込み時に必要な情報を与え，更に取り込みの指示を与
えると，マイクから入力される前記音響信号に対して，
符号化処理が開始される。前記信号符号化装置１におい
て，マイクなどから入力端子１０１を介して入力された
前記音響信号は，Ａ／Ｄ変換器１０２を介して前記周波
数スペクトル演算部１０３に供給される。前記信号符号
化装置１が前記コンピュータによって実現されるこの例
では，前記入力端子１０１に接続されるマイクとして，
例えば前記音響入出力用ボード５０７に接続されたマイ
ク５０７２が，前記Ａ／Ｄ変換器１０２には，前記音響
入出力用ボード５０７に実装された図示しないＡ／Ｄ変
換器が用いられる。前記音響入出力用ボード５０７に実
装されたＡ／Ｄ変換器から出力された信号は，実際に
は，前記ハードディスクドライブ５０４上などに一時的
に格納される。前記Ａ／Ｄ変換器１０２から出力され前
記ハードディスクドライブ５０４上などに一時的に格納
されたディジタル化後の前記音響信号は，前記線形PCM
方式により符号化したものに対応する。ＣＤ相当の取り
込み条件であれば，既述の通り，そのビットレートは，
約１４１１ｂｐｓである。前記ディジタル化後の前記音
響信号は，前記ＲＡＭ５０５などに格納された前記信号
符号化プログラムに従って，前記周波数スペクトル演算
部１０３として動作する前記演算装置５０６によって読
み出される。Hereinafter, the signal encoding device 1, the signal decoding device 2, the signal decoding device 2, and the signal decoding device 2 will be described by taking as an example the case where the signal encoding device 1 and the signal decoding device 2 are realized by the same computer. Details of the encoding program and the signal decoding program will be described. When the signal encoding program is executed and the computer operates as the signal encoding device 1, various settings such as designation of the audio signal to be encoded and designation of a bit rate after encoding can be performed first. A dialog is displayed on the output device 502. The designation of the bit rate after the encoding is used to determine the upper limit of the number of bits that can be used in the encoding process by the signal encoding device 1. The audio signal to be encoded may be input from a microphone at that time, or may be already digitized in some encoding format, and may be encoded in the hard disk drive 504.
Such as those stored above. When the user selects to input the acoustic signal from the microphone, the user outputs information such as the number of quantization bits at the time of capture, the sampling rate, and the distinction between stereo and monaural to the output device 502. A dialog for specifying is displayed. When the user gives necessary information at the time of capturing by using the input device 501 and further gives an instruction for capturing, the acoustic signal input from the microphone is
The encoding process is started. In the signal encoding device 1, the audio signal input from a microphone or the like via an input terminal 101 is supplied to the frequency spectrum calculation unit 103 via an A / D converter 102. In this example in which the signal encoding device 1 is realized by the computer, as a microphone connected to the input terminal 101,
For example, a microphone 5072 connected to the sound input / output board 507 is used, and an A / D converter (not shown) mounted on the sound input / output board 507 is used as the A / D converter 102. The signal output from the A / D converter mounted on the sound input / output board 507 is actually temporarily stored on the hard disk drive 504 or the like. The digitized audio signal output from the A / D converter 102 and temporarily stored on the hard disk drive 504 or the like is the linear PCM
It corresponds to the one encoded by the method. If the capture condition is equivalent to CD, the bit rate is
It is about 1411 bps. The digitized audio signal is read by the arithmetic unit 506 operating as the frequency spectrum arithmetic unit 103 according to the signal encoding program stored in the RAM 505 or the like.

【００１２】前記周波数スペクトル演算部１０３は，入
力された前記音響信号についてＦＦＴ(Fast Fourier Tr
ansform)演算を行い，前記音響信号の周波数スペクトル
を求める。前記周波数スペクトル演算部１０３に供給さ
れる前記音響信号は，予め時系列に分割されたものであ
る。分割の際には各フレーム間で一部が重複するように
処理される。時系列に分割された前記音響信号のフレー
ムの幅は，２０ｍｓ〜６０ｍｓ程度のフレーム幅を採用
することが多い他の符号化方式よりも長めの１００ｍｓ
程度が適当である。これは，本実施の形態に係る符号化
方式が，周波数スペクトルから抽出したスペクトル包絡
に基づいて各周波数成分の割当ビット数を定め，その割
当ビット数に応じて各周波数成分を量子化することによ
り，低ビットレート化を行うため，前記スペクトル包絡
を良好に抽出する必要があるということに起因する。前
記フレーム幅は各フレームに対して一定である必要はな
く，また対象となるフレームの信号に応じて動的に変化
させるようにしてもよい。前記音響信号の各フレームに
は，必要に応じて前記ＦＦＴ演算の前にハミング窓など
の窓関数が掛けられる。前記周波数スペクトル演算部１
０３は，各フレームについて前記ＦＦＴ演算を行い，周
波数領域における前記音響信号のパワー成分と位相成分
とを定める。前記音響信号に対するサンプリングレート
が４４．１ｋＨｚである場合，前記ＦＦＴ演算の際のサ
ンプル点数は，４０９６が適当である。前記フレーム幅
に対する設定や，窓関数の有無，窓関数の種類などの情
報は，前記出力装置５０２にダイアログを表示させるな
どして，使用者が適宜変更し得るようにしてもよい。
尚，前記周波数スペクトル演算部１０３が行う上述の通
りの処理が，前記信号符号化プログラムが前記コンピュ
ータに実行させる手順Ｓ１０１に対応する処理である。
前記周波数スペクトル演算部１０３により定められた前
記音響信号の周波数スペクトルのパワー成分と位相成分
は，前記ＲＡＭ５０５や前記演算装置５０６などの一次
キャッシュなどに一時的に保持される。前記演算装置５
０６は，前記信号符号化プログラムに従って，次にスペ
クトル包絡抽出部１０４として動作する。前記スペクト
ル包絡抽出部１０４は，前記音響信号の周波数スペクト
ルのパワー成分から，スペクトル包絡を抽出する。ＬＰ
Ｃ(Linear prediction code)などを用いることも可能で
あるが，本実施の形態に係る信号符号化装置（及び信号
符号化プログラム）の場合，スペクトル包絡を定めるの
に必要な情報量ができるだけ少なく，また周波数−強度
の２次元空間で前記スペクトル包絡が占める面積ができ
るだけ小さくなることが重要である。[0012] The frequency spectrum calculator 103 performs FFT (Fast Fourier Tr) processing on the input acoustic signal.
ansform) operation to obtain a frequency spectrum of the acoustic signal. The sound signal supplied to the frequency spectrum calculation unit 103 has been divided in time series in advance. At the time of division, processing is performed so that a part of each frame overlaps. The width of the frame of the audio signal divided in time series is 100 ms, which is longer than other encoding methods that often employ a frame width of about 20 ms to 60 ms.
The degree is appropriate. This is because the encoding method according to the present embodiment determines the number of allocated bits of each frequency component based on the spectrum envelope extracted from the frequency spectrum, and quantizes each frequency component according to the allocated number of bits. In order to reduce the bit rate, it is necessary to satisfactorily extract the spectrum envelope. The frame width does not need to be constant for each frame, and may be dynamically changed according to the signal of the target frame. Each frame of the audio signal is multiplied by a window function such as a Hamming window before the FFT operation, if necessary. The frequency spectrum calculator 1
In step 03, the FFT operation is performed for each frame to determine a power component and a phase component of the acoustic signal in the frequency domain. When the sampling rate for the acoustic signal is 44.1 kHz, the number of sample points in the FFT operation is appropriately set at 4096. Information such as the setting for the frame width, the presence or absence of the window function, and the type of the window function may be changed by the user as appropriate by displaying a dialog on the output device 502 or the like.
The above-described processing performed by the frequency spectrum calculation unit 103 is processing corresponding to the step S101 that the signal encoding program causes the computer to execute.
The power component and the phase component of the frequency spectrum of the acoustic signal determined by the frequency spectrum calculation unit 103 are temporarily stored in a primary cache or the like such as the RAM 505 or the calculation device 506. The arithmetic unit 5
06 operates as the spectrum envelope extraction unit 104 in accordance with the signal encoding program. The spectrum envelope extraction unit 104 extracts a spectrum envelope from a power component of a frequency spectrum of the audio signal. LP
Although it is possible to use C (Linear prediction code) or the like, in the case of the signal encoding device (and the signal encoding program) according to the present embodiment, the amount of information necessary to determine the spectrum envelope is as small as possible. It is also important that the area occupied by the spectrum envelope in the two-dimensional frequency-intensity space is as small as possible.

【００１３】本実施の形態に係る信号符号化装置（及び
信号符号化プログラム）に，より適したスペクトル包絡
抽出の具体的な演算手順を図６に示す。図６に示す通
り，まず，前記パワー成分の周波数領域を複数の周波数
帯に分割し，各周波数帯における前記パワー成分の最大
値を抽出する（Ｓ１００１）。前記パワー成分の周波数
領域の分割数は，圧縮率に関係するので，予め使用者な
どにより指定されたビットレートに従って定められる。
次に，前記各周波数帯における前記パワー成分の最大値
を結んだ仮のスペクトル包絡を設定する（Ｓ１００
２）。次に，前記パワー成分のうち前記仮のスペクトル
包絡よりも著しく大きい周波数成分があるか否かを，全
周波数領域について探索する（Ｓ１００３）。前記仮の
スペクトル包絡より当該周波数成分の大きさが著しく大
きいか否かは，例えば予め設定したしきい値よりその差
が大きいか否かに基づいて判別すればよい。次に，前記
仮のスペクトル包絡よりも著しく大きいと判断された周
波数成分が存在する場合には，当該周波数成分と前記各
周波数帯における前記パワー成分の最大値を，前記仮の
スペクトル包絡よりも著しく大きいと判断された周波数
成分が存在しない場合には，前記各周波数帯における前
記パワー成分の最大値のみを，抽出しようとするスペク
トル包絡のピーク点として設定する（Ｓ１００４）。
尚，スペクトル包絡を得るための各周波数帯の大きさは
同じである必要はなく，周波数に応じて異ならせてもよ
い。前記ピーク点の点数は，低中周波数領域に多い方が
人間の聴覚特性上好ましい。高周波数領域の音はさほど
精密に再現しなくても，聴感上問題は少ない。前記スペ
クトル包絡のピーク点が設定されたら，次に，隣り合う
ピーク点を支点として糸を垂らす要領で，両ピーク点間
にある周波数スペクトルの谷の部分を表現する。谷の部
分を表現するために使用する関数を，以下，谷関数とい
う。前記谷関数の具体例としては，正弦波関数Ｌ＝Ａ×
ｓｉｎ（ω×ｎ）や，正弦波の平方根関数Ｌ＝Ａ×√
（ｓｉｎ（ω×ｎ）），正弦波のべき乗関数Ｌ＝Ａ×ｓ
ｉｎ^c（ω×ｎ），放物線関数Ｌ＝Ａ×ｎ²＋ｂ，高次
関数，懸垂曲線Ｌ＝Ａ×ｃｏｓｈ（ω×ｎ）などが挙げ
られる。但し，ｎは両ピーク点間にある（周波数軸上
の）各サンプルを一方のピーク点から数え始めたときの
サンプル数であり，Ｌは両ピーク点間にある各サンプル
の大きさである。例えば前記正弦波関数を前記谷関数と
して用いる場合，両ピーク点間のスペクトル包絡を，次
式（１）に従って表す。Ｌ＝ａ×ｎ−Ａ×ｓｉｎ（ω×ｎ）＋Ｃ（１）図７に示す通り，上式（１）におけるＮは一方のピーク
点Ｐ１から数え始めたときの他方のピーク点Ｐ２までの
サンプル数であって，０≦ｎ≦Ｎ，ω＝π／Ｎである。
また，ａは両ピーク点Ｐ１，Ｐ２間の前記仮のスペクト
ル包絡Ｌ′の傾き，Ｃは一方のピーク点Ｐ１の大きさで
ある。尚，前記ピーク点Ｐの大きさが，両ピーク点Ｐ間
にあるサンプル数と較べてかなり大きい場合，即ち前記
周波数スペクトルの谷部分が深くなる場合には，前記正
弦波関数よりも，正弦波のべき乗関数を用いる方がよ
い。上式（１）に含まれる前記谷関数の係数Ａを変化さ
せ，両ピーク点Ｐ１，Ｐ２間のスペクトル包絡を定め
る。例えば両ピーク点Ｐ１，Ｐ２間にあるサンプルのう
ち，前記仮のスペクトル包絡Ｌ′よりも小さい，いずれ
かのサンプルＥ１の大きさと前記スペクトル包絡の値Ｌ
とが一致したときの係数Ａを用いた上式（１）によって
表されるスペクトル包絡を，抽出しようとするスペクト
ル包絡の一部（両ピーク点間のスペクトル包絡）として
定める（Ｓ１００５）。前記手順Ｓ１００５を，全ての
ピーク点間に対して繰り返すことによって（Ｓ１００
６），例えば図８に示すような各周波数成分に対するス
ペクトル包絡が得られる。図８に示す通り，上述の手順
に従って抽出したスペクトル包絡は，各周波数成分の大
きさよりも所定値以上小さくなることはなく，また周波
数−強度２次元空間で前記スペクトル包絡が占める面積
が小さい。FIG. 6 shows a specific calculation procedure of spectrum envelope extraction more suitable for the signal encoding device (and signal encoding program) according to the present embodiment. As shown in FIG. 6, first, the frequency domain of the power component is divided into a plurality of frequency bands, and the maximum value of the power component in each frequency band is extracted (S1001). Since the number of divisions of the power component in the frequency domain is related to the compression ratio, it is determined according to a bit rate specified in advance by a user or the like.
Next, a temporary spectrum envelope connecting the maximum values of the power components in the respective frequency bands is set (S100).
2). Next, whether or not there is a frequency component significantly larger than the temporary spectrum envelope among the power components is searched for in all frequency regions (S1003). Whether or not the magnitude of the frequency component is significantly greater than the temporary spectrum envelope may be determined based on, for example, whether or not the difference is greater than a preset threshold. Next, when there is a frequency component determined to be significantly larger than the temporary spectrum envelope, the frequency component and the maximum value of the power component in each of the frequency bands are remarkably compared to the temporary spectrum envelope. If there is no frequency component determined to be large, only the maximum value of the power component in each frequency band is set as the peak point of the spectrum envelope to be extracted (S1004).
Note that the size of each frequency band for obtaining the spectral envelope does not need to be the same, and may be different depending on the frequency. It is preferable that the number of the peak points is large in the low and middle frequency regions in terms of human hearing characteristics. Even if the sound in the high frequency range is not reproduced very precisely, there is little problem in terms of hearing. After the peak points of the spectrum envelope are set, the valley portion of the frequency spectrum between the two peak points is expressed in the manner of hanging the thread with the adjacent peak points as fulcrums. The function used to represent the valley portion is hereinafter referred to as a valley function. As a specific example of the valley function, a sine wave function L = A ×
sin (ω × n) or the square root function of a sine wave L = A × √
(Sin (ω × n)), power function of sine wave L = A × s
^inc (ω × n), parabolic function L = A × n ² + b, higher-order function, suspension curve L = A × cosh (ω × n), and the like. Here, n is the number of samples when counting each sample (on the frequency axis) between both peak points from one peak point, and L is the size of each sample between both peak points. For example, when the sine wave function is used as the valley function, the spectrum envelope between both peak points is represented by the following equation (1). L = a.times.n-A.times.sin (.omega..times.n) + C (1) As shown in FIG. 7, N in the above equation (1) is a value from one peak point P1 to the other peak point P2 when counting is started. The number of samples, where 0 ≦ n ≦ N and ω = π / N.
A is the slope of the temporary spectrum envelope L 'between the peak points P1 and P2, and C is the magnitude of one peak point P1. When the size of the peak point P is considerably larger than the number of samples between the two peak points P, that is, when the valley of the frequency spectrum is deeper, the sine wave function is It is better to use a power function of. The spectral envelope between both peak points P1 and P2 is determined by changing the coefficient A of the valley function included in the above equation (1). For example, among the samples located between both peak points P1 and P2, the size of any one of the samples E1 smaller than the temporary spectrum envelope L ′ and the value L of the spectrum envelope
Then, the spectrum envelope represented by the above equation (1) using the coefficient A at the time when is matched is determined as a part of the spectrum envelope to be extracted (spectrum envelope between both peak points) (S1005). By repeating the procedure S1005 for all the peak points (S100
6) For example, a spectrum envelope for each frequency component as shown in FIG. 8 is obtained. As shown in FIG. 8, the spectrum envelope extracted according to the above-described procedure does not become smaller than the magnitude of each frequency component by a predetermined value or more, and the area occupied by the spectrum envelope in the frequency-intensity two-dimensional space is small.

【００１４】ところで，上述の通りの好適なスペクトル
包絡を得るためには，前記ピーク点間が広く，前記谷部
分が深い方がよい。既述した１００ｍｓ程度のフレーム
幅が好適であるという理由は，この点に関係する。２０
ｍｓ程度のフレーム幅を採用した場合でも，好適なスペ
クトル包絡を得るには，１００ｍｓ程度の時と同等の情
報量が必要となる。しかしながら，フレーム毎に復号化
のためのスペクトル包絡の情報が必要となるので，フレ
ーム幅が小さくなればなるほど，スペクトル包絡の情報
量が増加してしまう。一方，フレーム幅を大きくして
も，前記ピーク点の数はさほど増えず，前記谷部分の深
さは大きくなる。これは，例えば有声音声の場合，基本
周波数及びその高調波がピークとして現れるが，フレー
ム幅を大きくすると，ピーク間の間隔が周波数でみたと
きには同じであっても，次数でみたときには広がるから
である。また，楽音の場合も，ランダムに周波数成分が
折り重なっている訳ではなく，１２段階の音程により規
則性があり，楽器の音も基本周波数及びその高調波で成
り立っている場合が多く，ほぼ有声音声の場合と同様で
ある。また，１００ｍｓ程度のフレーム幅では，信号の
定常性が増すという理由もある。尚，前記スペクトル包
絡抽出部１０４が行う上述の通りの処理が，前記信号符
号化プログラムが前記コンピュータに実行させる手順Ｓ
１０２に対応する処理である。Incidentally, in order to obtain the above-mentioned preferable spectral envelope, it is preferable that the interval between the peak points is wide and the valley portion is deep. The reason that the frame width of about 100 ms is suitable as described above is related to this point. 20
Even when a frame width of about ms is adopted, an information amount equivalent to that of about 100 ms is required to obtain a suitable spectral envelope. However, since information on the spectrum envelope for decoding is required for each frame, the information amount of the spectrum envelope increases as the frame width decreases. On the other hand, even if the frame width is increased, the number of the peak points does not increase so much, and the depth of the valleys increases. This is because, for example, in the case of voiced speech, the fundamental frequency and its harmonics appear as peaks, but when the frame width is increased, even if the interval between the peaks is the same when viewed in frequency, it increases when viewed in order. . Also, in the case of musical sounds, the frequency components do not fold at random, but have regularity in 12 steps, and the sound of musical instruments often consists of the fundamental frequency and its harmonics. Is the same as Another reason is that the signal steadiness increases with a frame width of about 100 ms. The above-described processing performed by the spectrum envelope extraction unit 104 is the same as the procedure S in which the signal encoding program is executed by the computer.
This is processing corresponding to 102.

【００１５】前記スペクトル包絡抽出部１０４により抽
出された前記スペクトル包絡は，前記ＲＡＭ５０５や前
記演算装置５０６の一次キャッシュなどに一時的に保持
される。次に前記演算装置５０６は，前記信号符号化プ
ログラムに従って，割当ビット数決定部１０５として動
作する。前記割当ビット数決定部１０５は，前記スペク
トル包絡の各値に応じたビット数を各周波数成分に割り
当てる。基本的には，前記スペクトル包絡の各値が大き
い周波数成分ほど，多くのビットを割り当てる。例え
ば，前記スペクトル包絡の各値を対数化する。対数化は
必須ではないが，ビットの持つ特性上最も好ましい。次
に，予め設けておいた下限値が０になるように，前記ス
ペクトル包絡を正規化する。次に，対数化した前記スペ
クトル包絡の各値の総和を，複数に分割した各周波数帯
毎に求める。次に，各周波数帯における前記スペクトル
包絡の各値の和が，当該周波数帯の総ビット数と等しく
なるような係数（＝総ビット数／スペクトル包絡の各値
の和）を，当該周波数帯における前記スペクトル包絡の
各値に乗じ，その値に応じて各周波数成分に割り当てる
ビット数を算出する。予め定められた全周波数領域にお
ける全総ビット数を，複数の周波数帯毎に配分するの
は，ある周波数領域に大きな成分が集中した場合に，他
の周波数領域に割り当てるビット数が極端に少なくなる
ことを防止するためである。大きな成分が集中する周波
数領域の近くの領域にある成分は，マスク効果によって
人間が知覚し難いが，離れた領域にある成分は，比較的
知覚し易いからである。ビット数配分の割合は予め定め
ておいてもよいし，符号化の対象となった前記音響信号
の周波数スペクトルに応じて定めるようにしてもよい。
基本的には，低中領域における周波数帯に，より多くの
ビットを配分した方が，人間の聴覚特性に合致する。前
記スペクトル包絡の各値からビット数を定める際，少数
点以下は切り捨てるか，若しくは四捨五入等の丸めを行
う。また少ないビット数には丸め，多いビット数には切
り捨てを適用するなど，ビット数に応じて異ならせても
よい。また，必要以上に多くのビットを割り当てても，
音質が向上することはないので，ビットの浪費を防止す
るため，例えば各周波数帯毎に割り当てるビット数の上
限値を定めておき，前記スペクトル包絡の各値に応じた
ビット数が前記上限値を越える場合には，当該周波数成
分に応じたビット数の代わりに前記上限値を割り当て
る。更に，必要に応じて，切り捨て丸め等により余った
ビット数に対して，その周波数帯に配分した総ビット数
と実際に配分したビット数の合計が等しくなるか，予め
設けた範囲内に入るまで，上述した正規化の手順から繰
り返すか，前記係数を調整する。尚，前記割当ビット数
決定部１０５が行う上述の通りの処理が，前記信号符号
化プログラムが前記コンピュータに実行させる手順Ｓ１
０３に対応する処理である。The spectrum envelope extracted by the spectrum envelope extraction unit 104 is temporarily stored in the RAM 505, the primary cache of the arithmetic unit 506, or the like. Next, the arithmetic unit 506 operates as the allocated bit number determining unit 105 according to the signal encoding program. The allocation bit number determination unit 105 allocates a bit number according to each value of the spectrum envelope to each frequency component. Basically, more bits are allocated to a frequency component having a larger value of the spectrum envelope. For example, each value of the spectrum envelope is logarithmized. Although logarithmization is not essential, it is most preferable in terms of the characteristics of bits. Next, the spectrum envelope is normalized so that the lower limit value provided in advance becomes zero. Next, the sum of the logarithmic values of the spectrum envelope is obtained for each of the plurality of divided frequency bands. Next, a coefficient (= total bit number / sum of each value of spectrum envelope) such that the sum of each value of the spectrum envelope in each frequency band is equal to the total number of bits of the frequency band is determined. The value of the spectrum envelope is multiplied, and the number of bits allocated to each frequency component is calculated according to the value. Distributing the total total number of bits in a predetermined all frequency range for each of a plurality of frequency bands means that when a large component is concentrated in a certain frequency range, the number of bits allocated to another frequency range becomes extremely small. This is to prevent that. This is because components in a region near a frequency region where large components are concentrated are difficult for a human to perceive due to a mask effect, but components in a distant region are relatively easy to perceive. The ratio of the bit number distribution may be determined in advance, or may be determined according to the frequency spectrum of the audio signal to be encoded.
Basically, allocating more bits to the frequency band in the low-middle region matches human hearing characteristics. When the number of bits is determined from each value of the spectrum envelope, the number of bits below the decimal point is rounded down or rounded off. Further, the number of bits may be varied depending on the number of bits, such as rounding to a small number of bits and truncation to a large number of bits. Also, if you allocate more bits than necessary,
Since the sound quality is not improved, in order to prevent bits from being wasted, for example, an upper limit value of the number of bits allocated to each frequency band is set in advance, and the number of bits corresponding to each value of the spectrum envelope sets the upper limit value. If it exceeds, the upper limit is assigned instead of the number of bits corresponding to the frequency component. Further, if necessary, the total number of bits allocated to the frequency band and the total number of bits actually allocated to the number of bits surplus due to rounding down or the like become equal or fall within a predetermined range. , Or the coefficient is adjusted. The above-described processing performed by the allocation bit number determination unit 105 is the same as the procedure S1 that causes the computer to execute the signal encoding program.
03.

【００１６】前記割当ビット数決定部１０５により定め
られた割当ビット数は，前記ＲＡＭ５０５や前記演算装
置５０６の一次キャッシュなどに一時的に保持される。
次に前記演算装置５０６は，前記信号符号化プログラム
に従って，周波数スペクトル量子化部１０６として動作
する。前記周波数スペクトル量子化部１０６は，前記周
波数スペクトルのパワー成分を量子化するパワー成分量
子化部１０６１，位相成分を量子化する位相成分量子化
部１０６２，及び前記パワー成分が０となる周波数成分
を位相成分から排除するための０成分排除部１０６３か
らなる。前記パワー成分量子化部１０６１には，前記周
波数スペクトル演算部１０３から前記パワー成分が供給
されており，前記割当ビット数決定部１０５から供給さ
れた割当ビット数に従って，前記パワー成分の各周波数
成分が線形的又は非線形的に量子化される。線形的に量
子化を行う場合には，各周波数成分の量子化後の大きさ
は，当該周波数成分の大きさと当該周波数スペクトル成
分のスペクトル包絡との比に前記割当ビット数で表現で
きる最大値を乗じた値によって表される。量子化演算の
際に発生した小数点については，四捨五入等の丸めを行
う。但し，前記割当ビット数が少ない場合には，量子化
ひずみが大きくなる可能性がある。このため，例えば前
記割当ビット数が１ビットの場合と，２ビット以上の場
合とで，次式（２ａ），（２ｂ）の通り丸め方を異なら
せるなどして，量子化ひずみを低減するようにした方が
好ましい。Ｒ＝Ｉ_nt（Ｓｐ／Ｅｎ＋０．３）（２ａ）Ｒ＝Ｉ_nt（Ｓｐ×（２^Ba−１）／Ｅｎ＋０．５）（２ｂ）ここで，Ｒは量子化後の当該周波数成分の大きさ，Ｓｐ
は当該周波数成分の大きさ，Ｅｎは当該周波数成分のス
ペクトル包絡，Ｂａは当該周波数成分の割当ビット数，
Ｉ_ntは整数化関数である。また，前記位相成分量子化部
１０６２には，前記周波数スペクトル演算部１０３から
前記位相成分が供給されており，例えば前記パワー成分
量子化部１０６１と同じく前記割当ビット数決定部１０
５から供給された割当ビット数に従って，前記位相成分
の各周波数成分が量子化される。但し，前記パワー成分
が０となる周波数成分は復元されないので，前記パワー
成分量子化部１０６１から出力されたパワー成分に従っ
て，予め０成分排除部１０６３により，前記位相成分量
子化部１０６２が量子化する対象から，当該周波数成分
が除外される。前記位相成分は，復元した楽音（特に打
撃音等）の明瞭さに大きな影響を与えるため，１ビット
や２ビットなどの少ないビット数しか割り当てられなか
った周波数成分に対して，余計に１ビット〜数ビットの
ビットを付加して，前記位相成分の再現性を向上させる
ようにしてもよい。また，前記位相成分では，極近辺の
成分の位相差が特に重要となるため，絶対的な値を量子
化するより，隣のサンプルとの差分を量子化するように
する方が好ましい。但し，隣のサンプルとの差分を量子
化する場合には，演算誤差等により絶対的な値からずれ
が生じる恐れがある。このため，隣のサンプルとの差分
を量子化する場合には，差分の対象を，隣のサンプルの
絶対的な値ではなく，量子化した値の累積値とした方が
よい。尚，前記周波数スペクトル量子化部１０６が行う
上述の通りの処理が，前記信号符号化プログラムが前記
コンピュータに実行させる手順Ｓ１０４に対応する処理
である。The number of allocated bits determined by the allocated bit number determining unit 105 is temporarily stored in the RAM 505, the primary cache of the arithmetic unit 506, or the like.
Next, the arithmetic unit 506 operates as the frequency spectrum quantization unit 106 according to the signal encoding program. The frequency spectrum quantization unit 106 includes a power component quantization unit 1061 for quantizing the power component of the frequency spectrum, a phase component quantization unit 1062 for quantizing the phase component, and a frequency component in which the power component becomes 0. It comprises a zero component elimination unit 1063 for eliminating the phase component. The power component quantization unit 1061 is supplied with the power component from the frequency spectrum calculation unit 103, and according to the number of allocated bits supplied from the allocated bit number determination unit 105, each frequency component of the power component is It is quantized linearly or non-linearly. When performing quantization linearly, the size of each frequency component after quantization is the maximum value that can be expressed by the number of allocated bits in the ratio of the size of the frequency component to the spectrum envelope of the frequency spectrum component. Expressed by multiplied value. The decimal point generated at the time of the quantization operation is rounded such as rounding. However, when the number of allocated bits is small, quantization distortion may increase. For this reason, the quantization distortion is reduced by, for example, differentiating the rounding method according to the following equations (2a) and (2b) depending on whether the number of allocated bits is 1 bit or 2 bits or more. More preferably, _{R = I nt (Sp / En} + 0.3) (2a) R = I nt (Sp × (2 Ba -1) /En+0.5) (2b) where, R represents the size of the frequency components after quantization , Sp
Is the magnitude of the frequency component, En is the spectrum envelope of the frequency component, Ba is the number of bits allocated to the frequency component,
_Int is an integer function. Further, the phase component is supplied to the phase component quantization unit 1062 from the frequency spectrum calculation unit 103. For example, like the power component quantization unit 1061, the allocated bit number determination unit 10
According to the number of allocated bits supplied from 5, each frequency component of the phase component is quantized. However, since the frequency component in which the power component becomes 0 is not restored, the phase component quantization unit 1062 is quantized in advance by the 0 component elimination unit 1063 according to the power component output from the power component quantization unit 1061. The frequency component is excluded from the target. Since the phase component has a great effect on the clarity of the restored tone (especially the striking sound, etc.), the frequency component which is assigned only a small number of bits, such as 1 bit or 2 bits, has an additional 1 bit or more. A few bits may be added to improve the reproducibility of the phase component. Further, in the phase component, since the phase difference between components in the vicinity of the pole is particularly important, it is preferable to quantize the difference from an adjacent sample rather than quantize an absolute value. However, when quantizing the difference from the adjacent sample, a deviation from an absolute value may occur due to a calculation error or the like. For this reason, when quantizing the difference from the adjacent sample, it is better to set the difference target not to the absolute value of the adjacent sample but to the cumulative value of the quantized value. Note that the above-described processing performed by the frequency spectrum quantization unit 106 is processing corresponding to step S104 that the signal encoding program causes the computer to execute.

【００１７】上述の通り量子化された前記パワー成分と
前記位相成分は，量子化ビット数こそ各周波数成分で異
なるものの，ほぼ前記周波数スペクトルそのものであ
る。また，量子化ビット数は前記スペクトル包絡に応じ
て変化するが，各周波数成分よりも所定値以上小さくな
ることがないように抽出した前記スペクトル包絡は，マ
スク効果など人間の聴覚特性に結果的に合致しており，
品質上量子化前後で失われる情報は極僅かである。前記
周波数スペクトル量子化部１０６により量子化された前
記パワー成分と前記位相成分は，前記ＲＡＭ５０５や前
記演算装置５０６の一次キャッシュなどに一時的に保持
される。次に前記演算装置５０６は，前記信号符号化プ
ログラムに従って，符号化信号生成部１０７として動作
する。前記符号化信号生成部１０７では，前記スペクト
ル包絡抽出部１０４により抽出されたスペクトル包絡，
前記パワー成分量子化部１０６１により量子化されたパ
ワー成分，及び前記位相成分量子化部１０６２により量
子化された位相成分を基に，前記スペクトル包絡に関す
るスペクトル包絡情報，前記パワー成分に関するパワー
成分情報，前記位相成分に関する位相成分情報を符号化
した符号化信号が生成される。前記スペクトル包絡情報
は，前記スペクトル包絡を定めるのに必要な情報であ
る。前記スペクトル包絡を上述の好適な手順で抽出した
場合には，前記スペクトル包絡情報は，前記ピーク点の
大きさ，前記ピーク点の周波数，前記谷関数の種類，前
記谷関数のパラメータを含む。上述の４つのパラメータ
は，ほぼ前記ピーク点の点数に比例する数だけ必要とな
る。即ち，前記ピーク点の点数を変更することによっ
て，圧縮率を変化させることができる。各ピーク点の大
きさは，前記パワー成分の量子化とは別途に量子化され
る。各ピーク点の大きさを量子化する際の量子化ビット
数は，例えば予め定められたものを用いればよい。この
各ピーク点に対する量子化ビット数も，複数の周波数帯
毎に定めるようにしてもよい。複数の周波数帯毎に前記
ピーク点の大きさを量子化する場合，前記パワー成分の
量子化と同様，低中周波数領域に配分するビット数を多
くした方がよい。また，前記ピーク点の周波数は，絶対
周波数よりも隣のピーク点との差分周波数で表現した方
が，必要なビット数を低減させることができる。前記谷
関数の種類は，複数の谷関数を使用する場合にのみ必要
で，予め谷関数を定める場合には，省略可能である。ま
た，前記谷関数のパラメータは，前記谷関数の種類によ
り，一つの場合もあるし，複数の場合もある。前記正弦
波関数やその平方根関数を用いる場合には，前記パラメ
ータは，振幅Ａの一つのみでよい。この振幅は，２〜４
ビットの量子化ビット数で表現するのが適当である。前
記パワー成分情報は，基本的に，上述のようにして量子
化されたパワー成分そのものである。但し，前記ピーク
点の大きさや，前記ピーク点の周波数は，前記スペクト
ル包絡情報に含まれるので，前記パワー成分から取り除
くことが可能である。また，前記位相成分情報も，基本
的には，上述のようにして量子化された位相成分そのも
のである。もちろん，上述のように量子化された前記パ
ワー成分や前記位相成分に何らかの符号化処理を施した
ものを前記パワー成分情報や前記位相成分情報とするよ
うにしてもよい。例えば前記位相成分について絶対値で
はなく差分値に対して量子化を行った場合，小さな値が
出現する頻度が高くなるため，必要に応じて，ハフマン
符号化を施すと，符号化信号のデータ量をより小さくす
ることができる。尚，前記符号化信号生成部１０７が行
う上述の通りの処理が，前記信号符号化プログラムが前
記コンピュータに実行させる手順Ｓ１０５に対応する処
理である。前記符号化信号生成部１０７から出力された
符号化信号は，出力端子１０８を介して，例えば前記コ
ンピュータのハードディスクドライブ５０４上などに保
存される。このようにして，前記信号符号化装置により
符号化された符号化信号を保存するのに必要な記憶容量
は，線形PCM 方式により前記音響信号を符号化した場合
よりずっと少なく，前記MP3 やTwinVQと同等かそれより
少ない。また，周波数スペクトルをほぼそのまま符号化
するため，音質の劣化は少ない。しかも，前記周波数ス
ペクトル包絡情報を定める際には一度だけFFT 変換を行
えばよく, スペクトル包絡情報を用いる前記MP3 やTwin
VQと較べて演算量を大幅に低減することができる。The power component and the phase component quantized as described above are substantially the frequency spectrum itself, although the number of quantization bits differs for each frequency component. Although the number of quantization bits changes according to the spectral envelope, the spectral envelope extracted so as not to be smaller than each frequency component by a predetermined value or more results in human auditory characteristics such as a mask effect. Match,
There is very little information lost before and after quantization in terms of quality. The power component and the phase component quantized by the frequency spectrum quantization unit 106 are temporarily stored in the RAM 505, the primary cache of the arithmetic device 506, or the like. Next, the arithmetic unit 506 operates as the coded signal generation unit 107 according to the signal coding program. In the coded signal generation unit 107, the spectrum envelope extracted by the spectrum envelope extraction unit 104,
Based on the power component quantized by the power component quantization unit 1061 and the phase component quantized by the phase component quantization unit 1062, spectrum envelope information on the spectrum envelope, power component information on the power component, An encoded signal is generated by encoding the phase component information relating to the phase component. The spectrum envelope information is information necessary for determining the spectrum envelope. When the spectrum envelope is extracted by the above-described preferred procedure, the spectrum envelope information includes the size of the peak point, the frequency of the peak point, the type of the valley function, and parameters of the valley function. The above four parameters are required in a number substantially proportional to the number of the peak points. That is, the compression ratio can be changed by changing the number of the peak points. The size of each peak point is quantized separately from the quantization of the power component. For example, a predetermined number may be used as the number of quantization bits when quantizing the size of each peak point. The number of quantization bits for each peak point may also be determined for each of a plurality of frequency bands. When quantizing the magnitude of the peak point for each of a plurality of frequency bands, it is preferable to increase the number of bits allocated to the low and middle frequency regions, similarly to the quantization of the power component. In addition, when the frequency of the peak point is represented by the difference frequency from the adjacent peak point, the required number of bits can be reduced rather than the absolute frequency. The type of the valley function is necessary only when a plurality of valley functions are used, and can be omitted when a valley function is determined in advance. The valley function has one parameter or a plurality of parameters depending on the type of the valley function. When the sine wave function or its square root function is used, the parameter may be only one of the amplitudes A. This amplitude is 2-4
It is appropriate to express it by the bit quantization bit number. The power component information is basically the power component itself quantized as described above. However, the size of the peak point and the frequency of the peak point are included in the spectrum envelope information, and can be removed from the power component. Also, the phase component information is basically the phase component itself quantized as described above. It is needless to say that the power component and the phase component quantized as described above may be subjected to some encoding processing as the power component information and the phase component information. For example, when the phase component is quantized to a difference value instead of an absolute value, a small value appears more frequently. Therefore, if necessary, Huffman encoding is performed to reduce the data amount of the encoded signal. Can be made smaller. The above-described processing performed by the coded signal generation unit 107 is processing corresponding to step S105 that the signal coding program causes the computer to execute. The encoded signal output from the encoded signal generation unit 107 is stored, for example, on a hard disk drive 504 of the computer via an output terminal 108. In this way, the storage capacity required to store the coded signal coded by the signal coding apparatus is much smaller than when the acoustic signal is coded by the linear PCM method, and is smaller than that of the MP3 or TwinVQ. Equal or less. Also, since the frequency spectrum is encoded almost as it is, there is little deterioration in sound quality. In addition, when the frequency spectrum envelope information is determined, the FFT conversion needs to be performed only once, and the MP3 or Twin using the spectrum envelope information is used.
The amount of calculation can be significantly reduced as compared with VQ.

【００１８】前記信号符号化装置１により生成された符
号化信号を復号化するのに好適な装置が，前記信号復号
化装置２である。前記信号復号化プログラムが実行さ
れ，前記コンピュータが前記信号復号化装置２として動
作すると，はじめに例えば復号化対象となる前記符号化
信号の指定等を行うためのダイアログが前記出力装置５
０２に表示される。使用者により，復号化対象となる前
記符号化信号が指定されると，当該符号化信号に対して
復号化処理が開始される。前記信号復号化装置２におい
て，入力端子２０１を介して入力された前記符号化信号
は，バッファ２０２に一時的に格納される。前記信号復
号化装置２が前記コンピュータによって実現されるこの
例では，前記バッファ２０２には，前記ハードディスク
ドライブ５０４や前記ＲＡＭ５０５などが利用される。
前記バッファ２０２として利用される前記ハードディス
クドライブ５０４上などに格納された前記符号化信号
は，スペクトル包絡復元部２０３，周波数スペクトル復
号化部２０５として動作する前記演算装置５０６によっ
て読み出されることになる。前記スペクトル包絡復元部
２０３は，前記符号化信号から前記スペクトル包絡情報
を取得し，該取得した前記スペクトル包絡情報に基づい
て前記音響信号の周波数スペクトルの各周波数成分に対
する前記スペクトル包絡を復元する。前記スペクトル包
絡情報が，例えばピーク点の大きさ，ピーク点の周波
数，谷関数の種類，谷関数のパラメータを含むのは既述
の通りである。前記ピーク点の大きさを表現するのに用
いた量子化ビット数が，前記スペクトル包絡情報に含ま
れている場合には，当該量子化ビット数を用いて前記ピ
ーク点の大きさを復元するが，前記スペクトル包絡情報
に必要なデータ量をできるだけ低減するために，前記量
子化ビット数は，前記信号符号化装置１と前記信号復号
化装置２とで予め共通化しておくのが望ましい。前記ピ
ーク点の大きさとその周波数が定まると，前記谷関数の
種類及び前記谷関数のパラメータ（例えば係数Ａ）を前
記スペクトル包絡情報から取得すれば，前記スペクトル
包絡が復元される。尚，前記スペクトル包絡復元部２０
３が行う上述の通りの処理が，前記信号復号化プログラ
ムが前記コンピュータに実行させる手順Ｓ２０１に対応
する処理である。前記スペクトル包絡復元部２０３によ
り復元された前記スペクトル包絡は，割当ビット数取得
部２０４に供給される。前記割当ビット数取得部２０４
は，前記スペクトル包絡から，前記音響信号の周波数ス
ペクトルの各周波数成分に割り当てられた割当ビット数
を取得する。前記スペクトル包絡から前記割当ビット数
を取得する処理は，前記信号符号化装置１の前記割当ビ
ット数決定部１０５が行う処理と同様である。尚，前記
割当ビット数取得部２０４が行う上述の通りの処理が，
前記信号復号化プログラムが前記コンピュータに実行さ
せる手順Ｓ２０２に対応する処理である。前記割当ビッ
ト数取得部２０４により取得された各周波数成分に対す
る割当ビット数は，周波数スペクトル復号化部２０５に
供給される。The device suitable for decoding the encoded signal generated by the signal encoding device 1 is the signal decoding device 2. When the signal decoding program is executed and the computer operates as the signal decoding device 2, first, for example, a dialog for designating the coded signal to be decoded is displayed on the output device 5.
02 is displayed. When the user specifies the coded signal to be decoded, decoding processing is started for the coded signal. In the signal decoding device 2, the coded signal input via the input terminal 201 is temporarily stored in a buffer 202. In this example in which the signal decoding device 2 is realized by the computer, the buffer 202 uses the hard disk drive 504, the RAM 505, or the like.
The encoded signal stored on the hard disk drive 504 or the like used as the buffer 202 is read by the arithmetic unit 506 operating as the spectrum envelope restoration unit 203 and the frequency spectrum decoding unit 205. The spectrum envelope restoration unit 203 acquires the spectrum envelope information from the coded signal, and restores the spectrum envelope for each frequency component of the frequency spectrum of the audio signal based on the acquired spectrum envelope information. As described above, the spectrum envelope information includes, for example, the size of the peak point, the frequency of the peak point, the type of the valley function, and the parameter of the valley function. If the number of quantization bits used to represent the size of the peak point is included in the spectral envelope information, the size of the peak point is restored using the number of quantization bits. In order to reduce the amount of data necessary for the spectrum envelope information as much as possible, it is desirable that the number of quantization bits is previously shared between the signal encoding device 1 and the signal decoding device 2. When the magnitude of the peak point and its frequency are determined, the spectrum envelope is restored if the type of the valley function and the parameters of the valley function (for example, coefficient A) are obtained from the spectrum envelope information. The spectrum envelope restoration unit 20
3 is a process corresponding to step S201 that the signal decoding program causes the computer to execute. The spectrum envelope restored by the spectrum envelope restoration unit 203 is supplied to an allocated bit number acquisition unit 204. The allocated bit number acquisition section 204
Obtains the number of bits allocated to each frequency component of the frequency spectrum of the audio signal from the spectrum envelope. The process of acquiring the number of allocated bits from the spectrum envelope is the same as the process performed by the allocated bit number determination unit 105 of the signal encoding device 1. The above-described processing performed by the allocated bit number acquiring unit 204 is as follows.
This is processing corresponding to step S202 that the signal decoding program causes the computer to execute. The number of allocated bits for each frequency component obtained by the allocated bit number obtaining unit 204 is supplied to the frequency spectrum decoding unit 205.

【００１９】前記周波数スペクトル復号化部２０５は，
前記パワー成分を復号化するパワー成分復号化部２０５
１，前記位相成分を復号化する位相成分復号化部２０５
２，０成分を排除するための０成分排除部２０５３を具
備する。前記パワー成分復号化部２０５１は，前記バッ
ファ２０２に格納された前記符号化信号から前記周波数
スペクトル情報のうちのパワー成分情報を取得し，前記
割当ビット数取得部２０４により取得された各周波数成
分に対する割当ビット数に応じて，前記パワー成分を復
号化する。前記パワー成分情報がビットストリーム化さ
れていれば，各周波数成分に対する前記割当ビット数づ
つ順次取り出せば，前記パワー成分が復号化される。ま
た，前記位相成分復号化部２０５２は，前記バッファ２
０２に格納された前記符号化信号から前記周波数スペク
トル情報のうちの位相成分情報を取得し，前記割当ビッ
ト数取得部２０４により取得された各周波数成分に対す
る割当ビット数に応じて，前記位相成分を復号化する。
但し，前記パワー成分の大きさが０の周波数成分は復元
されないので，前記パワー成分復号化部２０５１により
復号化された前記パワー成分に基づき，前記位相成分復
号化部２０５２が復号化を行う対象から，前記０成分排
除部２０５３により前記パワー成分が０の周波数成分が
除外される。前記パワー成分や位相成分の量子化する際
に，上式（２ａ）や（２ｂ）を用い，割当ビット数に応
じて丸め処理を異ならせた場合には，次式（３ａ），
（３ｂ）に従って，復号化処理を行う。Ｓｐ＝０．８５×Ｒ×Ｅｐ（３ａ）Ｓｐ＝Ｒ×Ｅｐ／（２^Ba−１）（３ｂ）尚，前記周波数スペクトル復号化部２０５が行う上述の
通りの処理が，前記信号復号化プログラムが前記コンピ
ュータに実行させる手順Ｓ２０３に対応する処理であ
る。前記パワー成分復号化部２０５１，及び前記位相成
分復号化部２０５２により前記パワー成分，及び前記位
相成分が復号化されると，次に前記演算装置５０６は，
信号復元部２０６として動作する。前記信号復元部２０
６では，前記パワー成分復号化部２０５１により復号化
された前記パワー成分と，前記位相成分復号化部２０５
２により復号化された前記位相成分とから，実数成分と
虚数成分とを導出して，IFFT(Inverse FFT) 演算を行
い，時間領域の信号に変換し，前記音響信号を復元す
る。前記信号符号化装置１において，各フレームに窓関
数が施されている場合には，前記IFFT演算後の値に，対
応する窓関数の逆関数を乗じ復元する必要がある。前記
窓関数に関する情報は，前記信号符号化装置１と前記信
号復号化装置２とで共通化しておくか，前記符号化信号
に含ませる。各フレームの結合部については，窓関数を
施した部分を，オーバーラップさせることにより，結合
部の不連続性に伴うノイズを除去することができる。ま
た，窓関数を施した部分の近傍の部分を除外して，残っ
た部分の両端に窓関数を乗じて結合部を再生することも
有効である。その場合には，フレーム分割の際に除外す
る部分を予め余計に確保しておく必要がある。尚，前記
信号復元部２０５が行う上述の通りの処理が，前記信号
復号化プログラムが前記コンピュータに実行させる手順
Ｓ２０４に対応する処理である。The frequency spectrum decoding unit 205 comprises:
Power component decoding section 205 for decoding the power component
1, a phase component decoding unit 205 for decoding the phase component
It has a 0 component elimination unit 2053 for eliminating 2,0 components. The power component decoding unit 2051 obtains power component information of the frequency spectrum information from the coded signal stored in the buffer 202, and obtains power component information for each frequency component obtained by the allocated bit number obtaining unit 204. The power component is decoded according to the number of allocated bits. If the power component information is converted into a bit stream, the power component is decoded by sequentially extracting the assigned bits for each frequency component. In addition, the phase component decoding unit 2052
02 of the frequency spectrum information is obtained from the coded signal stored in No. 02, and the phase component is obtained in accordance with the number of bits allocated to each frequency component obtained by the allocated bit number obtaining unit 204. Decrypt.
However, since the frequency component whose power component has a magnitude of 0 is not restored, the phase component decoding unit 2052 performs decoding based on the power component decoded by the power component decoding unit 2051 based on the power component. , The zero-component elimination unit 2053 eliminates the frequency component whose power component is zero. When the power component and the phase component are quantized using the above equations (2a) and (2b) and the rounding processing is varied according to the number of allocated bits, the following equations (3a),
The decoding process is performed according to (3b). Sp = 0.85 × R × Ep (3a) Sp = R × Ep / (2 ^Ba −1) (3b) The above-described processing performed by the frequency spectrum decoding unit 205 is performed by the signal decoding program. Is a process corresponding to the procedure S203 to be executed by the computer. When the power component and the phase component are decoded by the power component decoding unit 2051 and the phase component decoding unit 2052, the arithmetic unit 506 next outputs
It operates as the signal restoration unit 206. The signal restoration unit 20
6, the power component decoded by the power component decoding unit 2051 and the phase component decoding unit 205
A real number component and an imaginary number component are derived from the phase component decoded by step 2, and an IFFT (Inverse FFT) operation is performed to convert the component into a time domain signal, thereby restoring the acoustic signal. In the signal encoding device 1, when a window function is applied to each frame, it is necessary to multiply the value after the IFFT operation by the inverse function of the corresponding window function to restore. The information on the window function is shared by the signal encoding device 1 and the signal decoding device 2 or included in the encoded signal. By overlapping the window-applied portions of the joints of the respective frames, noise caused by discontinuity of the joints can be removed. In addition, it is also effective to regenerate the joined portion by excluding a portion near the portion subjected to the window function and multiplying both ends of the remaining portion by the window function. In such a case, it is necessary to previously secure a portion to be excluded when dividing the frame. The above-described processing performed by the signal restoring unit 205 is processing corresponding to step S204 that the signal decoding program causes the computer to execute.

【００２０】そして，前記信号復元部２０６により復元
された前記音響信号について，再生実行の指示が与えら
れると，前記音響信号は，Ｄ／Ａ変換器２０７によりア
ナログ化されてから，出力端子２０８を介してスピーカ
などから出力される。このように，本発明の実施の形態
に係る信号復号化装置は，前記信号符号化装置により生
成された符号化信号を復号化して，時間的に変化する対
象信号を復元するのに好適である。前記信号復号化装置
においても，周波数領域から時間領域に変換する逆変換
の処理は，復号化を行うにあたって一度行うだけでよい
から，全体の演算量は少なくてすみ，短時間で前記対象
信号の復号化を行うことができ，また実時間再生を行う
のも容易になる。また，前記信号復号化装置（及び信号
復号化プログラム）の処理のほとんどは，前記信号符号
化装置（及び信号符号化プログラム）の処理と共通の，
又は対照的な処理であり，一台の装置及び一つのプログ
ラムによって両者を構成することが比較的容易である。
ここで，図９に源信号と，前記MP3 により符号化・復号
化した後の信号と，本実施の形態に係る方式により符号
化・復号化した後の信号と，のパワー成分を比較して示
す。尚，前記MP3 と本実施の形態に係る方式のビットレ
ートは両方とも１２８ｋｂｐｓである。図９に示す通
り，本実施の形態に係る方式の周波数スペクトルは，低
周波領域から高周波領域に渡ってよく一致しているが，
前記MP3 によるものは，特に低周波領域と高周波領域で
源信号のものと大きく異なっている。即ち，本実施の形
態に係る符号化・復号化方式では，同じ圧縮率ながら高
い品質を確保することができる。また，その際の演算量
も少ない。When an instruction to execute reproduction is given to the acoustic signal restored by the signal restoring unit 206, the acoustic signal is converted into an analog signal by a D / A converter 207 and then output to an output terminal 208. Output from a speaker or the like via the As described above, the signal decoding device according to the embodiment of the present invention is suitable for decoding the encoded signal generated by the signal encoding device and restoring a temporally changing target signal. . Also in the signal decoding device, the inverse transform process for transforming from the frequency domain to the time domain only needs to be performed once in performing the decoding, so that the entire operation amount is small and the target signal can be converted in a short time. Decoding can be performed, and real-time reproduction can be easily performed. Also, most of the processing of the signal decoding device (and the signal decoding program) is the same as that of the signal encoding device (and the signal coding program).
Or it is a contrasting process, and it is relatively easy to configure both by one device and one program.
Here, FIG. 9 shows a comparison between the power components of the source signal, the signal coded / decoded by the MP3, and the signal coded / decoded by the method according to the present embodiment. Show. The bit rates of the MP3 and the method according to the present embodiment are both 128 kbps. As shown in FIG. 9, the frequency spectrum of the method according to the present embodiment matches well from the low-frequency region to the high-frequency region.
The method based on MP3 is significantly different from that based on the source signal particularly in the low frequency region and the high frequency region. That is, in the encoding / decoding method according to the present embodiment, high quality can be ensured while maintaining the same compression ratio. Also, the amount of calculation at that time is small.

【００２１】[0021]

【実施例】前記実施の形態では，前記信号符号化装置
１，及び前記信号復号化装置２を一台のコンピュータに
より実現する例を説明したが，これに限られるものでな
く，例えば前記信号符号化装置１と前記信号復号化装置
２とを別個のコンピュータにより実現するようにしても
よい。前記信号符号化装置１を実現するコンピュータに
よって生成された符号化信号は，各コンピュータに備え
られている前記通信装置を用い，インターネットなどを
介して，前記信号復号化装置２を実現するコンピュータ
に伝送するようにすればよい。前記信号符号化装置１を
実現するコンピュータは，例えば前記符号化信号を，当
該通信に対応したプロトコロルに応じたパケットに格納
し，前記信号符号化装置２は該パケットをデパケットし
た後，復号化処理を行う。伝送時間が保証されないネッ
トワークを介して実時間再生を行う場合，前記符号化信
号に含まれる前記スペクトル包絡情報，及び前記周波数
スペクトル情報（前記パワー成分情報，前記位相成分情
報）は，同じパケットに格納されることが好ましい。但
し，各々の情報を別々のパケットに格納した場合でも，
例えば前記スペクトル包絡情報と前記パワー成分情報の
み，前記スペクトル包絡情報と前記位相成分情報のみ，
又は前記周波数スペクトル情報のみを格納したパケット
が到着していれば，音質は劣化するものの受信側で再生
を行うことは可能である。また，前記信号符号化装置
１，及び前記信号復号化装置２は，汎用のコンピュータ
によって実現するのではなく，ＤＳＰなどを備えた専用
のハードウェアにより実現するようにしてもよい。この
場合にも，同一の装置に前記信号符号化装置１及び前記
信号復号化装置２を備えさせてもよいし，別個の装置に
前記信号復号化装置１及び前記信号復号化装置２をそれ
ぞれ備えさせてもよい。また，前記実施の形態では，前
記音響信号を符号化，復号化する例を説明したが，これ
に限られるものではなく，静止画像や，動画像など時間
的，又は空間的に変化する他の対象信号についても本発
明を適用することは可能である。例えば前記対象信号が
空間的に変化する静止画像の場合，FFT などの変換を２
次元的に行うと，空間周波数領域のスペクトル分布が得
られる。各空間周波数成分のスペクトル強度に応じて，
当該空間周波数成分に割り当てる割当ビット数を変化さ
せる。また，既述の通り前記フレーム幅は圧縮率に影響
を与えるため，前記フレーム幅を前記対象信号に対して
動的に最適化する処理を加えてもよい。最適化の手法と
しては，例えば次の（１）〜（４）がある。（１）フレーム幅を変化させて，最もスペクトル包絡
の面積が小さくなるフレーム幅を採用する。（２）フレーム幅を変化させて，同程度の圧縮率でそ
れぞれ圧縮し，圧縮したスペクトルと非圧縮のスペクト
ルの差が，最も小さくなるフレーム幅を採用する。（３）フレーム幅を変化させて，同程度のスペクトル
精度が得られる圧縮率でそれぞれ圧縮して，圧縮率が最
も小さくなるフレーム幅を採用する。（４）上述の（１），（２），（３）のいずれか，又
はこれらの組み合わせて，圧縮率或いはスペクトル精度
が最も良いフレーム幅を，予め統計的手法で決定して，
そのフレーム幅を使用する。このとき音声（男女別），
楽音の種類（ポップス，ロック，演歌等）別に，フレー
ム幅を算出することも有効である。また，前記実施の形
態では，周波数スペクトルに関連してFFT やIFFTを用い
たが, これに限られるものではなく，MDCT(Modified De
screte Cosine Transform)などの他の変換処理，逆変換
処理を用いるようにしてもよい。In the above embodiment, an example has been described in which the signal encoding device 1 and the signal decoding device 2 are realized by a single computer. However, the present invention is not limited to this. The decoding device 1 and the signal decoding device 2 may be realized by separate computers. An encoded signal generated by a computer that implements the signal encoding device 1 is transmitted to a computer that implements the signal decoding device 2 via the Internet or the like using the communication device provided in each computer. What should I do? A computer that realizes the signal encoding device 1 stores, for example, the encoded signal in a packet corresponding to a protocol corresponding to the communication, and the signal encoding device 2 depackets the packet and then performs a decoding process. I do. When real-time reproduction is performed via a network whose transmission time is not guaranteed, the spectrum envelope information and the frequency spectrum information (the power component information and the phase component information) included in the coded signal are stored in the same packet. Is preferably performed. However, even if each information is stored in a separate packet,
For example, only the spectrum envelope information and the power component information, only the spectrum envelope information and the phase component information,
Alternatively, if a packet storing only the frequency spectrum information has arrived, it is possible to reproduce on the receiving side although the sound quality is deteriorated. In addition, the signal encoding device 1 and the signal decoding device 2 may be realized not by a general-purpose computer but by dedicated hardware including a DSP or the like. Also in this case, the same device may include the signal encoding device 1 and the signal decoding device 2 or separate devices include the signal decoding device 1 and the signal decoding device 2, respectively. May be. Further, in the above-described embodiment, an example in which the audio signal is encoded and decoded has been described. However, the present invention is not limited to this, and other temporally or spatially changing other images such as a still image and a moving image may be used. The present invention can be applied to a target signal. For example, if the target signal is a spatially changing still image, the conversion such as FFT is performed by 2
When performed in a dimensional manner, a spectral distribution in the spatial frequency domain is obtained. According to the spectral intensity of each spatial frequency component,
The number of bits allocated to the spatial frequency component is changed. As described above, since the frame width affects the compression ratio, a process of dynamically optimizing the frame width with respect to the target signal may be added. For example, the following techniques (1) to (4) are available as optimization techniques. (1) Change the frame width and adopt the frame width that minimizes the area of the spectral envelope. (2) By changing the frame width and compressing them at the same compression ratio, a frame width that minimizes the difference between the compressed spectrum and the uncompressed spectrum is adopted. (3) The frame width is changed, and compression is performed at a compression ratio that can obtain the same level of spectral accuracy, and a frame width that minimizes the compression ratio is adopted. (4) Any one of the above (1), (2), and (3), or a combination thereof, is used to determine in advance a frame width having the best compression ratio or spectral accuracy by a statistical method.
Use that frame width. At this time, voice (separated by gender),
It is also effective to calculate the frame width for each type of musical sound (pops, rock, enka, etc.). Further, in the above embodiment, FFT and IFFT are used in relation to the frequency spectrum.
Other conversion processes such as a cosine transform (screte Cosine Transform) and an inverse conversion process may be used.

【００２２】[0022]

【発明の効果】以上説明した通り，前記請求項１に記載
の信号符号化装置によれば，少ない演算量で高い圧縮率
と品質とを確保することのできる。また，前記請求項２
に記載の発明によれば，前記請求項１に記載の信号符号
化装置により生成された符号化信号を復号化して，時間
的又は空間的に変化する対象信号を復元するのに好適で
あって復号化に伴う演算量も少ない信号復号化装置を提
供することができる。また，前記請求項３に記載のコン
ピュータ読み取り可能な記録媒体によれば，前記請求項
１に記載の信号符号化装置をコンピュータにより実現す
るのに好適な信号符号化プログラムを提供することがで
きる。また，前記請求項４に記載のコンピュータ読み取
り可能な記録媒体によれば，前記請求項１に記載の信号
符号化装置によって生成された符号化信号を，又は前記
請求項３に記載の信号符号化プログラムに従ってコンピ
ュータにより生成された符号化信号を復号化して，時間
的又は空間的に変化する対象信号を復元するのに好適な
信号復号化プログラムを提供することができる。As described above, according to the signal encoding apparatus of the first aspect, a high compression rate and high quality can be secured with a small amount of calculation. Further, the above-mentioned claim 2
According to the invention described in (1), it is suitable for decoding the encoded signal generated by the signal encoding device according to claim 1 and restoring a temporally or spatially changing target signal. It is possible to provide a signal decoding device that requires a small amount of calculation for decoding. Further, according to the computer-readable recording medium of the third aspect, it is possible to provide a signal encoding program suitable for realizing the signal encoding device of the first aspect by a computer. According to a fourth aspect of the present invention, there is provided a computer-readable recording medium, wherein the encoded signal generated by the signal encoding apparatus according to the first aspect or the signal encoding method according to the third aspect is used. It is possible to provide a signal decoding program suitable for decoding a coded signal generated by a computer according to the program and restoring a temporally or spatially changing target signal.

[Brief description of the drawings]

【図１】本発明の実施の形態に係る信号符号化装置の
概略構成を示す図。FIG. 1 is a diagram showing a schematic configuration of a signal encoding device according to an embodiment of the present invention.

【図２】本発明の実施の形態に係る信号復号化装置の
概略構成を示す図。FIG. 2 is a diagram showing a schematic configuration of a signal decoding device according to the embodiment of the present invention.

【図３】本発明の実施の形態に係る信号符号化プログ
ラムを説明するためのフローチャート。FIG. 3 is a flowchart for explaining a signal encoding program according to the embodiment of the present invention.

【図４】本発明の実施の形態に係る信号復号化プログ
ラムを説明するためのフローチャート。FIG. 4 is a flowchart for explaining a signal decoding program according to the embodiment of the present invention.

【図５】前記信号符号化装置，前記信号復号化装置を
実現するコンピュータの構成例を示す図。FIG. 5 is a diagram showing a configuration example of a computer that realizes the signal encoding device and the signal decoding device.

【図６】スペクトル包絡抽出処理を説明するためのフ
ローチャート。FIG. 6 is a flowchart for explaining a spectrum envelope extraction process.

【図７】スペクトル包絡抽出処理における谷関数の決
定手法を説明するための図。FIG. 7 is a view for explaining a method of determining a valley function in the spectral envelope extraction processing.

【図８】抽出したスペクトル包絡の例を示す図。FIG. 8 is a diagram showing an example of an extracted spectrum envelope.

【図９】源信号，本方式，その他の符号化方式を周波
数スペクトルを比較して示す図。FIG. 9 is a diagram showing a source signal, the present scheme, and other encoding schemes by comparing frequency spectra.

[Explanation of symbols]

１０３…周波数スペクトル演算部１０４…スペクトル包絡抽出部１０５…割当ビット数決定部１０６…周波数スペクトル量子化部１０７…符号化信号生成部２０３…スペクトル包絡復元部２０４…割当ビット数取得部２０５…周波数スペクトル復号化部２０６…信号復元部 Reference numeral 103: frequency spectrum calculation unit 104: spectrum envelope extraction unit 105: allocation bit number determination unit 106: frequency spectrum quantization unit 107: coded signal generation unit 203: spectrum envelope restoration unit 204: allocation bit number acquisition unit 205: frequency spectrum Decoding unit 206: signal restoration unit

───────────────────────────────────────────────────── フロントページの続き (54)【発明の名称】信号符号化装置，及び信号復号化装置，並びに信号符号化プログラムを記録したコンピュータ読み取り可能な記録媒体，及び信号復号化プログラムを記録したコンピュータ読み取り可能な記録媒体 ──────────────────────────────────────────────────続き Continuation of the front page (54) [Title of the Invention] A signal encoding device, a signal decoding device, a computer-readable recording medium on which a signal encoding program is recorded, and a signal decoding program are recorded. Computer readable recording medium

Claims

[Claims]

1. A frequency spectrum calculating means for calculating a frequency spectrum of a target signal which changes temporally or spatially, and a spectrum envelope extracting means for extracting a spectrum envelope from the frequency spectrum of the target signal calculated by the frequency spectrum calculating means. Means, allocated bit number determining means for determining the number of allocated bits to be allocated to each frequency component of the frequency spectrum of the target signal according to the spectrum envelope extracted by the spectrum envelope extracting means, and the allocated bit number determining means. Frequency spectrum quantization means for quantizing the frequency spectrum of the target signal based on the determined number of allocated bits; frequency spectrum information on the frequency spectrum of the target signal quantized by the frequency spectrum quantization means; , The spectral envelope And a coded signal generating means for generating a coded signal obtained by coding spectrum envelope information related to the signal.

2. Obtaining the spectrum envelope information from an encoded signal obtained by encoding frequency spectrum information on a frequency spectrum of a target signal that changes temporally or spatially and spectrum envelope information on a spectrum envelope of the target signal, A spectrum envelope restoration unit for restoring the spectrum envelope for each frequency component of the frequency spectrum of the target signal based on the acquired spectrum envelope information; and the spectrum envelope restored by the spectrum envelope restoration unit. Allocation bit number obtaining means for obtaining the number of allocated bits allocated to each frequency component of the frequency spectrum of the frequency spectrum information, and obtaining the frequency spectrum information from the encoded signal, and obtaining the allocated bit number obtained by the allocated bit number obtaining means. Based on the number,
Frequency spectrum decoding means for decoding the frequency spectrum of the target signal from the acquired frequency spectrum information, based on the frequency spectrum of the target signal decoded by the frequency spectrum decoding means,
A signal decoding device comprising: a signal restoring unit that restores the target signal.

3. A computer for obtaining a frequency spectrum of a spatially or temporally changing target signal, extracting a spectrum envelope from the frequency spectrum of the target signal, and responding to the spectrum envelope according to the spectrum envelope. A procedure for determining the number of bits to be allocated to each frequency component of the frequency spectrum, a procedure for quantizing the frequency spectrum of the target signal based on the number of allocated bits, frequency spectrum information on the frequency spectrum of the target signal, and the spectrum envelope A computer-readable recording medium on which a signal encoding program for executing a procedure for generating an encoded signal obtained by encoding spectral envelope information defining the following is stored.

4. A computer according to claim 1, wherein said spectral envelope information is obtained from an encoded signal obtained by encoding frequency spectrum information relating to a frequency spectrum of a target signal which changes temporally or spatially and spectral envelope information relating to a spectrum envelope of said target signal. Acquiring and restoring the spectrum envelope for each frequency component of the frequency spectrum of the target signal based on the spectrum envelope information; and assigning bits assigned to each frequency component of the frequency spectrum of the target signal from the spectrum envelope. Obtaining the frequency spectrum information from the coded signal, decoding the frequency spectrum of the target signal from the frequency spectrum information based on the allocated bit number, and obtaining the frequency spectrum of the target signal. The target signal is restored based on A computer-readable recording medium on which a signal decoding program for executing the original procedure is recorded.