JPH08202398A

JPH08202398A - Voice coding device

Info

Publication number: JPH08202398A
Application number: JP7013072A
Authority: JP
Inventors: Shinichi Taumi; 真一田海; Kazunori Ozawa; 一範小澤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1995-01-30
Filing date: 1995-01-30
Publication date: 1996-08-09
Anticipated expiration: 2015-06-05
Also published as: JP3047761B2; CA2167552C; CA2167552A1

Abstract

PURPOSE: To provide a voice coding device by which good voice quality can be obtained without degrading voice quality by temporal variation for discriminating a mode even if a frame length is shortened to 5ms-10ms or less in order to make low delay. CONSTITUTION: A hearing sense weighing signal is inputted from an input terminal 2010 with a unit of a frame, and a spectrum parameter is inputted from an input terminal 2020. A feature quantity calculating circuit A 2030 calculates, for example, pitch prediction gain PG as feature quantity and outputs it. A feature quantity calculating circuit B 3040 calculates short time prediction gain SG as feature quantity and output it. A mode discriminating circuit 2050 compares an output value PG of the input terminal 2020 and an output value SG of the input terminal 2030 with plural threshold values previously decided in accordance with mode information of a frame of one frame before stored in a delay circuit 2060, performs mode discrimination, and outputs mode information.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声信号を低遅延、特
に５ｍｓ−１０ｍｓ以下の短いフレーム単位で高品質に
符号化するための音声符号化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding apparatus for coding a speech signal with a low delay, particularly in a short frame unit of 5 ms-10 ms or less with high quality.

【０００２】[0002]

【従来の技術】音声信号を符号化する方式としては、例
えば、Ｋ．Ｏｚａｗａ氏らによる“Ｍ−ＬＣＥＬＰＳ
ｐｅｅｃｈＣｏｄｉｎｇａｔ４ｋｂ／ｓｗｉｔ
ｈＭｕｌｔｉ−ＭｏｄｅａｎｄＭｕｌｔｉ−Ｃｏ
ｄｅｂｏｏｋ”（ＩＥＩＣＥＴｒａｎｓ．Ｃｏｍｍｕ
ｎ．ｖｏｌ．Ｅ７７−Ｂ，Ｎｏ．９，ｐｐ．１１１４−
１１２１，１９９４年）と題した論文（文献１）などが
知られている。この従来例では、送信側では、フレーム
毎（例えば４０ｍｓ）に音声信号から線形予測（ＬＰ
Ｃ）分析を用いて、音声信号のスペクトル特性を表すス
ペクトルパラメータを抽出し、前記フレーム単位の信号
または前記フレーム単位の信号に聴感重み付けを行なっ
た信号から得た特徴量を計算し、特徴量を用いてモード
判別（例えば、母音部と子音部）を行ない、モード判別
結果に応じてアルゴリズムあるいはコードブックを切り
かえて符号化を行なう。符号化部では、フレームをさら
にサブフレーム（例えば８ｍｓ）に分割し、サブフレー
ム毎に過去の音源信号を基に適応コードブックにおける
パラメータ（ピッチ周期に対応する遅延パラメータとゲ
インパラメータ）を抽出し適応コードブックにより前記
サブフレームの音声信号をピッチ予測し、ピッチ予測し
て求めた残差信号に対して、予め定められた種類の雑音
信号からなる音源コードブック（ベクトル量子化コード
ブック）から最適音源コードベクトルを選択し最適なゲ
インを計算することにより、音源信号を量子化する。音
源コードベクトルの選択の仕方は、選択した雑音信号に
より合成した信号と、前記残差信号との誤差電力を最小
化するように行なう。そして、選択されたコードベクト
ルの種類を表すインデクスとゲインならびに、前記スペ
クトルパラメータと適応コードブックのパラメータをマ
ルチプレクサ部により組み合わせて伝送する。受信側の
説明は省略する。2. Description of the Related Art As a method for encoding a voice signal, for example, K. "M-LCELPS S by Ozawa et al.
Peach Coding at 4kb / s wit
h Multi-Mode and Multi-Co
"Debook" (IEICE Trans. Commu
n. vol. E77-B, No. 9, pp. 1114-
A paper (reference 1) entitled "1121, 1994) is known. In this conventional example, on the transmission side, linear prediction (LP) is performed from the audio signal for each frame (for example, 40 ms).
C) Using analysis, a spectrum parameter representing the spectral characteristic of the audio signal is extracted, and a feature amount obtained from the frame unit signal or a signal obtained by perceptually weighting the frame unit signal is calculated. A mode discrimination (for example, a vowel part and a consonant part) is performed by using it, and coding is performed by switching the algorithm or codebook according to the mode discrimination result. The encoding unit further divides the frame into subframes (for example, 8 ms), extracts parameters (delay parameters and gain parameters corresponding to the pitch period) in the adaptive codebook based on past excitation signals for each subframe, and adapts them. Optimum sound source from a sound source codebook (vector quantization codebook) composed of a noise signal of a predetermined type with respect to the residual signal obtained by pitch-estimating the voice signal of the subframe by the codebook and pitch-predicting The source signal is quantized by selecting a code vector and calculating the optimum gain. The excitation code vector is selected so as to minimize the error power between the signal synthesized from the selected noise signal and the residual signal. Then, the index and the gain indicating the type of the selected code vector, the spectrum parameter and the parameter of the adaptive codebook are combined by the multiplexer unit and transmitted. A description of the receiving side is omitted.

【０００３】[0003]

【発明が解決しようとする課題】前記従来法では、処理
遅延を低減するために、フレーム長を例えば５ｍｓ以下
に低減した場合、モード情報あるいはピッチ抽出、レベ
ル抽出をフレーム単位で求めると、これらの値の時間的
変動が大きいために、不安定で過ったモード切り替え、
あるいは不安定で過ったピッチ抽出、不安定で過ったレ
ベル抽出が生じ、音質劣化が起こるという問題点があっ
た。本発明は、上述の問題を解決し、正しいモード判
別、あるいは正しいピッチ抽出、正しいレベル抽出を提
供し、これらの誤りによる音質劣化を抑制することを目
的とする。In the conventional method described above, when the frame length is reduced to, for example, 5 ms or less in order to reduce the processing delay, if mode information or pitch extraction and level extraction are obtained in frame units, these Mode change due to instability due to large time fluctuation of value,
Alternatively, there is a problem that unstable and incorrect pitch extraction and unstable and incorrect level extraction occur, resulting in deterioration of sound quality. SUMMARY OF THE INVENTION It is an object of the present invention to solve the above problems, provide correct mode determination, correct pitch extraction, and correct level extraction, and suppress sound quality deterioration due to these errors.

【０００４】[0004]

【課題を解決するための手段】本発明によれば、音声信
号を予め定めたフレーム単位に区切るフレーム分割部
と、前記音声信号から特徴量を計算しモード判別を行な
うモード判別部と、前記判別結果におうじて前記音声信
号を符号化する音声符号化装置において、現フレーム及
び過去の少なくとも一つのフレームからそれぞれ求めた
少なくとも１種類以上の特徴量と過去の少なくとも一つ
のフレームから求めたモード判別情報を用いて、現フレ
ームのモード判別をする機能を有することを特徴とする
音声符号化装置が得られる。According to the present invention, a frame dividing section for dividing an audio signal into predetermined frame units, a mode determining section for calculating a feature amount from the audio signal and performing a mode determination, and the determination In a voice encoding device that encodes the voice signal based on the result, at least one type of feature amount obtained from each of the present frame and at least one past frame and mode discrimination information obtained from at least one past frame A speech coding apparatus having a function of discriminating the mode of the current frame is obtained by using.

【０００５】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのモード判
別をする機能を有することを特徴とする音声符号化装置
で、前記特徴量として、少なくとも１種類以上の特徴量
の時間変化比を特徴量として含めた音声符号化装置が得
られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and performs mode determination, and the above-mentioned determination result is used. In a voice encoding device for encoding a voice signal, at least one type of feature amount obtained from each of a current frame and at least one past frame and mode discrimination information obtained from at least one past frame In a speech coding apparatus having a function of discriminating a mode of a frame, a speech coding apparatus can be obtained in which, as the feature quantity, at least one kind of time-dependent change ratio of the feature quantity is included as the feature quantity.

【０００６】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのモード判
別をする機能を有することを特徴とする音声符号化装置
で、前記特徴量として、現フレーム又は過去の少なくと
も一つ以上のフレームのいずれかの２フレーム分のそれ
ぞれの特徴量に対し、前記二つの特徴量の比を特徴量と
して含めた音声符号化装置が得られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and performs mode determination, and the above-mentioned determination result is used. In a voice encoding device for encoding a voice signal, at least one type of feature amount obtained from each of a current frame and at least one past frame and mode discrimination information obtained from at least one past frame A voice encoding device having a function of discriminating a mode of a frame, wherein the feature amount is a feature amount for each feature amount of two frames of a current frame or at least one past frame. As a result, it is possible to obtain a speech coding apparatus in which the ratio of the two feature quantities is included as the feature quantity.

【０００７】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのモード判
別をする機能を有することを特徴とする音声符号化装置
で、前記特徴量として、ピッチ予測ゲイン、短期予測ゲ
イン、レベル、ピッチの少なくとも一種以上を特徴量と
して含めることを特徴とする音声符号化装置が得られ
る。Further, according to the present invention, a frame dividing section that divides the audio signal into predetermined frame units, a mode determining section that calculates a feature amount from the audio signal and determines a mode, and the above-mentioned determination result is used. In a voice encoding device for encoding a voice signal, at least one type of feature amount obtained from each of a current frame and at least one past frame and mode discrimination information obtained from at least one past frame A speech coding apparatus having a function of discriminating a mode of a frame, characterized in that the feature quantity includes at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch. A speech coder is obtained.

【０００８】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのモード判
別をする機能を有することを特徴とする音声符号化装置
で、前記特徴量として、少なくとも１種類以上の特徴量
の時間変化比を特徴量として含めた音声符号化装置で、
前記特徴量として、ピッチ予測ゲイン、短期予測ゲイ
ン、レベル、ピッチの少なくとも一種以上を特徴量とし
て含めることを特徴とする音声符号化装置が得られる。Further, according to the present invention, a frame dividing section that divides the audio signal into predetermined frame units, a mode determining section that calculates a feature amount from the audio signal and performs mode determination, and the above-mentioned determination result is used. In a voice encoding device for encoding a voice signal, at least one type of feature amount obtained from each of a current frame and at least one past frame and mode discrimination information obtained from at least one past frame A voice encoding device having a function of discriminating a mode of a frame, wherein the feature amount includes at least one time change ratio of feature amounts as a feature amount.
A speech coding apparatus is obtained which includes at least one of pitch prediction gain, short-term prediction gain, level, and pitch as the feature amount as the feature amount.

【０００９】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのモード判
別をする機能を有することを特徴とする音声符号化装置
で、前記特徴量として、現フレームまたは過去の少なく
とも一つ以上のフレームのいずれかの２フレーム分のそ
れぞれの特徴量に対し、前記二つの特徴量の比を特徴量
として含めた音声符号化装置で、前記特徴量として、ピ
ッチ予測ゲイン、短期予測ゲイン、レベル、ピッチの少
なくとも一種以上を特徴量として含めることを特徴とす
る音声符号化装置が得られる。Further, according to the present invention, a frame dividing section that divides the audio signal into predetermined frame units, a mode determining section that calculates a feature amount from the audio signal and performs mode determination, and the above-mentioned determination result is used. In a voice encoding device for encoding a voice signal, at least one type of feature amount obtained from each of a current frame and at least one past frame and mode discrimination information obtained from at least one past frame A voice encoding device having a function of discriminating a mode of a frame, wherein the feature amount is a feature amount for each feature amount of two frames of a current frame or at least one past frame. , A speech coding apparatus including a ratio of the two feature quantities as a feature quantity, wherein the feature quantity is a pitch prediction gain, Period predicted gain, a level, the speech coding apparatus is obtained which is characterized in that included as a feature quantity of at least one or more of the pitch.

【００１０】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からピッチを抽出するピッチ抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのピッチを
補正するピッチ抽出部を有することを特徴とする音声符
号化装置が得られる。Further, according to the present invention, a frame dividing section that divides an audio signal into predetermined frame units, a mode determining section that calculates a feature amount from the audio signal and determines a mode, and a pitch is extracted from the audio signal. In the pitch extraction unit and in the speech coding apparatus for coding the speech signal based on the discrimination result, at least one type of feature amount and at least one past characteristic value respectively obtained from the current frame and at least one past frame. A speech encoding apparatus is obtained which has a pitch extraction unit that corrects the pitch of the current frame using the mode discrimination information obtained from one frame.

【００１１】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からピッチを抽出するピッチ抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのピッチを
補正するピッチ抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、少なくとも１種類以上
の特徴量の時間変化比を特徴量として含めた音声符号化
装置が得られる。Further, according to the present invention, a frame dividing unit that divides an audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a pitch is extracted from the audio signal. In the pitch extraction unit and in the speech coding apparatus for coding the speech signal based on the discrimination result, at least one type of feature amount and at least one past characteristic value respectively obtained from the current frame and at least one past frame. A speech coding apparatus characterized by having a pitch extraction unit that corrects the pitch of the current frame by using the mode discrimination information obtained from one frame, wherein at least one or more kinds of the characteristic quantities change with time as the characteristic quantity. A speech coding apparatus including a ratio as a feature amount can be obtained.

【００１２】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からピッチを抽出するピッチ抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのピッチを
補正するピッチ抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、現フレーム又は過去の
少なくとも一つ以上のフレームのいずれかの２フレーム
分のそれぞれの特徴量に対し、前記二つの特徴量の比を
特徴量として含めた音声符号化装置が得られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a pitch is extracted from the audio signal. In the pitch extraction unit and in the speech coding apparatus for coding the speech signal based on the discrimination result, at least one type of feature amount and at least one past characteristic value respectively obtained from the current frame and at least one past frame. A speech coding apparatus characterized by having a pitch extraction unit for correcting the pitch of the current frame by using the mode discrimination information obtained from one frame, wherein the feature amount is at least one of the current frame and the past. The ratio of the two feature amounts is included as the feature amount for each feature amount of any two frames. Voice encoding device can be obtained.

【００１３】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からピッチを抽出するピッチ抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのピッチを
補正するピッチ抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、ピッチ予測ゲイン、短
期予測ゲイン、レベル、ピッチの少なくとも一種以上を
特徴量として含めることを特徴とする音声符号化装置が
得られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a pitch is extracted from the audio signal. In the pitch extraction unit and in the speech coding apparatus for coding the speech signal based on the discrimination result, at least one type of feature amount and at least one past characteristic value respectively obtained from the current frame and at least one past frame. Using the mode discrimination information obtained from one frame, a speech coding apparatus characterized by having a pitch extraction unit for correcting the pitch of the current frame, as the feature amount, pitch prediction gain, short-term prediction gain, level, There is provided a speech coding apparatus characterized by including at least one kind of pitch as a feature amount.

【００１４】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からピッチを抽出するピッチ抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのピッチを
補正するピッチ抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、少なくとも１種類以上
の特徴量の時間変化比を特徴量として含めた音声符号化
装置で、前記特徴量として、ピッチ予測ゲイン、短期予
測ゲイン、レベル、ピッチの少なくとも一種以上を特徴
量として含めることを特徴とする音声符号化装置が得ら
れる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a pitch is extracted from the audio signal. In the pitch extraction unit and in the speech coding apparatus for coding the speech signal based on the discrimination result, at least one type of feature amount and at least one past characteristic value respectively obtained from the current frame and at least one past frame. A speech coding apparatus characterized by having a pitch extraction unit that corrects the pitch of the current frame by using the mode discrimination information obtained from one frame, wherein at least one or more kinds of the characteristic quantities change with time as the characteristic quantity. A speech coding apparatus including a ratio as a feature amount, wherein the feature amount includes a pitch prediction gain, a short-term prediction gain, a level, Speech coding device is obtained, characterized in that included as the feature quantity of at least one or more pitch.

【００１５】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からピッチを抽出するピッチ抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのピッチを
補正するピッチ抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、現フレーム又は過去の
少なくとも一つ以上のフレームのいずれかの２フレーム
分のそれぞれの特徴量に対し、前記二つの特徴量の比を
特徴量として含めた音声符号化装置で、前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする音声符号化装置が得られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a pitch is extracted from the audio signal. In the pitch extraction unit and in the speech coding apparatus for coding the speech signal based on the discrimination result, at least one type of feature amount and at least one past characteristic value respectively obtained from the current frame and at least one past frame. A speech coding apparatus characterized by having a pitch extraction unit for correcting the pitch of the current frame by using the mode discrimination information obtained from one frame, wherein the feature amount is at least one of the current frame and the past. The ratio of the two feature amounts is included as the feature amount for each feature amount of any two frames. In voice encoding device, as the feature amount, a pitch prediction gain, short-term prediction gain, level, the speech coding apparatus is obtained which is characterized in that included as a feature quantity of at least one or more of the pitch.

【００１６】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からレベルを抽出するレベル抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのレベルを
補正するレベル抽出部を有することを特徴とする音声符
号化装置が得られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a level is extracted from the audio signal. And a level extraction unit for encoding the audio signal based on the determination result, and at least one type of feature amount and at least one of the past obtained respectively from the current frame and at least one frame in the past. A speech coding apparatus is obtained which has a level extraction unit for correcting the level of the current frame using the mode discrimination information obtained from one frame.

【００１７】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からレベルを抽出するレベル抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのレベルを
補正するレベル抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、少なくとも１種類以上
の特徴量の時間変化比を特徴量として含めた音声符号化
装置が得られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a level is extracted from the audio signal. And a level extraction unit for encoding the audio signal based on the determination result, and at least one type of feature amount and at least one of the past obtained respectively from the current frame and at least one frame in the past. A speech coding apparatus characterized by having a level extraction unit that corrects the level of the current frame by using the mode discrimination information obtained from one frame, wherein at least one or more of the characteristic amounts as the characteristic amount changes with time. A speech coding apparatus including a ratio as a feature amount can be obtained.

【００１８】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からレベルを抽出するレベル抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのレベルを
補正するレベル抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、現フレーム又は過去の
少なくとも一つ以上のフレームのいずれかの２フレーム
分のそれぞれの特徴量に対し、前記二つの特徴量の比を
特徴量として含めた音声符号化装置が得られる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a level is extracted from the audio signal. And a level extraction unit for encoding the audio signal based on the determination result, and at least one type of feature amount and at least one of the past obtained respectively from the current frame and at least one frame in the past. A speech coding apparatus characterized by having a level extraction unit for correcting the level of the current frame by using the mode discrimination information obtained from one frame, wherein the feature amount is at least one of the current frame and the past. The ratio of the two feature amounts is included as the feature amount for each feature amount of any two frames. Voice encoding device can be obtained.

【００１９】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からレベルを抽出するレベル抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのレベルを
補正するレベル抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、ピッチ予測ゲイン、短
期予測ゲイン、レベル、ピッチの少なくとも一種以上を
特徴量として含めることを特徴とする音声符号化装置が
得られる。Further, according to the present invention, a frame dividing section that divides the audio signal into predetermined frame units, a mode determining section that calculates a feature amount from the audio signal and determines a mode, and a level is extracted from the audio signal. And a level extraction unit for encoding the audio signal based on the determination result, and at least one type of feature amount and at least one of the past obtained respectively from the current frame and at least one frame in the past. Using the mode discrimination information obtained from one frame, a speech encoding device characterized by having a level extraction unit for correcting the level of the current frame, as the feature amount, pitch prediction gain, short-term prediction gain, level, There is provided a speech coding apparatus characterized by including at least one kind of pitch as a feature amount.

【００２０】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からレベルを抽出するレベル抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのレベルを
補正するレベル抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、少なくとも１種類以上
の特徴量の時間変化比を特徴量として含めた音声符号化
装置で、前記特徴量として、ピッチ予測ゲイン、短期予
測ゲイン、レベル、ピッチの少なくとも一種以上を特徴
量として含めることを特徴とする音声符号化装置が得ら
れる。Further, according to the present invention, a frame dividing unit that divides the audio signal into predetermined frame units, a mode determining unit that calculates a feature amount from the audio signal and determines a mode, and a level is extracted from the audio signal. And a level extraction unit for encoding the audio signal based on the determination result, and at least one type of feature amount and at least one of the past obtained respectively from the current frame and at least one frame in the past. A speech coding apparatus characterized by having a level extraction unit that corrects the level of the current frame by using the mode discrimination information obtained from one frame, wherein at least one or more of the characteristic amounts as the characteristic amount changes with time. A speech coding apparatus including a ratio as a feature amount, wherein the feature amount includes a pitch prediction gain, a short-term prediction gain, a level, Speech coding device is obtained, characterized in that included as the feature quantity of at least one or more pitch.

【００２１】また本発明によれば、音声信号を予め定め
たフレーム単位に区切るフレーム分割部と、前記音声信
号から特徴量を計算しモード判別を行なうモード判別部
と、前記音声信号からレベルを抽出するレベル抽出部
と、前記判別結果におうじて前記音声信号を符号化する
音声符号化装置において、現フレーム及び過去の少なく
とも一つのフレームからそれぞれ求めた少なくとも１種
類以上の特徴量と過去の少なくとも一つのフレームから
求めたモード判別情報を用いて、現フレームのレベルを
補正するレベル抽出部を有することを特徴とする音声符
号化装置で、前記特徴量として、現フレーム又は過去の
少なくとも一つ以上のフレームのいずれかの２フレーム
分のそれぞれの特徴量に対し、前記二つの特徴量の比を
特徴量として含めた音声符号化装置で、前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする音声符号化装置が得られる。Further, according to the present invention, a frame dividing section that divides the audio signal into predetermined frame units, a mode determining section that calculates a feature amount from the audio signal and determines a mode, and a level is extracted from the audio signal. And a level extraction unit for encoding the audio signal based on the determination result, and at least one type of feature amount and at least one of the past obtained respectively from the current frame and at least one frame in the past. A speech coding apparatus characterized by having a level extraction unit for correcting the level of the current frame by using the mode discrimination information obtained from one frame, wherein the feature amount is at least one of the current frame and the past. The ratio of the two feature amounts is included as the feature amount for each feature amount of any two frames. In voice encoding device, as the feature amount, a pitch prediction gain, short-term prediction gain, level, the speech coding apparatus is obtained which is characterized in that included as a feature quantity of at least one or more of the pitch.

【００２２】[0022]

【作用】前記構成により、現フレームの前後では正しい
モード情報あるいは正しいピッチ抽出、正しいレベル抽
出を示しているが、現フレームのみでこれらの値が過っ
た場合、過去の正しいモード情報あるいは正しいピッチ
抽出、正しいレベル抽出を示すフレームからの情報を現
フレームに適応することにより、現フレームのモード情
報あるいはピッチ抽出、レベル抽出を長い時間長にわた
る情報を用いて補正することができる。従って、音声符
号化処理に正しいモード情報あるいは正しいピッチ、正
しいレベル抽出を提供し、これらの誤りに起因する音質
劣化を抑制できる。With the above structure, the correct mode information, the correct pitch extraction, and the correct level extraction are shown before and after the current frame. However, when these values are exceeded only in the current frame, the correct mode information in the past or the correct pitch is extracted. By applying the information from the frame indicating the extraction and the correct level extraction to the current frame, the mode information of the current frame or the pitch extraction and the level extraction can be corrected using the information over a long time length. Therefore, correct mode information or correct pitch and correct level extraction can be provided to the audio encoding process, and the sound quality deterioration due to these errors can be suppressed.

【００２３】[0023]

【実施例】請求項４の“前記特徴量として、ピッチ予測
ゲイン、短期予測ゲイン、レベル、ピッチの少なくとも
一種以上を特徴量として含めることを特徴とする請求項
１記載の音声符号化装置”に関わる実施例を図１に示
す。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In the speech encoding apparatus according to claim 4, "at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch is included as the feature amount as the feature amount". A related example is shown in FIG.

【００２４】図において、入力端子１００から音声信号
を入力し、フレーム分割回路１１０では音声信号をフレ
ーム（例えば５ｍｓ）毎に分割し、サブフレーム分割回
路１２０では、フレームの音声信号をフレームよりも短
いサブフレーム（例えば２．５ｍｓ）に分割する。In the figure, a voice signal is input from an input terminal 100, a frame division circuit 110 divides the voice signal into frames (for example, 5 ms), and a sub-frame division circuit 120 divides the voice signal of the frame into shorter than the frame. It is divided into subframes (for example, 2.5 ms).

【００２５】スペクトルパラメータ計算回路２００で
は、少なくとも一つのサブフレームの音声信号に対し
て、サブフレーム長よりも長い窓（例えば２４ｍｓ）を
かけて音声を切り出してスペクトルパラメータをあらか
じめ定められた次数（例えばＰ＝１０次）計算する。こ
こでスペクトルパラメータの計算には、周知のＬＰＣ分
析や、Ｂｕｒｇ分析等を用いることができる。ここで
は、Ｂｅｒｇ分析を用いることとする。Ｂｕｒｇ分析の
詳細については、中溝著による“信号解析とシステム同
定”と題した単行本（コロナ社１９８８年刊）の８２〜
８７頁（文献２）等に記載されているので説明は略す
る。さらにスペクトルパラメータ計算部では、Ｂｕｒｇ
法により計算された線形予測係数α_i（ｉ＝１，…，１
０）を量子化や補間に適したＬＳＰパラメータに変換す
る。ここで、線形予測係数からＬＳＰへの変換は、菅村
他による“線スペクトル対（ＬＳＰ）音声分析合成方法
による音声情報圧縮”と題した論文（電子通信学会論文
誌、Ｊ６４−Ａ、ｐｐ．５９９−６０６、１９８１年）
（文献３）を参照することができる。つまり、第２サブ
フレームでＢｕｒｇ法により求めた線形予測係数を、Ｌ
ＳＰパラメータに変換し、第１サブフレームのＬＳＰを
直線補間により求めて、第１サブフレームのＬＳＰを逆
変換して線形予測係数に戻し、第１、２サブフレームの
線形予測係数α_il（ｉ＝１，…，１０，ｌ＝１，…，
５）を聴感重み付け回路２３０に出力する。また、第
１、２サブフレームのＬＳＰをスペクトルパラメータ量
子化回路２１０へ出力する。In the spectrum parameter calculation circuit 200, a speech signal is cut out by applying a window (for example, 24 ms) longer than the subframe length to a speech signal of at least one subframe, and spectrum parameters are set to predetermined orders (for example, P = 10th order) Here, well-known LPC analysis, Burg analysis, or the like can be used for the calculation of the spectrum parameter. Here, Berg analysis is used. For more information on Burg analysis, see 82-of the book (Corona Publishing Co., Ltd., 1988) entitled "Signal Analysis and System Identification" by Nakamizo.
The description is omitted because it is described on page 87 (reference 2). Furthermore, in the spectrum parameter calculation unit, Burg
Linear prediction coefficient α _i (i = 1, ..., 1) calculated by the method
0) is converted into an LSP parameter suitable for quantization and interpolation. Here, the conversion from the linear prediction coefficient to the LSP is performed by Sugamura et al., "Speech information compression by line spectrum pair (LSP) speech analysis and synthesis method" (The Institute of Electronics and Communication Engineers, J64-A, pp. 599). -606, 1981)
(Reference 3) can be referred to. That is, the linear prediction coefficient obtained by the Burg method in the second subframe is
It is converted into SP parameters, the LSP of the first sub-frame is obtained by linear interpolation, the LSP of the first sub-frame is inversely transformed back to the linear prediction coefficient, and the linear prediction coefficient α _il (i = 1, ..., 10, l = 1, ...,
5) is output to the perceptual weighting circuit 230. In addition, the LSP of the first and second subframes is output to the spectrum parameter quantization circuit 210.

【００２６】スペクトルパラメータ量子化回路２１０で
は、あらかじめ定められたサブフレームのＬＳＰパラメ
ータを効率的に電子化する。以下では、量子化法とし
て、ベクトル量子化を用いるものとし、第２サブフレー
ムのＬＳＰパラメータを量子化するものとする。ＬＳＰ
パラメータのベクトル量子化の手法は周知の手法を用い
ることができる。具体的な方法は例えば、特開平４−１
７１５００号公報（特願平２−２９７６００号）（文献
４）や特開平４−３６３０００号公報（特願平３−２６
１９２５号）（文献５）や、特開平５−６１９９号公報
（特願平３−１５５０４９号）（文献６）や、Ｔ．Ｎｏ
ｍｕｒａｅｔａｌ．，による“ＬＳＰＣｏｄｉｎｇ
ＵｓｉｎｇＶＱ−ＳＶＱＷｉｔｈＩｎｔｅｒｐ
ｏｌａｔｉｏｎｉｎ４．０７５ｋｂｐｓＭ−Ｌ
ＣＥＬＰＳｐｅｅｃｈＣｏｄｅｒ”と題した論文
（Ｐｒｏｃ．ＭｏｂｉｌｅＭｕｌｔｉｍｅｄｉａＣ
ｏｍｍｕｎｉｃａｔｉｏｎｓ，ｐｐ．Ｂ．２．５，１９
９３）（文献７）等を参照できるのでここでは説明は略
する。また、スペクトルパラメータ量子化回路２１０で
は、第２サブフレームで量子化したＬＳＰパラメータを
もとに、第１，２サブフレームのＬＳＰパラメータを復
元する。ここでは、現フレームの第２サブフレームの量
子化ＬＳＰパラメータと１つ過去のフレームの第２サブ
フレームの量子化ＬＳＰを直線補間して、第１，２サブ
フレームのＬＳＰを復元する。ここで、量子化前のＬＳ
Ｐと量子化後のＬＳＰとの誤差電力を最小化するコード
ベクトルを１種類選択した後に、直線補間により第１〜
第４サブフレームのＬＳＰを復元できる。さらに性能を
向上させるためには、前記誤差電力を最小化するコード
ベクトルを複数候補選択したのちに、各々の候補につい
て、累積歪を評価し、累積歪を最小化する候補と補間Ｌ
ＳＰの組を選択するようにすることができる。The spectrum parameter quantization circuit 210 efficiently digitizes the LSP parameters of a predetermined subframe. In the following, it is assumed that vector quantization is used as the quantization method and the LSP parameter of the second subframe is quantized. LSP
A well-known method can be used as a method of vector quantization of parameters. A specific method is, for example, Japanese Patent Laid-Open No. 4-1
71500 (Japanese Patent Application No. 2-297600) (Reference 4) and Japanese Patent Laid-Open No. 4-363000 (Japanese Patent Application No. 3-26).
1925) (Reference 5), Japanese Patent Laid-Open No. 5-6199 (Japanese Patent Application No. 3-155049) (Reference 6), T.I. No
mura et al. , By "LSPCoding
Using VQ-SVQ With Interp
automation in 4.075 kbps ML
The paper entitled "CELP Speech Coder" (Proc. Mobile Multimedia C
communications, pp. B. 2.5, 19
93) (Reference 7) and the like, so description thereof will be omitted here. Further, the spectrum parameter quantization circuit 210 restores the LSP parameters of the first and second subframes based on the LSP parameters quantized in the second subframe. Here, the quantized LSP parameter of the second subframe of the current frame and the quantized LSP of the second subframe of the frame one past are linearly interpolated to restore the LSP of the first and second subframes. Where LS before quantization
After selecting one type of code vector that minimizes the error power between P and the quantized LSP, first to
The LSP of the fourth subframe can be restored. In order to further improve the performance, after selecting a plurality of code vectors that minimize the error power, the cumulative distortion is evaluated for each candidate, and the candidate that minimizes the cumulative distortion and the interpolation L are calculated.
A set of SPs can be selected.

【００２７】以上により復元した第１，２サブフレーム
のＬＳＰと第２サブフレームの量子化ＬＳＰをサブフレ
ーム毎に線形予測係数α′_il（ｉ＝１，…，１０，ｌ＝
１，…，５）に変換し、インパルス応答計算回路３１０
へ出力する。また、第２サブフレームの量子化ＬＳＰの
コードベクトルを表すインパルスをマルチプレクサ４０
０に出力する。The LSP of the first and second subframes and the quantized LSP of the second subframe restored as described above are linearly predicted for each subframe by a linear prediction coefficient _α'il (i = 1, ..., 10, l =
1, ..., 5), and the impulse response calculation circuit 310
Output to. In addition, the multiplexer 40 transmits the impulse representing the code vector of the quantized LSP of the second subframe.
Output to 0.

【００２８】上記において、直線補間のかわりに、ＬＳ
Ｐの補間パターンをあらかじめ定められたビット数（例
えば２ビット）分用意しておき、これらのパターンの各
々に対して１，２サブフレームのＬＳＰを復元して累積
歪を最小化するコードベクトルと補間パターンの組を選
択するようにしてもよい。このようにすると補間パター
ンのビット数だけ伝送情報が増加するが、ＬＳＰのフレ
ーム内での時間的な変化をより精密に表すことができ
る。ここで、補間パターンは、トレーニング用のＬＳＰ
データを用いてあらかじめ学習して作成してもよいし、
あらかじめ定められたパターンを格納しておいてもよ
い。あらかじめ定められたパターンとしては、例えば、
Ｔ．Ｔａｎｉｇｕｃｈｉｅｔａｌ．による“Ｉｍｐ
ｒｏｖｅｄＣＥＬＰｓｐｅｅｃｈｃｏｄｉｎｇａ
ｔ４ｋｂ／ｓａｎｄｂｅｌｏｗ”と題した論文
（Ｐｒｏｃ．ＩＣＳＬＰ，ｐｐ．４１−４４，１９９
２）（文献８）等に記載のパターンを用いることができ
る。また、さらに性能を改善するためには、補間パター
ンを選択した後に、あらかじめ定められたサブフレーム
において、ＬＳＰの真の値とＬＳＰの補間値との誤差信
号を求め、前記誤差信号をさらに誤差コードブックで表
すようにしてもよい。In the above, instead of linear interpolation, LS
P interpolation patterns are prepared for a predetermined number of bits (for example, 2 bits), and LSPs of 1 and 2 subframes are restored for each of these patterns and a code vector for minimizing the cumulative distortion. A set of interpolation patterns may be selected. In this way, the transmission information increases by the number of bits of the interpolation pattern, but it is possible to more accurately represent the temporal change in the LSP frame. Here, the interpolation pattern is the LSP for training.
It may be created by learning in advance using data,
A predetermined pattern may be stored. As the predetermined pattern, for example,
T. Taniguchi et al. By “Imp
lovedCELP speech coding a
t 4 kb / s and below ”(Proc. ICSLP, pp. 41-44, 199).
2) The pattern described in (Reference 8) or the like can be used. Further, in order to further improve the performance, after selecting an interpolation pattern, an error signal between the true value of the LSP and the interpolation value of the LSP is obtained in a predetermined subframe, and the error signal is further converted into an error code. It may be represented in a book.

【００２９】聴感重み付け回路２３０は、スペクトルパ
ラメータ計算回路２００から、各サブフレーム毎に量子
化前の線形予測係数α_il（ｉ＝１，…，１０，ｌ＝１，
…，５）を入力し、前記文献１にもとづき、サブフレー
ムの音声信号に対して聴感重み付けを行ない、聴感重み
付け信号を出力する。The perceptual weighting circuit 230 receives from the spectral parameter calculation circuit 200 a linear prediction coefficient α _il (i = 1, ..., 10, l = 1, 1) before quantization for each subframe.
, 5) is input, and the perceptual weighting is performed on the audio signal of the sub-frame based on the reference 1 and the perceptual weighting signal is output.

【００３０】提案型モード判別回路２０００は、聴感重
み付け回路２３０からフレーム単位で聴感重み付け信号
を受け取り、スペクトルパラメータ計算回路２００から
スペクトルパラメータを受け取り、モード判別情報を出
力する。提案型モード判別回路の構成を図２に示す。The proposed mode discrimination circuit 2000 receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis, receives the spectrum parameter from the spectrum parameter calculation circuit 200, and outputs the mode discrimination information. The configuration of the proposed mode discrimination circuit is shown in FIG.

【００３１】図２において、入力端子２０１０からフレ
ーム単位に、聴感重み付け信号を入力し、入力端子２０
２０からスペクトルパラメータを入力する。特徴量計算
回路Ａ２０３０では特徴量として、例えばピッチ予測ゲ
インＰＧを計算し出力する。特徴量計算回路Ｂ３０４０
では特徴量として、例えば短期予測ゲインＳＧを計算し
出力する。In FIG. 2, the perceptual weighting signal is input from the input terminal 2010 on a frame-by-frame basis, and the input terminal 20
Input the spectral parameters from 20. The characteristic amount calculation circuit A2030 calculates and outputs, for example, a pitch prediction gain PG as a characteristic amount. Feature amount calculation circuit B3040
Then, for example, a short-term prediction gain SG is calculated and output as the feature amount.

【００３２】モード判別回路２０５０では、遅延器２０
６０に格納されている過去の一つ前のフレームのモード
情報に応じて、２０３０の出力値ＰＧと、２０４０の出
力値ＳＧを、あらかじめ定められた複数個のしきいと比
較して、モード判別を行ない、モード情報を出力する。
モード判別回路２０５０は、モード判別結果を適応コー
ドブック回路５００、音源量子化回路３５０へ出力す
る。In the mode discrimination circuit 2050, the delay unit 20
According to the mode information of the previous frame stored in 60, the output value PG of 2030 and the output value SG of 2040 are compared with a plurality of predetermined thresholds to determine the mode. And output the mode information.
The mode discrimination circuit 2050 outputs the mode discrimination result to the adaptive codebook circuit 500 and the excitation quantization circuit 350.

【００３３】図１にもどり、応答信号計算回路２４０
は、スペクトルパラメータ計算回路２００から、各サブ
フレーム毎に線形予測係数α_ilを入力し、スペクトルパ
ラメータ量子化回路２１０から、量子化、補間して復元
した線形予測係数α′_ilをサブフレーム毎に入力し、保
存されているフィルタメモリの値を用いて、入力信号ｄ
（ｎ）＝０とした応答信号を１サブフレーム分計算し、
減算器２３５へ出力する。ここで、応答信号ｘ_z（ｎ）
は下式（１）で表される。Returning to FIG. 1, the response signal calculation circuit 240
Is the linear prediction coefficient α _il input from the spectrum parameter calculation circuit 200 for each subframe, and the linear prediction coefficient α ′ _il restored by quantization and interpolation from the spectrum parameter quantization circuit 210 for each subframe. The input signal d is input using the value of the filter memory that is input and stored.
The response signal with (n) = 0 is calculated for one subframe,
Output to the subtractor 235. Here, the response signal x _z (n)
Is represented by the following formula (1).

【００３４】[0034]

【数１】 [Equation 1]

【００３５】ここで、γは、聴感重み付け量を制御する
重み係数であり、下記の（３）式と同一の値である。Here, γ is a weighting coefficient for controlling the perceptual weighting amount, and has the same value as the following equation (3).

【００３６】減算器２３５は、下式（２）により、聴感
重み付け信号から応答信号を１サブフレーム分減算し、
ｘ′_w（ｎ）を適応コードブック回路３００へ出力す
る。The subtractor 235 subtracts the response signal for one subframe from the perceptual weighting signal by the following equation (2),
It outputs x ′ _w (n) to the adaptive codebook circuit 300.

【００３７】ｘ′_w（ｎ）＝ｘ_w（ｎ）−ｘ_z（ｎ）（２）インパルス応答計算回路３１０は、ｚ変換が下式（３）
で表される重み付けフィルタのインパルス応答ｈ
_w（ｎ）をあらかじめ定められた点数Ｌだけ計算し、適
応コードブック回路３００、音源量子化回路３５０へ出
力する。X ′ _w (n) = x _w (n) −x _z (n) (2) In the impulse response calculation circuit 310, z conversion is represented by the following formula (3).
Impulse response h of the weighting filter represented by
_w (n) is calculated by a predetermined number L of points and output to the adaptive codebook circuit 300 and the excitation quantization circuit 350.

【００３８】[0038]

【数２】 [Equation 2]

【００３９】適応コードブック回路５００は、ピッチパ
ラメータを求める。詳細は前記文献２を参照することが
できる。また、適応コードブックによりピッチ予測を下
式（４）に従い行ない、適応コードブック予測算差信号
ｚ（ｎ）を出力する。Adaptive codebook circuit 500 determines pitch parameters. For details, refer to Reference 2 above. Also, pitch prediction is performed by the adaptive codebook according to the following equation (4), and the adaptive codebook prediction difference signal z (n) is output.

【００４０】ｘ（ｎ）＝ｘ′_w（ｎ）−ｂ（ｎ）（４）ここで、ｂ（ｎ）は、適応コードブックピッチ予測信号
であり、下式（５）で表せる。X (n) = x ′ _w (n) −b (n) (4) Here, b (n) is an adaptive codebook pitch prediction signal and can be expressed by the following equation (5).

【００４１】ｂ（ｎ）＝βｖ（ｎ−Ｔ）＊ｈ_w（ｎ）（５）ここで、β、Ｔは、それぞれ、適応コードブックのゲイ
ン、遅延を示す。ｖ（ｎ）は適応コードベクトルであ
る。記号＊は畳み込み演算を示す。B (n) = βv (n−T) * h _w (n) (5) where β and T represent the gain and delay of the adaptive codebook, respectively. v (n) is an adaptive code vector. The symbol * indicates a convolution operation.

【００４２】不均一パルス数型スパース音源コードブッ
ク３５１は、各々のベクトルの０でない成分の個数が異
なるスパースコードブックである。The nonuniform pulse number type sparse excitation codebook 351 is a sparse codebook in which the number of non-zero components of each vector is different.

【００４３】音源量子化回路３５０では、音源コードブ
ック３５１に格納された音源コードベクトルの全部ある
いは一部に対して、式（６）を最小化するように、最良
の音源コードベクトルｃ_j（ｎ）を選択する。このと
き、最良のコードベクトルを１種選択してもよいし、２
種以上のコードベクトルを選んでおいて、ゲイン量子化
の際に、１種に本選択してもよい。ここでは、２種以上
のコードベクトルを選んでおくものとする。In the excitation quantization circuit 350, the best excitation code vector c _j (n) is minimized so that the expression (6) is minimized for all or some of the excitation code vectors stored in the excitation codebook 351. ) Is selected. At this time, one of the best code vectors may be selected, or 2
It is also possible to select more than one type of code vector and make a final selection for one type during gain quantization. Here, it is assumed that two or more types of code vectors are selected.

【００４４】Ｄ_j＝Σ_n（ｚ（ｎ）−γ_jｃ_j（ｎ）ｈ_w（ｎ））² （６）なお、一部の音源コードベクトルに対してのみ、式
（６）を適用するときは、複数個の音源コードベクトル
をあらかじめ予備選択しておき、予備選択された音源コ
ードベクトルに対して、式（６）を適用することもでき
る。D _j = Σ _n (z (n) −γ _j c _j (n) h _w (n)) ² (6) Note that Expression (6) is applied only to some sound source code vectors. In this case, it is also possible to preselect a plurality of sound source code vectors in advance and apply equation (6) to the preselected sound source code vectors.

【００４５】ゲイン量子化回路３６５は、ゲインコード
ブック３５５からゲインコードベクトルを読みだし、選
択された音源コードベクトルに対して、式（７）を最小
化するように、音源コードベクトルとゲインコードベク
トルの組み合わせを選択する。The gain quantization circuit 365 reads the gain code vector from the gain code book 355, and for the selected excitation code vector, the excitation code vector and the gain code vector are set so as to minimize the equation (7). Select the combination of.

【００４６】Ｄ_j,k＝Σ_n（ｘ_w（ｎ）−β′_kｖ（ｎ−Ｔ）ｈ_w（ｎ）−γ′_kｃ_j（ｎ）ｈ_w（ｎ））² （７）ここで、β′_k、γ′_kは、ゲインコードブック３５５
に格納された２次元ゲインコードブックにおけるｋ番目
のコードベクトルである。選択された音源コードベクト
ルとゲインコードベクトルを表すインデクスをマルチプ
レクサ４００に出力する。D _{j, k} = Σ _n (x _w (n) −β ′ _k v (n−T) h _w (n) −γ ′ _k c _j (n) h _w (n)) ² (7) Here, β ′ _k and γ ′ _k are gain codebook 355.
It is the k-th code vector in the two-dimensional gain codebook stored in. The indexes representing the selected sound source code vector and gain code vector are output to the multiplexer 400.

【００４７】重み付け信号計算回路３６０は、スペクト
ルパラメータ計算回路の出力パラメータ及び、それぞれ
のインデクスを入力し、インデクスからそれに対応する
コードベクトルを読みだし、まず下式（８）にもとづく
駆動音源信号ｖ（ｎ）を求める。The weighting signal calculation circuit 360 inputs the output parameter of the spectrum parameter calculation circuit and each index, reads the code vector corresponding to the index from the index, and first, the drive source signal v (based on the following equation (8). n) is calculated.

【００４８】ｖ（ｎ）＝β′_kｖ（ｎ−Ｔ）＋γ′_kｃ_j（ｎ）（８）次に、スペクトルパラメータ計算回路２００の出力パラ
メータ、スペクトルパラメータ量子化回路２１０の出力
パラメータを用いて下式（９）により、重み付け信号ｓ
_w（ｎ）をサブフレーム毎に計算し、応答信号計算回路
２４０へ出力する。V (n) = β ′ _k v (n−T) + γ ′ _k c _j (n) (8) Next, the output parameter of the spectrum parameter calculation circuit 200 and the output parameter of the spectrum parameter quantization circuit 210 are set. Using the following equation (9), the weighting signal s
_w (n) is calculated for each subframe and output to the response signal calculation circuit 240.

【００４９】[0049]

【数３】 (Equation 3)

【００５０】以上により、請求項４の“前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする請求項１記載の音声符号化装置”に関わる実施
例の説明を終える。According to the above, at least one or more of the pitch prediction gain, the short-term prediction gain, the level, and the pitch are included as the feature quantity in the speech coding apparatus according to claim 4. The description of the embodiment relating to "is completed.

【００５１】請求項４の“前記特徴量として、ピッチ予
測ゲイン、短期予測ゲイン、レベル、ピッチの少なくと
も一種以上を特徴量として含めることを特徴とする請求
項２記載の音声符号化装置”に関わる実施例を図３に示
す。The speech coding apparatus according to claim 2, wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as a feature amount as the feature amount. An example is shown in FIG.

【００５２】ここでの発明では、請求項４の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項１記載の音声符号化装置”に関
わる実施例である図１の提案型モード判別回路２０００
と本実施例の提案型モード判別回路の構成が異なるの
で、本実施例の提案型モード判別回路の構成を図３を用
いて説明する。According to the present invention, at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as the feature quantity in the claim 4 as the feature quantity. 1 is an embodiment relating to the "encoding device", and the proposed mode discrimination circuit 2000 of FIG.
Since the configuration of the proposed mode discrimination circuit of this embodiment is different from that of the present embodiment, the configuration of the proposed mode discrimination circuit of this embodiment will be described with reference to FIG.

【００５３】提案型モード判別回路は、提案型モード判
別回路２０００と同様に、聴感重み付け回路２３０から
フレーム単位で聴感重み付け信号とスペクトルパラメー
タ計算回路２００よりスペクトルパラメータを受け取
り、モード判別情報を出力する。提案型モード判別回路
の構成を図３に示す。Like the proposed mode discrimination circuit 2000, the proposed mode discrimination circuit receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis and the spectrum parameter from the spectrum parameter calculation circuit 200, and outputs the mode discrimination information. The configuration of the proposed mode discrimination circuit is shown in FIG.

【００５４】図３において、入力端子３０１０からフレ
ーム単位に、聴感重み付け信号を入力し、入力端子３０
２０からスペクトルパラメータを入力する。In FIG. 3, the perceptual weighting signal is input from the input terminal 3010 in units of frames, and the input terminal 30
Input the spectral parameters from 20.

【００５５】特徴量計算回路Ａ３０３０では特徴量とし
て、例えばピッチ予測ゲインＰＧを計算し出力する。特
徴量計算回路Ｂ３０４０では特徴量として、例えばＲＭ
Ｓ比ＲＲを計算し出力する。特徴量計算回路Ｃ３０５０
では特徴量として、例えば短期予測ゲインＳＧと短期予
測ゲイン比ＳＧＲを計算し出力する。The characteristic amount calculation circuit A3030 calculates and outputs, for example, a pitch prediction gain PG as a characteristic amount. In the characteristic amount calculation circuit B3040, as the characteristic amount, for example, RM
Calculate and output the S ratio RR. Feature amount calculation circuit C3050
Then, for example, the short-term prediction gain SG and the short-term prediction gain ratio SGR are calculated and output as the feature amount.

【００５６】モード判別回路３０６０では、遅延器３０
７０に格納された過去の一つ前のフレームのモード情報
に応じて、３０３０の出力値ＰＧと、３０４０の出力値
ＲＲと、３０５０の出力値ＳＧとＳＧＲを、あらかじめ
定められた複数個のしきいと比較して、モード判別を行
ない、モード情報を出力する。モード判別回路３０６０
は、提案型モード判別回路２０００と同様に、モード判
別結果を適応コードブック回路５００、音源量子化回路
３５０へ出力する。In the mode discrimination circuit 3060, the delay unit 30
According to the mode information of the previous frame stored in 70, the output value PG of 3030, the output value RR of 3040, and the output values SG and SGR of 3050 are stored in a predetermined number. The mode information is compared and the mode information is output. Mode discrimination circuit 3060
Outputs the mode discrimination result to the adaptive codebook circuit 500 and the excitation quantization circuit 350, similarly to the proposed mode discrimination circuit 2000.

【００５７】特徴量計算回路Ｂ３０４０の構成を図４に
示す。図４において、入力端子４０１０からフレーム単
位に、聴感重み付け信号を入力し、ＲＭＳ計算回路４０
２０でＲＭＳ値Ｒを計算し、この値と遅延器４０３０に
格納された過去のＲＭＳ値とを用いてＲＭＳ計算回路４
０４０でＲＭＳ比ＲＲを計算し、これを出力端子４０５
０により出力する。ここで、ＲＭＳ比ＲＲはフレーム単
位に時間軸をとったときのＲＭＳの変化率である。The structure of the characteristic amount calculation circuit B3040 is shown in FIG. In FIG. 4, the perceptual weighting signal is input from the input terminal 4010 in frame units, and the RMS calculation circuit 40
20 calculates the RMS value R, and uses this value and the past RMS value stored in the delay unit 4030 to calculate the RMS calculation circuit 4
040 calculates the RMS ratio RR and outputs it to the output terminal 405.
Output by 0. Here, the RMS ratio RR is the rate of change of the RMS when the time axis is taken in frame units.

【００５８】特徴量計算回路Ｃ３０５０の構成を図５に
示す。図５において、入力端子５０１０からフレーム単
位に、聴感重み付け信号を入力し、入力端子５０２０か
らフレーム単位に、スペクトルパラメータを入力し、短
期予測ゲイン計算回路５０３０で短期予測ゲインＳＧを
計算し、この値を出力端子５０７０により出力する。ま
た、５０３０で計算された短期予測ゲインＳＧと遅延器
５０４０に格納された過去のフレームの短期予測ゲイン
とを用いて短期予測ゲイン比計算回路５０５０で短期予
測ゲイン比を計算し、これを出力端子５０６０により出
力する。The configuration of the characteristic amount calculation circuit C3050 is shown in FIG. In FIG. 5, the perceptual weighting signal is input from the input terminal 5010 in frame units, the spectrum parameter is input from the input terminal 5020 in frame units, the short-term prediction gain calculation circuit 5030 calculates the short-term prediction gain SG, and this value is calculated. Is output from the output terminal 5070. Also, the short-term prediction gain ratio calculation circuit 5050 calculates the short-term prediction gain ratio using the short-term prediction gain SG calculated in 5030 and the short-term prediction gain of the past frame stored in the delay unit 5040, and this is output terminal. Output by 5060.

【００５９】以上により、請求項４の“前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする請求項２記載の音声符号化装置”に関わる実施
例の説明を終える。According to the above, at least one or more of the pitch prediction gain, the short-term prediction gain, the level, and the pitch are included as the feature amount in the claim 4 as the feature amount. The description of the embodiment relating to "is completed.

【００６０】請求項４の“前記特徴量として、ピッチ予
測ゲイン、短期予測ゲイン、レベル、ピッチの少なくと
も一種以上を特徴量として含めることを特徴とする請求
項３記載の音声符号化装置”に関わる実施例を図９に示
す。The present invention relates to the speech coder according to claim 3, wherein at least one of a pitch prediction gain, a short-term prediction gain, a level, and a pitch is included as the feature quantity as the feature quantity. An example is shown in FIG.

【００６１】ここでの発明では、請求項４の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項２記載の音声符号化装置”に関
わる実施例である図３の特徴量計算回路Ｃ３０５０と本
実施例の特徴量計算回路Ｃの構成が異なるので、本実施
例の特徴量計算回路Ｃの構成を図９を用いて説明する。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch are included as the feature amount in the claim 4 "feature amount. The configuration of the feature amount calculation circuit C3050 of FIG. 3 which is an embodiment relating to the "encoding device" and the feature amount calculation circuit C of this embodiment are different. Therefore, the configuration of the feature amount calculation circuit C of this embodiment is shown in FIG. Explain.

【００６２】図９において、入力端子８０１０からフレ
ーム単位に、聴感重み付け信号を入力し、入力端子８０
２０からフレーム単位に、スペクトルパラメータを入力
し、短期予測ゲイン計算回路８０３０で短期予測ゲイン
ＳＧを計算し、この値を出力端子８０７０により出力す
る。また、８０３０で計算された短期予測ゲインＳＧと
遅延器８０５０に格納された２つ前の過去のフレームの
短期予測ゲインとを用いて短期予測ゲイン比計算回路で
短期予測ゲイン比を計算し、これを出力端子８０６０に
より出力する。In FIG. 9, the perceptual weighting signal is input from the input terminal 8010 in frame units, and the input terminal 80
The spectrum parameter is input from 20 in frame units, the short-term prediction gain calculation circuit 8030 calculates the short-term prediction gain SG, and this value is output from the output terminal 8070. Further, the short-term prediction gain ratio is calculated by the short-term prediction gain ratio calculation circuit using the short-term prediction gain SG calculated in 8030 and the short-term prediction gain of the previous frame stored in the delay unit 8050. Is output from the output terminal 8060.

【００６３】以上により、請求項４の“前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする請求項３記載の音声符号化装置”に関わる実施
例の説明を終える。According to the above, at least one or more of the pitch prediction gain, the short-term prediction gain, the level, and the pitch are included as the feature amount as the feature amount in the speech encoding device according to claim 4. The description of the embodiment relating to "is completed.

【００６４】請求項３に関わる実施例を図１０に示す。An embodiment relating to claim 3 is shown in FIG.

【００６５】ここでの発明では、請求項４の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項２記載の音声符号化装置”に関
わる実施例である提案型モード判別回路と本実施例の提
案型モード判別回路の構成が異なるので、本実施例の提
案型モード判別回路の構成を図１０を用いて説明する。In the present invention, the speech according to claim 4 is characterized in that at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as the feature amount as the feature amount. Since the configuration of the proposed mode discriminating circuit which is the embodiment relating to the "encoding device" is different from that of the proposed mode discriminating circuit of the present embodiment, the configuration of the proposed mode discriminating circuit of the present embodiment will be described with reference to FIG. .

【００６６】提案型モード判別回路は、聴感重み付け回
路２３０からフレーム単位で聴感重み付け信号とスペク
トルパラメータ計算回路２００よりスペクトルパラメー
タを受け取り、モード判別情報を出力する。提案型モー
ド判別回路の構成を図１０に示す。The proposed mode discrimination circuit receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis and the spectrum parameter from the spectrum parameter calculation circuit 200, and outputs the mode discrimination information. The configuration of the proposed mode discrimination circuit is shown in FIG.

【００６７】図１０において、入力端子９０１０からフ
レーム単位に、聴感重み付け信号を入力し、入力端子９
０２０からスペクトルパラメータを入力する。In FIG. 10, the perceptual weighting signal is input from the input terminal 9010 in frame units, and the input terminal 9
Input the spectrum parameter from 020.

【００６８】特徴量計算回路Ａ９０３０では特徴量とし
て、例えばピッチ予測ゲインＰＧを計算し出力する。特
徴量計算回路Ｂ９０４０では特徴量として、例えばＲＭ
Ｓ値ＲとＲＭＳ比ＲＲを計算し出力する。特徴量計算回
路Ｃ９０５０では特徴量として、例えば短期予測ゲイン
ＳＧと短期予測ゲイン比ＳＧＲを計算し出力する。The characteristic amount calculation circuit A 9030 calculates and outputs, for example, a pitch prediction gain PG as a characteristic amount. In the feature amount calculation circuit B9040, as the feature amount, for example, RM
The S value R and the RMS ratio RR are calculated and output. The characteristic amount calculation circuit C9050 calculates and outputs, for example, the short-term prediction gain SG and the short-term prediction gain ratio SGR as the characteristic amount.

【００６９】モード判別回路９０６０では、遅延器９０
７０に格納された過去の一つ前のフレームのモード情報
に応じて、９０３０の出力値ＰＧと、９０４０の出力値
ＲとＲＲと、９０５０の出力値ＳＧとＳＧＲを、あらか
じめ定められた複数個のしきいと比較して、モード判別
を行ない、モード情報を出力する。モード判別回路９０
６０は、モード判別結果を適応コードブック回路５０
０、音源量子化回路３５０へ出力する。In the mode discrimination circuit 9060, the delay device 90
A plurality of predetermined output values PG of 9030, output values R and RR of 9040, and output values SG and SGR of 9050 are stored in accordance with the mode information of the previous frame stored in 70. The mode information is compared with the threshold value and the mode information is output. Mode discrimination circuit 90
Reference numeral 60 denotes the adaptive codebook circuit 50 which indicates the mode discrimination result.
0, output to the excitation quantization circuit 350.

【００７０】特徴量計算回路Ｂ９０４０の構成を図１１
に示す。図１１において、入力端子１０１０からフレー
ム単位に、聴感重み付け信号を入力し、ＲＭＳ計算回路
１１０２０でＲＭＳ値Ｒを計算し出力端子１１０６０か
ら出力する。また、ＲＭＳ計算回路１１０２０の出力値
Ｒと遅延器２１０３０に格納された過去の２つ前のフレ
ームのＲＭＳ値とを用いてＲＭＳ比計算回路１１０４０
でＲＭＳ比ＲＲを計算し、これを出力端子１１０５０に
より出力する。FIG. 11 shows the configuration of the characteristic amount calculation circuit B9040.
Shown in In FIG. 11, the perceptual weighting signal is input from the input terminal 1010 in frame units, the RMS calculation circuit 11020 calculates the RMS value R, and outputs the RMS value R from the output terminal 11060. Also, the RMS ratio calculation circuit 11040 is calculated using the output value R of the RMS calculation circuit 11020 and the RMS value of the frame two frames before in the past stored in the delay unit 21030.
Then, the RMS ratio RR is calculated, and this is output from the output terminal 11050.

【００７１】特徴量計算回路Ｃ９０５０は、請求項４の
“前記特徴量として、ピッチ予測ゲイン、短期予測ゲイ
ン、レベル、ピッチの少なくとも一種以上を特徴量とし
て含めることを特徴とする請求項２記載の音声符号化装
置”に関わる実施例の図３の特徴量計算回路Ｃ３０５０
と同じである。The feature quantity calculating circuit C9050 according to claim 4, wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as the feature quantity as the feature quantity. Feature amount calculation circuit C3050 of FIG.
Is the same as

【００７２】以上により、請求項３に関わる実施例の説
明を終える。This is the end of the description of the embodiment according to claim 3.

【００７３】請求項２に関わる実施例を図２７に示す。An embodiment relating to claim 2 is shown in FIG.

【００７４】ここでの発明では、請求項３に関わる実施
例である図１０の特徴量計算回路Ｂと本実施例の特徴量
計算回路Ｂの構成が異なるので、この構成を図２７を用
いて説明する。In the present invention, the characteristic amount calculating circuit B of FIG. 10 which is an embodiment relating to claim 3 and the characteristic amount calculating circuit B of the present embodiment have different configurations. explain.

【００７５】提案型モード判別回路は、聴感重み付け回
路２３０からフレーム単位で聴感重み付け信号とスペク
トルパラメータ計算回路２００よりスペクトルパラメー
タを受け取り、モード判別情報を出力する。提案型モー
ド判別回路の構成を図２７に示す。The proposed mode discriminating circuit receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis and the spectrum parameter from the spectrum parameter calculating circuit 200, and outputs the mode discriminating information. FIG. 27 shows the configuration of the proposed mode discrimination circuit.

【００７６】図２７において、入力端子１００１０から
フレーム単位に、聴感重み付け信号を入力し、ＲＭＳ計
算回路１００２０でＲＭＳ値Ｒを計算し出力端子１００
６０から出力する。また、ＲＭＳ計算回路１００２０の
出力値Ｒと遅延器１００３０に格納された過去のフレー
ムのＲＭＳ値とを用いてＲＭＳ比計算回路１００４０で
ＲＭＳ比ＲＲを計算し、これを出力端子１００５０によ
り出力する。ここで、ＲＭＳ比ＲＲはフレーム単位に時
間軸をとったときのＲＭＳの変化率である。In FIG. 27, a perceptual weighting signal is input from the input terminal 10010 in frame units, the RMS calculation circuit 10020 calculates the RMS value R, and the output terminal 100
Output from 60. Further, the RMS ratio calculation circuit 10040 calculates the RMS ratio RR using the output value R of the RMS calculation circuit 10020 and the RMS value of the past frame stored in the delay device 10030, and outputs the RMS ratio RR from the output terminal 10050. Here, the RMS ratio RR is the rate of change of the RMS when the time axis is taken in frame units.

【００７７】以上により、請求項２に関わる実施例の説
明を終える。This is the end of the description of the embodiment according to claim 2.

【００７８】請求項１に関わる実施例を図１２に示す。An embodiment relating to claim 1 is shown in FIG.

【００７９】ここでの発明では、請求項４の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項２記載の音声符号化装置”に関
わる実施例である図３の提案型モード判別回路と本実施
例の提案型モード判別回路の構成が異なるので、提案型
モード判別回路の構成を図１２を用いて説明する。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch are included as the feature quantity in the claim 4 as the feature quantity. The configuration of the proposed mode discrimination circuit of FIG. 3 which is an embodiment related to the "encoding device" and the proposed mode discrimination circuit of this embodiment are different, and therefore the configuration of the proposed mode discrimination circuit will be described with reference to FIG.

【００８０】提案型モード判別回路は、聴感重み付け回
路２３０からフレーム単位で聴感重み付け信号とスペク
トルパラメータ計算回路２００よりスペクトルパラメー
タを受け取り、モード判別情報を出力する。The proposed mode discriminating circuit receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis and the spectrum parameter from the spectrum parameter calculating circuit 200, and outputs the mode discriminating information.

【００８１】特徴量計算回路Ａ１２０３０では特徴量と
して、例えばピッチ予測ゲインＰＧを計算し出力する。
特徴量計算回路Ｂ１２０４０では特徴量として、例えば
ＲＭＳ値Ｒを計算し出力する。特徴量計算回路Ｃ１２０
５０では特徴量として、例えば短期予測ゲインＳＧを計
算し出力する。The characteristic amount calculation circuit A12030 calculates and outputs, for example, a pitch prediction gain PG as a characteristic amount.
The characteristic amount calculation circuit B12040 calculates and outputs, for example, an RMS value R as a characteristic amount. Feature amount calculation circuit C120
In 50, for example, a short-term prediction gain SG is calculated and output as a feature amount.

【００８２】モード判別回路１２０６０では、遅延器１
２０７０に格納された過去の一つ前のフレームのモード
情報に応じて、１２０３０の出力値ＰＧと、１２０４０
の出力値Ｒと、１２０５０の出力値ＳＧを、あらかじめ
定められた複数個のしきいと比較して、モード判別を行
ない、モード情報を出力する。モード判別回路１２０６
０は、モード判別結果を適応コードブック回路５００、
音源量子化回路３５０へ出力する。In the mode discrimination circuit 12060, the delay unit 1
According to the mode information of the previous frame stored in 2070, the output value PG of 12030 and 12040
Output value R of 12050 and output value SG of 12050 are compared with a plurality of predetermined thresholds, mode determination is performed, and mode information is output. Mode discrimination circuit 1206
0 indicates that the result of mode discrimination is the adaptive codebook circuit 500,
Output to the excitation quantization circuit 350.

【００８３】以上により、請求項１に関わる実施例の説
明を終える。With the above, the description of the embodiment according to claim 1 is completed.

【００８４】請求項８の“前記特徴量として、ピッチ予
測ゲイン、短期予測ゲイン、レベル、ピッチの少なくと
も一種以上を特徴量として含めることを特徴とする請求
項６記載の音声符号化装置”に関わる実施例を図６に示
す。The present invention relates to the "speech coding apparatus according to claim 6," wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as the feature quantity as the feature quantity. An example is shown in FIG.

【００８５】ここでの発明では、請求項４の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項１記載の音声符号化装置”に関
わる実施例である図１の提案型モード判別回路２０００
と適応コードブック回路５００が、本実施例の提案型モ
ード判別回路４０００と適応コードブック回路５０００
のそれぞれに対しその構成が異なるので、本実施例では
これらと提案型ピッチ抽出回路６０００の構成について
図６を用いて説明する。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch are included as the feature amount in the claim 4 "feature amount. 1 is an embodiment relating to the "encoding device", and the proposed mode discrimination circuit 2000 of FIG.
And the adaptive codebook circuit 500, the proposed mode discrimination circuit 4000 and the adaptive codebook circuit 5000 of this embodiment.
Since their configurations are different from each other, the configurations of these and the proposed pitch extraction circuit 6000 will be described in this embodiment with reference to FIG.

【００８６】モード判別回路４０００は、聴感重み付け
回路２３０からフレーム単位で聴感重み付け信号を受取
り、ピッチ予測ゲインＰＧを計算し、これを、あらかじ
め定められた複数個のしきいと比較して、モード判別を
行ない、モード情報を出力する。モード判別回路４００
０は、モード判別結果を適応コードブック回路５０００
及び音源量子化回路３５０及び提案型ピッチ抽出回路６
０００へ出力する。The mode discrimination circuit 4000 receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis, calculates the pitch prediction gain PG, and compares this with a predetermined plurality of thresholds to discriminate the mode. And output the mode information. Mode discrimination circuit 400
0 indicates that the result of mode discrimination is the adaptive codebook circuit 5000.
And excitation quantization circuit 350 and proposed pitch extraction circuit 6
Output to 000.

【００８７】提案型ピッチ抽出回路６０００は、聴感重
み付け回路２３０からフレーム単位で聴感重み付け信号
とモード判別回路４０００よりモード判別情報と提案型
ピッチ抽出回路６０００の出力値を受け取り、適応コー
ドブック回路５０００と提案型ピッチ抽出回路６０００
に抽出したピッチＣＰを出力する。The proposed pitch extraction circuit 6000 receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis, the mode discrimination information from the mode discrimination circuit 4000, and the output value of the proposed pitch extraction circuit 6000. Proposed pitch extraction circuit 6000
The pitch CP extracted in is output.

【００８８】提案型ピッチ抽出回路６０００の構成を図
７に示す。The structure of the proposed pitch extraction circuit 6000 is shown in FIG.

【００８９】図７において、入力端子６０１０からモー
ド判別情報を入力し、入力端子６０２０から聴感重み付
け信号を入力し、入力端子６０７０からピッチを入力す
る。In FIG. 7, mode discrimination information is input from the input terminal 6010, a perceptual weighting signal is input from the input terminal 6020, and a pitch is input from the input terminal 6070.

【００９０】特徴量Ｄ計算回路６０４０では特徴量とし
て、例えば現フレームのピッチＣＰ、過去のフレームの
ピッチＰＰ、ピッチ比ＤＲを計算し出力する。ここで、
ピッチ比ＤＲはフレーム単位に時間軸をとったときのピ
ッチの変化率である。The characteristic amount D calculating circuit 6040 calculates and outputs, for example, the current frame pitch CP, the past frame pitch PP, and the pitch ratio DR as the characteristic amount. here,
The pitch ratio DR is the rate of change in pitch when the time axis is taken in frame units.

【００９１】特徴量Ｄ補正計算回路６０５０では、入力
端子６０１０からの現在のモード情報と、遅延器６０３
０に格納された過去の一つ前のフレームのモード情報に
応じて、６０４０の出力値ピッチ比ＤＲをあらかじめ定
められた閾値と比較して、現フレームのピッチＣＰを過
去のフレームのピッチＰＰで補正した値ＣＰＰを出力す
る。In the characteristic amount D correction calculation circuit 6050, the current mode information from the input terminal 6010 and the delay device 603 are used.
According to the mode information of the previous frame stored in 0, the output value pitch ratio DR of 6040 is compared with a predetermined threshold value, and the pitch CP of the current frame is set to the pitch PP of the past frame. The corrected value CPP is output.

【００９２】特徴量Ｄ計算回路６０４０の構成を図８に
示す。図８において、入力端子７０１０からフレーム単
位に、聴感重み付け信号入力端子７０８０からピッチを
入力し、ピッチ計算回路７０２０でピッチＣＰを計算
し、出力端子７０７０で出力する。また、７０２０で計
算されたピッチＣＰと遅延器７０３０に格納された過去
のフレームのピッチＰＰとを用いてピッチ比計算回路７
０４０でピッチ比ＤＲを計算し、これを出力端子７０６
０により出力する。また、遅延器７０３０に格納された
過去のフレームのピッチＰＰも出力端子７０５０より出
力される。FIG. 8 shows the configuration of the characteristic amount D calculation circuit 6040. In FIG. 8, the pitch is input from the perceptual weighting signal input terminal 7080 in frame units from the input terminal 7010, the pitch CP is calculated by the pitch calculation circuit 7020, and output at the output terminal 7070. Further, the pitch ratio calculation circuit 7 is calculated using the pitch CP calculated in 7020 and the pitch PP of the past frame stored in the delay unit 7030.
The pitch ratio DR is calculated at 040, and this is output terminal 706.
Output by 0. The pitch PP of the past frame stored in the delay unit 7030 is also output from the output terminal 7050.

【００９３】適応コードブック回路５０００は、第１の
発明の適応コードブック回路５００と基本的に同じであ
るが、過去の信号からのピッチの探索範囲を、提案型ピ
ッチ抽出回路６０００により得られたピッチＣＰＰの近
傍とすることを特徴とする。The adaptive codebook circuit 5000 is basically the same as the adaptive codebook circuit 500 of the first invention, but the pitch search range from the past signal is obtained by the proposed pitch extraction circuit 6000. It is characterized in that it is in the vicinity of the pitch CPP.

【００９４】以上により、請求項８の“前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする請求項６記載の音声符号化装置”に関わる実施
例の説明を終える。According to the above, at least one or more of the pitch prediction gain, the short-term prediction gain, the level, and the pitch are included as the feature quantity in the speech coding apparatus according to claim 8. The description of the embodiment relating to "is completed.

【００９５】請求項８の“前記特徴量として、ピッチ予
測ゲイン、短期予測ゲイン、レベル、ピッチの少なくと
も一種以上を特徴量として含めることを特徴とする請求
項５記載の音声符号化装置”に関わる実施例を図１３に
示す。The speech coding apparatus according to claim 5, wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as a feature amount as the feature amount. An example is shown in FIG.

【００９６】ここでの発明では、請求項８の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項６記載の音声符号化装置”に関
わる実施例である図７のピッチ抽出回路と本実施例のピ
ッチ抽出回路の構成が異なるので、これについて図１３
を用いて説明する。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch are included as the feature amount in the claim 8 "feature amount. The configuration of the pitch extraction circuit of FIG. 7 which is an embodiment relating to the "encoding device" is different from that of the pitch extraction circuit of this embodiment.
Will be explained.

【００９７】提案型ピッチ抽出回路は、聴感重み付け回
路２３０からフレーム単位で聴感重み付け信号とモード
判別回路４０００よりモード判別情報を受け取り、適応
コードブック回路５０００に抽出したピッチＣＰＰを出
力する。The proposed pitch extraction circuit receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis and the mode discrimination information from the mode discrimination circuit 4000, and outputs the extracted pitch CPP to the adaptive codebook circuit 5000.

【００９８】提案型ピッチ抽出回路の構成を図１３に示
す。The structure of the proposed pitch extraction circuit is shown in FIG.

【００９９】図１３において、入力端子１３０１０から
モード判別情報を入力し、入力端子１３０２０から聴感
重み付け信号を入力する。In FIG. 13, mode discrimination information is input from an input terminal 13010, and a perceptual weighting signal is input from an input terminal 13020.

【０１００】特徴量Ｄ計算回路１３０４０では特徴量と
して、例えば現フレームのピッチＣＰを計算し出力す
る。The characteristic amount D calculating circuit 13040 calculates and outputs, for example, the pitch CP of the current frame as a characteristic amount.

【０１０１】特徴量Ｄ補正計算回路１３０５０では、入
力端子１３０１０からの現在のモード情報と、遅延器１
３０３０に格納された過去の一つ前のフレームのモード
情報に応じて、現フレームのピッチＣＰを補正した値Ｃ
ＰＰを出力する。In the characteristic amount D correction calculation circuit 13050, the current mode information from the input terminal 13010 and the delay unit 1
A value C obtained by correcting the pitch CP of the current frame in accordance with the mode information of the previous frame stored in 3030.
Output PP.

【０１０２】以上により、請求項８の“前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする請求項５記載の音声符号化装置”に関わる実施
例の説明を終える。According to the above, at least one or more of the pitch prediction gain, the short-term prediction gain, the level, and the pitch are included as the feature quantity in the speech coding apparatus according to claim 8. The description of the embodiment relating to "is completed.

【０１０３】請求項８の“前記特徴量として、ピッチ予
測ゲイン、短期予測ゲイン、レベル、ピッチの少なくと
も一種以上を特徴量として含めることを特徴とする請求
項７記載の音声符号化装置”に関わる実施例を図１４に
示す。The speech coding apparatus according to claim 7, wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as a feature amount as the feature amount. An example is shown in FIG.

【０１０４】ここでの発明では、請求項８の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項６記載の音声符号化装置、に関
わる実施例である図７の特徴量Ｄ計算回路６０４０と本
実施例の特徴量Ｄ計算回路の構成が異なるので、これに
ついて図１４を用いて説明する。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch is included as the feature amount in the claim 8 "feature amount. The feature amount D calculation circuit 6040 of FIG. 7 which is an embodiment relating to the encoding device and the feature amount D calculation circuit of the present embodiment have different configurations, which will be described with reference to FIG.

【０１０５】特徴量Ｄ計算回路の構成を図１４に示す。
図１４において、入力端子１４０１０からフレーム単位
に、聴感重み付け信号を入力し、ピッチ計算回路１４０
２０でピッチＣＰを計算し、出力端子１４０７０で出力
する。また、１４０２０で計算されたピッチＣＰと遅延
器１４０３０に格納された２つ前の過去のフレームのピ
ッチＰＰＰとを用いてピッチ比計算回路１４０４０でピ
ッチ比ＤＲを計算し、これを出力端子１４０６０により
出力する。また、遅延器１４０３０に格納された過去の
フレームのピッチＰＰも出力端子１４０５０より出力さ
れる。FIG. 14 shows the configuration of the feature amount D calculation circuit.
In FIG. 14, the perceptual weighting signal is input from the input terminal 14010 in frame units, and the pitch calculation circuit 140
The pitch CP is calculated at 20, and output at the output terminal 14070. Also, the pitch ratio DR is calculated by the pitch ratio calculation circuit 14040 using the pitch CP calculated in 14020 and the pitch PPP of the frame of the previous frame stored in the delay unit 14030, and this is calculated by the output terminal 14060. Output. Further, the pitch PP of the past frame stored in the delay device 14030 is also output from the output terminal 14050.

【０１０６】以上により、請求項８の“前記特徴量とし
て、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピッ
チの少なくとも一種以上を特徴量として含めることを特
徴とする請求項７記載の音声符号化装置”に関わる実施
例の説明を終える。According to the above, at least one or more of the pitch prediction gain, the short-term prediction gain, the level, and the pitch is included as the feature quantity in the speech coding apparatus according to claim 8. The description of the embodiment relating to "is completed.

【０１０７】請求項７に関わる実施例を図１５に示す。An embodiment relating to claim 7 is shown in FIG.

【０１０８】ここでの発明では、請求項８の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として求める
ことを特徴とする請求項７記載の音声符号化装置”に関
わる実施例である図７のピッチ抽出回路が、本実施例の
提案型ピッチ抽出回路に対しその構成が異なるので、本
実施例ではこれらの構成について図１５を用いて説明す
る。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch are obtained as the feature amount as the feature amount in the eighth aspect. The pitch extraction circuit of FIG. 7 which is an embodiment related to the "encoding device" has a different structure from the pitch pitch extraction circuit of the proposed type of this embodiment. Therefore, in this embodiment, these structures will be described with reference to FIG. .

【０１０９】提案型ピッチ抽出回路は、聴感重み付け回
路２３０からフレーム単位で聴感重み付け信号とモード
判別回路４０００よりモード判別情報と提案型ピッチ抽
出回路からピッチを受け取り、適応コードブック回路５
０００と提案型ピッチ抽出回路に抽出したピッチＣＰを
出力する。The proposed pitch extraction circuit receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis, the mode discrimination information from the mode discrimination circuit 4000 and the pitch from the proposed pitch extraction circuit, and the adaptive codebook circuit 5
000 and the pitch CP extracted to the proposed pitch extraction circuit.

【０１１０】提案型ピッチ抽出回路の構成を図１５に示
す。The structure of the proposed pitch extraction circuit is shown in FIG.

【０１１１】図１５において、入力端子１５０１０から
モード判別情報を入力し、入力端子１５０２０から聴感
重み付け信号を入力する。In FIG. 15, mode discrimination information is input from an input terminal 15010, and a perceptual weighting signal is input from an input terminal 15020.

【０１１２】特徴量Ｄ計算回路１５０４０では特徴量と
して、例えば現フレームのピッチＣＰ、ピッチ比ＤＲを
計算し出力する。The characteristic amount D calculating circuit 15040 calculates and outputs, for example, the pitch CP and the pitch ratio DR of the current frame as the characteristic amount.

【０１１３】特徴量Ｄ補正計算回路１５０５０では、入
力端子１５０１０からの現在のモード情報と、遅延器１
５０３０に格納された過去の一つ前のフレームのモード
情報に応じて、１５０４０の出力値ピッチ比ＤＲをあら
かじめ定められた閾値と比較して、現フレームのピッチ
ＣＰをピッチ比ＤＲで補正した値ＣＰＰを出力する。In the characteristic amount D correction calculation circuit 15050, the current mode information from the input terminal 15010 and the delay unit 1
A value obtained by comparing the output value pitch ratio DR of 15040 with a predetermined threshold value according to the mode information of the previous frame stored in 5030 and correcting the pitch CP of the current frame with the pitch ratio DR. Output CPP.

【０１１４】特徴量Ｄ計算回路１５０４０の構成を図１
６に示す。図１６において、入力端子１６０１０からフ
レーム単位に、聴感重み付け信号を入力し、ピッチ計算
回路１６０２０でピッチＣＰを計算し、出力端子１６０
７０で出力する。また、１６０２０で計算されたピッチ
ＣＰと遅延器２６０３０に格納された２つ前の過去のフ
レームのピッチＰＰとを用いてピッチ比計算回路１６０
４０でピッチ比ＤＲを計算し、これを出力端子１６０６
０により出力する。FIG. 1 shows the configuration of the feature amount D calculation circuit 15040.
6 is shown. In FIG. 16, the perceptual weighting signal is input from the input terminal 16010 in frame units, the pitch calculation circuit 16020 calculates the pitch CP, and the output terminal 160
Output at 70. Further, the pitch ratio calculation circuit 160 is calculated by using the pitch CP calculated in 16020 and the pitch PP of the frame of the previous frame stored in the delay unit 26030.
The pitch ratio DR is calculated at 40, and this is output terminal 1606
Output by 0.

【０１１５】以上により、請求項７に関わる実施例の説
明を終える。With the above, the description of the embodiment according to claim 7 is completed.

【０１１６】請求項６に関わる実施例を図１７に示す。An embodiment relating to claim 6 is shown in FIG.

【０１１７】ここでの発明では、請求項７に関わる実施
例である図１５の特徴量Ｄ計算回路１５０４０が、本実
施例の提案型ピッチ抽出回路に対しその構成が異なるの
で、本実施例ではこれらの構成について図１７を用いて
説明する。In the present invention, the feature amount D calculating circuit 15040 of FIG. 15 which is an embodiment relating to claim 7 has a different structure from the proposed pitch extracting circuit of the present embodiment. Therefore, in the present embodiment, These configurations will be described with reference to FIG.

【０１１８】特徴量Ｄ抽出計算回路の構成を図１７に示
す。図１７において、入力端子１７０１０からフレーム
単位に、聴感重み付け信号と入力端子１７０８０からピ
ッチを入力し、ピッチ計算回路１７０２０でピッチＣＰ
を計算し、出力端子１７０７０で出力する。また、１７
０２０で計算されたピッチＣＰと遅延器１７０３０に格
納された過去のフレームのピッチＰＰとを用いてピッチ
比計算回路１７０４０でピッチ比ＤＲを計算し、これを
出力端子１７０６０により出力する。FIG. 17 shows the configuration of the characteristic amount D extraction / calculation circuit. In FIG. 17, the perceptual weighting signal and the pitch are input from the input terminal 17080 from the input terminal 17010 in frame units, and the pitch calculation circuit 17020 inputs the pitch CP.
Is calculated and output at the output terminal 17070. Also, 17
The pitch ratio DR is calculated by the pitch ratio calculation circuit 17040 using the pitch CP calculated in 020 and the pitch PP of the past frame stored in the delay device 17030, and this is output from the output terminal 17060.

【０１１９】以上により、請求項６に関わる実施例の説
明を終える。With the above, the description of the embodiment according to claim 6 is completed.

【０１２０】請求項５に関わる実施例を図１８に示す。An embodiment relating to claim 5 is shown in FIG.

【０１２１】ここでの発明では、請求項８の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項６記載の音声符号化装置”に関
わる実施例である図６の提案型ピッチ抽出回路６０００
と本実施例の提案型ピッチ抽出回路が異なるため、この
構成についてのみ図１８を用いて説明する。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch are included as the feature amount in the claim 8 "feature amount. The proposed pitch extraction circuit 6000 of FIG. 6 which is an embodiment relating to the "encoding device".
Since the proposed pitch extraction circuit of this embodiment is different from that of this embodiment, only this configuration will be described with reference to FIG.

【０１２２】提案型ピッチ抽出回路は、聴感重み付け回
路２３０からフレーム単位で聴感重み付け信号とモード
判別回路４０００よりモード判別情報と提案型ピッチ抽
出回路よりピッチを受け取り、適応コードブック回路６
０００と提案型ピッチ抽出回路に抽出したピッチＣＰを
出力する。The proposed pitch extraction circuit receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis, the mode discrimination information from the mode discrimination circuit 4000 and the pitch from the proposed pitch extraction circuit, and the adaptive codebook circuit 6
000 and the pitch CP extracted to the proposed pitch extraction circuit.

【０１２３】提案型ピッチ抽出回路の構成を図１８に示
す。The structure of the proposed pitch extraction circuit is shown in FIG.

【０１２４】図１８において、入力端子１８０１０から
モード判別情報を入力し、入力端子１８０２０から聴感
重み付け信号を入力し、入力端子１８０７０よりピッチ
を入力する。In FIG. 18, mode discrimination information is input from an input terminal 18010, a perceptual weighting signal is input from an input terminal 18020, and a pitch is input from an input terminal 18070.

【０１２５】特徴量Ｄ計算回路１８０４０では特徴量と
して、例えば現フレームのピッチＣＰ、過去のフレーム
のピッチＰＰを出力する。The characteristic amount D calculating circuit 18040 outputs, for example, the pitch CP of the present frame and the pitch PP of the past frame as the characteristic amount.

【０１２６】特徴量Ｄ補正計算回路１８０５０では、入
力端子１８０１０からの現在のモード情報と、遅延器１
８０３０に格納された過去の一つ前のフレームのモード
情報に応じて、１８０４０の出力値過去のフレームのピ
ッチＰＰをあらかじめ定められた閾値と比較して、現フ
レームのピッチＣＰを過去のフレームのピッチＰＰで補
正した値ＣＰＰを出力する。In the characteristic amount D correction calculation circuit 18050, the current mode information from the input terminal 18010 and the delay unit 1
According to the mode information of the previous frame stored in 8030, the output value 18040 of the past frame is compared with a preset threshold value, and the pitch CP of the current frame is compared with that of the previous frame. The value CPP corrected with the pitch PP is output.

【０１２７】特徴量Ｄ計算回路１８０４０の構成を図１
９に示す。図１９において、入力端子１９０１０からフ
レーム単位に、聴感重み付け信号を入力し、ピッチ計算
回路１９０２０でピッチＣＰを計算し、出力端子１９０
７０で出力する。また、１９０２０で計算されたピッチ
ＣＰと遅延器１９０３０に格納された過去のフレームの
ピッチＰＰを出力端子１９０６０により出力する。The configuration of the feature amount D calculation circuit 18040 is shown in FIG.
9 shows. 19, a perceptual weighting signal is input from an input terminal 19010 in frame units, a pitch CP is calculated by a pitch calculation circuit 19020, and an output terminal 190 is output.
Output at 70. Further, the output terminal 19060 outputs the pitch CP calculated in 19020 and the pitch PP of the past frame stored in the delay device 19030.

【０１２８】請求項５に関わる実施例の説明を終える。The description of the embodiment relating to claim 5 is finished.

【０１２９】請求項９に関わる実施例を図２０に示す。An embodiment relating to claim 9 is shown in FIG.

【０１３０】ここでの発明では、請求項４の“前記特徴
量として、ピッチ予測ゲイン、短期予測ゲイン、レベ
ル、ピッチの少なくとも一種以上を特徴量として含める
ことを特徴とする請求項１記載の音声符号化装置”に関
わる実施例である図１の提案型モード判別回路２０００
が、本実施例の提案型モード判別回路２００００に対し
その構成が異なるので、本実施例ではこれと提案型ＲＭ
Ｓ抽出回路３００００の構成について図２０を用いて説
明する。According to the present invention, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch is included as the feature amount in the claim 4 "feature amount. 1 is an embodiment relating to the "encoding device", and the proposed mode discrimination circuit 2000 of FIG.
However, since the configuration is different from the proposed mode discrimination circuit 20000 of this embodiment, this and the proposed RM are used in this embodiment.
The configuration of the S extraction circuit 30000 will be described with reference to FIG.

【０１３１】モード判別回路２００００は、聴感重み付
け回路２３０からフレーム単位で聴感重み付け信号を受
け取り、ピッチ予測ゲインＰＧを計算し、これを、あら
かじめ定められた複数のしきいと比較して、モード判別
を行ない、モード情報を出力する。モード判別回路２０
０００は、モード判別結果を適応コードブック回路５０
０、提案型ＲＭＳ抽出回路３００００及び音源量子化回
路３５０へ出力する。The mode discrimination circuit 20000 receives the perceptual weighting signal from the perceptual weighting circuit 230 on a frame-by-frame basis, calculates the pitch prediction gain PG, and compares it with a plurality of predetermined thresholds to discriminate the mode. Output the mode information. Mode discrimination circuit 20
000 indicates the mode discrimination result based on the adaptive codebook circuit 50.
0, and outputs to the proposed RMS extraction circuit 30000 and the excitation quantization circuit 350.

【０１３２】提案型ＲＭＳ抽出回路３００００は、フレ
ーム分割回路１１０からフレーム単位で音声信号とモー
ド判別回路２００００よりモード判別情報とＲＭＳコー
ドブック４００００より、幾つかのＲＭＳコードベクト
ルを受け取り、一つのＲＭＳコードベクトルを出力す
る。The proposed RMS extraction circuit 30000 receives a voice signal from the frame division circuit 110 on a frame-by-frame basis, the mode discrimination information from the mode discrimination circuit 20000, and several RMS code vectors from the RMS codebook 40000. Output a vector.

【０１３３】提案型ＲＭＳ抽出回路３００００の構成を
図２１に示す。The structure of the proposed RMS extraction circuit 30000 is shown in FIG.

【０１３４】図２１において、入力端子３１０１０から
モード判別情報を、入力端子３１０２０からフレーム単
位での音声信号を、入力端子３１０８０からＲＭＳコー
ドベクトル信号入力する。In FIG. 21, mode discrimination information is input from the input terminal 31010, an audio signal in frame units is input from the input terminal 31020, and an RMS code vector signal is input from the input terminal 31080.

【０１３５】ＲＭＳ計算回路３１０４０ではフレーム単
位でのＲＭＳ値Ｒを計算する。The RMS calculation circuit 31040 calculates the RMS value R in frame units.

【０１３６】ＲＭＳ補正計算回路３１０５０では、入力
端子３１０１０からの現在のモード情報と、遅延器３１
０３０に格納された過去の一つ前のフレームのモード情
報に応じて、３１０４０の出力値Ｒをあらかじめ定めら
れた閾値と比較して、現フレームのＲＭＳ値を補正した
値ＩＲを出力する。In the RMS correction calculation circuit 31050, the current mode information from the input terminal 31010 and the delay unit 31 are used.
According to the mode information of the previous frame stored in 030, the output value R of 31040 is compared with a predetermined threshold value, and the value IR obtained by correcting the RMS value of the current frame is output.

【０１３７】ＲＭＳ量子化ベクトル選択回路３１０６０
では、ＲＭＳコードブック４００００の予め格納された
コードベクトルの中から、ＲＭＳ補正計算回路３１０５
０の出力値ＩＲに近いベクトルを選択し、これを出力す
る。RMS quantization vector selection circuit 31060
Then, the RMS correction calculation circuit 3105 is selected from the code vectors stored in advance in the RMS codebook 40000.
A vector close to the output value IR of 0 is selected and output.

【０１３８】以上により、請求項９に関わる実施例の説
明を終える。With the above, the description of the embodiment according to claim 9 is completed.

【０１３９】請求項１２の“前記特徴量として、ピッチ
予測ゲイン、短期予測ゲイン、レベル、ピッチの少なく
とも一種以上を特徴量として含めることを特徴とする請
求項９記載の音声符号化装置”に関わる実施例を図２２
に示す。The speech encoding apparatus according to claim 9, wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as a feature amount as the feature amount. FIG. 22 shows an example.
Shown in

【０１４０】ここでの発明では、請求項９に関わる実施
例である図２０の提案型ＲＭＳ抽出回路３００００が、
本実施例の提案型ＲＭＳ抽出回路に対しその構成が異な
るので、本実施例ではこの構成について図２２を用いて
説明する。In the present invention, the proposed RMS extraction circuit 30000 of FIG. 20, which is an embodiment relating to claim 9, is
Since the configuration is different from the proposed RMS extraction circuit of this embodiment, this configuration will be described in this embodiment with reference to FIG.

【０１４１】図２２において、入力端子３２０１０から
モード判別情報を、入力端子３２０２０からフレーム単
位での音声信号を、入力端子３２０８０からＲＭＳコー
ドベクトル信号入力する。In FIG. 22, mode discrimination information is input from the input terminal 32010, an audio signal in frame units is input from the input terminal 32020, and an RMS code vector signal is input from the input terminal 32080.

【０１４２】ＲＭＳ計算回路３２０４０ではフレーム単
位でのＲＭＳ値Ｒを計算する。The RMS calculation circuit 32040 calculates the RMS value R in frame units.

【０１４３】ＲＭＳ補正計算回路３２０５０では、入力
端子３２０１０からの現在のモード情報と、遅延器３２
０３０に格納された過去の一つ前のフレームのモード情
報と、遅延器３２０９０に格納された過去のフレームの
ＲＭＳ値に応じて、３２０４０の出力値Ｒをあらかじめ
定められた閾値と比較して、現フレームのＲＭＳ値を補
正した値ＩＲを出力する。In the RMS correction calculation circuit 32050, the current mode information from the input terminal 32010 and the delay unit 32
According to the mode information of the previous frame stored in 030 in the past and the RMS value of the past frame stored in the delay unit 32090, the output value R of 32040 is compared with a predetermined threshold value, A value IR obtained by correcting the RMS value of the current frame is output.

【０１４４】ＲＭＳ量子化ベクトル選択回路３２０６０
では、ＲＭＳコードブック４００００の予め格納された
コードベクトルの中から、ＲＭＳ補正計算回路３２０５
０の出力値ＩＲに近いベクトルを選択し、これを出力す
る。RMS quantization vector selection circuit 32060
Then, the RMS correction calculation circuit 3205 is selected from the code vectors stored in advance in the RMS codebook 40000.
A vector close to the output value IR of 0 is selected and output.

【０１４５】以上により、請求項１２の“前記特徴量と
して、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピ
ッチの少なくとも一種以上を特徴量として含めることを
特徴とする請求項９記載の音声符号化装置”に関わる実
施例の説明を終える。According to the above, at least one or more of a pitch prediction gain, a short-term prediction gain, a level, and a pitch is included as the feature quantity in the speech coding apparatus according to claim 12. The description of the embodiment relating to "is completed.

【０１４６】請求項１０に関わる実施例を図２３に示
す。An embodiment relating to claim 10 is shown in FIG.

【０１４７】ここでの発明では、請求項９に関わる実施
例である図２０の提案型ＲＭＳ抽出回路が、本実施例の
提案型ＲＭＳ抽出回路に対しその構成が異なるので、本
実施例ではこれの構成について図２３を用いて説明す
る。In the present invention, the structure of the proposed RMS extraction circuit of FIG. 20, which is an embodiment relating to claim 9, is different from that of the proposed RMS extraction circuit of the present embodiment. The configuration will be described with reference to FIG.

【０１４８】図２３において、入力端子３３０１０から
モード判別情報を、入力端子３３０２０からフレーム単
位での音声信号を、入力端子３３０８０からＲＭＳコー
ドベクトル信号入力する。In FIG. 23, mode discrimination information is input from the input terminal 33010, an audio signal in frame units is input from the input terminal 33020, and an RMS code vector signal is input from the input terminal 33080.

【０１４９】ＲＭＳ計算回路３３０４０ではフレーム単
位でのＲＭＳ値Ｒを計算する。The RMS calculation circuit 33040 calculates the RMS value R in frame units.

【０１５０】ＲＭＳ補正計算回路３３０５０では、入力
端子３３０１０からの現在のモード情報と、遅延器３３
０３０に格納された過去の一つ前のフレームのモード情
報と、ＲＭＳ計算回路４３０９０に格納されたＲＭＳ比
ＲＲに応じて、３３０４０の出力値Ｒをあらかじめ定め
られた閾値と比較して、現フレームのＲＭＳ値を補正し
た値ＩＲを出力する。In the RMS correction calculation circuit 33050, the current mode information from the input terminal 33010 and the delay unit 33 are used.
The output value R of 33040 is compared with a predetermined threshold value according to the mode information of the previous frame stored in 030 and the RMS ratio RR stored in the RMS calculation circuit 43090, and the current frame is compared. A value IR obtained by correcting the RMS value of is output.

【０１５１】ＲＭＳ比計算回路４３０９０では、ＲＭＳ
計算回路３３０４０の出力値Ｒと遅延器３３０９０に格
納された過去のフレームのＲＭＳ値との比を計算し、こ
れを出力する。In the RMS ratio calculation circuit 43090, the RMS
The ratio between the output value R of the calculation circuit 33040 and the RMS value of the past frame stored in the delay unit 33090 is calculated and output.

【０１５２】ＲＭＳ量子化ベクトル選択回路３３０６０
では、ＲＭＳコードブック４００００の予め格納された
コードベクトルの中から、ＲＭＳ補正計算回路３３０５
０の出力値ＩＲに近いベクトルを選択し、これを出力す
る。RMS quantization vector selection circuit 33060
Then, the RMS correction calculation circuit 3305 is selected from the code vectors stored in advance in the RMS codebook 40000.
A vector close to the output value IR of 0 is selected and output.

【０１５３】以上により、請求項１０に関わる実施例の
説明を終える。With the above, the description of the embodiment according to claim 10 is completed.

【０１５４】請求項１２の“前記特徴量として、ピッチ
予測ゲイン、短期予測ゲイン、レベル、ピッチの少なく
とも一種以上を特徴量として含めることを特徴とする請
求項１０記載の音声符号化装置”に関わる実施例を図２
４に示す。According to a twelfth aspect of the present invention, the "speech coding apparatus according to the tenth aspect" is characterized in that at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as a feature amount as the feature amount. Example of FIG.
4 shows.

【０１５５】ここでの発明では、請求項１０に関わる実
施例である図２４の提案型ＲＭＳ抽出回路が、本実施例
の提案型ＲＭＳ抽出回路に対しその構成が異なるので、
本実施例ではこの構成について図２４を用いて説明す
る。In the present invention, the structure of the proposed RMS extraction circuit of FIG. 24, which is an embodiment relating to claim 10, is different from that of the proposed RMS extraction circuit of the present embodiment.
In this embodiment, this structure will be described with reference to FIG.

【０１５６】図２４において、入力端子３４０１０から
モード判別情報を、入力端子３４０２０からフレーム単
位での音声信号を、入力端子３４０８０からＲＭＳコー
ドベクトル信号入力する。In FIG. 24, mode discrimination information is input from the input terminal 34010, an audio signal in frame units is input from the input terminal 34020, and an RMS code vector signal is input from the input terminal 34080.

【０１５７】ＲＭＳ計算回路３４０４０ではフレーム単
位でのＲＭＳ値Ｒを計算する。The RMS calculation circuit 34040 calculates the RMS value R in frame units.

【０１５８】ＲＭＳ補正計算回路３４０５０では、入力
端子３４０１０からの現在のモード情報と、遅延器３４
０３０に格納された過去の一つ前のフレームのモード情
報と、ＲＭＳ比計算回路４４０９０に格納されたＲＭＳ
比ＲＲに応じて、３４０４０の出力値Ｒをあらかじめ定
められた閾値と比較して、現フレームのＲＭＳ値を遅延
器３４０９０に格納された過去のフレームのＲＭＳ値で
補正した値ＩＲを出力する。In the RMS correction calculation circuit 34050, the current mode information from the input terminal 34010 and the delay unit 34 are used.
Mode information of the previous frame stored in 030 and RMS stored in the RMS ratio calculation circuit 44090.
According to the ratio RR, the output value R of 34040 is compared with a predetermined threshold value, and the RMS value of the current frame is corrected with the RMS value of the past frame stored in the delay unit 34090 to output a value IR.

【０１５９】ＲＭＳ比計算回路３４０９０では、ＲＭＳ
計算回路３４０４０の出力値Ｒと遅延器３４０９０に格
納された過去のフレームのＲＭＳ値との比を計算し、こ
れを出力する。In the RMS ratio calculation circuit 34090, the RMS
The ratio of the output value R of the calculation circuit 34040 and the RMS value of the past frame stored in the delay unit 34090 is calculated and output.

【０１６０】ＲＭＳ量子化ベクトル選択回路３４０６０
では、ＲＭＳコードブック４００００の予め格納された
コードベクトルの中から、ＲＭＳ補正計算回路３４０５
０の出力値ＩＲに近いベクトルを選択し、これを出力す
る。RMS quantization vector selection circuit 34060
Then, the RMS correction calculation circuit 3405 is selected from the code vectors stored in advance in the RMS codebook 40000.
A vector close to the output value IR of 0 is selected and output.

【０１６１】以上により、請求項１２の“前記特徴量と
して、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピ
ッチの少なくとも一種以上を特徴量として含めることを
特徴とする請求項１０記載の音声符号化装置”に関わる
実施例の説明を終える。According to the above, at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as a feature quantity in the speech coding apparatus according to claim 12. The description of the embodiment relating to "is completed.

【０１６２】請求項１１に関わる実施例を図２５に示
す。An embodiment relating to claim 11 is shown in FIG.

【０１６３】ここでの発明では、請求項１１に関わる実
施例である図２４の提案型ＲＭＳ抽出回路が、本実施例
の提案型ＲＭＳ抽出回路に対しその構成が異なるので、
本実施例ではこれの構成について図２５を用いて説明す
る。In the present invention, the structure of the proposed RMS extraction circuit of FIG. 24, which is an embodiment related to claim 11, is different from that of the proposed RMS extraction circuit of the present embodiment.
In the present embodiment, this structure will be described with reference to FIG.

【０１６４】図２５において、入力端子３５０１０から
モード判別情報を、入力端子３５０２０からフレーム単
位での音声信号を、入力端子３５０８０からＲＭＳコー
ドベクトル信号入力する。In FIG. 25, mode discrimination information is input from the input terminal 35010, an audio signal in frame units is input from the input terminal 35020, and an RMS code vector signal is input from the input terminal 35080.

【０１６５】ＲＭＳ計算回路３５０４０ではフレーム単
位でのＲＭＳ値Ｒを計算する。The RMS calculation circuit 35040 calculates the RMS value R in frame units.

【０１６６】ＲＭＳ補正計算回路３５０５０では、入力
端子３５０１０からの現在のモード情報と、遅延器３５
０３０に格納された過去の一つ前のフレームのモード情
報と、ＲＭＳ比計算回路４５０９０に格納されたＲＭＳ
比ＲＲに応じて、３５０４０の出力値Ｒをあらかじめ定
められた閾値と比較して、現フレームのＲＭＳ値を補正
した値ＩＲを出力する。In the RMS correction calculation circuit 35050, the current mode information from the input terminal 35010 and the delay unit 35 are used.
Mode information of the previous frame stored in 030 and the RMS stored in the RMS ratio calculation circuit 45090.
The output value R of 35040 is compared with a predetermined threshold value according to the ratio RR, and a value IR obtained by correcting the RMS value of the current frame is output.

【０１６７】ＲＭＳ比計算回路４５０９０では、ＲＭＳ
計算回路３５０４０の出力値Ｒと遅延器５５０９０に格
納された過去の２つ前のフレームのＲＭＳ値との比を計
算し、これを出力する。In the RMS ratio calculation circuit 45090, the RMS
The ratio between the output value R of the calculation circuit 35040 and the RMS value of the frame two frames before in the past stored in the delay unit 55090 is calculated and output.

【０１６８】ＲＭＳ量子化ベクトル選択回路３５０６０
では、ＲＭＳコードブック４００００の予め格納された
コードベクトルの中から、ＲＭＳ補正計算回路３５０５
０の出力値ＩＲに近いベクトルを選択し、これを出力す
る。RMS quantization vector selection circuit 35060
Then, the RMS correction calculation circuit 3505 is selected from the code vectors stored in advance in the RMS codebook 40000.
A vector close to the output value IR of 0 is selected and output.

【０１６９】以上により、請求項１１に関わる実施例の
説明を終える。With the above, the description of the embodiment according to claim 11 is completed.

【０１７０】請求項１２の“前記特徴量として、ピッチ
予測ゲイン、短期予測ゲイン、レベル、ピッチの少なく
とも一種以上を特徴量として含めることを特徴とする請
求項１１記載の音声符号化装置”に関わる実施例を図２
６に示す。According to a twelfth aspect of the present invention, there is provided the "speech coding apparatus according to the eleventh aspect", wherein at least one of a pitch prediction gain, a short-term prediction gain, a level, and a pitch is included as the feature quantity as the feature quantity. Example of FIG.
6 is shown.

【０１７１】ここでの発明では、請求項１１に関わる実
施例である図２５の提案型ＲＭＳ抽出回路が、本実施例
の提案型ＲＭＳ抽出回路に対しその構成が異なるので、
本実施例ではこの構成について図２６を用いて説明す
る。In the present invention, the structure of the proposed RMS extraction circuit of FIG. 25, which is an embodiment relating to claim 11, is different from that of the proposed RMS extraction circuit of the present embodiment.
In this embodiment, this structure will be described with reference to FIG.

【０１７２】図２６において、入力端子３６０１０から
モード判別情報を、入力端子３６０２０からフレーム単
位での音声信号を、入力端子３６０８０からＲＭＳコー
ドベクトル信号入力する。In FIG. 26, mode discrimination information is input from the input terminal 36010, an audio signal in frame units is input from the input terminal 36020, and an RMS code vector signal is input from the input terminal 36080.

【０１７３】ＲＭＳ計算回路３６０４０ではフレーム単
位でのＲＭＳ値Ｒを計算する。The RMS calculation circuit 36040 calculates the RMS value R in frame units.

【０１７４】ＲＭＳ補正計算回路３６０５０では、入力
端子３６０１０からの現在のモード情報と、遅延器３６
０３０に格納された過去の一つ前のフレームのモード情
報と、ＲＭＳ比計算回路４６０９０に格納されたＲＭＳ
比ＲＲに応じて、３６０４０の出力値Ｒをあらかじめ定
められた閾値と比較して、現フレームのＲＭＳ値を遅延
器３６０９０に格納された過去の１つ前のフレームのＲ
ＭＳ値で補正した値ＩＲを出力する。In the RMS correction calculation circuit 36050, the current mode information from the input terminal 36010 and the delay unit 36 are used.
The mode information of the previous frame stored in 030 and the RMS stored in the RMS ratio calculation circuit 46090.
According to the ratio RR, the output value R of 36040 is compared with a predetermined threshold value, and the RMS value of the current frame is stored in the delay unit 36090, and the RMS value of the previous frame of the past is stored.
The value IR corrected with the MS value is output.

【０１７５】ＲＭＳ比較計算回路４６０９０では、ＲＭ
Ｓ計算回路３６０４０の出力値Ｒと遅延器５６０９０に
格納された過去の２つ前のフレームのＲＭＳ値との比を
計算し、これを出力する。In the RMS comparison calculation circuit 46090, the RM
The ratio between the output value R of the S calculation circuit 36040 and the RMS value of the frame two frames before in the past stored in the delay device 56090 is calculated and output.

【０１７６】ＲＭＳ量子化ベクトル選択回路３６０６０
では、ＲＭＳコードブック４００００の予め格納された
コードベクトルの中から、ＲＭＳ補正計算回路３６０５
０の出力値ＩＲに近いベクトルを選択し、これを出力す
る。RMS quantization vector selection circuit 36060
Then, the RMS correction calculation circuit 3605 is selected from the code vectors stored in advance in the RMS codebook 40000.
A vector close to the output value IR of 0 is selected and output.

【０１７７】以上により、請求項１２の“前記特徴量と
して、ピッチ予測ゲイン、短期予測ゲイン、レベル、ピ
ッチの少なくとも一種以上を特徴量として含めることを
特徴とする請求項１１記載の音声符号化装置”に関わる
実施例の説明を終える。According to the above, at least one or more of the pitch prediction gain, the short-term prediction gain, the level, and the pitch are included as the feature quantity in the speech coding apparatus according to claim 12. The description of the embodiment relating to "is completed.

【０１７８】[0178]

【発明の効果】以上説明したように、本発明によれば、
音声符号化装置において、低遅延とするために、フレー
ム長を５ｍｓ−１０ｍｓ以下と短くしても、モード判別
あるいはピッチ抽出、レベル抽出の時間的変動による音
質劣化を起こすことなく、良好な音質を得ることが可能
となりこの利点は極めて大きなものである。As described above, according to the present invention,
In the speech coding apparatus, even if the frame length is shortened to 5 ms-10 ms or less in order to achieve low delay, good sound quality is obtained without causing sound quality deterioration due to time variation of mode discrimination, pitch extraction, or level extraction. This is possible and this advantage is extremely large.

[Brief description of drawings]

【図１】本発明の一実施例の構成図。FIG. 1 is a configuration diagram of an embodiment of the present invention.

【図２】提案型モード判別回路の構成図。FIG. 2 is a block diagram of a proposed mode discrimination circuit.

【図３】提案型モード判別回路の構成図。FIG. 3 is a block diagram of a proposed mode discrimination circuit.

【図４】特徴量計算回路Ｂの構成図。FIG. 4 is a configuration diagram of a feature amount calculation circuit B.

【図５】特徴量計算回路Ｃの構成図。FIG. 5 is a configuration diagram of a feature amount calculation circuit C.

【図６】本発明の一実施例の構成図。FIG. 6 is a configuration diagram of an embodiment of the present invention.

【図７】提案型ピッチ抽出回路の構成図。FIG. 7 is a block diagram of a proposed pitch extraction circuit.

【図８】特徴量Ｄ抽出計算回路の構成図。FIG. 8 is a configuration diagram of a feature amount D extraction calculation circuit.

【図９】特徴量計算回路Ｃの構成図。FIG. 9 is a configuration diagram of a feature amount calculation circuit C.

【図１０】提案型モード判別回路の構成図。FIG. 10 is a block diagram of a proposed mode discrimination circuit.

【図１１】特徴量計算回路Ｂの構成図。FIG. 11 is a configuration diagram of a feature quantity calculation circuit B.

【図１２】提案型モード判別回路の構成図。FIG. 12 is a block diagram of a proposed mode discrimination circuit.

【図１３】ピッチ抽出回路の構成図。FIG. 13 is a configuration diagram of a pitch extraction circuit.

【図１４】特徴量Ｄ計算回路の構成図。FIG. 14 is a configuration diagram of a feature amount D calculation circuit.

【図１５】ピッチ抽出回路の構成図。FIG. 15 is a configuration diagram of a pitch extraction circuit.

【図１６】特徴量Ｄ抽出計算回路の構成図。FIG. 16 is a configuration diagram of a feature amount D extraction calculation circuit.

【図１７】提案型ピッチ抽出回路の構成図。FIG. 17 is a block diagram of a proposed pitch extraction circuit.

【図１８】提案型ピッチ抽出回路の構成図。FIG. 18 is a configuration diagram of a proposed pitch extraction circuit.

【図１９】特徴量Ｄ抽出計算回路の構成図。FIG. 19 is a configuration diagram of a feature amount D extraction calculation circuit.

【図２０】本発明の一実施例の構成図。FIG. 20 is a configuration diagram of an embodiment of the present invention.

【図２１】提案型ＲＭＳ抽出回路の構成図。FIG. 21 is a block diagram of a proposed RMS extraction circuit.

【図２２】提案型ＲＭＳ抽出回路の構成図。FIG. 22 is a configuration diagram of a proposed RMS extraction circuit.

【図２３】提案型ＲＭＳ抽出回路の構成図。FIG. 23 is a block diagram of a proposed RMS extraction circuit.

【図２４】提案型ＲＭＳ抽出回路の構成図。FIG. 24 is a block diagram of a proposed RMS extraction circuit.

【図２５】提案型ＲＭＳ抽出回路の構成図。FIG. 25 is a block diagram of a proposed RMS extraction circuit.

【図２６】提案型ＲＭＳ抽出回路の構成図。FIG. 26 is a block diagram of a proposed RMS extraction circuit.

【図２７】特徴量計算回路Ｂの構成図。FIG. 27 is a configuration diagram of a feature quantity calculation circuit B.

[Explanation of symbols]

１１０フレーム分割回路１２０サブフレーム分割回路２００スペクトルパラメータ計算回路２１０スペクトルパラメータ量子化回路２１１ＬＳＰコードブック２３０重み付け回路２３５減算回路２４０応答信号計算回路３１０インパルス応答計算回路３５０音源量子化回路３５１不均一パルス数型スパース音源コードブック３５５ゲインコードブック３６０重み付け信号計算回路３６５ゲイン量子化回路４００マルチプレクサ５００、５５０適応コードブック回路２０００提案型モード判別回路２０１０フレーム単位の入力端子２０２０スペクトルパラメータの入力端子２０３０フレーム単位の特徴量計算回路２０４０特徴量計算回路Ｂ２０５０モード判別回路２０６０モード情報格納遅延器２０７０モード情報出力端子３０３０特徴量計算回路Ａ３０４０特徴量計算回路Ｂ３０５０特徴量計算回路Ｃ３０６０モード判別回路３０７０遅延器４０００モード判別回路４０２０ＲＭＳ計算回路４０３０遅延器４０４０ＲＭＳ比計算回路５０００適応コードブック回路５０３０短期予測ゲイン計算回路５０４０遅延器５０５０短期予測ゲイン比計算回路６０００提案型ピッチ抽出回路６０３０遅延器６０４０特徴量Ｄ計算回路６０５０特徴量Ｄ補正計算回路７０２０ピッチ計算回路７０３０遅延器７０４０ピッチ比計算回路８０３０短期予測ゲイン計算回路８０４０遅延器８０５０遅延器８０８０短期予測ゲイン比計算回路９０３０特徴量計算回路Ａ９０４０特徴量計算回路Ｂ９０５０特徴量計算回路Ｃ９０６０モード判別回路９０７０遅延器１００２０ＲＭＳ計算回路１００３０遅延器１００４０ＲＭＳ比計算回路１１０２０ＲＭＳ計算回路１１０３０遅延器１１０４０ＲＭＳ比計算回路１２０３０特徴量計算回路Ａ１２０４０特徴量計算回路Ｂ１２０５０特徴量計算回路Ｃ１２０６０モード判別回路１２０７０遅延器１３０３０遅延器１３０４０特徴量Ｄ計算回路１３０５０特徴量Ｄ補正計算回路１４０２０ピッチ計算回路１４０３０遅延器１４０４０ピッチ比計算回路１５０３０遅延器１５０４０特徴量Ｄ計算回路１５０５０特徴量Ｄ抽出計算回路１６０２０ピッチ計算回路１６０３０遅延器１６０４０ピッチ計算回路１７０２０ピッチ計算回路１７０３０遅延器１７０４０ピッチ比計算回路１８０３０遅延器１８０４０特徴量Ｄ計算回路１８０５０特徴量Ｄ抽出計算回路１９０２０ピッチ計算回路１９０３０遅延器２００００提案型モード判別回路２１０３０遅延器２４０３０遅延器２６０３０遅延器３００００提案型ＲＭＳ抽出回路３１０３０遅延器３１０４０ＲＭＳ計算回路３１０５０ＲＭＳ補正計算回路３１０６０ＲＭＳ量子化ベクトル選択回路３２０３０遅延器３２０４０ＲＭＳ計算回路３２０５０ＲＭＳ補正計算回路３２０６０ＲＭＳ量子化ベクトル選択回路３２０９０遅延器３３０３０遅延器３３０４０ＲＭＳ計算回路３３０５０ＲＭＳ補正計算回路３３０６０ＲＭＳ量子化ベクトル選択回路３３０９０遅延器３４０３０遅延器３４０４０ＲＭＳ計算回路３４０５０ＲＭＳ補正計算回路３４０６０ＲＭＳ量子化ベクトル選択回路３４０９０遅延器３５０３０遅延器３５０４０ＲＭＳ計算回路３５０５０ＲＭＳ補正計算回路３５０６０ＲＭＳ量子化ベクトル選択回路３５０９０遅延器３６０３０遅延器３６０４０ＲＭＳ計算回路３６０５０ＲＭＳ補正計算回路３６０６０ＲＭＳ量子化ベクトル選択回路３６０９０遅延器４００００ＲＭＳコードブック４３０９０ＲＭＳ比計算回路４４０９０ＲＭＳ比計算回路４５０９０ＲＭＳ比計算回路４６０９０ＲＭＳ比計算回路５５０９０遅延器５６０９０遅延器 110 frame division circuit 120 sub-frame division circuit 200 spectrum parameter calculation circuit 210 spectrum parameter quantization circuit 211 LSP codebook 230 weighting circuit 235 subtraction circuit 240 response signal calculation circuit 310 impulse response calculation circuit 350 excitation quantization circuit 351 non-uniform pulse number Type sparse source codebook 355 gain codebook 360 weighting signal calculation circuit 365 gain quantization circuit 400 multiplexer 500, 550 adaptive codebook circuit 2000 proposed mode discrimination circuit 2010 frame-unit input terminal 2020 spectrum parameter input terminal 2030 frame-unit input Feature amount calculation circuit 2040 Feature amount calculation circuit B 2050 Mode discrimination circuit 2060 Mode information storage delay device 2070 mode Information output terminal 3030 Feature amount calculation circuit A 3040 Feature amount calculation circuit B 3050 Feature amount calculation circuit C 3060 Mode determination circuit 3070 Delay device 4000 Mode determination circuit 4020 RMS calculation circuit 4030 Delay device 4040 RMS ratio calculation circuit 5000 Adaptive codebook circuit 5030 Short-term prediction gain calculation circuit 5040 Delay device 5050 Short-term prediction gain ratio calculation circuit 6000 Proposed pitch extraction circuit 6030 Delay device 6040 Feature amount D calculation circuit 6050 Feature amount D correction calculation circuit 7020 Pitch calculation circuit 7030 Delay device 7040 Pitch ratio calculation circuit 8030 Short-term prediction gain calculation circuit 8040 Delay device 8050 Delay device 8080 Short-term prediction gain ratio calculation circuit 9030 Feature amount calculation circuit A 9040 Feature amount calculation circuit B 9050 Feature amount calculation circuit C 906 Mode discriminating circuit 9070 Delay device 10020 RMS calculating circuit 10030 Delay device 10040 RMS ratio calculating circuit 11020 RMS calculating circuit 11030 Delay device 11040 RMS ratio calculating circuit 12030 Feature amount calculating circuit A 12040 Feature amount calculating circuit B 12050 Feature amount calculating circuit C 12060 Mode Discrimination circuit 12070 Delay device 13030 Delay device 13040 Feature amount D calculation circuit 13050 Feature amount D correction calculation circuit 14020 Pitch calculation circuit 14030 Delay device 14040 Pitch ratio calculation circuit 15030 Delay device 15040 Feature amount D calculation circuit 15050 Feature amount D extraction calculation circuit 16020 Pitch calculation circuit 16030 Delay device 16040 Pitch calculation circuit 17020 Pitch calculation circuit 17030 Delay device 17040 Pitch ratio calculation circuit 18030 Delay device 18040 Feature amount D calculation circuit 18050 Feature amount D extraction calculation circuit 19020 Pitch calculation circuit 19030 Delay device 20000 Proposed type mode determination circuit 21030 Delay device 24030 Delay device 26030 Delay device 30000 Proposed RMS extraction circuit 31030 Delay device 31040 RMS calculation circuit 31050 RMS correction calculation circuit 31060 RMS quantization vector selection circuit 32030 delay device 3240 RMS calculation circuit 32050 RMS correction calculation circuit 32060 RMS quantization vector selection circuit 32090 delay device 33030 delay device 33040 RMS calculation circuit 33050 RMS correction calculation circuit 33060 RMS quantization Vector selection circuit 33090 Delay device 34030 Delay device 34040 RMS calculation circuit 34050 RMS correction calculation circuit 3 060 RMS quantization vector selection circuit 34090 delay device 35030 delay device 35040 RMS calculation circuit 35050 RMS correction calculation circuit 35060 RMS quantization vector selection circuit 35090 delay device 36030 delay device 36040 RMS calculation circuit 36050 RMS correction calculation circuit 36060 RMS quantization vector selection Circuit 36090 Delay device 40000 RMS codebook 43090 RMS ratio calculation circuit 44090 RMS ratio calculation circuit 45090 RMS ratio calculation circuit 46090 RMS ratio calculation circuit 55090 Delay device 56090 Delay device

Claims

[Claims]

1. A frame division unit that divides an audio signal into predetermined frame units, a mode determination unit that calculates a feature amount from the audio signal and performs mode determination, and the audio signal is encoded based on the determination result. In the speech coding apparatus, the mode discrimination of the current frame is performed by using at least one kind of feature amount obtained from the present frame and at least one past frame, and the mode discrimination information obtained from at least one past frame. A speech coding apparatus having a function of:

2. The speech encoding apparatus according to claim 1, wherein a temporal change ratio of at least one type of feature quantity is included as the feature quantity as the feature quantity.

3. The feature amount includes a ratio of the two feature amounts as a feature amount with respect to each feature amount of two frames of either the present frame or at least one past frame. The speech coding apparatus according to Item 1.

4. The speech coding apparatus according to claim 1, wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as the feature amount as the feature amount.

5. A frame division unit that divides an audio signal into predetermined frame units, a mode determination unit that calculates a feature amount from the audio signal and determines a mode, and a pitch extraction unit that extracts a pitch from the audio signal. In a voice encoding device that encodes the voice signal based on the determination result, it is obtained from at least one type of feature amount obtained from the current frame and at least one past frame and at least one past frame. A speech coding apparatus comprising a pitch extraction unit that corrects a pitch of a current frame using mode discrimination information.

6. The speech coding apparatus according to claim 5, wherein as the feature quantity, a temporal change ratio of at least one kind of feature quantity is included as the feature quantity.

7. The feature amount includes a ratio of the two feature amounts as a feature amount with respect to each feature amount of two frames of at least one of a current frame and at least one past frame. Item 5. The speech encoding device according to item 5.

8. The speech coding apparatus according to claim 5, wherein at least one of pitch prediction gain, short-term prediction gain, level, and pitch is included as the feature amount as the feature amount.

9. A frame division unit that divides an audio signal into predetermined frame units, a mode determination unit that calculates a feature amount from the audio signal and determines a mode, and a level extraction unit that extracts a level from the audio signal. In a voice encoding device that encodes the voice signal based on the determination result, it is obtained from at least one type of feature amount obtained from the current frame and at least one past frame and at least one past frame. A speech coding apparatus comprising a level extraction unit that corrects a level of a current frame using mode discrimination information.

10. The feature amount includes a time change ratio of at least one type of feature amount as the feature amount.
The speech encoding device described.

11. The feature amount includes a ratio of the two feature amounts as a feature amount with respect to each feature amount of two frames of either a current frame or at least one past frame. Item 9. The speech encoding device according to item 9.

12. A pitch prediction gain as the feature quantity,
10. At least one or more of a short-term prediction gain, a level, and a pitch are included as a feature amount, 9.
The audio encoding device according to 0 or 11.