JPH09114498A

JPH09114498A - Speech encoding device

Info

Publication number: JPH09114498A
Application number: JP7268756A
Authority: JP
Inventors: 秀享 ▲高▼橋; Hideyuki Takahashi
Original assignee: Olympus Optical Co Ltd
Current assignee: Olympus Corp
Priority date: 1995-10-17
Filing date: 1995-10-17
Publication date: 1997-05-02

Abstract

PROBLEM TO BE SOLVED: To perform an encoding process excellently even if a non-speech signal is inputted by continuously outputting the spectrum parameter of a precedent frame to a linear predictive analyzing means (LPC analyzer) when the non-speech signal lasts. SOLUTION: A buffer memory 1 sends out an input signal in frame units to a subframe divider 7 and a speech discrimination unit 2. A switch control circuit 3 sets a variable (i) indicating the number of successive non-sound frames to 0 when encoding is started. Then, when the speech discrimination unit 2 discriminates as a non-speech signal, the variable (i) is increased by one and it is judged whether or not the variable (i) is a specific number R(e.g. 10). When the variable (i) is larger than the specific number R, a terminal of a changeover switch 4 is closed to a side (a). The LPC analyzer 5 continuously outputs the spectrum parameter of the precedent frame.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音声符号化装置、
より詳しくは、音声信号をディジタル情報圧縮して記録
または伝送する音声符号化装置に関する。[0001] The present invention relates to a speech coding apparatus,
More specifically, the present invention relates to an audio encoding device for compressing and recording or transmitting an audio signal by digital information.

【０００２】[0002]

【従来の技術】音声信号を効率良く圧縮するために広く
用いられている手段として、音声信号を、スペクトル包
絡を表す線形予測パラメータと、線形予測残差信号に対
応する音源パラメータとを用いて符号化する方式があ
る。このような線形予測の手段を用いた音声符号化方式
は、少ない伝送容量で比較的高品質な合成音声を得られ
ることから、最近のハードウェア技術の進歩と相まって
様々な応用方式が盛んに研究し、開発されている。2. Description of the Related Art As a widely used means for efficiently compressing a speech signal, a speech signal is encoded using a linear prediction parameter representing a spectral envelope and an excitation parameter corresponding to the linear prediction residual signal. There is a method to make it. A speech coding method using such a linear prediction method can obtain a relatively high quality synthesized speech with a small transmission capacity. Therefore, various application methods are actively studied in combination with the recent progress in hardware technology. And is being developed.

【０００３】その中でも良い音質が得られる方式とし
て、Kleijin等による "Improved speech quality and e
fficient vector quantization in SELP"(ICASP’88 s
4.4,pp.155-158,1988)と題した論文に記載されている、
過去の音源信号を繰り返して得られる適応コードブック
を用いるCELP(Code Excited Linear Predictive Codin
g)方式がよく知られている。Among them, as a method for obtaining good sound quality, "Improved speech quality and e" by Kleijin et al.
fficient vector quantization in SELP "(ICASP'88 s
4.4, pp.155-158, 1988),
CELP (Code Excited Linear Predictive Codin) which uses an adaptive codebook obtained by repeating past sound source signals
g) The method is well known.

【０００４】図７は、上記適応コードブックを備えたコ
ード駆動線形予測符号化装置の構成を示すブロック図で
ある。FIG. 7 is a block diagram showing the configuration of a code driven linear predictive coding apparatus including the above adaptive codebook.

【０００５】図示のように、入力端子から、例えば８ｋ
Ｈｚ（すなわち、１サンプル当たり１／８ｍｓ）でサン
プリングされた原音声信号を入力し、予め定められたフ
レーム間隔（例えば２０ｍｓ、すなわち１６０サンプ
ル）の音声信号をバッファメモリ５１に格納する。As shown, from the input terminal, for example, 8k
An original voice signal sampled at Hz (that is, 1/8 ms per sample) is input, and a voice signal having a predetermined frame interval (for example, 20 ms, that is, 160 samples) is stored in the buffer memory 51.

【０００６】このバッファメモリ５１は、フレーム単位
で原音声信号をＬＰＣ（Linear Predictive Coding；線
形予測コード化）分析器５５に送出する。The buffer memory 51 sends the original audio signal in frame units to an LPC (Linear Predictive Coding) analyzer 55.

【０００７】このＬＰＣ分析器５５は、原音声信号に対
して線形予測分析（ＬＰＣ分析）を行い、スペクトル特
性を表すスペクトルパラメータである線形予測パラメー
タαを抽出して、合成フィルタ５６およびマルチプレク
サ６８に送出する。The LPC analyzer 55 performs a linear prediction analysis (LPC analysis) on the original speech signal, extracts a linear prediction parameter α which is a spectral parameter representing a spectral characteristic, and outputs it to a synthesis filter 56 and a multiplexer 68. Send out.

【０００８】また、サブフレーム分割器５７は、上記バ
ッファメモリ５１からフレーム単位で入力された原音声
信号を、予め定められたサブフレーム間隔（例えば５ｍ
ｓ、つまり４０サンプル）に分割する。すなわち、１フ
レームの原音声信号から、上述の例においては、第１サ
ブフレームから第４サブフレームまでの４つのサブフレ
ーム信号が作成される。Further, the sub-frame divider 57 receives the original audio signal input from the buffer memory 51 on a frame-by-frame basis, at a predetermined sub-frame interval (for example, 5 m).
s, that is, 40 samples). That is, in the above example, four subframe signals from the first subframe to the fourth subframe are created from the original audio signal of one frame.

【０００９】次に、適応コードブックの遅延Ｌとゲイン
βは、以下の処理によって決定される。Next, the delay L and the gain β of the adaptive codebook are determined by the following processing.

【００１０】まず、遅延回路６１において、先行サブフ
レームにおける合成フィルタ５６の入力信号すなわち駆
動音源信号に、ピッチ周期に相当する遅延を与えて適応
コードベクトルとして作成する。First, in the delay circuit 61, the input signal of the synthesizing filter 56 in the preceding sub-frame, that is, the driving sound source signal, is delayed by a delay corresponding to the pitch period to create an adaptive code vector.

【００１１】例えば、想定するピッチ周期を４０〜１６
７サンプルとすると、４０〜１６７サンプル遅れの１２
８種類の信号が適応コードベクトルとして作成され、適
応コードブック６２に格納される。For example, the assumed pitch period is 40 to 16
Assuming 7 samples, 12 samples with a delay of 40 to 167 samples
Eight types of signals are created as adaptive code vectors and stored in the adaptive code book 62.

【００１２】このときスイッチ６６は開いた状態となっ
ていて、各適応コードベクトルは乗算器６３でゲイン値
を可変して乗じた後に、加算器６７を通過してそのまま
合成フィルタ５６に入力される。At this time, the switch 66 is in an open state, each adaptive code vector is multiplied by the gain value varied by the multiplier 63, and then passed through the adder 67 to be input to the synthesis filter 56 as it is. .

【００１３】この合成フィルタ５６は、線形予測パラメ
ータαを用いて合成処理を行い、合成ベクトルを減算器
５８に送出する。この減算器５８は、原音声ベクトルと
合成ベクトルとの減算を行うことにより誤差ベクトルを
生成し、得られた誤差ベクトルを聴感重み付けフィルタ
５９に送出する。The synthesizing filter 56 performs the synthesizing process using the linear prediction parameter α and sends the synthesized vector to the subtractor 58. The subtracter 58 generates an error vector by subtracting the original speech vector and the synthetic vector, and sends the obtained error vector to the perceptual weighting filter 59.

【００１４】この聴感重み付けフィルタ５９は、誤差ベ
クトルに対して聴感特性を考慮した重み付け処理を行
い、誤差評価器６０に送出する。The perceptual weighting filter 59 performs weighting processing on the error vector in consideration of perceptual characteristics, and sends it to the error evaluator 60.

【００１５】誤差評価器６０は、誤差ベクトルの２乗平
均を計算し、その２乗平均値が最小となる適応コードベ
クトルを検索して、その遅れＬとゲインβをマルチプレ
クサ６８に送出する。このようにして、適応コードブッ
ク６２の遅延Ｌとゲインβが決定される。The error evaluator 60 calculates the mean square of the error vector, searches for the adaptive code vector having the smallest mean square value, and sends the delay L and the gain β to the multiplexer 68. In this way, the delay L and the gain β of the adaptive codebook 62 are determined.

【００１６】続いて、確率コードブック６４のインデッ
クスｉとゲインγは、以下の処理によって決定される。Subsequently, the index i and the gain γ of the probability codebook 64 are determined by the following processing.

【００１７】確率コードブック６４は、サブフレーム長
に対応する次元数（すなわち、上述の例では４０次元）
の確率コードベクトルが、例えば５１２種類予め格納さ
れており、各々にインデックスが付与されている。な
お、このときにはスイッチ６６は閉じた状態となってい
る。The probability codebook 64 has a dimension number corresponding to the subframe length (that is, 40 dimensions in the above example).
Are stored in advance, for example, in the form of 512 types, and each is assigned an index. At this time, the switch 66 is in a closed state.

【００１８】まず、上記処理によって決定された最適な
適応コードベクトルを、乗算器６３で最適ゲインβを乗
じた後に、加算器６７に送出する。First, the optimum adaptive code vector determined by the above processing is multiplied by the optimum gain β in the multiplier 63 and then sent to the adder 67.

【００１９】次に、各確率コードベクトルを乗算器６５
でゲイン値を可変して乗じた後に、加算器６７に入力す
る。加算器６７は上記最適ゲインβを乗じた最適な適応
コードベクトルと各確率コードベクトルの加算を行い、
その結果が合成フィルタ５６に入力される。Next, each probability code vector is multiplied by the multiplier 65.
The gain value is varied and multiplied by, and then input to the adder 67. The adder 67 adds the optimal adaptive code vector multiplied by the optimal gain β and each probability code vector,
The result is input to the synthesis filter 56.

【００２０】この後の処理は、上記適応コードブックパ
ラメータの決定処理と同様に行われる。すなわち、合成
フィルタ５６は線形予測パラメータαを用いて合成処理
を行い、合成ベクトルを減算器５８に送出する。The subsequent processing is performed in the same manner as the adaptive codebook parameter determination processing. That is, the synthesizing filter 56 performs the synthesizing process using the linear prediction parameter α, and sends the synthesized vector to the subtractor 58.

【００２１】減算器５８は原音声ベクトルと合成ベクト
ルとの減算を行うことにより誤差ベクトルを生成し、得
られた誤差ベクトルを聴感重み付けフィルタ５９に送出
する。The subtractor 58 generates an error vector by subtracting the original speech vector and the synthetic vector, and sends the obtained error vector to the perceptual weighting filter 59.

【００２２】聴感重み付けフィルタ５９は、誤差ベクト
ルに対して聴感特性を考慮した重み付け処理を行い、誤
差評価器６０に送出する。The perceptual weighting filter 59 performs a weighting process on the error vector in consideration of perceptual characteristics, and sends it to the error evaluator 60.

【００２３】誤差評価器６０は、誤差ベクトルの２乗平
均を計算して、その２乗平均値が最小となる確率コード
ベクトルを検索して、そのインデックスｉとゲインγを
マルチプレクサ６８に送出する。このようにして、確率
コードブック６４のインデックスｉとゲインγが決定さ
れる。The error evaluator 60 calculates the mean square of the error vector, searches for the probability code vector having the smallest mean square value, and sends the index i and the gain γ to the multiplexer 68. In this way, the index i and the gain γ of the probability codebook 64 are determined.

【００２４】上記マルチプレクサ６８は、量子化された
線形予測パラメータα、適応コードブックの遅れＬ、ゲ
インβ、確率コードブックのインデックスｉ、ゲインγ
の各々をマルチプレクスするものである。The multiplexer 68 has a quantized linear prediction parameter α, an adaptive codebook delay L, a gain β, a probability codebook index i, and a gain γ.
Is to multiplex each.

【００２５】続いて、上述した音声符号化装置に対応す
る音声復号化装置の動作を図８を参照して詳細に説明す
る。図８は、上記図７のコード駆動線形予測符号化装置
に対応する復号化装置の構成を示すブロック図である。Next, the operation of the speech decoding apparatus corresponding to the above speech encoding apparatus will be described in detail with reference to FIG. FIG. 8 is a block diagram showing a configuration of a decoding device corresponding to the code driven linear predictive coding device of FIG.

【００２６】同図において、デマルチプレクサ７８は、
受信した信号を線形予測パラメータα、適応コードブッ
クの遅れＬとゲインβ、確率コードブックのインデック
スｉとゲインγに分解して、線形予測パラメータαを合
成フィルタに、適応コードブックの遅れＬとゲインβを
各々適応コードブック７２と乗算器７３に、確率コード
ブックのインデックスｉとゲインγを各々確率コードブ
ック７４と乗算器７５にそれぞれ出力する。In the figure, the demultiplexer 78 is
The received signal is decomposed into a linear prediction parameter α, an adaptive codebook delay L and gain β, a probability codebook index i and a gain γ, and the linear prediction parameter α is used as a synthesis filter, and the adaptive codebook delay L and gain are used. β is output to the adaptive codebook 72 and the multiplier 73, and the index i and the gain γ of the probability codebook are output to the probability codebook 74 and the multiplier 75, respectively.

【００２７】上記デマルチプレクサ７８から出力された
適応コードブックの遅れＬに基づいて、適応コードブッ
ク７２の適応コードベクトルを選択する。ここで適応コ
ードブック７２は、上記符号化装置における適応コード
ブック６２の内容と同じ内容を有するものである。すな
わち、適応コードブック７２には、遅延回路７１を介し
て過去の駆動音源信号が入力される。乗算器７３は、受
信したゲインβにより、適応コードゲイン補間回路７６
を介して入力された適応コードベクトルを増幅して加算
器７９に送出する。Based on the delay L of the adaptive codebook output from the demultiplexer 78, the adaptive code vector of the adaptive codebook 72 is selected. Here, the adaptive codebook 72 has the same content as the adaptive codebook 62 in the above-mentioned encoding device. That is, the past driving sound source signal is input to the adaptive codebook 72 via the delay circuit 71. The multiplier 73 uses the received gain β to obtain the adaptive code gain interpolation circuit 76.
The adaptive code vector input via the is amplified and sent to the adder 79.

【００２８】また、上記デマルチプレクサ７８から出力
された確率コードブックのインデックスｉに基づいて、
確率コードブック７４の確率コードベクトルを選択す
る。ここで確率コードブック７４は、上記符号化装置に
おける確率コードブック６４の内容と同じ内容を有する
ものである。乗算器７５は、受信したゲインγにより、
確率コードゲイン補間回路７７を介して入力された確率
コードベクトルを増幅して加算器７９に送出する。Based on the index i of the probability codebook output from the demultiplexer 78,
A probability code vector in the probability code book 74 is selected. Here, the probability code book 74 has the same contents as the contents of the probability code book 64 in the above encoding device. The multiplier 75 uses the received gain γ to
The probability code vector input via the probability code gain interpolation circuit 77 is amplified and sent to the adder 79.

【００２９】加算器７９は、増幅された確率コードベク
トルと増幅された適応コードベクトルとを加算して、合
成フィルタ８０および遅延回路７１に送出する。The adder 79 adds the amplified probability code vector and the amplified adaptive code vector, and sends them to the synthesis filter 80 and the delay circuit 71.

【００３０】上記合成フィルタ８０は、受信した線形予
測パラメータαを係数として合成処理を行い、合成音声
信号として出力するようになっている。The synthesizing filter 80 performs synthesizing processing using the received linear prediction parameter α as a coefficient, and outputs it as a synthetic speech signal.

【００３１】上述したような線形予測分析を基礎とした
音声符号化装置は、比較的低いビットレートで高品質な
符号化性能を得ることができるという利点を有してい
る。このような線形予測分析を基礎とした音声符号化装
置は、人間が発する概周期的な有声音を前提として構成
されており、１フレームの分析長は２０ｍｓ前後が適当
であるとされている。The speech coding apparatus based on the linear prediction analysis as described above has an advantage that high-quality coding performance can be obtained at a relatively low bit rate. The speech coding apparatus based on such a linear predictive analysis is configured on the premise of almost periodic voiced sound produced by humans, and it is said that an appropriate analysis length of one frame is around 20 ms.

【００３２】[0032]

【発明が解決しようとする課題】しかしながら、上述し
たような従来の音声符号化装置は、音声信号以外の非音
声信号については良好に符号化することができず、特に
背景雑音等が混入すると急激に音質が劣化してしまうと
いう問題点があった。However, the conventional speech coder as described above cannot satisfactorily encode a non-speech signal other than a speech signal, and in particular, when background noise or the like is mixed, it becomes sharp. There was a problem that the sound quality deteriorates.

【００３３】上述したような音声符号化装置の適用分野
としては、移動体電話や音声録音装置などが考えられて
おり、これらは背景雑音が混入する場合を含む様々な環
境下で使用されるものと想定されるために、上記音質劣
化の問題点は、魅力的な製品を実現する上でどうしても
解決しなければならない必須の課題である。Mobile phones, voice recorders, and the like are considered as fields of application of the above-mentioned voice encoding device, and these are used in various environments including the case where background noise is mixed. Therefore, the problem of sound quality deterioration is an essential issue that must be solved in order to realize an attractive product.

【００３４】本発明は上記事情に鑑みてなされたもので
あり、非音声信号が入力しても良好に符号化することが
できる音質の良い音声符号化装置を提供することを目的
としている。The present invention has been made in view of the above circumstances, and it is an object of the present invention to provide a voice encoding device with good sound quality that can be encoded well even if a non-voice signal is input.

【００３５】[0035]

【課題を解決するための手段】上記の目的を達成するた
めに、第１の発明による音声符号化装置は、予め定めら
れたフレーム間隔に分割された入力信号が音声信号か非
音声信号かを判別する音声判別手段と、上記入力信号の
スペクトルパラメータを出力する線形予測分析手段と、
上記音声判別手段による判別結果が非音声信号であるこ
とが所定フレーム数にわたって連続した場合に上記入力
信号のスペクトルパラメータとして上記線形予測分析手
段に所定の先行フレームにおけるスペクトルパラメータ
を継続して出力させる制御手段と、線形予測残差信号に
相当する駆動音源信号を生成する駆動音源信号生成手段
と、上記スペクトルパラメータに基づいて上記駆動音源
信号から音声を合成する合成フィルタとを備えたもので
ある。In order to achieve the above object, the speech coder according to the first invention determines whether an input signal divided into predetermined frame intervals is a speech signal or a non-speech signal. A voice discriminating means for discriminating, a linear prediction analysis means for outputting the spectrum parameter of the input signal,
Control for causing the linear prediction analysis unit to continuously output the spectrum parameter in a predetermined preceding frame as the spectrum parameter of the input signal when the result of the determination by the voice determining unit is a non-voice signal continuously for a predetermined number of frames. Means, a driving sound source signal generating means for generating a driving sound source signal corresponding to the linear prediction residual signal, and a synthesis filter for synthesizing voice from the driving sound source signal based on the spectral parameter.

【００３６】また、第２の発明による音声符号化装置
は、予め定められたフレーム間隔に分割された入力信号
が音声信号か非音声信号かを判別する音声判別手段と、
上記入力信号のスペクトルパラメータを出力する線形予
測分析手段と、上記音声判別手段による判別結果が非音
声信号である場合には所定フレーム数を越えない範囲で
次に音声信号であると判別されるまでその入力信号をバ
ッファリングして上記線形予測分析手段にその入力信号
を一括して線形予測分析させる制御手段と、線形予測残
差信号に相当する駆動音源信号を生成する駆動音源信号
生成手段と、上記スペクトルパラメータに基づいて上記
駆動音源信号から音声を合成する合成フィルタとを備え
たものである。The speech coding apparatus according to the second aspect of the invention comprises speech discrimination means for discriminating whether the input signal divided into a predetermined frame interval is a speech signal or a non-speech signal,
Linear prediction analysis means for outputting the spectrum parameter of the input signal, and when the discrimination result by the voice discrimination means is a non-voice signal, until the next voice signal is discriminated within a range not exceeding a predetermined number of frames. Control means for buffering the input signal to cause the linear prediction analysis means to perform linear prediction analysis of the input signal at once, and driving excitation signal generation means for generating a driving excitation signal corresponding to the linear prediction residual signal, And a synthesis filter for synthesizing voice from the drive sound source signal based on the spectrum parameter.

【００３７】従って、第１の発明による音声符号化装置
は、音声判別手段が予め定められたフレーム間隔に分割
された入力信号が音声信号か非音声信号かを判別し、線
形予測分析手段が上記入力信号のスペクトルパラメータ
を出力し、上記音声判別手段による判別結果が非音声信
号であることが所定フレーム数にわたって連続した場合
に、制御手段が上記入力信号のスペクトルパラメータと
して上記線形予測分析手段に所定の先行フレームにおけ
るスペクトルパラメータを継続して出力させ、駆動音源
信号生成手段が線形予測残差信号に相当する駆動音源信
号を生成し、合成フィルタが上記スペクトルパラメータ
に基づいて上記駆動音源信号から音声を合成する。Therefore, in the speech coder according to the first aspect of the invention, the speech discrimination means discriminates whether the input signal divided into the predetermined frame intervals is a speech signal or a non-speech signal, and the linear prediction analysis means is the above-mentioned. When the spectrum parameter of the input signal is output and the discrimination result by the voice discriminating means is a non-voice signal continuously over a predetermined number of frames, the control means determines the spectrum parameter of the input signal by the linear predictive analysis means. Continuously output the spectrum parameter in the preceding frame, the driving sound source signal generation means generates a driving sound source signal corresponding to the linear prediction residual signal, the synthesis filter based on the spectrum parameter from the driving sound source signal voice. To synthesize.

【００３８】また、第２の発明による音声符号化装置
は、音声判別手段が予め定められたフレーム間隔に分割
された入力信号が音声信号か非音声信号かを判別し、線
形予測分析手段が上記入力信号のスペクトルパラメータ
を出力し、上記音声判別手段による判別結果が非音声信
号である場合には、制御手段が所定フレーム数を越えな
い範囲で次に音声信号であると判別されるまでその入力
信号をバッファリングして上記線形予測分析手段にその
入力信号を一括して線形予測分析させ、駆動音源信号生
成手段が線形予測残差信号に相当する駆動音源信号を生
成し、合成フィルタが上記スペクトルパラメータに基づ
いて上記駆動音源信号から音声を合成する。In the speech coder according to the second aspect of the invention, the speech discrimination means discriminates whether the input signal divided into the predetermined frame intervals is a speech signal or a non-speech signal, and the linear prediction analysis means is the above-mentioned. When the spectrum parameter of the input signal is output and the discrimination result by the voice discriminating means is the non-voice signal, the control means inputs the signal until it is discriminated to be the next voice signal within a range not exceeding the predetermined number of frames. The signal is buffered, the input signal is collectively subjected to linear prediction analysis by the linear prediction analysis unit, the driving excitation signal generation unit generates a driving excitation signal corresponding to the linear prediction residual signal, and the synthesis filter is the spectrum. Speech is synthesized from the driving sound source signal based on the parameters.

【００３９】[0039]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態を説明する。図１から図４は本発明の第１の実
施形態を示したものであり、図１は音声符号化装置の構
成を示すブロック図である。Embodiments of the present invention will be described below with reference to the drawings. 1 to 4 show a first embodiment of the present invention, and FIG. 1 is a block diagram showing a configuration of a speech coding apparatus.

【００４０】図１に示すように、入力端子に接続された
バッファメモリ１の出力端は３つに分岐されていて、第
１の出力端はサブフレーム分割器７を介して減算器８に
接続され、第２の出力端は切替スイッチ４の入力端に接
続され、第３の出力端は音声判別手段たる音声判別器２
を介して上記切替スイッチ４の制御を行う制御手段たる
スイッチ制御回路３に接続されている。As shown in FIG. 1, the output terminal of the buffer memory 1 connected to the input terminal is branched into three, and the first output terminal is connected to the subtracter 8 via the subframe divider 7. The second output end is connected to the input end of the changeover switch 4, and the third output end is the voice discriminator 2 which is a voice discriminating means.
It is connected to the switch control circuit 3 which is a control means for controlling the changeover switch 4 via.

【００４１】上記切替スイッチ４は、一方の出力端ａが
合成フィルタ６に接続されていて、他方の出力端ｂが線
形予測分析手段たるＬＰＣ分析器５を介して上記合成フ
ィルタ６に接続されている。One output terminal a of the changeover switch 4 is connected to the synthesizing filter 6, and the other output terminal b is connected to the synthesizing filter 6 via the LPC analyzer 5 which is a linear predictive analysis means. There is.

【００４２】減算器８の出力端子は、聴感重み付けフィ
ルタ９を介して誤差評価器１０の入力端子に接続されて
いて、さらに、この誤差評価器１０の出力端子は、適応
コードブック１２と、確率コードブック１４と、さらに
乗算器１３，１５とに接続されている。The output terminal of the subtractor 8 is connected to the input terminal of the error evaluator 10 via the perceptual weighting filter 9. Further, the output terminal of the error evaluator 10 is connected to the adaptive codebook 12 and the probability. It is connected to the codebook 14 and also to the multipliers 13 and 15.

【００４３】上記適応コードブック１２は、乗算器１３
を介して加算器１７の第１入力端子に接続されており、
また、確率コードブック１４は、乗算器１５とスイッチ
１６とを介して上記加算器１７の第２入力端子に接続さ
れている。The adaptive codebook 12 has a multiplier 13
Is connected to the first input terminal of the adder 17 via
The probability codebook 14 is connected to the second input terminal of the adder 17 via the multiplier 15 and the switch 16.

【００４４】この加算器１７の出力端子は、合成フィル
タ６を介して上記減算器８の入力端子に接続されるとと
もに、遅延回路１１を介して上記適応コードブック１２
に接続されている。The output terminal of the adder 17 is connected to the input terminal of the subtractor 8 via the synthesis filter 6 and the adaptive codebook 12 via the delay circuit 11.
It is connected to the.

【００４５】そして、マルチプレクサ１８は、音声判別
器２と、ＬＰＣ分析器５と、誤差評価器１０とに接続さ
れている。The multiplexer 18 is connected to the voice discriminator 2, the LPC analyzer 5, and the error evaluator 10.

【００４６】上述のような音声符号化装置において、線
形予測残差信号に相当する駆動音源信号を生成する駆動
音源信号生成手段は、上記遅延回路１１、適応コードブ
ック１２、確率コードブック１４、乗算器１３，１５、
スイッチ１６、加算器１７等を含んで構成されている。In the above speech coding apparatus, the driving excitation signal generating means for generating the driving excitation signal corresponding to the linear prediction residual signal is the delay circuit 11, the adaptive codebook 12, the probability codebook 14, and the multiplication. Vessels 13,15,
The switch 16 and the adder 17 are included.

【００４７】次に、図２は上記音声判別器２のより詳細
な構成を示すブロック図である。Next, FIG. 2 is a block diagram showing a more detailed structure of the voice discriminator 2.

【００４８】この音声判別器２に入力された上記バッフ
ァメモリ１の出力信号は、２つに分岐されて一方がフレ
ームエネルギー分析回路２ａに、他方が初期フレームエ
ネルギー分析回路２ｂに入力されるようになっている。The output signal of the buffer memory 1 input to the voice discriminator 2 is branched into two so that one is input to the frame energy analysis circuit 2a and the other is input to the initial frame energy analysis circuit 2b. Has become.

【００４９】上記フレームエネルギー分析回路２ａは加
算器２ｃの＋端子となっている第１入力端子に、上記初
期フレームエネルギー分析回路２ｂは該加算器２ｃの−
端子となっている第２入力端子にそれぞれ接続されてい
るとともに、さらに、初期フレームエネルギー分析回路
２ｂは、閾値決定回路２ｄにも接続されている。The frame energy analysis circuit 2a is connected to the first input terminal which is the + terminal of the adder 2c, and the initial frame energy analysis circuit 2b is connected to the-of the adder 2c.
The initial frame energy analysis circuit 2b is connected to each of the second input terminals, which are terminals, and is also connected to the threshold value determination circuit 2d.

【００５０】そして、上記加算器２ｃの出力端子と上記
閾値決定回路２ｄの出力端子は、共に判別回路２ｅに接
続されていて、この判別回路２ｅの出力が上記スイッチ
制御回路３に出力されるようになっている。The output terminal of the adder 2c and the output terminal of the threshold value determining circuit 2d are both connected to the discriminating circuit 2e, and the output of the discriminating circuit 2e is output to the switch control circuit 3. It has become.

【００５１】次に、上記図１および図２に示したような
構成における信号の流れを説明する。Next, the signal flow in the configuration shown in FIGS. 1 and 2 will be described.

【００５２】入力端子から例えば８ｋＨｚ（すなわち、
１サンプル当たり１／８ｍｓ）でサンプリングされた原
音声信号を入力して、予め定められたフレーム間隔（例
えば２０ｍｓ、すなわち１６０サンプル）の音声信号を
バッファメモリ１に格納する。From the input terminal, for example, 8 kHz (that is,
An original audio signal sampled at 1/8 ms per sample) is input, and an audio signal at a predetermined frame interval (for example, 20 ms, ie, 160 samples) is stored in the buffer memory 1.

【００５３】バッファメモリ１は、入力信号をフレーム
単位でサブフレーム分割器７と音声判別器２に送出す
る。The buffer memory 1 sends the input signal to the sub-frame divider 7 and the voice discriminator 2 in frame units.

【００５４】この音声判別器２は、フレームの入力信号
が音声か非音声かを、例えば以下に説明するような方法
で判別する。The voice discriminator 2 discriminates whether the input signal of the frame is voice or non-voice by a method as described below, for example.

【００５５】上記図２に示したような構成の音声判別器
２において、フレームエネルギー分析回路２ａは、入力
されたフレーム入力信号のフレームエネルギーＥf を次
に示すような数式により算出する。In the voice discriminator 2 having the structure shown in FIG. 2, the frame energy analysis circuit 2a calculates the frame energy Ef of the input frame input signal by the following mathematical formula.

【００５６】[0056]

【数１】ここに、ｓ（ｎ）はサンプルｎにおける入力信号、Ｎは
フレーム長をそれぞれ示している。(Equation 1) Here, s (n) indicates an input signal in sample n, and N indicates a frame length.

【００５７】また、上記初期フレームエネルギー分析回
路２ｂは、符号化を開始したときのフレームエネルギー
Ｅb を上記数式１と同様の数式を用いて算出する。Further, the initial frame energy analysis circuit 2b calculates the frame energy Eb at the time of starting the encoding by using the same mathematical expression as the mathematical expression 1.

【００５８】上記閾値決定回路２ｄは、背景雑音エネル
ギーの大きさに応じて閾値を決定する。例えば、図３に
示すように、背景雑音エネルギーがｄＢ単位で増加する
に従って、閾値をｄＢ単位で減少させる関係により、閾
値を決定する。そして、その結果を判別回路２ｅに送出
する。The threshold decision circuit 2d decides a threshold according to the magnitude of the background noise energy. For example, as shown in FIG. 3, as the background noise energy increases in dB units, the threshold value is determined based on the relationship of decreasing the threshold value in dB units. Then, the result is sent to the discrimination circuit 2e.

【００５９】加算器２ｃでは、フレームエネルギーＥf
を正として入力するとともに、初期フレームエネルギー
Ｅb を負として入力してこれらを加算することにより、
フレームエネルギーＥf から初期フレームエネルギーＥ
b を減算し、その減算結果を判別回路２ｅに送出する。In the adder 2c, the frame energy Ef
By inputting as a positive value and inputting the initial frame energy Eb as a negative value and adding them,
From the frame energy Ef to the initial frame energy E
b is subtracted, and the subtraction result is sent to the discrimination circuit 2e.

【００６０】そして、判別回路２ｅは、入力された減算
結果と閾値を比較して、減算結果が閾値より大きければ
フレーム入力信号は音声区間であると判別し、そうでな
ければ非音声区間であると判別する。Then, the discriminating circuit 2e compares the input subtraction result with the threshold value and discriminates that the frame input signal is in the voice section if the subtraction result is larger than the threshold value, and is in the non-voice section otherwise. To determine.

【００６１】図１に戻って、サブフレーム分割器７は、
フレームの入力信号を予め定められたサブフレーム間隔
（例えば５ｍｓ、つまり４０サンプル）に分割する。す
なわち、１フレームの入力信号から、第１サブフレーム
から第４サブフレームまでの４つのサブフレーム信号が
作成される。Returning to FIG. 1, the subframe divider 7 is
The input signal of the frame is divided into predetermined subframe intervals (for example, 5 ms, that is, 40 samples). That is, four subframe signals from the first subframe to the fourth subframe are created from the input signal of one frame.

【００６２】ＬＰＣ分析器５は、入力信号に対して線形
予測分析（ＬＰＣ分析）を行って、スペクトル特性を表
すスペクトルパラメータたる線形予測パラメータαを抽
出し、合成フィルタ６およびマルチプレクサ１８に送出
する。The LPC analyzer 5 performs a linear prediction analysis (LPC analysis) on the input signal, extracts a linear prediction parameter α which is a spectral parameter representing a spectral characteristic, and sends it to the synthesis filter 6 and the multiplexer 18.

【００６３】次に、上記スイッチ制御回路３の動作を図
４のフローチャートを参照して説明する。図４は音声符
号化装置の動作を示すフローチャートである。Next, the operation of the switch control circuit 3 will be described with reference to the flowchart of FIG. FIG. 4 is a flowchart showing the operation of the speech coding apparatus.

【００６４】符号化が開始されると、非音声フレーム連
続数を示す変数ｉを０にセットする（ステップＳ１）。When encoding is started, a variable i indicating the number of consecutive non-voice frames is set to 0 (step S1).

【００６５】次に、音声判別器２における判別結果が音
声（ｖ）であるか非音声（ｕｖ）であるかを判定する
（ステップＳ２）。Next, it is determined whether the discrimination result by the voice discriminator 2 is voice (v) or non-voice (uv) (step S2).

【００６６】このステップＳ２における判別結果が非音
声である場合には、変数ｉを１増分して（ステップＳ
３）、変数ｉが所定数Ｒ（例えば１０）より大きいか否
かを判定する（ステップＳ４）。If the determination result in step S2 is non-voice, the variable i is incremented by 1 (step S2).
3), it is determined whether the variable i is larger than a predetermined number R (for example, 10) (step S4).

【００６７】このステップＳ４において変数ｉが所定数
Ｒ（例えば１０）より大きい場合には、切替スイッチ４
の端子をａ側に閉じて（ステップＳ５）、先行フレーム
のスペクトルパラメータを継続して使用する（ステップ
Ｓ６）。その後、次のフレームの処理を待つ（ステップ
Ｓ７）。If the variable i is larger than the predetermined number R (eg, 10) in step S4, the changeover switch 4
Is closed to the side a (step S5), and the spectrum parameter of the preceding frame is continuously used (step S6). After that, the process waits for the next frame (step S7).

【００６８】一方、上記ステップＳ２における判別結果
が音声である場合には、非音声フレーム連続数を示す変
数ｉを０にリセットした後に（ステップＳ８）、切替ス
イッチ４の端子をｂ側に閉じて（ステップＳ９）、ＬＰ
Ｃ分析器５によりＬＰＣ分析を行ってスペクトルパラメ
ータを更新する（ステップＳ１０）。その後、上記ステ
ップＳ７に進んで次のフレームの処理を待つ。On the other hand, when the discrimination result in the above step S2 is voice, after the variable i indicating the number of consecutive non-voice frames is reset to 0 (step S8), the terminal of the changeover switch 4 is closed to the side b. (Step S9), LP
The C analyzer 5 performs LPC analysis to update the spectrum parameter (step S10). Then, the process proceeds to step S7 and waits for the processing of the next frame.

【００６９】また、上記ステップＳ４において変数ｉが
所定数Ｒ（例えば１０）よりも大きくない場合には、上
記ステップＳ８に進む。If the variable i is not larger than the predetermined number R (for example, 10) in step S4, the process proceeds to step S8.

【００７０】再び図１の説明に戻って、適応コードブッ
ク１２の遅れＬとゲインβ、確率コードブックのインデ
ックスｉとゲインγは、上記従来例において説明した方
法と同様の方法により決定される。Returning to the explanation of FIG. 1, the delay L and the gain β of the adaptive codebook 12 and the index i and the gain γ of the probability codebook are determined by the same method as the method described in the conventional example.

【００７１】すなわち、まず、適応コードブック１２の
遅延Ｌとゲインβは、以下の処理によって決定される。That is, first, the delay L and the gain β of the adaptive codebook 12 are determined by the following processing.

【００７２】遅延回路１１において、先行サブフレーム
における合成フィルタ６の入力信号すなわち駆動音源信
号に、ピッチ周期に相当する遅延を与えて適応コードベ
クトルとして作成する。In the delay circuit 11, the input signal of the synthesizing filter 6 in the preceding sub-frame, that is, the driving sound source signal, is delayed by a pitch period to create an adaptive code vector.

【００７３】例えば、想定するピッチ周期を４０〜１６
７サンプルとすると、４０〜１６７サンプル遅れの１２
８種類の信号が適応コードベクトルとして作成され、適
応コードブック１２に格納される。For example, the assumed pitch period is 40 to 16
Assuming 7 samples, 12 samples with a delay of 40 to 167 samples
Eight types of signals are created as adaptive code vectors and stored in the adaptive codebook 12.

【００７４】このときスイッチ１６は開いた状態となっ
ていて、各適応コードベクトルは乗算器１３でゲイン値
を可変して乗じた後に、加算器１７を通過してそのまま
合成フィルタ６に入力される。At this time, the switch 16 is in an open state, and each adaptive code vector is multiplied by the gain value varied by the multiplier 13, and then passed through the adder 17 to be input to the synthesis filter 6 as it is. .

【００７５】この合成フィルタ６は、線形予測パラメー
タαを用いて合成処理を行い、合成ベクトルを減算器８
に送出する。この減算器８は、原音声ベクトルと合成ベ
クトルとの減算を行うことにより誤差ベクトルを生成
し、得られた誤差ベクトルを聴感重み付けフィルタ９に
送出する。The synthesizing filter 6 performs the synthesizing process using the linear prediction parameter α, and subtracts the synthesized vector from the subtractor 8
To send to. The subtracter 8 generates an error vector by subtracting the original speech vector from the synthesized vector, and sends the obtained error vector to the auditory weighting filter 9.

【００７６】この聴感重み付けフィルタ９は、誤差ベク
トルに対して聴感特性を考慮した重み付け処理を行い、
誤差評価器１０に送出する。The perceptual weighting filter 9 performs weighting processing on the error vector in consideration of perceptual characteristics,
It is sent to the error evaluator 10.

【００７７】誤差評価器１０は、誤差ベクトルの２乗平
均を計算し、その２乗平均値が最小となる適応コードベ
クトルを検索して、その遅れＬとゲインβをマルチプレ
クサ１８に送出する。このようにして、適応コードブッ
ク１２の遅延Ｌとゲインβが決定される。The error evaluator 10 calculates the mean square of the error vector, searches for an adaptive code vector having the smallest mean square value, and sends the delay L and the gain β to the multiplexer 18. Thus, the delay L and the gain β of the adaptive codebook 12 are determined.

【００７８】続いて、確率コードブック１４のインデッ
クスｉとゲインγは、以下の処理によって決定される。Subsequently, the index i and the gain γ of the probability codebook 14 are determined by the following processing.

【００７９】確率コードブック１４は、サブフレーム長
に対応する次元数（すなわち、上述の例では４０次元）
の確率コードベクトルが、例えば５１２種類予め格納さ
れており、各々にインデックスが付与されている。な
お、このときにはスイッチ１６は閉じた状態となってい
る。The probability codebook 14 has the number of dimensions corresponding to the subframe length (that is, 40 dimensions in the above example).
Are stored in advance, for example, in the form of 512 types, and each is assigned an index. At this time, the switch 16 is in a closed state.

【００８０】まず、上記処理によって決定された最適な
適応コードベクトルを、乗算器１３で最適ゲインβを乗
じた後に、加算器１７に送出する。First, the optimum adaptive code vector determined by the above processing is multiplied by the optimum gain β in the multiplier 13, and then sent to the adder 17.

【００８１】次に、各確率コードベクトルを乗算器１５
でゲイン値を可変して乗じた後に、加算器１７に入力す
る。加算器１７は上記最適ゲインβを乗じた最適な適応
コードベクトルと各確率コードベクトルの加算を行い、
その結果が合成フィルタ６に入力される。Next, each probability code vector is multiplied by the multiplier 15
The variable is multiplied by the gain value and input to the adder 17. The adder 17 adds the optimal adaptive code vector multiplied by the optimal gain β and each probability code vector,
The result is input to the synthesis filter 6.

【００８２】この後の処理は、上記適応コードブックパ
ラメータの決定処理と同様に行われる。すなわち、合成
フィルタ６は線形予測パラメータαを用いて合成処理を
行い、合成ベクトルを減算器８に送出する。Subsequent processing is performed in the same manner as the adaptive codebook parameter determination processing. That is, the synthesis filter 6 performs the synthesis process using the linear prediction parameter α, and sends the synthesized vector to the subtractor 8.

【００８３】減算器８は原音声ベクトルと合成ベクトル
との減算を行うことにより誤差ベクトルを生成し、得ら
れた誤差ベクトルを聴感重み付けフィルタ９に送出す
る。The subtracter 8 generates an error vector by subtracting the original speech vector and the synthetic vector, and sends the obtained error vector to the perceptual weighting filter 9.

【００８４】聴感重み付けフィルタ９は、誤差ベクトル
に対して聴感特性を考慮した重み付け処理を行い、誤差
評価器１０に送出する。The perceptual weighting filter 9 performs a weighting process on the error vector in consideration of the perceptual characteristic, and sends it to the error evaluator 10.

【００８５】誤差評価器１０は、誤差ベクトルの２乗平
均を計算して、その２乗平均値が最小となる確率コード
ベクトルを検索して、そのインデックスｉとゲインγを
マルチプレクサ１８に送出する。このようにして、確率
コードブック１４のインデックスｉとゲインγが決定さ
れる。The error evaluator 10 calculates the mean square of the error vector, searches for the probability code vector having the smallest mean square value, and sends the index i and the gain γ to the multiplexer 18. Thus, the index i and the gain γ of the probability codebook 14 are determined.

【００８６】上記マルチプレクサ１８は、量子化された
線形予測パラメータα、適応コードブックの遅れＬとゲ
インβ、確率コードブックのインデックスｉとゲインγ
の各々をマルチプレクスして伝送する。The multiplexer 18 quantizes the linear prediction parameter α, the adaptive codebook delay L and gain β, the probability codebook index i and gain γ.
Are multiplexed and transmitted.

【００８７】なお、上述したような音声符号化装置に対
応する音声復号化装置の復号化動作は、上記従来例にお
いて説明したものと同様である。The decoding operation of the speech decoding apparatus corresponding to the speech coding apparatus as described above is the same as that described in the conventional example.

【００８８】また、音声判別器２からマルチプレクサ１
８に音声／非音声の判別結果ｖ／ｕｖを出力して、伝送
する符号化パラメータにｖ／ｕｖの情報も入れるように
し、これに対応する復号化装置に、この符号化装置と同
様のスイッチ制御回路および切替スイッチを設けて、ｖ
／ｕｖの情報に基づいて切替スイッチの制御を行うよう
にすれば、より高効率に符号化可能な可変ビットレート
符号化装置／復号化装置を構成することができる。From the voice discriminator 2 to the multiplexer 1
The voice / non-voice discrimination result v / uv is output to 8 so that the v / uv information is also included in the encoding parameter to be transmitted, and the corresponding decoding device is provided with the same switch as this encoding device. With a control circuit and changeover switch,
By controlling the changeover switch based on the information of / uv, it is possible to configure a variable bit rate coding device / decoding device capable of coding with higher efficiency.

【００８９】このような第１の実施形態によれば、入力
信号が音声信号であるか否かを判別して、非音声信号が
所定フレーム数にわたって連続した場合に、ＬＰＣ分析
器に所定の先行フレームにおける線形予測パラメータを
継続して出力させることにより、非音声信号における線
形予測パラメータの切り替えに起因する符号化音声の歪
みが減少するために、背景雑音等の非音声信号が混入し
ても、良好に音声信号を符号化することができる高品質
な音声符号化装置となる。According to the first embodiment as described above, it is determined whether or not the input signal is a voice signal, and when the non-voice signal continues for a predetermined number of frames, the LPC analyzer is given a predetermined lead. By continuously outputting the linear prediction parameter in the frame, the distortion of the coded speech due to the switching of the linear prediction parameter in the non-voice signal is reduced, even if a non-voice signal such as background noise is mixed, It becomes a high-quality speech coder that can satisfactorily encode a speech signal.

【００９０】図５，図６は本発明の第２の実施形態を示
したものであり、図５は音声符号化装置の構成を示すブ
ロック図である。この第２の実施形態において、上述の
第１の実施形態と同様である部分については説明を省略
し、主として異なる点についてのみ説明する。FIG. 5 and FIG. 6 show the second embodiment of the present invention, and FIG. 5 is a block diagram showing the configuration of the speech coding apparatus. In the second embodiment, a description of the same parts as those in the first embodiment will be omitted, and only different points will be mainly described.

【００９１】この第２実施形態の音声符号化装置は、上
記図１に示したものとほぼ同様であるが、図５に示すよ
うに、入力端子には上記バッファメモリ１と同様の機能
を果たす第１バッファメモリ２１が接続されている。The speech coding apparatus of the second embodiment is almost the same as that shown in FIG. 1, but as shown in FIG. 5, the input terminal has the same function as that of the buffer memory 1. The first buffer memory 21 is connected.

【００９２】この第１バッファメモリ２１の出力端は３
つに分岐されていて、第１の出力端はサブフレーム分割
器７を介して減算器８に接続され、第２の出力端は第２
バッファメモリ２３に接続され、第３の出力端は音声判
別手段たる音声判別器２を介して上記第２バッファメモ
リ２３の制御を行う制御手段たるバッファ制御回路２２
に接続されている。The output end of the first buffer memory 21 is 3
The first output end is connected to the subtractor 8 via the subframe divider 7, and the second output end is connected to the second
The buffer control circuit 22 is connected to the buffer memory 23, and the third output end is the control means for controlling the second buffer memory 23 via the voice discriminator 2 which is the voice discrimination means.
It is connected to the.

【００９３】上記第２バッファメモリ２３は、線形予測
分析手段たるＬＰＣ分析器５を介して合成フィルタ６に
接続されている。The second buffer memory 23 is connected to the synthesis filter 6 via the LPC analyzer 5 which is a linear predictive analysis means.

【００９４】その他の部分は上記図１と同様である。The other parts are the same as in FIG.

【００９５】次に、上記図５に示したような構成におけ
る信号の流れを説明する。Next, the signal flow in the configuration shown in FIG. 5 will be described.

【００９６】入力端子から例えば８ｋＨｚ（すなわち、
１サンプル当たり１／８ｍｓ）でサンプリングされた原
音声信号を入力して、予め定められたフレーム間隔（例
えば２０ｍｓ、すなわち１６０サンプル）の音声信号を
第１バッファメモリ２１に格納する。From the input terminal, for example, 8 kHz (that is,
An original audio signal sampled at 1/8 ms per sample is input, and an audio signal having a predetermined frame interval (for example, 20 ms, that is, 160 samples) is stored in the first buffer memory 21.

【００９７】第１バッファメモリ２１は、フレーム単位
で入力信号をサブフレーム分割器７と音声判別器２に送
出する。この音声判別器２は、フレームの入力信号が音
声か非音声かを、例えば上記第１の実施形態に説明した
ような方法で判別する。The first buffer memory 21 sends the input signal to the subframe divider 7 and the voice discriminator 2 in frame units. The voice discriminator 2 discriminates whether the input signal of the frame is voice or non-voice by, for example, the method described in the first embodiment.

【００９８】サブフレーム分割器７は、フレームの入力
信号を予め定められたサブフレーム間隔（例えば５ｍ
ｓ、つまり４０サンプル）に分割する。すなわち、１フ
レームの入力信号から、第１サブフレームから第４サブ
フレームまでの４つのサブフレーム信号が作成される。The subframe divider 7 inputs the input signal of the frame at a predetermined subframe interval (for example, 5 m).
s, that is, 40 samples). That is, four subframe signals from the first subframe to the fourth subframe are created from the input signal of one frame.

【００９９】ＬＰＣ分析器５は、入力信号に対して線形
予測分析（ＬＰＣ分析）を行って、スペクトル特性を表
す線形予測パラメータαを抽出し、合成フィルタ６およ
びマルチプレクサ１８に送出する。The LPC analyzer 5 performs a linear prediction analysis (LPC analysis) on the input signal, extracts a linear prediction parameter α representing a spectral characteristic, and sends it to the synthesis filter 6 and the multiplexer 18.

【０１００】次に、上記バッファ制御回路２２の動作を
図６を参照して説明する。図６は音声符号化装置の動作
を示すフローチャートである。Next, the operation of the buffer control circuit 22 will be described with reference to FIG. FIG. 6 is a flowchart showing the operation of the speech coding apparatus.

【０１０１】符号化が開始されると、非音声フレーム連
続数を示す変数ｉを０にセットする（ステップＳ２
１）。When encoding is started, a variable i indicating the number of continuous non-voice frames is set to 0 (step S2).
1).

【０１０２】次に、音声判別器２における判別結果が音
声（ｖ）であるか非音声（ｕｖ）であるかを判定する
（ステップＳ２２）。Next, it is determined whether the discrimination result by the voice discriminator 2 is voice (v) or non-voice (uv) (step S22).

【０１０３】このステップＳ２２における判別結果が非
音声である場合には、第２バッファメモリ２３にバッフ
ァリングを行い（ステップＳ２３）、変数ｉを１増分し
て（ステップＳ２４）、変数ｉが所定数Ｒ（例えば１
０）より小さいか否かを判定する（ステップＳ２５）。If the determination result in step S22 is non-voice, buffering is performed in the second buffer memory 23 (step S23), the variable i is incremented by 1 (step S24), and the variable i is set to a predetermined number. R (eg 1
0) is smaller than 0 (step S25).

【０１０４】このステップＳ２５において変数ｉが所定
数Ｒ（例えば１０）より小さい場合には、第２バッファ
メモリ２３の内容を一括してＬＰＣ分析して（ステップ
Ｓ２６）、その後次のフレームの処理を待ち（ステップ
Ｓ２７）、一方、ステップＳ２５において変数ｉが所定
数Ｒ以上である場合には、上記ステップＳ２６を行うこ
となく上記ステップＳ２７へ行く。If the variable i is smaller than the predetermined number R (for example, 10) in step S25, the contents of the second buffer memory 23 are collectively LPC analyzed (step S26), and then the next frame is processed. On the other hand, when the variable i is equal to or more than the predetermined number R in step S25, the process goes to step S27 without performing step S26.

【０１０５】一方、上記ステップＳ２２における判別結
果が音声である場合には、非音声フレーム連続数を示す
変数ｉを０にリセットした後に（ステップＳ２８）、Ｌ
ＰＣ分析器５によりＬＰＣ分析を行ってスペクトルパラ
メータを更新する（ステップＳ２９）。その後、上記ス
テップＳ２７に進んで次のフレームの処理を待つ。On the other hand, when the discrimination result in the above step S22 is voice, after resetting the variable i indicating the number of continuous non-voice frames to 0 (step S28), L
The LPC analysis is performed by the PC analyzer 5 to update the spectrum parameter (step S29). Then, the process proceeds to step S27 and waits for the processing of the next frame.

【０１０６】適応コードブック１２の遅れＬ、ゲイン
β、確率コードブックのインデックスｉ、ゲインγは、
上記従来例において説明した方法と同様に決定される。The delay L of the adaptive codebook 12, the gain β, the index i of the probability codebook, and the gain γ are
It is determined in the same manner as the method described in the above conventional example.

【０１０７】マルチプレクサ１８は、量子化された線形
予測パラメータα、適応コードブックの遅れＬ、ゲイン
β、確率コードブックのインデックスｉ、ゲインγの各
々をマルチプレクスして伝送する。The multiplexer 18 multiplexes and transmits the quantized linear prediction parameter α, the adaptive codebook delay L, the gain β, the probability codebook index i, and the gain γ.

【０１０８】なお、上述したような音声符号化装置に対
応する音声復号化装置の復号化動作は、上記従来例にお
いて説明したものと同様である。The decoding operation of the speech decoding apparatus corresponding to the speech coding apparatus as described above is the same as that described in the conventional example.

【０１０９】また、この実施形態においても、伝送する
符号化パラメータにｖ／ｕｖの情報も入れるようにし
て、復号化装置もこれに対応した構成としても良い。Also in this embodiment, the v / uv information may be included in the coding parameter to be transmitted, and the decoding device may have a structure corresponding thereto.

【０１１０】このような第２の実施形態によれば、バッ
ファメモリを用いることで、上述の第１の実施形態と同
様に、良好に音声信号を符号化することができる高品質
な音声符号化装置となる。According to the second embodiment as described above, by using the buffer memory, as in the first embodiment described above, a high-quality voice encoding capable of properly encoding a voice signal can be performed. It becomes a device.

【０１１１】なお、上記第１，第２の実施形態の音声判
別器における音声判別方法は、一例として述べたもので
あって、上述した手段に限るものではない。The voice discrimination methods in the voice discriminators of the first and second embodiments are described as an example, and are not limited to the above-mentioned means.

【０１１２】また、上記第１，第２の実施形態において
は、コード駆動線形予測符号化装置を一例として取り上
げて説明したが、線形予測パラメータと、線形予測残差
信号に相当する駆動音源信号のパラメータとで表現する
符号化装置であれば、当然にして、何れのものにも適用
することが可能である。In the first and second embodiments, the code-driven linear predictive coding apparatus has been described as an example. However, the linear predictive parameter and the drive excitation signal corresponding to the linear predictive residual signal are used. As long as the encoding device is expressed by parameters, it can be applied to any of them.

【０１１３】［付記］以上詳述したような本発明の上記
実施形態によれば、以下のごとき構成を得ることができ
る。[Additional Notes] According to the above-described embodiment of the present invention as described in detail above, the following configuration can be obtained.

【０１１４】（１）予め定められたフレーム間隔に分
割された入力信号が、音声信号か非音声信号かを判別す
る音声判別手段と、上記入力信号のスペクトルパラメー
タを出力する線形予測分析手段と、上記音声判別手段に
よる判別結果が非音声信号であることが所定フレーム数
にわたって連続した場合に、上記入力信号のスペクトル
パラメータとして、上記線形予測分析手段に所定の先行
フレームにおけるスペクトルパラメータを継続して出力
させる制御手段と、線形予測残差信号に相当する駆動音
源信号を生成する駆動音源信号生成手段と、上記スペク
トルパラメータに基づいて上記駆動音源信号から音声を
合成する合成フィルタと、を具備したことを特徴とする
音声符号化装置。(1) A voice discriminating means for discriminating whether the input signal divided into a predetermined frame interval is a voice signal or a non-voice signal, and a linear prediction analysis means for outputting a spectrum parameter of the input signal. When the result of the discrimination by the speech discriminating means is a non-speech signal continuously for a predetermined number of frames, the spectral parameter in the predetermined preceding frame is continuously output to the linear prediction analysis means as the spectral parameter of the input signal. Control means, a driving sound source signal generating means for generating a driving sound source signal corresponding to the linear prediction residual signal, and a synthesis filter for synthesizing voice from the driving sound source signal based on the spectral parameter. Characteristic speech encoding device.

【０１１５】（２）予め定められたフレーム間隔に分
割された入力信号が、音声信号か非音声信号かを判別す
る音声判別手段と、上記入力信号のスペクトルパラメー
タを出力する線形予測分析手段と、上記音声判別手段に
よる判別結果が非音声信号である場合には、所定フレー
ム数を越えない範囲で次に音声信号であると判別される
までその入力信号をバッファリングして、上記線形予測
分析手段にその入力信号を一括して線形予測分析させる
制御手段と、線形予測残差信号に相当する駆動音源信号
を生成する駆動音源信号生成手段と、上記スペクトルパ
ラメータに基づいて上記駆動音源信号から音声を合成す
る合成フィルタと、を具備したことを特徴とする音声符
号化装置。(2) A voice discriminating means for discriminating whether the input signal divided into a predetermined frame interval is a voice signal or a non-voice signal, and a linear predictive analyzing means for outputting the spectrum parameter of the input signal. When the discrimination result by the voice discriminating means is a non-voice signal, the input signal is buffered until the next voice signal is discriminated within a range not exceeding a predetermined number of frames, and the linear predictive analyzing means is provided. Control means for collectively performing linear prediction analysis of the input signal, driving source signal generation means for generating a driving source signal corresponding to the linear prediction residual signal, and voice from the driving source signal based on the spectral parameter. A speech coding apparatus comprising: a synthesizing filter for synthesizing.

【０１１６】（３）上記駆動音源信号生成手段は、遅
延回路と、適応コードブックと、確率コードブックとを
具備してなることを特徴とする上記（１）または（２）
に記載の音声符号化装置。(3) The driving sound source signal generating means comprises a delay circuit, an adaptive codebook, and a probability codebook, wherein (1) or (2) above.
3. The speech encoding device according to claim 1.

【０１１７】（４）上記音声判別手段は、上記フレー
ム間隔に分割された入力信号のフレームエネルギーを算
出するフレームエネルギー分析手段と、符号化を開始し
たときの上記入力信号のフレームエネルギーを算出する
初期フレームエネルギー分析手段と、非音声信号のエネ
ルギーの大きさに応じて閾値を決定する閾値決定手段
と、上記フレームエネルギー分析手段により算出された
フレームエネルギーと上記初期フレームエネルギー分析
手段により算出された初期フレームエネルギーを符号を
互いに逆にして加算することにより実質的に減算を行う
加算器と、この加算器による減算結果と上記閾値決定手
段により決定された閾値とを比較して、減算結果が閾値
より大きければフレーム入力信号は音声区間であると判
別し、そうでなければ非音声区間であると判別する判別
手段と、を具備してなることを特徴とする上記（１）ま
たは（２）に記載の音声符号化装置。(4) The voice discriminating means calculates the frame energy of the input signal divided into the frame intervals, and the initial stage of calculating the frame energy of the input signal when the encoding is started. Frame energy analysis means, threshold value determination means for determining a threshold value according to the energy level of the non-voice signal, frame energy calculated by the frame energy analysis means, and initial frame calculated by the initial frame energy analysis means If the result of subtraction is greater than the threshold value by comparing the result of the subtraction by the adder and the threshold value determined by the threshold value determining means with each other, the adder that performs the subtraction by adding the energy with the signs opposite to each other is added. If the frame input signal is in the voice section, The speech coding apparatus according to (1) or (2) above, further comprising: a discriminating unit that discriminates a non-speech section.

【０１１８】上記（１）に記載の発明によれば、非音声
信号が入力しても良好に符号化することができる音質の
良い音声符号化装置となる。According to the invention described in (1) above, a speech coding apparatus with good sound quality can be obtained which can be well coded even if a non-speech signal is input.

【０１１９】上記（２）に記載の発明によれば、次に音
声信号であると判別されるまで入力信号をバッファリン
グすることにより、上記（１）に記載の発明と同様の効
果を奏することができる。According to the invention described in (2) above, the same effect as that of the invention described in (1) above can be obtained by buffering the input signal until it is determined that the audio signal is the next audio signal. You can

【０１２０】上記（３）に記載の発明によれば、上記
（１）または（２）に記載の発明と同様の効果を奏する
とともに、CELP(Code Excited Linear Predictive Codi
ng)方式を用いることにより、より良好な音質を得るこ
とができる。According to the invention described in (3) above, the same effect as that of the invention described in (1) or (2) above can be obtained, and the CELP (Code Excited Linear Predictive Codi
ng) method, it is possible to obtain better sound quality.

【０１２１】上記（４）に記載の発明によれば、上記
（１）または（２）に記載の発明と同様の効果を奏する
とともに、非音声信号のエネルギーの大きさに応じて、
入力信号が音声信号か非音声信号かを良好に判別するこ
とができる。According to the invention described in (4) above, the same effect as that of the invention described in (1) or (2) above can be obtained, and according to the magnitude of the energy of the non-voice signal,
It is possible to favorably determine whether the input signal is a voice signal or a non-voice signal.

【０１２２】[0122]

【発明の効果】以上説明したように請求項１に記載の発
明によれば、非音声信号が入力しても良好に符号化する
ことができる音質の良い音声符号化装置となる。As described above, according to the invention as set forth in claim 1, it becomes a speech coding apparatus with good sound quality which can be well coded even if a non-speech signal is inputted.

【０１２３】また、請求項２に記載の発明によれば、次
に音声信号であると判別されるまで入力信号をバッファ
リングすることにより、請求項１に記載の発明と同様の
効果を奏することができる。According to the second aspect of the invention, the same effect as that of the first aspect of the invention can be obtained by buffering the input signal until it is determined that it is the next audio signal. You can

[Brief description of the drawings]

【図１】本発明の第１の実施形態の音声符号化装置の構
成を示すブロック図。FIG. 1 is a block diagram showing a configuration of a speech encoding apparatus according to a first embodiment of the present invention.

【図２】上記第１の実施形態の音声符号化装置における
音声判別器の構成を示すブロック図。FIG. 2 is a block diagram showing a configuration of a voice discriminator in the voice encoding device according to the first embodiment.

【図３】上記第１の実施形態において、音声判別器の閾
値決定回路により決定される閾値の背景雑音エネルギー
との関係の一例を示す線図。FIG. 3 is a diagram showing an example of a relationship between a threshold value determined by a threshold value determination circuit of the voice discriminator and background noise energy in the first embodiment.

【図４】上記第１の実施形態の音声符号化装置の動作を
示すフローチャート。FIG. 4 is a flowchart showing the operation of the speech encoding apparatus according to the first embodiment.

【図５】本発明の第２の実施形態の音声符号化装置の構
成を示すブロック図。FIG. 5 is a block diagram showing a configuration of a speech encoding apparatus according to a second embodiment of the present invention.

【図６】上記第２の実施形態の音声符号化装置の動作を
示すフローチャート。FIG. 6 is a flowchart showing the operation of the speech encoding apparatus according to the second embodiment.

【図７】従来の音声符号化装置の構成を示すブロック
図。FIG. 7 is a block diagram showing a configuration of a conventional speech encoding device.

【図８】上記図７の音声符号化装置に対応する音声復号
化装置の構成を示すブロック図。8 is a block diagram showing a configuration of a speech decoding apparatus corresponding to the speech encoding apparatus of FIG. 7.

[Explanation of symbols]

２…音声判別器（音声判別手段）３…スイッチ制御回路（制御手段）５…ＬＰＣ分析器（線形予測分析手段）６…合成フィルタ１０…誤差評価器１１…遅延回路（駆動音源信号生成手段の一部）１２…適応コードブック（駆動音源信号生成手段の一
部）１４…確率コードブック（駆動音源信号生成手段の一
部）１８…マルチプレクサ２２…バッファ制御回路（制御手段） α…線形予測パラメータ（スペクトルパラメータ）2 ... Voice discriminator (voice discriminating means) 3 ... Switch control circuit (control means) 5 ... LPC analyzer (linear prediction analyzing means) 6 ... Synthesis filter 10 ... Error evaluator 11 ... Delay circuit (driving source signal generating means) Part: 12 ... Adaptive codebook (part of driving sound source signal generating means) 14 ... Stochastic codebook (part of driving sound source signal generating means) 18 ... Multiplexer 22 ... Buffer control circuit (control means) α ... Linear prediction parameter (Spectral parameter)

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｈ０３Ｍ 7/42 9382−5ＫＨ０３Ｍ 7/42 ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification number Office reference number FI technical display location H03M 7/42 9382-5K H03M 7/42

Claims

[Claims]

1. A voice discriminating means for discriminating whether an input signal divided into a predetermined frame interval is a voice signal or a non-voice signal, a linear prediction analysis means for outputting a spectrum parameter of the input signal, and When the discrimination result by the voice discriminating means is a non-voice signal continuously for a predetermined number of frames, the linear predictive analyzing means continuously outputs the spectrum parameter in a predetermined preceding frame as the spectrum parameter of the input signal. Control means, drive source signal generation means for generating a drive source signal corresponding to the linear prediction residual signal, and a synthesis filter for synthesizing voice from the drive source signal based on the spectral parameter. Speech coding device.

2. A voice discriminating means for discriminating whether an input signal divided into predetermined frame intervals is a voice signal or a non-voice signal, a linear prediction analysis means for outputting a spectrum parameter of the input signal, and If the discrimination result by the voice discriminating means is a non-voice signal, the input signal is buffered until it is discriminated as the next voice signal in a range not exceeding the predetermined number of frames, and then the linear predictive analyzing means is provided. Control means for collectively performing linear prediction analysis of the input signals, driving sound source signal generation means for generating a driving sound source signal corresponding to the linear prediction residual signal, and speech synthesis from the driving sound source signal based on the spectrum parameter. A speech coding apparatus, comprising: