JP3824706B2

JP3824706B2 - Speech encoding / decoding device

Info

Publication number: JP3824706B2
Application number: JP11397596A
Authority: JP
Inventors: 原宏幸江
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1996-05-08
Filing date: 1996-05-08
Publication date: 2006-09-20
Anticipated expiration: 2016-05-08
Also published as: JPH09297598A

Description

【０００１】
【発明の属する技術分野】
本発明は、ＣＥＬＰ型の音声符号化／復号化装置に関するものである。
【０００２】
【従来の技術】
近年、ディジタル移動通信の需要の増加により音声符号化の低ビットレート化が必要とされており、数々の音声符号化装置が開発されている。その中で、ＣＥＬＰ方式は、音声信号を声道情報と音源情報に分離し、声道情報を線形予測係数から構成されるディジタルフィルタにより表現し、音源情報を数百〜千種類程度の波形パターンから構成されている音源符号帳を用いてベクトル量子化するもので、低ビットレート（４ｋｂ／ｓ〜８ｋｂ／ｓ）においても高品質の音声を実現できる方式として広く用いられている。
【０００３】
ＣＥＬＰ方式の音源は、適応符号帳と固定符号帳（確率的符号帳と雑音符号帳）の２種類の符号帳から選ばれる音源ベクトルから構成される。このうち、適応符号帳は、音源信号（特に母音部）に含まれる周期的成分を表現するもので、過去に合成した音源信号波形を蓄えたものである。一方、固定符号帳は、音源信号から周期的成分を取り除いた後のランダムな波形（音源信号のランダム成分）を表現するために予め容易されるものである。固定符号帳は、乱数によって作成されたものや、多数の音声データを用いて学習して作成したもの、パルス列によって構成されるものなど、多くの種類のものが提案され、用いられているが、ＣＥＬＰ方式の音声符号化装置においては、過去に生成した音源信号を適応符号帳として用いるため、伝送路誤りが生じると誤りから復帰した後も、誤り時に生成した音源信号が適応符号帳として保存されているため、誤りの影響が伝播するという問題を有する。
【０００４】
以下にＣＥＬＰ方式に基づく従来の音声符号化装置における適応符号帳探索部について説明する。図５は一般的なＣＥＬＰ型音声符号化装置を示したものである。図５において、入力音声信号１は、前処理器２によって波形整形された後、線形予測分析器３および加算器４に出力される。線形予測分析器３は、前処理後の入力音声信号を用いて線形予測分析を行い、線形予測係数を合成フィルタ５に出力する。合成フィルタ５は、加算器６から入力した音源信号と線形予測分析器３から入力した線形予測係数とを用いて音声合成を行い、加算器４に出力する。加算器４は、合成フィルタから入力した合成信号と前処理器２から入力した前処理後の入力音声信号との誤差を算出し、聴覚重み付け器７に出力する。聴覚重み付け器７は、誤差信号に聴覚重み付けを行い、誤差最小化手段８に出力する。誤差最小化手段８は、聴覚重み付け器７から入力した聴覚重み付け誤差が最小となるように、固定符号ベクトル、適応符号ベクトル、固定符号ベクトル利得、適応符号ベクトル利得を決定する。固定符号ベクトルは、固定符号帳９の中から選択され、固定符号ベクトル利得乗算器１０に出力される。固定符号ベクトル利得乗算器１０は、固定符号帳９から出力された固定符号ベクトルに固定符号ベクトル利得を乗じて、加算器６に出力する。適応符号ベクトルは、適応符号帳１１の中から選択され、適応符号ベクトル利得乗算器１２に出力される。適応符号ベクトル利得乗算器１２は、適応符号帳１１から出力された適応符号ベクトルに適応符号ベクトル利得を乗じて、加算器６に出力する。加算器６は、固定符号ベクトル利得乗算器１０と適応符号ベクトル利得乗算器１２から出力されたそれぞれのベクトルの加算を行い、音源ベクトルとして合成フィルタ５に出力する。誤差最小となる固定符号ベクトル、適応符号ベクトル、固定符号ベクトル利得、適応符号ベクトル利得の組み合わせによって加算器６によって生成された音源ベクトルは、過去の音源信号をバッファリングしている適応符号帳に新しく付け加えられる。そして、この誤差最小となる音源ベクトルを生成する適応符号ベクトル、固定符号ベクトル、適応符号ベクトル利得、固定符号ベクトル利得の情報が復号器側に伝送される。
【０００５】
【発明が解決しようとする課題】
しかしながら、上記従来のＣＥＬＰ型音声符号化装置では、伝送路誤りが生じた場合、適応符号帳にバッファリングされている内容が、符号器側と復号器側で異なってしまい、伝送路誤りから復帰した後も誤りの影響を大きく受けてしまうという問題を有していた。
【０００６】
本発明は、上記従来の問題を解決するものであり、伝送路誤りから復帰した直後でも符号器側と復号器側で同一の音源ベクトルを得られるようにし、また伝送路誤りから復帰した直後に生じる符号器側と復号器側で生成される音源ベクトルの誤差を緩和することのできる音声符号化／復号化装置を提供することを目的とする。
【０００７】
【課題を解決するための手段】
本発明は、上記目的を達成するために、伝送路誤りの発生を監視するための情報が復号器側から符号器側に送られ、伝送路誤りが発生したと判定される場合には、誤りの発生した次のフレームまたサブフレームにおける適応符号帳探索の探索範囲を制限するようにしたものである。また、連続して伝送路誤りが発生した場合は、適応帳を用いずに、固定符号帳のみによる符号化処理を伝送路誤りが解消されるまで続けるようにしたものである。さらに、伝送路誤りを生じたときに生成された適応符号帳の使用を回避し、伝送路誤りがないときに生成された適応符号帳を用いるようにしたものである。
【０００８】
【発明の実施の形態】
本発明の請求項１に記載の発明は、過去に生成した音源ベクトルのバッファである適応符号帳と、音声復号化装置から受信した伝送路誤り監視信号に基づいて伝送路誤りが生じたかどうかを判定し、直前のフレームまたはサブフレームに伝送路誤りが生じたと判断した場合には、直前のフレームまたはサブフレームで生成した部分を適応符号帳の探索範囲から除外して適応符号帳探索を行う探索範囲限定手段とを有する音声符号化装置を備えた音声符号化／復号化装置であり、伝送路誤り解消直後のフレームまたはサブフレームにおいても、符号器側と復号器側で同一の音源ベクトルを生成することが可能となる。
【０００９】
本発明の請求項２に記載の発明は、前記探索範囲限定手段は、直前の数フレームまたは数サブフレームに渡って伝送路誤りが生じていると判断した場合、その連続して伝送路誤りが生じたフレームまたはサブフレームで生成した適応符号ベクトルの利得を零にして、音源ベクトルを固定符号帳のみから生成するものであり、伝送路誤り解消直後のフレームまたはサブフレームにおいても、符号器側と復号器側で同一の音源ベクトルを生成することが可能となる。
【００１０】
本発明の請求項３に記載の発明は、前記探索範囲限定手段から探索範囲情報を入力して適応符号帳と固定符号帳のいずれか一方を選択する符号帳選択手段を有し、前記探索範囲限定手段が、直前の数フレームまたは数サブフレームに渡って伝送路誤りが生じて前記適応符号帳に格納されている音源ベクトルの全てが探索範囲から除外されてしまうと判断した場合は、前記探索範囲情報を前記符号帳選択手段に出力し、前記符号帳選択手段は、前記適応符号符号帳を固定符号帳に切り替えて、音源ベクトルを固定符号帳のみから生成するものであり、伝送路誤り解消直後のフレームまたはサブフレームにおいても、符号器側と復号器側で同一の音源ベクトルを生成することが可能となる。
【００１１】
本発明の請求項４に記載の発明は、前記音声符号化装置が、前記適応符号帳と固定符号帳のどちらの符号帳を用いるかを示す情報を前記音声復号化装置へ送信する手段を有し、前記音声復号化装置が、前記適応符号帳と固定符号帳のどちらの符号帳を用いるかを示す情報を基に前記適応符号帳と固定符号帳のいずれか一方を選択する手段を有するものであり、伝送路誤り解消直後のフレームまたはサブフレームにおいても、符号器側と復号器側で同一の音源ベクトルを生成することが可能となる。
【００１２】
本発明の請求項５に記載の発明は、前記音声復号化装置が、受信したピッチ情報を用いて復号される適応符号ベクトルが、伝送路誤りによって正しく復号されなかったフレームの音源ベクトルを利用して復号されるかどうかを判定し、正しく復号されなかったフレームを利用して前記適応符号ベクトルが生成される場合は、受信したピッチ情報をそのまま用いて前記適応符号ベクトルを生成するものであり、符号器側と復号器側で得られる音源ベクトルの誤差が大きくなることを避けることが可能となる。
【００１３】
（実施の形態１）
以下、本発明の実施の形態について、図面を参照しながら説明する。図１は本発明の第１の実施の形態におけるＣＥＬＰ型音声符号化装置の構成を示すものである。図１において、１０１は入力音声信号、１０２は入力音声信号１０１を入力として前処理後の入力音声信号を線形予測器１０３と加算器１０４に出力する前処理器、１０３は前処理後の入力音声信号を入力として線形予測分析を行い、線形予測係数を合成フィルタ１０５に出力する線形予測分析器、１０４は前処理後の音声信号と合成フィルタ１０５の出力信号とを入力として差分信号を算出し、聴覚重み付け器１０７に出力する加算器、１０５は加算器１０６から出力された音源ベクトルと線形予測分析器１０３から出力された線形予測係数とを入力として音声信号の合成を行なう合成フィルタ、１０６は固定符号ベクトル利得乗算器１１２と適応符号ベクトル利得乗算器１１４から出力されるそれぞれのベクトルを加算して合成フィルタ１０５に出力する加算器、１０７は加算器１０４から出力された誤差信号を入力として聴覚的な重み付けを行い、誤差最小化手段１０８に出力する聴覚重み付け器、１０８は聴覚重み付け器１０７から出力された聴覚重み付けの誤差パワーが最小となるような固定符号ベクトル、適応符号ベクトル利得、適応符号ベクトル利得の組み合わせを、探索範囲限定器１０９から出力された探索範囲に基づいて決定する誤差最小化手段、１０９は伝送路誤り監視信号を入力とし、誤差最小化手段１０８による適応符号等の探索範囲を決定して誤差最小化手段１０８に出力する探索範囲限定器、１１０は伝送路誤りの発生を検出するための伝送路誤り監視信号、１１１は固定符号ベクトル利得乗算器１１２に出力する予め定められた数の固定符号ベクトルを格納する固定符号帳、１１２は固定符号帳１１１から出力された固定符号ベクトルに固定ベクトル利得を乗じて加算器１０６に出力する固定符号ベクトル利得乗算器、１１３は加算器１０６から出力された過去の音源ベクトル（誤差最小化手段１０８によって最終的に決定されたもの）のバッファからなり、バッファに格納された信号列の一部を切り出して適応符号ベクトルとして適応符号ベクトル利得乗算器１１４に出力する適応符号帳、１１４は適応符号帳１１３から出力された適応符号ベクトルに適応符号ベクトル利得を乗じて加算器１０６に出力する適応符号ベクトル利得乗算器である。
【００１４】
以上のように構成されたＣＥＬＰ型音声符号化ー装置について、以下にその動作を説明する。図１において、入力音声信号１０１は、定められたサンプル数からなるディジタル信号であり、音声符号化処理は、この定められたサンプル数の音声信号毎に行なわれる。この定められたサンプル数の音声信号ブロックをフレームまたはサブフレームと呼ぶ。入力音声信号１０１は、前処理器１０２により帯域制限や利得調整が行なわれる。この前処理後の音声信号を用いて、線形予測分析器１０３は、公知の線形予測分析を行い、線形予測係数を算出する。合成フィルタ１０５は、線形予測分析器１０３で算出された線形予測係数を用いてフィルタを構成し、加算器６から出力されてくる音源ベクトルにフィルタ処理を行なって音声を行なう。加算器１０４は、前処理後の入力音声信号と合成フィルタ１０５によって合成された音声信号との差分信号を計算する。聴覚重み付け器１０７は、加算器１０４によって算出された差分信号に聴覚的な重み付けを行い、誤差最小化手段１０８に出力する。この聴覚的な重み付けは、一般的には、線形予測分析器１０３で算出された線形予測係数と聴覚重み付け係数を用いた線形予測フィルタを縦続接続したフィルタを用いて行なわれる。誤差最小化手段１０８は、聴覚重み付け後の差分信号（誤差信号）のパワーが最小となるように、合成フィルタ１０５に入力される音源ベクトルを、固定符号ベクトルと固定符号ベクトル利得と適応符号ベクトルと適応符号ベクトル利得の組み合わせを変えることによって調整する。一般的には、初めに適応符号帳１１３から最適な適応符号ベクトルを取り出して、乗算器１１４で適応符号ベクトル利得と乗算して加算器１０６への出力を決定し、続いて固定符号帳１１１の中から適応符号ベクトルと組み合わせた時に最適となる固定符号ベクトルを取り出して、乗算器１１２で固定符号ベクトル利得と乗算して加算器１０６への出力を決定する。探索範囲限定器１０９は、適応符号帳１１３の中から最適な適応符号ベクトルを取り出すときに、適応符号帳１１３の探索範囲を限定するものである。探索範囲限定器１０９は、探索範囲限定器１０９に入力される伝送路誤り監視信号１１０から、直前のフレームまたはサブフレームに伝送路誤りが生じたかを判定する。そして、直前のフレームまたはサブフレームで伝送路誤りが生じたと判定した場合には、適応符号帳１１３に格納されている過去に生成した音源信号のうち、直前のフレームで生成した部分を探索範囲から外して適応符号帳探索を行い、最適な符号ベクトルを選択するように、適応符号帳１１３の探索範囲を誤差最小化手段１０８に出力する。連続して直前のフレームまたはサブフレームに伝送路誤りが生じたと判定されている場合は、適応符号帳１１３に格納されている過去に生成した音源信号のうち、連続した直前のフレームで生成した部分を探索範囲から外して適応符号帳探索を行なうように適応符号帳探索範囲を決定し、誤差最小化手段１０８に出力する。しかしながら、伝送路誤りの連続が長時間に渡ることによって、適応符号帳１１３に格納されている音源符号帳の全てが探索範囲から除外されてしまうような場合は、適応符号ベクトル利得を零にして、音源ベクトルを固定符号ベクトルのみから生成するように、誤差最小化手段１０８の探索範囲を決定する。
【００１５】
音声符号化装置を以上のように構成した場合、復号化装置には、符号化装置に伝送路誤り監視信号１１０を伝送する手段を付加する必要があるが、復号化装置における復号処理は従来のものと全く同じものになるため、従来のものをそのまま用いることが可能である。なお、伝送路誤り監視信号１１０としては、予め定められた信号を一定時間間隔（１フレーム分の符号化パラメータを伝送する時間間隔より短い）で送信するものなどが考えられ、この場合、探索範囲限定器１０９では、予め定められた信号と異なる信号を受け取った場合に、その時送信したフレームの符号化情報に伝送路誤りが発生したと判断する。
【００１６】
このように、上記第１の実施の形態によれば、復号器側から伝送路誤り情報を受け取った符号器が、直前のフレームまたはサブフレームに伝送路誤りが生じたかを判断し、伝送路誤りが生じた場合には、直前のフレームまたはサブフレームで生成した部分を適応符号帳の探索範囲から除外する探索範囲限定器１０９を備えたものであり、伝送路誤り解消直後のフレームまたはサブフレームにおいても、符号器側と復号器側で同一の音源ベクトルを生成することが可能となる。
【００１７】
（実施の形態２）
次に、本発明の第２の実施の形態について図２を参照しながら説明する。図２において、２０１は入力音声信号、２０２は入力音声信号２０１を入力として前処理後の入力音声信号を線形予測分析器２０３と加算器２０４に出力する前処理器、２０３は前処理後の入力音声信号を入力として線形予測分析を行い、線形予測係数を合成フィルタ２０５に出力する線形予測分析器、２０４は前処理後の音声信号と合成フィルタ２０５の出力信号とを入力として差分信号を算出し、聴覚重み付け器２０７に出力する加算器、２０５は加算器２０６から出力された音源ベクトルと線形予測分析器２０３から出力された線形予測係数とを入力として音声信号の合成を行なう合成フィルタ、２０６は固定符号ベクトル利得乗算器２１２と適応符号ベクトル利得乗算器２１６から出力されるそれぞれのベクトルを加算して合成フィルタ２０５に出力する加算器、２０７は加算器２０４から出力された誤差信号を入力として聴覚的な重み付けを行い、誤差最小化手段２０８に出力する聴覚重み付け器、２０８は聴覚重み付け器２０７から出力された聴覚重み付け後の誤差パワーが最小となるような固定符号ベクトル、適応符号ベクトル、固定符号ベクトル利得、適応符号ベクトル利得の組み合わせを、探索範囲限定器２０９から出力された探索範囲に基づいて決定する誤差最小化手段、２０９は伝送路誤り監視信号２１０を入力とし、誤差最小化手段２０８による適応符号帳２１４の探索範囲を決定して誤差最小化手段２０８および符号帳選択器２１５に出力する探索範囲限定器、２１０は伝送路誤りの発生を検出するための伝送路誤り監視信号、２１１は固定符号ベクトルを固定符号ベクトル利得乗算器２１２に出力する予め定められた数の固定符号ベクトルを格納する固定符号帳、２１２は固定符号帳２１１から出力された固定符号ベクトルに固定符号ベクトル利得を乗じて加算器２０６に出力する固定符号ベクトル利得乗算器、２１３は固定符号ベクトルを符号帳選択器２１５に出力する予め定められた数の固定符号ベクトルを格納する固定符号帳、２１４は加算器２０６から出力された過去の音源ベクトル（誤差最小化手段２０８によって最終的に決定されたもの）のバッファからなり、バッファに格納された信号列の一部を切り出して適応符号ベクトルとして符号帳選択器２１５に出力する適応符号帳、２１５は探索範囲限定器２０９から探索範囲情報を入力し、固定符号帳２１３と適応符号帳２１４からそれぞれ入力したベクトルのうち一方のみを選択して符号ベクトル利得乗算器２１６へ出力する符号帳選択器、２１６は符号帳選択器２１５から出力された符号ベクトルに符号ベクトル利得を乗算して加算器２０６に出力する符号ベクトル利得乗算器である。
【００１８】
以上のように構成されたＣＥＬＰ型音声符号化装置について、以下にその動作を説明する。図２において、入力音声信号２０１は、定められたサンプル数からなるディジタル信号であり、音声符号化処理は、この定められたサンプル数の音声信号毎に行なわれる。この定められたサンプル数の音声信号ブロックをフレームまたはサブフレームと呼ぶ。入力音声信号２０１は、前処理器２０２により帯域制限や利得調整が行なわれる。この前処理後の音声信号を用いて、線形予測分析器２０３は、公知の線形予測分析を行い、線形予測係数を算出する。合成フィルタ２０５は、線形予測分析器２０３で算出された線形予測係数を用いてフィルタを構成し、加算器２０６から出力されてくる音源ベクトルにフィルタ処理を行なって音声合成を行なう。加算器２０４は、前処理後の入力音声信号と合成フィルタ２０５によって合成された音声信号との差分信号を計算する。聴覚重み付け器２０７は、加算器２０４によって算出された差分信号に聴覚的な重み付けを行ない、誤差最小化手段２０８に出力する。この聴覚的な重み付けは、一般的には、線形予測分析器２０３で算出された線形予測係数と聴覚重み付け係数を用いた線形予測フィルタを縦続接続したフィルタを用いて行なわれる。誤差最小化手段２０８は、聴覚重み付けの後の差分信号（誤差信号）のパワーが最小となるように、合成フィルタ２０５に入力される音源ベクトルを、固定符号ベクトルと固定符号ベクトル利得と適応符号ベクトルと適応符号ベクトル利得の組み合わせを変えることによって調整する。一般的には、初めに適応符号帳２１４から最適な適応符号ベクトルを取り出して、乗算器２１６で適応符号ベクトル利得と乗算して加算器２０６への出力を決定し、続いて固定符号帳２１１の中から適応ベクトルと組合わせた時に最適となる固定符号ベクトルを取り出して、乗算器２１２で固定符号ベクトル利得と乗算して加算器２０６への出力を決定する。探索範囲限定器２０９は、適応符号帳２１４の中から最適な適応符号ベクトルを取り出すときに、適応符号帳２１４の探索範囲を限定するものである。探索範囲限定器２０９は、探索範囲限定器２０９に入力される伝送路誤り監視信号２１０から直前のフレームまたはサブフレームに伝送路誤りが生じたかを判定する。そして、直前のフレームまたはサブフレームで伝送路誤りが生じたと判定した場合には、適応符号帳２１４に格納されている過去に生成した音源信号のうち、直前のフレームで生成した部分を探索範囲から外して適応符号帳探索を行ない、最適な符号ベクトルを選択するように、適応符号帳２１４の探索範囲を誤差最小化手段２０８および符号帳選択器２１５に出力する。連続して直前のフレームまたはサブフレームに伝送路誤りが生じたと判定した場合には、適応符号帳２１４に格納されている過去に生成した音源信号のうち、連続した直前のフレームで生成した部分を探索範囲から外して適応符号帳探索を行なうように適応符号帳探索範囲を決定し、誤差最小化手段２０８に出力する。しかしながら、伝送誤りの連続が長時間に渡ることによって適応符号帳２１４に格納されている音源符号帳の全てが探索範囲から除外されてしまう場合は、適応符号帳２１４を用いずに固定符号帳２１３を用いて音源ベクトルを生成するように、符号帳選択器２１５と誤差最小化手段２０８に探索範囲を出力する。符号帳選択器２１５は、入力された探索範囲が固定符号帳探索を示す内容となっている場合には、固定符号帳２１３からの入力される符号ベクトルを符号ベクトル利得乗算器２１６に出力する。
【００１９】
音声符号化装置を以上のように構成した場合、復号化装置には、適応符号帳と固定符号帳のどちらか一方を選択する手段が必要となる。簡単な方法としては、どちらの符号帳を用いているのかを示す情報を符号化装置側で付加して復号化装置側へ伝送すればよい。このためにビットを割くことが不可能な場合には、伝送路誤り監視手段を付加して、過去に連続した伝送路誤りが発生していた場合に、固定符号帳と適応符号帳の切り替えを行なう必要がある。
【００２０】
なお、伝送路誤り監視信号２１０としては、予め定められた信号を一定時間間隔（１フレーム分の符号化パラメータを伝送する時間間隔より短い）で送信するものなどが考えられ、この場合、探索範囲限定器２０９では、予め定められた信号と異なる信号を受け取った場合に、その時送信したフレームの符号化情報に伝送路誤りが発生したと判断する。
【００２１】
このように、上記第２の実施の形態によれば、誤りフレーム後の正常フレームにおいて、符号化装置の音源ベクトルと復号化装置の音源ベクトルとの間に歪みを生じることなく、同一の音源ベクトルが得られるようにすることができる。また、適応符号帳をキャンセルして、適応符号ベクトルに割り当てられた情報量を使用しない上記第１の実施の形態よりも音質を向上させることができる。
【００２２】
（実施の形態３）
次に、本発明の第３の実施の形態における音声復号化装置について説明する。図３は音声復号化装置の適応符号帳に格納されている音源波形を示したものであり、３０１は適応符号帳に格納されている音源波形、３０２は伝送路誤りによって正しく復号されなかったフレームで生成された音源波形の部分、Ｐｉは伝送路誤りのあったフレームの直後に符号化装置から伝送された正常フレームのラグ値、Ｖｐｉはラグ値Ｐｉに基づいて適応符号帳から切り出された適応符号ベクトルの区間、ＮＶｐｉはこれから音源波形を生成する区間（現在のフレームまたはサブフレーム）を示している。
【００２３】
図３において、符号化装置から伝送されたラグ値Ｐｉによって表される適応符号ベクトル（区間Ｖｐｉ）は、伝送路誤りによって正しく復号されなかったフレームの音源ベクトルを含んでしまうため、波形歪みが大きくなる。そこで、このように直前のフレームに伝送路誤りなどがあった場合には、符号化装置から伝送されたラグ値Ｐｉの整数倍のピッチ（ｎＰｉ）を用いて適応符号ベクトルを生成する。このときの整数ｎは、誤った情報によって生成された音源部分３０２を含まないために必要な整数の最小値であり、図３においては、ｎ＝３となり、Ｖｐ３が適応符号ベクトルとして切り出される。
【００２４】
また、図４は図３の直後のフレームまたはサブフレームにおける復号化装置の適応符号帳の音源波形を示している。このとき、符号化装置から伝送されたラグ値Ｐｉ＋１に基づいて切り出される適応符号ベクトルＶｐｉ＋１は、まだ誤りフレームにおいて生成した音源波形を含んでいる。これを避けるためには、ｎＰｉ＋１のｎ＝４として、Ｖｐ４を適応符号ベクトルとして用いればよい。ただし、図４に示すような場合には、Ｖｐｉ＋１に含まれる誤りフレームにおいて生成した音源波形を含む割合が低いため、Ｖｐ４を用いずにＶｐｉ＋１を用いても良いが、その場合はＶｐｉ＋１に含まれる誤りフレームにおいて生成した音源波形を含む割合による場合分けを行なう必要がある。
【００２５】
なお、このような整数倍ピッチを用いる手法が有効となるのは、ピッチ周期がはっきりした有声部においてであり、符号化装置から伝送された適応符号ベクトル利得が１．０に近い値の時、または誤りフレームより前の数フレームにおけるラグ値の変化が小さく、誤りフレーム直後の正常フレームにおけるラグ値と等しいかほぼ等しい場合である。また、過去の誤り発生時に生成した部分を避けて適応符号ベクトルを適応符号帳から切り出すため、復号化装置の適応符号帳に格納される音源波形は、符号化装置の適応符号帳よりも長時間格納する必要がある。
【００２６】
このように、上記第３の実施の形態によれば、適応符号ベクトルが有効に働く部分において、誤りフレーム後の正常フレームにおける適応符号ベクトルの歪みを抑えることが可能となる。
【００２７】
【発明の効果】
以上のように、本発明は、ＣＥＬＰ型音声符号化／復号化装置において、伝送路誤りから復帰した直後でも、符号器側と復号器側で同一の音源ベクトルが得られ、また、伝送路誤りから復帰した直後に生じる符号器側と復号器側で生成される音源ベクトルの誤差を緩和することができる優れた音声符号化／復号化装置を実現できるものである。
【図面の簡単な説明】
【図１】本発明の第１の実施の形態における音声符号化装置の構成を示すブロック図
【図２】本発明の第２の実施の形態における音声符号化装置の構成を示すブロック図
【図３】本発明の第３の実施の形態における音声符号化装置の適応符号帳の模式図
【図４】本発明の第３の実施の形態における音声符号化装置の適応符号帳の模式図
【図５】一般的なＣＥＬＰ音声符号化装置の構成を示すブロック図
【符号の説明】
１０４加算器
１０６加算器
１１２固定符号ベクトル利得乗算器
１１４適応符号ベクトル利得乗算器
２０４加算器
２０６加算器
２１２固定符号ベクトル利得乗算器
２１６符号ベクトル利得乗算器
３０１適応符号等音源波形
３０２誤り発生フレームにおいて生成された適応符号帳音源波形区間[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a CELP speech encoding / decoding device.
[0002]
[Prior art]
In recent years, due to an increase in demand for digital mobile communication, it is necessary to reduce the bit rate of speech coding, and a number of speech coding devices have been developed. Among them, the CELP method separates a voice signal into vocal tract information and sound source information, expresses the vocal tract information by a digital filter composed of linear prediction coefficients, and the sound source information has about hundreds to thousands of waveform patterns. It is vector-quantized using the excitation codebook composed of the above, and is widely used as a method capable of realizing high-quality speech even at a low bit rate (4 kb / s to 8 kb / s).
[0003]
A CELP excitation source is composed of excitation vectors selected from two types of codebooks: an adaptive codebook and a fixed codebook (stochastic codebook and noise codebook). Among these, the adaptive codebook expresses a periodic component included in a sound source signal (particularly a vowel part), and stores sound source signal waveforms synthesized in the past. On the other hand, the fixed codebook is facilitated in advance to represent a random waveform (random component of the sound source signal) after removing the periodic component from the sound source signal. Many types of fixed codebooks have been proposed and used, such as those created with random numbers, those created by learning using a large number of audio data, and those composed of pulse trains. In the CELP speech coding apparatus, since the excitation signal generated in the past is used as the adaptive codebook, the excitation signal generated at the time of error is stored as the adaptive codebook even after recovery from the error when a transmission path error occurs. Therefore, there is a problem that the influence of errors propagates.
[0004]
The adaptive codebook search unit in the conventional speech encoding apparatus based on the CELP method will be described below. FIG. 5 shows a general CELP speech coding apparatus. In FIG. 5, the input speech signal 1 is waveform-shaped by the preprocessor 2 and then output to the linear prediction analyzer 3 and the adder 4. The linear prediction analyzer 3 performs linear prediction analysis using the pre-processed input speech signal and outputs linear prediction coefficients to the synthesis filter 5. The synthesis filter 5 performs speech synthesis using the sound source signal input from the adder 6 and the linear prediction coefficient input from the linear prediction analyzer 3, and outputs the synthesized speech to the adder 4. The adder 4 calculates an error between the synthesized signal input from the synthesis filter and the pre-processed input speech signal input from the preprocessor 2, and outputs the error to the auditory weighter 7. The auditory weighter 7 performs auditory weighting on the error signal and outputs it to the error minimizing means 8. The error minimizing means 8 determines a fixed code vector, an adaptive code vector, a fixed code vector gain, and an adaptive code vector gain so that the perceptual weighting error input from the perceptual weighter 7 is minimized. The fixed code vector is selected from the fixed codebook 9 and output to the fixed code vector gain multiplier 10. The fixed code vector gain multiplier 10 multiplies the fixed code vector output from the fixed codebook 9 by the fixed code vector gain and outputs the result to the adder 6. The adaptive code vector is selected from the adaptive code book 11 and output to the adaptive code vector gain multiplier 12. The adaptive code vector gain multiplier 12 multiplies the adaptive code vector output from the adaptive code book 11 by the adaptive code vector gain and outputs the result to the adder 6. The adder 6 adds the vectors output from the fixed code vector gain multiplier 10 and the adaptive code vector gain multiplier 12 and outputs the result as a sound source vector to the synthesis filter 5. The excitation vector generated by the adder 6 by the combination of the fixed code vector, the adaptive code vector, the fixed code vector gain, and the adaptive code vector gain that minimizes the error is newly added to the adaptive codebook buffering the past excitation signal. Added. Then, information on the adaptive code vector, fixed code vector, adaptive code vector gain, and fixed code vector gain for generating the excitation vector that minimizes the error is transmitted to the decoder side.
[0005]
[Problems to be solved by the invention]
However, in the above-described conventional CELP speech coding apparatus, when a transmission line error occurs, the contents buffered in the adaptive codebook differ between the encoder side and the decoder side, and recovery from the transmission line error occurs. After that, there was a problem of being greatly affected by errors.
[0006]
The present invention solves the above-mentioned conventional problem, and enables the same excitation vector to be obtained on the encoder side and the decoder side even immediately after returning from a transmission path error, and immediately after returning from a transmission path error. It is an object of the present invention to provide a speech encoding / decoding device that can mitigate an error between excitation vectors generated on the encoder side and the decoder side.
[0007]
[Means for Solving the Problems]
In order to achieve the above object, the present invention sends information for monitoring the occurrence of a transmission path error from the decoder side to the encoder side, and determines that a transmission path error has occurred. This limits the search range of the adaptive codebook search in the next frame or subframe in which occurrence occurs. In addition, when transmission path errors occur continuously, the encoding process using only the fixed codebook is continued until the transmission path error is eliminated without using the adaptive book. Furthermore, the use of the adaptive codebook generated when a transmission path error occurs is avoided, and the adaptive codebook generated when there is no transmission path error is used.
[0008]
DETAILED DESCRIPTION OF THE INVENTION
The invention described in claim 1 of the present invention Determine whether a transmission path error has occurred based on the adaptive codebook that is a buffer of excitation vectors generated in the past and a transmission path error monitoring signal received from the speech decoding device, Transmission path error occurred in the previous frame or subframe And Judgment did In some cases, the part generated in the previous frame or subframe is excluded from the adaptive codebook search range. To perform adaptive codebook search Search range limiting means Speech encoding / decoding device provided with speech encoding device having Thus, the same excitation vector can be generated on the encoder side and the decoder side even in the frame or subframe immediately after the transmission path error is eliminated.
[0009]
The invention according to claim 2 of the present invention is The search range limiting means includes: If it is determined that a transmission path error has occurred over the previous few frames or subframes, the gain of the adaptive code vector generated in the frame or subframe in which the transmission path error has occurred is set to zero, The excitation vector is generated only from the fixed codebook, and the same excitation vector can be generated on the encoder side and the decoder side even in the frame or subframe immediately after the transmission path error is eliminated.
[0010]
The invention according to claim 3 of the present invention is It has codebook selection means for inputting search range information from the search range limitation means and selecting either an adaptive codebook or a fixed codebook, and the search range limitation means, A transmission path error occurred over the previous few frames or subframes. All the excitation vectors stored in the adaptive codebook are excluded from the search range. If you decide The search range information is output to the codebook selection means, and the codebook selection means Switch adaptive codebook to fixed codebook and generate excitation vector from fixed codebook only Rumo Thus, the same excitation vector can be generated on the encoder side and the decoder side even in the frame or subframe immediately after the transmission path error is eliminated.
[0011]
The invention according to claim 4 of the present invention is The speech coding apparatus has means for transmitting to the speech decoding apparatus information indicating which of the adaptive codebook and the fixed codebook is used, and the speech decoding apparatus comprises the adaptive codebook A means for selecting one of the adaptive codebook and the fixed codebook based on information indicating which codebook to use. Thus, even in a frame or subframe immediately after the transmission channel error is eliminated, the same excitation vector can be generated on the encoder side and the decoder side.
[0012]
The invention according to claim 5 of the present invention is The speech decoding apparatus determines whether or not the adaptive code vector decoded using the received pitch information is decoded using the excitation vector of the frame that was not correctly decoded due to a transmission path error, and is decoded correctly When the adaptive code vector is generated using a frame that has not been generated, the adaptive code vector is generated using the received pitch information as it is. Therefore, it is possible to avoid an increase in the error of the excitation vector obtained on the encoder side and the decoder side.
[0013]
(Embodiment 1)
Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 shows the configuration of a CELP speech coding apparatus according to the first embodiment of the present invention. In FIG. 1, 101 is an input speech signal, 102 is a preprocessor that receives the input speech signal 101 as input, and outputs a preprocessed input speech signal to the linear predictor 103 and the adder 104, and 103 is a preprocessed input speech. A linear prediction analyzer that performs linear prediction analysis using the signal as input and outputs linear prediction coefficients to the synthesis filter 105; 104 calculates a difference signal using the pre-processed speech signal and the output signal of the synthesis filter 105 as inputs; An adder output to the perceptual weighting unit 107, 105 is a synthesis filter that synthesizes a speech signal with the sound source vector output from the adder 106 and the linear prediction coefficient output from the linear prediction analyzer 103 as inputs, and 106 is fixed. A synthesis filter by adding the respective vectors output from the code vector gain multiplier 112 and the adaptive code vector gain multiplier 114 An adder for outputting to 05, 107 is an auditory weighting unit for performing auditory weighting using the error signal output from the adder 104 as an input, and 108 is output from the auditory weighting unit 107 for output to the error minimizing means 108. Error minimizing means for determining a combination of a fixed code vector, adaptive code vector gain, and adaptive code vector gain that minimizes the perceptual weighting error power based on the search range output from the search range limiter 109; Is a search range limiter which receives a transmission line error monitoring signal as input, determines the search range of an adaptive code or the like by the error minimizing means 108, and outputs the search range to the error minimizing means 108, , 111 is a predetermined number of fixed code vectors to be output to the fixed code vector gain multiplier 112. , 112 is a fixed code vector gain multiplier that multiplies the fixed code vector output from the fixed codebook 111 by a fixed vector gain and outputs the result to the adder 106, and 113 is a past code output from the adder 106. And a part of the signal sequence stored in the buffer is cut out and output to the adaptive code vector gain multiplier 114 as an adaptive code vector. The adaptive code book 114 is an adaptive code vector gain multiplier that multiplies the adaptive code vector output from the adaptive code book 113 by the adaptive code vector gain and outputs the result to the adder 106.
[0014]
The operation of the CELP speech coding apparatus configured as described above will be described below. In FIG. 1, an input audio signal 101 is a digital signal having a predetermined number of samples, and the audio encoding process is performed for each audio signal having the predetermined number of samples. The audio signal block having the predetermined number of samples is called a frame or a subframe. The input audio signal 101 is subjected to band limitation and gain adjustment by the preprocessor 102. Using the preprocessed speech signal, the linear prediction analyzer 103 performs a known linear prediction analysis and calculates a linear prediction coefficient. The synthesis filter 105 configures a filter using the linear prediction coefficient calculated by the linear prediction analyzer 103, performs a filter process on the sound source vector output from the adder 6, and performs speech. The adder 104 calculates a difference signal between the pre-processed input audio signal and the audio signal synthesized by the synthesis filter 105. The auditory weighter 107 performs auditory weighting on the difference signal calculated by the adder 104 and outputs the result to the error minimizing means 108. This auditory weighting is generally performed using a filter in which a linear prediction coefficient calculated by the linear prediction analyzer 103 and a linear prediction filter using the auditory weighting coefficient are connected in cascade. The error minimizing means 108 converts the excitation vector input to the synthesis filter 105 into a fixed code vector, a fixed code vector gain, and an adaptive code vector so that the power of the differential signal (error signal) after auditory weighting is minimized. Adjust by changing the combination of adaptive code vector gains. In general, an optimum adaptive code vector is first extracted from the adaptive code book 113, and multiplied by an adaptive code vector gain by a multiplier 114 to determine an output to the adder 106. A fixed code vector that is optimal when combined with an adaptive code vector is extracted from the inside, and a multiplier 112 multiplies the fixed code vector by a fixed code vector gain to determine an output to the adder 106. The search range limiter 109 limits the search range of the adaptive codebook 113 when the optimum adaptive code vector is extracted from the adaptive codebook 113. The search range limiter 109 determines whether a transmission path error has occurred in the immediately preceding frame or subframe from the transmission path error monitoring signal 110 input to the search range limiter 109. If it is determined that a transmission path error has occurred in the immediately preceding frame or subframe, the portion of the excitation signal generated in the past stored in the adaptive codebook 113 is extracted from the search range. The search range of the adaptive codebook 113 is output to the error minimizing means 108 so that the adaptive codebook search is performed by selecting the optimal code vector. If it is determined that a transmission path error has occurred in the immediately preceding frame or subframe, the portion of the excitation signal generated in the past stored in the adaptive codebook 113 that is generated in the immediately preceding frame Is removed from the search range, the adaptive codebook search range is determined so as to perform the adaptive codebook search, and is output to the error minimizing means 108. However, if all of the excitation codebooks stored in the adaptive codebook 113 are excluded from the search range due to a long continuous transmission path error, the adaptive code vector gain is set to zero. The search range of the error minimizing means 108 is determined so that the excitation vector is generated only from the fixed code vector.
[0015]
When the speech encoding apparatus is configured as described above, it is necessary to add a means for transmitting the transmission path error monitoring signal 110 to the decoding apparatus. Decryption Since the decoding process in the conversion apparatus is exactly the same as the conventional one, the conventional one can be used as it is. Note that the transmission path error monitoring signal 110 may be a signal that transmits a predetermined signal at a constant time interval (shorter than the time interval for transmitting one frame of encoding parameters). Limiter 109 When a signal different from a predetermined signal is received, it is determined that a transmission path error has occurred in the encoded information of the frame transmitted at that time.
[0016]
As described above, according to the first embodiment, the encoder that has received the transmission path error information from the decoder side determines whether or not a transmission path error has occurred in the immediately preceding frame or subframe. In the case where the error occurs, a search range limiter 109 for excluding the portion generated in the immediately preceding frame or subframe from the search range of the adaptive codebook is provided. In addition, it is possible to generate the same excitation vector on the encoder side and the decoder side.
[0017]
(Embodiment 2)
Next, a second embodiment of the present invention will be described with reference to FIG. In FIG. 2, 201 is an input speech signal, 202 is a preprocessor that receives the input speech signal 201 as input and outputs the preprocessed input speech signal to the linear prediction analyzer 203 and the adder 204, and 203 is a preprocessed input. A linear prediction analyzer that performs linear prediction analysis with an audio signal as input and outputs linear prediction coefficients to the synthesis filter 205, and 204 calculates a difference signal by using the preprocessed audio signal and the output signal of the synthesis filter 205 as inputs. , An adder that outputs to the auditory weighting unit 207, a synthesis filter 205 that synthesizes a speech signal with the sound source vector output from the adder 206 and the linear prediction coefficient output from the linear prediction analyzer 203 as inputs, and 206 The respective vectors output from the fixed code vector gain multiplier 212 and the adaptive code vector gain multiplier 216 are added to form a synthesis field. An adder that outputs to the output unit 205, 207 performs auditory weighting using the error signal output from the adder 204 as input, and an auditory weighter that outputs to the error minimizing means 208, 208 outputs from the auditory weighter 207. A combination of a fixed code vector, an adaptive code vector, a fixed code vector gain, and an adaptive code vector gain that minimizes the error power after auditory weighting is determined based on the search range output from the search range limiter 209. The error minimizing means 209 receives the transmission line error monitoring signal 210 as input, determines the search range of the adaptive codebook 214 by the error minimizing means 208, and outputs it to the error minimizing means 208 and the codebook selector 215. Limiter 210, a transmission path error monitoring signal for detecting the occurrence of a transmission path error, 211 is a fixed code vector A fixed codebook for storing a predetermined number of fixed code vectors to be output to the fixed code vector gain multiplier 212, 212 is an adder by multiplying the fixed code vector output from the fixed codebook 211 by a fixed code vector gain Fixed code vector gain multiplier to be output to 206, 213 is a fixed code book for storing a fixed number of fixed code vectors to be output to the code book selector 215, and 214 is output from the adder 206 An adaptation comprising a buffer of past excitation vectors (finally determined by the error minimizing means 208), cutting out a part of the signal sequence stored in the buffer and outputting it as an adaptive code vector to the codebook selector 215 The code book 215 receives the search range information from the search range limiter 209, and either the fixed code book 213 and the adaptive code book 214. A codebook selector that selects only one of the input vectors and outputs the selected vector to the codevector gain multiplier 216. An adder 216 multiplies the codevector output from the codebook selector 215 by the code vector gain. A code vector gain multiplier to be output to 206.
[0018]
The operation of the CELP speech coding apparatus configured as described above will be described below. In FIG. 2, an input audio signal 201 is a digital signal having a predetermined number of samples, and the audio encoding process is performed for each audio signal of the predetermined number of samples. The audio signal block having the predetermined number of samples is called a frame or a subframe. The input audio signal 201 is subjected to band limitation and gain adjustment by the preprocessor 202. Using the pre-processed speech signal, the linear prediction analyzer 203 performs a known linear prediction analysis and calculates a linear prediction coefficient. The synthesis filter 205 configures a filter using the linear prediction coefficient calculated by the linear prediction analyzer 203, performs speech processing on the sound source vector output from the adder 206, and performs speech synthesis. The adder 204 calculates a differential signal between the pre-processed input audio signal and the audio signal synthesized by the synthesis filter 205. The auditory weighter 207 performs auditory weighting on the difference signal calculated by the adder 204 and outputs the result to the error minimizing means 208. This auditory weighting is generally performed using a filter in which a linear prediction coefficient calculated by the linear prediction analyzer 203 and a linear prediction filter using the auditory weighting coefficient are cascaded. The error minimizing means 208 converts the excitation vector input to the synthesis filter 205 into a fixed code vector, a fixed code vector gain, and an adaptive code vector so that the power of the differential signal (error signal) after auditory weighting is minimized. And adjusting the combination of adaptive code vector gain. In general, first, an optimum adaptive code vector is extracted from the adaptive code book 214, and multiplied by the adaptive code vector gain by a multiplier 216 to determine an output to the adder 206. Subsequently, the fixed code book 211 The fixed code vector that is optimal when combined with the adaptive vector is extracted from the inside, and the multiplier 212 multiplies the fixed code vector by the fixed code vector gain to determine the output to the adder 206. The search range limiter 209 limits the search range of the adaptive codebook 214 when the optimum adaptive code vector is extracted from the adaptive codebook 214. The search range limiter 209 determines whether a transmission path error has occurred in the immediately preceding frame or subframe from the transmission path error monitoring signal 210 input to the search range limiter 209. If it is determined that a transmission path error has occurred in the immediately preceding frame or subframe, the portion of the excitation signal generated in the past stored in the adaptive codebook 214 is extracted from the search range. The search range of the adaptive codebook 214 is output to the error minimizing means 208 and the codebook selector 215 so that the adaptive codebook search is carried out and the optimum code vector is selected. If it is determined that a transmission path error has occurred in the immediately preceding frame or subframe, the portion of the excitation signal generated in the past stored in the adaptive codebook 214 is generated in the immediately preceding frame. The adaptive codebook search range is determined so that the adaptive codebook search is performed outside the search range, and is output to the error minimizing means 208. However, when all of the excitation codebooks stored in the adaptive codebook 214 are excluded from the search range due to a continuous transmission error for a long time, the fixed codebook 213 is not used without using the adaptive codebook 214. The search range is output to the codebook selector 215 and the error minimizing means 208 so that a sound source vector is generated using. The codebook selector 215 outputs the code vector input from the fixed codebook 213 to the code vector gain multiplier 216 when the input search range has a content indicating fixed codebook search.
[0019]
When the speech encoding apparatus is configured as described above, the decoding apparatus requires means for selecting either the adaptive codebook or the fixed codebook. As a simple method, information indicating which codebook is used may be added on the encoding device side and transmitted to the decoding device side. For this reason, when it is impossible to divide bits, a transmission path error monitoring means is added to switch between a fixed codebook and an adaptive codebook when a continuous transmission path error has occurred in the past. Need to do.
[0020]
Note that the transmission path error monitoring signal 210 may be a signal that transmits a predetermined signal at a constant time interval (shorter than the time interval for transmitting the encoding parameters for one frame). When the limiter 209 receives a signal different from a predetermined signal, the limiter 209 determines that a transmission path error has occurred in the encoded information of the frame transmitted at that time.
[0021]
Thus, according to the second embodiment, in the normal frame after the error frame, the same excitation vector is generated without causing distortion between the excitation vector of the encoding device and the excitation vector of the decoding device. Can be obtained. In addition, it is possible to cancel the adaptive codebook and improve the sound quality as compared with the first embodiment in which the amount of information assigned to the adaptive code vector is not used.
[0022]
(Embodiment 3)
Next, a speech decoding apparatus according to the third embodiment of the present invention will be described. FIG. 3 shows the excitation waveform stored in the adaptive codebook of the speech decoding apparatus, 301 is the excitation waveform stored in the adaptive codebook, and 302 is a frame that has not been correctly decoded due to a transmission path error. , Pi is a lag value of a normal frame transmitted from the encoding device immediately after a frame having a transmission path error, and Vpi is an adaptation extracted from the adaptive codebook based on the lag value Pi A code vector section, NVpi, indicates a section (current frame or subframe) in which a sound source waveform is to be generated.
[0023]
In FIG. 3, since the adaptive code vector (section Vpi) represented by the lag value Pi transmitted from the encoding device includes a sound source vector of a frame that has not been correctly decoded due to a transmission path error, the waveform distortion is large. Become. Thus, when there is a transmission path error in the immediately preceding frame in this way, an adaptive code vector is generated using a pitch (nPi) that is an integral multiple of the lag value Pi transmitted from the encoding device. The integer n at this time is the minimum value of the integer necessary for not including the sound source portion 302 generated by incorrect information. In FIG. 3, n = 3, and Vp3 is cut out as an adaptive code vector.
[0024]
FIG. 4 shows the excitation waveform of the adaptive codebook of the decoding apparatus in the frame or subframe immediately after FIG. At this time, the adaptive code vector Vpi + 1 cut out based on the lag value Pi + 1 transmitted from the encoding device still includes the excitation waveform generated in the error frame. In order to avoid this, it is only necessary to use Vp4 as an adaptive code vector with n = 4 of nPi + 1. However, in the case shown in FIG. 4, since the ratio including the sound source waveform generated in the error frame included in Vpi + 1 is low, Vpi + 1 may be used without using Vp4, but in that case, it is included in Vpi + 1. It is necessary to perform case classification according to the ratio including the sound source waveform generated in the error frame.
[0025]
Note that the method using such an integer multiple pitch is effective in a voiced part with a clear pitch period, and when the adaptive code vector gain transmitted from the encoding device is a value close to 1.0, Alternatively, the change in the lag value in several frames before the error frame is small and is equal to or approximately equal to the lag value in the normal frame immediately after the error frame. In addition, since the adaptive code vector is cut out from the adaptive codebook while avoiding the part generated when the past error occurs, the excitation waveform stored in the adaptive codebook of the decoding device is longer than the adaptive codebook of the coding device. Must be stored.
[0026]
As described above, according to the third embodiment, it is possible to suppress distortion of the adaptive code vector in a normal frame after an error frame in a portion where the adaptive code vector works effectively.
[0027]
【The invention's effect】
As described above, according to the present invention, in the CELP speech coding / decoding device, the same excitation vector can be obtained on the encoder side and the decoder side even immediately after returning from the transmission path error, and the transmission path error can be obtained. Therefore, it is possible to realize an excellent speech encoding / decoding device that can alleviate an error between excitation vectors generated on the encoder side and the decoder side that occurs immediately after returning from the above.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a speech encoding apparatus according to a first embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of a speech encoding apparatus according to a second embodiment of the present invention.
FIG. 3 is a schematic diagram of an adaptive codebook of a speech coding apparatus according to a third embodiment of the present invention.
FIG. 4 is a schematic diagram of an adaptive codebook of a speech encoding device according to a third embodiment of the present invention.
FIG. 5 is a block diagram showing a configuration of a general CELP speech encoding apparatus.
[Explanation of symbols]
104 adder
106 Adder
112 Fixed Code Vector Gain Multiplier
114 Adaptive Code Vector Gain Multiplier
204 Adder
206 Adder
212 Fixed Code Vector Gain Multiplier
216 Code vector gain multiplier
301 Excitation source waveform such as adaptive code
302 Adaptive codebook excitation waveform section generated in error occurrence frame

Claims

It is determined whether a transmission path error has occurred based on the adaptive codebook which is a buffer of excitation vectors generated in the past and a transmission path error monitoring signal received from the speech decoding apparatus, and the transmission path is transmitted to the immediately preceding frame or subframe. If it is determined that an error has occurred, a voice coding apparatus and a search range restriction unit which excludes the generated part immediately preceding frame or sub-frame from the search range of the adaptive codebook performs adaptive codebook search Voice encoding / decoding device.

When the search range limiting means determines that a transmission path error has occurred over the previous few frames or subframes, the adaptive code vector generated in the frame or subframe in which the transmission path error has occurred continuously 2. The speech encoding / decoding apparatus according to claim 1, wherein the sound source vector is generated only from the fixed codebook, with the gain of the sound source being zero.

Code range selection means for inputting search range information from the search range limit means and selecting either an adaptive codebook or a fixed codebook, wherein the search range limit means is the last few frames or subframes. If it is determined that all of the excitation vectors stored in the adaptive codebook are excluded from the search range due to transmission path errors, the search range information is output to the codebook selection means, 2. The speech encoding / decoding apparatus according to claim 1, wherein the codebook selecting means switches the adaptive codecodebook to a fixed codebook and generates a sound source vector only from the fixed codebook.

The speech coding apparatus has means for transmitting to the speech decoding apparatus information indicating which of the adaptive codebook and the fixed codebook is used, and the speech decoding apparatus comprises the adaptive codebook 4. The voice according to claim 2, further comprising means for selecting either the adaptive codebook or the fixed codebook based on information indicating which of the codebook and the fixed codebook is used. Encoding / decoding device.

The speech decoding apparatus determines whether or not the adaptive code vector decoded using the received pitch information is decoded using the excitation vector of the frame that was not correctly decoded due to a transmission path error, and is decoded correctly 5. The speech encoding / decoding method according to claim 4, wherein when the adaptive code vector is generated using a frame that has not been generated, the adaptive code vector is generated using the received pitch information as it is . apparatus.