JP3713288B2

JP3713288B2 - Speech decoder

Info

Publication number: JP3713288B2
Application number: JP06522794A
Authority: JP
Inventors: 政巳赤嶺; 公生三関; 進神庭; 皇天田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1994-04-01
Filing date: 1994-04-01
Publication date: 2005-11-09
Anticipated expiration: 2020-11-09
Also published as: JPH07271391A

Description

【０００１】
【産業上の利用分野】
本発明は、音声符号化パラメータを受けて元の音声を復号再生する音声復号装置に関する。
【０００２】
【従来の技術】
自動車電話等の移動体通信システムに用いられる音声符号化／復号装置は、低ビットレート化のほかに、移動端末がビルの谷間等のような電界強度の低い場所を通ることによって符号誤りが生じた場合でも良好な音声を再生できることが要求される。特公昭５８−４９８７４号公報に記載された音声合成装置は、このような要求に応えるものとして提案された従来の音声復号装置の例である。
【０００３】
図４は、この従来の音声復号装置のブロック図であり、１は入力端子、２は出力端子、３は再生中継復調回路、４は合成回路、５は瞬断時と瞬断回復時を検出する瞬断検出回路、６は分析パラメータの１組を遅延記憶する遅延記憶回路、７はスイッチである。
【０００４】
この従来装置において、入力端子１から入力された受信信号は再生中継復調回路３によって再生中継復調され、ディジタル信号の分析パラメータに変換されて遅延記憶回路６とスイッチ７に出力される。瞬断時でない場合、瞬断検出回路５は瞬断でないことを検出し、これを遅延記憶回路６とスイッチ７に出力する。遅延記憶回路６は瞬断検出回路５からの信号に基づいて、再生中継復調回路３からの分析パラメータを入力して順次遅延記憶する。スイッチ７は、瞬断検出回路５からの信号に基づいて再生中継復調回路３と合成回路４とを接続する。合成回路４は、スイッチ７を介して再生中継復調回路３の出力信号に基づいて音声信号を復元し、出力端子２に出力する。
【０００５】
入力端子１からの入力信号に瞬断が生じた場合は、瞬断検出回路５でそれを検出し、この情報を遅延記憶回路６とスイッチ７に出力する。遅延記憶回路６は、瞬断検出回路５からの瞬断情報により再生中継復調回路３からの入力信号を遅延記憶するのを止め、記憶内容を古い順に順次スイッチ７に繰り返し出力する。遅延記憶回路６は分析パラメータ１組の記憶容量を持っているので、ちょうど１組前の分析パラメータがスイッチ７に出力される。スイッチ７は、瞬断検出回路５からの瞬断情報に基づいて遅延記憶回路６と合成回路４を接続する。合成回路４は、この遅延記憶回路６からの入力信号に基づいて音声信号に復元し、出力端子２に出力する。瞬断が回復すると、瞬断検出回路５が瞬断回復時を検出し、瞬断回復情報を遅延記憶回路６とスイッチ７に出力する。このときは前述の瞬断でない場合の動作に戻る。
【０００６】
このように従来の音声復号装置では、電波の瞬断によって音声符号化パラメータに誤りが生じた場合、正常時に入力した１組の音声符号化パラメータに基づいて音声を再生しているので、符号誤りが生じた場合でも音質の大きな劣化を防止することができる。しかしながら、瞬断検出回路５の検出能力には限界があり、どのような検出法を用いたとしても、誤検出を起こすことがあり得る。従って、瞬断検出回路５の検出能力以上の誤りが発生した場合、再生音声の音質が大きく劣化することがある。
【０００７】
【発明が解決しようとする課題】
上述したように、従来の音声復号装置では瞬断検出回路の検出能力に限界があるため、瞬断を誤検出することがあり、瞬断によって符号誤りがあったフレームでも誤りなしと検出する可能性がある。このような場合、符号誤りがあるにも拘らず検出されなかったフレームの音声符号化パラメータを記憶してしまい、この音声符号化パラメータに基づいて音声を再生するため、復号音声の音質が劣化するという問題があった。
本発明は、音声符号化パラメータの誤りの誤検出による再生音声の音質劣化を防止できる音声復号装置を提供することを目的とする。
【０００８】
【課題を解決するための手段】
本発明は、上記の課題を解決するため、単位区間毎に入力される音声符号化パラメータに基づいて復号を行って音声信号を再生する音声復号装置において、前記入力される音声符号化パラメータの誤りを検出する誤り検出手段と、前記入力される音声符号化パラメータの少なくとも１組を前記誤り検出手段により誤りが検出されない複数の単位区間にわたり記憶する記憶手段と、前記記憶手段から読み出された過去の複数の単位区間にわたる音声符号化パラメータに基づいて音声符号化パラメータを再生するパラメータ再生手段と、
前記誤り検出手段により誤りが検出されない単位区間については前記入力される音声符号化パラメータに基づいて音声信号を再生し、前記誤り検出手段により誤りが検出された単位区間については前記パラメータ再生手段により再生された音声符号化パラメータに基づいて音声信号を再生する再生手段とを備えたことを特徴とする。
【０００９】
より具体的には、上記パラメータ再生手段として記憶手段から読み出された過去の複数の単位区間にわたる音声符号化パラメータの補間演算を行う補間手段を設け、誤り検出手段により誤りが検出された単位区間については補間手段により得られた音声符号化パラメータに基づいて音声信号を再生するようにしてもよい。
【００１０】
さらに、本発明においては記憶手段から読み出された過去の複数の単位区間にわたる音声符号化パラメータのうち最も信頼性の高い音声符号化パラメータを選択する選択手段を設け、誤り検出手段により誤りが検出された単位区間については、再生手段において該選択手段により選択された音声符号化パラメータに基づいて音声信号を再生するようにしてもよい。
【００１１】
【作用】
本発明に係る音声復号装置においては、誤り検出手段によって入力された音声符号化パラメータの符号誤りの有無が検出され、誤りが検出されない単位区間では、入力した音声符号化パラメータに基づいて再生手段で音声が再生されるとともに、記憶手段によって音声符号化パラメータの少なくとも１組が記憶される。このようにして、記憶手段では誤りが検出されない複数の単位区間の音声符号化パラメータが記憶され、さらにこの記憶手段から読み出された過去の複数の単位区間の符号化パラメータについて例えば補間演算が補間手段で行われる。そして、誤りが検出された単位区間では、この補間手段で得られた音声符号化パラメータに基づいて音声が再生される。
【００１２】
ここで、入力した音声符号化パラメータに誤りが生じているにも拘らず、誤り検出手段の誤検出により誤りなしと検出された場合でも、そのような誤検出が２つ以上の単位区間にわたり連続して発生する確率は非常に少ない。従って、誤りなしと検出された複数の単位区間の音声符号化パラメータの補間演算を行うと、いずれかの単位区間の音声符号化パラメータが誤っていても、他の誤りの生じていない単位区間の音声符号化パラメータによって、その誤りの影響が緩和される結果、補間演算により得られた音声符号化パラメータの信頼性が向上するので、一つの単位区間のみの音声符号化パラメータに基づいて音声を再生する従来の方式に比較して、音質の劣化は著しく少なくなる。
【００１３】
また、誤りなしと検出された複数の単位区間の音声符号化パラメータの信頼性を何らかの手段により判定し、その判定結果に基づいても最も信頼性の高い音声符号化パラメータを用いて音声の再生を行うようにしても、同様に再生音声の音質劣化は低減される。
【００１４】
【実施例】
以下、図面を参照して本発明の実施例を説明する。
（実施例１）
図１は、本発明の一実施例に係る音声復号装置のブロック図である。この実施例の音声復号装置は、入力端子１００に入力される音声符号化パラメータを復号する符号化パラメータ復号器１０１と、入力される音声符号化パラメータの誤りを検出する誤り検出回路１０２と、スイッチ１０３と、誤り検出回路１０２により“誤りなし”と検出されたフレーム（単位区間）の音声符号化パラメータを記憶する第１および第２の記憶回路１０４，１０５と、これらの記憶回路１０４，１０５から読み出された音声符号化パラメータの補間演算を行う補間回路１０６と、符号化パラメータ復号器１０１の出力と補間回路１０６の出力のいずれかを選択するスイッチ１０７と、このスイッチ１０７により選択された音声符号化パラメータから元の音声を合成して再生し、出力端子１０９に出力する音声合成器１０８により構成される。また、スイッチ１０３、補間回路１０６およびスイッチ１０７は、誤り検出回路１０２の出力によって制御される。
【００１５】
次に、本実施例に係る音声復号装置の動作について説明する。
入力端子１００には、図示しない送信側の音声符号化装置から送信された複数の音声符号化パラメータがフレーム単位で入力される。これらの音声符号化パラメータは、例えば音声符号化装置の符号化方式がＣＥＬＰ(Code Excited Linear Prediction)方式の場合を例にとると、ＬＰＣ分析パラメータ、合成フィルタの駆動信号、音声信号のピッチおよびゲインといったパラメータである。また、これらの音声符号化パラメータは、誤り訂正符号化されて伝送される。
【００１６】
入力端子１００に入力されたこれらの音声符号化パラメータは、符号化パラメータ復号器１０１によって復号され、さらにスイッチ１０３，１０７に出力される。
【００１７】
また、入力端子１００に入力された音声符号化パラメータは誤り検出回路１０２にも入力され、ここで符号誤りがあるかどうかが検出される。スイッチ１０３，１０７は、誤り検出回路１０２の出力によって制御される。すなわち、誤り検出回路１０２の出力が“誤りなし”を示している場合、スイッチ１０７は符号化パラメータ復号器１０１によって復号された音声符号化パラメータを選択して音声合成器１０８へ送る。また、このときスイッチ１０３はオンとなって、符号化パラメータ復号器１０１により復号された音声符号化パラメータを第１の記憶回路１０４に送る。第１の記憶回路１０４は、それまで記憶保持していた内容を第２の記憶回路１０５に出力した後、スイッチ１０３を介して入力した新たな音声符号化パラメータを記憶すると同時に、それまで記憶保持していた内容を補間回路１０６へ出力する。一方、第２の記憶回路１０５はそれまで記憶保持していた内容をクリアした後、第１の記憶回路１０４から入力された内容を記憶する。
【００１８】
このようにして、第１および第２の記憶回路１０４，１０５には、誤り検出回路１０２により符号誤りが検出されなかったフレームの音声符号化パラメータが新しい順に２フレーム分記憶保持される。
【００１９】
次に、誤り検出回路１０２の出力が“誤りあり”を示している場合には、スイッチ１０３がオフとなって、誤りが検出された音声符号化パラメータの記憶回路１０４への入力が阻止されるとともに、補間回路１０６によって第１および第２の記憶回路１０４，１０５の内容に基づいて音声符号化パラメータの補間演算が行われる。具体的には、例えば音声符号化パラメータに一つであるＬＰＣ分析パラメータとしてＬＳＰパラメータを用いた場合、補間回路１０６は次式に従ってＬＳＰパラメータの補間を行う。
【００２０】
ωｉ（ｎ）＝αωｉ（ｎ−１）＋（１−α）ωｉ（ｎ−２）（１）
ここで、１≦ｉ≦ＰでＰは分析次数、ωｉ（ｎ）は現在のフレームのＬＳＰパラメータ、ωｉ（ｎ−１），ωｉ（ｎ−２）はそれぞれ記憶回路１０４，１０５に記憶保持されているＬＳＰパラメータであり、“誤りなし”と検出されたフレームのＬＳＰパラメータのうちで１番目と２番目に新しいパラメータである。
【００２１】
また、αはωｉ（ｎ−１）とωｉ（ｎ−２）を加重平均するための０＜α＜１なる係数であり、より現実的には例えば０．８＜α＜１の範囲に設定される。このように、ωｉ（ｎ−１）に対する加重係数αをωｉ（ｎ−２）に対する加重係数（α−１）より大きくする理由は、ωｉ（ｎ−１）の方が補間演算結果として実際に要求される音声符号化パラメータに時間的に近いためである。
【００２２】
誤り検出回路１０２の出力が“誤りあり”を示した場合、補間回路１０６により上記のようにして補間演算された音声符号化パラメータがスイッチ１０７によって選択され、音声合成器１０８に入力される。音声合成器１０８は、入力された音声符号化パラメータに基づいて音声信号を再生し、出力端子１０２へ出力する。
【００２３】
このように本実施例では、誤りが検出されたフレームの音声符号化パラメータの少なくとも１組（上記例ではＬＳＰパラメータ）については、過去に誤りが検出されなかった２つのフレームのパラメータを加重平均して補間したパラメータで代用している。誤り検出回路１０２の検出能力には限界があるために、誤検出の可能性があることは前述した通りであるが、誤検出が重ねて発生する確率は非常に少ないため、本実施例のように２つのフレームの音声符号化パラメータを加重平均することにより補間したパラメータの信頼性は大きく向上し、これを音声信号の復号再生に用いることにより、符号誤りが生じたフレームでの再生音声の音質劣化を防ぐことができる。また、従来方式では音声符号化パラメータの補間をいわば１つのフレームのパラメータに基づいて行っていたのに対し、本実施例においては２つのフレームのパラメータに基づいて行っているため、補間精度が向上し、再生音声の品質がさらに向上する効果も期待できる。
【００２４】
なお、上記実施例では２つのフレームの音声符号化パラメータを記憶保持し、その内容に基づいて、誤ったフレームのパラメータを２次の補間によって求める構成としたが、記憶回路の数を増やして、より高次の補間を行う構成としてもよい。この場合、誤りの誤検出による影響をより小さくすると共に、補間精度をより高くする効果がある。
【００２５】
（実施例２）
図２は、本発明の他の実施例に係る復号装置の構成を示すブロック図である。図１に示した実施例と同一部分に同一符号を付して相違点を中心に説明すると、本実施例では音声符号化パラメータの補間を移動平均によって行っている点が図１の実施例と異なる。
【００２６】
すなわち、図２においては誤り検出回路１０２の出力が“誤りなし”を示したときに符号化パラメータ復号器１０１により復号された音声符号化パラメータが第１の記憶回路１０４で記憶保持される。補間回路２０２は係数乗算器２０３，２０４と加算器２０５および第２の記憶回路２０６からなり、記憶回路２０６は誤り検出回路１０２の出力が“誤りなし”を示したフレーム単位で内容を記憶保持するように制御される。この実施例の場合、音声符号化パラメータの補間値は次式により計算される。
【００２７】
Ωｉ＝βΩｉ＋（１−β）ωｉ（ｎ−１）（２）
ここで、ΩｉはＬＳＰパラメータの補間値で、加算器２０５の出力、ωｉ（ｎ−１）は誤り検出回路１０２の出力が“誤りなし”を示したフレームのＬＳＰパラメータ、（１−β）は乗算器２０３の係数、βは乗算器２０４の係数（移動平均の係数）である。
【００２８】
本実施例によれば、音声符号化パラメータの補間を過去の誤りなしの全フレームのパラメータを移動平均することにより行うため、補間パラメータの信頼性および補間精度が向上する。さらに、２個の係数乗算器２０３，２０４と加算器２０５および１個の記憶回路２０６で構成される補間回路２０２によって高次の補間を行うため、図１の実施例のような単純な加算平均により高次の補間を実現する場合に比較して回路規模を小さくできるという利点もある。
【００２９】
なお、以上の各実施例では音声符号化パラメータの補間演算をフレーム単位で行ったが、サブフレーム単位で行ってもよい。その場合、補間に用いるパラメータの相関が強くなるため、補間精度が向上する効果がある。
【００３０】
（実施例３）
図３は、本発明のさらに別の実施例に係る音声復号装置の構成を示すブロック図である。図１に示した実施例と同一部分に同一符号を付して相違点を中心に説明すると、本実施例では信頼性判定部３０１とスイッチ３０２を設け、音声符号化パラメータの補間を行う代わりに、第１および第２の記憶回路１０４，１０５に記憶保持された音声符号化パラメータを選択的に切り替えて音声信号の復号再生に用いる点が図１の実施例と異なっている。
【００３１】
第１および第２の記憶回路１０４，１０５は、誤り検出回路１０２の出力が“誤りなし”を示したときに符号化パラメータ復号器１０１により復号された音声符号化パラメータをそれぞれ記憶している。信頼性判定部３０１は、これら記憶回路１０４，１０５に記憶保持されている音声符号化パラメータのうち、過去の音声符号化パラメータとの連続性が良くなる方を信頼性の高いものと判定し、その判定結果に基づいてスイッチ３０２を制御する。そして、誤り検出回路１０２が“誤りあり”と検出したときには、スイッチ３０２で選択された音声符号化パラメータがスイッチ１０７で選択され、これに基づいて音声合成器１０８で音声が復号再生される。
【００３２】
なお、記憶回路１０４，１０５に記憶保持された音声符号化パラメータを選択する際には、上述した過去の音声符号化パラメータとの連続性を考慮する方法の他、音声符号化パラメータに含まれるフィルタ係数について、フィルタがより安定となるようなフィルタ係数を含む音声符号化パラメータをより信頼性が高いものとして判定してもよい。また、音声符号化パラメータに含まれる電力係数について、過去の音声符号化パラメータに含まれる電力係数からの変化がより滑らかに変化する電力係数を含む音声符号化パラメータをより信頼性が高いものとして判定してもよい。
【００３３】
このように本実施例によれば、誤りが検出されたフレームの音声符号化パラメータについては、複数の記憶回路１０４，１０５に記憶保持されている誤りが検出されなかったときの音声符号化パラメータのうち信頼性の最も高いパラメータを選択して音声の再生に用いることにより、符号誤りが生じたフレームの再生音声の音質劣化を防止することができる。
【００３４】
【発明の効果】
以上説明したように、本発明によれば音声符号化パラメータに誤りがあった場合には、過去の誤りなしと判定された複数の単位区間の音声符号化パラメータに基づいて音声を再生して出力するため、誤り検出の誤検出による音質劣化を防ぐとともに、符号誤り時の音声符号化パラメータの補間を精度よく行うことができるので、再生音質が向上するという効果が得られる。
【図面の簡単な説明】
【図１】本発明の一実施例に係る音声復号装置の構成を示すブロック図
【図２】本発明の他の実施例に係る音声復号装置の構成を示すブロック図
【図３】本発明のさらに別の実施例に係る音声復号装置の構成を示すブロック図
【図４】従来の音声復号装置の構成を示すブロック図
【符号の説明】
１００…入力端子
１０１…符号化パラメータ復号器
１０２…誤り検出回路
１０３…スイッチ
１０４…第１の記憶回路
１０５…第２の記憶回路
１０６…補間回路
１０７…スイッチ
１０８…音声合成器
１０９…出力端子
２０２…補間回路
２０３…係数乗算器
２０４…係数乗算器
２０５…加算器
２０６…記憶回路
３０１…信頼性判定部
３０２…スイッチ[0001]
[Industrial application fields]
The present invention relates to a speech decoding apparatus that receives speech encoding parameters and decodes and reproduces original speech.
[0002]
[Prior art]
Speech coding / decoding devices used in mobile communication systems such as automobile telephones, in addition to lowering bit rates, cause code errors when mobile terminals pass through places with low electric field strength, such as valleys in buildings. It is required to be able to reproduce good audio even when The speech synthesizer described in Japanese Patent Publication No. 58-49874 is an example of a conventional speech decoding apparatus proposed as a response to such a demand.
[0003]
FIG. 4 is a block diagram of this conventional speech decoding apparatus, where 1 is an input terminal, 2 is an output terminal, 3 is a regenerative repeater demodulation circuit, 4 is a synthesis circuit, 5 is detected at the time of instantaneous interruption and recovery from instantaneous interruption. An instantaneous interruption detecting circuit 6 for delaying, a delay memory circuit 6 for delay storing one set of analysis parameters, and a switch 7.
[0004]
In this conventional apparatus, the received signal input from the input terminal 1 is regeneratively relay demodulated by the regenerative relay demodulation circuit 3, converted into a digital signal analysis parameter, and output to the delay storage circuit 6 and the switch 7. When it is not at the time of instantaneous interruption, the instantaneous interruption detection circuit 5 detects that it is not an instantaneous interruption, and outputs this to the delay memory circuit 6 and the switch 7. Based on the signal from the instantaneous interruption detection circuit 5, the delay storage circuit 6 inputs the analysis parameter from the regenerative repeater demodulation circuit 3 and sequentially stores the delay. The switch 7 connects the regenerative repeater demodulation circuit 3 and the synthesis circuit 4 based on the signal from the instantaneous interruption detection circuit 5. The synthesis circuit 4 restores the audio signal based on the output signal of the regenerative repeater demodulation circuit 3 via the switch 7 and outputs it to the output terminal 2.
[0005]
When a momentary interruption occurs in the input signal from the input terminal 1, the momentary interruption detection circuit 5 detects it and outputs this information to the delay storage circuit 6 and the switch 7. The delay storage circuit 6 stops delaying and storing the input signal from the regenerative repeater demodulation circuit 3 based on the instantaneous interruption information from the instantaneous interruption detection circuit 5, and repeatedly outputs the stored contents to the switch 7 in order from the oldest. Since the delay storage circuit 6 has a storage capacity of one set of analysis parameters, the analysis parameter of the previous set is output to the switch 7. The switch 7 connects the delay storage circuit 6 and the synthesis circuit 4 based on the instantaneous interruption information from the instantaneous interruption detection circuit 5. The synthesis circuit 4 restores an audio signal based on the input signal from the delay storage circuit 6 and outputs it to the output terminal 2. When the instantaneous interruption is recovered, the instantaneous interruption detection circuit 5 detects the instantaneous interruption recovery time and outputs instantaneous interruption recovery information to the delay storage circuit 6 and the switch 7. At this time, the operation returns to the case where the above-mentioned momentary interruption has not occurred.
[0006]
As described above, in the conventional speech decoding apparatus, when an error occurs in the speech coding parameter due to the instantaneous interruption of the radio wave, the speech is reproduced based on a set of speech coding parameters input at normal time. Even when this occurs, it is possible to prevent major deterioration in sound quality. However, the detection capability of the instantaneous interruption detection circuit 5 has a limit, and any detection method may be used to cause erroneous detection. Therefore, when an error exceeding the detection capability of the instantaneous interruption detection circuit 5 occurs, the sound quality of the reproduced sound may be greatly deteriorated.
[0007]
[Problems to be solved by the invention]
As described above, since the detection capability of the instantaneous interruption detection circuit is limited in the conventional speech decoding apparatus, the instantaneous interruption may be erroneously detected, and it is possible to detect that there is no error even in a frame in which there is a code error due to the instantaneous interruption. There is sex. In such a case, the speech coding parameters of the frames that have not been detected despite the presence of code errors are stored, and the speech is reproduced based on the speech coding parameters, so that the quality of the decoded speech deteriorates. There was a problem.
An object of the present invention is to provide a speech decoding apparatus capable of preventing deterioration in sound quality of reproduced speech due to erroneous detection of speech coding parameter errors.
[0008]
[Means for Solving the Problems]
In order to solve the above-described problem, the present invention provides an audio decoding apparatus that reproduces an audio signal by performing decoding based on an audio encoding parameter input for each unit interval , and an error in the input audio encoding parameter. an error detection means for detecting, storing means for storing across multiple unit sections in which no error is detected by the error detection means at least one set of speech coding parameters to be the input, past read from the storage means Parameter reproduction means for reproducing a speech coding parameter based on speech coding parameters over a plurality of unit intervals of
For a unit section in which no error is detected by the error detection means, a speech signal is reproduced based on the input speech coding parameter, and for a unit section in which an error is detected by the error detection means, reproduction is performed by the parameter reproduction means. And a reproducing means for reproducing the audio signal based on the audio encoding parameter.
[0009]
More specifically, an interpolating unit for interpolating speech coding parameters over a plurality of past unit intervals read from the storage unit as the parameter reproducing unit is provided, and the unit interval in which the error is detected by the error detecting unit For the above, the audio signal may be reproduced based on the audio encoding parameter obtained by the interpolation means.
[0010]
Furthermore, in the present invention, there is provided selection means for selecting the most reliable speech coding parameter among speech coding parameters over a plurality of past unit intervals read from the storage means, and an error is detected by the error detection means. With respect to the unit section, the reproduction unit may reproduce the audio signal based on the audio encoding parameter selected by the selection unit.
[0011]
[Action]
In the speech decoding apparatus according to the present invention, the presence / absence of a code error in the speech coding parameter input by the error detection unit is detected, and in the unit interval in which no error is detected, the playback unit performs the playback based on the input speech coding parameter. together with the audio are reproduced, at least one set of the storage unit thus speech coding parameters are stored. In this way, the speech coding parameters of a plurality of unit sections in which no error is detected are stored in the storage means, and further, for example, interpolation calculation is performed on the past plurality of unit section encoding parameters read from the storage means. Done by means. Then, in the unit section where the error is detected, the voice is reproduced based on the voice coding parameter obtained by this interpolation means.
[0012]
Here, even if an error has occurred in the input speech coding parameter, even if it is detected that there is no error due to an error detection by the error detection means, such error detection continues over two or more unit intervals. The probability of occurrence is very low. Therefore, when performing the interpolation calculation of the speech coding parameters of a plurality of unit sections detected as error-free, even if the speech coding parameter of any unit section is incorrect, other unit sections in which no error has occurred As a result of alleviating the influence of the error by the speech coding parameter, the reliability of the speech coding parameter obtained by the interpolation calculation is improved, so that the speech is reproduced based on the speech coding parameter of only one unit section. Compared with the conventional method, the deterioration of sound quality is remarkably reduced.
[0013]
Further, the reliability of the speech coding parameters of a plurality of unit sections detected as having no error is determined by some means, and the speech is reproduced using the speech coding parameter with the highest reliability based on the determination result. Even if it carries out, the sound quality degradation of reproduction | regeneration audio | voice is reduced similarly.
[0014]
【Example】
Embodiments of the present invention will be described below with reference to the drawings.
(Example 1)
FIG. 1 is a block diagram of a speech decoding apparatus according to an embodiment of the present invention. The speech decoding apparatus according to this embodiment includes a coding parameter decoder 101 that decodes a speech coding parameter input to an input terminal 100, an error detection circuit 102 that detects an error of a speech coding parameter that is input, and a switch 103, first and second storage circuits 104 and 105 for storing speech coding parameters of a frame (unit section) detected as “no error” by the error detection circuit 102, and the storage circuits 104 and 105 An interpolation circuit 106 that performs an interpolation operation on the read speech coding parameters, a switch 107 that selects either the output of the coding parameter decoder 101 or the output of the interpolation circuit 106, and the speech selected by the switch 107 A speech synthesizer 108 that synthesizes and reproduces the original speech from the encoding parameters and outputs it to the output terminal 109. Constructed. The switch 103, the interpolation circuit 106, and the switch 107 are controlled by the output of the error detection circuit 102.
[0015]
Next, the operation of the speech decoding apparatus according to this embodiment will be described.
A plurality of speech encoding parameters transmitted from a transmitting side speech encoding apparatus (not shown) are input to the input terminal 100 in units of frames. For example, when the encoding method of the speech encoding apparatus is a CELP (Code Excited Linear Prediction) method, these speech encoding parameters are LPC analysis parameters, synthesis filter drive signals, speech signal pitches and gains. It is a parameter. Also, these speech coding parameters are transmitted after being error correction coded.
[0016]
These speech encoding parameters input to the input terminal 100 are decoded by the encoding parameter decoder 101 and further output to the switches 103 and 107.
[0017]
The speech coding parameters input to the input terminal 100 are also input to the error detection circuit 102, where it is detected whether there is a code error. The switches 103 and 107 are controlled by the output of the error detection circuit 102. That is, when the output of the error detection circuit 102 indicates “no error”, the switch 107 selects the speech coding parameter decoded by the coding parameter decoder 101 and sends it to the speech synthesizer 108. At this time, the switch 103 is turned on, and the speech coding parameter decoded by the coding parameter decoder 101 is sent to the first storage circuit 104. The first storage circuit 104 outputs the content stored and held so far to the second storage circuit 105, and then stores the new speech coding parameter input through the switch 103, and at the same time stores and holds it. The contents thus output are output to the interpolation circuit 106. On the other hand, the second storage circuit 105 clears the contents stored and held so far, and then stores the contents input from the first storage circuit 104.
[0018]
In this way, the first and second storage circuits 104 and 105 store and hold the speech coding parameters of the frames in which the code error is not detected by the error detection circuit 102 for two frames in the newest order.
[0019]
Next, when the output of the error detection circuit 102 indicates “with error”, the switch 103 is turned off, and the input of the speech coding parameter in which the error is detected to the storage circuit 104 is blocked. At the same time, the interpolation circuit 106 performs an interpolation calculation of speech coding parameters based on the contents of the first and second storage circuits 104 and 105. Specifically, for example, when an LSP parameter is used as an LPC analysis parameter that is one of speech coding parameters, the interpolation circuit 106 performs interpolation of the LSP parameter according to the following equation.
[0020]
ωi (n) = αωi (n−1) + (1−α) ωi (n−2) (1)
Here, 1 ≦ i ≦ P, P is the analysis order, ωi (n) is the LSP parameter of the current frame, and ωi (n−1) and ωi (n−2) are stored and held in the storage circuits 104 and 105, respectively. This is the first and second newest parameter among the LSP parameters of the frame detected as “no error”.
[0021]
Α is a coefficient of 0 <α <1 for weighted averaging of ωi (n−1) and ωi (n−2), and more realistically, for example, set in a range of 0.8 <α <1. Is done. Thus, the reason why the weighting coefficient α for ωi (n−1) is made larger than the weighting coefficient (α−1) for ωi (n−2) is that ωi (n−1) is actually the interpolation calculation result. This is because it is close in time to the required speech coding parameters.
[0022]
When the output of the error detection circuit 102 indicates “There is an error”, the speech coding parameter interpolated by the interpolation circuit 106 as described above is selected by the switch 107 and input to the speech synthesizer 108. The voice synthesizer 108 reproduces a voice signal based on the inputted voice coding parameter and outputs it to the output terminal 102.
[0023]
As described above, in this embodiment, for at least one set of speech coding parameters of frames in which errors are detected (LSP parameters in the above example), the parameters of two frames in which no errors have been detected in the past are weighted averaged. The interpolated parameters are substituted. Since there is a limit to the detection capability of the error detection circuit 102, there is a possibility of erroneous detection as described above. However, since the probability of repeated erroneous detection is very small, as in this embodiment. The reliability of the interpolated parameters is greatly improved by performing weighted averaging of the speech coding parameters of two frames, and the sound quality of the reproduced speech in a frame in which a code error has occurred by using this for decoding and reproducing speech signals. Deterioration can be prevented. Also, in the conventional method, speech encoding parameters are interpolated based on parameters of one frame, so in this embodiment, interpolation accuracy is improved because it is performed based on parameters of two frames. In addition, the effect of further improving the quality of the reproduced sound can be expected.
[0024]
In the above embodiment, the speech coding parameters of two frames are stored and held, and based on the contents, the erroneous frame parameters are obtained by secondary interpolation, but the number of storage circuits is increased, It may be configured to perform higher-order interpolation. In this case, it is possible to reduce the influence of erroneous detection of errors and to increase the interpolation accuracy.
[0025]
(Example 2)
FIG. 2 is a block diagram showing a configuration of a decoding apparatus according to another embodiment of the present invention. The same parts as those in the embodiment shown in FIG. 1 are denoted by the same reference numerals, and the differences will be mainly described. In this embodiment, the speech coding parameters are interpolated by moving average, which is different from the embodiment of FIG. different.
[0026]
That is, in FIG. 2, the speech encoding parameter decoded by the encoding parameter decoder 101 when the output of the error detection circuit 102 indicates “no error” is stored and held in the first storage circuit 104 . The interpolation circuit 202 includes coefficient multipliers 203 and 204, an adder 205, and a second storage circuit 206. The storage circuit 206 stores and holds the contents in units of frames in which the output of the error detection circuit 102 indicates “no error”. To be controlled. In this embodiment, the interpolated value of the speech coding parameter is calculated by the following equation.
[0027]
Ωi = βΩi + (1-β) ωi (n−1) (2)
Here, Ωi is the interpolation value of the LSP parameter, the output of the adder 205, ωi (n−1) is the LSP parameter of the frame in which the output of the error detection circuit 102 indicates “no error”, and (1-β) is The coefficient of the multiplier 203, β is the coefficient of the multiplier 204 (moving average coefficient).
[0028]
According to this embodiment, since the more moving average parameters of all frames without previous error interpolation speech coding parameters, reliability is improved and the interpolation accuracy of the interpolation parameter. In addition, since high-order interpolation is performed by the interpolation circuit 202 including two coefficient multipliers 203 and 204, an adder 205, and one storage circuit 206, a simple addition average as in the embodiment of FIG. Therefore, there is an advantage that the circuit scale can be reduced as compared with the case where higher-order interpolation is realized.
[0029]
In each of the above embodiments, the speech coding parameter interpolation calculation is performed in units of frames, but may be performed in units of subframes. In that case, since the correlation of parameters used for interpolation becomes strong, there is an effect of improving the interpolation accuracy.
[0030]
(Example 3)
FIG. 3 is a block diagram showing a configuration of a speech decoding apparatus according to still another embodiment of the present invention. The same parts as those in the embodiment shown in FIG. 1 are denoted by the same reference numerals, and the differences will be mainly described. In this embodiment, a reliability determination unit 301 and a switch 302 are provided, instead of performing speech coding parameter interpolation. 1 is different from the embodiment of FIG. 1 in that the audio encoding parameters stored and held in the first and second storage circuits 104 and 105 are selectively switched and used for decoding and reproducing audio signals.
[0031]
The first and second storage circuits 104 and 105 respectively store the speech coding parameters decoded by the coding parameter decoder 101 when the output of the error detection circuit 102 indicates “no error”. The reliability determination unit 301 determines that the continuity with the past speech encoding parameter is higher among the speech encoding parameters stored and held in the storage circuits 104 and 105 as being highly reliable, The switch 302 is controlled based on the determination result. When the error detection circuit 102 detects that “there is an error”, the speech encoding parameter selected by the switch 302 is selected by the switch 107, and based on this, the speech synthesizer 108 decodes and reproduces the speech.
[0032]
Note that when selecting speech coding parameters stored and held in the storage circuits 104 and 105, a filter included in the speech coding parameters is used in addition to a method that considers continuity with the past speech coding parameters described above. As for the coefficient, a speech coding parameter including a filter coefficient that makes the filter more stable may be determined as having higher reliability. Also, regarding the power coefficient included in the speech coding parameter, the speech coding parameter including the power coefficient that changes more smoothly from the power coefficient included in the past speech coding parameter is determined as having higher reliability. May be.
[0033]
As described above, according to the present embodiment, the speech coding parameter of the frame in which the error is detected is the speech coding parameter when the error stored in the plurality of storage circuits 104 and 105 is not detected. By selecting the parameter with the highest reliability and using it for audio reproduction, it is possible to prevent deterioration of the sound quality of the reproduced audio of a frame in which a code error has occurred.
[0034]
【The invention's effect】
As described above, according to the present invention, when there is an error in the speech coding parameter, the speech is reproduced and output based on the speech coding parameters of a plurality of unit sections determined as having no past errors. Therefore, it is possible to prevent deterioration in sound quality due to erroneous detection of error detection and to accurately perform speech coding parameter interpolation at the time of a code error, so that the effect of improving the reproduction sound quality can be obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a speech decoding apparatus according to an embodiment of the present invention. FIG. 2 is a block diagram showing a configuration of a speech decoding apparatus according to another embodiment of the present invention. FIG. 4 is a block diagram showing the configuration of a speech decoding apparatus according to another embodiment. FIG. 4 is a block diagram showing the configuration of a conventional speech decoding apparatus.
DESCRIPTION OF SYMBOLS 100 ... Input terminal 101 ... Coding parameter decoder 102 ... Error detection circuit 103 ... Switch 104 ... 1st memory circuit 105 ... 2nd memory circuit 106 ... Interpolation circuit 107 ... Switch 108 ... Speech synthesizer 109 ... Output terminal 202 ... Interpolation circuit 203 ... Coefficient multiplier 204 ... Coefficient multiplier 205 ... Adder 206 ... Storage circuit 301 ... Reliability determination unit 302 ... Switch

Claims

In a speech decoding device that reproduces a speech signal by performing decoding based on a speech coding parameter input for each unit section,
Error detection means for detecting an error in the input speech coding parameter;
Storage means for storing at least one set of the input speech coding parameters over a plurality of unit intervals in which no error is detected by the error detection means;
A speech coding parameter including a power coefficient in which a change from a power coefficient included in a past speech coding parameter changes more smoothly among speech coding parameters over a plurality of past unit intervals read from the storage unit. A selection means for selecting the most reliable speech coding parameter by determining that the reliability is high,
A unit section in which no error is detected by the error detection means is reproduced based on the input speech coding parameter, and a unit section in which an error is detected by the error detection means is selected by the selection means. A speech decoding apparatus comprising: reproduction means for reproducing an audio signal based on the audio encoding parameter.