JP3567750B2

JP3567750B2 - Compressed audio reproduction method and compressed audio reproduction device

Info

Publication number: JP3567750B2
Application number: JP22560998A
Authority: JP
Inventors: 信一小畑; 藤井　　由紀夫
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1998-08-10
Filing date: 1998-08-10
Publication date: 2004-09-22
Anticipated expiration: 2018-08-10
Also published as: JP2000059231A

Description

【発明の属する技術分野】
ディジタルオーディオ再生装置、圧縮音声デコード処理装置（ＭＰＥＧ−１ａｕｄｉｏ，ＭＰＥＧ−２ａｕｄｉｏ）等、特に、圧縮音声データにエラーが生じたときの、エラーデータ補償方法及び装置に関する。
【従来の技術】
従来の圧縮音声再生方法においては、圧縮音声ストリーム中にエラーが検出された場合の対処法としてのエラー補償処理は、圧縮の基本単位であるフレームについて、前のデータを繰り返す、或いはエラー部分をミュートするという処理を行っている。この一例としては、「１９９２年電子情報通信学会秋季大会、Ｂ−５７１、ＭＰＥＧ／Ａｕｄｉｏ符号化方式における伝送エラー補償法のハードウェアによる評価：北畠他３，ＮＥＣ」に発表がされており、近年でもエラー補償法についてはあまり変化がない。
【発明が解決しようとする課題】
例えば、ＩＳＯ／ＩＥＣ１１１７２−３に示されたＭＰＥＧ−１ａｕｄｉｏｌａｙｅｒ２でＦｓ＝４８ｋＨｚの場合には、圧縮の基本単位である１フレームは１１５２サンプル分で２４ｍｓ、１フレームは更に細かなブロック（３８４サンプル分：８ｍｓ）に分かれており、時間軸データから圧縮データ（周波数軸データ）への変換の最短変換長は１ブロックとなっている。誤り検出ワードはこの１フレームを単位につけられており、エラーが発生した場合は、それが圧縮データであるが故に、エラーの影響は大抵の場合１フレーム全体におよび、１部に抑えこむことはできない。この区間を前フレームの繰り返しや、エラー補償処理（レベル０化）を行ってしまうと、音源データの特性によっては違和感が知覚される。しかし、エラー補償値の算出のために細かな聴覚特性を利用した演算を導入すると、デコード処理ハードウェアの規模やソフト演算処理量の大幅な増大を招いてしまう。
【課題を解決するための手段】
圧縮音声の１フレームの最後のブロックの振幅係数データを記憶しておき、エラーが発生したフレームの次のフレームの最初のブロックの振幅係数と比較する。この両端の振幅係数になだらかにつながるようにエラーフレームの各ブロックに対する振幅係数を設定する。振幅係数などに代表される信号レベルに影響を与えるデータ以外は前フレームの同ブロックを繰り返しとする。
【発明の実施の形態】
以下に本発明の実施形態の一例を示す。
図１は圧縮音声再生処理方法の処理手順の一例である。
この圧縮音声再生処理に入力される圧縮音声ストリームとしては図２の（１）に示したようなものがある。図２の（１）の圧縮音声ストリームは、圧縮音声の圧縮基本単位であるフレーム１０１を単位に形成されており、フレームの時間長さは３２ｍｓでオーディオブロックはフレームの時間長さの１／６である。
フレーム１０１内はフレーム１０１の先頭を示すＳＹＮＣ１０２、ストリーム補助情報であるＢＳＩ１０３、１フレーム時間内を６分割したオーディオブロックＡＢＬＫ（０）１０４、ＡＢＬＫ（１）１０５、ＡＢＬＫ（２）１０６、ＡＢＬＫ（３）１０７、ＡＢＬＫ（４）１０８、ＡＢＬＫ（５）１０９、さらに誤りを検出するための誤り検出コードＥＤＣ１１０とで構成されている。オーディオブロックＡＢＬＫ（ｂｎ）内は更にオーディオブロック情報ＡＢＩ１１２、振幅値データＥＸＰ１１３、正規化サンプル値ＭＡＮＴ１１４とで構成されている（ｂｎはブロック番号）。
この圧縮音声フレーム１０１を再生処理すると再生後データ１１５となり、ＡＢＬＫ（０）１０４に対するデコード部分は第０再生データ区分１１６に、ＡＢＬＫ（１）１０５に対するデコード部分は第１再生データ区分１１７に、ＡＢＬＫ（２）１０６に対するデコード部分は第２再生データ区分１１８に、ＡＢＬＫ（３）１０７に対するデコード部分は第３再生データ区分１１９に、ＡＢＬＫ（４）１０８に対するデコード部分は第４再生データ区分１２０に、ＡＢＬＫ（５）１０９に対するデコード部分は第５再生データ区分１２１になる。
図１の説明に戻ると、ステップ１の再生処理開始から再生処理が始まり、まずステップ２の音声ストリームＡＦＬＭ（ｆｎ＋１）入力で圧縮音声ストリームの（ｆｎ＋１）番目フレームを入力する（ｆｎは整数）。ステップ３のフレームエラー検出及び検出結果保存により、誤り検出コードＥＤＣ１１０などを使い誤りの有無を検出し、その結果を何番目が誤りのあるフレームかがわかる形で保存する。ステップ４のＡＦＬＭ（ｆｎ）エラー有無確認で（ｆｎ）番目のフレームがエラーだったかどうかを確認し、エラーなしならばステップ５へ、エラーありならばステップ１０へ進む。ステップ５のＡＦＬＭ（ｆｎ）の振幅データ再現ではＡＦＬＭ（ｆｎ）内に含まれた情報から振幅データに関連するデータＥＸＰ１１３をもとに振幅値を再現する。
ステップ６のサンプル逆量子化では上記ＥＸＰ１１３をもとに再現された振幅値とＡＦＬＭ（ｆｎ）内に含まれたＭＡＮＴ１１４からサンプル値ＲＥＣＯＮ＿ＳＡＭＰを再現する。ＥＸＰ１１３、ＭＡＮＴ１１４、サンプル値ＲＥＣＯＮ＿ＳＡＭＰの関係の一例としては下記数式３などがあり、（ＢＡＳＥ）の（ＥＸＰ）乗にＭＡＮＴを掛け合わせるという形で表現される。下記数式４は数式３での（ＢＡＳＥ）が２の場合の具体例で、（ＢＡＳＥ）が少数や負の数の例も考えられる。
【数３】
ＲＥＣＯＮ＿ＳＡＭＰ＝（ＢＡＳＥ）＾（ＥＸＰ）＊ＭＡＮＴ
【数４】
ＲＥＣＯＮ＿ＳＡＭＰ＝２＾（ＥＸＰ）＊ＭＡＮＴ
ステップ６に続いてステップ７のフレーム内最終ブロックデータ保存で（ｆｎ）番目のフレーム内の最終ブロックＡＢＬＫ（５）が保存される。ステップ８のデコード継続判断でデコードを続けるかどうかが判断され、続ける場合にはステップ２から処理を続け、終了する場合にはステップ９の再生処理終了で再生を終了する。
ステップ４において（ｆｎ）番目のフレームがエラーだった場合はステップ１０の補償用データ呼び戻しで、既に過去に通ったステップ７によって保存されている（ｆｎ−１）番目フレームの最終ブロックデータＡＢＬＫ（５）を呼び戻す。ステップ１１の振幅データ補償値の算出で、（ｆｎ−１）番目フレームの最終ブロックデータＡＢＬＫ（５）と（ｆｎ＋１）番目フレームの先頭ブロックデータＡＢＬＫ（０）とからエラーフレームである（ｆｎ）番目のフレーム内の各ブロックデータＡＢＬＫ（ｂｎ）を作り出す。この作り出されたブロックデータＡＢＬＫ（ｂｎ）をもとにステップ６以降の処理を続ける。
ステップ１１内を更に詳しく示したのが図２の処理フローで、ステップ１０からの処理の続きはステップ１２のＡＦＬＭ（ｆｎ−１）内のＡＢＬＫ（５）の振幅データ入力で（ｆｎ−１）番目フレームの最終ブロックデータＡＢＬＫ（５）の振幅データＥＸＰ（ｆｎ＋１，０，ｂａｎｄ）を入力する（ｂａｎｄは帯域番号）。続いてステップ１３のＡＦＬＭ（ｆｎ＋１）内のＡＢＬＫ（０）の振幅データ入力で（ｆｎ＋１）番目フレームの先頭ブロックデータＡＢＬＫ（０）の振幅データＥＸＰ（ｆｎ＋１，０，ｂａｎｄ）を入力する。ステップ１４のブロック内振幅値補償データ生成で、エラーフレームであるＡＦＬＭ（ｆｎ）内のＡＢＬＫ（ｂｎ）内の補償値を各帯域ごとに生成する。
ステップ１４に続いてはステップ１５の全ブロック終了判断で、最終ブロックまで処理が終わったかどうかを判断し、まだ終わっていなければステップ１４の処理を次のブロックＡＢＬＫに対して行い、全部終わっていれば、ステップ１６の振幅値補償データ保存で、所定の場所に振幅値補償データを置く。続いてステップ６へと処理を進め以下、上記の説明と同様である。
【数５】

【数６】

ステップ１４内での補償値の生成の仕方に関しては、更に具体的に数式５にその一例が示してある。これは振幅データＥＸＰ（ｆｎ，ｂｎ，ｂａｎｄ）を振幅データＥＸＰ（ｆｎ−１，５，ｂａｎｄ）と振幅データＥＸＰ（ｆｎ＋１，０，ｂａｎｄ）から作り出す式で、ＡＦＬＭ（ｆｎ）内のＡＢＬＫ（ｂｎ）内の振幅データＥＸＰ各帯域ごとにそれぞれなだらかに隣のフレームの振幅データＥＸＰへとつながるように加重平均が取られている。
ここで、例えば具体的に、ブロック２については、数式５は数式６と等価になる。
この効果を図で示したのが図５であり、ブロック内のある決まった帯域に対する振幅データＥＸＰに着目して表現してある。従って、例えば圧縮処理時に帯域分割処理で３２個の帯域に分けられて、その１帯域に対して１つの振幅データＥＸＰが決められているのならば、図５のような処理が３２の各帯域ごとに行われることとなる。
図５の（１）に補償値生成前、図５の（２）に補償値生成後を示している。図５の（１）において、（ｆｎ−１）番目フレームのＡＢＬＫ（ｂｎ）内のある決まった帯域ｂａｎｄの振幅データはＥＸＰ（ｆｎ−１，０，ｂａｎｄ）１２６、ＥＸＰ（ｆｎ−１，１，ｂａｎｄ）１２７、ＥＸＰ（ｆｎ−１，２，ｂａｎｄ）１２８、ＥＸＰ（ｆｎ−３，０，ｂａｎｄ）１２９、ＥＸＰ（ｆｎ−１，４、ｂａｎｄ）１３０、ＥＸＰ（ｆｎ−１，５，ｂａｎｄ）１３１で、（ｆｎ＋１）番目フレームのＡＢＬＫ（ｂｎ）内のある決まった帯域ｂａｎｄの振幅データはＥＸＰ（ｆｎ＋１，０，ｂａｎｄ）１３２、ＥＸＰ（ｆｎ＋１，１，ｂａｎｄ）１３３、ＥＸＰ（ｆｎ＋１，２，ｂａｎｄ）１３４、ＥＸＰ（ｆｎ＋１，３，ｂａｎｄ）１３５、ＥＸＰ（ｆｎ＋１，４，ｂａｎｄ）１３６、ＥＸＰ（ｆｎ＋１，５，ｂａｎｄ）１３７である。
エラーフレームである（ｆｎ）番目フレームの振幅データについては数式３の処理を行った結果、図５（２）に示したように（ｆｎ）番目フレームのＡＢＬＫ（ｂｎ）内のある決まった帯域ｂａｎｄの振幅データＥＸＰ（ｆｎ，０，ｂａｎｄ）１３８、ＥＸＰ（ｆｎ，１，ｂａｎｄ）１３９、ＥＸＰ（ｆｎ，２，ｂａｎｄ）１４０、ＥＸＰ（ｆｎ，０，ｂａｎｄ）１４１、ＥＸＰ（ｆｎ，４，ｂａｎｄ）１４２、ＥＸＰ（ｆｎ，５，ｂａｎｄ）１４３は、なだらかにＥＸＰ（ｆｎ−１，５，ｂａｎｄ）１３１やＥＸＰ（ｆｎ＋１，０，ｂａｎｄ）１３２につながるように生成される。
このようにして、エラーフレームに対するエラー補償処理が行われる。これにより、エラー区間と有効フレーム区間とのつながりがよりスムーズになる。また扱うデータはエラーの生じているフレームの長さより、かなり短い区間に対するデータで且つ、更にサンプルを再現する前の部分データで済むため、処理規模の大幅な増大とはならない。またサンプルを再現する前の部分データであるため直行逆変換をする前の周波数領域でのエラー補償となり、時間域での滑らかなつながりだけでなく、周波数域でのスペクトル分布の滑らかなつながりが実現できる。
ここで従来型のフレーム繰り返しの例である図６、従来型のミュートの例である図７と図５（２）を比較すれば効果は明らかで、図６の例ではスペクトル変動の状況によってはＥＸＰ１３１からＥＸＰ（ｆｎ，０，ｂａｎｄ）１４６への分布の急変が違和感を生み出しかねない状態であり、図７の例では明らかにスペクトル分布の急変が生じている。
もちろん時間域では窓がけ及びオーバーラップ処理によって滑らかにつなぐことはできるが、そこから更に聴感を向上させるには、波形のみならずスペクトル分布も滑らかにつなぐことが有効である。
次に、エラーが２フレーム以上連続した場合の、実施例の効果を図８，図９に示す。
まず、図８は過去のＡＦＬＭ（ｆｎ−１）が有効データフレームであり、ＡＦＬＭ（ｆｎ）とＡＦＬＭ（ｆｎ＋１）がエラーフレームであるときの例である。この場合にも、本発明はその威力を発揮する。つまり、エラーフレームであるＡＦＬＭ（ｆｎ＋１）内の振幅データＥＸＰ（ｆｎ＋１，０，ｂａｎｄ）１５０を暫定的に０と解釈すれば、図８の例のように滑らかにミュートさせることができる。また、この減衰のさせかたが緩やかすぎると判断されるならば、数式５の代わりに数式７のような処理で補償データを作ることも可能である。数式５ではブロックがすすむごとに前フレームからの影響部分が１／２のべき乗で減っていくものとなる。振幅データＥＸＰ（ｆｎ，３，ｂａｎｄ）の場合の具体例が数式８である。この場合、減衰の仕方が線形ではなく、対数的になるので聴感特性との親和性が高くなる。
【数７】

【数８】

また図９では、過去のＡＦＬＭ（ｆｎ−１）とＡＦＬＭ（ｆｎ）がエラーフレームであり、ＡＦＬＭ（ｆｎ＋１）が有効データフレームエラーフレームであるときの例である。これも図８の例と同様に、エラーフレームであるＡＦＬＭ（ｆｎ−１）内の振幅データＥＸＰ（ｆｎ＋１，５，ｂａｎｄ）１５１を０と解釈すれば、図８の例のように滑らかなフェードインをさせることができる。
図１０は図１の処理方法を実現する圧縮音声再生装置の一例で、音声ストリーム２００を入力し、ストリーム中の同期ワードＳＹＮＣ１０２を検出して音声フレーム１０１の同期処理を行い、音声フレーム２０２内の誤り検出ワードＥＤＣ１１０を利用して誤り検出をし、誤りがあるかないかを示すフレームエラーフラグ２０５を出力するフレーム同期検出及び回路誤り検出２０１と、同期処理誤り検出後の音声フレームデータ２０２からデコード処理のために必要な情報を抜き出して保持するフレーム内情報抜き出し回路２０３と、フレーム内情報抜き出し回路２０３から振幅値コード２０８を入力してデータ振幅値をデコードする振幅値デコード回路２０９と、振幅値デコード回路２０９から出力されるデコードされた振幅値２１０を入力し、上記フレーム内情報抜き出し回路２０３から正規化サンプル値２０４を入力して、逆量子化処理を行う逆量子化回路２１４と、逆量子化回路２１４から出力されるサンプル再構成値２１６を各帯域について入力し、帯域合成して、最終的な時間領域信号２１８に変換して出力する帯域合成回路２１７とで構成されており、
更に本発明の特徴部分として、フレーム同期および誤り検出回路２０１からのフレームエラーフラグ２０５を入力して保持する誤り履歴保持回路２０６と、振幅値デコード回路２０９でデコードされた振幅値のうち、圧縮音声フレーム内での最終オーディオデータブロックに対応する振幅値２２４を入力して保持する過去フレーム最終ブロック保持回路２２０と、（ｆｎ）番目圧縮音声フレームに対する逆量子化を行うときに（ｆｎは整数）、（ｆｎ − １）番目圧縮音声フレームの最終オーディオデータブロックに対応する振幅値２１１を過去フレーム最終ブロック保持回路２２０から入力し、また上記振幅値デコード手段から（ｆｎ＋１）番目圧縮音声フレームの先頭オーディオデータブロックに対応する振幅値２２６を入力して、振幅値補償データ２１３を生成して出力する補償値生成回路２１１と、誤り履歴保持回路２０６から（ｆｎ）番目圧縮音声フレームに対するフレームエラーフラグ２１９を入力して、
フレームエラーフラグ２１９がエラーでないことを示す場合は、上記振幅値デコード回路２０９が出力する（ｆｎ）番目圧縮音声フレームの各オーディオデータブロックに対応する振幅値２１０を、上記フレームエラーフラグがエラーであることを示す場合は補償値生成回路２１１が出力する補償値としての各オーディオデータブロックに対応する振幅値２１３を選び、逆量子化回路２１４の入力２０７として出力する。
この装置における振幅値補償データの生成に関連したデータ変化タイミングの一例が図１１に示してある。同期検出、誤り検出、情報抜き出し用フレームデータ１５２と、保持最終ブロックデータ１５５と、サンプル逆量子化、帯域合成用フレーム１５６とが示してあり、エラーフレームＡＦＬＭ（ｆｎ）内の振幅値ＥＸＰをＡＦＬＭ（ｆｎ−１）内の振幅データＥＸＰ（ｆｎ−１，５，ｂａｎｄ）１５３とＡＦＬＭ（ｆｎ＋１）内の振幅データＥＸＰ（ｆｎ＋１，０，ｂａｎｄ）１５４から生成する場合のタイミングの一例である。
このタイミングにより補償値が算出でき、ＡＦＬＭ（ｆｎ−１）内の振幅データＥＸＰ（ｆｎ−１，５，ｂａｎｄ）１５３とＡＦＬＭ（ｆｎ）内の補償後振幅データＥＸＰ（ｆｎ，０，ｂａｎｄ）１５８、またＡＦＬＭ（ｆｎ）内の補償後振幅データＥＸＰ（ｆｎ，５，ｂａｎｄ）１５９とＡＦＬＭ（ｆｎ−１）内の振幅データＥＸＰ（ｆｎ−１，０，ｂａｎｄ）１６０が滑らかにつながる。
この構成により、図１の例で示した補償方法を圧縮音声再生装置において実現する事ができる。
このようにして、エラーフレームに対するエラー補償処理が行われ、時間域での滑らかなつながりだけでなく、周波数域でのスペクトル分布の滑らかなつながりが実現できるため、聴感上の劣化を抑制できる。
【発明の効果】
エラーの検出された圧縮音声フレームのエラー補償処理が、より細やかに行われ、フレーム全体が訂正不能な状態なのに対して、フレーム内のブロックごとのデータ補償を行うことができる。これにより、１フレームのみ（２フレーム以上連続しない）のエラーの場合には、前フレーム完全繰り返しよりもエラー補償フレームからの復帰部分が周波数軸上で滑らかにつながる。また、１フレーム完全ミュートに比べて、エラー部分の音圧レベルの抑えすぎにならないので、データ補償処理発生時の聴感の劣化の抑制が期待できる。また、この処理のために保持するべきデータは１ブロック分の振幅係数で済むため、ハードウェアの増加も少ない。
【図面の簡単な説明】
【図１】本発明を適用した圧縮音声エラー補償処理方法の一実施例を示すフローチャート。
【図２】本発明のエラー補償処理詳細部分の一例を示すフローチャート。
【図３】本発明を適用した圧縮音声エラー補償処理方法に対する入力としての圧縮音声ストリームの構造とデコード結果音声の一例を示す図。
【図４】本発明を適用した圧縮音声エラー補償処理方法がエラー補償処理を行う状況の圧縮音声フレームとデコード結果音声の一例を示す図。
【図５】本発明を適用した圧縮音声エラー補償処理方法で生成される補償値の、隣接フレーム内のデータとの関連を示した一例を示す図。
【図６】従来の補償処理であるフレーム全体繰り返しの一例を示す図。
【図７】従来の補償処理であるフレーム全体ミュートの一例を示す図。
【図８】本発明を適用した圧縮音声エラー補償処理方法で生成される補償値の別の一例で、有効フレームからエラーフレームが２つ以上連続する場合の例を示す図。
【図９】本発明を適用した圧縮音声エラー補償処理方法で生成される補償値の別の一例で、エラーフレームが２つ以上連続した後に有効フレームが来た場合の例を示す図。
【図１０】本発明を適用した圧縮音声エラー補償処理装置の一実施例を示すフローチャート。
【図１１】本発明を適用した圧縮音声エラー補償処理装置における補償値生成のためのタイミング。
【符号の説明】
２…音声フレーム入力、３…フレームエラー検出，結果保存、４…逆量子化用フレームエラー有無確認、５…振幅値データ再現、６…サンプル逆量子化、７…最終ブロックデータ保存、１０…（ｆｎ − １）番フレーム最終ブロック呼び戻し、１１…振幅データ補償値の算出。TECHNICAL FIELD OF THE INVENTION
The present invention relates to a digital audio reproducing device, a compressed audio decoding device (MPEG-1 audio, MPEG-2 audio), and the like, and particularly to an error data compensation method and device when an error occurs in compressed audio data.
[Prior art]
In the conventional compressed audio reproduction method, the error compensation processing as a measure to cope with an error detected in the compressed audio stream is to repeat the previous data or mute the error part for a frame which is a basic unit of compression. Is performed. An example of this is described in "1992 Autumn Meeting of the Institute of Electronics, Information and Communication Engineers, B-571, Evaluation of Hardware for Transmission Error Compensation Method in MPEG / Audio Coding System: Kitabatake et al. 3, NEC". However, there is not much change in the error compensation method.
[Problems to be solved by the invention]
For example, when Fs = 48 kHz in MPEG-1 audio layer 2 shown in ISO / IEC 11172-3, one frame as a basic unit of compression is 24 ms for 1152 samples, and one frame is a finer block. (384 samples: 8 ms), and the shortest conversion length of conversion from time axis data to compressed data (frequency axis data) is one block. The error detection word is attached in units of one frame, and when an error occurs, it is compressed data, so that the effect of the error is usually limited to one whole frame and one part. Can not. If this section is subjected to repetition of the previous frame or error compensation processing (level conversion to level 0), a sense of discomfort is perceived depending on the characteristics of the sound source data. However, introduction of a calculation using fine auditory characteristics for calculating the error compensation value leads to a significant increase in the scale of decoding processing hardware and the amount of software calculation processing.
[Means for Solving the Problems]
The amplitude coefficient data of the last block of one frame of the compressed voice is stored and compared with the amplitude coefficient of the first block of the frame next to the frame in which the error has occurred. The amplitude coefficient for each block of the error frame is set so as to smoothly connect to the amplitude coefficients at both ends. The same block in the previous frame is repeated except for data affecting the signal level represented by the amplitude coefficient and the like.
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an example of an embodiment of the present invention will be described.
FIG. 1 is an example of a processing procedure of a compressed audio reproduction processing method.
As a compressed audio stream input to the compressed audio reproduction process, there is a stream shown in (1) of FIG. The compressed audio stream of (1) in FIG. 2 is formed in units of a frame 101 which is a basic compression unit of the compressed audio, and the time length of the frame is 32 ms, and the audio block is 1/6 of the time length of the frame. It is.
In the frame 101, a SYNC 102 indicating the head of the frame 101, BSI 103 which is stream auxiliary information, audio blocks ABLK (0) 104, ABLK (1) 105, ABLK (2) 106, ABLK (3 ) 107, ABLK (4) 108, ABLK (5) 109, and an error detection code EDC110 for detecting an error. The audio block ABLK (bn) further includes audio block information ABI112, amplitude value data EXP113, and normalized sample value MANT114 (bn is a block number).
When the compressed audio frame 101 is reproduced, the data becomes post-reproduction data 115. The decoded part for ABLK (0) 104 is in the 0th reproduced data section 116, the decoded part for ABLK (1) 105 is in the first reproduced data section 117, and ABLK (1) 105. (2) The decoded part for 106 is in the second reproduced data section 118, the decoded part for ABLK (3) 107 is in the third reproduced data section 119, the decoded part for ABLK (4) 108 is in the fourth reproduced data section 120, The decoded part for ABLK (5) 109 is the fifth reproduced data section 121.
Returning to the description of FIG. 1, the reproduction processing starts from the start of the reproduction processing in step 1, and first, the (fn + 1) th frame of the compressed audio stream is input by the audio stream AFLM (fn + 1) input in step 2 (fn is an integer). By detecting the frame error in step 3 and storing the detection result, the presence / absence of an error is detected by using the error detection code EDC110 or the like, and the result is stored in such a form that the number of the erroneous frame can be recognized. It is checked whether or not the (fn) -th frame has an error in the AFLM (fn) error check in step 4. If there is no error, the process proceeds to step 5. If there is an error, the process proceeds to step 10. In the amplitude data reproduction of AFLM (fn) in step 5, an amplitude value is reproduced from information included in AFLM (fn) based on data EXP113 related to the amplitude data.
In the sample inverse quantization of step 6, the sample value RECON_SAMP is reproduced from the amplitude value reproduced based on the EXP 113 and the MANT 114 included in the AFLM (fn). An example of the relationship between the EXP 113, the MANT 114, and the sample value RECON_SAMP is shown in the following Expression 3, and is expressed by multiplying (BASE) to the (EXP) power by MANT. Equation 4 below is a specific example in the case where (BASE) in Equation 3 is 2, and an example in which (BASE) is a small number or a negative number is also conceivable.
(Equation 3)
RECON_SAMP = (BASE) ＾ (EXP) * MANT
(Equation 4)
RECON_SAMP = 2 ＾ (EXP) * MANT
Subsequent to step 6, the last block ABLK (5) in the (fn) th frame is stored in the last block data storage in frame in step 7. It is determined whether or not to continue decoding in the decoding continuation determination in step 8, and if so, the processing is continued from step 2, and if it is to be ended, the reproduction is ended at the end of the reproduction processing in step 9.
If the (fn) -th frame has an error in step 4, the last block data ABLK (5) of the (fn-1) -th frame stored in step 7 which has been passed in the past by the recall of the compensation data in step 10 Call back). In the calculation of the amplitude data compensation value in step 11, the (fn) -th error frame is obtained from the last block data ABLK (5) of the (fn-1) -th frame and the first block data ABLK (0) of the (fn + 1) -th frame. Creates each block data ABLK (bn) in the frame of. Based on the created block data ABLK (bn), the processing after step 6 is continued.
FIG. 2 shows the processing flow of step 11 in more detail. The processing from step 10 is continued by inputting the amplitude data of ABLK (5) in AFLM (fn-1) of step 12 to (fn-1). The amplitude data EXP (fn + 1, 0, band) of the last block data ABLK (5) of the second frame is input (band is a band number). Subsequently, the amplitude data EXP (fn + 1, 0, band) of the first block data ABLK (0) of the (fn + 1) -th frame is input by the amplitude data input of ABLK (0) in AFLM (fn + 1) in step 13. In the generation of the in-block amplitude value compensation data in step 14, a compensation value in ABLK (bn) in AFLM (fn), which is an error frame, is generated for each band.
Subsequent to step 14, it is determined whether or not the processing has been completed up to the last block in the all block end determination in step 15, and if not completed, the processing in step 14 is performed for the next block ABLK. For example, the amplitude value compensation data is placed at a predetermined location in the storage of the amplitude value compensation data in step 16. Subsequently, the process proceeds to step S6, and is the same as that described above.
(Equation 5)

(Equation 6)

An example of the method of generating the compensation value in step 14 is more specifically shown in Expression 5. This is a formula for generating the amplitude data EXP (fn, bn, band) from the amplitude data EXP (fn-1, 5, band) and the amplitude data EXP (fn + 1, 0, band), and ABLK (bn) in AFLM (fn). The weighted average is calculated so that the amplitude data EXP in the parentheses) is smoothly connected to the amplitude data EXP of the adjacent frame for each band.
Here, for example, specifically, for Block 2, Equation 5 is equivalent to Equation 6.
FIG. 5 shows this effect, and the effect is expressed by focusing on amplitude data EXP for a certain band in the block. Therefore, for example, if the compression processing is divided into 32 bands by band division processing and one amplitude data EXP is determined for one band, the processing as shown in FIG. Will be performed every time.
FIG. 5A shows the state before the compensation value is generated, and FIG. 5B shows the state after the generation of the compensation value. In (1) of FIG. 5, the amplitude data of a predetermined band in the ABLK (bn) of the (fn-1) th frame is represented by EXP (fn-1, 0, band) 126 and EXP (fn-1, 1). , Band) 127, EXP (fn-1, 2, band) 128, EXP (fn-3, 0, band) 129, EXP (fn-1, 4, band) 130, EXP (fn-1, 5, band) ) 131, the amplitude data of a certain band in the ABLK (bn) of the (fn + 1) th frame is EXP (fn + 1, 0, band) 132, EXP (fn + 1, 1, band) 133, EXP (fn + 1, 2) , Band) 134, EXP (fn + 1, 3, band) 135, EXP (fn + 1, 4, band) 136, EXP (fn + 1, 5, band) 1 It is 7.
As a result of performing the processing of Expression 3 on the amplitude data of the (fn) -th frame, which is an error frame, as shown in FIG. 5B, a certain band band in ABLK (bn) of the (fn) -th frame is obtained. (Fn, 0, band) 138, EXP (fn, 1, band) 139, EXP (fn, 2, band) 140, EXP (fn, 0, band) 141, EXP (fn, 4, band) ) 142 and EXP (fn, 5, band) 143 are generated so as to be smoothly connected to EXP (fn-1, 5, band) 131 and EXP (fn + 1, 0, band) 132.
In this way, the error compensation processing for the error frame is performed. Thereby, the connection between the error section and the valid frame section becomes smoother. Further, the data to be handled is data for a section considerably shorter than the length of the frame in which an error has occurred, and further requires only partial data before the sample is reproduced, so that the processing scale is not significantly increased. Also, since it is partial data before reproducing the sample, error compensation in the frequency domain before performing the orthogonal inverse transformation, not only smooth connection in the time domain but also smooth connection of the spectral distribution in the frequency domain it can.
Here, the effect is clear by comparing FIG. 6 which is an example of the conventional frame repetition and FIG. 7 and FIG. 5 (2) which is an example of the conventional mute. In the example of FIG. A sudden change in the distribution from EXP 131 to EXP (fn, 0, band) 146 may create a sense of incongruity, and in the example of FIG. 7, a sharp change in the spectral distribution is apparent.
Of course, in the time domain, smooth connection can be achieved by windowing and overlap processing, but in order to further improve the audibility, it is effective to smoothly connect not only the waveform but also the spectrum distribution.
Next, FIGS. 8 and 9 show the effects of the embodiment when the error continues for two or more frames.
First, FIG. 8 shows an example in which AFLM (fn-1) in the past is a valid data frame, and AFLM (fn) and AFLM (fn + 1) are error frames. Also in this case, the present invention exerts its power. That is, if the amplitude data EXP (fn + 1, 0, band) 150 in the AFLM (fn + 1), which is an error frame, is provisionally interpreted as 0, the mute can be smoothly muted as in the example of FIG. If it is determined that the attenuation is too gentle, compensation data can be created by processing as shown in Expression 7 instead of Expression 5. In Expression 5, the influence part from the previous frame decreases by a power of １／ every time the block progresses. Equation 8 is a specific example in the case of the amplitude data EXP (fn, 3, band). In this case, since the attenuation method is not linear but logarithmic, the affinity with the auditory characteristics is enhanced.
(Equation 7)

(Equation 8)

FIG. 9 shows an example in which the past AFLM (fn-1) and AFLM (fn) are error frames, and AFLM (fn + 1) is a valid data frame error frame. 8, if the amplitude data EXP (fn + 1,5, band) 151 in the error frame AFLM (fn-1) is interpreted as 0, as in the example of FIG. In.
FIG. 10 is an example of a compressed audio reproducing apparatus that realizes the processing method of FIG. 1. The audio stream 200 is input, a synchronization word SYNC 102 in the stream is detected, the audio frame 101 is synchronized, and An error detection is performed using the error detection word EDC110, and a frame synchronization detection and circuit error detection 201 for outputting a frame error flag 205 indicating whether or not there is an error, and a decoding process from the audio frame data 202 after the synchronization processing error detection. Information extracting circuit 203 for extracting and holding information necessary for the operation, an amplitude value decoding circuit 209 for inputting the amplitude value code 208 from the intra-frame information extracting circuit 203 and decoding the data amplitude value, The decoded amplitude value 210 output from the circuit 209 is input. Then, the normalized sample value 204 is input from the intra-frame information extracting circuit 203, and an inverse quantization circuit 214 for performing an inverse quantization process, and a sample reconstruction value 216 output from the inverse quantization circuit 214 are converted into each band. And a band synthesizing circuit 217 for inputting, band-synthesizing, converting to a final time-domain signal 218, and outputting the signal.
Further, as characteristic features of the present invention, an error history holding circuit 206 for inputting and holding the frame error flag 205 from the frame synchronization and error detection circuit 201, and a compressed audio signal among the amplitude values decoded by the amplitude value decoding circuit 209 A past frame last block holding circuit 220 for inputting and holding an amplitude value 224 corresponding to the last audio data block in the frame; and performing inverse quantization on the (fn) -th compressed audio frame (fn is an integer). The amplitude value 211 corresponding to the final audio data block of the (fn-1) th compressed audio frame is input from the past frame last block holding circuit 220, and the amplitude value decoding means starts the (fn + 1) th compressed audio frame. Input amplitude value 226 corresponding to audio data block Te, the compensation value generating circuit 211 for generating and outputting an amplitude value compensation data 213, enter the frame error flag 219 for the error history holding circuit 206 (fn) th compressed audio frames,
If the frame error flag 219 indicates that there is no error, the amplitude value decoding circuit 209 outputs the amplitude value 210 corresponding to each audio data block of the (fn) -th compressed audio frame, and the frame error flag indicates an error. In this case, the amplitude value 213 corresponding to each audio data block as a compensation value output from the compensation value generation circuit 211 is selected and output as the input 207 of the inverse quantization circuit 214.
FIG. 11 shows an example of the data change timing related to the generation of the amplitude value compensation data in this device. Frame data 152 for synchronization detection, error detection, and information extraction, held final block data 155, sample dequantization and band synthesis frame 156 are shown, and the amplitude value EXP in the error frame AFLM (fn) is represented by AFLM. This is an example of timing when the amplitude data EXP (fn-1, 5, band) 153 in (fn-1) and the amplitude data EXP (fn + 1, 0, band) 154 in AFLM (fn + 1) are generated.
At this timing, a compensation value can be calculated, and the amplitude data EXP (fn-1,5, band) 153 in AFLM (fn-1) and the compensated amplitude data EXP (fn, 0, band) 158 in AFLM (fn). Further, the compensated amplitude data EXP (fn, 5, band) 159 in AFLM (fn) and the amplitude data EXP (fn-1,0, band) 160 in AFLM (fn-1) are smoothly connected.
With this configuration, the compensation method shown in the example of FIG. 1 can be realized in the compressed audio reproduction device.
In this way, the error compensation processing for the error frame is performed, and not only the smooth connection in the time domain but also the smooth connection of the spectrum distribution in the frequency domain can be realized, so that the deterioration in audibility can be suppressed.
【The invention's effect】
The error compensation processing of the compressed voice frame in which the error is detected is performed more finely, and the data compensation for each block in the frame can be performed while the entire frame is in an uncorrectable state. Thus, in the case of an error of only one frame (not continuous for two or more frames), a return portion from the error compensation frame is more smoothly connected on the frequency axis than in the complete repetition of the previous frame. In addition, the sound pressure level of the error portion is not excessively suppressed as compared with the case where one frame is completely muted. Therefore, it is expected that the deterioration of the audibility at the time of occurrence of the data compensation processing is suppressed. Also, since the data to be held for this processing only needs to be the amplitude coefficient for one block, the increase in hardware is small.
[Brief description of the drawings]
FIG. 1 is a flowchart showing an embodiment of a compressed speech error compensation processing method to which the present invention is applied.
FIG. 2 is a flowchart illustrating an example of a detailed portion of an error compensation process according to the present invention.
FIG. 3 is a diagram showing an example of a structure of a compressed audio stream as an input to a compressed audio error compensation processing method to which the present invention is applied, and an example of decoded audio.
FIG. 4 is a diagram showing an example of a compressed audio frame and a decoding result audio in a situation where the compressed audio error compensation processing method to which the present invention is applied performs error compensation processing.
FIG. 5 is a diagram showing an example showing a relation between a compensation value generated by a compressed speech error compensation processing method to which the present invention is applied and data in an adjacent frame.
FIG. 6 is a diagram showing an example of repetition of an entire frame, which is a conventional compensation process.
FIG. 7 is a diagram showing an example of a whole frame mute which is a conventional compensation process.
FIG. 8 is a diagram showing another example of a compensation value generated by the compressed speech error compensation processing method to which the present invention is applied, in a case where two or more error frames continue from a valid frame.
FIG. 9 is a diagram showing another example of a compensation value generated by the compressed speech error compensation processing method to which the present invention is applied, in which an effective frame comes after two or more error frames continue.
FIG. 10 is a flowchart showing an embodiment of a compressed speech error compensation processing apparatus to which the present invention is applied.
FIG. 11 shows a timing for generating a compensation value in the compressed speech error compensation processing apparatus to which the present invention is applied.
[Explanation of symbols]
2: Voice frame input, 3: Frame error detection, result storage, 4: Confirmation of presence / absence of frame error for inverse quantization, 5: Amplitude value data reproduction, 6: Sample inverse quantization, 7: Final block data storage, 10: ( fn-1) Recall of the last block of the #th frame, 11 ... Calculation of amplitude data compensation value.

Claims

A synchronization word indicating the beginning of a compressed audio frame, which is a basic unit of compression, an error detection word for detecting whether the compressed audio frame contains an error, and a plurality of audio information blocks holding audio information The compressed voice frame is input, and the head of the compressed voice frame is determined by the synchronization word included in the compressed voice frame, and the compression is performed by the error detection word included in the compressed voice frame. Determining whether the audio frame contains an error, dequantizing the sample value by the amplitude value and the normalized sample value included in the audio information block included in the compressed audio frame, and outputting an audio signal; In the compressed audio reproduction method,
The history of the compressed audio frame is retained, and when an error is detected in the (fn) th compressed audio frame (fn is an integer), the last audio information block of the (fn-1) th compressed audio frame and ( generating error compensation information for the audio information block of the (fn) th compressed audio frame from the first audio information block of the (fn + 1) th compressed audio frame ;
From the amplitude value of the last audio information block of the (fn-1) th compressed audio frame and the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame, the audio information of the (fn) th compressed audio frame Generate error compensation information for the block,
If the (fn + 1) th compressed audio frame also has a defect, the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame is set to zero, and error compensation for the audio information block of the (fn) th compressed audio frame is performed. A compressed sound reproduction method characterized by generating information .

A synchronization word indicating the beginning of a compressed audio frame, which is a basic unit of compression, an error detection word for detecting whether the compressed audio frame contains an error, and a plurality of audio information blocks holding audio information The compressed voice frame is input, and the head of the compressed voice frame is determined by the synchronization word included in the compressed voice frame, and the compression is performed by the error detection word included in the compressed voice frame. Determining whether the audio frame contains an error, dequantizing the sample value by the amplitude value and the normalized sample value included in the audio information block included in the compressed audio frame, and outputting an audio signal; In the compressed audio reproduction method,
The history of the compressed audio frame is retained, and when an error is detected in the (fn) th compressed audio frame (fn is an integer), the last audio information block of the (fn-1) th compressed audio frame and ( generating error compensation information for the audio information block of the (fn) th compressed audio frame from the first audio information block of the (fn + 1) th compressed audio frame;
From the amplitude value of the last audio information block of the (fn-1) th compressed audio frame and the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame, the audio information of the (fn) th compressed audio frame Generate error compensation information for the block,
If the (fn-1) th compressed audio frame also has a defect, the amplitude value of the last audio information block of the (fn-1) th compressed audio frame is set to zero, and the audio information of the (fn) th compressed audio frame is set. A compressed sound reproduction method characterized by generating error compensation information for a block.

The compressed audio reproduction method according to claim 1 or 2,
The temporally approaching the (fn + 1) th compressed audio frame, the amplitude value of the audio information block of the error compensation information approaches the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame. Compressed audio playback method.

A synchronization word indicating the beginning of a compressed audio frame, which is a basic unit of compression, an error detection word for detecting whether the compressed audio frame contains an error, and a plurality of audio information blocks holding audio information A compressed voice frame input unit configured to receive the compressed voice frame, and an error determination unit that determines whether the compressed voice frame contains an error by the error detection word included in the compressed voice frame. A sound signal output unit that outputs an audio signal by dequantizing a sample value using an amplitude value and a normalized sample value included in the audio information block included in the compressed audio frame. ,
An error is detected in the (fn) th compressed audio frame (fn is an integer) by the compressed audio frame holding unit that holds the history of the compressed audio frame input from the compressed audio frame input unit and the error determination unit. In this case, error compensation information for the audio information block of the (fn) th compressed audio frame from the last audio information block of the (fn-1) th compressed audio frame and the first audio information block of the (fn + 1) th compressed audio frame And an error compensation information generation unit that generates
The error compensation information generation unit calculates (fn) the amplitude value of the last audio information block of the (fn-1) th compressed audio frame and the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame. Generating error compensation information for the audio information block of the) th compressed audio frame;
If the (fn + 1) th compressed audio frame also has a defect, the error compensation information generation unit sets the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame to zero, and sets the (fn) th compressed audio frame to zero. A compressed audio reproducing apparatus for generating error compensation information for an audio information block of a frame.

A synchronization word indicating the beginning of a compressed audio frame, which is a basic unit of compression, an error detection word for detecting whether the compressed audio frame contains an error, and a plurality of audio information blocks holding audio information A compressed voice frame input unit configured to receive the compressed voice frame, and an error determination unit that determines whether the compressed voice frame contains an error by the error detection word included in the compressed voice frame. A sound signal output unit that outputs an audio signal by dequantizing a sample value using an amplitude value and a normalized sample value included in the audio information block included in the compressed audio frame. ,
An error is detected in the (fn) th compressed audio frame (fn is an integer) by the compressed audio frame holding unit that holds the history of the compressed audio frame input from the compressed audio frame input unit and the error determination unit. In this case, error compensation information for the audio information block of the (fn) th compressed audio frame from the last audio information block of the (fn-1) th compressed audio frame and the first audio information block of the (fn + 1) th compressed audio frame And an error compensation information generation unit that generates
The error compensation information generation unit calculates (fn) the amplitude value of the last audio information block of the (fn-1) th compressed audio frame and the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame. Generating error compensation information for the audio information block of the) th compressed audio frame;
If the (fn-1) th compressed audio frame also has a defect, the error compensation information generation unit sets the amplitude value of the last audio information block of the (fn-1) th compressed audio frame to zero (fn A) a compressed audio reproducing apparatus for generating error compensation information for an audio information block of a compressed audio frame;

The compressed audio reproduction device according to claim 4 or 5,
The temporally approaching the (fn + 1) th compressed audio frame, the amplitude value of the audio information block of the error compensation information approaches the amplitude value of the first audio information block of the (fn + 1) th compressed audio frame. Compressed audio playback device.