JP3917767B2

JP3917767B2 - Voice fluctuation correction control method, voice playback apparatus, and voice relay apparatus

Info

Publication number: JP3917767B2
Application number: JP31481398A
Authority: JP
Inventors: 善中島; 篤新村
Original assignee: 日立情報通信エンジニアリング株式会社
Priority date: 1998-11-05
Filing date: 1998-11-05
Publication date: 2007-05-23
Anticipated expiration: 2018-11-05
Also published as: JP2000151694A

Description

【０００１】
【発明の属する技術分野】
本発明は、音声通信システムに係わり、特にインターネット等のパケット通信網を利用して音声データを送受信するシステムにおいて、音声データを受信再生する際での音声ゆらぎ補正制御方法、更には、その音声ゆらぎ補正制御方法に係る音声再生装置および音声中継装置に関するものである。
【０００２】
【従来の技術】
近年、音声電話をＬＡＮ（Local Area Network）に統合する技術が提案され、電話機間の通話にインターネット、またはイントラネットを利用する通話システムの構築やサービスが次第に実施されつつあるのが現状である。ところで、これまでの構内通信では、音声電話は専らＰＢＸ（ Private Branch Exchange：構内交換機）によっている一方では、データ通信にはＬＡＮを用いて行われており、両通信網は相互に独立したものとなっている。そこで、音声もＬＡＮのデータとして送受信し得るように、音声がＬＡＮの通信プロトコルによる通信データに適応したディジタルデータに変換されるようにすれば、ＬＡＮを利用しての音声通話が可能となり、初めて音声電話網とデータ通信網の統合が実現されることになる。
【０００３】
しかしながら、ＬＡＮで利用される通信プロトコルはインターネットプロトコル（ＩＰ： Internet Protocol）が標準となっている。一方、インターネットでは、ＯＳＩ参照モデルのトランスポート層プロトコルとして、ＴＣＰ（Transmission Control Protocol）およびＵＤＰ（ User Datagram Protocol）が用いられているのが実情である。ＴＣＰはコネクション型の通信接続を行うもので、パケット順序制御、フロー制御、エラー検出・回復、再送、輻輳制御などの機能を有していることから、したがって、ＴＣＰによる場合には、信頼性のある通信サービスが提供されるものとなっている。一方、ＵＤＰはコネクションレス型のプロトコルとされ、ＴＣＰのような機能を有しない分、リアルタイム性が要求される通信に使用されるものとなっている。例えば音声通話を実現するには、一部分の音声データが欠落しても再送することの重要性は低く、連続的に発生している音声データは順次送信されなければならないものとなっている。即ち、音声のように、リアルタイム性が要求される通信にはＵＤＰが用いられているものである。問題は、インターネット通信は離散的に発生するトラフィックを送受信するには適しているが、音声のように、連続的に発生するデータはデータパケット各々がＵＤＰにより順次連続的に高速送信されてから、必ずしも同一伝送経路を介し伝送されるとは限らないというものである。換言すれば、データパケット各々がＵＤＰにより順次連続的に高速送信されてから、様々な伝送経路を介し受信側で受信されるまでの時間は一定ではなく、受信されるまでの時間にはバラツキが生じ、したがって、送信順に必ずしも受信されるとは限らなく、何等かの制御が施されなければ、受信順に順次再生出力されたとしても、連続性ある音声として必ずしも再生され得ないというものである。
【０００４】
そのような音声の不連続的再生を防止すべく、これまでにあっては、受信再生側には再生バッファが設けられた上、順次受信される音声データパケット各々はその再生バッファ上に送信順に並べ替えられた状態として一時蓄積されつつ、連続再生可能となる時間を待って、初めて再生出力されていたものである。図５には従来技術に係る音声再生装置が示されているが、これについて簡単ながら説明すれば以下のようである。
【０００５】
即ち、パケット通信網からの音声データパケット各々は受信部５００で順次受信された上、受信された音声データパケット中のディジタル音声データ（符号化圧縮状態）は再生バッファ５０１上にパケット送信順に並べ替えされた状態として順次一時蓄積されるものとなっている。一方、そのような一時蓄積に並行して、連続再生可能なディジタル音声データが再生バッファ５０１上に一時蓄積される一定時間を待って、再生バッファ５００からはディジタル音声データが再生データ読み出し部５０２により順次一定周期で読み出されるものとなっている。順次読み出されたディジタル音声データは、その後、復号化伸長された状態としてＤ／Ａ変換部５０３で順次アナログ音声信号に変換された上、音声出力部（スピーカやヘッドフォン等）５０４から音声として再生出力されているものである。このように、インターネットやＬＡＮ、ＡＴＭネットワーク、フレームリレーネットワーク等のパケット通信網（非同期通信網）上では、音声データパケット各々はその送信順序通りに受信側で受信されるとは限らなく、受信音声データパケット各々がその送信順序通りに連続的に並べ替えされた状態として一時蓄積された上、読み出し再生されるまでにある程度の待ち時間が要されていたものである。
【０００６】
【発明が解決しようとする課題】
以上のように、これまでにあっては、受信再生側に一定容量の再生バッファが用意された上、順次受信される音声データパケット各々は再生バッファにパケット送信順に並べ替えされた状態として順次一時蓄積される一方では、その再生バッファから一定再生待ち時間後に順次読み出し再生されているが、このような方法による場合、場合如何によっては不具合を生じるものとなっている。というのは、パケット通信網が一定再生待ち時間を必要としないような良好な通信状態であっても、読み出し再生までに一定時間待つ必要があり、これとは逆に、一定再生待ち時間を上回るような受信遅れが起こった場合には、再生上、空白時間が生じてしまうからである。
【０００７】
本発明の目的は、送信音声データパケット各々が受信されるに際し、その受信順序が送信順序とは異なる状態として受信、即ち、ばらついた状態として受信される場合に、そのバラツキ程度に応じて動的に最適な再生待ち時間が随時設定された上、受信音声データパケット各々が良好な音声として再生出力され得る音声ゆらぎ補正制御方法を供するにある。また、音声中継装置で電話網へ中継する際に、通信網の状態により動的に最適な再生待ち時間で音声再生し、送信することにより良質な音声送信を実現するものである。
【０００８】
上記目的を達成するため本発明は、通信網を介して、符号化され且つ送信時刻情報が付加されたディジタル音声データを含む音声パケットを受信する受信部と、ディジタル音声データをブロック番号が付与されたブロック単位で書き込み及び読み出しが成される再生バッファと、前記受信部から前記ディジタル音声データを読み出し、ディジタル音声データを音声ゆらぎ補正を行うように再生バッファに蓄積する音声ゆらぎ補正制御部と、前記再生バッファからディジタル音声データを取り出して音声再生出力を行う音声出力部とを備える音声再生装置の音声ゆらぎ補正制御方法であって、
音声ゆらぎ補正制御部が、
前記再生バッファから音声を再生する最大再生待ち時間及び最小再生待ち時間を設定しておき、
前記再生バッファ上の現在の再生ブロック番号と書き込みブロック番号との間隔ｄを算出し、
前記送信時刻情報を基に設定した再生バッファへの書き込みブロック位置を、
前記間隔ｄが最大再生待ち時間より大のときが連続して発生した回数が予め設定した値以上の回数連続してカウントしたとき、データ書き込みブロック位置を再生ブロック位置に近づく方向に移動し、
前記間隔ｄが最小再生待ち時間より小のときが連続して発生した回数が予め設定した値以上の回数連続してカウントしたとき、データ書き込みブロック位置を再生ブロック位置から離れる方向に移動するように補正することを第１の特徴とする。
【０００９】
また本発明は、通信網を介して、符号化され且つ送信時刻情報が付加されたディジタル音声データを含む音声パケットを受信する受信部と、ディジタル音声データをブロック番号が付与されたブロック単位で書き込み及び読み出しが成される再生バッファと、前記受信部から前記ディジタル音声データを読み出し、ディジタル音声データを音声ゆらぎ補正を行うように再生バッファに蓄積する音声ゆらぎ補正制御部と、前記再生バッファからディジタル音声データを取り出して音声再生出力を行う音声出力部とを備える音声再生装置の音声ゆらぎ補正制御方法であって、
前記音声ゆらぎ補正制御部に、
前記ディジタル音声データから送信時刻情報Ｔを抽出する手段と、
前記再生バッファから音声を再生する最大再生待ち時間Ｔmax及び最小再生待ち時間Ｔminを設定する手段と、
前記再生バッファ上の現在の再生ブロック番号と書き込みブロック番号との間隔ｄを算出する手段と、
前記間隔ｄが最大再生待ち時間Ｔmaxより大のときが連続して発生した回数をカウントする補正値減少カウンタＡＪＭと、
前記間隔ｄが最小再生待ち時間Ｔminより小のときが連続して発生した回数をカウントする補正値増加カウンタＡＪＰと、
ディジタル音声データを再生バッファに書き込むブロック番号を補正する値である補正値ＡＪを設定する手段とを設け、
前記音声ゆらぎ補正制御部が、
前記ディジタル音声データから抽出した送信時刻情報Ｔを基に前記再生バッファにディジタル音声データを書き込むブロック番号を設定する第１工程と、
前記補正値増加カウンタＡＪＰが、予め設定した値以上の回数連続してカウントしたとき、データ書き込みブロック位置を再生ブロック位置から離れるように補正値ＡＪを変更する第２工程と、
前記補正値減少カウンタＡＪＭが、予め設定した値以上の回数連続してカウントしたとき、データ書き込みブロック位置を再生ブロック位置に近づけるように補正値ＡＪを変更する第３工程と、
前記第２及び第３工程により変更した補正値ＡＪを用いて前記第１工程により設定した再生バッファへの書き込みブロック番号を訂正する第４工程と、
前記算出した再生バッファへの書き込みブロック番号のブロックにディジタル音声データを書き込む第５工程とを実行することにより、ディジタル音声データの再生バッファへの書き込み位置の補正することを第２の特徴とし、
該第２の特徴の音声ゆらぎ補正制御方法において、前記音声ゆらぎ補正制御部に、前記間隔ｄが最小再生待ち時間以上である場合が連続して発生した回数をカウントする最小値比較カウンタＣＭinを設け、該最小値比較カウンタＣＭinが、予め設定した補正最高値ＡＪＷmaxを超えた値をカウントしたとき、データ書き込みブロック位置を再生ブロック位置に近づけるように補正値ＡＪを変更する工程を含むことを第３の特徴とする。
【００１０】
更に本発明は、パケット通信網を介して、符号化され且つ送信時刻情報が付加されたディジタル音声データを含む音声パケットを受信する受信部と、ディジタル音声データをブロック番号が付与されたブロック単位で書き込み及び読み出しが成される再生バッファと、前記受信部から前記ディジタル音声データを読み出し、ディジタル音声データを音声ゆらぎ補正を行うように再生バッファに蓄積する音声ゆらぎ補正制御部と、前記再生バッファからディジタル音声データを取り出して音声再生出力を行う音声出力部とを備える音声再生装置であって、
音声ゆらぎ補正制御部は、
前記再生バッファから音声を再生する最大再生待ち時間及び最小再生待ち時間を設定する設定手段と、
前記再生バッファ上の現在の再生ブロック番号と書き込みブロック番号との間隔ｄを算出する手段と、
前記間隔ｄが最大再生待ち時間より大のときが連続して発生した回数をカウントする補正値減少カウンタと、
間隔ｄが最小再生待ち時間より小のときが連続して発生した回数をカウントする補正値増加カウンタとを備え、
前記補正値増加カウンタが、前記予め設定した値以上の回数連続してカウントしたとき、前記送信時刻情報を基に設定した再生バッファへの書き込みブロック位置を、データ書き込みブロック位置を再生ブロック位置から離れる方向に移動し、
前記補正値減少カウンタが、前記予め設定した値以上の回数連続してカウントしたとき、前記送信時刻情報を基に設定した再生バッファへの書き込みブロック位置を、データ書き込みブロック位置を再生ブロック位置に近づく方向に移動するように補正することを第４の特徴とする。
【００１１】
また本発明は、パケット通信網を介して、符号化され且つ送信時刻情報が付加されたディジタル音声データを含む音声パケットを受信する受信部と、ディジタル音声データをブロック番号が付与されたブロック単位で書き込み及び読み出しが成される再生バッファと、前記受信部から前記ディジタル音声データを読み出し、ディジタル音声データを音声ゆらぎ補正を行うように再生バッファに蓄積する音声ゆらぎ補正制御部と、前記再生バッファからディジタル音声データを取り出して音声再生出力を行う音声出力部とを備える音声再生装置であって、
前記音声ゆらぎ補正制御部が、
前記ディジタル音声データから送信時刻情報Ｔを抽出する手段と、
前記再生バッファから音声を再生する最大再生待ち時間Ｔmax及び最小再生待ち時間Ｔminを設定する手段と、
前記再生バッファ上の現在の再生ブロック番号と書き込みブロック番号との間隔ｄを算出する手段と、
前記間隔ｄが最小再生待ち時間Ｔmin以上である場合が連続して発生した回数をカウントする最小値比較カウンタＣＭinと、
前記間隔ｄが最大再生待ち時間Ｔmaxより大のときが連続して発生した回数をカウントする補正値減少カウンタＡＪＭと、
前記間隔ｄが最小再生待ち時間Ｔminより小のときが連続して発生した回数をカウントする補正値増加カウンタＡＪＰと、
ディジタル音声データを再生バッファに書き込むブロック番号を補正する値である補正値ＡＪを設定する手段とを備え、
前記ディジタル音声データから抽出した送信時刻情報Ｔを基に前記再生バッファにディジタル音声データを書き込むブロック番号を設定する第１工程と、
前記補正値増加カウンタＡＪＰが、予め設定した値以上の回数連続してカウントしたとき、データ書き込みブロック位置を再生ブロック位置から離れるように補正値ＡＪを変更する第２工程と、
前記補正値減少カウンタＡＪＭが、予め設定した値以上の回数連続してカウントしたとき、データ書き込みブロック位置を再生ブロック位置に近づけるように補正値ＡＪを変更する第３工程と、
前記第２及び第３工程により変更した補正値ＡＪを用いて前記第１工程により設定した再生バッファへの書き込みブロック番号を訂正する第４工程と、
前記算出した再生バッファへの書き込みブロック番号のブロックにディジタル音声データを書き込む第５工程とを実行することを第５の特徴とし、
該第５の特徴の音声再生装置において、前記音声ゆらぎ補正制御部が、前記間隔ｄが最小再生待ち時間以上である場合が連続して発生した回数をカウントする最小値比較カウンタＣＭinを有し、該最小値比較カウンタＣＭinが、予め設定した補正最高値ＡＪＷmaxを超えた値をカウントしたとき、データ書き込みブロック位置を再生ブロック位置に近づけるように補正値ＡＪを変更する工程を実行することを第６の特徴とする。
【００１２】
更に本発明は、前記第５又は６の特徴の音声再生装置において、前記パケット通信網が、インターネット、又はローカルエリアネットワーク、又はＡＴＭルットワーク、又はフレームリレーネットワークであることを第７の特徴とする。
また本発明は、前記特徴５〜７何れかに記載の音声出力部に替えて回線交換網へパケット音声データを出力する回線応答部を備えたことを第８の特徴とし、該第８の特徴の音声中継装置において、前記交換回線網が、公衆アナログ電話網、又は公衆ＩＳＤＮ、又は内線電話網であることを第９の特徴とする。
【００１３】
更に本発明は、前記第２又は３の特徴の音声ゆらぎ補正制御方法において、前記ディジタル音声データが、３０ミリ秒間隔のサンプリング時間で８ビット８ＫＨｚのμ法則を利用してサンプリングしたデータであり、前記再生バッファのブロック単位が、２４０バイトのディジタル音声データであり、前記最大再生待時間を３００ミリ秒、最小再生待ち時間を６０ミリ秒としたとき、前記補正値増加カウンタＡＪＰの予め設定した値及び前記補正値減少カウンタＡＪＭの予め設定した値が、２回であることを第１０の特徴とする。
【００１４】
更に本発明は、前記第請求項５又は６記載の音声再生装置において、前記ディジタル音声データが、３０ミリ秒間隔のサンプリング時間で８ビット８ＫＨｚのμ法則を利用してサンプリングしたデータであり、前記再生バッファのブロック単位が、２４０バイトのディジタル音声データであり、前記最大再生待時間を３００ミリ秒、最小再生待ち時間を６０ミリ秒としたとき、前記補正値増加カウンタＡＪＰの予め設定した値及び前記補正値減少カウンタＡＪＭの予め設定した値が、２回であることを第１１の特徴とする。
【００１７】
【発明の実施の形態】
本発明は、パケット通信網等の非同期通信網を介して、音声データを受信して音声再生する音声再生装置と再生音声を交換回線網に送信する音声中継装置に適用されるもので、特にインターネットを利用して音声データを通信するインターネット電話に適用される。
【００１８】
（実施例１）：以下、本発明の第１の実施形態について図面に基づき詳細に説明する。図１は本実施の形態に係る音声再生装置のブロック構成図である。図１の１００はパケット通信網からデータパケットを受信する受信部であり、１０５にデータを受信したことを通知する。１０５は本発明の特徴をなすゆらぎ補正制御部で、受信部１００からの通知をもとに再生バッファ１０１への書き込み制御をする。１０１は再生バッファで１０５から出力されたディジタル音声データを一時的に保存する。１０２は再生データ読み出し部であり、連続再生可能なディジタル音声データが１０１に蓄積されるのを待ってからディジタル音声データを読み出し、１０３へ出力する。１０３はディジタル音声データを音声データに変換して１０４に出力するＤ／Ａ変換部である。１０４はスピーカやヘッドホン等の音声出力部である。
【００１９】
以上のように構成された音声再生装置の再生バッファ１０１、再生データ読み出し部１０２、ゆらぎ補正制御部１０５について以下、図２と図３を参照しつつ詳細に説明する。
【００２０】
図２はゆらぎ補正制御部１０５のバッファ制御処理を説明する図である。図のブロックの並びは図１における再生バッファ１０１を表しており、図１のゆらぎ補正制御部１０５が音声データを最適位置に書込んでいき、再生データ読み出し部１０２が図２の左側のブロックから読み出し、再生していくことを表している。また、Ｂ０は音声再生する単位ブロックである。この値は予め一定に設定されたサンプリング時間毎にディジタル音声データ化された音声データの大きさであり、例えば、３０ｍ秒間隔のサンプリング時間で８ビット８ｋＨｚのμ法則を用いた場合は２４０バイトのディジタル音声データとなり、これが音声再生単位ブロックの大きさとなる。Ｂ１は現在再生してる再生ブロック単位のブロック番号を、Ｂ２は受信した音声データを書き込む再生ブロック単位のブロック番号をそれぞれ示している。ｄは書き込みブロック番号Ｂ２と再生ブロック番号Ｂ１の間隔を表す値で、すなわち受信した音声データを書き込んでから再生されるまで待ち合わせる音声データのブロック数を表している。図２の斜線部分が音声データを受信してから再生されるまでに待ち合わせている音声データを示している。
【００２１】
この再生されるまでに待ち合わせるブロック数と音声再生には次のような関係がある。即ち、待ち合わせるブロック数が多いほど、音声データを受信してから再生されるまでの時間が長くなってしまうが、通信状態によって到着時間が大きくゆらいでも音声が途切れること無く連続に再生することが出来る。一方、再生されるまでに待ち合わせるブロック数が少ないほど、音声データを受信してから再生されるまでの時間が短くなり、少ない遅延時間で再生することが出来るが、再生を待ち合わせる時間以上の遅れで音声データが到着する場合は音声が途切れて再生されるという関係がある。この音声データを受信してから再生するまでに待ち合わせる間隔の算出、即ち、再生ブロック番号Ｂ１に対する書き込みブロック番号Ｂ２を決定する処理をゆらぎ補正制御部１０５で行う。なお、Ｔｍｉｎは最小再生待ち時間を表し、この値は書き込みブロック番号Ｂ２と再生ブロック番号Ｂ１との間隔ｄ値がとる最小値を表す。Ｔｍａｘは最大再生待ち時間を表し、間隔ｄ値がとる最大値を表す。このＴｍｉｎ、Ｔｍａｘの値はゆらぎ補正制御処理のため参照する値であり、音声データ受信前に予め設定しておく値である。例えば再生単位ブロックＢ０を３０ｍ秒間隔でサンプリングしたディジタル音声データとした場合、最小再生待ち時間を６０ｍ秒、最大３００ｍ秒の再生遅延時間を許容すると設定した場合は、Ｔｍｉｎ＝２、Ｔｍａｘ＝１０として設定される。
【００２２】
図３はゆらぎ補正制御部１０５の動作を説明するフローチャートである。図３では、音声が到着する度に、当該フローを実行し、再生バッファ上の現在再生ブロック番号と書き込みブロック番号との間隔ｄを求め、この間隔ｄを最大値Ｔｍａｘ、最小値Ｔｍｉｎと比較し、特定時間範囲である場合の発生頻度によって、再生バッファに書き込む位置を補正する動作を説明している。
【００２３】
図３で処理のために使用している作業用変数を以下に説明すれば、補正値ＡＪは音声データを書き込むブロック番号を補正する値で、整数値をとる。音声データ到着が連続して遅れる状態が続くと、この値は増加していき、現在再生ブロックより時間的に離れた再生バッファに音声データを書き込むように補正される。一方、音声データが連続して早く到着する状態が続くと、この値は減少していき、現在再生ブロックに時間的に近い再生バッファに書き込むように補正される。
【００２４】
送信時刻情報Ｔは、音声送信装置にて音声を符号化したデータ量に依存した時刻を表す情報であり、音声データとともに付加して送信する。インターネットでの音声、映像のようなリアルタイムトラフィックを送受信する場合、ＲＴＰ（Real Time Protocol）が用いられる。ＲＴＰは主にＵＤＰの上位層で規定されるプロトコルであり、リアルタイムデータを制御する目的で時刻印やシーケンス番号等の情報を定義している。このプロトコルの時刻情報部を利用して音声送信時の時刻情報を送信する。例えば３０ｍ秒間隔のサンプリング時間で８ビット８ｋＨz のμ法則を用いた場合は、３０ｍ秒毎に２４０バイトの音声データが生成されるが、このときの送信時刻情報Ｔは１パケット毎に２４０づつ加算された値が設定される。
【００２５】
最小値比較カウンタＣＭｉｎは、間隔ｄが最小再生待ち時間Ｔｍｉｎ以上である場合が連続して発生した回数をカウントしておく値で、このカウンタ値ＣＭｉｎが予め設定しておく比較値以上連続してカウントされたとき、音声データはＴｍｉｎより大きな再生待ち合せ時間で定常的に再生されているので、より早く再生するよう補正するため、データ書き込みブロック位置を再生ブロック位置側に近づけるように補正値ＡＪを変更する。ＣＭｉｎと比較するために予め設定しておく比較値は補正最高値ＡＪＷｍａｘで、例えば１０回連続でＣＭｉｎがカウントされたときに補正値ＡＪを変更するよう設定する場合は、ＡＪＷｍａｘ＝１０となる。
【００２６】
補正値増加カウンタＡＪＰは、間隔ｄが最小再生待ち時間Ｔｍｉｎより小のときが連続して発生した回数をカウントする値であって、このカウンタ値が予め設定しておく比較値以上連続で発生したときに、再生ブロック位置に時間的に離れる方向に音声データを書き込むように補正値ＡＪを変更する。ＡＪＰと比較するために予め設定しておく比較値は補正値増加変更カウンタＡＪＰＣｈで、例えば２回連続でＡＪＰがカウントされたときに補正値ＡＪを変更するように設定する場合は、ＡＪＰＣｈ＝２となる。
【００２７】
補正値減少カウンタＡＪＭは、差ｄが最大再生待ち時間Ｔｍａｘより大のときが連続して発生した回数をカウントする値であって、このカウンタ値が予め設定しておく比較値以上連続で発生したときに、再生ブロック位置から時間的に近い再生バッファに音声データを書き込むように補正値ＡＪを変更する。ＡＪＭと比較するために予め設定しておく比較値は補正値減少変更カウンタＡＪＭＣｈで、例えば２回連続でＡＪＭがカウントされたときに補正値ＡＪを変更するように設定する場合は、ＡＪＭＣｈ＝２となる。
【００２８】
音声再生装置では、新たな音声パケットが到着する度に、図３のフローを実行し、補正値ＡＪと送信時刻情報Ｔを元に、再生バッファへ書き込む音声データの書き込み位置を決定するとともに、送信時刻情報Ｔから算出する再生バッファ上の書き込み位置と現在の再生位置との差、即ち、再生待ち間隔に基づいて補正値ＡＪを算出する。送信時刻情報Ｔは、例えば、音声送信装置側で音声データを符号化したデータ量に依存する値を基準値として、音声をサンプリングした時間を刻印した値であり、音声送信装置はこの値をディジタル音声データとともに付加して送信する。
【００２９】
以下、図３について動作を詳細に説明する。新しい音声データを受信する（Ｓ１）と受信データから送信時刻情報Ｔを取得する（Ｓ２）。次に現在の再生ブロック番号Ｂ１を取得する（Ｓ３）。そして、送信時刻情報Ｔと補正値ＡＪを元に書き込みブロック番号Ｂ２を算出する（Ｓ４）。この補正値ＡＪは直前に到着した受信データ処理により更新されている（初期値は０）。送信時刻情報Ｔは３０ｍ秒間隔サンプリングで８ビット８ｋＨz のμ法則の場合、２４０の整数倍の値であり、Ｔを２４０で割り、補正値ＡＪを加えることで、補正された書き込みブロック番号Ｂ２を算出している。
【００３０】
次に、書き込みブロック番号Ｂ２から現在の再生ブロック番号Ｂ１の値を引き、間隔ｄを算出する（Ｓ５）。次に、最小値比較カウンタＣＭｉｎを加算（インクリメント：＋１更新）しておく（Ｓ６）。このＣＭｉｎは間隔ｄがＴｍｉｎより大きい場合が連続して発生する状態のときに、より早く音声再生するよう、書き込みブロック番号を補正するために用いる値である。次に、書き込み再生位置ブロック間隔ｄが最小再生待ち時間Ｔｍｉｎ以下かを判定する（Ｓ７）。判定が偽であったら、次に間隔ｄを最大再生待ち時間Ｔｍａｘと比較するために▲２▼に分岐し、真であったら、間隔ｄがＴｍｉｎより大きい場合が連続しなかったので、最小値比較カウンタＣＭｉｎを初期化する（Ｓ８）。
【００３１】
次に、補正値増加カウンタＡＪＰは補正値増加変更カウンタＡＪＰＣｈ以上であれば（Ｓ９）補正値ＡＪを加算（インクリメント）し（Ｓ１０）、補正値増加カウンタＡＪＰを初期化する（Ｓ１１）。その後、補正値増加カウンタＡＪＰの加算（インクリメント）と補正値減少カウンタＡＪＭの初期化をする（Ｓ１２）。その後、▲３▼に分岐し、書き込みブロック番号Ｂ２へ受信データを書き込む（Ｓ２２）。間隔ｄがＴｍｉｎ以下であった場合、当該受信データの処理はこれで終了し、▲１▼に戻り次受信データ到着を待つ。
【００３２】
ステップ（Ｓ７）の判定が偽のとき、書き込み再生位置ブロック間隔ｄが最大再生待ち時間Ｔｍａｘ以上かを判定する（Ｓ１３）。偽のとき、即ち、間隔ｄがＴｍｉｎとＴｍａｘとの間であった場合、処理▲３▼へ分岐し、書き込みブロック番号Ｂ２へ受信データを書き込む（Ｓ２２）、補正値ＡＪは変更せずに▲１▼に戻る。真のとき補正値減少カウンタＡＪＭが補正値減少変更カウンタＡＪＭＣｈ以上かを判定して（Ｓ１４）、補正値ＡＪを減算（デクリメント：−１更新）し（Ｓ１５）、補正値減少カウンタＡＪＭを初期化する（Ｓ１６）。その後、補正値減少カウンタＡＪＭの加算（インクリメント）と補正値増加カウンタＡＪＰを初期化する（Ｓ１７）。
【００３３】
ステップ（Ｓ７）と（Ｓ１３）の判定がともに偽のとき、即ち、間隔ｄがＴｍｉｎとＴｍａｘとの間であった場合、最小値比較カウンタＣＭｉｎが補正最高値ＡＪＷｍａｘ以上かを判定する（Ｓ１８）。真なら補正値ＡＪを減算（デクリメント）（Ｓ１９）、補正値減少カウンタＡＪＭを初期化（Ｓ２０）し、最小値比較カウンタＣＭｉｎを初期化する（Ｓ２１）。補正最高値ＡＪＷｍａｘは予め設定しておく値で、ステップ（Ｓ１８）の判定で、間隔ｄが最小再生待ち時間Ｔｍｉｎより大きい場合が、補正最高値ＡＪＷｍａｘの回数分連続して起こったら補正値ＡＪを減算（デクリメント）する。即ち、受信データの到着時間が一定時間幅に定常的に受信するような、通信網が安定している状態のときは、再生待ち時間を少なくするように、書き込みブロック番号を補正するため、補正値ＡＪを減算（デクリメント）する。
【００３４】
その後、書き込みブロック番号Ｂ２へ受信データを書き込む（Ｓ２２）、▲１▼に戻り次受信データを待機する。
【００３５】
以上の到着時間ゆらぎ補正制御部の処理制御により、通信網状態によって音声データの到着時間にゆらぎが生じても音声再生待ち時間を動的に補正制御することで、通信網が大きな再生待ち時間を必要としないような良好な通信状態のときは再生待ち時間を最小時間に、逆に音声データ到着が遅れる通信状態のときは大きな再生待ち時間に変化させられる。
【００３６】
さて、以上に説明したゆらぎ補正制御部の動作を踏まえ、ゆらぎ補正制御部を含む音声再生装置の動作について説明する。
【００３７】
受信部１００はパケット通信網から自音声再生装置宛てのディジタル音声データを受信すると、ゆらぎ補正制御部１０５へ受信したことを通知する。１０５は通知を受けると受信データから送信時刻情報Ｔを取得し、この値から上記で説明したゆらぎ補正制御方法により、最適な再生待ち時間を算出する。この算出した最適再生待ち時間を基にして１０５は、再生バッファ１０１の最適な位置へディジタル音声データを書き込む。この書き込む位置の動的な最適化により、通信状態に応じた再生待ち時間を待つように調整する。１０２は再生バッファ１０１のディジタル音声データを逐次読み出し、１０３へ出力する。Ｄ／Ａ変換部１０３は１０２から受け取ったディジタル音声データを逐次、Ｄ／Ａ変換して音声出力部１０４へ出力する。１０４からは音声として再生出力する。
【００３８】
以上のような本発明の第一の実施形態によれば、上記の音声データ到着時間ゆらぎ補正制御部を含む音声再生装置によって、インターネットに代表するデータ到着時間にゆらぎが発生する非同期通信網での音声通信をする際、通信網状態に対応した最適な再生待ち時間で音声再生が実現される。
【００３９】
（実施例２）：以下本発明の第２の実施形態について図面に基づき説明すれば、図４は本実施の形態に係る音声中継装置のブロック構成図である。図４で、４００はパケット通信網からデータパケットを受信する受信部であり、４０５にデータを受信したことを通知する。４０５は本発明の特徴をなすゆらぎ補正制御部で、受信部４００からの通知をもとに再生バッファ４０１への書き込み制御をする。４０１は再生バッファで４０５から出力されたディジタル音声データを一時的に保存する。４０２は再生データ読み出し部であり、連続再生可能なディジタル音声データが４０１に蓄積されるのを待ってからディジタル音声データを読み出し、４０３へ出力する。４０３はディジタル音声データを音声データに変換して４０４に出力するＤ／Ａ変換部である。４０４は回線対応部で４０３から出力された音声を電話回線網へ送信する。
【００４０】
以上のように構成された音声中継装置について、その動作について説明すれば、受信部４００はパケット通信網から自音声中継装置宛てのディジタル音声データを受信すると、ゆらぎ補正制御部４０５へ受信したことを通知する。４０５は通知を受けると受信データから送信時刻情報Ｔを取得し、この値から上記で説明したゆらぎ補正制御方法により、最適な再生待ち時間を算出する。この算出した最適再生待ち時間を基にして４０５は、再生バッファ４０１の最適な位置へディジタル音声データを書き込む。この書き込む位置の動的な最適化により、通信状態に応じた再生待ち時間を待つように調整する。４０２は再生バッファ４０１のディジタル音声データを逐次読み出し、４０３へ出力する。Ｄ／Ａ変換部４０３は４０２から受け取ったディジタル音声データを逐次、Ｄ／Ａ変換して音声出力部４０４へ出力する。４０４からは電話回線網へ再生出力される。
【００４１】
以上の動作により、本発明の第２の実施形態によれば、上記の音声データ到着時間ゆらぎ補正制御部を含む音声中継装置によって、インターネットに代表するデータ到着時間にゆらぎが発生する非同期通信網での音声通信の中継をする際、通信網状態に対応した再生待ち時間で音声再生し、電話回線網へ送信することで最適な音声中継が実現される。
【００４２】
なお、以上の説明から判るように、再生待ち時間を小さくするように、音声データの書き込み位置が変更された上、書き込みが行われる場合は、未再生音声データ上に上書きされる結果として、その未再生音声データは無効化されることになる。例えば３０ｍ秒間隔のサンプリングの場合、一回の書き込み位置補正で３０ｍ秒分の音声データが無効化されるというものである。したがって、仮に、連続的に書き込み位置の変更が行われれば、連続的に音声データの無効化が生じる結果として、聴覚上、音声の途切れが感じられることになる。しかしながら、本発明による場合には、指定回数分、再生待ち時間を小さくする事象が発生した場合に、初めて書き込み位置補正処理が行われており、この結果、時間的に離散した状態として音声データが無効化されていることから、聴覚上、殆ど問題とはされないものとなっている。
【００４３】
一方、以上とは逆に、再生待ち時間を大きくするように、音声データの書き込み位置が変更された上、書き込みが行われる場合には、その書き込み位置と既に音声データが書き込みされている位置との間には大きなアドレス差が生じる結果として、その間での音声の再生に際しては、音声が全く再生されないか（音声再生上の途切れに相当）、または既に再生済の音声データが再生されることになる（音声再生上の不連続再生に相当）。このような不具合は、直前音声データの繰返し再生による補間方法によって、ある程度軽減され得るものとなっている。したがって、仮に、連続的に書き込み位置の変更が行われれば、音声データの連続性が失われる結果として、聴覚上、同一音声の補間再生により不自然さを感じることになる。しかしながら、本発明による場合、指定回数分、再生待ち時間を大きくする事象が発生した場合に、初めて書き込み位置補正処理が行われており、この結果、時間的に離散した状態として再生待ち時間が遅らされていることから、聴覚上、殆ど問題とはならないものとなっている。
【００４４】
【発明の効果】
以上、説明したように、本発明によれば、非同期通信網（インターネット等）を介し音声通信が行われた上、受信ディジタル音声データが音声再生装置で音声として再生される際に、通信網の状態により動的に最適な再生待ち時間が設定された上、音声が再生されることによって、良質な音声通話を実現する。また、非同期通信網を介しての音声通信を行うために受信したディジタル音声データを音声中継装置で電話網へ送信する際に、通信網の状態により動的に最適な再生待ち時間で音声再生をすることにより良質な音声送信を実現する。
【図面の簡単な説明】
【図１】図１は、本発明による音声再生装置のブロック構成を示す図
【図２】図２は、その音声再生装置におけるゆらぎ補正制御処理を説明するための図
【図３】図３は、そのゆらぎ補正制御処理のフローを示す図
【図４】図４は、本発明による音声中継装置のブロック構成を示す図
【図５】図５は、従来技術に係る音声再生装置のブロック構成を示す図
【符号の説明】
１００，４００…受信部、１０１，４０１…再生バッファ、１０２，４０２…再生データ読み出し部、１０３，４０３…Ｄ／Ａ変換部、１０４…音声出力部、１０５，４０５…ゆらぎ補正制御部、４０４…回線対応部、Ｂ０…再生単位ブロック、Ｂ１…再生位置、Ｂ２…書き込み位置[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice communication system, and more particularly to a voice fluctuation correction control method for receiving and reproducing voice data in a system for transmitting and receiving voice data using a packet communication network such as the Internet, and further to the voice fluctuation. The present invention relates to an audio reproduction device and an audio relay device according to a correction control method.
[0002]
[Prior art]
In recent years, a technique for integrating a voice telephone with a LAN (Local Area Network) has been proposed, and the construction and services of a telephone system that uses the Internet or an intranet for telephone conversation between telephones are gradually being implemented. By the way, in private communication so far, while the voice telephone is exclusively based on PBX (Private Branch Exchange), data communication is performed using LAN, and both communication networks are independent of each other. It has become. Therefore, if voice is converted into digital data adapted to communication data according to the LAN communication protocol so that voice can be transmitted and received as LAN data, voice communication using the LAN becomes possible. Integration of the voice telephone network and the data communication network will be realized.
[0003]
However, the Internet protocol (IP) is a standard communication protocol used in the LAN. On the other hand, in the Internet, TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) are actually used as the transport layer protocol of the OSI reference model. TCP is a connection-type communication connection and has functions such as packet order control, flow control, error detection / recovery, retransmission, congestion control, etc. A certain communication service is provided. On the other hand, UDP is a connectionless protocol and is used for communications that require real-time performance because it does not have the function of TCP. For example, in order to realize a voice call, even if a part of voice data is lost, the importance of resending is low, and continuously generated voice data must be transmitted sequentially. That is, UDP is used for communications that require real-time performance, such as voice. The problem is that Internet communication is suitable for transmitting and receiving discretely generated traffic, but continuously generated data, such as voice, is transmitted after each data packet is sequentially and continuously transmitted by UDP. It is not necessarily transmitted via the same transmission path. In other words, the time from when each data packet is sequentially transmitted at high speed by UDP to when it is received at the receiving side via various transmission paths is not constant, and there is variation in the time until it is received. Therefore, it is not always received in the order of transmission, and if some control is not applied, even if it is sequentially reproduced and output in the order of reception, it cannot always be reproduced as continuous sound.
[0004]
In order to prevent such discontinuous reproduction of audio, a reception buffer has been provided on the reception and reproduction side, and each audio data packet that is sequentially received is transmitted on the reproduction buffer in the order of transmission. While being temporarily stored as a rearranged state, it is reproduced and output for the first time after waiting for a time during which continuous reproduction is possible. FIG. 5 shows a sound reproducing apparatus according to the prior art. This will be briefly described as follows.
[0005]
That is, each of the voice data packets from the packet communication network is sequentially received by the receiving unit 500, and the digital voice data (encoded compression state) in the received voice data packet is rearranged in the reproduction buffer 501 in the order of packet transmission. The stored state is temporarily accumulated sequentially. On the other hand, in parallel with such temporary storage, the playback data reading unit 502 receives digital audio data from the playback buffer 500 after a certain period of time during which digital playback data that can be continuously played back is temporarily stored in the playback buffer 501. The data is sequentially read at a constant cycle. The digital audio data read out sequentially is then converted into an analog audio signal by the D / A converter 503 in a decoded and decompressed state, and reproduced as audio from the audio output unit (speaker, headphones, etc.) 504. It is what is being output. As described above, on a packet communication network (asynchronous communication network) such as the Internet, a LAN, an ATM network, or a frame relay network, each voice data packet is not necessarily received on the receiving side in the transmission order. Each data packet is temporarily stored in a state where the data packets are continuously rearranged in the order of transmission, and a certain amount of waiting time is required until the data packets are read and reproduced.
[0006]
[Problems to be solved by the invention]
As described above, in the past, a reproduction buffer having a fixed capacity is prepared on the reception and reproduction side, and each of the sequentially received audio data packets is sequentially temporarily stored in the reproduction buffer in the order of packet transmission. On the other hand, while it is stored, it is sequentially read and reproduced from the reproduction buffer after a certain reproduction waiting time. However, according to such a method, there is a problem in some cases. This is because even if the packet communication network is in a good communication state that does not require a certain reproduction waiting time, it is necessary to wait for a certain time before reading and reproducing, and conversely, it exceeds the certain reproduction waiting time. This is because when such a reception delay occurs, a blank time occurs during reproduction.
[0007]
An object of the present invention is to receive each transmission voice data packet as a state in which the reception order is different from the transmission order, that is, when the reception is performed in a state of variation, depending on the degree of variation. And an audio fluctuation correction control method in which each received audio data packet can be reproduced and output as good audio. Further, when relaying to the telephone network by the voice relay device, the voice is dynamically played back with the optimal playback waiting time depending on the state of the communication network, and is transmitted, thereby realizing high-quality voice transmission.
[0008]
In order to achieve the above object, the present invention provides a receiving unit for receiving a voice packet including digital voice data encoded and added with transmission time information via a communication network, and a block number assigned to the digital voice data. A reproduction buffer that is written and read in units of blocks, a voice fluctuation correction control unit that reads the digital voice data from the receiving unit and stores the digital voice data in the reproduction buffer so as to perform voice fluctuation correction, and An audio fluctuation correction control method for an audio reproduction device comprising: an audio output unit that extracts digital audio data from a reproduction buffer and performs audio reproduction output,
The voice fluctuation correction control unit
Set the maximum playback waiting time and the minimum playback waiting time for playing back audio from the playback buffer,
Calculating an interval d between the current reproduction block number and the writing block number on the reproduction buffer;
Write block position to the playback buffer set based on the transmission time information,
When the number of times that the interval d is greater than the maximum reproduction waiting time is continuously counted more than a preset value, the data writing block position is moved in a direction closer to the reproduction block position,
When the interval d is smaller than the minimum reproduction waiting time, the data write block position is moved in a direction away from the reproduction block position when the number of consecutive occurrences is continuously counted more than a preset value. Correction is a first feature.
[0009]
In addition, the present invention provides a receiving unit that receives a voice packet including digital voice data that is encoded and to which transmission time information is added via a communication network, and writes the digital voice data in units of blocks to which block numbers are assigned. A reproduction buffer that is read out, a voice fluctuation correction control unit that reads out the digital voice data from the receiving unit and stores the digital voice data in the reproduction buffer so as to perform voice fluctuation correction, and a digital voice from the reproduction buffer. An audio fluctuation correction control method for an audio reproduction device including an audio output unit that extracts data and performs audio reproduction output,
In the voice fluctuation correction control unit,
Means for extracting transmission time information T from the digital audio data;
Means for setting a maximum reproduction waiting time Tmax and a minimum reproduction waiting time Tmin for reproducing sound from the reproduction buffer;
Means for calculating an interval d between a current reproduction block number and a writing block number on the reproduction buffer;
A correction value decrease counter AJM that counts the number of times that the interval d is greater than the maximum reproduction waiting time Tmax,
A correction value increase counter AJP that counts the number of times that the interval d is smaller than the minimum reproduction waiting time Tmin,
Means for setting a correction value AJ, which is a value for correcting the block number for writing the digital audio data to the reproduction buffer,
The voice fluctuation correction control unit
A first step of setting a block number for writing digital audio data to the reproduction buffer based on transmission time information T extracted from the digital audio data;
A second step of changing the correction value AJ so that the data write block position is separated from the reproduction block position when the correction value increase counter AJP continuously counts a predetermined number of times or more;
A third step of changing the correction value AJ so as to bring the data writing block position closer to the reproduction block position when the correction value decrease counter AJM continuously counts a predetermined number of times or more;
A fourth step of correcting the write block number to the reproduction buffer set in the first step using the correction value AJ changed in the second and third steps;
The second feature is to correct the writing position of the digital audio data in the reproduction buffer by executing the fifth step of writing the digital audio data in the block of the calculated block number to be written in the reproduction buffer,
In the audio fluctuation correction control method of the second feature, the audio fluctuation correction control unit is provided with a minimum value comparison counter CMin for counting the number of times that the interval d is continuously generated when the interval d is not less than the minimum reproduction waiting time. And a step of changing the correction value AJ so that the data write block position approaches the reproduction block position when the minimum value comparison counter CMin counts a value exceeding a preset correction maximum value AJWmax. It is characterized by.
[0010]
Furthermore, the present invention provides a receiving unit that receives a voice packet including digital voice data that is encoded and to which transmission time information is added via a packet communication network, and the digital voice data in units of blocks to which block numbers are assigned. A reproduction buffer for writing and reading; an audio fluctuation correction control unit for reading the digital audio data from the receiving unit and storing the digital audio data in the reproduction buffer so as to perform audio fluctuation correction; An audio reproduction device including an audio output unit that extracts audio data and performs audio reproduction output;
The voice fluctuation correction control unit
Setting means for setting a maximum reproduction waiting time and a minimum reproduction waiting time for reproducing sound from the reproduction buffer;
Means for calculating an interval d between a current reproduction block number and a writing block number on the reproduction buffer;
A correction value decrease counter that counts the number of times that the interval d is continuously greater than the maximum reproduction waiting time;
A correction value increase counter that counts the number of times that the interval d is smaller than the minimum reproduction waiting time,
When the correction value increment counter continuously counts the preset value or more, the write block position to the reproduction buffer set based on the transmission time information is moved away from the reproduction block position. Move in the direction,
When the correction value decrease counter continuously counts the predetermined value or more, the write block position to the reproduction buffer set based on the transmission time information is approached, and the data write block position approaches the reproduction block position. The fourth feature is that the correction is made to move in the direction.
[0011]
In addition, the present invention provides a receiving unit for receiving audio packets including digital audio data encoded and added with transmission time information via a packet communication network, and the digital audio data in units of blocks to which block numbers are assigned. A reproduction buffer for writing and reading; an audio fluctuation correction control unit for reading the digital audio data from the receiving unit and storing the digital audio data in the reproduction buffer so as to perform audio fluctuation correction; An audio reproduction device including an audio output unit that extracts audio data and performs audio reproduction output;
The voice fluctuation correction control unit
Means for extracting transmission time information T from the digital audio data;
Means for setting a maximum reproduction waiting time Tmax and a minimum reproduction waiting time Tmin for reproducing sound from the reproduction buffer;
Means for calculating an interval d between a current reproduction block number and a writing block number on the reproduction buffer;
A minimum value comparison counter CMin for counting the number of times that the interval d is equal to or greater than the minimum reproduction waiting time Tmin;
A correction value decrease counter AJM that counts the number of times that the interval d is greater than the maximum reproduction waiting time Tmax,
A correction value increase counter AJP that counts the number of times that the interval d is smaller than the minimum reproduction waiting time Tmin,
Means for setting a correction value AJ, which is a value for correcting a block number for writing digital audio data to the reproduction buffer,
A first step of setting a block number for writing digital audio data to the reproduction buffer based on transmission time information T extracted from the digital audio data;
A second step of changing the correction value AJ so that the data write block position is separated from the reproduction block position when the correction value increase counter AJP continuously counts a predetermined number of times or more;
A third step of changing the correction value AJ so as to bring the data writing block position closer to the reproduction block position when the correction value decrease counter AJM continuously counts a predetermined number of times or more;
A fourth step of correcting the write block number to the reproduction buffer set in the first step using the correction value AJ changed in the second and third steps;
A fifth feature is that the fifth step of writing the digital audio data in the block of the block number to be written to the reproduction buffer is calculated,
In the audio reproduction device of the fifth feature, the audio fluctuation correction control unit includes a minimum value comparison counter CMin that counts the number of times that the interval d is continuously generated when the interval d is equal to or greater than the minimum reproduction waiting time, When the minimum value comparison counter CMin counts a value exceeding a preset correction maximum value AJWmax, a sixth step of executing a step of changing the correction value AJ so that the data writing block position is brought closer to the reproduction block position is performed. It is characterized by.
[0012]
Furthermore, the present invention is characterized in that, in the audio reproducing device according to the fifth or sixth feature, the packet communication network is the Internet, a local area network, an ATM protocol, or a frame relay network.
The eighth aspect of the present invention is characterized by comprising a line response unit that outputs packet voice data to a circuit switched network in place of the voice output unit described in any one of the features 5 to 7, and the eighth feature. According to a ninth aspect of the present invention, the switching line network is a public analog telephone network, a public ISDN, or an extension telephone network.
[0013]
Furthermore, the present invention is the audio fluctuation correction control method according to the second or third feature, wherein the digital audio data is data sampled using an 8-bit 8 KHz μ-law at a sampling time interval of 30 milliseconds, When the reproduction buffer block unit is 240-byte digital audio data, the maximum reproduction waiting time is 300 milliseconds, and the minimum reproduction waiting time is 60 milliseconds, a preset value of the correction value increase counter AJP A tenth feature is that the preset value of the correction value decrease counter AJM is twice.
[0014]
Furthermore, the present invention provides the audio reproducing apparatus according to claim 5 or 6, wherein the digital audio data is data sampled using an 8-bit 8 KHz μ-law at a sampling time interval of 30 milliseconds, When the block unit of the reproduction buffer is 240-byte digital audio data, the maximum reproduction waiting time is 300 milliseconds, and the minimum reproduction waiting time is 60 milliseconds, a preset value of the correction value increase counter AJP and The eleventh feature is that the preset value of the correction value decrease counter AJM is twice.
[0017]
DETAILED DESCRIPTION OF THE INVENTION
The present invention is applied to an audio reproducing apparatus that receives audio data and reproduces audio via an asynchronous communication network such as a packet communication network, and an audio relay apparatus that transmits reproduced audio to a switched line network. It is applied to Internet telephones that communicate voice data using.
[0018]
Example 1 Hereinafter, a first embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block configuration diagram of an audio reproducing apparatus according to the present embodiment. Reference numeral 100 in FIG. 1 denotes a receiving unit that receives a data packet from the packet communication network, and notifies 105 that data has been received. Reference numeral 105 denotes a fluctuation correction control unit that characterizes the present invention, and controls writing to the reproduction buffer 101 based on a notification from the receiving unit 100. A reproduction buffer 101 temporarily stores digital audio data output from 105. Reference numeral 102 denotes a reproduction data reading unit, which waits for digital audio data that can be continuously reproduced to be stored in 101 before reading out the digital audio data and outputting it to 103. Reference numeral 103 denotes a D / A converter that converts digital audio data into audio data and outputs the audio data to 104. Reference numeral 104 denotes an audio output unit such as a speaker or headphones.
[0019]
The playback buffer 101, playback data reading unit 102, and fluctuation correction control unit 105 of the audio playback apparatus configured as described above will be described in detail below with reference to FIGS.
[0020]
FIG. 2 is a diagram for explaining buffer control processing of the fluctuation correction control unit 105. The arrangement of the blocks in FIG. 1 represents the reproduction buffer 101 in FIG. 1. The fluctuation correction control unit 105 in FIG. 1 writes the audio data at the optimum position, and the reproduction data reading unit 102 starts from the left block in FIG. It represents reading and playback. B0 is a unit block for audio reproduction. This value is the size of the voice data converted into digital voice data at a preset sampling time. For example, if the 8-bit 8 kHz μ-law is used with a sampling time of 30 ms, 240 bytes. It becomes digital audio data, which is the size of the audio reproduction unit block. B1 indicates the block number of the playback block unit currently being played back, and B2 indicates the block number of the playback block unit to which the received audio data is written. d is a value representing the interval between the writing block number B2 and the reproduction block number B1, that is, the number of blocks of audio data to be waited for after the received audio data is written and reproduced. The hatched portion in FIG. 2 indicates the audio data waiting from the reception of the audio data to the reproduction.
[0021]
There is the following relationship between the number of blocks waiting for the reproduction and the audio reproduction. In other words, as the number of waiting blocks increases, the time from reception of audio data to playback becomes longer. However, even if the arrival time varies greatly depending on the communication state, the audio can be continuously played without interruption. . On the other hand, the smaller the number of blocks waiting for playback, the shorter the time it takes for audio data to be played after it has been received, and it can be played back with less delay time. When audio data arrives, the audio is interrupted and reproduced. The fluctuation correction control unit 105 performs calculation of an interval for waiting from the reception of the audio data to reproduction, that is, processing for determining the writing block number B2 for the reproduction block number B1. Tmin represents the minimum reproduction waiting time, and this value represents the minimum value taken by the interval d value between the writing block number B2 and the reproduction block number B1. Tmax represents the maximum reproduction waiting time, and represents the maximum value taken by the interval d value. The values of Tmin and Tmax are values that are referred to for the fluctuation correction control process, and are values that are set in advance before receiving audio data. For example, when the reproduction unit block B0 is digital audio data sampled at intervals of 30 msec, if the minimum reproduction waiting time is set to allow 60 msec and a maximum reproduction delay time of 300 msec, Tmin = 2 and Tmax = 10 are set. Is set.
[0022]
FIG. 3 is a flowchart for explaining the operation of the fluctuation correction control unit 105. In FIG. 3, each time voice arrives, the flow is executed to obtain an interval d between the current reproduction block number and the write block number on the reproduction buffer, and this interval d is compared with the maximum value Tmax and the minimum value Tmin. The operation of correcting the position to be written in the reproduction buffer according to the frequency of occurrence in the specific time range is described.
[0023]
The work variables used for processing in FIG. 3 will be described below. The correction value AJ is a value for correcting the block number in which the audio data is written, and takes an integer value. If the state where the arrival of the audio data is continuously delayed continues, this value increases, and the audio data is corrected so as to be written in the reproduction buffer that is temporally separated from the current reproduction block. On the other hand, if the state where the voice data arrives continuously and continuously continues, this value decreases and is corrected so as to be written in the reproduction buffer that is temporally close to the current reproduction block.
[0024]
The transmission time information T is information representing the time depending on the amount of data obtained by encoding the voice in the voice transmission device, and is transmitted along with the voice data. When transmitting and receiving real-time traffic such as voice and video on the Internet, RTP (Real Time Protocol) is used. RTP is a protocol mainly defined in the upper layer of UDP, and defines information such as time stamps and sequence numbers for the purpose of controlling real-time data. The time information at the time of voice transmission is transmitted using the time information part of this protocol. For example, when the 8-bit 8 kHz μ-law is used at a sampling time of 30 ms, 240 bytes of audio data are generated every 30 ms, but the transmission time information T at this time is incremented by 240 for each packet. Value is set.
[0025]
The minimum value comparison counter CMin is a value for counting the number of times the interval d is continuously generated when the interval d is equal to or greater than the minimum reproduction waiting time Tmin, and the counter value CMin is continuously greater than or equal to a preset comparison value. When counted, since the audio data is constantly reproduced with a reproduction waiting time larger than Tmin, the correction value AJ is set so that the data writing block position is closer to the reproduction block position side in order to correct the reproduction so that the reproduction is performed earlier. change. The comparison value set in advance for comparison with CMin is the maximum correction value AJWmax. For example, when the correction value AJ is set to be changed when CMin is counted ten times continuously, AJWmax = 10.
[0026]
The correction value increase counter AJP is a value for counting the number of times that the interval d is continuously generated when the interval d is smaller than the minimum reproduction waiting time Tmin, and this counter value is continuously generated for a predetermined comparison value or more. Sometimes, the correction value AJ is changed so that the audio data is written in a direction away from the playback block position in terms of time. The comparison value set in advance for comparison with AJP is the correction value increase / change counter AJPCh. For example, when the correction value AJ is set to be changed when AJP is counted twice, AJPCh = 2. It becomes.
[0027]
The correction value decrease counter AJM is a value that counts the number of times that the difference d is continuously generated when the difference d is greater than the maximum reproduction waiting time Tmax, and this counter value is continuously generated for a predetermined comparison value or more. Sometimes, the correction value AJ is changed so that the audio data is written in the reproduction buffer that is temporally closer to the reproduction block position. The comparison value set in advance for comparison with AJM is a correction value decrease change counter AJMCh. For example, when setting the correction value AJ to be changed when AJM is counted twice in succession, AJMCh = 2. It becomes.
[0028]
The audio playback device executes the flow of FIG. 3 every time a new audio packet arrives, determines the writing position of the audio data to be written to the playback buffer based on the correction value AJ and the transmission time information T, and transmits it. The correction value AJ is calculated based on the difference between the write position on the reproduction buffer calculated from the time information T and the current reproduction position, that is, the reproduction waiting interval. The transmission time information T is, for example, a value obtained by imprinting the time at which the voice is sampled, with a value depending on the amount of data encoded on the voice transmission apparatus side as a reference value. Attached with audio data and transmitted.
[0029]
Hereinafter, the operation will be described in detail with reference to FIG. When new voice data is received (S1), transmission time information T is acquired from the received data (S2). Next, the current reproduction block number B1 is acquired (S3). Then, the writing block number B2 is calculated based on the transmission time information T and the correction value AJ (S4). This correction value AJ is updated by the received data processing that arrived immediately before (initial value is 0). The transmission time information T is a value of an integral multiple of 240 in the case of 8 bits 8 kHz μ-law sampling at 30 ms interval sampling, and T is divided by 240 and the correction value AJ is added to obtain the corrected writing block number B2. Calculated.
[0030]
Next, the value of the current reproduction block number B1 is subtracted from the writing block number B2 to calculate the interval d (S5). Next, the minimum value comparison counter CMin is added (increment: +1 update) (S6). This CMin is a value used for correcting the writing block number so that the voice is reproduced earlier when the interval d is larger than Tmin. Next, it is determined whether the writing / reproducing position block interval d is equal to or smaller than the minimum reproduction waiting time Tmin (S7). If the determination is false, then the process branches to (2) to compare the interval d with the maximum reproduction waiting time Tmax. If true, the interval d is greater than Tmin. The comparison counter CMin is initialized (S8).
[0031]
Next, if the correction value increase counter AJP is equal to or greater than the correction value increase change counter AJPCh (S9), the correction value AJ is added (incremented) (S10), and the correction value increase counter AJP is initialized (S11). Thereafter, addition (increment) of the correction value increase counter AJP and initialization of the correction value decrease counter AJM are performed (S12). Thereafter, the process branches to {circle around (3)} and the received data is written to the write block number B2 (S22). When the interval d is equal to or less than Tmin, the processing of the received data is finished, and the process returns to (1) to wait for the next received data arrival.
[0032]
When the determination in step (S7) is false, it is determined whether the writing / reproducing position block interval d is equal to or greater than the maximum reproduction waiting time Tmax (S13). When it is false, that is, when the interval d is between Tmin and Tmax, the process branches to process (3) and the received data is written to the write block number B2 (S22), and the correction value AJ is not changed. Return to 1 ▼. When true, it is determined whether the correction value decrease counter AJM is equal to or greater than the correction value decrease change counter AJMCh (S14), the correction value AJ is subtracted (decrement: -1 update) (S15), and the correction value decrease counter AJM is initialized. (S16). Thereafter, addition (increment) of the correction value decrease counter AJM and the correction value increase counter AJP are initialized (S17).
[0033]
When both the determinations in steps (S7) and (S13) are false, that is, when the interval d is between Tmin and Tmax, it is determined whether the minimum value comparison counter CMin is greater than or equal to the corrected maximum value AJWmax (S18). . If true, the correction value AJ is subtracted (decrement) (S19), the correction value decrease counter AJM is initialized (S20), and the minimum value comparison counter CMin is initialized (S21). The maximum correction value AJWmax is a value that is set in advance. If the interval d is greater than the minimum reproduction waiting time Tmin in the determination in step (S18), the correction value AJ is set to the maximum correction value AJWmax. Subtract (decrement). In other words, when the communication network is stable, such that the arrival time of received data is constantly received within a certain time width, the write block number is corrected so as to reduce the reproduction waiting time. The value AJ is subtracted (decremented).
[0034]
Thereafter, the received data is written to the write block number B2 (S22), and the process returns to (1) to wait for the next received data.
[0035]
With the processing control of the arrival time fluctuation correction control unit described above, the communication network has a large reproduction waiting time by dynamically correcting and controlling the voice reproduction waiting time even if fluctuations in the arrival time of the voice data occur depending on the communication network state. The reproduction waiting time can be changed to the minimum time in a good communication state that is not required, and conversely, the reproduction waiting time can be changed to a large reproduction waiting time in a communication state in which voice data arrival is delayed.
[0036]
Now, based on the operation of the fluctuation correction control unit described above, the operation of the audio reproduction device including the fluctuation correction control unit will be described.
[0037]
When receiving the digital audio data addressed to the own audio reproduction device from the packet communication network, the receiving unit 100 notifies the fluctuation correction control unit 105 of the reception. When 105 receives the notification, it acquires the transmission time information T from the received data, and calculates the optimum reproduction waiting time from this value by the fluctuation correction control method described above. Based on the calculated optimum reproduction waiting time 105, the digital audio data is written to the optimum position of the reproduction buffer 101. By dynamically optimizing the writing position, adjustment is made so as to wait for the reproduction waiting time according to the communication state. Reference numeral 102 sequentially reads the digital audio data in the reproduction buffer 101 and outputs it to 103. The D / A converter 103 sequentially D / A converts the digital audio data received from 102 and outputs it to the audio output unit 104. From 104, it is reproduced and output as sound.
[0038]
According to the first embodiment of the present invention as described above, in the asynchronous communication network in which fluctuation occurs in the data arrival time represented by the Internet by the voice reproduction device including the voice data arrival time fluctuation correction control unit. When performing voice communication, voice reproduction is realized with an optimum reproduction waiting time corresponding to the communication network state.
[0039]
(Embodiment 2): A second embodiment of the present invention will be described below with reference to the drawings. FIG. 4 is a block configuration diagram of the voice relay apparatus according to the present embodiment. In FIG. 4, reference numeral 400 denotes a receiving unit that receives a data packet from the packet communication network, and notifies 405 that the data has been received. Reference numeral 405 denotes a fluctuation correction control unit that characterizes the present invention, and controls writing to the reproduction buffer 401 based on a notification from the reception unit 400. A reproduction buffer 401 temporarily stores the digital audio data output from 405. A reproduction data reading unit 402 reads digital audio data after waiting for digital audio data that can be continuously reproduced to be stored in 401, and outputs it to 403. A D / A conversion unit 403 converts the digital audio data into audio data and outputs the audio data to 404. Reference numeral 404 denotes a line corresponding unit which transmits the voice output from 403 to the telephone line network.
[0040]
The operation of the voice repeater configured as described above will be described. When the receiving unit 400 receives digital voice data addressed to the own voice repeater from the packet communication network, it is received by the fluctuation correction control unit 405. Notice. Upon receiving the notification, 405 obtains the transmission time information T from the received data, and calculates an optimum reproduction waiting time from this value by the fluctuation correction control method described above. Based on the calculated optimum reproduction waiting time, 405 writes the digital audio data to the optimum position of the reproduction buffer 401. By dynamically optimizing the writing position, adjustment is made so as to wait for the reproduction waiting time according to the communication state. Reference numeral 402 sequentially reads the digital audio data in the reproduction buffer 401 and outputs it to 403. The D / A conversion unit 403 sequentially D / A converts the digital audio data received from 402 and outputs it to the audio output unit 404. From 404, it is reproduced and output to the telephone network.
[0041]
With the above operation, according to the second embodiment of the present invention, the voice relay apparatus including the voice data arrival time fluctuation correction control unit described above is an asynchronous communication network in which fluctuation occurs in the data arrival time represented by the Internet. When relaying the voice communication, the voice is reproduced with a reproduction waiting time corresponding to the state of the communication network and transmitted to the telephone line network, so that the optimum voice relay is realized.
[0042]
As can be seen from the above description, when writing is performed after changing the writing position of the audio data so as to reduce the reproduction waiting time, as a result of being overwritten on the unreproduced audio data, Unreproduced audio data will be invalidated. For example, in the case of sampling at intervals of 30 milliseconds, the sound data for 30 milliseconds is invalidated by one writing position correction. Therefore, if the writing position is continuously changed, the audio data is continuously invalidated, and as a result, the sound is sensed intermittently. However, in the case of the present invention, the write position correction process is performed for the first time when an event that reduces the reproduction waiting time for the specified number of times occurs. As a result, the audio data is in a discrete state in time. Since it is invalidated, it is hardly regarded as a problem in hearing.
[0043]
On the other hand, when the writing position of the audio data is changed and the writing position is changed so as to increase the reproduction waiting time, the writing position and the position where the audio data has already been written are changed. As a result of a large address difference between the two, there is no audio playback (corresponding to interruptions in audio playback) or audio data that has already been played back. (Corresponding to discontinuous playback on audio playback). Such a problem can be alleviated to some extent by an interpolation method based on repeated reproduction of immediately preceding audio data. Accordingly, if the writing position is continuously changed, the continuity of the audio data is lost, and as a result, unnaturalness is felt by the interpolated reproduction of the same audio. However, according to the present invention, the write position correction processing is performed for the first time when an event that increases the reproduction waiting time for the specified number of times occurs. As a result, the reproduction waiting time is delayed as a time-discrete state. Therefore, it has become almost no problem for hearing.
[0044]
【The invention's effect】
As described above, according to the present invention, when voice communication is performed via an asynchronous communication network (such as the Internet) and the received digital voice data is played back as voice by the voice playback device, An optimal reproduction waiting time is dynamically set according to the state, and voice is reproduced, thereby realizing a high-quality voice call. In addition, when digital voice data received for voice communication via an asynchronous communication network is transmitted to the telephone network by the voice relay device, voice playback is dynamically performed with an optimal playback waiting time depending on the state of the communication network. By doing so, high-quality voice transmission is realized.
[Brief description of the drawings]
FIG. 1 is a block diagram of an audio playback device according to the present invention.
FIG. 2 is a diagram for explaining fluctuation correction control processing in the audio reproduction device;
FIG. 3 is a flowchart showing the fluctuation correction control process;
FIG. 4 is a block diagram of a voice relay device according to the present invention.
FIG. 5 is a diagram showing a block configuration of an audio reproducing apparatus according to the prior art.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 100,400 ... Reception part, 101, 401 ... Reproduction buffer, 102, 402 ... Reproduction data reading part, 103, 403 ... D / A conversion part, 104 ... Audio | voice output part, 105, 405 ... Fluctuation correction control part, 404 ... Line-corresponding part, B0 ... reproduction unit block, B1 ... reproduction position, B2 ... write position

Claims

A receiving unit that receives a voice packet including digital voice data encoded and added with transmission time information via a communication network, and a digital voice data is written and read in units of blocks to which block numbers are assigned. A playback buffer, a voice fluctuation correction control unit that reads out the digital voice data from the receiving unit and stores the digital voice data in the playback buffer so as to perform voice fluctuation correction, and takes out the digital voice data from the playback buffer and outputs the voice. An audio fluctuation correction control method for an audio reproduction device including an audio output unit for performing reproduction output,
The voice fluctuation correction control unit
Set the maximum playback waiting time and the minimum playback waiting time for playing back audio from the playback buffer,
Calculating an interval d between the current reproduction block number and the writing block number on the reproduction buffer;
Write block position to the playback buffer set based on the transmission time information,
When the number of times that the interval d is greater than the maximum reproduction waiting time is continuously counted more than a preset value, the data writing block position is moved in a direction closer to the reproduction block position,
When the interval d is smaller than the minimum reproduction waiting time, the data write block position is moved in a direction away from the reproduction block position when the number of consecutive occurrences is continuously counted more than a preset value. An audio fluctuation correction control method for an audio reproduction device, wherein the audio fluctuation correction control method is performed.

A receiving unit that receives a voice packet including digital voice data encoded and added with transmission time information via a communication network, and a digital voice data is written and read in units of blocks to which block numbers are assigned. A playback buffer, a voice fluctuation correction control unit that reads out the digital voice data from the receiving unit and stores the digital voice data in the playback buffer so as to perform voice fluctuation correction, and takes out the digital voice data from the playback buffer and outputs the voice. An audio fluctuation correction control method for an audio reproduction device including an audio output unit for performing reproduction output,
In the voice fluctuation correction control unit,
Means for extracting transmission time information T from the digital audio data;
Means for setting a maximum reproduction waiting time Tmax and a minimum reproduction waiting time Tmin for reproducing sound from the reproduction buffer;
Means for calculating an interval d between a current reproduction block number and a writing block number on the reproduction buffer;
A correction value decrease counter ALM that counts the number of times that the interval d is greater than the maximum reproduction waiting time Tmax,
A correction value increase counter AJP that counts the number of times that the interval d is smaller than the minimum reproduction waiting time Tmin,
Means for setting a correction value AJ, which is a value for correcting the block number for writing the digital audio data to the reproduction buffer,
The voice fluctuation correction control unit
A first step of setting a block number for writing digital audio data to the reproduction buffer based on transmission time information T extracted from the digital audio data;
A second step of changing the correction value AJ so that the data write block position is separated from the reproduction block position when the correction value increase counter AJP continuously counts a predetermined number of times or more;
A third step of changing the correction value AJ so as to bring the data writing block position closer to the reproduction block position when the correction value decrease counter AJM continuously counts a predetermined number of times or more;
A fourth step of correcting the write block number to the reproduction buffer set in the first step using the correction value AJ changed in the second and third steps;
An audio reproducing apparatus for correcting a writing position of digital audio data in a reproduction buffer by executing the fifth step of writing digital audio data in a block having a block number to be written in the reproduction buffer. Voice fluctuation correction control method.

The voice fluctuation correction control unit is provided with a minimum value comparison counter CMin for counting the number of times that the interval d is equal to or longer than the minimum reproduction waiting time, and the minimum value comparison counter CMin is set to a preset correction. 3. The audio fluctuation correction of the audio reproducing apparatus according to claim 2, further comprising a step of changing the correction value AJ so that the data writing block position approaches the reproduction block position when a value exceeding the maximum value AJWmax is counted. Control method.

A receiving unit that receives a voice packet including digital voice data encoded and added with transmission time information via a packet communication network, and writing and reading digital voice data in units of blocks to which block numbers are assigned. A playback buffer, a voice fluctuation correction control unit that reads the digital voice data from the receiving unit, stores the digital voice data in the playback buffer so as to perform voice fluctuation correction, and extracts the digital voice data from the playback buffer. An audio reproduction device including an audio output unit for performing audio reproduction output,
The voice fluctuation correction control unit
Setting means for setting a maximum reproduction waiting time and a minimum reproduction waiting time for reproducing sound from the reproduction buffer;
Means for calculating an interval d between a current reproduction block number and a writing block number on the reproduction buffer;
A correction value decrease counter that counts the number of times that the interval d is continuously greater than the maximum reproduction waiting time;
A correction value increase counter that counts the number of times that the interval d is smaller than the minimum reproduction waiting time,
When the correction value increment counter continuously counts the preset value or more, the write block position to the reproduction buffer set based on the transmission time information is moved away from the reproduction block position. Move in the direction,
When the correction value decrease counter continuously counts the predetermined value or more, the write block position to the reproduction buffer set based on the transmission time information is approached, and the data write block position approaches the reproduction block position. A sound reproducing device, wherein the sound reproducing device is corrected so as to move in a direction.

A receiving unit that receives a voice packet including digital voice data encoded and added with transmission time information via a packet communication network, and writing and reading digital voice data in units of blocks to which block numbers are assigned. A playback buffer, a voice fluctuation correction control unit that reads the digital voice data from the receiving unit, stores the digital voice data in the playback buffer so as to perform voice fluctuation correction, and extracts the digital voice data from the playback buffer. An audio reproduction device including an audio output unit for performing audio reproduction output,
The voice fluctuation correction control unit
Means for extracting transmission time information T from the digital audio data;
Means for setting a maximum reproduction waiting time Tmax and a minimum reproduction waiting time Tmin for reproducing sound from the reproduction buffer;
Means for calculating an interval d between a current reproduction block number and a writing block number on the reproduction buffer;
A minimum value comparison counter CMin for counting the number of times that the interval d is equal to or greater than the minimum reproduction waiting time Tmin;
A correction value decrease counter AJM that counts the number of times that the interval d is greater than the maximum reproduction waiting time Tmax,
A correction value increase counter AJP that counts the number of times that the interval d is smaller than the minimum reproduction waiting time Tmin,
Means for setting a correction value AJ, which is a value for correcting a block number for writing digital audio data to the reproduction buffer,
A first step of setting a block number for writing digital audio data to the reproduction buffer based on transmission time information T extracted from the digital audio data;
A second step of changing the correction value AJ so that the data write block position is separated from the reproduction block position when the correction value increase counter AJP continuously counts a predetermined number of times or more;
A third step of changing the correction value AJ so as to bring the data writing block position closer to the reproduction block position when the correction value decrease counter AJM continuously counts a predetermined number of times or more;
A fourth step of correcting the write block number to the reproduction buffer set in the first step using the correction value AJ changed in the second and third steps;
And a fifth step of writing digital audio data in a block having a block number to be written to the reproduction buffer.

The voice fluctuation correction control unit has a minimum value comparison counter CMin for counting the number of times that the interval d is equal to or longer than the minimum reproduction waiting time, and the minimum value comparison counter CMin is set in advance. 6. The audio reproducing apparatus according to claim 5, wherein when a value exceeding the maximum correction value AJWmax is counted, a step of changing the correction value AJ so as to bring the data writing block position closer to the reproduction block position is performed.

The audio reproducing apparatus according to claim 5, wherein the packet communication network is the Internet, a local area network, an ATM network, or a frame relay network.

8. A voice relay apparatus comprising a line response unit that outputs packet voice data to a circuit switched network in place of the voice output unit according to claim 5.

9. The voice relay apparatus according to claim 8, wherein the switched line network is a public analog telephone network, a public ISDN, or an extension telephone network.

The digital audio data is data sampled using a μ-law of 8 bits and 8 KHz at a sampling time interval of 30 milliseconds, and the block unit of the reproduction buffer is 240 bytes of digital audio data, and the maximum reproduction When the waiting time is 300 milliseconds and the minimum playback waiting time is 60 milliseconds,
4. The voice fluctuation correction control method according to claim 2, wherein the preset value of the correction value increase counter AJP and the preset value of the correction value decrease counter AJPM are twice.

The digital audio data is data sampled using a μ-law of 8 bits and 8 KHz at a sampling time interval of 30 milliseconds, and the block unit of the reproduction buffer is 240 bytes of digital audio data, and the maximum reproduction When the waiting time is 300 milliseconds and the minimum playback waiting time is 60 milliseconds,
7. The audio reproducing apparatus according to claim 5, wherein the preset value of the correction value increase counter AJP and the preset value of the correction value decrease counter AJPM are twice.