JP2006100882A

JP2006100882A - Voice packet playback method

Info

Publication number: JP2006100882A
Application number: JP2004280972A
Authority: JP
Inventors: Nobuyasu Shiga; 伸靖志賀
Original assignee: Iwatsu Electric Co Ltd
Current assignee: Iwatsu Electric Co Ltd
Priority date: 2004-09-28
Filing date: 2004-09-28
Publication date: 2006-04-13

Abstract

【課題】受信された音声パケットから再生される音声に与える影響である違和感を軽減して音声再生を行うことができる音声パケット再生方式を提供する。
【解決手段】受信した音声パケットから音声を再生するための基準となる音声再生タイミング信号と受信した音声パケットの受信タイミングとのクロック周波数差に対応する補正量情報に従って、音声再生タイミング信号の分周比を制御して、受信した音声パケットから音声信号を再生する再生要求信号を生成する。受信された前記音声パケットを音声バッファに一時的に記録し、その記録された音声パケットの読み出し再生間隔が再生要求信号により制御されて、音声パケットのフレームデータへの再生タイミングを再生クロック単位で変化させることにより、音声サンプルデータを前記再生クロック単位でそのまま再生または廃棄もしくは挿入して再生データを生成する。
【選択図】図２The present invention provides a voice packet reproduction method capable of performing voice reproduction while reducing a sense of incongruity that is an influence on voice reproduced from a received voice packet.
Frequency division of an audio reproduction timing signal according to correction amount information corresponding to a clock frequency difference between an audio reproduction timing signal serving as a reference for reproducing audio from the received audio packet and a reception timing of the received audio packet The reproduction request signal for reproducing the audio signal from the received audio packet is generated by controlling the ratio. The received voice packet is temporarily recorded in the voice buffer, and the read / playback interval of the recorded voice packet is controlled by the playback request signal, and the playback timing of the voice packet to the frame data is changed in units of the playback clock. As a result, the audio sample data is reproduced, discarded, or inserted as it is in units of the reproduction clock to generate reproduction data.
[Selection] Figure 2

Description

本発明は、ネットワーク上のパケットを介して音声通信を行うシステムにおける音声パケット再生方式に関する。 The present invention relates to a voice packet reproduction method in a system that performs voice communication via a packet on a network.

この種の従来の音声パケット再生方式では、ネットワークから受信した音声パケットは、到着時間のばらつきを吸収し音声サンプルクロックをもとに等間隔で音声再生をするための音声バッファに格納される。音声バッファでは、一定数のパケットが蓄積されると音声再生が開始される。音声パケットの受信間隔が長くなるかパケットの欠落が発生し、バッファ内のパケットが枯渇すると、代替パケットが挿入される。また、音声パケットの受信間隔が短くなりバッファが溢れると、パケットが廃棄される（特許文献１参照）。
特開２００１−４５０６５号公報 In this type of conventional voice packet reproduction method, voice packets received from the network are stored in a voice buffer for reproducing the voice at regular intervals based on the voice sample clock by absorbing variations in arrival time. In the audio buffer, audio reproduction is started when a certain number of packets are accumulated. When the voice packet reception interval becomes long or packet loss occurs and the packets in the buffer are exhausted, alternative packets are inserted. Further, when the voice packet reception interval becomes short and the buffer overflows, the packet is discarded (see Patent Document 1).
JP 2001-45065 A

このような従来の音声パケット再生方式では、音声パケットの受信タイミングと再生タイミング間でのクロック周波数差により、パケットの欠落がない良好な通信状態であっても、時間経過とともに音声バッファ内のパケットの蓄積あるいは枯渇が進み、周期的に音声バッファのオーバフローあるいはアンダフローが生じ、その都度音声パケット単位で音声データの廃棄あるいは挿入を行うため、再生される音声の違和感が大きいという問題点があった。 In such a conventional voice packet reproduction method, even in a good communication state in which there is no packet loss due to the clock frequency difference between the reception timing and the reproduction timing of the voice packet, the packet in the voice buffer is Accumulation or depletion has progressed, and voice buffer overflows or underflows occur periodically. Since voice data is discarded or inserted in units of voice packets each time, there is a problem that the sense of discomfort of the reproduced voice is large.

本発明の目的は、受信された音声パケットから再生される音声に与える影響である違和感を軽減して音声再生を行うことができる音声パケット再生方式を提供することにある。 An object of the present invention is to provide an audio packet reproduction method capable of performing audio reproduction while reducing a sense of incongruity that is an influence on audio reproduced from a received audio packet.

この課題を解決するために、本発明による音声パケット再生方式は、受信した音声パケットから音声を再生するための基準となる音声再生タイミング信号を出力するタイミング発生部と、
受信した音声パケットの受信タイミングと前記音声再生タイミング信号とのクロック周波数差を検出して該クロック周波数差に対応する補正量情報を出力する周波数差検出部と、
検出された該周波数差に対応する前記補正量情報に従って前記音声再生タイミング信号の分周比を制御して、前記受信した音声パケットから音声信号を再生する再生要求信号を生成する補正制御部と、
前記受信された前記音声パケットを一時的に記録し、その記録された音声パケットの読み出し再生間隔が前記再生要求信号により制御されて、前記音声パケットの再生タイミングを再生クロック単位で変化させることができる音声バッファと、
該音声パケットのフレームデータを前記再生要求信号による制御の下で音声サンプルデータに変換する音声復号部と、
前記再生要求信号に基づき該音声サンプルデータを前記再生クロック単位でそのまま再生または廃棄もしくは挿入して再生データを生成する音声再生部と、
を備えた備えた構成を有している。 In order to solve this problem, an audio packet reproduction method according to the present invention includes a timing generation unit that outputs an audio reproduction timing signal serving as a reference for reproducing audio from a received audio packet;
A frequency difference detection unit that detects a clock frequency difference between the reception timing of the received audio packet and the audio reproduction timing signal and outputs correction amount information corresponding to the clock frequency difference;
A correction control unit that controls a frequency division ratio of the audio reproduction timing signal according to the correction amount information corresponding to the detected frequency difference and generates a reproduction request signal for reproducing an audio signal from the received audio packet;
The received voice packet is temporarily recorded, and the read / playback interval of the recorded voice packet is controlled by the playback request signal, so that the playback timing of the voice packet can be changed in units of playback clock. An audio buffer;
A voice decoding unit that converts the frame data of the voice packet into voice sample data under the control of the reproduction request signal;
An audio reproduction unit for generating reproduction data by reproducing, discarding, or inserting the audio sample data in units of the reproduction clock based on the reproduction request signal;
It has the composition provided with.

本発明によれば、音声データの再生が再生クロック単位で廃棄および挿入が可能となるため、再生される音声品質の向上を期待することができる。 According to the present invention, since the reproduction of audio data can be discarded and inserted in units of reproduction clocks, it is possible to expect an improvement in the quality of reproduced audio.

図１は、本発明の実施例１を示すブロック図である。図２は、本発明方式の動作を説明するためのタイムチャートである。
〔ＬＡＮインタフェース１〕
ＬＡＮインタフェース１では、ＩＰネットワークから受信したパケットデータをパケット受信制御部２に転送する。
〔パケット受信制御部２〕
パケット受信制御部２は、図３に示すように、パケットデータ受信部２−１，受信バッファ制御部２−２および受信タイミング情報生成部２−３により構成される。
パケットデータ受信部２−１では、ＬＡＮインタフェース１から受信した音声パケットからその受信タイミングを抽出するとともに、また、そのシーケンス番号，タイムスタンプ，符号化音声フレームデータａを出力する。
受信バッファ制御部２−２では、符号化音声フレームデータａを音声バッファ３に転送する。
受信タイミング情報生成部２−３では、周波数差検出部５に対して、音声バッファ３への転送を完了したことを示す転送完了タイミング信号を出力するとともに、当該音声パケットに記録されたシーケンス番号およびタイムスタンプを出力する。
〔音声バッファ３〕
音声バッファ３は、パケット受信制御部２から転送された符号化音声フレームデータａを蓄積する。
蓄積された符号化音声フレームデータｂは、ジッタ吸収に必要な一定量蓄積された後、補正制御部６からの再生要求信号ｅにより制御されて、音声復号部７に転送される。 FIG. 1 is a block diagram showing Embodiment 1 of the present invention. FIG. 2 is a time chart for explaining the operation of the method of the present invention.
[LAN interface 1]
The LAN interface 1 transfers packet data received from the IP network to the packet reception control unit 2.
[Packet reception control unit 2]
As shown in FIG. 3, the packet reception control unit 2 includes a packet data reception unit 2-1, a reception buffer control unit 2-2, and a reception timing information generation unit 2-3.
The packet data receiving unit 2-1 extracts the reception timing from the voice packet received from the LAN interface 1, and outputs the sequence number, time stamp, and encoded voice frame data a.
The reception buffer control unit 2-2 transfers the encoded audio frame data a to the audio buffer 3.
The reception timing information generation unit 2-3 outputs a transfer completion timing signal indicating that the transfer to the audio buffer 3 has been completed to the frequency difference detection unit 5, and the sequence number recorded in the audio packet and Output timestamp.
[Audio buffer 3]
The audio buffer 3 stores the encoded audio frame data a transferred from the packet reception control unit 2.
The accumulated encoded audio frame data b is accumulated in a certain amount necessary for jitter absorption, and is then controlled by a reproduction request signal e from the correction control unit 6 and transferred to the audio decoding unit 7.

〔タイミング発生部４〕
タイミング発生部４は、図示しない音声再生側基準クロック源からの基準クロックを分周し、音声再生タイミング信号ｆを生成する。
〔周波数差検出部５〕
周波数差検出部５は、受信した音声パケットの転送完了タイミング信号と基準再生タイミング信号ｆ間のクロック周波数差を検出し、この周波数差に対応する補正量情報を出力する。即ち、この周波数差検出部５では、送受信点間で共通の音声符号化方式に基づいて構成された音声パケットに付与された時刻の推移を監視し、一定時間あたりの受信パケットと再生パケットとの時刻の変化量を計算することにより、送受信間でのクロック周波数差を算出し、これに対応する補正量情報を作成する
〔補正制御部６〕
補正制御部６は、タイミング発生部４からの音声再生タイミング信号ｆを分周して、音声復号化および音声再生の基準とする再生周期を生成する。また、図５に示すように、分周比制御回路６ー１と分周カウンタ６ー２を用いて、周波数差検出部５からの補正量情報をもとに、基準となる再生周期に対して、１再生クロック分のデータのタイミング補正制御を行うために、分周比を変化させた再生要求信号ｅを出力する。また、補正制御部６はクロック周波数差に対応する補正量情報をもとに受信から再生までのパケット転送間隔すなわち音声再生間隔を制御する再生要求信号ｅを生成する。 [Timing generator 4]
The timing generator 4 divides a reference clock from a sound reproduction side reference clock source (not shown) to generate a sound reproduction timing signal f.
[Frequency difference detector 5]
The frequency difference detector 5 detects the clock frequency difference between the transfer completion timing signal of the received voice packet and the reference reproduction timing signal f, and outputs correction amount information corresponding to this frequency difference. That is, the frequency difference detection unit 5 monitors the transition of the time given to the voice packet configured based on the voice coding method common between the transmission and reception points, and determines the received packet and the playback packet per certain time. By calculating the amount of change in time, the clock frequency difference between transmission and reception is calculated, and correction amount information corresponding to this is generated [correction control unit 6]
The correction control unit 6 divides the audio reproduction timing signal f from the timing generation unit 4 to generate a reproduction cycle as a reference for audio decoding and audio reproduction. Further, as shown in FIG. 5, the frequency division ratio control circuit 6-1 and the frequency division counter 6-2 are used to determine the reference reproduction cycle based on the correction amount information from the frequency difference detection unit 5. Thus, in order to perform timing correction control of data for one reproduction clock, a reproduction request signal e with a changed frequency division ratio is output. Further, the correction control unit 6 generates a reproduction request signal e for controlling the packet transfer interval from reception to reproduction, that is, the audio reproduction interval, based on the correction amount information corresponding to the clock frequency difference.

〔音声復号部７〕
音声復号部７は、再生要求信号ｅに同期して、音声バッファ３から転送される符号化音声フレームデータｂをもとに、実際に使用されるサンプリングレートにおける時間系列に対応した復号化音声データｃに変換して出力する。
〔音声再生部８〕
音声再生部８は、復号化された音声データｃを再生要求信号ｅの周期の変化に対応して、そのまま再生するか、１再生クロック分のデータを廃棄または挿入して再生音声データｄを出力する。即ち、音声再生部８では、音声バッファ３からパケットを取り出す周期に連動して、補正量情報をもとに、音声サンプルデータ列に対して、そのまま再生するか１再生クロック分のデータの廃棄または挿入を行って再生音声データｄ１，ｄ２，ｄ３を取り出す。 [Audio decoding unit 7]
The audio decoding unit 7 decodes the decoded audio data corresponding to the time series at the sampling rate actually used based on the encoded audio frame data b transferred from the audio buffer 3 in synchronization with the reproduction request signal e. Convert to c and output.
[Audio playback unit 8]
The audio reproduction unit 8 reproduces the decoded audio data c as it is corresponding to the change in the period of the reproduction request signal e, or discards or inserts the data for one reproduction clock and outputs the reproduction audio data d To do. That is, the audio reproduction unit 8 reproduces the audio sample data string as it is based on the correction amount information in conjunction with the cycle for extracting the packet from the audio buffer 3 or discards data for one reproduction clock. The inserted audio data d1, d2, and d3 are taken out.

本発明方式の動作の概要を説明する。
本発明の実施例を示す図１において、周波数差検出部５では、送受信点間で共通の音声符号化方式に基づいて構成された音声パケットに付与された時刻の推移を監視し、一定時間あたりの受信パケットと再生パケットとの時刻の変化量を計算することにより、送受信間でのクロック周波数差を算出し、これに対応する補正量情報を作成する。また、補正制御部６はクロック周波数差に対応する補正量情報をもとに受信から再生までのパケット転送間隔すなわち音声再生間隔を制御する再生要求信号ｅを生成する。 An outline of the operation of the method of the present invention will be described.
In FIG. 1 showing an embodiment of the present invention, the frequency difference detection unit 5 monitors the transition of time given to a voice packet configured based on a voice coding method common between transmission and reception points, By calculating the amount of change in the time between the received packet and the reproduced packet, the clock frequency difference between transmission and reception is calculated, and correction amount information corresponding to this is created. Further, the correction control unit 6 generates a reproduction request signal e for controlling the packet transfer interval from reception to reproduction, that is, the audio reproduction interval, based on the correction amount information corresponding to the clock frequency difference.

音声バッファ３では、補正量情報をもとにその音声バッファ３からパケットを取り出す周期を音声サンプルクロック単位で一時的に変化させる。 The audio buffer 3 temporarily changes the cycle of extracting packets from the audio buffer 3 based on the correction amount information in units of audio sample clocks.

音声復号部７では、取り出されたパケットから音声パケット符号化方式に基づいて音声サンプルデータ列に変換する。 The voice decoding unit 7 converts the extracted packet into a voice sample data string based on the voice packet encoding method.

音声再生部８では、音声バッファ３からパケットを取り出す周期に連動して、補正量情報をもとに、音声サンプルデータ列に対して、１サンプル分のデータを廃棄あるいは挿入を行って再生音声データｄを取り出す。 The audio playback unit 8 discards or inserts one sample of data into the audio sample data string based on the correction amount information in conjunction with the cycle of extracting the packet from the audio buffer 3, and reproduces the audio data. Remove d.

図４は、本発明に用いる周波数差検出部５の構成例を示すブロック図である。
計数カウンタ回路５０１は、音声再生タイミング信号ｆにより連続計数動作し、転送完了タイミング信号２１１が入力された時点での計数値５１５をメモリ５０２に書き込む。
メモリ５０２は、受信パケットのシーケンス番号２１２およびタイムスタンプ２１３とともに、計数値５１５を順次記憶する。
差分回路５０３は、メモリ５０２から読み出した２個のパケットに対する計数値５２０，５２１から差分を求め、パケット受信間隔５２２を算出する。
メモリ５０４は、差分回路５０３で得られた受信間隔５２２を順次記憶する。
演算回路５０５は、メモリ５０４に記憶された個々のパケット受信間隔５２３を読み出し、一定数について平均化処理を行い、平均受信間隔５２４を算出する。
受信間隔レジスタ５０９は、一定数の受信パケットから算出された平均受信間隔５２４を一時的に記憶し、時間経過とともに順次更新され、パケット平均受信間隔５２５として出力する。
差分回路５０６は、メモリ５０２から読み出した２個のパケットに対するタイムスタンプ５１６，５１７から差分を求め、パケット送信間隔５０８を算出する。
送信間隔レジスタ５０７は、差分回路５０６により算出されたパケット送信間隔５１８を記憶する。
演算回路５０８は、パケット平均受信間隔５２５とパケット送信間隔５１８から周波数差を算出し、補正量情報５２６を出力する。 FIG. 4 is a block diagram showing a configuration example of the frequency difference detection unit 5 used in the present invention.
The count counter circuit 501 performs a continuous counting operation in response to the audio reproduction timing signal f, and writes the count value 515 at the time when the transfer completion timing signal 211 is input to the memory 502.
The memory 502 sequentially stores the count value 515 together with the sequence number 212 of the received packet and the time stamp 213.
The difference circuit 503 obtains a difference from the count values 520 and 521 for the two packets read from the memory 502 and calculates a packet reception interval 522.
The memory 504 sequentially stores the reception intervals 522 obtained by the difference circuit 503.
The arithmetic circuit 505 reads the individual packet reception intervals 523 stored in the memory 504, performs an averaging process on a certain number, and calculates an average reception interval 524.
The reception interval register 509 temporarily stores an average reception interval 524 calculated from a fixed number of received packets, is sequentially updated as time elapses, and is output as a packet average reception interval 525.
The difference circuit 506 calculates a difference from the time stamps 516 and 517 for the two packets read from the memory 502 and calculates a packet transmission interval 508.
The transmission interval register 507 stores the packet transmission interval 518 calculated by the difference circuit 506.
The arithmetic circuit 508 calculates a frequency difference from the packet average reception interval 525 and the packet transmission interval 518 and outputs correction amount information 526.

この周波数差検出部５では、音声再生タイミングクロックで計数カウンタ回路５０１を動作させ、パケット受信制御部２から出力される受信完了タイミング信号が発生した時点での計数カウンタ回路５０１の値を記録するとともに、対応するパケットのシーケンス番号およびタイムスタンプを記録する。
ここで、図６に示すように、一定の差分ｎを示す２個のシーケンス番号（ｋ，ｋ＋ｎ）を持つ２個のパケットのタイムスタンプの差分値により、送信側における２個のパケット送信間隔ＴＩを求める。
一方、その２個のパケットに対応する計数カウンタの値（Ｔｋ，Ｔｋ＋ｎ）の差分値により、実際に受信された時間間隔ＲＩｋを求める。同様に、連続する２ｎ個のパケットに対して、シーケンス番号の差分ｎを示す２個のパケットに対応する計数カウンタの値の差分値をｎ個求め、平均時間間隔ＲＩを求める。
ＴＩとＲＩの差異の推移を順次監視し、１サンプル分の音声データを廃棄または挿入すべき補正時間間隔を算出し、補正制御部へ補正量情報として通知する。 The frequency difference detection unit 5 operates the count counter circuit 501 with the audio reproduction timing clock, and records the value of the count counter circuit 501 when the reception completion timing signal output from the packet reception control unit 2 is generated. Record the sequence number and time stamp of the corresponding packet.
Here, as shown in FIG. 6, two packet transmission intervals TI on the transmission side are determined by the difference value of the time stamps of two packets having two sequence numbers (k, k + n) indicating a constant difference n. Ask for.
On the other hand, the actually received time interval RIk is obtained from the difference value of the count counter values (Tk, Tk + n) corresponding to the two packets. Similarly, for 2n consecutive packets, n difference values of count counter values corresponding to two packets indicating sequence number difference n are obtained, and an average time interval RI is obtained.
The transition of the difference between TI and RI is sequentially monitored, a correction time interval at which one sample of audio data is to be discarded or inserted is calculated, and the correction control unit is notified as correction amount information.

本発明は、ネットワーク上でパケットを介して音声通信を行うシステムにおいて、音声パケットを再生する際に広く適用することができる。 The present invention can be widely applied when reproducing voice packets in a system that performs voice communication via packets on a network.

本発明の一実施例を示すブロック図である。It is a block diagram which shows one Example of this invention. 本発明に用いる音声再生部の動作を説明するためのタイムチャートである。It is a time chart for demonstrating operation | movement of the audio | voice reproduction part used for this invention. 本発明に用いるパケット受信制御部の構造例を示すブロック図である。It is a block diagram which shows the structural example of the packet reception control part used for this invention. 本発明に用いる周波数差検出部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the frequency difference detection part used for this invention. 本発明に用いる補正制御部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the correction control part used for this invention. 本発明に用いる周波数差検出部の動作を説明するための図である。It is a figure for demonstrating operation | movement of the frequency difference detection part used for this invention.

Explanation of symbols

１ＬＡＮインタフェース
２パケット受信制御部
２−１パケットデータ受信部
２−２受信バッファ制御部
２−３受信タイミング情報生成部
３音声バッファ
４タイミング発生部
５周波数差検出部
６補正制御部
７音声復号部
８音声再生部
９バッファ監視部
２１１転送完了タイミング信号
２１２シーケンス番号
２１３タイムスタンプ
５０１計数カウンタ回路
５０２メモリ
５０３差分回路
５０４メモリ
５０５演算回路
５０６差分回路
５０７送信間隔レジスタ
５０８演算回路
５０９受信間隔レジスタ
５１５計数値
５１６タイムスタンプ
５１７タイムスタンプ
５１８パケット送信間隔
５１９パケット送信間隔
５２０，５２１計数値
５２２パケット受信間隔
５２３パケット受信間隔
５２４平均受信間隔
５２５パケット平均受信間隔
ａ符号化音声フレームデータ
ｂ符号化音声フレームデータ
ｃ符号化音声データ
ｄ再生音声データ
ｅ再生要求信号
ｆ音声再生タイミング信号 DESCRIPTION OF SYMBOLS 1 LAN interface 2 Packet reception control part 2-1 Packet data reception part 2-2 Reception buffer control part 2-3 Reception timing information generation part 3 Voice buffer 4 Timing generation part 5 Frequency difference detection part 6 Correction control part 7 Voice decoding part 8 Audio playback unit 9 Buffer monitoring unit 211 Transfer completion timing signal 212 Sequence number 213 Time stamp 501 Count counter circuit 502 Memory 503 Difference circuit 504 Memory 505 Operation circuit 506 Difference circuit 507 Transmission interval register 508 Operation circuit 509 Reception interval register 515 Count value 516 Time stamp 517 Time stamp 518 Packet transmission interval 519 Packet transmission interval 520,521 Count value 522 Packet reception interval 523 Packet reception interval 524 Average reception interval 525 Average encoded reception interval a encoded audio frame data b encoded audio frame data c encoded audio data d reproduction audio data e reproduction request signal f audio reproduction timing signal

Claims

A timing generation unit that outputs a sound reproduction timing signal serving as a reference for reproducing sound from the received sound packet;
A frequency difference detection unit that detects a clock frequency difference between the reception timing of the received audio packet and the audio reproduction timing signal and outputs correction amount information corresponding to the clock frequency difference;
A correction control unit that controls a frequency division ratio of the audio reproduction timing signal according to the correction amount information corresponding to the detected frequency difference and generates a reproduction request signal for reproducing an audio signal from the received audio packet;
The received voice packet is temporarily recorded, and the read / playback interval of the recorded voice packet is controlled by the playback request signal, so that the playback timing of the voice packet can be changed in units of playback clock. An audio buffer;
A voice decoding unit that converts the frame data of the voice packet into voice sample data under the control of the reproduction request signal;
An audio reproduction unit for generating reproduction data by reproducing, discarding, or inserting the audio sample data in units of the reproduction clock based on the reproduction request signal;
Voice packet playback system with