JP2000059391A

JP2000059391A - Audio packet transmitting/receiving method, audio packet transmission terminal and audio packet reception terminal

Info

Publication number: JP2000059391A
Application number: JP22712198A
Authority: JP
Inventors: Satoshi Hishitani; 聡菱谷; Masami Mineo; 正美峯尾
Original assignee: Hitachi Communication Systems Inc
Current assignee: Hitachi Information Technology Co Ltd
Priority date: 1998-08-11
Filing date: 1998-08-11
Publication date: 2000-02-25

Abstract

PROBLEM TO BE SOLVED: To reduce the cutoff of voices or quality degradation by reproducing audio packets through interpolation to preceding and following temporary packets to be stored at the time of detecting a lost packet based on transmission audio packet numbers which are sequentially assembled as audio packets, transmitted to an asynchronous digital communication network while added the transmission packet numbers and added to the audio packets. SOLUTION: The two on more temporary packets combined and extracted as prescribed are assembled as one audio packet, successively assembled as audio packets at a packet assembly part 104 and transmitted from a signal transmission part 105 onto an asynchronous digital communication network 2 at high speed later. The audio packets from the asynchronous digital communication network 2 are successively received by a signal reception part 106 and when the number of received audio packets does not reach a prescribed number as a result of packet receiving state monitor from a signal receiving state monitoring part 109, continuity is maintained by reproducing these packets through interpolating processing to preceding and following audio data values to be temporarily stored in an encoded data assembly part 108.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声パケット送信
端末からの音声パケットがデータ通信網を介し音声パケ
ット受信端末で受信・再生される際での音声パケット送
受信方法、更には、その音声パケット送受信方法が実施
される上で好適な構成の音声パケット送信端末および音
声パケット受信端末に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice packet transmitting / receiving method when a voice packet from a voice packet transmitting terminal is received / reproduced by a voice packet receiving terminal via a data communication network, and further, the voice packet transmitting / receiving method. The present invention relates to a voice packet transmitting terminal and a voice packet receiving terminal having a configuration suitable for carrying out the method.

【０００２】[0002]

【従来の技術】情報通信分野における通信技術の発展と
通信機器の普及とにより、非同期ディジタル通信網（Ａ
ＴＭ網）を介し映像や音声等の送受信が行われている
が、このうち、音声の送受信に際しては、音声はパケッ
ト化された状態として送受信されているのが現状であ
る。予めディジタル符号化された音声信号は固定長の音
声パケットに順次分割された上、音声パケット送信端末
から非同期ディジタル通信網上に送信される一方、非同
期ディジタル通信網上からの音声パケットは音声パケッ
ト受信端末で順次受信された上、音声として再生されて
いるものである。2. Description of the Related Art With the development of communication technology and the spread of communication equipment in the field of information communication, asynchronous digital communication networks (A
Transmission and reception of video, audio, and the like are performed via a (TM network), and among them, audio is currently transmitted and received in a packetized state. The voice signal digitally encoded in advance is sequentially divided into voice packets of a fixed length and transmitted from the voice packet transmitting terminal to the asynchronous digital communication network, while the voice packet from the asynchronous digital communication network receives the voice packet. These are sequentially received by the terminal and reproduced as audio.

【０００３】ここで、その従来技術に係る音声パケット
送受信方法について具体的に説明すれば以下のようであ
る。即ち、図４には従来技術に係る音声パケット送受信
方法の概要が示されているが、これによる場合、音声パ
ケット送信端末においては、音声入力部（マイクロフォ
ン等）３０１からのアナログ音声信号はディジタル信号
符号化部３０２でディジタル音声データに符号化変換さ
れた上、パケット組立部３０３で順次音声パケットとし
て組立てされたものとなっている。そのパケット組立部
３０３から順次得られる音声パケットは、その後、信号
送信部３０４から非同期ディジタル通信網上に送信され
ているものである。一方、音声パケット受信端末におい
ては、非同期ディジタル通信網上からの音声パケットは
信号受信部３０５で順次受信された上、データ記憶部３
０６に一時記憶されるものとなっている。そのデータ記
憶部３０６に一時記憶されている音声パケットは、その
後、周期的、かつ所定順にディジタル音声データ復号化
部３０７に読み出された上、復号化されることで、音声
出力部（スピーカー等）３０８から音声として再生出力
されているものである。Here, the voice packet transmitting / receiving method according to the prior art will be specifically described as follows. That is, FIG. 4 shows an outline of a conventional voice packet transmitting / receiving method. In this case, in a voice packet transmitting terminal, an analog voice signal from a voice input unit (microphone or the like) 301 is converted into a digital signal. After being coded and converted into digital voice data by the coding unit 302, the voice data is sequentially assembled as voice packets by the packet assembling unit 303. The voice packets sequentially obtained from the packet assembling unit 303 are transmitted from the signal transmitting unit 304 to the asynchronous digital communication network. On the other hand, in the voice packet receiving terminal, voice packets from the asynchronous digital communication network are sequentially received by the signal receiving unit 305, and the data storage unit 3
06 is temporarily stored. The audio packet temporarily stored in the data storage unit 306 is thereafter read out to the digital audio data decoding unit 307 periodically and in a predetermined order, and then decoded, so that the audio output unit (such as a speaker) ) 308 reproduced and output as audio.

【０００４】因みに、信号受信部３０５からの受信音声
パケットはデータ記憶部３０６に一時記憶されている
が、これは、以下の理由によるものとなっている。即
ち、音声パケットが非同期ディジタル通信網を介し送受
信されるに際しては、全音声パケットが非同期ディジタ
ル通信網内の同一経路上を介し必ずしも伝送されるとは
限らなく、非同期ディジタル通信網上での輻湊状態如何
によっては、各種迂回経路を介し伝送される虞があるも
のとなっている。したがって、音声パケット各々はその
経路長に応じた様々な伝送遅延（時間遅れ）を以て音声
パケット受信端末で受信される可能性があるというもの
である。よって、受信音声パケット各々がＦＩＦＯ的に
一時記憶された後、周期的に所定順に読み出された上、
復号化・再生される場合は、音声としての再生に際し、
不連続状態（途切れ状態）を含む状態としての音声の再
生は防止され得るものである。[0004] Incidentally, the voice packet received from the signal receiving section 305 is temporarily stored in the data storage section 306 for the following reason. That is, when voice packets are transmitted / received via the asynchronous digital communication network, not all voice packets are necessarily transmitted via the same path in the asynchronous digital communication network. Depending on the situation, there is a possibility that the data will be transmitted via various detour paths. Therefore, each voice packet may be received by the voice packet receiving terminal with various transmission delays (time delays) according to the path length. Therefore, after each received voice packet is temporarily stored in a FIFO manner, it is periodically read out in a predetermined order.
When decrypted and played back, when playing back as audio,
Reproduction of sound as a state including a discontinuous state (interrupted state) can be prevented.

【０００５】なお、特開平９ー２７８２７号公報による
場合、音声パケットの受信・再生に際し、伝送途中での
音声パケット自体の紛失（ヘッダ誤り等による）、ある
いは大きな伝送遅延により次音声パケットがまだ受信さ
れていない場合、直前に受信されている音声パケットに
もとづき、その次音声パケットに対する補間が行われ、
その次音声パケットが新たに再現されることによって、
伝送遅延小さくして、しかも伝送データ量が減少されつ
つ、音声が途切れることなく再生され得るものとなって
いる。In the case of Japanese Patent Application Laid-Open No. 9-27827, upon receiving and reproducing an audio packet, the next audio packet is still received due to loss of the audio packet itself during transmission (due to a header error or the like) or a large transmission delay. If not, interpolation is performed on the next voice packet based on the voice packet received immediately before,
The next voice packet is newly reproduced,
The audio can be reproduced without interruption while the transmission delay is reduced and the transmission data amount is reduced.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、従来技
術に係る音声パケット送受信方法では、非同期ディジタ
ル通信網上での輻輳状態やパケット自体の紛失等を要因
として、ある音声パケットが音声パケット受信端末に伝
送されるまでの伝送遅延が大きく、その音声パケットが
損失パケットとして破棄される場合には、再生音声の途
切れは避けられないものとなっている。図５（Ａ），
（Ｂ），（Ｃ）にはそれぞれ送信音声データ、パケット
損失時での音声データ、再生音声データが一例として示
されているが、損失パケット各々に該当する区間内には
何等音声データが存在しなく、その結果として、図５
（Ｃ）に示すように、音声出力部３０８から音声が再生
出力されるに際しては、途切れを含む状態として音声が
再生出力されてしまうというものである。また、上記公
報による場合には、音声が途切れることなく再生され得
るにしても、再生音声の品質劣化は避けられないものと
なっている。However, in the voice packet transmission / reception method according to the prior art, a certain voice packet is transmitted to the voice packet receiving terminal due to a congestion state on the asynchronous digital communication network or a loss of the packet itself. When the transmission delay until the transmission is large and the voice packet is discarded as a lost packet, interruption of the reproduced voice is inevitable. FIG. 5 (A),
(B) and (C) respectively show transmitted voice data, voice data at the time of packet loss, and reproduced voice data as examples, but no voice data exists in the section corresponding to each lost packet. And as a result, FIG.
As shown in (C), when audio is reproduced and output from the audio output unit 308, the audio is reproduced and output in a state including breaks. Further, in the case of the above publication, even if the sound can be reproduced without interruption, quality deterioration of the reproduced sound is inevitable.

【０００７】本発明の第１の目的は、非同期ディジタル
通信網上での音声パケット紛失や輻輳状態等に起因し
て、音声パケット受信端末で音声パケットが損失パケッ
トとして廃棄される場合であっても、音声が途切れるこ
となく、しかも再生上での品質劣化が抑えられた状態
で、音声が再生出力され得る音声パケット送受信方法を
供するにある。本発明の第２の目的は、その音声パケッ
ト送受信方法が実施される上で好適とされた構成の音声
パケット送信端末を供するにある。本発明の第３の目的
は、同じくその音声パケット送受信方法が実施される上
で好適とされた構成の音声パケット受信端末を供するに
ある。A first object of the present invention is to provide a voice packet receiving terminal which discards a voice packet as a lost packet due to a voice packet loss or a congestion state on an asynchronous digital communication network. Another object of the present invention is to provide a method of transmitting and receiving a voice packet in which the voice can be reproduced and output in a state where the voice is not interrupted and the quality deterioration in reproduction is suppressed. A second object of the present invention is to provide a voice packet transmitting terminal having a configuration suitable for implementing the voice packet transmitting / receiving method. A third object of the present invention is to provide a voice packet receiving terminal having a configuration suitable for implementing the voice packet transmitting / receiving method.

【０００８】[0008]

【課題を解決するための手段】上記第１の目的は、音声
パケット送信端末では、アナログ音声信号はディジタル
音声データに符号化変換された上、一定時間毎に分割さ
れつつ、該一定時間内のディジタル音声データは更に固
定長に仮分割された上、仮分割されたディジタル音声デ
ータはそれぞれ仮パケットとして、そのうちの２以上の
仮パケットが所定間隔離隔された状態として所定に組合
せ抽出され、所定に組合せ抽出された２以上の仮パケッ
トの組各々は順次音声パケットとして組立てされた上、
送信音声パケット番号が付加された状態として非同期デ
ィジタル通信網上に送信される一方、音声パケット受信
端末では、非同期ディジタル通信網上から音声パケット
が受信される度に、該音声パケットからは仮パケットが
２以上抽出された上、該音声パケットに付加されている
送信音声パケット番号にもとづき一定時間内での仮パケ
ット位置順に編集された状態として、既に一時記憶され
ている他仮パケットとともに一時記憶されるのに並行し
て、非同期ディジタル通信網上からの音声パケットの受
信状態の監視結果として、所定時間内での受信音声パケ
ット数が所定数に達していなく、かつ送信音声パケット
番号の順序上の欠落からある音声パケットが損失パケッ
トとして特定的に検出された場合には、一時記憶されて
いる前後の仮パケットに対する補間処理により該損失パ
ケット内に含まれていた仮パケットが再現されつつ、時
間上での連続性が維持された状態として一時記憶されて
いる仮パケット各々は周期的に順次読み出された上、復
号化された状態で音声として再生出力されることで達成
される。The first object of the present invention is to provide a voice packet transmitting terminal in which an analog voice signal is encoded and converted into digital voice data and then divided at regular time intervals. The digital audio data is further provisionally divided into fixed lengths, and the provisionally divided digital audio data is extracted as predetermined packets, and two or more of the provisional packets are extracted and separated in a predetermined interval in a predetermined combination. Each set of two or more temporary packets extracted in combination is sequentially assembled as a voice packet,
While the voice packet number is transmitted to the asynchronous digital communication network in a state where the voice packet number is added, the voice packet receiving terminal generates a temporary packet from the voice packet every time a voice packet is received from the asynchronous digital communication network. Two or more extracted packets are temporarily stored together with other temporary packets already temporarily stored in a state edited in the order of the temporary packet position within a certain time based on the transmission voice packet number added to the voice packet. In parallel with the above, as a result of monitoring the reception state of the voice packet from the asynchronous digital communication network, the number of received voice packets within the predetermined time has not reached the predetermined number, and the order of the transmitted voice packet number is missing. If a certain voice packet is specifically detected as a lost packet, the temporary packet before and after temporarily stored The temporary packets included in the lost packet are reproduced by the interpolation process for the packet, and each of the temporary packets temporarily stored as a state in which continuity in time is maintained is periodically read out sequentially. The above is achieved by reproducing and outputting as audio in a decoded state.

【０００９】上記第２の目的はまた、その構成要素とし
て、音声をアナログ音声信号に変換する音声入力部と、
該音声入力部からのアナログ音声信号をディジタル音声
データに符号化変換するディジタル信号符号化部と、該
ディジタル信号符号化部からのディジタル音声データを
一定時間毎に分割しつつ、該一定時間内のディジタル音
声データを更に固定長に仮分割した上、仮分割されたデ
ィジタル音声データをそれぞれ仮パケットとして、その
うちの２以上の仮パケットを所定間隔離隔された状態と
して所定に組合せ抽出する符号化データ分割部と、該符
号化データ分割部で所定に組合せ抽出された２以上の仮
パケットの組各々を音声パケットとして組立てた上、送
信音声パケット番号が付加された状態として非同期ディ
ジタル通信網上に送信する信号送信部とを少なくとも具
備せしめることで達成される。A second object of the present invention is to provide an audio input unit for converting audio into an analog audio signal,
A digital signal encoding unit that encodes and converts an analog audio signal from the audio input unit into digital audio data; and a digital signal encoding unit that divides the digital audio data from the digital signal encoding unit at regular time intervals. The digital voice data is further provisionally divided into fixed lengths, and the provisionally divided digital voice data is used as provisional packets, and two or more provisional packets of the provisional packets are separated by a predetermined distance and extracted in a predetermined combination. And a set of two or more temporary packets extracted and combined in a predetermined manner by the coded data division unit, and assembles them as voice packets, and transmits them to the asynchronous digital communication network in a state where transmission voice packet numbers are added. This is achieved by providing at least a signal transmission unit.

【００１０】更に、上記第３の目的は、その構成要素と
して、非同期ディジタル通信網上からの音声パケットを
順次受信する信号受信部と、該信号受信部で音声パケッ
トが受信される度に、該音声パケットに付加されている
送信音声パケット番号にもとづき、該音声パケットから
は一定時間内での仮パケット位置が特定された状態とし
て仮パケットを２以上抽出する符号化データ抽出部と、
該符号化データ抽出部から２以上抽出された仮パケット
を一定時間内での仮パケット位置順に編集した状態とし
て、既に一時記憶されている他仮パケットとともに一時
記憶する符号化データ組立部と、非同期ディジタル通信
網上からの音声パケットの受信状態を所定時間毎に該所
定時間内に受信された音声パケット数として常時監視す
る信号受信状態監視部と、該信号受信状態監視部からの
パケット受信状態監視結果として、受信音声パケット数
が所定数に達していなく、かつ送信音声パケット番号の
順序上の欠落からある音声パケットが損失パケットとし
て特定的に検出された場合に、上記符号化データ組立部
に一時記憶されている前後の仮パケットに対する補間処
理により該損失パケット内に含まれていた仮パケットを
再現するディジタル音声データ補間部と、時間上での連
続性が維持された状態として上記符号化データ組立部か
ら周期的に順次読み出される仮パケット各々を復号化す
るディジタル音声データ復号化部と、該ディジタル音声
データ復号化部からの復号化仮パケットを順次音声とし
て再生出力する音声出力部とを少なくとも具備せしめる
ことで達成される。A third object of the present invention is to provide a signal receiving unit which sequentially receives voice packets from an asynchronous digital communication network, and that the signal receiving unit receives the voice packets every time the signal receiving unit receives the voice packets. An encoded data extraction unit that extracts two or more temporary packets from the voice packet based on the transmission voice packet number added to the voice packet, assuming that the position of the temporary packet within a fixed time is specified;
Asynchronous with an encoded data assembling section which temporarily stores two or more temporary packets extracted from the encoded data extracting section in the order of the temporary packet position within a certain time together with other temporary packets already temporarily stored. A signal reception state monitoring unit for constantly monitoring the reception state of voice packets from the digital communication network as the number of voice packets received within the predetermined time period, and a packet reception state monitor from the signal reception state monitoring unit As a result, when the number of received voice packets does not reach the predetermined number and a voice packet due to a missing order of the transmitted voice packet number is specifically detected as a lost packet, the encoded data assembling section temporarily stores A digitizer that reproduces a temporary packet included in the lost packet by performing an interpolation process on the stored temporary packets before and after. An audio data interpolator; a digital audio data decoder for decoding each temporary packet periodically read from the encoded data assembler in a state where continuity in time is maintained; and a digital audio data decoder. This is achieved by providing at least an audio output unit for sequentially reproducing and outputting decoded temporary packets from the decoding unit as audio.

【００１１】[0011]

【発明の実施の形態】以下、本発明の一実施形態を図１
から図３を参照しつつ説明する。先ず本発明による音声
パケット送受信方法について説明すれば、図１はその音
声パケット送受信方法の概要を示したものである。これ
による場合、音声パケット送信端末１においては、音声
入力部（マイクロフォン等）１０１からのアナログ音声
信号はディジタル信号符号化部１０２でディジタル音声
データに符号化変換されているが、符号化データ分割部
１０３では、更に、そのディジタル音声データは一定時
間ｔ毎に分割されつつ、その一定時間ｔ内のディジタル
音声データは更に固定長に仮分割された上、仮分割され
たディジタル音声データはそれぞれ仮パケットとして、
そのうちの２以上の仮パケットが所定間隔離隔された状
態として所定に組合せ抽出されるものとなっている。所
定に組合せ抽出された２以上の仮パケットは１つの音声
パケットとして組立てられるべく、パケット組立部１０
４では順次音声パケットとして組立てされた上、送信音
声パケット番号が付加された状態として、信号送信部１
０５から非同期ディジタル通信網２上に高速に送信され
ているものである。DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below with reference to FIG.
This will be described with reference to FIG. First, an audio packet transmitting / receiving method according to the present invention will be described. FIG. 1 shows an outline of the audio packet transmitting / receiving method. In this case, in the voice packet transmitting terminal 1, an analog voice signal from a voice input unit (microphone or the like) 101 is coded and converted into digital voice data by a digital signal coding unit 102. At 103, the digital audio data is further divided at regular time intervals t, and the digital audio data within the fixed time period t is further temporarily divided into fixed lengths. As
Two or more of these temporary packets are extracted in a predetermined combination as being separated by a predetermined interval. The two or more temporary packets extracted in a predetermined combination are assembled into one packet by a packet assembling unit 10.
4, the signal transmission unit 1 assembles the voice packets in sequence and adds the transmission voice packet number thereto.
05 is transmitted over the asynchronous digital communication network 2 at a high speed.

【００１２】以上のように、符号化データ分割部１０３
では、一定時間ｔ内で仮分割されたディジタル音声デー
タ各々は仮パケットとして、例えば時間上、相互に３仮
パケット分離れている２つの仮パケットが１つの音声パ
ケットとして組立てられるべく、その一定時間ｔ内での
仮パケット各々は所定に組合せ抽出された上、符号化デ
ータ分割部１０３では音声パケットが順次組立てられて
いるが、音声受信端末側での構成・動作の説明に先立っ
て、その仮パケットの組合せ抽出例をより具体的に補足
説明すれば以下のようである。As described above, the encoded data dividing section 103
Then, each of the digital audio data provisionally divided within the predetermined time t is a temporary packet, for example, two temporary packets separated from each other by three temporary packets in time so as to be assembled as one voice packet. Each of the temporary packets within t is extracted in a predetermined combination, and voice packets are sequentially assembled in the encoded data division unit 103. Before describing the configuration and operation on the voice receiving terminal side, the temporary packets are extracted. The following is a more specific supplementary description of an example of packet combination extraction.

【００１３】即ち、一定時間ｔ内でのディジタル音声デ
ータの全データ数をＴ（音声送信／受信端末側に共通な
値として任意に選択可）、そのディジタル音声データに
対する音声パケット分割数をｎ（同じく、音声送信／受
信端末側に共通な値として任意に選択可）、送信音声パ
ケット番号をｓ（音声送信端末側で任意に選択可）とす
れば、ディジタル信号符号化部１０２により符号化され
たディジタル音声データからは、例えばｉ＊ｎ＋ｓとし
て求められる標本時間対応の音声データ値が仮パケット
として抽出されるものとなっている。但しｉはｉ＝０，
１，… … …，(Ｔ／ｎ)−１）として規定される値を
とるものとなっている。より具体的に、図２（Ａ），
（Ｂ）にはＴ、ｎの値がそれぞれ８、４、したがって、
ｓ＝０，１，２，３とする具体例が示されているが、そ
れらＴ、ｎの値より音声パケット各々に含まれるディジ
タル音声データ（仮パケット）の数は２（＝Ｔ／ｎ）と
して、また、ｉのとり得る値はそれぞれ０，１として求
められるものである。よって、ｓの値が０から３に向っ
て順次更新される度に、ｉの値を０，１にそれぞれ更新
の上、ｉ＊ｎ＋ｓに代入すれば、送信パケット番号ｓに
対するパケット内に含まれる音声データ値は図示の如く
の組合せとして抽出され得るものである。That is, the total number of digital voice data within a given time t is T (arbitrarily selectable as a value common to voice transmitting / receiving terminals), and the number of voice packet divisions for the digital voice data is n ( Similarly, if the transmission voice packet number is s (arbitrarily selectable on the voice transmission terminal side) and the transmission voice packet number is s (arbitrarily selectable on the voice transmission / reception terminal side), the digital signal is coded by the digital signal coding unit 102. From the digital voice data, a voice data value corresponding to a sampling time obtained as, for example, i * n + s is extracted as a temporary packet. Where i is i = 0,
1,..., (T / n) -1). More specifically, FIG.
(B) shows that the values of T and n are 8, 4, respectively,
Although a specific example in which s = 0, 1, 2, 3 is shown, the number of digital voice data (temporary packets) included in each voice packet is 2 (= T / n) from the values of T and n. , And the possible values of i are obtained as 0 and 1, respectively. Therefore, every time the value of s is sequentially updated from 0 to 3, the value of i is updated to 0 and 1, respectively, and is substituted into i * n + s, so that it is included in the packet for the transmission packet number s. The audio data values can be extracted as a combination as shown.

【００１４】以上からも判るように、本例では、４標本
時間分離れた音声データ値が２つ組合せ抽出されるもの
として、一定時間ｔ内でのディジタル音声データからは
４つの音声パケットが順次得られているが、これに限定
されることなく、一般に、符号化データ分割部１０３で
は、ディジタル信号符号化部１０２からのディジタル音
声データは一定時間ｔ毎に分割されつつ、その一定時間
ｔ内のディジタル音声データは更に固定長に仮分割され
た上、仮分割されたディジタル音声データはそれぞれ仮
パケットとして、そのうちの２以上の仮パケットが所定
間隔離隔された状態として所定に組合せ抽出されればよ
いものである。As can be seen from the above description, in this example, it is assumed that two voice data values separated by four sampling times are extracted in combination, and four voice packets are sequentially generated from digital voice data within a certain time t. Although not limited to this, in general, the coded data dividing section 103 divides the digital audio data from the digital signal coding section 102 at regular time intervals t, and Is temporarily divided into fixed lengths, and the temporarily divided digital voice data is temporarily extracted as predetermined packets, and two or more of the temporary audio packets are separated by a predetermined distance and extracted in a predetermined combination. Good thing.

【００１５】さて、以上の音声パケット送信端末１での
構成・動作を踏まえ、ここで、音声パケット受信端末３
での構成・動作について説明すれば、音声パケット受信
端末３においては、非同期ディジタル通信網２上からの
音声パケットは信号受信部１０６で順次受信されるもの
となっている。音声パケットが受信される度に、符号化
データ抽出部１０７では、その音声パケットに付加され
ている送信音声パケット番号ｓにもとづき、その音声パ
ケットからは、一定時間ｔ内での標本時間位置が特定さ
れた状態として４標本時間分離れた音声データ値が２つ
抽出されたものとなっている。更に、符号化データ組立
部１０８では、符号化データ抽出部１０７で抽出された
２つの音声データ値が一定時間ｔ内での標本時間位置順
に編集された状態として、既に一時記憶されている他音
声データ値とともにＦＩＦＯ的に一時記憶されたものと
なっている。Now, based on the configuration and operation of the voice packet transmitting terminal 1, the voice packet receiving terminal 3
In the voice packet receiving terminal 3, voice packets from the asynchronous digital communication network 2 are sequentially received by the signal receiving unit 106. Each time a voice packet is received, the encoded data extraction unit 107 specifies a sampling time position within a certain time t from the voice packet based on the transmission voice packet number s added to the voice packet. In this state, two audio data values separated by four sample times are extracted. Further, the encoded data assembling unit 108 edits the two audio data values extracted by the encoded data extraction unit 107 in the order of sampling time positions within a certain time t, and stores the other audio data values already stored temporarily. It is temporarily stored in a FIFO manner together with the data value.

【００１６】一方、以上の受信動作に並行して、信号受
信状態監視部１０９ではまた、非同期ディジタル通信網
２上からの音声パケットの受信状態が所定時間毎にその
所定時間内に受信された音声パケット数として常時監視
されたものとなっている。その音声パケットの受信に際
し、本例では、通常、送信音声パケット番号ｓが０→１
→２→３→０→１→２→３→０… … …といった順に
サイクリックに音声パケットが受信される筈であるが、
信号受信状態監視部１０９からのパケット受信状態監視
結果として、受信音声パケット数が所定数に達していな
く、しかも送信音声パケット番号の順序上の欠落からあ
る音声パケットが損失パケットとして特定的に検出され
た場合には、ディジタル音声データ補間部１１０では、
符号化データ組立部１０８に既に一時記憶されている前
後音声データ値に対する補間処理によりその損失パケッ
ト内に含まれていた音声データ値が不完全ながらも再現
されているものである。このように、損失パケット内に
含まれていた音声データ値は不完全ながらも再現された
上、時間上での連続性が維持されるべく、符号化データ
組立部１０８に一時記憶されている音声データ値間に所
望に挿入された後、ディジタル音声データ復号化部１１
１に順次周期的に読み出された上、復号化される場合に
は、損失パケットの存在にも拘らず、音声出力部（スピ
ーカー等）１１２からは、音声が途切れなく再生出力さ
れ得るものである。On the other hand, in parallel with the above-mentioned receiving operation, the signal receiving state monitoring unit 109 also checks the receiving state of the voice packet from the asynchronous digital communication network 2 at predetermined time intervals for the voice received within the predetermined time. The number of packets is constantly monitored. At the time of receiving the voice packet, in this example, normally, the transmission voice packet number s is 0 → 1
→ 2 → 3 → 0 → 1 → 2 → 3 → 0...…...
As a result of monitoring the packet reception status from the signal reception status monitoring unit 109, a voice packet in which the number of received voice packets has not reached the predetermined number and which is missing from the order of the transmitted voice packet numbers is specifically detected as a lost packet. In this case, the digital audio data interpolation unit 110
The audio data value included in the lost packet is reproduced, though incomplete, by interpolation processing on the preceding and following audio data values already temporarily stored in the encoded data assembling unit 108. As described above, the audio data value included in the lost packet is reproduced incompletely, and the audio data temporarily stored in the encoded data assembling unit 108 is maintained so that continuity in time is maintained. After being inserted as desired between the data values, the digital audio data decoding unit 11
1, when the data is decoded periodically after being read, the audio can be reproduced and output from the audio output unit (such as a speaker) 112 without interruption irrespective of the presence of a lost packet. is there.

【００１７】ここで、損失パケット内に含まれていた音
声データ値が如何に再現されるかについてより具体的に
説明すれば以下のようである。例えば図２（Ｂ）に示す
送信音声パケット番号１対応の音声パケット（標本時間
２，６各々に対応する音声データ値を含む）が損失パケ
ットとして検出された場合を想定すれば、標本時間２，
６各々に対応する音声データ値が再現される必要がある
が、このうち、標本時間２に対応する音声データ値は標
本時間１，３各々に対応する音声データ値から、また、
標本時間６に対応する音声データ値は標本時間５，７各
々に対応する音声データ値から、例えばそれら音声デー
タ値の中間値として再現されればよいというものであ
る。また、送信音声パケット番号１に併せて、送信音声
パケット番号２対応の音声パケット（標本時間３，７各
々に対応する音声データ値を含む）もが損失パケットと
して同時に検出された場合を想定すれば、標本時間２，
６、３，７各々に対応する音声データ値が再現される必
要があるが、このうち、標本時間２，３各々に対応する
音声データ値は標本時間１，４各々に対応する音声デー
タ値から、また、標本時間６，７各々に対応する音声デ
ータ値は標本時間５，８各々に対応する音声データ値か
ら、例えばそれら音声データ値の比例配分値として再現
されればよいというものである。Here, how to reproduce the voice data value included in the lost packet will be described more specifically as follows. For example, assuming that a voice packet corresponding to the transmission voice packet number 1 (including voice data values corresponding to the sample times 2 and 6) shown in FIG.
6, the audio data values corresponding to each of the sampling times 2 need to be reproduced. Among them, the audio data values corresponding to the sampling times 2 are obtained from the audio data values corresponding to the sampling times 1 and 3, respectively.
The audio data value corresponding to the sample time 6 may be reproduced from the audio data values corresponding to the sample times 5 and 7 as, for example, an intermediate value of the audio data values. Also, assuming that a voice packet corresponding to the transmission voice packet number 2 (including voice data values corresponding to the sampling times 3 and 7) is also detected as a lost packet simultaneously with the transmission voice packet number 1 , Sampling time 2,
It is necessary to reproduce the audio data values corresponding to each of 6, 3, and 7, and among them, the audio data values corresponding to each of the sampling times 2 and 3 are obtained from the audio data values corresponding to each of the sampling times 1 and 4. The audio data values corresponding to the sampling times 6 and 7 may be reproduced from the audio data values corresponding to the sampling times 5 and 8 as, for example, proportional distribution values of the audio data values.

【００１８】ところで、図３（Ａ）〜（Ｃ）には本発明
に関連して、それぞれ送信音声データ、パケット損失時
での音声データ、再生音声データが一例として示されて
いるが、これからも判るように、音声パケット各々が損
失パケットとして破棄され、たとえ、音声データが散発
的に存在する場合であっても、それら音声データより損
失音声データがより劣化が抑えられた状態として再現さ
れた上、音声として再生出力され得るものとなってい
る。FIGS. 3A to 3C show transmission voice data, voice data at the time of packet loss, and reproduced voice data, respectively, as examples in connection with the present invention. As can be seen, each voice packet is discarded as a lost packet, and even if voice data sporadically exists, the lost voice data is reproduced as a state in which deterioration is more suppressed than those voice data. , And can be reproduced and output as audio.

【００１９】[0019]

【発明の効果】以上、説明したように、請求項１による
場合は、非同期ディジタル通信網上での音声パケット紛
失や輻輳状態等に起因して、音声パケット受信端末で音
声パケットが損失パケットとして廃棄される場合であっ
ても、音声が途切れることなく、しかも再生上での品質
劣化が抑えられた状態で、音声が再生出力され得る音声
パケット送受信方法が得られ、また、請求項２，３によ
る場合には、その音声パケット送受信方法が実施される
上で好適とされた構成の音声パケット送信端末、音声パ
ケット受信端末がそれぞれ得られたものとなっている。As described above, according to the first aspect, a voice packet is discarded as a lost packet at a voice packet receiving terminal due to a voice packet loss or a congestion state on an asynchronous digital communication network. Thus, a voice packet transmission / reception method capable of reproducing and outputting voice without interruption of voice and in a state in which quality deterioration during reproduction is suppressed is obtained. In this case, a voice packet transmitting terminal and a voice packet receiving terminal each having a configuration suitable for implementing the voice packet transmitting / receiving method are obtained.

[Brief description of the drawings]

【図１】図１は、本発明による音声パケット送受信方法
の概要を示す図FIG. 1 is a diagram showing an outline of a voice packet transmitting / receiving method according to the present invention;

【図２】図２（Ａ），（Ｂ）は、一定時間内での、本発
明に係る仮分割ディジタル音声データの一例での組合せ
抽出方法を示す図FIGS. 2A and 2B are diagrams showing a combination extraction method in one example of provisionally divided digital audio data according to the present invention within a fixed time;

【図３】図３（Ａ）〜（Ｃ）は、音声パケット受信端末
において、受信遅れ大の音声パケット各々が損失パケッ
トとして破棄される場合であっても、それら損失パケッ
ト内に含まれていた音声データが再現された上、音声と
して再生され得ることを説明するための図FIGS. 3 (A) to 3 (C) show the case where voice packets having a large reception delay are discarded as lost packets in the voice packet receiving terminal, but are included in the lost packets. Diagram for explaining that sound data can be reproduced and then reproduced as sound

【図４】図４は、従来技術に係る音声パケット送受信方
法を説明するための図FIG. 4 is a diagram for explaining a voice packet transmission / reception method according to the related art.

【図５】図５（Ａ）〜（Ｃ）は、音声パケット受信端末
において、受信遅れ大の音声パケット各々が損失パケッ
トとして破棄される場合での不具合を説明するための図FIGS. 5A to 5C are diagrams for explaining a problem in a case where a voice packet having a large reception delay is discarded as a lost packet in a voice packet receiving terminal.

[Explanation of symbols]

１…音声パケット送信端末、２…非同期ディジタル通信
網、３…音声パケット受信端末、１０１…音声入力部、
１０２…ディジタル信号符号化部、１０３…符号化デー
タ分割部、１０４…パケット組立部、１０５…信号送信
部、１０６…信号受信部、１０７…符号化データ抽出
部、１０８…符号化データ組立部、１０９…信号受信状
態監視部、１１０…ディジタル音声データ補間部、１１
１…ディジタル音声データ復号化部、１１２…音声出力
部DESCRIPTION OF SYMBOLS 1 ... Voice packet transmission terminal, 2 ... Asynchronous digital communication network, 3 ... Voice packet receiving terminal, 101 ... Voice input part,
Reference numeral 102: a digital signal encoding unit; 103, an encoded data dividing unit; 104, a packet assembling unit; 105, a signal transmitting unit; 106, a signal receiving unit; 107, an encoded data extracting unit; 109: signal reception state monitoring unit, 110: digital voice data interpolation unit, 11
1. Digital audio data decoding unit 112: Audio output unit

Claims

[Claims]

At a voice packet transmitting terminal, an analog voice signal is coded and converted into digital voice data,
While being divided every fixed time, the digital audio data within the fixed time is further provisionally divided into a fixed length, and the provisionally divided digital audio data is respectively provided as provisional packets.
Two or more of the temporary packets are separated and separated by a predetermined interval in a predetermined combination, and the predetermined combination of the two or more temporary packets are sequentially assembled as voice packets, and the transmission voice packet number is While the voice packet is transmitted on the asynchronous digital communication network as an added state, the voice packet receiving terminal extracts two or more temporary packets from the voice packet each time a voice packet is received from the asynchronous digital communication network. Above, as a state of being edited in the order of the temporary packet position within a certain time based on the transmission voice packet number added to the voice packet, while being temporarily stored together with other temporary packets already stored in parallel, As a result of monitoring the reception status of the voice packet from the asynchronous digital communication network, Wattage is not reached the predetermined number,
When a certain voice packet due to a loss in the order of the transmission voice packet number is specifically detected as a lost packet, the voice packet was included in the lost packet by interpolation processing of the temporary packets before and after temporarily stored. Each temporary packet temporarily stored as a state in which temporal continuity is maintained while the temporary packet is reproduced is periodically read out sequentially, and then reproduced and output as audio in a decoded state. Voice packet transmitting and receiving method.

2. The voice packet transmitting terminal according to claim 1, wherein said voice input unit converts voice into an analog voice signal, and said digital voice code converts said analog voice signal from said voice input unit into digital voice data. A signal encoding unit, and while the digital audio data from the digital signal encoding unit is divided at regular time intervals, the digital audio data within the fixed time is further temporarily divided into fixed lengths. Each of the data is assumed to be a temporary packet, and two or more of the temporary packets are separated by a predetermined distance in a predetermined combination and extracted in a predetermined manner.
An audio packet transmitting terminal comprising: a signal transmitting unit that assembles each of the above-mentioned set of temporary packets as an audio packet and transmits the packet to the asynchronous digital communication network in a state where a transmission audio packet number is added.

3. The voice packet receiving terminal according to claim 1, wherein: a signal receiving section for sequentially receiving voice packets from the asynchronous digital communication network; and each time the voice packet is received by the signal receiving section, An encoded data extraction unit for extracting two or more temporary packets from the voice packet based on the transmission voice packet number added to the voice packet in a state where a temporary packet position within a predetermined time is specified; An encoded data assembling section for temporarily storing the temporary packets extracted from the encoded data extracting section in the order of the temporary packet positions within a predetermined time together with other temporarily stored temporary packets, and asynchronous digital communication A signal reception state monitor that constantly monitors the reception state of voice packets from the network as the number of voice packets received within the predetermined time at predetermined time intervals. As a result of the packet reception status monitoring from the monitoring unit and the signal reception status monitoring unit, a voice packet in which the number of received voice packets has not reached the predetermined number and there is a missing in the order of the transmission voice packet number is identified as a lost packet. A digital audio data interpolator for reproducing the tentative packet included in the lost packet by interpolation processing for the tentative packet before and after temporarily stored in the coded data assembling unit when the erroneous packet is detected; A digital audio data decoding unit for decoding each of the temporary packets periodically read sequentially from the encoded data assembling unit while maintaining the above continuity; and decoding from the digital audio data decoding unit. An audio output unit for sequentially reproducing and outputting the temporary packets as audio.