JP2002374297A

JP2002374297A - Voice communication terminal and voice communication system

Info

Publication number: JP2002374297A
Application number: JP2001183061A
Authority: JP
Inventors: Masato Yamazaki; 真人山崎
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2001-06-18
Filing date: 2001-06-18
Publication date: 2002-12-26

Abstract

PROBLEM TO BE SOLVED: To provide a voice communication system that enables a user to be able to grasp conversation timing with the opposite party, even if voice transfer delay occurs and enables the user to be able to make communication with the opposite party by taking into account the voice transfer delay. SOLUTION: A voice communication terminal 1A is provided with a VOCOD ER 101A that encodes voice data XA, a data multiplexer 102A that multiplexes the data XA with feedback voice data FXB to generate multiplexed data MDA, a transmission line modulator 103A that modulates the data MDA and transmits the modulated data, a transmission line demodulator 109A that receives and demodulates multiplexed data MDB, a data distributor 108A that separates the data MDB into voice data XB and feedback voice data FXA, a voice decoder 107A that decodes the data XB, a feedback vocoder 104A, that re-encodes the data XB as the data FXB, a feedback voice decoder 105A that decodes the data FXA, and a gain controller 106A that mixes the data XB and the data FXA in accordance with a set ratio, and the voice communication terminal 1B, having the same configuration as that of the terminal 1A, is connected to the terminal 1A via a transmission line 2.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ＶＯＩＰ（Voice
Over IP）などの音声通信システムおよびこのシステム
を構成する音声通信端末装置であって、音声の伝送遅延
について対策された音声通信システムおよび音声通信端
末装置（それぞれ、音声遅延対策システム、音声遅延対
策端末装置とも称する）に関する。[0001] The present invention relates to a VOIP (Voice).
And a voice communication terminal device constituting the system, wherein the voice communication system and the voice communication terminal device are provided with a voice transmission delay countermeasure (a voice delay countermeasure system and a voice delay countermeasure terminal, respectively). Device).

【０００２】[0002]

【従来の技術】ＶＯＩＰシステムとは、従来の電話など
に用いられている一般交換機網（ＧＳＴＮ）ではなく、
コンピュータのネットワークシステムとして広まりつつ
あるインターネットを用い、コンピュータのネットワー
クプロトコル（ＩＰ）で音声データを伝送するための技
術である。このＶＯＩＰシステムでは、音声をディジタ
ル符号化し、圧縮した音声データをパケットで伝送し、
無音声時は送らないなどの手法を用いることで、通常の
電話に比べて伝送路の使用効率がよいなどのメリットが
ある。2. Description of the Related Art A VOIP system is not a general exchange network (GSTN) used for conventional telephones and the like.
This is a technique for transmitting voice data using a computer network protocol (IP) using the Internet that is becoming widespread as a computer network system. In this VOIP system, voice is digitally encoded, and compressed voice data is transmitted in packets.
The use of a technique such as not transmitting when there is no voice has advantages such as better use of the transmission path compared to a normal telephone.

【０００３】[0003]

【発明が解決しようとする課題】上記のＶＯＩＰシステ
ムでは、従来の一般交換機網とは異なり、音声をディジ
タル符号化し、それをパケットとして、インターネット
網を用いて相手側に伝送し、復号して音声に戻すが、イ
ンターネット網は、もともとリアルタイム性を求めない
ネットワークとして発展してきたものであり、ルーティ
ングをユーザー側が選択できないものなので、符号復号
遅延に加えてパケットの伝送遅延が発生し、これにより
かなりの音声転送遅延（話した音声が相手の耳に届くま
での遅延）が発生する。また、伝送路によってはパケッ
トの伝送遅延の揺らぎを生じ、これにより音声転送遅延
に揺らぎが発生することがある。この音声転送遅延によ
り、通常の会話のように相手の音声を瞬間に聞くことが
できず、また自分の音声を相手が聞くタイミングを把握
できないので、音声が遅延してもそれをそれを考慮して
会話することができず、会話をしずらい、あるいは会話
が成立しないという問題があった。このようにＶＯＩＰ
システムは、利用形態として問題がある音声通信システ
ムであった。In the above-mentioned VOIP system, unlike the conventional general exchange network, the voice is digitally encoded, transmitted as a packet to the other party using the Internet network, decoded, and decoded. However, since the Internet network was originally developed as a network that does not require real-time properties, and the user side cannot select the routing, packet transmission delays occur in addition to code decoding delays. A voice transfer delay (a delay until the spoken voice reaches the other party's ear) occurs. Also, depending on the transmission path, fluctuations in packet transmission delay may occur, which may cause fluctuations in the voice transfer delay. Because of this voice transfer delay, it is not possible to hear the other party's voice instantly as in a normal conversation, and it is not possible to grasp the timing at which the other party listens to their own voice. There was a problem that it was difficult to have a conversation, or the conversation was not established. Thus VOIP
The system was a voice communication system that had problems in usage.

【０００４】本発明は、このような従来の問題を解決す
るためになされたものであり、音声転送遅延が発生して
も相手との会話のタイミングを把握でき、音声転送遅延
を考慮して会話することができる音声通信端末装置およ
び音声通信システムを提供することを目的とするもので
ある。The present invention has been made to solve such a conventional problem. Even when a voice transfer delay occurs, the timing of a conversation with a partner can be grasped, and the voice transfer is considered in consideration of the voice transfer delay. It is an object of the present invention to provide a voice communication terminal device and a voice communication system that can perform the communication.

【０００５】[0005]

【課題を解決するための手段】上記の目的を達成するた
めに本発明の請求項１記載の音声通信端末装置は、入力
された第１の音声データを含む第１の多重化データを生
成して送信するとともに、相手側から送信された含む第
２の多重化データを受信して第２の音声データを出力す
る通信端末装置において、第１の音声データを符号化す
る符号化手段と、符号化された第１の音声データと相手
側にフィードバックするために再符号化された第２の音
声データである第２のフィードバック音声データとを多
重化し、第１の多重化データを生成する多重化手段と、
第１の多重化データを伝送仕様に応じて変調し、送信す
る変調手段と、変調された第２の多重化データを受信し
て復調する復調手段と、復調された第２の多重化データ
を符号化された第２の音声データと符号化された第１の
フィードバック音声データとに分離する分配手段と、分
離された第２の音声データを復号する復号化手段と、復
号された第２の音声データを第２のフィードバック音声
データとして再符号化するフィードバック符号化手段
と、分離された第１のフィードバック音声データを復号
するフィードバック復号化手段と、復号された第２の音
声データと復号された第１のフィードバック音声データ
とを設定された割合に従って混合する混合手段とを備え
たことを特徴とする。According to a first aspect of the present invention, there is provided a voice communication terminal device for generating first multiplexed data including input first voice data. Encoding means for encoding the first audio data in a communication terminal device for transmitting and transmitting the second audio data and receiving the second multiplexed data including from the other party and outputting the second audio data; Multiplexing the multiplexed first audio data and the second feedback audio data, which is the second audio data re-encoded for feedback to the other party, to generate first multiplexed data Means,
A modulating means for modulating and transmitting the first multiplexed data according to the transmission specification, a demodulating means for receiving and demodulating the modulated second multiplexed data, and a demodulating means for demodulating the second multiplexed data. Distributing means for separating encoded second audio data and encoded first feedback audio data; decoding means for decoding the separated second audio data; Feedback encoding means for re-encoding the audio data as second feedback audio data, feedback decoding means for decoding the separated first feedback audio data, and the decoded second audio data and the decoded second audio data Mixing means for mixing the first feedback voice data with the set feedback ratio.

【０００６】請求項２記載の音声通信端末装置は、入力
された第１の音声データを含む第１の多重化データを生
成して送信するとともに、相手側から送信された含む第
２の多重化データを受信して第２の音声データを出力す
る通信端末装置において、入力された第１の音声データ
を符号化する符号化手段と、符号化された第１の音声デ
ータと相手側にフィードバックするために遅延された符
号化された第２の音声データである第２のフィードバッ
ク音声データとを多重化し、第１の多重化データを生成
する多重化手段と、第１の多重化データを伝送仕様に応
じて変調し、送信する変調手段と、変調された第２の多
重化データを受信して復調する復調手段と、復調された
第２の多重化データを符号化された第２の音声データと
符号化された第１のフィードバック音声データとに分離
する分配手段と、分離された第２の音声データを復号す
る復号化手段と、分離された第１のフィードバック音声
データを復号するフィードバック復号化手段と、分離さ
れた第２の音声データを上記復号化手段においての第２
の音声データの復号化遅延量と上記符号化手段において
の第１の音声データの符号化遅延量の合計の遅延量だけ
遅延し、第２のフィードバック音声データとして出力す
る遅延手段と、復号された第２の音声データと復号され
た第１のフィードバック音声データとを設定された割合
に従って混合する混合手段とを備えたことを特徴とす
る。According to a second aspect of the present invention, the voice communication terminal device generates and transmits first multiplexed data including the input first voice data, and includes the second multiplexed data transmitted from the other party. In a communication terminal device that receives data and outputs second audio data, an encoding unit that encodes the input first audio data, and feeds back the encoded first audio data and the other party. Multiplexing means for multiplexing the second feedback voice data, which is the coded second voice data delayed for generating the first multiplexed data, and transmitting the first multiplexed data to the transmission specification. And demodulation means for receiving and demodulating the modulated second multiplexed data, and second audio data obtained by encoding the demodulated second multiplexed data. The first coded Distribution means for separating the separated feedback audio data; decoding means for decoding the separated second audio data; feedback decoding means for decoding the separated first feedback audio data; Of the second audio data in the decoding means.
Delay means for delaying by a total delay amount of the decoding delay amount of the audio data and the encoding delay amount of the first audio data in the encoding means, and outputting as the second feedback audio data; Mixing means for mixing the second audio data and the decoded first feedback audio data according to a set ratio.

【０００７】請求項３記載の音声通信システムは、伝送
路を介して相互に接続される請求項１記載の複数の音声
通信端末装置、あるいは伝送路を介して相互に接続され
る請求項２記載の複数の音声通信端末装置、あるいは伝
送路を介して相互に接続される請求項１記載の１つまた
は複数の音声通信端末装置および請求項２記載の１つま
たは複数の音声通信端末装置を備え、上記第１のフィー
ドバック音声データが、相手側で復号および再符号化さ
れて相手側からフィードバックされた第１の音声データ
であることを特徴とする。The voice communication system according to claim 3 is connected to each other via a transmission line, and the voice communication terminals are connected to each other via a transmission line. A plurality of voice communication terminal devices, or one or more voice communication terminal devices according to claim 1 and one or more voice communication terminal devices connected to each other via a transmission path. , Wherein the first feedback voice data is first voice data decoded and re-encoded by the other party and fed back from the other party.

【０００８】請求項４記載の音声通信端末装置は、入力
された第１の音声データを含む第１の多重化データを生
成して送信するとともに、相手側から送信された含む第
２の多重化データを受信して第２の音声データを出力す
る通信端末装置において、第１の音声データを符号化す
るとともに、その第１の音声データに第１のインデック
スデータを割り当てる符号化手段と、符号化された第１
の音声データと第１のインデックスデータと相手側にフ
ィードバックするために遅延された第２のフィードバッ
クインデックスデータとを多重化し、第１の多重化デー
タを生成する多重化手段と、第１の多重化データを伝送
仕様に応じて変調し、送信する変調手段と、変調された
第２の多重化データを受信して復調する復調手段と、復
調された第２の多重化データを符号化された第２の音声
データと第２のインデックスデータと第１のフィードバ
ックインデックスデータとに分離する分配手段と、分離
された第２の音声データを復号する復号化手段と、分離
された第２のインデックスデータを上記復号化手段にお
いての第２の音声データの復号化遅延量と上記符号化手
段においての第１の音声データの符号化遅延量の合計の
遅延量だけ遅延し、第２のフィードバックインデックス
データとして出力する遅延手段と、分離された第１のフ
ィードバックインデックスデータを上記復号化手段にお
いての第２の音声データの復号化遅延量だけ遅延するフ
ィードバック遅延手段と、第１の音声データおよびその
第１の音声データに割り当てられた第１のインデックス
データを保持しておき、遅延された第１のフィードバッ
クインデックスデータが上記フィードバック遅延手段か
ら出力されるタイミングで、その第１のフィードバック
インデックスデータに相当する第１のインデックスデー
タが割り当てられた第１の音声データを出力するバッフ
ァ手段と、復号された第２の音声データと上記バッファ
手段から出力された第１の音声データとを設定された割
合に従って混合する混合手段とを備えたことを特徴とす
る。According to a fourth aspect of the present invention, the voice communication terminal device generates and transmits first multiplexed data including the input first voice data, and includes the second multiplexed data transmitted from the other party. A communication terminal device for receiving data and outputting second audio data, encoding means for encoding the first audio data and assigning first index data to the first audio data; The first
Multiplexing means for multiplexing the first audio data, the first index data, and the second feedback index data delayed for feedback to the other party, and generating first multiplexed data; A modulating means for modulating and transmitting the data according to the transmission specification, a demodulating means for receiving and demodulating the modulated second multiplexed data, and a demodulating means for modulating the demodulated second multiplexed data. Distributing means for separating the second audio data, the second index data, and the first feedback index data, decoding means for decoding the separated second audio data, and Delay by a total delay amount of the decoding delay amount of the second audio data in the decoding unit and the encoding delay amount of the first audio data in the encoding unit. Delay means for outputting as second feedback index data, feedback delay means for delaying the separated first feedback index data by a decoding delay amount of the second audio data in the decoding means, The voice data and the first index data allocated to the first voice data are held, and the first feedback index data is output at the timing when the delayed first feedback index data is output from the feedback delay unit. A buffer for outputting first audio data to which first index data corresponding to the index data is assigned; a second decoded audio data and a first audio data output from the buffer; Mixing means for mixing according to the specified ratio. Characterized in that was.

【０００９】請求項５記載の音声通信システムは、伝送
路を介して相互に接続される請求項３記載の複数の音声
通信端末装置を備え、上記第１のフィードバックインデ
ックスデータが、相手側において第１の音声データの復
号化遅延量および第２の音声データの符号化遅延量の合
計の遅延量だけ遅延されて相手側からフィードバックさ
れた第１のインデックスデータであり、上記第２のイン
デックスデータが、相手側において第２の音声データに
割り当てられたものであることを特徴とする。According to a fifth aspect of the present invention, there is provided a voice communication system comprising a plurality of voice communication terminals connected to each other via a transmission path, wherein the first feedback index data is transmitted to the other side on the other side. 1 is the first index data which is delayed by the total delay amount of the decoding delay amount of the audio data and the encoding delay amount of the second audio data and fed back from the other party, and the second index data is , Which is assigned to the second voice data on the other side.

【００１０】請求項６記載の音声通信端末装置は、入力
された第１の音声データを含む第１の多重化データを生
成して送信するとともに、相手側から送信された含む第
２の多重化データを受信して第２の音声データを出力す
る通信端末装置において、第１の音声データを符号化す
るとともに、その符号化した第１の音声データに第１の
インデックスデータを割り当てる符号化手段と、符号化
された第１の音声データと第１のインデックスデータと
相手側にフィードバックするために遅延された第２のフ
ィードバックインデックスデータとを多重化し、第１の
多重化データを生成する多重化手段と、第１の多重化デ
ータを伝送仕様に応じて変調し、送信する変調手段と、
変調された第２の多重化データを受信して復調する復調
手段と、復調された第２の多重化データを符号化された
第２の音声データと第２のインデックスデータと第１の
フィードバックインデックスデータとに分離する分配手
段と、分離された第２の音声データを復号する復号化手
段と、分離された第２のインデックスデータを上記復号
化手段においての第２の音声データの復号化遅延量と上
記符号化手段においての第１の音声データの符号化遅延
量の合計の遅延量だけ遅延し、第２のフィードバックイ
ンデックスデータとして出力する遅延手段と、符号化さ
れた第１の音声データおよびその符号化された第１の音
声データに割り当てられた第１のインデックスデータを
保持しておき、第１のフィードバックインデックスデー
タが上記分配手段から出力されるタイミングで、その第
１のフィードバックインデックスデータに相当する第１
のインデックスデータが割り当てられた符号化された第
１の音声データを出力するバッファ手段と、上記バッフ
ァ手段から出力された第１の音声データを復号するフィ
ードバック復号化手段と、復号された第２の音声データ
と復号された第１の音声データとを設定された割合に従
って混合する混合手段とを備えたことを特徴とする。According to a sixth aspect of the present invention, the voice communication terminal device generates and transmits first multiplexed data including the input first voice data, and includes the second multiplexed data transmitted from the other party. Encoding means for encoding the first audio data and assigning the first index data to the encoded first audio data in a communication terminal device for receiving the data and outputting the second audio data; Multiplexing means for multiplexing the encoded first audio data, the first index data, and the second feedback index data delayed for feedback to the other party to generate first multiplexed data Modulation means for modulating and transmitting the first multiplexed data according to the transmission specification,
Demodulation means for receiving and demodulating the modulated second multiplexed data; second audio data, second index data, and first feedback index obtained by encoding the demodulated second multiplexed data; Distribution means for separating the data into data, decoding means for decoding the separated second audio data, and a decoding delay amount of the second audio data in the decoding means for converting the separated second index data into the second index data Delay means for delaying by a total delay amount of the encoding delay amount of the first audio data in the encoding means, and outputting the result as second feedback index data; The first index data allocated to the encoded first audio data is held, and the first feedback index data is stored in the distribution unit. At a timing et output, first corresponding to the first feedback index data
Buffer means for outputting the encoded first audio data to which the index data is assigned, feedback decoding means for decoding the first audio data output from the buffer means, and the decoded second audio data. Mixing means for mixing the audio data and the decoded first audio data in accordance with a set ratio.

【００１１】請求項７記載の音声通信システムは、伝送
路を介して相互に接続される請求項５記載の複数の音声
通信端末装置を備え、上記第１のフィードバックインデ
ックスデータが、相手側において第１の音声データの復
号化遅延量および第２の音声データの符号化遅延量の合
計の遅延量だけ遅延されて相手側からフィードバックさ
れた第１のインデックスデータであり、上記第２のイン
デックスデータが、相手側において符号化された第２の
音声データに割り当てられたものであることを特徴とす
る。According to a seventh aspect of the present invention, there is provided a voice communication system comprising a plurality of voice communication terminals connected to each other via a transmission path, wherein the first feedback index data is transmitted to a third party on the other end. 1 is the first index data which is delayed by the total delay amount of the decoding delay amount of the audio data and the encoding delay amount of the second audio data and fed back from the other party, and the second index data is , Which is assigned to the encoded second audio data on the other side.

【００１２】[0012]

【発明の実施の形態】第１の実施の形態図１は本発明の第１の実施の形態の音声遅延対策システ
ムを示すブロック構成図である。図１では、端末装置１
Ａと端末装置１Ｂとが伝送路２を介して接続しており、
端末装置１Ａと端末装置１Ｂと伝送路２によって上記の
音声遅延対策システムが構成されている。DESCRIPTION OF THE PREFERRED EMBODIMENTS First Embodiment FIG. 1 is a block diagram showing a voice delay countermeasure system according to a first embodiment of the present invention. In FIG. 1, the terminal device 1
A and the terminal device 1B are connected via the transmission line 2,
The terminal device 1A, the terminal device 1B, and the transmission line 2 constitute the above-described audio delay countermeasure system.

【００１３】端末装置１Ａは、音声符号化器１０１Ａ
と、データ多重化器１０２Ａと、伝送路変調器１０３Ａ
と、フィードバック音声符号化器１０４Ａと、フィード
バック音声復号化器１０５Ａと、ゲインコントローラ１
０６Ａと、音声復号化器１０７Ａと、データ分配器１０
８Ａと、伝送路復調器１０９Ａと、音声データ入力端子
ＩＮ１Ａと、ゲイン設定データ入力端子ＩＮ２Ａと、音
声データ出力端子ＯＵＴＡとを備える。The terminal device 1A includes a speech encoder 101A.
, A data multiplexer 102A, and a transmission line modulator 103A.
, Feedback speech encoder 104A, feedback speech decoder 105A, and gain controller 1
06A, the speech decoder 107A, and the data distributor 10A.
8A, a transmission line demodulator 109A, an audio data input terminal IN1A, a gain setting data input terminal IN2A, and an audio data output terminal OUTA.

【００１４】また、端末装置１Ｂは、音声符号化器１０
１Ｂと、データ多重化器１０２Ｂと、伝送路変調器１０
３Ｂと、フィードバック音声符号化器１０４Ｂと、フィ
ードバック音声復号化器１０５Ｂと、ゲインコントロー
ラ１０６Ｂと、音声復号化器１０７Ｂと、データ分配器
１０８Ｂと、伝送路復調器１０９Ｂと、音声データ入力
端子ＩＮ１Ｂと、ゲイン設定データ入力端子ＩＮ２Ｂ
と、音声データ出力端子ＯＵＴＢとを備える。The terminal device 1B includes a speech encoder 10
1B, the data multiplexer 102B, and the transmission line modulator 10
3B, a feedback speech encoder 104B, a feedback speech decoder 105B, a gain controller 106B, a speech decoder 107B, a data distributor 108B, a transmission path demodulator 109B, and a speech data input terminal IN1B. , Gain setting data input terminal IN2B
And an audio data output terminal OUTB.

【００１５】図１の端末装置１Ａの構成および動作につ
いて以下に説明する。なお、端末装置１Ａと端末装置１
Ｂは同じ構成である。また、端末装置１Ａの音声符号化
器１０１Ａと端末装置１Ｂの音声符号化器１０１Ｂ、端
末装置１Ａのデータ多重化器１０２Ａと端末装置１Ｂの
データ多重化器１０２Ｂなどは同じ構成である。以下の
端末装置１Ａの構成および動作の説明において、それぞ
れの符号をかっこ内の符号に置き換えたものは、図１の
端末装置１Ｂの構成および動作の説明である。The configuration and operation of the terminal device 1A shown in FIG. 1 will be described below. The terminal device 1A and the terminal device 1
B has the same configuration. The speech encoder 101A of the terminal device 1A and the speech encoder 101B of the terminal device 1B, the data multiplexer 102A of the terminal device 1A and the data multiplexer 102B of the terminal device 1B have the same configuration. In the following description of the configuration and operation of the terminal device 1A, those in which the reference numerals are replaced with those in parentheses are descriptions of the configuration and operation of the terminal device 1B in FIG.

【００１６】入力端子ＩＮ１Ａ（ＩＮ１Ｂ）には音声デ
ータＸＡ（ＸＢ）が入力され、この音声データＸＡ（Ｘ
Ｂ）は音声符号化器１０１Ａ（１０１Ａ）に入力され
る。The audio data XA (XB) is input to the input terminal IN1A (IN1B), and the audio data XA (XB)
B) is input to the speech encoder 101A (101A).

【００１７】音声符号化器１０１Ａ（１０１Ｂ）は、入
力された音声データＸＡ（ＸＢ）を符号化し、データ多
重化器１０２Ａ（１０２Ｂ）に出力する。上記の符号化
には、伝送路２を効率的に使用するための音声圧縮など
の処理を含む。The audio encoder 101A (101B) encodes the input audio data XA (XB) and outputs the encoded data to the data multiplexer 102A (102B). The above-described encoding includes processing such as audio compression for efficiently using the transmission path 2.

【００１８】データ多重化器１０２Ａ（１０２Ｂ）は、
符号化された音声データＸＡ（ＸＢ）と、フィードバッ
ク音声符号化器１０４Ａ（１０４Ｂ）からのフィードバ
ック音声データＦＸＢ（ＦＸＡ）とを多重化し、伝送路
変調器１０３Ａ（１０３Ｂ）に出力する。The data multiplexer 102A (102B)
The coded audio data XA (XB) and the feedback audio data FXB (FXA) from the feedback audio encoder 104A (104B) are multiplexed and output to the transmission line modulator 103A (103B).

【００１９】伝送路変調器１０３Ａ（１０３Ｂ）は、多
重されたデータＭＤＡ（ＭＤＢ）をフォーマット変換
し、伝送路２に送出する。上記のフォーマット変換は、
ＴＣＰ／ＩＰに代表される伝送路２で用いられるフォー
マットに変換するものであり、パケット化などを含む。The transmission line modulator 103A (103B) converts the format of the multiplexed data MDA (MDB) and sends it out to the transmission line 2. The above format conversion is
It is converted into a format used in the transmission path 2 represented by TCP / IP, and includes packetization and the like.

【００２０】伝送路復調器１０９Ａ（１０９Ｂ）は、端
末装置１Ｂ（１Ａ）から伝送路２に送出された多重化さ
れたデータＭＤＢ（ＭＤＡ）を受信し、端末装置１Ｂ
（１Ａ）の伝送路変調器１０３Ｂ（１０３Ａ）でなされ
たフォーマット変換の逆変換を施し、データ分配器１０
８Ａ（１０８Ｂ）に出力する。The transmission line demodulator 109A (109B) receives the multiplexed data MDB (MDA) transmitted to the transmission line 2 from the terminal device 1B (1A), and
The data distributor 10B performs an inverse conversion of the format conversion performed by the transmission line modulator 103B (103A) in (1A).
8A (108B).

【００２１】データ分配器１０８Ａ（１０８Ｂ）は、多
重化されたデータＭＤＢ（ＭＤＡ）を、符号化された音
声データＸＢ（ＸＡ）と、符号化されたフィードバック
音声データＦＸＡ（ＦＸＢ）とに分離し、音声データＸ
Ｂ（ＸＡ）を音声復号化器１０７Ａ（１０７Ｂ）に出力
し、フィードバック音声データＦＸＡ（ＦＸＢ）をフィ
ードバック音声復号化器１０５Ａ（１０５Ｂ）に出力す
る。The data distributor 108A (108B) separates the multiplexed data MDB (MDA) into coded voice data XB (XA) and coded feedback voice data FXA (FXB). , Audio data X
B (XA) is output to audio decoder 107A (107B), and feedback audio data FXA (FXB) is output to feedback audio decoder 105A (105B).

【００２２】音声復号化器１０７Ａは、符号化された音
声データＸＢ（ＸＢ）を復号し、ゲインコントローラ１
０６Ａ（１０６Ｂ）およびフィードバック音声符号化器
１０４Ａ（１０４Ｂ）に出力する。The audio decoder 107A decodes the encoded audio data XB (XB), and
06A (106B) and the feedback speech encoder 104A (104B).

【００２３】フィードバック音声符号化器１０４Ａ（１
０４Ｂ）は、音声データＸＢ（ＸＡ）を再び符号化し、
フィードバック音声データＦＸＢ（ＦＸＡ）としてデー
タ多重化器１０２Ａ（１０２Ｂ）に出力する。The feedback speech encoder 104A (1
04B) encodes the audio data XB (XA) again,
The data is output to the data multiplexer 102A (102B) as feedback voice data FXB (FXA).

【００２４】フィードバック音声復号化器１０５Ａ（１
０５Ｂ）は、符号化されたフィードバック音声データＦ
ＸＡ（ＦＸＢ）を復号し、ゲインコントローラ１０６Ａ
（１０６Ｂ）に出力する。The feedback speech decoder 105A (1
05B) is the encoded feedback audio data F
XA (FXB) is decoded and gain controller 106A is decoded.
(106B).

【００２５】入力端子ＩＮ２Ａ（ＩＮ２Ｂ）にはゲイン
設定データｒＡ（ｒＢ）が入力され、このゲイン設定デ
ータｒＡ（ｒＢ）はゲインコントローラ１０６Ａ（１０
６Ａ）に入力される。The gain setting data rA (rB) is input to the input terminal IN2A (IN2B), and the gain setting data rA (rB) is input to the gain controller 106A (10
6A).

【００２６】ゲインコントローラ１０６Ａは、端末装置
１Ｂから送信された音声データＸＢ、および端末装置１
Ｂからフィードバックされた端末装置１Ａの音声データ
であるフィードバック音声データＦＸＡを、ゲイン設定
データｒＡに応じた割合で混合し、例えば出力音声デー
タｒＡ・ＸＢ＋｛（１−ｒＡ）ＦＸＡ｝を出力端子ＯＵ
ＴＡから出力する。ただし、０≦ｒＡ≦１である。な
お、端末装置１Ｂのゲインコントローラ１０６Ｂは、端
末装置１Ａから送信された音声データＸＡ、および端末
装置１Ａからフィードバックされた端末装置１Ｂの音声
データであるフィードバック音声データＦＸＢを、ゲイ
ン設定データｒＢに応じた割合で混合し、例えば出力音
声データｒＢ・ＸＡ＋｛（１−ｒＢ）ＦＸＢ｝を出力端
子ＯＵＴＢから出力する。ただし、０≦ｒＢ≦１であ
る。The gain controller 106A receives the audio data XB transmitted from the terminal device 1B,
Feedback audio data FXA, which is audio data of the terminal device 1A fed back from B, is mixed at a ratio according to the gain setting data rA, and for example, output audio data rA.XB + {(1-rA) FXA} is output to the output terminal OU.
Output from TA. However, 0 ≦ rA ≦ 1. The gain controller 106B of the terminal device 1B converts the audio data XA transmitted from the terminal device 1A and the feedback audio data FXB, which is the audio data of the terminal device 1B fed back from the terminal device 1A, according to the gain setting data rB. The output audio data rB.XA + {(1-rB) FXB} is output from the output terminal OUTB. However, 0 ≦ rB ≦ 1.

【００２７】図２はゲインコントローラ１０６Ａの構成
の一例を示す図である。なお、端末装置１Ｂのゲインコ
ントローラ１０６Ｂは、端末装置１Ａのゲインコントロ
ーラ１０６Ａと同じ内部構成である。図２のそれぞれの
符号および図２についての以下の説明のそれぞれの符号
をかっこ内の符号に置き換えたものは、ゲインコントロ
ーラ１０６Ｂの構成の一例となる。FIG. 2 is a diagram showing an example of the configuration of the gain controller 106A. The gain controller 106B of the terminal device 1B has the same internal configuration as the gain controller 106A of the terminal device 1A. Each of the reference numerals in FIG. 2 and the following description of FIG. 2 in which each reference numeral is replaced with a reference in parentheses is an example of the configuration of the gain controller 106B.

【００２８】図２において、ゲインコントローラ１０６
Ａ（１０６Ｂ）は、乗算器２０１，２０２と、加算器２
０３と、入力端子ｉｎ１，ｉｎ２と、出力端子ｏｕｔと
を備える。入力端子ｉｎ１は図１の音声復号化器１０７
Ａ（１０７Ｂ）の出力に接続され、入力端子ｉｎ２は図
１のフィードバック音声符号化器１０４Ａ（１０４Ｂ）
の出力に接続され、入力端子ｉｎ３は図１のゲイン設定
データ入力端子ＩＮ２Ａ（ＩＮ２Ｂ）に接続され、出力
端子ｏｕｔは図１の音声データ出力端子ＯＵＴＡ（ＯＵ
ＴＢ）に接続されている。In FIG. 2, the gain controller 106
A (106B) includes multipliers 201 and 202 and an adder 2
03, input terminals in1 and in2, and an output terminal out. The input terminal in1 is connected to the audio decoder 107 in FIG.
A (107B), and the input terminal in2 is connected to the feedback speech encoder 104A (104B) of FIG.
The input terminal in3 is connected to the gain setting data input terminal IN2A (IN2B) in FIG. 1, and the output terminal out is connected to the audio data output terminal OUTA (OU) in FIG.
TB).

【００２９】乗算器２０１は、端子ｉｎ１に入力された
端末装置１Ｂからの音声データＸＢと、端子ｉｎ３に入
力されたゲイン設定データｒＡとを乗算し、ｒＡ・ＸＢ
を加算器２０３に出力する。なお、ゲインコントローラ
１０６Ｂの乗算器２０１は、端子ｉｎ１に入力された端
末装置１Ａからの音声データＸＡと、端子ｉｎ３に入力
されたゲイン設定データｒＢとを乗算し、ｒＢ・ＸＡを
加算器２０３に出力する。The multiplier 201 multiplies the audio data XB from the terminal device 1B input to the terminal in1 by the gain setting data rA input to the terminal in3, and obtains rA · XB
Is output to the adder 203. The multiplier 201 of the gain controller 106B multiplies the audio data XA from the terminal device 1A input to the terminal in1 by the gain setting data rB input to the terminal in3, and adds rB · XA to the adder 203. Output.

【００３０】乗算器２０２は、端子ｉｎ２に入力された
フィードバック音声データＦＸＡと、（１−ｒＡ）とを
乗算し、（１−ｒＡ）ＦＸＡを加算器２０３に出力す
る。なお、ゲインコントローラ１０６Ｂの乗算器２０２
は、端子ｉｎ２に入力されたフィードバック音声データ
ＦＸＢと、（１−ｒＢ）とを乗算し、（１−ｒＢ）ＦＸ
Ｂを加算器２０３に出力する。The multiplier 202 multiplies the feedback voice data FXA input to the terminal in2 by (1-rA) and outputs (1-rA) FXA to the adder 203. The multiplier 202 of the gain controller 106B
Multiplies the feedback voice data FXB input to the terminal in2 by (1-rB), and obtains (1-rB) FX
B is output to the adder 203.

【００３１】加算器２０３は、上記のｒＡ・ＸＢと（１
−ｒＡ）ＦＸＡとを加算し、ｒＡ・ＸＢ＋（１−ｒＡ）
ＦＸＡを端子ｏｕｔから出力する。このｒＡ・ＸＢ＋
（１−ｒＡ）ＦＸＡは、端末装置１Ａの音声データ出力
端子ＯＵＴＡから出力される音声データである。出力音
声データｒＡ・ＸＢ＋（１−ｒＡ）ＦＸＡは、ｒＡ＝１
に設定されたときには、端末装置１Ｂから送信された音
声データＸＢのみとなり、ｒＡ＝０に設定されたときに
は、端末装置１Ｂからフィードバックされた端末装置１
Ａの音声データであるフィードバック音声データＦＸＡ
のみとなる。また、０＜ｒＡ＜１に設定された場合は、
そのゲイン設定データｒＡに応じた割合で音声データＸ
Ｂとフィードバック音声データＦＸＡが混合される。な
お、ゲインコントローラ１０６Ｂの加算器２０３は、ｒ
Ｂ・ＸＡと（１−ｒＢ）ＦＸＢとを加算し、ｒＢ・ＸＡ
＋（１−ｒＢ）ＦＸＢを端子ｏｕｔから出力する。The adder 203 calculates the above rA · XB and (1
−rA) FXA and add rA · XB + (1−rA)
FXA is output from the terminal out. This rA · XB +
(1-rA) FXA is audio data output from the audio data output terminal OUTA of the terminal device 1A. The output audio data rA · XB + (1−rA) FXA is rA = 1
Is set to only the audio data XB transmitted from the terminal device 1B, and when rA = 0, the terminal device 1B is fed back from the terminal device 1B.
Feedback audio data FXA which is audio data of A
Only. When 0 <rA <1 is set,
The audio data X at a rate corresponding to the gain setting data rA
B and the feedback audio data FXA are mixed. Note that the adder 203 of the gain controller 106B has r
B · XA and (1-rB) FXB are added, and rB · XA
+ (1-rB) FXB is output from the terminal out.

【００３２】図１の音声遅延対策システムにおいての音
声データＸＡおよびＸＢの流れについて以下に説明す
る。The flow of audio data XA and XB in the audio delay countermeasure system of FIG. 1 will be described below.

【００３３】端末装置１Ａに入力された音声データＸＡ
は、符号化器１０１Ａ、データ多重化器１０２Ａ、伝送
路変調器１０３Ａ、伝送路２、伝送路復調器１０９Ｂ、
データ分配器１０８Ｂ、音声復号化器１０７Ｂを経由
し、ゲインコントローラ１０６Ｂおよびフィードバック
音声符号化器１０４Ｂに入力される。ゲインコントロー
ラ１０６Ｂに入力された音声データＸＡは、フィードバ
ック音声データＦＸＢと混合され、出力端子ＯＵＴＢか
ら出力される。The audio data XA input to the terminal device 1A
Are an encoder 101A, a data multiplexer 102A, a transmission line modulator 103A, a transmission line 2, a transmission line demodulator 109B,
The data is input to the gain controller 106B and the feedback speech encoder 104B via the data distributor 108B and the speech decoder 107B. The audio data XA input to the gain controller 106B is mixed with the feedback audio data FXB and output from the output terminal OUTB.

【００３４】また、フィードバック音声符号化器１０４
Ｂに入力された音声データＸＡは、データ多重化器１０
２Ｂ、伝送路変調器１０３Ｂ、伝送路２を経由して端末
装置１Ａにフィードバックされる。このフィードバック
音声データＦＸＡは、伝送路復調器１０９Ａ、データ分
配器１０８Ａ、フィードバック音声復号化器１０５Ａを
経由し、ゲインコントローラ１０６Ａに入力され、音声
データＸＢと混合され、出力端子ＯＵＴＡから出力され
る。Also, the feedback speech encoder 104
B is input to the data multiplexer 10.
2B, the transmission path modulator 103B, and the transmission path 2 are fed back to the terminal device 1A. The feedback audio data FXA is input to the gain controller 106A via the transmission path demodulator 109A, the data distributor 108A, and the feedback audio decoder 105A, mixed with the audio data XB, and output from the output terminal OUTA.

【００３５】同じように、端末装置１Ｂに入力された音
声データＸＢは、符号化器１０１Ｂ、データ多重化器１
０２Ｂ、伝送路変調器１０３Ｂ、伝送路２、伝送路復調
器１０９Ａ、データ分配器１０８Ａ、音声復号化器１０
７Ａを経由し、ゲインコントローラ１０６Ａおよびフィ
ードバック音声符号化器１０４Ａに入力される。ゲイン
コントローラ１０６Ａに入力された音声データＸＢは、
フィードバック音声データＦＸＡと混合され、出力端子
ＯＵＴＡから出力される。Similarly, the audio data XB input to the terminal device 1B is encoded by the encoder 101B and the data multiplexer 1B.
02B, transmission path modulator 103B, transmission path 2, transmission path demodulator 109A, data distributor 108A, speech decoder 10
The signal is input to gain controller 106A and feedback speech encoder 104A via 7A. The audio data XB input to the gain controller 106A is
It is mixed with the feedback audio data FXA and output from the output terminal OUTA.

【００３６】また、フィードバック音声符号化器１０４
Ａに入力された音声データＸＢは、データ多重化器１０
２Ａ、伝送路変調器１０３Ａ、伝送路２を経由して端末
装置１Ｂにフィードバックされる。このフィードバック
音声データＦＸＢは、伝送路復調器１０９Ｂ、データ分
配器１０８Ｂ、フィードバック音声復号化器１０５Ｂを
経由し、ゲインコントローラ１０６Ｂに入力され、音声
データＸＡと混合され、出力端子ＯＵＴＢから出力され
る。The feedback speech coder 104
The audio data XB input to A is transmitted to the data multiplexer 10.
The signal is fed back to the terminal device 1B via the transmission line modulator 103A and the transmission line modulator 2A. The feedback audio data FXB is input to the gain controller 106B via the transmission path demodulator 109B, the data distributor 108B, and the feedback audio decoder 105B, mixed with the audio data XA, and output from the output terminal OUTB.

【００３７】ここで、音声符号化器１０１Ａ，１０１
Ｂ、フィードバック音声符号化器１０４Ａ，１０４Ｂの
データ入出力遅延時間をいずれもＴａとし、データ多重
化器１０２Ａと伝送路変調器１０３Ａの合計のデータ遅
延時間、およびデータ多重化器１０２Ｂと伝送路変調器
１０３Ｂの合計のデータ遅延時間をいずれもＴｂとし、
伝送路２のデータ伝送遅延時間をＴｃとし、伝送路復調
器１０９Ａとデータ分配器１０８Ａの合計のデータ遅延
時間、および伝送路復調器１０９Ｂとデータ分配器１０
８Ｂの合計のデータ遅延時間をいずれもＴｄとし、音声
復号化器１０７Ａ，１０７Ｂ、フィードバック音声復号
化器１０５Ａ，１０５のデータ入出力遅延時間をいずれ
もＴｅとする。また、Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔｄ＋Ｔｅ＝
Ｔとする。Here, the speech encoders 101A and 101A
B, the data input / output delay time of the feedback speech encoders 104A and 104B is Ta, the total data delay time of the data multiplexer 102A and the transmission line modulator 103A, and the data delay time of the data multiplexer 102B and the transmission line modulation The total data delay time of the device 103B is Tb,
The data transmission delay time of the transmission line 2 is defined as Tc, the total data delay time of the transmission line demodulator 109A and the data distributor 108A, and the transmission line demodulator 109B and the data distributor 10B.
The total data delay time of 8B is Td, and the data input / output delay times of the audio decoders 107A and 107B and the feedback audio decoders 105A and 105 are Te. Also, Ta + Tb + Tc + Td + Te =
Let it be T.

【００３８】時刻ｔ＝０に端末装置１Ａに入力された音
声データＸＡ［０］は、時刻ｔ＝Ｔに端末装置１Ｂのゲ
インコントローラ１０６Ｂおよびフィードバック音声符
号化器１０４Ｂに入力され、端末装置１Ｂから出力され
る。The voice data XA [0] input to the terminal device 1A at time t = 0 is input to the gain controller 106B and the feedback voice encoder 104B of the terminal device 1B at time t = T, and is output from the terminal device 1B. Is output.

【００３９】フィードバック音声符号化器１０４Ｂに入
力された音声データＸＡ［０］は、時刻ｔ＝Ｔ＋Ｔａに
フィードバック音声データＦＸＡ［０］としてデータ多
重化器１０２Ｂに入力される。また、時刻ｔ＝Ｔに端末
装置１Ｂに入力された音声データＸＢ［Ｔ］も、時刻ｔ
＝Ｔ＋Ｔａにデータ多重化器１０２Ｂに入力される。The audio data XA [0] input to the feedback audio encoder 104B is input to the data multiplexer 102B as feedback audio data FXA [0] at time t = T + Ta. The audio data XB [T] input to the terminal device 1B at time t = T also
= T + Ta is input to the data multiplexer 102B.

【００４０】フィードバック音声データＦＸＡ［０］お
よび音声データＸＢ［Ｔ］は、ともに時刻ｔ＝Ｔ＋Ｔａ
＋Ｔｂに端末装置１Ｂから伝送路２に送出され、時刻ｔ
＝２Ｔに端末装置１Ａのゲインコントローラ１０６Ａに
入力され、端末装置１Ａから出力される。The feedback voice data FXA [0] and the voice data XB [T] are both at time t = T + Ta
+ Tb from the terminal device 1B to the transmission line 2 at time t
= 2T, the signal is input to the gain controller 106A of the terminal device 1A and output from the terminal device 1A.

【００４１】同じように、時刻ｔ＝０に端末装置１Ｂに
入力された音声データＸＢ［０］は、時刻ｔ＝Ｔに端末
装置１Ａのゲインコントローラ１０６Ａおよびフィード
バック音声符号化器１０４Ａに入力され、端末装置１Ａ
から出力される。Similarly, voice data XB [0] input to terminal device 1B at time t = 0 is input to gain controller 106A and feedback voice encoder 104A of terminal device 1A at time t = T, Terminal device 1A
Output from

【００４２】フィードバック音声符号化器１０４Ａに入
力された音声データＸＢ［０］は、時刻ｔ＝Ｔ＋Ｔａに
フィードバック音声データＦＸＢ［０］としてデータ多
重化器１０２Ａに入力される。また、時刻ｔ＝Ｔに端末
装置１Ａに入力された音声データＸＡ［Ｔ］も、時刻ｔ
＝Ｔ＋Ｔａにデータ多重化器１０２Ａに入力される。The voice data XB [0] input to the feedback voice encoder 104A is input to the data multiplexer 102A as feedback voice data FXB [0] at time t = T + Ta. Also, the audio data XA [T] input to the terminal device 1A at time t = T also
= T + Ta is input to the data multiplexer 102A.

【００４３】フィードバック音声データＦＸＢ［０］お
よび音声データＸＡ［Ｔ］は、ともに時刻ｔ＝Ｔ＋Ｔａ
＋Ｔｂに端末装置１Ａから伝送路２に送出され、時刻ｔ
＝２Ｔに端末装置１Ｂのゲインコントローラ１０６Ｂに
入力され、端末装置１Ｂから出力される。Feedback voice data FXB [0] and voice data XA [T] are both at time t = T + Ta
+ Tb is transmitted from the terminal device 1A to the transmission path 2 at time t
= 2T is input to the gain controller 106B of the terminal device 1B and output from the terminal device 1B.

【００４４】このように、端末装置１Ａ（１Ｂ）から端
末装置１Ｂ（１Ａ）に送信した音声データＸＡ［０］
（ＸＡ［０］）が端末装置１Ｂ（１Ａ）から端末装置１
Ａ（１Ｂ）にフィードバックされ、そのフィードバック
音声データＦＸＡ［０］（ＦＩＸＢ［０］）は、ゲイン
コントローラ１０６Ａ（１０６Ｂ）に入力され、端末装
置３Ｂ（３Ａ）から送信された音声データＸＢ［Ｔ］
（ＸＡ［Ｔ］）と混合され、端末装置３Ａ（３Ｂ）から
出力される。As described above, the audio data XA [0] transmitted from the terminal device 1A (1B) to the terminal device 1B (1A)
(XA [0]) is transmitted from the terminal device 1B (1A) to the terminal device 1
A (1B) is fed back, and the feedback voice data FXA [0] (FIXB [0]) is input to the gain controller 106A (106B), and the voice data XB [T] transmitted from the terminal device 3B (3A).
(XA [T]) and output from the terminal device 3A (3B).

【００４５】フィードバック音声データＦＸＡ［０］
（ＦＩＸＢ［０］）がゲインコントローラ１０６Ａ（１
０６Ｂ）に入力されるタイミングは、端末装置１Ａ（１
Ｂ）に入力された音声データＸＡ［０］（ＸＢ［０］）
の端末装置１Ａ（１Ｂ）および端末装置１Ｂ（１Ａ）に
おいての音声符号復号遅延、伝送路遅延、ならびに端末
装置１Ｂ（１Ａ）に入力された音声データＸＢ［０］
（ＸＡ［０］）の端末装置１Ｂ（１Ａ）および端末装置
１Ａ（１Ｂ）においての音声符号復号遅延を考慮し、こ
れらの合計の遅延量に準じたタイミングである。これに
より、音声データＸＡ［０］（ＸＢ［０］）は、この音
声データＸＡ［０］（ＸＢ［０］）が端末装置１Ｂ（１
Ａ）から出力されるときに端末装置１Ｂ（１Ａ）に入力
された音声データＸＢ［Ｔ］（ＸＡ［Ｔ］）と混合さ
れ、端末装置１Ａ（１Ｂ）から出力される。Feedback voice data FXA [0]
(FIXB [0]) is the gain controller 106A (1
06B) is input to the terminal device 1A (1
Audio data XA [0] (XB [0]) input to B)
Code delay and transmission line delay in the terminal device 1A (1B) and the terminal device 1B (1A), and the audio data XB [0] input to the terminal device 1B (1A).
Considering the voice code decoding delay in the terminal device 1B (1A) and the terminal device 1A (1B) of (XA [0]), the timing is based on the total delay amount of these. As a result, the audio data XA [0] (XB [0]) is converted from the audio data XA [0] (XB [0]) to the terminal device 1B (1).
When output from A), it is mixed with audio data XB [T] (XA [T]) input to the terminal device 1B (1A) and output from the terminal device 1A (1B).

【００４６】以上説明したように第１の実施の形態によ
れば、端末装置１Ａ（１Ｂ）の入力音声を、端末装置１
Ａ（１Ｂ）および端末装置１Ｂ（１Ａ）においての符号
復号遅延ならびに伝送路遅延を含んだ形で、相手側の端
末装置１Ｂ（１Ａ）からの送信音声と混合して、端末装
置１Ａ（１Ｂ）において聞くことができる。このことに
より、端末装置１Ａ（１Ｂ）のユーザーは、音声転送遅
延およびその揺らぎが発生しても自分の音声の転送遅延
を把握できるので、相手との会話のタイミングを把握す
ることができ、そのタイミングを計りながら会話するこ
とが可能となる。As described above, according to the first embodiment, the input sound of the terminal device 1A (1B) is
A (1B) and the terminal device 1B (1A) are mixed with the transmission voice from the terminal device 1B (1A) on the other side, including the code decoding delay and the transmission line delay in the terminal device 1B (1A). Can be heard at Accordingly, the user of the terminal device 1A (1B) can grasp the transfer delay of his / her voice even if the voice transfer delay and its fluctuation occur, so that the user can grasp the timing of the conversation with the other party. It becomes possible to talk while measuring the timing.

【００４７】また、端末装置１Ａ（１Ｂ）の入力音声と
相手側からの送信音声との混合割合を端末装置１Ａ（１
Ｂ）において設定することができ、入力音声の混合割合
を０に設定することもできるので、遅延の発生が微量な
場合など、ユーザー自身の音声が耳障りになる場合は、
ユーザー自身の音声を消去することも可能である。The mixing ratio between the input voice of the terminal device 1A (1B) and the voice transmitted from the other party is determined by the terminal device 1A (1B)
B), and the mixing ratio of the input sound can be set to 0. Therefore, when the user's own sound is annoying, such as when the delay is very small,
It is also possible to delete the user's own voice.

【００４８】なお、上記第１の実施の形態の端末装置１
Ａ（１Ｂ）において、フィードバック音声符号化器１０
４Ａ（１０４Ｂ）に代えて、データ分配器１０８Ａ（１
０８Ｂ）で分離された音声データＸＢ（ＸＡ）を音声復
号化器１０７Ａ（１０７Ｂ）においての音声データＸＢ
（ＸＡ）の復号遅延および音声符号化器１０１Ａ（１０
１Ｂ）においての音声データＸＡ（ＸＢ）の符号遅延の
合計の遅延量だけ遅延させてデータ多重化器１０２Ａ
（１０２Ｂ）に出力するデータバッファあるいは遅延器
などの遅延手段を設けた構成としてもよい。Note that the terminal device 1 of the first embodiment is
In A (1B), the feedback speech coder 10
4A (104B) instead of the data distributor 108A (1
08B) is separated from the audio data XB (XA) in the audio decoder 107A (107B).
(XA) decoding delay and speech encoder 101A (10
1B), the data multiplexer 102A delays by the total delay amount of the code delay of the audio data XA (XB).
A configuration may be adopted in which a delay means such as a data buffer or a delay unit for outputting to (102B) is provided.

【００４９】第２の実施の形態図３は本発明の第２の実
施の形態の音声遅延対策システムを示すブロック構成図
である。なお、図３において、図１と同じものには同じ
符号を付してある。図３では、端末装置３Ａと端末装置
３Ｂとが伝送路２を介して接続しており、端末装置３Ａ
と端末装置３Ｂと伝送路２によって上記の音声遅延対策
システムが構成されている。Second Embodiment FIG. 3 is a block diagram showing an audio delay countermeasure system according to a second embodiment of the present invention. In FIG. 3, the same components as those in FIG. 1 are denoted by the same reference numerals. In FIG. 3, the terminal device 3A and the terminal device 3B are connected via the transmission line 2, and the terminal device 3A
The terminal device 3B and the transmission path 2 constitute the above-described audio delay countermeasure system.

【００５０】端末装置３Ａは、音声符号化器３０１Ａ
と、データ多重化器３０２Ａと、伝送路変調器３０３Ａ
と、フレームインデックス遅延器３０４Ａと、フィード
バックフレームインデックス遅延器３０５Ａと、ゲイン
コントローラ３０６Ａと、音声復号化器３０７Ａと、デ
ータ分配器３０８Ａと、伝送路復調器３０９Ａと、音声
データバッファ（ＦＩＦＯバッファ）３１０Ａと、音声
データ入力端子ＩＮ１Ａと、ゲイン設定データ入力端子
ＩＮ２Ａと、音声データ出力端子ＯＵＴＡとを備える。The terminal device 3A includes a speech encoder 301A
, A data multiplexer 302A, and a transmission line modulator 303A.
, A frame index delay unit 304A, a feedback frame index delay unit 305A, a gain controller 306A, an audio decoder 307A, a data distributor 308A, a transmission line demodulator 309A, and an audio data buffer (FIFO buffer) 310A. And an audio data input terminal IN1A, a gain setting data input terminal IN2A, and an audio data output terminal OUTA.

【００５１】また、端末装置３Ｂは、音声符号化器３０
１Ｂと、データ多重化器３０２Ｂと、伝送路変調器３０
３Ｂと、フレームインデックス遅延器３０４Ｂと、フィ
ードバックフレームインデックス遅延器３０５Ｂと、ゲ
インコントローラ３０６Ｂと、音声復号化器３０７Ｂ
と、データ分配器３０８Ｂと、伝送路復調器３０９Ｂ
と、音声データバッファ（ＦＩＦＯバッファ）３１０Ｂ
と、音声データ入力端子ＩＮ１Ｂと、ゲイン設定データ
入力端子ＩＮ２Ｂと、音声データ出力端子ＯＵＴＢとを
備える。Further, the terminal device 3 B
1B, a data multiplexer 302B, and a transmission line modulator 30.
3B, a frame index delay unit 304B, a feedback frame index delay unit 305B, a gain controller 306B, and a speech decoder 307B.
, A data distributor 308B, and a transmission path demodulator 309B.
And audio data buffer (FIFO buffer) 310B
And an audio data input terminal IN1B, a gain setting data input terminal IN2B, and an audio data output terminal OUTB.

【００５２】図３の端末装置３Ａの構成および動作につ
いて以下に説明する。なお、以下の端末装置３Ａと端末
装置３Ｂは同じ構成である。また、端末装置３Ａの音声
符号化器３０１Ａと端末装置３Ｂの音声符号化器３０１
Ｂ、端末装置３Ａのデータ多重化器３０２Ａと端末装置
３Ｂのデータ多重化器３０２Ｂなどは同じ構成である。
以下の端末装置３Ａの構成および動作の説明において、
それぞれの符号をかっこ内の符号に置き換えたものは、
図３の端末装置３Ｂの構成および動作の説明である。The configuration and operation of the terminal device 3A shown in FIG. 3 will be described below. The following terminal devices 3A and 3B have the same configuration. The speech encoder 301A of the terminal device 3A and the speech encoder 301 of the terminal device 3B
B, the data multiplexer 302A of the terminal device 3A and the data multiplexer 302B of the terminal device 3B have the same configuration.
In the following description of the configuration and operation of the terminal device 3A,
Replace each code with the code in parentheses,
4 is an illustration of the configuration and operation of the terminal device 3B of FIG.

【００５３】入力端子ＩＮ１Ａ（ＩＮ１Ｂ）には音声デ
ータＸＡ（ＸＢ）が入力され、この音声データＸＡ（Ｘ
Ｂ）は音声符号化器３０１Ａ（３０１Ｂ）および音声デ
ータバッファ３１０Ａ（３１０Ｂ）に入力される。The audio data XA (XB) is input to the input terminal IN1A (IN1B).
B) is input to the audio encoder 301A (301B) and the audio data buffer 310A (310B).

【００５４】音声符号化器３０１Ａ（３０１Ｂ）は、入
力された音声データＸＡ（ＸＢ）を符号化し、データ多
重化器３０２Ａ（３０２Ｂ）に出力する。上記の符号化
には、伝送路２を効率的に使用するための音声圧縮など
の処理を含む。さらに、音声符号化器３０１Ａ（３０１
Ｂ）は、音声データＸＡ（ＸＢ）に音声フレームインデ
ックスＩＸＡ（ＩＸＢ）を割り振り、その音声フレーム
インデックスデータＩＸＡ（ＩＸＢ）を音声データバッ
ファ３１０Ａ（３１０Ｂ）およびデータ多重化器３０２
Ａ（３０２Ｂ）に出力する。The speech encoder 301A (301B) encodes the inputted speech data XA (XB) and outputs it to the data multiplexer 302A (302B). The above-described encoding includes processing such as audio compression for efficiently using the transmission path 2. Further, the speech encoder 301A (301
B) allocates an audio frame index IXA (IXB) to the audio data XA (XB), and stores the audio frame index data IXA (IXB) in the audio data buffer 310A (310B) and the data multiplexer 302.
A (302B).

【００５５】データ多重化器３０２Ａ（３０２Ｂ）は、
符号化された音声データＸＡ（ＸＢ）と、その音声デー
タＸＡ（ＸＢ）に割り振られた音声フレームインデック
スデータＩＸＡ（ＩＸＢ）と、フレームインデックス遅
延器３０４Ａ（３０４Ｂ）からのフィードバック音声フ
レームインデックスデータＦＩＸＢ（ＦＩＸＡ）とを多
重化し、伝送路変調器３０３Ａ（３０３Ｂ）に出力す
る。The data multiplexer 302A (302B)
The encoded audio data XA (XB), the audio frame index data IXA (IXB) allocated to the audio data XA (XB), and the feedback audio frame index data FIXB (FIXB) from the frame index delay unit 304A (304B). FIXA) and outputs the result to the transmission path modulator 303A (303B).

【００５６】伝送路変調器３０３Ａ（３０３Ｂ）は、多
重されたデータＭＤＩＡ（ＭＤＩＢ）をフォーマット変
換し、伝送路２に送出する。上記のフォーマット変換
は、ＴＣＰ／ＩＰに代表される伝送路２で用いられるフ
ォーマットに変換するものであり、パケット化などを含
む。The transmission line modulator 303A (303B) converts the format of the multiplexed data MDIA (MDIB) and sends out the data to the transmission line 2. The above-mentioned format conversion is a conversion to a format used in the transmission path 2 represented by TCP / IP, and includes packetization and the like.

【００５７】伝送路復調器３０９Ａ（３０９Ｂ）は、端
末装置３Ｂ（３Ａ）から伝送路２に送出された多重化さ
れたデータＭＤＩＢ（ＭＤＩＡ）を受信し、端末装置３
Ｂ（３Ａ）の伝送路変調器３０３Ｂ（３０３Ａ）でなさ
れたフォーマット変換の逆変換を施し、データ分配器３
０８Ａ（３０８Ｂ）に出力する。The transmission line demodulator 309A (309B) receives the multiplexed data MDIB (MDIA) transmitted to the transmission line 2 from the terminal device 3B (3A), and
B (3A) performs an inverse conversion of the format conversion performed by the transmission line modulator 303B (303A), and the data distributor 3
08A (308B).

【００５８】データ分配器３０８Ａ（３０８Ｂ）は、多
重化されたデータＭＤＩＢ（ＭＤＩＡ）を、符号化され
た音声データＸＢ（ＸＡ）と、その音声データＸＢ（Ｘ
Ａ）に割り振られた音声フレームインデックスデータＩ
ＸＢ（ＩＸＡ）と、フィードバック音声フレームインデ
ックスデータＦＩＸＡ（ＦＩＸＢ）とに分離し、音声デ
ータＸＢ（ＸＡ）を音声復号化器３０７Ａ（３０７Ｂ）
に出力し、音声フレームインデックスデータＩＸＢ（Ｉ
ＸＡ）をフレームインデックス遅延器３０４Ａ（３０４
Ｂ）に出力し、フィードバック音声フレームインデック
スデータＦＩＸＡ（ＦＩＸＢ）をフィードバックフレー
ムインデックス遅延器３０５Ａ（３０５Ａ）に出力す
る。上記のフィードバック音声フレームインデックスデ
ータＦＩＸＡ（ＦＩＸＢ）は、端末装置１Ａから相手側
の端末装置１Ｂに送信し、端末装置１Ｂから返信された
音声フレームインデックスデータであり、端末装置１Ｂ
に送信した音声データＸＡにに割り振られたものであ
る。また、上記の音声フレームインデックスデータＩＸ
Ｂ（ＩＸＡ）は、相手側の端末装置１Ｂから送信された
音声フレームインデックスデータであり、端末装置１Ｂ
から送信された音声データＸＢに割り振られたものであ
る。The data distributor 308A (308B) converts the multiplexed data MDIB (MDIA) into encoded audio data XB (XA) and the encoded audio data XB (XA).
Audio frame index data I allocated to A)
XB (IXA) and feedback audio frame index data FIXA (FIXB), and the audio data XB (XA) is decoded by the audio decoder 307A (307B).
To the audio frame index data IXB (I
XA) to the frame index delay unit 304A (304).
B), and outputs the feedback voice frame index data FIXA (FIXB) to the feedback frame index delay unit 305A (305A). The feedback voice frame index data FIXA (FIXB) is voice frame index data transmitted from the terminal device 1A to the terminal device 1B on the other side and returned from the terminal device 1B.
Is assigned to the audio data XA transmitted to the user. Also, the above-mentioned audio frame index data IX
B (IXA) is the voice frame index data transmitted from the terminal device 1B on the partner side,
Is assigned to the audio data XB transmitted from the.

【００５９】フレームインデックス遅延器３０４Ａ（３
０４Ｂ）は、音声フレームインデックスデータＩＸＢ
（ＩＸＡ）を、音声復号化装置３０７Ａ（３０７Ｂ）の
データ入出力遅延量および音声符号化装置３０１Ａ（３
０１Ｂ）のデータ入出力遅延量の合計の遅延量と同じだ
け遅延させ、フィードバック音声フレームインデックス
データＦＩＸＢ（ＦＩＸＡ）としてデータ多重化器３０
２Ａ（３０２Ｂ）に出力する。The frame index delay unit 304A (3
04B) is the audio frame index data IXB
(IXA) is calculated based on the data input / output delay amount of the audio decoding device 307A (307B) and the audio encoding device 301A (3
01B) is delayed by the same amount as the total amount of data input / output delay amounts, and is used as feedback voice frame index data FIXB (FIXA).
2A (302B).

【００６０】音声復号化器３０７Ａ（３０７Ｂ）は、符
号化された音声データＸＢ（ＸＡ）を復号し、ゲインコ
ントローラ３０６Ａ（３０６Ｂ）に出力する。The audio decoder 307A (307B) decodes the encoded audio data XB (XA) and outputs it to the gain controller 306A (306B).

【００６１】フィードバックフレームインデックス遅延
器３０５Ａ（３０５Ｂ）は、フィードバックされた音声
フレームインデックスデータＦＩＸＡ（ＦＩＸＢ）を、
音声復号化器３０７Ａ（３０７Ｂ）においての音声デー
タＸＢ（ＸＡ）の遅延量と同じだけ遅延させ、音声デー
タバッファ３１０Ａ（３１０Ｂ）に出力する。The feedback frame index delay unit 305A (305B) converts the fed back voice frame index data FIXA (FIXB) into
It delays by the same amount as the delay amount of the audio data XB (XA) in the audio decoder 307A (307B) and outputs it to the audio data buffer 310A (310B).

【００６２】音声データバッファ３１０Ａ（３１０Ｂ）
は、入力音声データＸＡ（ＸＢ）、およびその音声デー
タＸＡ（ＸＢ）に割り振られた音声フレームインデック
スデータＩＸＡ（ＩＸＢ）を貯えておき、フィードバッ
ク音声フレームインデックスデータＦＩＸＡ（ＦＩＸ
Ｂ）がフィードバックフレームインデックス遅延器３０
５Ａ（３０５Ｂ）から入力されると、そのフィードバッ
ク音声フレームインデックスデータＦＩＸＡ（ＦＩＸ
Ｂ）に一致する音声フレームインデックスデータＩＸＡ
（ＩＸＢ）が割り振られた音声データＸＡ（ＸＢ）を、
ゲインコントローラ３０６Ａ（３０６Ｂ）に出力する。Audio data buffer 310A (310B)
Stores input audio data XA (XB) and audio frame index data IXA (IXB) allocated to the audio data XA (XB), and stores feedback audio frame index data FIXA (FIX
B) is a feedback frame index delay unit 30
5A (305B), the feedback voice frame index data FIXA (FIX
B) Voice frame index data IXA that matches
(IXB) is assigned to the audio data XA (XB),
Output to the gain controller 306A (306B).

【００６３】つまり、音声データバッファ３１０Ａ（３
１０Ｂ）は、入力されたフィードバック音声フレームイ
ンデックスデータＦＩＸＡ（ＦＩＸＢ）が割り振られた
音声データＸＡ（ＸＢ）をゲインコントローラ３０６Ａ
（３０６Ｂ）に出力することにより、端末装置３Ａの入
力音声データＸＡ（ＸＢ）を、符号復号遅延および伝送
路遅延を考慮したタイミングで端末装置３Ａから出力さ
せる。音声フレームインデックスデータＩＸＡ（ＩＸ
Ｂ）は、フィードバック音声フレームインデックスデー
タＦＩＸＡ（ＦＩＸＢ）が割り振られた符号化された音
声データＸＡ（ＸＢ）を識別するために、入力音声デー
タＸＡ（ＸＢ）とともに音声データバッファ３１０Ａ
（３１０Ｂ）に貯えられる。That is, the audio data buffer 310A (3
10B) gains the audio data XA (XB) to which the input feedback audio frame index data FIXA (FIXB) is allocated by the gain controller 306A.
(306B), the input audio data XA (XB) of the terminal device 3A is output from the terminal device 3A at a timing in consideration of the code decoding delay and the transmission path delay. Voice frame index data IXA (IX
B), together with the input audio data XA (XB), the audio data buffer 310A to identify the encoded audio data XA (XB) to which the feedback audio frame index data FIXA (FIXB) is allocated.
(310B).

【００６４】入力端子ＩＮ２Ａ（ＩＮ２Ｂ）にはゲイン
設定データｒＡ（ｒＢ）が入力され、このゲイン設定デ
ータｒＡ（ｒＢ）はゲインコントローラ３０６Ａ（３０
６Ｂ）に入力される。The gain setting data rA (rB) is input to the input terminal IN2A (IN2B), and the gain setting data rA (rB) is input to the gain controller 306A (30).
6B).

【００６５】ゲインコントローラ３０６Ａ（３０６Ｂ）
は、上記第１の実施の形態のゲインコントローラ１０６
Ａ（１０６Ｂ）と同じ内部構成である。ゲインコントロ
ーラ３０６Ａは、端末装置３Ｂから送信された音声デー
タＸＢ、および音声データバッファ３１０Ａで遅延され
た端末装置３Ａの入力音声データＸＡを、ゲイン設定デ
ータｒＡに応じた割合で混合し、出力音声データｒＡ・
ＸＢ＋｛（１−ｒＡ）ＸＡ｝を出力端子ＯＵＴＡから出
力する。なお、端末装置３Ｂのゲインコントローラ３０
６Ｂは、端末装置３Ａから送信された音声データＸＡ、
および音声データバッファ３１０Ｂで遅延された端末装
置３Ｂの入力音声データＸＢを、ゲイン設定データｒＢ
に応じた割合で混合し、出力音声データｒＢ・ＸＡ＋
｛（１−ｒＢ）ＸＢ｝を出力端子ＯＵＴＢから出力す
る。Gain controller 306A (306B)
Is the gain controller 106 according to the first embodiment.
A (106B) has the same internal configuration. The gain controller 306A mixes the audio data XB transmitted from the terminal device 3B and the input audio data XA of the terminal device 3A delayed by the audio data buffer 310A at a ratio according to the gain setting data rA, and outputs the output audio data. rA
XB + {(1-rA) XA} is output from the output terminal OUTA. The gain controller 30 of the terminal device 3B
6B is the audio data XA transmitted from the terminal device 3A,
The input audio data XB of the terminal device 3B delayed by the audio data buffer 310B and the gain setting data rB
And the output audio data rB · XA +
{(1-rB) XB} is output from the output terminal OUTB.

【００６６】図３の音声遅延対策システムにおいての音
声データＸＡおよびＸＢならびに音声フレームインデッ
クスデータＩＸＡおよびＩＸＢの流れについて以下に説
明する。The flow of audio data XA and XB and audio frame index data IXA and IXB in the audio delay countermeasure system of FIG. 3 will be described below.

【００６７】端末装置３Ａに入力された音声データＸＡ
は、符号化器３０１Ａおよび音声データバッファ３１０
Ａに入力される。符号化器３０１Ａに入力された音声デ
ータＸＡは、データ多重化器３０２Ａ、伝送路変調器３
０３Ａ、伝送路２、伝送路復調器３０９Ｂ、データ分配
器３０８Ｂ、音声復号化器３０７Ｂを経由し、ゲインコ
ントローラ３０６Ｂに入力され、遅延された端末装置３
Ｂの入力音声データＸＢと混合され、出力端子ＯＵＴＢ
から出力される。The audio data XA input to the terminal device 3A
Is an encoder 301A and an audio data buffer 310
A is input to A. The audio data XA input to the encoder 301A is transmitted to the data multiplexer 302A and the transmission path modulator 3
03A, the transmission path 2, the transmission path demodulator 309B, the data distributor 308B, and the speech decoder 307B.
B is mixed with the input audio data XB of the output terminal OUTB.
Output from

【００６８】音声データバッファ３１０Ａに入力された
音声データＸＡは、音声データバッファ３１０Ａに一時
貯えられ、端末装置３Ｂから送信された音声データＸＢ
に同期するタイミングでゲインコントローラ３０６Ａに
出力され、端末装置３Ｂからの音声データＸＡと混合さ
れ、出力端子ＯＵＴＢから出力される。The audio data XA input to the audio data buffer 310A is temporarily stored in the audio data buffer 310A, and the audio data XB transmitted from the terminal device 3B.
Is output to the gain controller 306A at the timing synchronized with the audio data XA, mixed with the audio data XA from the terminal device 3B, and output from the output terminal OUTB.

【００６９】また、端末装置３Ａの入力音声データＸＡ
に割り振られた音声フレームインデックスＩＸＡは、音
声符号化器３０１Ａから音声データバッファ３１０Ａお
よびデータ多重化器３０２Ａに入力される。音声データ
バッファ３１０Ａに入力された音声フレームインデック
スＩＸＡは、その音声フレームインデックスＩＸＡが割
り振られた入力音声データＸＡとともに音声データバッ
ファ３１０Ａに貯えられる。The input audio data XA of the terminal device 3A
Is input from the audio encoder 301A to the audio data buffer 310A and the data multiplexer 302A. The audio frame index IXA input to the audio data buffer 310A is stored in the audio data buffer 310A together with the input audio data XA to which the audio frame index IXA is allocated.

【００７０】データ多重化器３０２Ａに入力された音声
フレームインデックスＩＸＡは、伝送路変調器３０３
Ａ、伝送路２、伝送路復調器３０９Ｂ、データ分配器３
０８Ｂ、フレームインデックス遅延器３０４Ｂ、データ
多重化器３０２Ｂ、伝送路変調器３０３Ｂ、伝送路２を
経由して端末装置３Ａにフィードバックされる。このフ
ィードバック音声フレームインデックスデータＦＩＸＡ
は、伝送路復調器３０９Ａ、データ分配器３０８Ａ、フ
ィードバックフレームインデックス遅延器３０５Ａを経
由し、音声データバッファ３１０Ａに入力される。The voice frame index IXA input to the data multiplexer 302 A is
A, transmission path 2, transmission path demodulator 309B, data distributor 3
08B, the frame index delay unit 304B, the data multiplexer 302B, the transmission line modulator 303B, and the feedback to the terminal device 3A via the transmission line 2. This feedback voice frame index data FIXA
Is input to the audio data buffer 310A via the transmission line demodulator 309A, the data distributor 308A, and the feedback frame index delay unit 305A.

【００７１】音声データバッファ３１０Ａは、フィード
バック音声フレームインデックスデータＦＩＸＡが入力
されたタイミングで、そのフィードバック音声フレーム
インデックスデータＦＩＸＡが割り振られた音声データ
ＸＡをゲインコントローラ３０６Ａに出力する。The audio data buffer 310A outputs the audio data XA to which the feedback audio frame index data FIXA is allocated to the gain controller 306A at the timing when the feedback audio frame index data FIXA is input.

【００７２】同じように、端末装置３Ｂに入力された音
声データＸＢは、符号化器３０１Ｂおよび音声データバ
ッファ３１０Ｂに入力される。符号化器３０１Ｂに入力
された音声データＸＢは、データ多重化器３０２Ｂ、伝
送路変調器３０３Ｂ、伝送路２、伝送路復調器３０９
Ａ、データ分配器３０８Ａ、音声復号化器３０７Ａを経
由し、ゲインコントローラ３０６Ａに入力され、端末装
置３Ａの入力音声データＸＡと混合され、出力端子ＯＵ
ＴＡから出力される。Similarly, audio data XB input to terminal device 3B is input to encoder 301B and audio data buffer 310B. The audio data XB input to the encoder 301B is transmitted to a data multiplexer 302B, a transmission path modulator 303B, a transmission path 2, and a transmission path demodulator 309.
A, via the data distributor 308A and the audio decoder 307A, input to the gain controller 306A, mixed with the input audio data XA of the terminal device 3A, and output terminal OU
Output from TA.

【００７３】音声データバッファ３１０Ｂに入力された
音声データＸＢは、音声データバッファ３１０Ｂに一時
貯えられ、端末装置３Ａから送信された音声データＸＡ
に同期するタイミングでゲインコントローラ３０６Ｂに
出力され、端末装置３Ａからの音声データＸＡと混合さ
れ、出力端子ＯＵＴＢから出力される。The audio data XB input to the audio data buffer 310B is temporarily stored in the audio data buffer 310B, and the audio data XA transmitted from the terminal device 3A.
Is output to the gain controller 306B at the timing synchronized with the audio data XA, mixed with the audio data XA from the terminal device 3A, and output from the output terminal OUTB.

【００７４】また、端末装置３Ｂの入力音声データＸＢ
に割り当てられた音声フレームインデックスＩＸＢは、
音声符号化器３０１Ｂから音声データバッファ３１０Ｂ
およびデータ多重化器３０２Ｂに入力される。音声デー
タバッファ３１０Ｂに入力された音声フレームインデッ
クスＩＸＢは、その音声フレームインデックスＩＸＡが
割り振られた入力音声データＸＡとともに音声データバ
ッファ３１０Ａに貯えられる。The input voice data XB of the terminal device 3B
Is assigned to the audio frame index IXB.
From the audio encoder 301B to the audio data buffer 310B
And input to the data multiplexer 302B. The audio frame index IXB input to the audio data buffer 310B is stored in the audio data buffer 310A together with the input audio data XA to which the audio frame index IXA is allocated.

【００７５】データ多重化器３０２Ｂに入力された音声
フレームインデックスＩＸＢは、伝送路変調器３０３
Ｂ、伝送路２、伝送路復調器３０９Ａ、データ分配器３
０８Ａ、フレームインデックス遅延器３０４Ａ、データ
多重化器３０２Ａ、伝送路変調器３０３Ａ、伝送路２を
経由して端末装置３Ｂにフィードバックされる。このフ
ィードバック音声フレームインデックスデータＦＩＸＢ
は、伝送路復調器３０９Ｂ、データ分配器３０８Ｂ、フ
ィードバックフレームインデックス遅延器３０５Ｂを経
由し、音声データバッファ３１０Ｂに入力される。The voice frame index IXB input to the data multiplexer 302 B is
B, transmission path 2, transmission path demodulator 309A, data distributor 3
08A, the frame index delay unit 304A, the data multiplexer 302A, the transmission line modulator 303A, and the feedback to the terminal device 3B via the transmission line 2. This feedback voice frame index data FIXB
Is input to the audio data buffer 310B via the transmission line demodulator 309B, the data distributor 308B, and the feedback frame index delay unit 305B.

【００７６】音声データバッファ３１０Ｂは、フィード
バック音声フレームインデックスデータＦＩＸＢが入力
されたタイミングで、そのフィードバック音声フレーム
インデックスデータＦＩＸＢが割り振られた音声データ
ＸＢをゲインコントローラ３０６Ｂに出力する。The audio data buffer 310B outputs the audio data XB to which the feedback audio frame index data FIXB is allocated to the gain controller 306B at the timing when the feedback audio frame index data FIXB is input.

【００７７】ここで、音声符号化器３０１Ａ，３０１Ｂ
のデータ入出力遅延時間をいずれもＴａとし、データ多
重化器３０２Ａと伝送路変調器３０３Ａの合計のデータ
遅延時間、およびデータ多重化器３０２Ｂと伝送路変調
器３０３Ｂの合計のデータ遅延時間をいずれもＴｂと
し、伝送路２のデータ伝送遅延時間をＴｃとし、伝送路
復調器３０９Ａとデータ分配器３０８Ａの合計のデータ
遅延時間、および伝送路復調器３０９Ｂとデータ分配器
３０８Ｂの合計のデータ遅延時間をいずれもＴｄとし、
音声復号化器３０７Ａ，３０７Ｂ、フィードバックフレ
ームインデックス遅延器３０５Ａ，３０５Ｂのデータ入
出力遅延時間をいずれもＴｅとする。また、フレームイ
ンデックス遅延器３０４Ａ，３０４Ｂのデータ入出力遅
延時間をいずれもＴｅ＋Ｔａとする。また、Ｔａ＋Ｔｂ
＋Ｔｃ＋Ｔｄ＋Ｔｅ＝Ｔとする。Here, the speech encoders 301A and 301B
, And the total data delay time of the data multiplexer 302A and the transmission path modulator 303A, and the total data delay time of the data multiplexer 302B and the transmission path modulator 303B. Is also Tb, the data transmission delay time of the transmission line 2 is Tc, the total data delay time of the transmission line demodulator 309A and the data distributor 308A, and the total data delay time of the transmission line demodulator 309B and the data distributor 308B. Are Td,
Let Te be the data input / output delay time of the audio decoders 307A and 307B and the feedback frame index delay units 305A and 305B. In addition, the data input / output delay time of the frame index delay units 304A and 304B is both Te + Ta. Also, Ta + Tb
+ Tc + Td + Te = T.

【００７８】時刻ｔ＝０に端末装置３Ａに入力された音
声データＸＡ［０］は、時刻ｔ＝Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔ
ｄに端末装置３Ｂの音声復号化器３０７Ｂに入力され、
時刻ｔ＝Ｔにゲインコントローラ３０６Ｂに入力され、
端末装置３Ｂから出力される。The audio data XA [0] input to the terminal device 3A at the time t = 0 is represented by the time t = Ta + Tb + Tc + T
d is input to the audio decoder 307B of the terminal device 3B,
At time t = T, it is input to the gain controller 306B,
Output from the terminal device 3B.

【００７９】上記の音声データＸＡ［０］に割り振られ
た音声フレームインデックスデータＩＸＡ［０］は、時
刻ｔ＝Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔｄに端末装置３Ｂのフレー
ムインデックス遅延器３０４Ｂに入力され、時刻ｔ＝Ｔ
＋Ｔａにフィードバック音声インデックスデータＦＩＸ
Ａ［０］としてデータ多重化器３０２Ｂに入力される。
また、時刻ｔ＝Ｔに端末装置３Ｂに入力された音声デー
タＸＢ［Ｔ］、および音声データＸＢ［Ｔ］に割り振ら
れた音声フレームインデックスデータＩＸＢ［Ｔ］も、
時刻ｔ＝Ｔ＋Ｔａにデータ多重化器３０２Ｂに入力され
る。The audio frame index data IXA [0] allocated to the audio data XA [0] is input to the frame index delay unit 304B of the terminal device 3B at time t = Ta + Tb + Tc + Td, and the time t = T
+ Ta is the feedback voice index data FIX
A [0] is input to the data multiplexer 302B.
The audio data XB [T] input to the terminal device 3B at time t = T and the audio frame index data IXB [T] allocated to the audio data XB [T] are also:
At time t = T + Ta, data is input to data multiplexer 302B.

【００８０】フィードバック音声フレームインデックス
データＦＩＸＡ［０］および音声データＸＢ［Ｔ］は、
ともに時刻ｔ＝Ｔ＋Ｔａ＋Ｔｂに端末装置３Ｂから伝送
路２に送出され、時刻ｔ＝Ｔ＋Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔｄ
に端末装置３Ａのデータ分配器３０８Ａから出力され
る。The feedback voice frame index data FIXA [0] and the voice data XB [T] are
Both are transmitted from the terminal device 3B to the transmission line 2 at time t = T + Ta + Tb, and time t = T + Ta + Tb + Tc + Td.
Is output from the data distributor 308A of the terminal device 3A.

【００８１】端末装置３Ｂから送信された音声データＸ
Ｂ［Ｔ］は、時刻ｔ＝２Ｔにゲインコントローラ３０６
Ａに入力され、端末装置３Ａから出力される。また、フ
ィードバック音声フレームインデックスデータＦＩＸＡ
［０］は、時刻ｔ＝２Ｔに音声データバッファ３１０Ａ
に入力される。これにより、音声データバッファ３１０
Ａに貯えられていた音声データＸＡ［０］は、時刻ｔ＝
２Ｔにゲインコントローラ３０６Ａに入力され、音声デ
ータＸＢ［Ｔ］とともに端末装置３Ａから出力される。The audio data X transmitted from the terminal device 3B
B [T] indicates that the gain controller 306 is at time t = 2T.
A and is output from the terminal device 3A. Also, the feedback voice frame index data FIXA
[0] indicates that the audio data buffer 310A at time t = 2T
Is input to Thereby, the audio data buffer 310
The audio data XA [0] stored in A is at time t =
The signal is input to the gain controller 306A at 2T, and is output from the terminal device 3A together with the audio data XB [T].

【００８２】同じように、時刻ｔ＝０に端末装置３Ｂに
入力された音声データＸＢ［０］は、時刻ｔ＝Ｔａ＋Ｔ
ｂ＋Ｔｃ＋Ｔｄに端末装置３Ａの音声復号化器３０７Ａ
に入力され、時刻ｔ＝Ｔにゲインコントローラ３０６Ａ
に入力され、端末装置３Ａから出力される。Similarly, voice data XB [0] input to terminal device 3B at time t = 0 is obtained at time t = Ta + T
The audio decoder 307A of the terminal device 3A is added to b + Tc + Td.
And at time t = T, the gain controller 306A
And output from the terminal device 3A.

【００８３】上記の音声データＸＢ［０］に割り振られ
た音声フレームインデックスデータＩＸＢ［０］は、時
刻ｔ＝Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔｄに端末装置３Ａのフレー
ムインデックス遅延器３０４Ａに入力され、時刻ｔ＝Ｔ
＋Ｔａにフィードバック音声インデックスデータＦＩＸ
Ｂ［０］としてデータ多重化器３０２Ａに入力される。
また、時刻ｔ＝Ｔに端末装置３Ａに入力された音声デー
タＸＡ［Ｔ］、および音声データＸＡ［Ｔ］に割り振ら
れた音声フレームインデックスデータＩＸＡ［Ｔ］も、
時刻ｔ＝Ｔ＋Ｔａにデータ多重化器３０２Ａに入力され
る。The voice frame index data IXB [0] allocated to the voice data XB [0] is input to the frame index delay unit 304A of the terminal device 3A at time t = Ta + Tb + Tc + Td, and the time t = T
+ Ta is the feedback voice index data FIX
The data is input to the data multiplexer 302A as B [0].
The audio data XA [T] input to the terminal device 3A at time t = T and the audio frame index data IXA [T] allocated to the audio data XA [T] are also:
At time t = T + Ta, data is input to data multiplexer 302A.

【００８４】フィードバック音声フレームインデックス
データＦＩＸＢ［０］および音声データＸＡ［Ｔ］は、
ともに時刻ｔ＝Ｔ＋Ｔａ＋Ｔｂに端末装置３Ａから伝送
路２に送出され、時刻ｔ＝Ｔ＋Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔｄ
に端末装置３Ｂのデータ分配器３０８Ｂから出力され
る。The feedback voice frame index data FIXB [0] and the voice data XA [T] are
Both are transmitted from the terminal device 3A to the transmission line 2 at the time t = T + Ta + Tb, and the time t = T + Ta + Tb + Tc + Td.
Is output from the data distributor 308B of the terminal device 3B.

【００８５】端末装置３Ａから送信された音声データＸ
Ａ［Ｔ］は、時刻ｔ＝２Ｔにゲインコントローラ３０６
Ｂに入力され、端末装置３Ｂから出力される。また、フ
ィードバック音声フレームインデックスデータＦＩＸＢ
［０］は、時刻ｔ＝２Ｔに音声データバッファ３１０Ｂ
に入力される。これにより、音声データバッファ３１０
Ｂに貯えられていた音声データＸＢ［０］は、時刻ｔ＝
２Ｔにゲインコントローラ３０６Ｂに入力され、音声デ
ータＸＡ［Ｔ］とともに端末装置３Ｂから出力される。Voice data X transmitted from terminal device 3A
A [T] indicates that the gain controller 306 is at time t = 2T.
B and output from the terminal device 3B. Also, the feedback voice frame index data FIXB
[0] indicates that the audio data buffer 310B at time t = 2T
Is input to Thereby, the audio data buffer 310
The audio data XB [0] stored in B at time t =
The signal is input to the gain controller 306B at 2T, and output from the terminal device 3B together with the audio data XA [T].

【００８６】このように、音声データＸＡ［０］（ＸＢ
［０］）に多重化されて端末装置３Ａ（３Ｂ）から端末
装置３Ｂ（３Ａ）に送信した音声フレームインデックス
データＩＸＡ［０］（ＩＸＢ［０］）が端末装置３Ｂ
（３Ａ）から端末装置３Ａ（３Ｂ）にフィードバックさ
れ、そのフィードバック音声フレームインデックスデー
タＦＩＸＡ［０］（ＦＩＸＢ［０］）が音声データバッ
ファ３１０Ａ（３１０Ｂ）に入力されるタイミングで、
音声データＸＡ［０］（ＸＢ［０］）は、音声データバ
ッファ３１０Ａ（３１０Ｂ）からゲインコントローラ３
０６Ａ（３０６Ｂ）に入力され、端末装置３Ｂ（３Ａ）
から送信された音声データＸＢ［Ｔ］（ＸＡ［Ｔ］）と
混合され、端末装置３Ａ（３Ｂ）から出力される。As described above, the audio data XA [0] (XB
[0]) and transmitted from the terminal device 3A (3B) to the terminal device 3B (3A), the voice frame index data IXA [0] (IXB [0]) is transmitted to the terminal device 3B.
(3A) is fed back to the terminal device 3A (3B), and when the feedback voice frame index data FIXA [0] (FIXB [0]) is input to the voice data buffer 310A (310B),
The audio data XA [0] (XB [0]) is supplied from the audio data buffer 310A (310B) to the gain controller 3A.
06A (306B) and the terminal device 3B (3A)
Is mixed with the audio data XB [T] (XA [T]) transmitted from the terminal device 3A (3B).

【００８７】上記のタイミングは、端末装置３Ａ（３
Ｂ）に入力された音声データＸＡ［０］（ＸＢ［０］）
の端末装置３Ａ（３Ｂ）および端末装置３Ｂ（３Ａ）に
おいての音声符号復号遅延、伝送路遅延、ならびに端末
装置３Ｂ（３Ａ）に入力された音声データＸＢ［０］
（ＸＡ［０］）の端末装置３Ｂ（３Ａ）および端末装置
３Ａ（３Ｂ）においての音声符号復号遅延を考慮し、こ
れらの合計の遅延量に準じたタイミングである。これに
より、音声データＸＡ［０］（ＸＢ［０］）は、この音
声データＸＡ［０］（ＸＢ［０］）が端末装置３Ｂ（３
Ａ）から出力されるときに端末装置３Ｂ（３Ａ）に入力
された音声データＸＢ［Ｔ］（ＸＡ［Ｔ］）と混合さ
れ、端末装置３Ａ（３Ｂ）から出力される。The above timing is determined by the terminal device 3A (3
Audio data XA [0] (XB [0]) input to B)
Code decoding delay and transmission path delay in the terminal device 3A (3B) and the terminal device 3B (3A), and the audio data XB [0] input to the terminal device 3B (3A).
Considering the voice code decoding delay in the terminal device 3B (3A) and the terminal device 3A (3B) of (XA [0]), the timing is based on the total delay amount of these. Thus, the audio data XA [0] (XB [0]) is converted from the audio data XA [0] (XB [0]) to the terminal device 3B (3
The audio data XB [T] (XA [T]) input to the terminal device 3B (3A) when output from the terminal device 3A is mixed and output from the terminal device 3A (3B).

【００８８】以上説明したように第２の実施の形態によ
れば、端末装置３Ａ（３Ｂ）の入力音声を、端末装置３
Ａ（３Ｂ）および端末装置３Ｂ（３Ａ）においての符号
復号遅延ならびに伝送路遅延を含んだ形で、相手側の端
末装置３Ｂ（３Ａ）からの送信音声と混合して、端末装
置３Ａ（３Ｂ）において聞くことができる。このことに
より、端末装置３Ａ（３Ｂ）のユーザーは、音声転送遅
延およびその揺らぎが発生しても自分の音声の遅延量を
把握できるので、相手との会話のタイミングを把握する
ことができ、そのタイミングを計りながら会話すること
が可能となる。As described above, according to the second embodiment, the input sound of the terminal device 3A (3B) is
A (3B) and the terminal device 3B (3A), including the code decoding delay and the transmission line delay in the terminal device 3B (3A), are mixed with the transmission sound from the partner terminal device 3B (3A). Can be heard at Thereby, the user of the terminal device 3A (3B) can grasp the delay amount of his / her voice even if the voice transfer delay and its fluctuation occur, so that the user can grasp the timing of the conversation with the other party. It becomes possible to talk while measuring the timing.

【００８９】また、端末装置３Ａ（３Ｂ）の入力音声と
相手側からの送信音声との混合割合を端末装置３Ａ（３
Ｂ）において設定することができ、入力音声の混合割合
を０に設定することもできるので、遅延の発生が微量な
場合など、ユーザー自身の音声が耳障りになる場合は、
ユーザー自身の音声を消去することも可能である。The mixing ratio of the input voice of the terminal device 3A (3B) and the voice transmitted from the other party is determined by the terminal device 3A (3B).
B), and the mixing ratio of the input sound can be set to 0. Therefore, when the user's own sound is annoying, such as when the delay is very small,
It is also possible to delete the user's own voice.

【００９０】さらに、この第２の実施の形態では、音声
データではなく、その音声データに割り当てた音声フレ
ームインデックスデータを相手側に送信し、相手側から
フィードバックさせており、音声フレームインデックス
データのデータ量は一般に符号化された音声データのデ
ータ量よりも少ないので、伝送データ量を上記第１の実
施の形態よりも少なくすることができる。Further, in the second embodiment, not the audio data but the audio frame index data assigned to the audio data is transmitted to the other party and fed back from the other party. Since the amount is generally smaller than the data amount of the encoded audio data, the amount of transmission data can be made smaller than that of the first embodiment.

【００９１】第３の実施の形態図４は本発明の第３の実
施の形態の音声遅延対策システムを示すブロック構成図
である。なお、図４において、図１と同じものには同じ
符号を付してある。図４では、端末装置４Ａと端末装置
４Ｂとが伝送路２を介して接続しており、端末装置４Ａ
と端末装置４Ｂと伝送路２によって上記の音声遅延対策
システムが構成されている。Third Embodiment FIG. 4 is a block diagram showing a voice delay countermeasure system according to a third embodiment of the present invention. In FIG. 4, the same components as those in FIG. 1 are denoted by the same reference numerals. In FIG. 4, the terminal device 4A and the terminal device 4B are connected via the transmission line 2, and the terminal device 4A
The terminal device 4B and the transmission path 2 constitute the above-described audio delay countermeasure system.

【００９２】端末装置４Ａは、音声符号化器４０１Ａ
と、データ多重化器４０２Ａと、伝送路変調器４０３Ａ
と、フレームインデックス遅延器４０４Ａと、フィード
バック音声復号化器４０５Ａと、ゲインコントローラ４
０６Ａと、音声復号化器４０７Ａと、データ分配器４０
８Ａと、伝送路復調器４０９Ａと、音声データバッファ
（ＦＩＦＯバッファ）４１０Ａと、音声データ入力端子
ＩＮ１Ａと、ゲイン設定データ入力端子ＩＮ２Ａと、音
声データ出力端子ＯＵＴＡとを備える。[0092] The terminal device 4A includes a speech coder 401A.
, A data multiplexer 402A, and a transmission line modulator 403A.
, A frame index delay unit 404A, a feedback speech decoder 405A, and a gain controller 4
06A, a speech decoder 407A, and a data distributor 40
8A, a transmission line demodulator 409A, an audio data buffer (FIFO buffer) 410A, an audio data input terminal IN1A, a gain setting data input terminal IN2A, and an audio data output terminal OUTA.

【００９３】また、端末装置４Ｂは、音声符号化器４０
１Ｂと、データ多重化器４０２Ｂと、伝送路変調器４０
３Ｂと、フレームインデックス遅延器４０４Ｂと、フィ
ードバック音声復号化器４０５Ｂと、ゲインコントロー
ラ４０６Ｂと、音声復号化器４０７Ｂと、データ分配器
４０８Ｂと、伝送路復調器４０９Ｂと、音声データバッ
ファ（ＦＩＦＯバッファ）４１０Ｂと、音声データ入力
端子ＩＮ１Ｂと、ゲイン設定データ入力端子ＩＮ２Ｂ
と、音声データ出力端子ＯＵＴＢとを備える。[0093] The terminal device 4B includes a speech coder 40.
1B, the data multiplexer 402B, and the transmission line modulator 40
3B, a frame index delay unit 404B, a feedback audio decoder 405B, a gain controller 406B, an audio decoder 407B, a data distributor 408B, a transmission path demodulator 409B, and an audio data buffer (FIFO buffer). 410B, an audio data input terminal IN1B, and a gain setting data input terminal IN2B
And an audio data output terminal OUTB.

【００９４】この第３の実施の形態の端末装置４Ａおよ
び４Ｂは、上記第２の実施の形態の端末装置３Ａおよび
３Ｂ（図２参照）において、符号化された音声データを
音声データバッファに貯えるようにしたものである。The terminal devices 4A and 4B of the third embodiment store the encoded audio data in the audio data buffer in the terminal devices 3A and 3B of the second embodiment (see FIG. 2). It is like that.

【００９５】図４の端末装置４Ａの構成および動作につ
いて以下に説明する。なお、以下の端末装置４Ａと端末
装置４Ｂは同じ構成である。また、端末装置４Ａの音声
符号化器４０１Ａと端末装置４Ｂの音声符号化器４０１
Ｂ、端末装置４Ａのデータ多重化器４０２Ａと端末装置
４Ｂのデータ多重化器４０２Ｂなどは同じ構成である。
以下の端末装置４Ａの構成および動作の説明において、
それぞれの符号をかっこ内の符号に置き換えたものは、
図４の端末装置４Ｂの構成および動作の説明である。The configuration and operation of the terminal device 4A shown in FIG. 4 will be described below. The following terminal devices 4A and 4B have the same configuration. Also, the speech encoder 401A of the terminal device 4A and the speech encoder 401 of the terminal device 4B.
B, the data multiplexer 402A of the terminal device 4A and the data multiplexer 402B of the terminal device 4B have the same configuration.
In the following description of the configuration and operation of the terminal device 4A,
Replace each code with the code in parentheses,
5 is a description of the configuration and operation of the terminal device 4B of FIG.

【００９６】入力端子ＩＮ１Ａ（ＩＮ１Ｂ）には音声デ
ータＸＡ（ＸＢ）が入力され、この音声データＸＡ（Ｘ
Ｂ）は音声符号化器４０１Ａ（４０１Ｂ）に入力され
る。The audio data XA (XB) is input to the input terminal IN1A (IN1B).
B) is input to the speech encoder 401A (401B).

【００９７】音声符号化器４０１Ａ（４０１Ｂ）は、入
力された音声データＸＡ（ＸＢ）を符号化し、音声デー
タバッファ４１０Ａ（４１０Ｂ）およびデータ多重化器
４０２Ａ（４０２Ｂ）に出力する。上記の符号化には、
伝送路２を効率的に使用するための音声圧縮などの処理
を含む。さらに、音声符号化器４０１Ａ（４０１Ｂ）
は、音声データＸＡ（ＸＢ）に音声フレームインデック
スＩＸＡ（ＩＸＢ）を割り振り、その音声フレームイン
デックスデータＩＸＡ（ＩＸＢ）を音声データバッファ
４１０Ａ（４１０Ｂ）およびデータ多重化器４０２Ａ
（４０２Ｂ）に出力する。The speech encoder 401A (401B) encodes the inputted speech data XA (XB) and outputs it to the speech data buffer 410A (410B) and the data multiplexer 402A (402B). In the above encoding,
Processing such as voice compression for efficiently using the transmission path 2 is included. Further, the speech encoder 401A (401B)
Allocates an audio frame index IXA (IXB) to audio data XA (XB), and stores the audio frame index data IXA (IXB) in an audio data buffer 410A (410B) and a data multiplexer 402A.
(402B).

【００９８】データ多重化器４０２Ａ（４０２Ｂ）は、
符号化された音声データＸＡ（ＸＢ）と、その音声デー
タＸＡ（ＸＢ）に割り振られた音声フレームインデック
スデータＩＸＡ（ＩＸＢ）と、フレームインデックス遅
延器４０４Ａ（４０４Ｂ）からのフィードバック音声フ
レームインデックスデータＦＩＸＢ（ＦＩＸＡ）とを多
重化し、伝送路変調器４０３Ａ（４０３Ｂ）に出力す
る。The data multiplexer 402A (402B)
The encoded audio data XA (XB), the audio frame index data IXA (IXB) allocated to the audio data XA (XB), and the feedback audio frame index data FIXB (FIXB) from the frame index delay unit 404A (404B). FIXA) and outputs the result to the transmission line modulator 403A (403B).

【００９９】伝送路変調器４０３Ａ（４０３Ｂ）は、多
重されたデータＭＤＩＡ（ＭＤＩＢ）をフォーマット変
換し、伝送路２に送出する。上記のフォーマット変換
は、ＴＣＰ／ＩＰに代表される伝送路２で用いられるフ
ォーマットに変換するものであり、パケット化などを含
む。The transmission line modulator 403A (403B) converts the format of the multiplexed data MDIA (MDIB) and sends out the data to the transmission line 2. The above-mentioned format conversion is a conversion to a format used in the transmission path 2 represented by TCP / IP, and includes packetization and the like.

【０１００】伝送路復調器４０９Ａ（４０９Ｂ）は、端
末装置４Ｂ（４Ａ）から伝送路２に送出された多重化さ
れたデータＭＤＩＢ（ＭＤＩＡ）を受信し、端末装置４
Ｂ（４Ａ）の伝送路変調器４０３Ｂ（４０３Ａ）でなさ
れたフォーマット変換の逆変換を施し、データ分配器４
０８Ａ（４０８Ｂ）に出力する。The transmission line demodulator 409A (409B) receives the multiplexed data MDIB (MDIA) sent to the transmission line 2 from the terminal device 4B (4A),
B (4A) performs inverse conversion of the format conversion performed by the transmission line modulator 403B (403A), and
08A (408B).

【０１０１】データ分配器４０８Ａ（４０８Ｂ）は、入
力されたデータＭＤＩＢ（ＭＤＩＡ）を、符号化された
音声データＸＢ（ＸＡ）と、その音声データＸＢ（Ｘ
Ａ）に割り振られた音声フレームインデックスデータＩ
ＸＢ（ＩＸＡ）と、フィードバック音声フレームインデ
ックスデータＦＩＸＡ（ＦＩＸＢ）とに分離し、音声デ
ータＸＢ（ＸＡ）を音声復号化器４０７Ａ（４０７Ｂ）
に出力し、音声フレームインデックスデータＩＸＢ（Ｉ
ＸＡ）をフレームインデックス遅延器４０４Ａ（４０４
Ｂ）に出力し、フィードバック音声フレームインデック
スデータＦＩＸＡ（ＦＩＸＢ）を音声データバッファ４
１０Ａ（４１０Ｂ）に出力する。The data distributor 408A (408B) converts the input data MDIB (MDIA) into encoded audio data XB (XA) and the encoded audio data XB (XA).
Audio frame index data I allocated to A)
XB (IXA) and feedback audio frame index data FIXA (FIXB), and the audio data XB (XA) is decoded by the audio decoder 407A (407B).
To the audio frame index data IXB (I
XA) to the frame index delay unit 404A (404
B) and outputs the feedback voice frame index data FIXA (FIXB) to the voice data buffer 4.
Output to 10A (410B).

【０１０２】フレームインデックス遅延器４０４Ａ（４
０４Ｂ）は、音声フレームインデックスデータＩＸＢ
（ＩＸＡ）を、音声復号化装置４０７Ａ（４０７Ｂ）の
データ入出量遅延量および音声符号化装置４０１Ａ（４
０１Ｂ）のデータ入出量遅延量の合計の遅延量と同じだ
け遅延させ、フィードバック音声フレームインデックス
データＦＩＸＢ（ＦＩＸＡ）としてデータ多重化器４０
２Ａ（４０２Ｂ）に出力する。The frame index delay unit 404A (4
04B) is the audio frame index data IXB
(IXA) is calculated based on the data input / output delay amount of the audio decoding device 407A (407B) and the audio encoding device 401A (4
01B) is delayed by the same amount as the total delay amount of the data input / output amounts of the data multiplexer / demultiplexer 40 as feedback voice frame index data FIXB (FIXA).
2A (402B).

【０１０３】音声復号化器４０７Ａ（４０７Ｂ）は、符
号化された音声データＸＢ（ＸＡ）を復号し、ゲインコ
ントローラ４０６Ａ（４０６Ｂ）に出力する。[0103] The audio decoder 407A (407B) decodes the encoded audio data XB (XA) and outputs it to the gain controller 406A (406B).

【０１０４】音声データバッファ４１０Ａ（４１０Ｂ）
は、符号化された入力音声データＸＡ（ＸＢ）、および
その音声データＸＡ（ＸＢ）に割り振られた音声フレー
ムインデックスデータＩＸＡ（ＩＸＢ）を貯えておき、
フィードバック音声フレームインデックスデータＦＩＸ
Ａ（ＦＩＸＢ）がデータ配分器４０８Ａ（４０８Ｂ）か
ら入力されると、そのフィードバック音声フレームイン
デックスデータＦＩＸＡ（ＦＩＸＢ）に一致する音声フ
レームインデックスデータＩＸＡ（ＩＸＢ）が割り振ら
れた符号化された音声データＸＡ（ＸＢ）を、フィード
バック音声復号化器４０５Ａ（４０５Ｂ）に出力する。Audio data buffer 410A (410B)
Stores encoded input audio data XA (XB) and audio frame index data IXA (IXB) allocated to the audio data XA (XB),
Feedback voice frame index data FIX
When A (FIXB) is input from the data distributor 408A (408B), the encoded audio data XA to which the audio frame index data IXA (IXB) that matches the feedback audio frame index data FIXA (FIXB) is allocated. (XB) is output to the feedback speech decoder 405A (405B).

【０１０５】つまり、音声データバッファ４１０Ａ（４
１０Ｂ）は、入力されたフィードバック音声フレームイ
ンデックスデータＦＩＸＡ（ＦＩＸＢ）が割り振られた
符号化された音声データＸＡ（ＸＢ）をフィードバック
音声復号化器４０５Ａ（４０５Ｂ）に出力することによ
り、端末装置３Ａの入力音声データＸＡ（ＸＢ）を、符
号復号遅延および伝送路遅延を考慮したタイミングで端
末装置４Ａ（４Ｂ）から出力させる。That is, the audio data buffer 410A (4
10B) outputs the encoded audio data XA (XB) to which the input feedback audio frame index data FIXA (FIXB) is allocated, to the feedback audio decoder 405A (405B), so that the terminal device 3A The input audio data XA (XB) is output from the terminal device 4A (4B) at a timing in consideration of the code decoding delay and the transmission path delay.

【０１０６】フィードバック音声復号化器４０５Ａ（４
０５Ｂ）は、音声データバッファ４１０Ａ（４１０Ｂ）
から入力された符号化された音声データＸＡ（ＸＢ）を
復号し、ゲインコントローラ４０６Ａ（４０６Ｂ）に出
力する。The feedback speech decoder 405A (4
05B) is an audio data buffer 410A (410B)
, Decodes the encoded audio data XA (XB) input from the controller and outputs it to the gain controller 406A (406B).

【０１０７】入力端子ＩＮ２Ａ（ＩＮ２Ｂ）にはゲイン
設定データｒＡ（ｒＢ）が入力され、このゲイン設定デ
ータｒＡ（ｒＢ）はゲインコントローラ４０６Ａ（４０
６Ｂ）に入力される。The gain setting data rA (rB) is input to the input terminal IN2A (IN2B), and the gain setting data rA (rB) is input to the gain controller 406A (40).
6B).

【０１０８】ゲインコントローラ４０６Ａ（４０６Ｂ）
は、上記第１の実施の形態のゲインコントローラ１０６
Ａ（１０６Ｂ）と同じ内部構成である。ゲインコントロ
ーラ４０６Ａは、端末装置４Ｂから送信された音声デー
タＸＢ、および音声データバッファ４１０Ａで遅延され
た端末装置４Ａの入力音声データＸＡを、ゲイン設定デ
ータｒＡに応じた割合で混合し、出力音声データｒＡ・
ＸＢ＋｛（１−ｒＡ）ＸＡ｝を出力端子ＯＵＴＡから出
力する。なお、端末装置４Ｂのゲインコントローラ４０
６Ｂは、端末装置４Ａから送信された音声データＸＡ、
および音声データバッファ４１０Ｂで遅延された端末装
置４Ｂの入力音声データＸＢを、ゲイン設定データｒＢ
に応じた割合で混合し、出力音声データｒＢ・ＸＡ＋
｛（１−ｒＢ）ＸＢ｝を出力端子ＯＵＴＢから出力す
る。Gain controller 406A (406B)
Is the gain controller 106 according to the first embodiment.
A (106B) has the same internal configuration. The gain controller 406A mixes the audio data XB transmitted from the terminal device 4B and the input audio data XA of the terminal device 4A delayed by the audio data buffer 410A at a ratio according to the gain setting data rA, and outputs the output audio data. rA
XB + {(1-rA) XA} is output from the output terminal OUTA. The gain controller 40 of the terminal device 4B
6B is the audio data XA transmitted from the terminal device 4A,
The input audio data XB of the terminal device 4B delayed by the audio data buffer 410B and the gain setting data rB
And the output audio data rB · XA +
{(1-rB) XB} is output from the output terminal OUTB.

【０１０９】図４の音声遅延対策システムにおいての音
声データＸＡおよび音声フレームインデックスデータＩ
ＸＡの流れについて以下に説明する。なお、以下の音声
データＸＡおよび音声フレームインデックスデータＩＸ
Ａの流れの説明において、それぞれの符号をかっこ内の
ものに置き換えたものは、音声データＸＢおよび音声フ
レームインデックスデータＩＸＢの流れの説明である。The audio data XA and the audio frame index data I in the audio delay countermeasure system of FIG.
The flow of XA will be described below. The following audio data XA and audio frame index data IX
In the description of the flow of A, the description of the flow of the audio data XB and the audio frame index data IXB is obtained by replacing the respective symbols with those in parentheses.

【０１１０】端末装置４Ａ（４Ｂ）に入力された音声デ
ータＸＡ（ＸＢ）は、符号化器４０１Ａ（３０１Ｂ）で
符号化され、音声データバッファ４１０Ａ（４１０Ｂ）
およびデータ多重化器４０２Ａ（４０２Ｂ）に入力され
る。データ多重化器４０２Ａ（４０２Ｂ）に入力された
音声データＸＡ（ＸＢ）は、伝送路変調器３０３Ａ（３
０３Ｂ）、伝送路２、伝送路復調器３０９Ｂ（３０９
Ａ）、データ分配器３０８Ｂ（３０８Ａ）、音声復号化
器３０７Ｂ（３０７Ｂ）を経由し、ゲインコントローラ
４０６Ｂ（４０６Ａ）に入力され、遅延された端末装置
４Ｂ（４Ａ）の入力音声データＸＢ（ＸＡ）と混合さ
れ、出力端子ＯＵＴＢ（ＯＵＴＡ）から出力される。The audio data XA (XB) input to the terminal device 4A (4B) is encoded by the encoder 401A (301B), and the audio data buffer 410A (410B)
And input to the data multiplexer 402A (402B). The audio data XA (XB) input to the data multiplexer 402A (402B) is transmitted to the transmission line modulator 303A (3
03B), transmission path 2, transmission path demodulator 309B (309
A), via the data distributor 308B (308A) and the audio decoder 307B (307B), input to the gain controller 406B (406A), and delayed input audio data XB (XA) of the terminal device 4B (4A). And output from the output terminal OUTB (OUTA).

【０１１１】音声データバッファ４１０Ａ（４１０Ｂ）
に入力された音声データＸＡ（ＸＢ）は、音声データバ
ッファ４１０Ａ（４１０Ｂ）に一時貯えられ、端末装置
４Ｂ（４Ａ）から送信された音声データＸＢ（ＸＡ）に
同期するタイミングでフィードバック音声復号化器４０
５Ａ（４０５Ｂ）に出力され、フィードバック音声復号
化器４０５Ａ（４０５Ｂ）で復号され、ゲインコントロ
ーラ４０６Ａ（４０６Ｂ）に入力され、音声復号化器４
０７Ａ（４０７Ｂ）で復号された端末装置３Ｂ（４Ａ）
からの音声データＸＢ（ＸＡ）と混合され、出力端子Ｏ
ＵＴＢ（ＯＵＴＡ）から出力される。Audio data buffer 410A (410B)
Is temporarily stored in the audio data buffer 410A (410B), and the feedback audio decoder is synchronized with the audio data XB (XA) transmitted from the terminal device 4B (4A). 40
5A (405B), is decoded by the feedback speech decoder 405A (405B), and is inputted to the gain controller 406A (406B).
Terminal device 3B (4A) decoded at 07A (407B)
Is mixed with the audio data XB (XA) from the
Output from UTB (OUTA).

【０１１２】また、端末装置４Ａ（４Ｂ）の入力音声デ
ータＸＡ（ＸＢ）に割り振られた音声フレームインデッ
クスＩＸＡ（ＩＸＢ）は、音声符号化器４０１Ａ（４０
１Ｂ）から音声データバッファ４１０Ａ（４１０Ｂ）お
よびデータ多重化器４０２Ａ（４０２Ｂ）に入力され
る。音声データバッファ４１０Ａ（４１０Ｂ）に入力さ
れた音声フレームインデックスＩＸＡ（ＩＸＢ）は、そ
の音声フレームインデックスＩＸＡ（ＩＸＢ）が割り振
られた符号化された入力音声データＸＡ（ＸＢ）ととも
に音声データバッファ４１０Ａ（４１０Ｂ）に貯えられ
る。The audio frame index IXA (IXB) allocated to the input audio data XA (XB) of the terminal device 4A (4B) is stored in the audio encoder 401A (40).
1B) is input to the audio data buffer 410A (410B) and the data multiplexer 402A (402B). The audio frame index IXA (IXB) input to the audio data buffer 410A (410B) is together with the encoded input audio data XA (XB) to which the audio frame index IXA (IXB) is allocated. ).

【０１１３】データ多重化器４０２Ａ（４０２Ｂ）に入
力された音声フレームインデックスＩＸＡ（ＩＸＢ）
は、伝送路変調器４０３Ａ（４０３Ｂ）、伝送路２、伝
送路復調器４０９Ｂ（４０９Ａ）、データ分配器４０８
Ｂ（４０８Ａ）、フレームインデックス遅延器４０４Ｂ
（４０４Ａ）、データ多重化器４０２Ｂ（４０２Ａ）、
伝送路変調器４０３Ｂ（４０３Ａ）、伝送路２を経由し
て端末装置４Ａ（４Ｂ）にフィードバックされる。この
フィードバック音声フレームインデックスデータＦＩＸ
Ａ（ＦＩＸＢ）は、伝送路復調器４０９Ａ（４０９
Ｂ）、データ分配器４０８Ａ（４０８Ｂ）を経由し、音
声データバッファ４１０Ａ（４１０Ｂ）に入力される。Speech frame index IXA (IXB) input to data multiplexer 402A (402B)
Are transmission path modulators 403A (403B), transmission path 2, transmission path demodulator 409B (409A), data distributor 408
B (408A), frame index delay unit 404B
(404A), the data multiplexer 402B (402A),
The signal is fed back to the terminal device 4A (4B) via the transmission path modulator 403B (403A) and the transmission path 2. This feedback voice frame index data FIX
A (FIXB) is a transmission path demodulator 409A (409
B), and is input to the audio data buffer 410A (410B) via the data distributor 408A (408B).

【０１１４】音声データバッファ４１０Ａ（４１０Ｂ）
は、フィードバック音声フレームインデックスデータＦ
ＩＸＡ（ＦＩＸＢ）が入力されたタイミングで、そのフ
ィードバック音声フレームインデックスデータＦＩＸＡ
（ＦＩＸＢ）が割り振られた符号化された音声データＸ
Ａ（ＸＢ）をフィードバック音声復号化器４０５Ａ（４
０５Ｂ）に出力する。Audio data buffer 410A (410B)
Is the feedback voice frame index data F
At the timing when IXA (FIXB) is input, the feedback voice frame index data FIXA
(FIXB) assigned encoded audio data X
A (XB) is converted to a feedback speech decoder 405A (4
05B).

【０１１５】ここで、音声符号化器４０１Ａ，４０１Ｂ
のデータ入出力遅延時間をいずれもＴａとし、データ多
重化器４０２Ａと伝送路変調器４０３Ａの合計のデータ
遅延時間、およびデータ多重化器４０２Ｂと伝送路変調
器４０３Ｂの合計のデータ遅延時間をいずれもＴｂと
し、伝送路２のデータ伝送遅延時間をＴｃとし、伝送路
復調器４０９Ａとデータ分配器４０８Ａの合計のデータ
遅延時間、および伝送路復調器４０９Ｂとデータ分配器
４０８Ｂの合計のデータ遅延時間をいずれもＴｄとし、
音声復号化器４０７Ａ，４０７Ｂ、フィードバック音声
復号化器４０５Ａ，４０５Ｂのデータ入出力遅延時間を
いずれもＴｅとする。また、フレームインデックス遅延
器４０４Ａ，４０４Ｂのデータ入出力遅延時間をいずれ
もＴｅ＋Ｔａとする。また、Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔｄ＋
Ｔｅ＝Ｔとする。Here, the speech encoders 401A and 401B
, The total data delay time of the data multiplexer 402A and the transmission line modulator 403A, and the total data delay time of the data multiplexer 402B and the transmission line modulator 403B. Is also Tb, the data transmission delay time of the transmission line 2 is Tc, the total data delay time of the transmission line demodulator 409A and the data distributor 408A, and the total data delay time of the transmission line demodulator 409B and the data distributor 408B. Are Td,
Let Te be the data input / output delay time of the audio decoders 407A and 407B and the feedback audio decoders 405A and 405B. Further, the data input / output delay time of the frame index delay units 404A and 404B is Te + Ta. Also, Ta + Tb + Tc + Td +
Let Te = T.

【０１１６】時刻ｔ＝０に端末装置４Ａ（４Ｂ）に入力
された音声データＸＡ［０］（ＸＡ［０］）は、時刻ｔ
＝Ｔに端末装置３Ｂ（３Ａ）のゲインコントローラ３０
６Ｂ（３０６Ａ）に入力され、端末装置４Ｂから出力さ
れる。At time t = 0, audio data XA [0] (XA [0]) input to terminal device 4A (4B)
= T, the gain controller 30 of the terminal device 3B (3A)
6B (306A) and output from the terminal device 4B.

【０１１７】上記の音声データＸＡ［０］（ＸＢ
［０］）に割り振られた音声フレームインデックスデー
タＩＸＡ［０］（ＩＸＢ［０］）は、時刻ｔ＝Ｔ＋Ｔａ
にフィードバック音声インデックスデータＦＩＸＡ
［０］（ＦＩＸＢ［０］）としてデータ多重化器４０２
Ｂ（４０２Ａ）に入力される。また、時刻ｔ＝Ｔに端末
装置４Ｂ（４Ａ）に入力された音声データＸＢ［Ｔ］
（ＸＡ［Ｔ］）、および音声データＸＢ［Ｔ］（ＸＡ
［Ｔ］）に割り振られた音声フレームインデックスデー
タＩＸＢ［Ｔ］（ＩＸＡ［Ｔ］）も、時刻ｔ＝Ｔ＋Ｔａ
にデータ多重化器４０２Ｂ（４０２Ａ）に入力される。The above audio data XA [0] (XB
The audio frame index data IXA [0] (IXB [0]) allocated to [0]) is at time t = T + Ta
Feedback audio index data FIXA
Data multiplexer 402 as [0] (FIXB [0])
B (402A). The audio data XB [T] input to the terminal device 4B (4A) at time t = T
(XA [T]), and audio data XB [T] (XA
The voice frame index data IXB [T] (IXA [T]) allocated to [T]) is also at time t = T + Ta.
Is input to the data multiplexer 402B (402A).

【０１１８】フィードバック音声フレームインデックス
データＦＩＸＡ［０］（ＦＩＸＢ［０］）および音声デ
ータＸＢ［Ｔ］（ＸＡ［Ｔ］）は、ともに時刻ｔ＝Ｔ＋
Ｔａ＋Ｔｂ＋Ｔｃ＋Ｔｄに端末装置４Ａ（４Ｂ）のデー
タ分配器４０８Ａ（４０８Ｂ）から出力される。Feedback audio frame index data FIXA [0] (FIXB [0]) and audio data XB [T] (XA [T]) are both at time t = T +
The data is output from the data distributor 408A (408B) of the terminal device 4A (4B) at Ta + Tb + Tc + Td.

【０１１９】端末装置４Ｂ（４Ａ）から送信された音声
データＸＢ［Ｔ］（ＸＡ［Ｔ］）は、時刻ｔ＝２Ｔにゲ
インコントローラ４０６Ａ（４０６Ｂ）に入力され、端
末装置４Ａ（４Ｂ）から出力される。また、フィードバ
ック音声フレームインデックスデータＦＩＸＡ［０］
（ＦＩＸＢ［０］）は、時刻ｔ＝２Ｔに音声データバッ
ファ４１０Ａ（４１０Ｂ）に入力される。これにより、
音声データバッファ４１０Ａ（４１０Ｂ）に貯えられて
いた音声データＸＡ［０］（ＸＢ［０］）は、時刻ｔ＝
２Ｔにゲインコントローラ４０６Ａ（４０６Ｂ）に入力
され、音声データＸＢ［Ｔ］（ＸＡ［Ｔ］）とともに端
末装置４Ａ（４Ｂ）から出力される。The audio data XB [T] (XA [T]) transmitted from the terminal device 4B (4A) is input to the gain controller 406A (406B) at time t = 2T, and is output from the terminal device 4A (4B). Is done. Also, the feedback voice frame index data FIXA [0]
(FIXB [0]) is input to the audio data buffer 410A (410B) at time t = 2T. This allows
The audio data XA [0] (XB [0]) stored in the audio data buffer 410A (410B) is at time t =
The signal is input to the gain controller 406A (406B) at 2T, and output from the terminal device 4A (4B) together with the audio data XB [T] (XA [T]).

【０１２０】このように、第３の実施の形態においても
上記第２の実施の形態と同様に、端末装置４Ａ（４Ｂ）
に入力された音声データＸＡ［０］（ＸＢ［０］）の端
末装置４Ａ（４Ｂ）および端末装置４Ｂ（４Ａ）におい
ての音声符号復号遅延、伝送路遅延、ならびに端末装置
４Ｂ（４Ａ）に入力された音声データＸＢ［０］（ＸＡ
［０］）の端末装置４Ｂ（４Ａ）および端末装置４Ａ
（４Ｂ）においての音声符号復号遅延に準じたタイミン
グで、音声データＸＡ［０］（ＸＢ［０］）は、ゲイン
コントローラ４０６Ａ（４０６Ｂ）に出力され、端末装
置４Ｂ（４Ａ）から送信された音声データＸＢ［Ｔ］
（ＸＢ［Ｔ］）と混合され、端末装置３Ａ（３Ｂ）から
出力される。これにより、音声データＸＡ［０］（ＸＢ
［０］）は、この音声データＸＡ［０］（ＸＢ［０］）
が端末装置４Ｂ（４Ａ）から出力されるときに端末装置
４Ｂ（４Ａ）に入力された音声データＸＢ［Ｔ］（ＸＡ
［Ｔ］）と混合され、端末装置４Ａ（４Ｂ）から出力さ
れる。As described above, also in the third embodiment, similarly to the second embodiment, the terminal devices 4A (4B)
Of the audio data XA [0] (XB [0]) input to the terminal device 4A (4B) and the terminal device 4B (4A), and the input to the terminal device 4B (4A). Audio data XB [0] (XA
[0]) Terminal device 4B (4A) and terminal device 4A
At the timing according to the audio code decoding delay in (4B), the audio data XA [0] (XB [0]) is output to the gain controller 406A (406B) and transmitted from the terminal device 4B (4A). Data XB [T]
(XB [T]) and output from the terminal device 3A (3B). Thereby, the audio data XA [0] (XB
[0]) is the audio data XA [0] (XB [0])
Is output from the terminal device 4B (4A), the audio data XB [T] (XA) input to the terminal device 4B (4A).
[T]) and output from the terminal device 4A (4B).

【０１２１】以上説明したように第３の実施の形態によ
れば、端末装置４Ａ（４Ｂ）の入力音声を、端末装置４
Ａ（４Ｂ）および端末装置４Ｂ（４Ａ）においての符号
復号遅延ならびに伝送路遅延を含んだ形で、相手側の端
末装置３Ｂ（３Ａ）からの送信音声と混合して、端末装
置４Ａ（４Ｂ）において聞くことができる。このことに
より、端末装置４Ａ（４Ｂ）のユーザーは、音声転送遅
延およびその揺らぎが発生しても自分の音声の遅延量を
把握できるので、相手との会話のタイミングを把握する
ことができ、そのタイミングを計りながら会話すること
が可能となる。As described above, according to the third embodiment, the input sound of the terminal device 4A (4B) is
A (4B) and the terminal device 4B (4A), including the code decoding delay and the transmission line delay in the terminal device 4B (4A), are mixed with the transmission voice from the terminal device 3B (3A) on the other side, and the terminal device 4A (4B) Can be heard at Thereby, the user of the terminal device 4A (4B) can grasp the delay amount of his / her voice even if the voice transfer delay and its fluctuation occur, so that the user can grasp the timing of the conversation with the other party. It becomes possible to talk while measuring the timing.

【０１２２】また、端末装置４Ａ（４Ｂ）の入力音声と
相手側からの送信音声との混合割合を端末装置４Ａ（４
Ｂ）において設定することができ、入力音声の混合割合
を０に設定することもできるので、遅延の発生が微量な
場合など、ユーザー自身の音声が耳障りになる場合は、
ユーザー自身の音声を消去することも可能である。Further, the mixing ratio of the input voice of the terminal device 4A (4B) and the voice transmitted from the other party is determined by the terminal device 4A (4B).
B), and the mixing ratio of the input sound can be set to 0. Therefore, when the user's own sound is annoying, such as when the delay is very small,
It is also possible to delete the user's own voice.

【０１２３】さらに、この第３の実施の形態では、符号
化された音声データＸＡ（ＸＢ）を音声データバッファ
４１０Ａ（４１０Ｂ）に保持し、この音声データＸＡ
（ＸＢ）をフィードバック音声復号化器４０５Ａ（４０
５Ｂ）で復号してゲインコントローラ４０６Ａ（４０６
Ｂ）に入力する構成になっており、符号化された音声デ
ータＸＡ（ＸＢ）のデータ量は一般に符号化される前よ
りも圧縮されるので、音声データバッファ４１０Ａ（４
１０Ｂ）のデータ容量を上記第２の実施の形態よりも少
なくできる。音声データバッファ４１０Ａ（４１０Ｂ）
およびフィードバック音声復号化器４０５Ａ（４０５
Ｂ）を上記第２の実施の形態のフィードバックフレーム
インデックス遅延器３０５Ａ（３０５Ｂ）および音声デ
ータバッファ３１０Ａ（３１０Ｂ）よりもコンパクトに
できる場合には第３の実施の形態を採用し、逆の場合に
は上記第２の実施の形態を採用することにより、装置の
コンパクト化を図れる。Further, in the third embodiment, the encoded audio data XA (XB) is held in the audio data buffer 410A (410B), and the audio data XA (XB) is stored in the audio data buffer 410A (410B).
(XB) is converted to the feedback speech decoder 405A (40
5B) and the gain controller 406A (406
B), and the data amount of the encoded audio data XA (XB) is compressed more than before encoding, so that the audio data buffer 410A (4
The data capacity of 10B) can be smaller than that of the second embodiment. Audio data buffer 410A (410B)
And the feedback speech decoder 405A (405
If B) can be made more compact than the feedback frame index delay unit 305A (305B) and the audio data buffer 310A (310B) of the second embodiment, the third embodiment is adopted. By adopting the second embodiment, the size of the apparatus can be reduced.

【０１２４】なお、上記第１ないし第３の実施の形態で
は、端末装置をハードウエアで構成した例で説明した
が、端末装置は、ソフトウエアで構成してもよいし、ハ
ードウエアおよびソフトウエアの組合せで構成してもよ
い。In the first to third embodiments, an example has been described in which the terminal device is configured by hardware. However, the terminal device may be configured by software, or by hardware and software. May be configured.

【０１２５】[0125]

【発明の効果】以上説明したように本発明によれば、ユ
ーザーは音声転送遅延およびその揺らぎが発生しても自
分の音声の遅延量を把握できるので、相手との会話のタ
イミングを把握することができ、そのタイミングを計り
ながら会話することができるという効果がある。As described above, according to the present invention, the user can grasp the amount of delay of his / her voice even if the voice transfer delay and its fluctuation occur, so that the user can grasp the timing of conversation with the other party. The effect is that the conversation can be performed while measuring the timing.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態の音声遅延対策シ
ステムを示すブロック構成図である。FIG. 1 is a block diagram showing an audio delay countermeasure system according to a first embodiment of the present invention.

【図２】図１のゲインコントローラの構成の一例を示
す図である。FIG. 2 is a diagram illustrating an example of a configuration of a gain controller in FIG. 1;

【図３】本発明の第２の実施の形態の音声遅延対策シ
ステムを示すブロック構成図である。FIG. 3 is a block diagram showing an audio delay countermeasure system according to a second embodiment of the present invention.

【図４】本発明の第３の実施の形態の音声遅延対策シ
ステムを示すブロック構成図である。FIG. 4 is a block diagram showing an audio delay countermeasure system according to a third embodiment of the present invention.

[Explanation of symbols]

１Ａ，１Ｂ，３Ａ，３Ｂ，４Ａ，４Ｂ端末装置、２
伝送路、１０１Ａ，１０１Ｂ，３０１Ａ，３０１
Ｂ，４０１Ａ，４０１Ｂ音声符号化器、１０２Ａ，
１０２Ｂ，３０２Ａ，３０２Ｂ，４０２Ａ，４０２Ｂ
データ多重化器、１０３Ａ，１０３Ｂ，３０３Ａ，３０
３Ｂ，４０３Ａ，４０３Ｂ伝送路変調器、１０４
Ａ，１０４Ｂフィードバック音声符号化器、１０５
Ａ，１０５Ｂ，４０５Ａ，４０５Ｂフィードバック音
声復号化器、１０６Ａ，１０６Ｂ，３０６Ａ，３０６
Ｂ，４０６Ａ，４０６Ｂゲインコントローラ、１０
７Ａ，１０７Ｂ，３０７Ａ，３０７Ｂ，４０７Ａ，４０
７Ｂ音声復号化器、１０８Ａ，１０８Ｂ，３０８
Ａ，３０８Ｂ，４０８Ａ，４０８Ｂデータ分配器、１
０９Ａ，１０９Ｂ，３０９Ａ，３０９Ｂ，４０９Ａ，４
０９Ｂ伝送路復調器、３０４Ａ，３０４Ｂ，４０４
Ａ，４０４Ｂフレームインデックス遅延器、３０５
Ａ，３０５Ｂフィードバックフレームインデックス遅
延器、３１０Ａ，３１０Ｂ，４１０Ａ，４１０Ｂ音
声データバッファ、ＩＮ１Ａ，ＩＮ１Ｂ音声データ
入力端子、ＩＮ２Ａ，ＩＮ２Ｂゲイン設定データ入
力端子、ＯＵＴＡ，ＯＵＴＢ音声データ出力端子。1A, 1B, 3A, 3B, 4A, 4B terminal device, 2
Transmission line, 101A, 101B, 301A, 301
B, 401A, 401B speech encoder, 102A,
102B, 302A, 302B, 402A, 402B
Data multiplexer, 103A, 103B, 303A, 30
3B, 403A, 403B Transmission line modulator, 104
A, 104B feedback speech encoder, 105
A, 105B, 405A, 405B Feedback speech decoder, 106A, 106B, 306A, 306
B, 406A, 406B Gain controller, 10
7A, 107B, 307A, 307B, 407A, 40
7B audio decoder, 108A, 108B, 308
A, 308B, 408A, 408B Data distributor, 1
09A, 109B, 309A, 309B, 409A, 4
09B transmission line demodulator, 304A, 304B, 404
A, 404B Frame index delay unit, 305
A, 305B Feedback frame index delay unit, 310A, 310B, 410A, 410B audio data buffer, IN1A, IN1B audio data input terminal, IN2A, IN2B gain setting data input terminal, OUTA, OUTB audio data output terminal.

Claims

[Claims]

The first audio data including the input first audio data.
A communication terminal device that generates and transmits multiplexed data of the first audio data and receives the second multiplexed data that is transmitted from the other party and outputs the second audio data. And multiplexing the encoded first audio data and the second feedback audio data, which is the second audio data re-encoded for feedback to the other party, Multiplexing means for generating one multiplexed data, modulating means for modulating and transmitting the first multiplexed data according to transmission specifications, and demodulation for receiving and demodulating the modulated second multiplexed data. Means for separating the demodulated second multiplexed data into coded second voice data and coded first feedback voice data; and distributing the separated second voice data. Decrypt Encoding means; feedback encoding means for re-encoding the decoded second audio data as second feedback audio data; feedback decoding means for decoding the separated first feedback audio data; Mixing means for mixing the decoded second audio data and the decoded first feedback audio data in accordance with a set ratio.

2. The method according to claim 1, wherein the first audio data includes first audio data.
A communication terminal device that generates and transmits multiplexed data of the first audio data and receives the second multiplexed data that is transmitted from the other party and outputs the second audio data. And multiplexing the encoded first audio data and the second feedback audio data, which is the encoded second audio data delayed for feedback to the other party. ,
Multiplexing means for generating first multiplexed data, modulating means for modulating and transmitting the first multiplexed data according to transmission specifications, and receiving and demodulating the modulated second multiplexed data. Demodulation means; distribution means for separating the demodulated second multiplexed data into coded second voice data and coded first feedback voice data; separated second voice data Decoding means for decoding the separated first feedback sound data; and decoding delay of the second sound data in the decoding means for decoding the separated second sound data. Delay means for delaying by a total delay amount of the first audio data and the encoding delay amount of the first audio data in the encoding means, and outputting as second feedback audio data; Mixing means for mixing the data and the decoded first feedback voice data in accordance with a set ratio.

3. The plurality of voice communication terminal apparatuses according to claim 1, which are connected to each other via a transmission path, or the plurality of voice communication terminal apparatuses according to claim 2, which are connected to each other via a transmission path. Alternatively, the first feedback voice data includes one or more voice communication terminal devices according to claim 1 and one or more voice communication terminal devices according to claim 2, which are connected to each other via a transmission path. Is the first voice data decoded and re-encoded by the other party and fed back from the other party.

4. The first audio data including the input first audio data.
A communication terminal device that generates and transmits multiplexed data of the first audio data and receives the second multiplexed data including the second multiplexed data transmitted from the other party and outputs the second audio data. Encoding means for assigning the first index data to the first audio data, and a delay for feeding back the encoded first audio data, the first index data and the other party to the other party Multiplexed with the second feedback index data,
Multiplexing means for generating first multiplexed data, modulating means for modulating and transmitting the first multiplexed data according to transmission specifications, and receiving and demodulating the modulated second multiplexed data. Demodulating means; distributing means for separating the demodulated second multiplexed data into coded second audio data, second index data, and first feedback index data; A decoding unit for decoding the audio data, a decoding delay amount of the second audio data in the decoding unit, and an encoding of the first audio data in the encoding unit. Delay means for delaying by a total delay amount and outputting as second feedback index data; and decoding means for separating the separated first feedback index data. Delay means for delaying the second audio data by the decoding delay amount, and holding the first audio data and the first index data assigned to the first audio data, and When the first feedback index data is output from the feedback delay means, the first feedback index data is allocated to the first index data corresponding to the first feedback index data.
Buffer means for outputting the second audio data, and mixing means for mixing the decoded second audio data and the first audio data output from the buffer means in accordance with a set ratio. Voice communication terminal device.

5. The apparatus according to claim 4, further comprising a plurality of voice communication terminals connected to each other via a transmission path, wherein the first feedback index data is a decoding delay of the first voice data on the other side. Quantity and second
Is the first index data which is delayed by the total delay amount of the encoding delay amount of the audio data and fed back from the other party, and the second index data is the second index data at the other party.
An audio communication system characterized in that the audio data is allocated to the audio data.

6. A first audio data including the input first audio data.
A communication terminal device that generates and transmits multiplexed data of the first audio data and receives the second multiplexed data including the second multiplexed data transmitted from the other party and outputs the second audio data. Encoding means for allocating the first index data to the encoded first audio data; and a delay for feeding back the encoded first audio data, the first index data, and the other party. Multiplexed with the second feedback index data,
Multiplexing means for generating first multiplexed data, modulating means for modulating and transmitting the first multiplexed data according to transmission specifications, and receiving and demodulating the modulated second multiplexed data. Demodulating means; distributing means for separating the demodulated second multiplexed data into coded second audio data, second index data, and first feedback index data; A decoding unit for decoding the audio data, a decoding delay amount of the second audio data in the decoding unit, and an encoding of the first audio data in the encoding unit. Delay means for delaying by a total delay amount and outputting it as second feedback index data; encoded first audio data and encoded first audio data; And the first index data corresponding to the first feedback index data is allocated at the timing when the first feedback index data is output from the distribution unit. Buffer means for outputting the encoded first audio data, feedback decoding means for decoding the first audio data output from the buffer means, and the decoded second audio data and the decoded second audio data. Mixing means for mixing the first audio data with the first audio data in accordance with a set ratio.

7. A plurality of voice communication terminal devices according to claim 6, which are connected to each other via a transmission line, wherein the first feedback index data is a decoding delay of the first voice data on the other side. Quantity and second
Is the first index data that is delayed by the total delay amount of the encoding delay amount of the audio data and fed back from the other party, and the second index data is the second audio data encoded by the other party. A voice communication system characterized by being assigned to data.