JP2008042354A

JP2008042354A - Communication terminal, communication method, and communication program

Info

Publication number: JP2008042354A
Application number: JP2006211576A
Authority: JP
Inventors: Haruya Miyajima; 春弥宮島; Naoya Seta; 直也瀬田; Hideki Hayashi; 秀樹林; Teruya Fujii; 輝也藤井
Original assignee: SoftBank Mobile Corp
Current assignee: SoftBank Corp
Priority date: 2006-08-03
Filing date: 2006-08-03
Publication date: 2008-02-21

Abstract

<P>PROBLEM TO BE SOLVED: To output voice after outputting sound, indicating that the voice is to be output, with a timing at which it becomes possible to output the voice generated from a voice packet. <P>SOLUTION: The communication terminal for performing half-duplex communication of voice data comprises a receiving part for receiving the plurality of voice packets indicating the voice; a voice generating part for generating the voice from among the plurality of voice packets received by the reception part; a voice output possibility detection part for detecting that it has become possible to output the voice by generation of the voice by the voice generation part; a standby control part for making the communication terminal stand by, until the voice output possibility detection part detects that it has become possible to output the voice; a notification sound output part for outputting output notification sound, indicating that the voice is to be output, with at least the timing at which the voice output possibility detection part detects that it has become possible to output the voice; and a voice output part for outputting the voice, generated by the voice generation part after the notification sound output part, outputs the output notification sound. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、通信端末、通信方法、および通信プログラムに関する。特に、本発明は、パケット通信により音声データの半二重通信を行う通信端末、通信方法、および通信プログラムに関する。 The present invention relates to a communication terminal, a communication method, and a communication program. In particular, the present invention relates to a communication terminal, a communication method, and a communication program that perform half-duplex communication of audio data by packet communication.

特許文献１には、Ｐｕｓｈ−Ｔｏ−Ｔａｌｋ（Ｐ２Ｔ）通信方式において、ユーザが発する音声を検出して、Ｐ２Ｔ機能を用いた会話を開始できる携帯端末装置が提案されている。すなわち、特許文献１に記載の携帯端末装置においては、マイクから音声が入力されたことを制御回路部が認識した場合に、会話権限獲得要求をＰ２Ｔサーバに送信して会話をする権限を獲得する。そして、携帯端末装置は制御回路部がマイクから一定時間音声が入力されなかった場合は、会話権限解放要求をＰ２Ｔサーバに送信して会話をする権限を解放する。
特開２００６−１０１０４８号公報 Patent Document 1 proposes a portable terminal device that can detect a voice uttered by a user and start a conversation using the P2T function in a Push-To-Talk (P2T) communication method. That is, in the portable terminal device described in Patent Document 1, when the control circuit unit recognizes that the voice is input from the microphone, the conversation authority acquisition request is transmitted to the P2T server to acquire the authority to perform the conversation. . Then, when the control circuit unit does not receive a voice from the microphone for a certain period of time, the portable terminal device transmits a conversation authority release request to the P2T server and releases the authority to converse.
JP 2006-101048 A

ここで、Ｐ２Ｔ方式はＶｏＩＰの交換方式による通信方式であり、パケットを交換する方式である。したがって、音声パケットの伝送遅延および欠損等により音声品質が劣化する。係る音声パケットの伝送遅延等を補正すべく、通常は、音声パケットのバッファ処理等がなされている。しかしながら、特許文献１に開示されている発明においては、伝送遅延等の補正処理について考慮されていない。また、音声パケットを受信したときから受信した音声パケットを逆パケット化して、デジタル化した音声データを出力するまでの遅延がユーザに与える影響を考慮していない。 Here, the P2T method is a communication method based on the VoIP exchange method, and is a method for exchanging packets. Therefore, the voice quality deteriorates due to the transmission delay and loss of the voice packet. In order to correct the transmission delay of the voice packet, a buffer process of the voice packet is usually performed. However, the invention disclosed in Patent Document 1 does not consider correction processing such as transmission delay. In addition, the influence of the delay from when the voice packet is received until the received voice packet is depacketized and the digitized voice data is output is not considered.

そこで本発明は、上記課題を解決することができる通信端末、通信方法、および通信プログラムを提供することを目的とする。この目的は特許請求の範囲における独立項に記載の特徴の組み合わせにより達成される。また従属項は本発明の更なる有利な具体例を規定する。 Then, an object of this invention is to provide the communication terminal, the communication method, and communication program which can solve the said subject. This object is achieved by a combination of features described in the independent claims. The dependent claims define further advantageous specific examples of the present invention.

上記課題を解決するために、本発明の第１の形態においては、パケット通信により音声データの半二重通信を行う通信端末であって、音声を示す複数の音声パケットを受信する受信部と、受信部が受信した複数の音声パケットから音声を生成する音声生成部と、音声生成部が音声を生成することによって音声が出力可能になったことを検出する音声出力可能検出部と、音声が出力可能になったことを音声出力可能検出部が検出するまで、当該通信端末を待機させる待機制御部と、少なくとも、音声が出力可能になったことを音声出力可能検出部が検出したタイミングで、音声を出力する旨を示す出力通知音を出力する通知音出力部と、通知音出力部が出力通知音を出力した後に、音声生成部が生成した音声を出力する音声出力部とを備える。 In order to solve the above-described problem, in the first embodiment of the present invention, a communication terminal that performs half-duplex communication of voice data by packet communication, a receiving unit that receives a plurality of voice packets indicating voice; An audio generation unit that generates audio from a plurality of audio packets received by the reception unit, an audio output enable detection unit that detects that audio can be output by generating audio, and audio output The standby control unit that waits for the communication terminal until the voice output enable detection unit detects that the voice output is possible, and at the timing when the voice output enable detection unit detects that the voice can be output. A notification sound output unit that outputs an output notification sound indicating that the sound is output, and a sound output unit that outputs the sound generated by the sound generation unit after the notification sound output unit outputs the output notification sound.

また、受信部が受信する複数の音声パケットのうちの所定の音声パケットを受信したこと検出する音声パケット受信検出部をさらに備え、通知音出力部は、所定の音声パケットを受信したことを音声パケット受信検出部が検出したタイミングで、所定の音声パケットを受信したことを示す受信通知音をさらに出力してもよい。さらに、音声パケット受信検出部は、受信部が受信する複数の音声パケットのうちのいずれかを最初に受信したことを検出してもよい。そして、通知音出力部は、所定の音声パケットを受信したことを音声パケット受信検出部が検出したタイミングから、音声が出力可能になったことを音声出力可能検出部が検出したタイミングまで、受信通知音と出力通知音とを連続して出力してもよい。さらに、受信通知音と出力通知音とは、異なる周波数の音であってもよい。 The reception unit further includes a voice packet reception detection unit that detects reception of a predetermined voice packet among a plurality of voice packets received by the reception unit, and the notification sound output unit indicates that the predetermined voice packet has been received. A reception notification sound indicating that a predetermined voice packet has been received may be further output at a timing detected by the reception detection unit. Furthermore, the voice packet reception detection unit may detect that any one of the plurality of voice packets received by the reception unit is received first. The notification sound output unit receives the notification from the timing when the voice packet reception detection unit detects that the predetermined voice packet has been received to the timing when the voice output enable detection unit detects that the voice can be output. The sound and the output notification sound may be output continuously. Further, the reception notification sound and the output notification sound may be sounds having different frequencies.

また、音声生成部は、受信部が受信した複数の音声パケットを、所定のパケット数ずつ一時的に格納するバッファ処理部と、バッファ処理部が格納している所定のパケット数ずつ、複数の音声パケットを音声データに変換する音声データ変換部とを有し、音声出力可能検出部は、受信部が受信した複数の音声パケットのうちの所定のパケット数を、音声データ変換部が変換した後、予め定められた時間経過した場合に、音声が出力可能になったことを検出してもよい。 In addition, the audio generation unit temporarily stores a plurality of audio packets received by the reception unit by a predetermined number of packets, and a plurality of audio packets by a predetermined number of packets stored by the buffer processing unit. An audio data conversion unit that converts the packet into audio data, and the audio output enable detection unit, after the audio data conversion unit converts a predetermined number of packets among the plurality of audio packets received by the reception unit, When a predetermined time has elapsed, it may be detected that sound can be output.

また、音声生成部は、受信部が受信した複数の音声パケットのうちで破損している音声パケット、または受信部が受信すべき複数の音声パケットのうちで消失している音声パケットを検出するエラーパケット検出部と、複数の音声パケットを送信した通信端末に対して、エラーパケット検出部が検出した音声パケットの再送を要求する再送要求部とを備え、音声出力可能検出部は、再送要求部が再送を要求した音声パケットを受信部が受信した後、音声生成部が音声を生成することによって音声が出力可能になったことを検出してもよい。 In addition, the voice generation unit detects an error in detecting a voice packet that is corrupted among a plurality of voice packets received by the reception unit or a voice packet that is lost in a plurality of voice packets that the reception unit should receive. A packet detection unit, and a retransmission request unit that requests retransmission of the voice packet detected by the error packet detection unit to a communication terminal that has transmitted a plurality of voice packets. After the reception unit receives the voice packet for which retransmission has been requested, the voice generation unit may detect that the voice can be output by generating the voice.

また、本発明の第２の形態においては、パケット通信により音声データの半二重通信を行う通信方法であって、音声を示す複数の音声パケットを受信する受信段階と、受信段階において受信された複数の音声パケットから音声を生成する音声生成段階と、音声生成段階において音声を生成することによって音声が出力可能になったことを検出する音声出力可能検出段階と、音声が出力可能になったことを音声出力可能検出段階において検出されるまで、通信端末を待機させる待機制御段階と、少なくとも、音声が出力可能になったことを音声出力可能検出段階において検出されたタイミングで、音声を出力する旨を示す出力通知音を出力する通知音出力段階と、通知音出力段階において出力通知音が出力された後に、音声生成段階において生成された音声を出力する音声出力段階とを備える。 Further, in the second embodiment of the present invention, a communication method for performing half-duplex communication of voice data by packet communication, the reception method receiving a plurality of voice packets indicating voice, and receiving in the reception stage A voice generation stage for generating voice from a plurality of voice packets, a voice output enabling detection stage for detecting that voice can be output by generating voice in the voice generation stage, and voice output enabled Voice is output at the timing at which the communication terminal waits until it is detected at the audio output enable detection stage and at the timing at which it is detected at the audio output enable detection stage that audio can be output at least. A notification sound output stage that outputs an output notification sound, and after the output notification sound is output in the notification sound output stage, And an audio output step of outputting a sound that is.

また、本発明の第３の形態においては、パケット通信により音声データの半二重通信を行う通信端末用の通信プログラムであって、通信端末を、音声を示す複数の音声パケットを受信する受信部、受信部が受信した複数の音声パケットから音声を生成する音声生成部、音声生成部が音声を生成することによって音声が出力可能になったことを検出する音声出力可能検出部、音声が出力可能になったことを音声出力可能検出部が検出するまで、当該通信端末を待機させる待機制御部、少なくとも、音声が出力可能になったことを音声出力可能検出部が検出したタイミングで、音声を出力する旨を示す出力通知音を出力する通知音出力部、通知音出力部が出力通知音を出力した後に、音声生成部が生成した音声を出力する音声出力部として機能させる。 According to the third aspect of the present invention, there is provided a communication program for a communication terminal that performs half-duplex communication of voice data by packet communication, and the reception unit receives a plurality of voice packets indicating voice. , An audio generation unit that generates audio from a plurality of audio packets received by the reception unit, an audio output enable detection unit that detects that audio can be output by generating audio, and can output audio The voice output is detected at the timing when the voice output detection unit detects that the voice can be output until the standby control unit that waits for the communication terminal until the voice output capability detection unit detects A notification sound output unit that outputs an output notification sound indicating that the sound is output, and the notification sound output unit functions as a sound output unit that outputs the sound generated by the sound generation unit after the output of the output notification sound. That.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではなく、これらの特徴群のサブコンビネーションもまた、発明となりうる。 The above summary of the invention does not enumerate all the necessary features of the present invention, and sub-combinations of these feature groups can also be the invention.

本発明によれば、音声パケットから生成した音声を出力することが可能となったタイミングに、音声を出力する旨を示す通知音を出力することができる。 ADVANTAGE OF THE INVENTION According to this invention, the notification sound which shows that a sound is output can be output at the timing which became possible to output the audio | voice produced | generated from the audio | voice packet.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲に係る発明を限定するものではなく、また実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 Hereinafter, the present invention will be described through embodiments of the invention. However, the following embodiments do not limit the claimed invention, and all combinations of features described in the embodiments are included. It is not necessarily essential for the solution of the invention.

図１は、本発明の一実施形態に係る通信端末１０の機能構成の一例を示す。通信端末１０は、例えば、パケット通信により音声データの半二重通信する機能を有する携帯電話である。また、通信端末１０は、パケット通信により音声データの半二重通信する機能を有するＰＤＡ等の携帯通信端末であってよい。さらに、通信端末１０は、有線でネットワークに接続している、パケット通信により音声データの半二重通信する機能を有するパーソナルコンピュータであってもよい。本実施形態に係る通信端末１０は、受信した音声パケットから音声を生成して出力が可能になったことを検出したタイミングに、音声を出力する旨を示す通知音を出力することを目的とする。 FIG. 1 shows an example of a functional configuration of a communication terminal 10 according to an embodiment of the present invention. The communication terminal 10 is, for example, a mobile phone having a function of performing half-duplex communication of audio data by packet communication. The communication terminal 10 may be a portable communication terminal such as a PDA having a function of performing half-duplex communication of audio data by packet communication. Further, the communication terminal 10 may be a personal computer connected to the network by wire and having a function of performing half-duplex communication of audio data by packet communication. An object of the communication terminal 10 according to the present embodiment is to output a notification sound indicating that a voice is to be output at the timing when it is detected that the voice is generated from the received voice packet and output is possible. .

本実施形態に係る通信端末１０は、受信部１００、送信部１０２、音声生成部１１０、音声パケット受信検出部１２０、待機制御部１３０、音声出力可能検出部１４０、音声出力部１５０、通知音出力部１５２、スピーカー１５４、およびアンテナ１６０を備える。また、音声生成部１１０は、バッファ処理部１１２、音声データ変換部１１４、エラーパケット検出部１１６、および再送要求部１１８を有する。 The communication terminal 10 according to the present embodiment includes a reception unit 100, a transmission unit 102, a voice generation unit 110, a voice packet reception detection unit 120, a standby control unit 130, a voice output enable detection unit 140, a voice output unit 150, and a notification sound output. A unit 152, a speaker 154, and an antenna 160 are provided. The audio generation unit 110 includes a buffer processing unit 112, an audio data conversion unit 114, an error packet detection unit 116, and a retransmission request unit 118.

受信部１００は、通信相手の通信端末が送信した音声を示す複数の音声パケットを受信する。具体的には、まず、通信相手の通信端末は音声データをデジタル符号化する。そして、通信相手の通信端末はデジタル符号化した音声データから、複数の音声パケットを生成する。続いて、通信相手の通信端末は複数の音声パケットを通信端末１０に送信する。そして受信部１００は、通信相手の通信端末が送信した複数の音声パケットを受信する。 The receiving unit 100 receives a plurality of voice packets indicating voice transmitted by a communication terminal as a communication partner. Specifically, first, the communication terminal of the communication partner digitally encodes the voice data. Then, the communication terminal of the communication partner generates a plurality of voice packets from the digitally encoded voice data. Subsequently, the communication terminal of the communication partner transmits a plurality of voice packets to the communication terminal 10. And the receiving part 100 receives the several audio | voice packet which the communication terminal of the communicating party transmitted.

通信端末１０が、例えば、無線通信端末である場合は、受信部１００はアンテナ１６０を介して、パケット通信により音声データの半二重通信する通信相手の通信端末から音声パケットを受信する。通信端末１０が、例えば、パーソナルコンピュータである場合は、受信部１００は有線でパケット通信により音声データの半二重通信する通信相手の通信端末から音声パケットを受信してよい。受信部１００は受信した複数の音声パケットを音声生成部１１０および音声パケット受信検出部１２０に供給する。 For example, when the communication terminal 10 is a wireless communication terminal, the receiving unit 100 receives a voice packet from the communication terminal of a communication partner that performs half-duplex communication of voice data by packet communication via the antenna 160. When the communication terminal 10 is, for example, a personal computer, the receiving unit 100 may receive a voice packet from a communication terminal of a communication partner that performs half-duplex communication of voice data by wired packet communication. The receiving unit 100 supplies the received plurality of voice packets to the voice generation unit 110 and the voice packet reception detection unit 120.

送信部１０２は、通信相手の通信端末に複数の音声パケットを送信する。また、送信部１０２は、再送要求部１１８の制御に基づいて、破損した音声パケットおよび消失した音声パケットの再送を要求する信号を、通信相手の通信端末に送信する。通信端末１０が、例えば、無線通信端末である場合は、送信部１０２はアンテナ１６０を介して、通信相手の通信端末に複数の音声パケットを送信する。通信端末１０が、例えば、パーソナルコンピュータである場合は、送信部１０２は有線で、例えばインターネット等のネットワークを介して、通信相手の通信端末に複数の音声パケットを送信してよい。 The transmission unit 102 transmits a plurality of voice packets to the communication terminal of the communication partner. Further, based on the control of the retransmission request unit 118, the transmission unit 102 transmits a signal requesting retransmission of the damaged voice packet and the lost voice packet to the communication terminal of the communication partner. For example, when the communication terminal 10 is a wireless communication terminal, the transmission unit 102 transmits a plurality of voice packets to the communication terminal of the communication partner via the antenna 160. When the communication terminal 10 is, for example, a personal computer, the transmission unit 102 may transmit a plurality of voice packets to a communication partner communication terminal via a wired network such as the Internet.

音声生成部１１０は、受信部１００から受け取った複数の音声パケットから音声を生成する。すなわち、音声生成部１１０は、複数の音声パケットを逆パケット化して、デジタル符号化された音声データを取り出す。そして、音声生成部１１０は、取り出したデジタル符号化された音声データから音声を生成する。また、音声生成部１１０は、受信部１００から受け取った複数の音声パケットのそれぞれのヘッダを解析して、バッファ処理、破損した音声パケットの再送処理、および伝送中に消失した音声パケットの再送処理等をする。 The voice generation unit 110 generates voice from a plurality of voice packets received from the reception unit 100. That is, the voice generation unit 110 converts a plurality of voice packets into packets and takes out the digitally encoded voice data. Then, the sound generation unit 110 generates sound from the extracted digitally encoded sound data. In addition, the voice generation unit 110 analyzes each header of a plurality of voice packets received from the reception unit 100, performs buffer processing, retransmission processing of damaged voice packets, retransmission processing of voice packets lost during transmission, etc. do.

具体的には、バッファ処理部１１２は、受信部１００から受け取った複数の音声パケットを、所定のパケット数ずつ一時的に格納する。なお、バッファ処理部１１２は、所定のパケット数の音声パケットを一時的に格納するバッファを有する。続いて、バッファ処理部１１２は、複数の音声パケットのそれぞれのヘッダ（例えば、ＲＴＰヘッダ）を解析して、複数の音声パケットが受信部１００に到達した時間である到達時間の間隔を取得する。そして、バッファ処理部１１２は、複数のパケット間の到達時間の間隔を等間隔に調整する。バッファ処理部１１２は、到達時間の間隔を調整した後の複数の音声パケットを、所定の音声パケット数ずつ音声データ変換部１１４に供給する。 Specifically, the buffer processing unit 112 temporarily stores a plurality of audio packets received from the receiving unit 100 by a predetermined number of packets. Note that the buffer processing unit 112 has a buffer for temporarily storing a predetermined number of voice packets. Subsequently, the buffer processing unit 112 analyzes respective headers (for example, RTP headers) of the plurality of voice packets, and obtains arrival time intervals that are times when the plurality of voice packets reach the receiving unit 100. Then, the buffer processing unit 112 adjusts the arrival time intervals between the plurality of packets at equal intervals. The buffer processing unit 112 supplies the plurality of voice packets after the arrival time interval is adjusted to the voice data conversion unit 114 by a predetermined number of voice packets.

音声データ変換部１１４は、バッファ処理部１１２から所定の数ずつ受け取った複数の音声パケットを音声データに変換する。具体的には、音声データ変換部１１４は、複数の音声パケットを逆パケット化する。そして、音声データ変換部１１４は、複数の音声パケットをデジタル符号化された音声データに変換する。音声データ変換部１１４は、音声データを音声出力部１５０に供給する。 The audio data conversion unit 114 converts a plurality of audio packets received from the buffer processing unit 112 by a predetermined number into audio data. Specifically, the voice data conversion unit 114 converts a plurality of voice packets into reverse packets. Then, the audio data converter 114 converts the plurality of audio packets into digitally encoded audio data. The audio data conversion unit 114 supplies the audio data to the audio output unit 150.

エラーパケット検出部１１６は、受信部１００から受け取った複数の音声パケットのうちで破損している音声パケットを検出する。また、エラーパケット検出部１１６は、受信部１００が受信すべき複数の音声パケットのうちで消失している音声パケットを検出する。具体的には、エラーパケット検出部１１６は、受信部１００が受信した複数の音声パケットのＲＴＰヘッダをそれぞれ解析する。そして、エラーパケット検出部１１６は、消失した音声パケットの情報および破損した音声パケットの情報を検出する。続いて、エラーパケット検出部１１６は、検出した情報から通信相手の通信端末に再送を要求すべき音声パケットの情報を再送要求部１１８に供給する。 The error packet detector 116 detects a damaged voice packet among the plurality of voice packets received from the receiver 100. Further, the error packet detection unit 116 detects a lost voice packet among a plurality of voice packets to be received by the receiving unit 100. Specifically, the error packet detection unit 116 analyzes RTP headers of a plurality of voice packets received by the reception unit 100, respectively. Then, the error packet detection unit 116 detects the information of the lost voice packet and the information of the damaged voice packet. Subsequently, the error packet detection unit 116 supplies the retransmission request unit 118 with information on a voice packet that should be requested to be retransmitted from the detected information to the communication terminal of the communication partner.

再送要求部１１８は、複数の音声パケットを送信した通信端末に対して、エラーパケット検出部１１６が検出した音声パケットの再送を要求する。再送要求部１１８は、再送を要求すべき音声パケットを識別する情報とともに、再送を要求する信号を、送信部１０２を介して通信相手の通信端末に要求する例えば、再送要求部１１８は、ＲＴＣＰで通信端末１０が受信した複数の音声パケットのうちで、消失および破損した音声パケットを示す情報を通信相手の通信端末にフィードバックすることにより、エラーパケット検出部１１６が検出した音声パケットの再送を要求する。 The retransmission request unit 118 requests the communication terminal that has transmitted a plurality of voice packets to retransmit the voice packet detected by the error packet detection unit 116. The retransmission request unit 118 requests a signal requesting retransmission together with information for identifying a voice packet to be requested for retransmission to the communication terminal of the communication partner via the transmission unit 102. For example, the retransmission request unit 118 is RTCP. Of the plurality of voice packets received by the communication terminal 10, information indicating the lost and damaged voice packets is fed back to the communication terminal of the communication partner, thereby requesting retransmission of the voice packets detected by the error packet detection unit 116. .

音声出力可能検出部１４０は、音声生成部１１０が音声を生成することによって音声が出力可能になったことを検出する。具体的には、音声出力可能検出部１４０は、受信部１００が受信した複数の音声パケットのうちの所定のパケット数を、音声データ変換部１１４が変換した後、予め定められた時間が経過した場合に、音声が出力可能になったことを検出する。 The sound output possibility detection unit 140 detects that the sound can be output by the sound generation unit 110 generating sound. Specifically, the voice output possibility detection unit 140 has passed a predetermined time after the voice data conversion unit 114 converts a predetermined number of packets among the plurality of voice packets received by the reception unit 100. In this case, it is detected that the voice can be output.

また、音声出力可能検出部１４０は、再送要求部１１８が再送を要求した音声パケットを受信部１００が受信した後、音声生成部１１０が有する音声データ変換部１１４が音声を変換することによって音声が出力可能になったことを検出する。音声出力可能検出部１４０は、音声が出力可能になったことを示す情報を、待機制御部１３０および通知音出力部１５２に供給する。待機制御部１３０は、音声が出力可能になったことを音声出力可能検出部１４０が検出するまで通信端末１０を待機させる。具体的には、待機制御部１３０は、音声出力部１５０に働きかけて、音声出力部１５０が音声を出力することを中止することにより、通信端末１０を待機させる。 Further, the audio output enable detection unit 140 receives the audio packet requested to be retransmitted by the retransmission request unit 118, and then receives the audio by the audio data conversion unit 114 included in the audio generation unit 110 converting the audio. Detect that output is possible. The sound output possibility detection unit 140 supplies information indicating that sound can be output to the standby control unit 130 and the notification sound output unit 152. The standby control unit 130 causes the communication terminal 10 to wait until the voice output possibility detection unit 140 detects that the voice can be output. Specifically, the standby control unit 130 works on the voice output unit 150 to stop the voice output unit 150 from outputting voice, thereby causing the communication terminal 10 to wait.

音声パケット受信検出部１２０は、受信部１００が受信する複数の音声パケットのうちの所定の音声パケットを受信したことを検出する。具体的には、音声パケット受信検出部１２０は、受信部１００が受信する複数の音声パケットのうちのいずれかを最初に受信したことを検出する。また、音声パケット受信検出部１２０は、予め定められた数の音声パケットを受信したことを検出してもよい。音声パケット受信検出部１２０は、所定の音声パケットを受信したことを示す情報を通知音出力部１５２に供給する。 The voice packet reception detection unit 120 detects that a predetermined voice packet among a plurality of voice packets received by the reception unit 100 has been received. Specifically, the voice packet reception detection unit 120 detects that any one of a plurality of voice packets received by the reception unit 100 is received first. In addition, the voice packet reception detection unit 120 may detect that a predetermined number of voice packets have been received. The voice packet reception detection unit 120 supplies information indicating that a predetermined voice packet has been received to the notification sound output unit 152.

通知音出力部１５２は、所定の音声パケットを受信したことを音声パケット受信検出部１２０が検出したタイミングから、音声データを音声出力部１５０が出力する直前まで、受信通知音を出力する。具体的には、通知音出力部１５２は、受信通知音を出力する時間の長さを変化させて出力する。例えば、通知音出力部１５２は、所定の音声パケットを受信したことを音声パケット受信検出部１２０が検出したタイミングから、音声データを音声出力部１５０が出力する直前まで受信通知音を出力する時間を延長してよい。また、通知音出力部１５２は、受信通知音を出力する時間を延長した場合には、受信通知音の周波数を変化させながら受信通知音を出力してもよい。 The notification sound output unit 152 outputs a reception notification sound from the timing at which the voice packet reception detection unit 120 detects that a predetermined voice packet has been received until immediately before the voice output unit 150 outputs the voice data. Specifically, the notification sound output unit 152 changes the length of time for outputting the reception notification sound and outputs it. For example, the notification sound output unit 152 sets the time for outputting the reception notification sound from the timing at which the voice packet reception detection unit 120 detects that a predetermined voice packet has been received to just before the voice output unit 150 outputs the voice data. May be extended. Further, the notification sound output unit 152 may output the reception notification sound while changing the frequency of the reception notification sound when the time for outputting the reception notification sound is extended.

また、通知音出力部１５２は、少なくとも、音声が出力可能になったことを音声出力可能検出部１４０が検出したタイミングで、音声を出力する旨を示す出力通知音を出力する。具体的には、通知音出力部１５２は、音声出力可能検出部１４０から音声が出力可能になったことを示す情報を受け取ったタイミングで出力通知音を出力する。ここで、受信通知音と出力通知音とは、異なる周波数の音である。また、通知音出力部１５２は、受信通知音と出力通知音とを連続して出力してよい。また、通知音出力部１５２は、受信通知音を、出力通知音を出力するタイミングまで断続的に出力してもよい。通知音出力部１５２は、受信音通知音と出力通知音とをスピーカー１５４から出力する。 The notification sound output unit 152 outputs an output notification sound indicating that the sound is to be output at least when the sound output possibility detection unit 140 detects that the sound can be output. Specifically, the notification sound output unit 152 outputs an output notification sound at the timing when the information indicating that the sound can be output is received from the sound output possibility detection unit 140. Here, the reception notification sound and the output notification sound are sounds having different frequencies. Further, the notification sound output unit 152 may continuously output the reception notification sound and the output notification sound. Further, the notification sound output unit 152 may intermittently output the reception notification sound until the output notification sound is output. The notification sound output unit 152 outputs a reception sound notification sound and an output notification sound from the speaker 154.

音声出力部１５０は、通知音出力部１５２が出力通知音を出力した後に、音声生成部１１０が生成した音声を出力する。音声出力部１５０は、待機制御部１３０が通信端末１０の待機を解除した後に、音声生成部１１０が生成した音声を出力する。具体的には、音声出力部１５０は、音声データ変換部１１４から受け取った音声データを、待機制御部１３０による通信端末１０に対する待機の解除後に、スピーカー１５４から出力する。 The sound output unit 150 outputs the sound generated by the sound generation unit 110 after the notification sound output unit 152 outputs the output notification sound. The voice output unit 150 outputs the voice generated by the voice generation unit 110 after the standby control unit 130 cancels the standby of the communication terminal 10. Specifically, the audio output unit 150 outputs the audio data received from the audio data conversion unit 114 from the speaker 154 after the standby control unit 130 cancels standby for the communication terminal 10.

本実施形態に係る通信端末１０によれば、音声パケットを受信したことを示す受信通知音を、音声パケットを受信してから、受信した音声パケットを音声データに変換して、変換した音声データを出力する直前まで出力することができる。これにより、音声パケットを受信したときから音声データを出力するまでの間に受信通知音を出力できるので、伝送遅延およびバッファ処理等の所定の処理に要する遅延がユーザに与える、心理的なストレスを軽減できる。 According to the communication terminal 10 according to the present embodiment, a reception notification sound indicating that a voice packet has been received is received, the voice packet is received, and then the received voice packet is converted into voice data. You can output until just before output. As a result, the reception notification sound can be output from the time when the voice packet is received to the time when the voice data is output. Therefore, the psychological stress that the delay required for predetermined processing such as transmission delay and buffer processing is given to the user. Can be reduced.

また、本実施形態に係る通信端末１０によれば、音声パケットを受信したことを示す受信通知音を出力して、さらに、音声データを出力する直前に音声を出力する旨を示す出力通知音を出力することができる。これにより、音声パケットのバッファ処理、音声データ変換処理、および破損／消失パケットの再送処理に要する時間だけ音声データの出力が遅延した場合であっても、ユーザは、違和感なく半二重通信をすることができる。 In addition, according to the communication terminal 10 according to the present embodiment, a reception notification sound indicating that a voice packet has been received is output, and further, an output notification sound indicating that a sound is output immediately before outputting the sound data is output. Can be output. As a result, even when the output of the voice data is delayed by the time required for the buffer processing of the voice packet, the voice data conversion processing, and the retransmission processing of the corrupted / lost packet, the user performs half-duplex communication without a sense of incongruity. be able to.

図２は、本実施形態に係る通信端末１０における処理の流れの概要の一例を示す。まず、通信端末１０（Ａ）は、元データ２００として音声データの複数の音声パケット（例えば、音声パケット２０２ないし音声パケット２０４）を通信端末１０（Ｂ）に送信する。具体的には、通信端末１０（Ａ）は、通信端末１０（Ａ）を使用するユーザが発した音声および通信端末１０（Ａ）の周囲の環境の音声を取得する。通信端末１０（Ａ）は、取得した音声の音声データをデジタル符号化する。続いて、通信端末１０（Ａ）はデジタル符号化した音声データを複数の音声パケットにパケット化する。通信端末１０（Ａ）は、複数の音声パケットを通信端末１０（Ｂ）に転送する。 FIG. 2 shows an example of an overview of the flow of processing in the communication terminal 10 according to the present embodiment. First, the communication terminal 10 (A) transmits a plurality of voice packets (for example, voice packets 202 to 204) of voice data as the original data 200 to the communication terminal 10 (B). Specifically, the communication terminal 10 (A) acquires the voice uttered by the user who uses the communication terminal 10 (A) and the voice of the environment around the communication terminal 10 (A). The communication terminal 10 (A) digitally encodes the acquired voice data. Subsequently, the communication terminal 10 (A) packetizes the digitally encoded audio data into a plurality of audio packets. Communication terminal 10 (A) transfers a plurality of voice packets to communication terminal 10 (B).

通信端末１０（Ｂ）は、通信端末１０（Ａ）が転送した複数の音声パケットを受信する。ここで、複数の音声パケットは、通信端末１０（Ｂ）に到達するまでに、各音声パケットが経由する伝送路の伝送速度等に応じて遅延する（伝送遅延２５０）。複数の音声パケットが経由する伝送路は同一とは限らないので、各音声パケットが通信端末１０（Ｂ）に到達するまでのそれぞれの到達時間は異なる場合がある。したがって、通信端末１０（Ａ）が複数の音声パケットを転送したときの複数の音声パケット間の時間間隔と通信端末１０（Ｂ）が受信した複数の音声パケット間の時間間隔とは一致しない場合がある。 The communication terminal 10 (B) receives a plurality of voice packets transferred by the communication terminal 10 (A). Here, the plurality of voice packets are delayed according to the transmission speed of the transmission path through which each voice packet passes before reaching the communication terminal 10 (B) (transmission delay 250). Since the transmission paths through which a plurality of voice packets pass are not necessarily the same, the arrival times until each voice packet reaches the communication terminal 10 (B) may be different. Therefore, the time interval between the plurality of voice packets when the communication terminal 10 (A) transfers the plurality of voice packets may not match the time interval between the plurality of voice packets received by the communication terminal 10 (B). is there.

例えば、伝送データ２２０に示したように、複数の音声パケット（例えば、音声パケット２０２ないし音声パケット２０４等）の間隔が等間隔ではなくなる。また、転送中に複数の音声パケットのうち少なくとも一部が破損して、破損パケット２０６が生じる場合がある。さらに、通信端末１０（Ａ）が送信した複数の音声パケットのうち少なくとも一部が、転送中に消失する場合がある（例えば、消失パケット２０８）。 For example, as shown in the transmission data 220, intervals between a plurality of voice packets (for example, voice packets 202 to 204) are not equal. In addition, at least some of the plurality of voice packets may be damaged during transfer, resulting in a damaged packet 206. Furthermore, at least some of the plurality of voice packets transmitted by the communication terminal 10 (A) may be lost during transfer (for example, lost packet 208).

そこで、通信端末１０（Ｂ）は、まず、所定のパケット数ずつ一時的にバッファ２１２に格納する。続いて、バッファ処理部１１２は、バッファ２１２に格納した複数の音声パケット間の到達時間間隔を等間隔に調整する。また、通信端末１０（Ｂ）は、受信した複数の音声パケットのヘッダを解析して、破損した音声パケットおよび消失した音声パケットの再送を通信端末１０（Ａ）に要求する。通信端末１０（Ｂ）は、破損した音声パケットおよび消失した音声パケットを通信端末１０（Ａ）から受け取った後に、複数の音声パケットを音声データに変換する（処理２１０）。そして、通信端末１０（Ｂ）は、音声データを出力する。 Therefore, the communication terminal 10 (B) first temporarily stores the predetermined number of packets in the buffer 212. Subsequently, the buffer processing unit 112 adjusts the arrival time intervals between the plurality of voice packets stored in the buffer 212 to equal intervals. In addition, the communication terminal 10 (B) analyzes the headers of the plurality of received voice packets and requests the communication terminal 10 (A) to retransmit the damaged voice packet and the lost voice packet. After receiving the damaged voice packet and the lost voice packet from the communication terminal 10 (A), the communication terminal 10 (B) converts the plurality of voice packets into voice data (process 210). And the communication terminal 10 (B) outputs audio | voice data.

ここで、通信端末１０（Ｂ）が備える音声パケット受信検出部１２０が所定の音声パケット、例えば、複数の音声パケットのうちのいずれかを最初に受信したことを検出した場合に、通知音出力部１５２は受信通知音を出力する（矢印３００）。また、通知音出力部１５２は、音声データ変換部１１４が所定のパケット数の音声パケットを音声データに変換して、変換した音声データを音声出力部１５０から出力可能になったことを音声出力可能検出部１４０が検出したタイミング、すなわち音声データを音声出力部１５０が出力する直前に出力通知音を出力する（矢印３１０）。 Here, when the voice packet reception detection unit 120 included in the communication terminal 10 (B) detects that a predetermined voice packet, for example, one of a plurality of voice packets is first received, a notification sound output unit 152 outputs a reception notification sound (arrow 300). In addition, the notification sound output unit 152 can output a sound that the sound data conversion unit 114 has converted a predetermined number of packets into sound data and the converted sound data can be output from the sound output unit 150. An output notification sound is output at the timing detected by the detection unit 140, that is, immediately before the audio output unit 150 outputs the audio data (arrow 310).

通知音出力部１５２は、受信通知音と出力通知音とを、例えば矢印３００で示したタイミングから矢印３１０で示したタイミングで連続して出力する。通知音出力部１５２は、受信通知音を出力する時間を矢印３１０のタイミングまで延ばして出力してもよい。すなわち、通知音出力部１５２は、期間３１５の間で受信通知音を連続して出力してよい。なお、通知音出力部１５２は、期間３１５の間で受信通知音を間欠的に出力してもよい。さらに、通知音出力部１５２は、受信通知音と出力通知音との周波数を異ならせる。これにより、ユーザは、通信端末１０（Ｂ）から音声データが出力されるタイミングを容易に把握できる。そして、通知音出力部１５２が出力通知音を出力した後に、音声出力部１５０は音声生成部１１０が生成した音声を、例えばスピーカー１５４から出力する。 The notification sound output unit 152 continuously outputs the reception notification sound and the output notification sound at the timing indicated by the arrow 310 from the timing indicated by the arrow 300, for example. The notification sound output unit 152 may output the reception notification sound by extending the time for outputting the reception notification sound until the timing of the arrow 310. That is, the notification sound output unit 152 may continuously output the reception notification sound during the period 315. Note that the notification sound output unit 152 may intermittently output the reception notification sound during the period 315. Further, the notification sound output unit 152 varies the frequencies of the reception notification sound and the output notification sound. Thereby, the user can easily grasp the timing at which the audio data is output from the communication terminal 10 (B). Then, after the notification sound output unit 152 outputs the output notification sound, the sound output unit 150 outputs the sound generated by the sound generation unit 110 from, for example, the speaker 154.

本実施形態に係る通信端末１０によれば、所定の音声パケットを受信したことを示す受信通知音を出力するだけでなく、受信通知音を出力する長さを、バッファ処理および破損／消失パケット再送処理等の処理に要する時間分、変化させることができる。そして、音声を出力するタイミングで出力通知音を出力できる。これにより、ユーザは音声が出力するタイミングを容易に把握できるので、複数の音声パケットに対するバッファ処理等により音声が出力されるタイミングが遅延した場合であっても、ユーザが感じる心理的なストレスを軽減できる。 According to the communication terminal 10 according to the present embodiment, not only the reception notification sound indicating that a predetermined voice packet has been received but also the length of outputting the reception notification sound is set to buffer processing and damaged / erased packet retransmission. The time required for processing such as processing can be changed. And the output notification sound can be output at the timing of outputting the sound. As a result, the user can easily grasp the timing at which the voice is output, so the psychological stress felt by the user can be reduced even when the timing at which the voice is output is delayed by buffer processing for a plurality of voice packets. it can.

図３は、本実施形態に係る通信端末１０のハードウェア構成の一例を示す。本実施形態に係る通信端末１０は、ホストコントローラ１０８２により相互に接続されるＣＰＵ１０００、ＲＡＭ１０２０、グラフィックコントローラ１０７５、および表示装置１０８０を有するＣＰＵ周辺部と、入出力コントローラ１０８４によりホストコントローラ１０８２に接続される通信インターフェース１０３０およびＲＯＭ１０１０を有する入出力部とを備える。ここで、通信端末１０は、例えば、携帯電話である。また、通信端末１０は、通信機能を有したＰＤＡまたはデジタルカメラであってよく、また、パーソナルコンピュータであってもよい。なお、通信端末１０がパーソナルコンピュータである場合は、入出力部は、ハードディスクドライブ、およびＣＤ−ＲＯＭドライブをさらに有する。さらに、入出力コントローラ１０８４には、フレキシブルディスクドライブ、および入出力チップを有するレガシー入出力部がさらに接続されてよい。 FIG. 3 shows an example of the hardware configuration of the communication terminal 10 according to the present embodiment. The communication terminal 10 according to the present embodiment is connected to a CPU peripheral unit having a CPU 1000, a RAM 1020, a graphic controller 1075, and a display device 1080 that are connected to each other by a host controller 1082, and to the host controller 1082 by an input / output controller 1084. And an input / output unit having a communication interface 1030 and a ROM 1010. Here, the communication terminal 10 is, for example, a mobile phone. The communication terminal 10 may be a PDA or a digital camera having a communication function, or may be a personal computer. When the communication terminal 10 is a personal computer, the input / output unit further includes a hard disk drive and a CD-ROM drive. Furthermore, the input / output controller 1084 may be further connected to a flexible disk drive and a legacy input / output unit having an input / output chip.

ホストコントローラ１０８２は、ＲＡＭ１０２０と、高い転送レートでＲＡＭ１０２０をアクセスするＣＰＵ１０００およびグラフィックコントローラ１０７５とを接続する。ＣＰＵ１０００は、ＲＯＭ１０１０およびＲＡＭ１０２０に格納されたプログラムに基づいて動作して、各部を制御する。グラフィックコントローラ１０７５は、ＣＰＵ１０００等がＲＡＭ１０２０内に設けたフレーム・バッファ上に生成する画像データを取得して、表示装置１０８０上に表示させる。これに代えて、グラフィックコントローラ１０７５は、ＣＰＵ１０００等が生成する画像データを格納するフレーム・バッファを、内部に含んでもよい。 The host controller 1082 connects the RAM 1020 to the CPU 1000 and the graphic controller 1075 that access the RAM 1020 at a high transfer rate. The CPU 1000 operates based on programs stored in the ROM 1010 and the RAM 1020 to control each unit. The graphic controller 1075 acquires image data generated by the CPU 1000 or the like on a frame buffer provided in the RAM 1020 and displays it on the display device 1080. Instead of this, the graphic controller 1075 may include a frame buffer for storing image data generated by the CPU 1000 or the like.

入出力コントローラ１０８４は、ホストコントローラ１０８２と、比較的高速な入出力装置である通信インターフェース１０３０、ハードディスクドライブ、ＣＤ−ＲＯＭドライブを接続する。通信インターフェース１０３０は、ネットワークを介して他の装置と通信する。ハードディスクドライブは、通信端末１０内のＣＰＵ１０００が使用するプログラムおよびデータを格納する。ＣＤ−ＲＯＭドライブは、ＣＤ−ＲＯＭからプログラムまたはデータを読み取り、ＲＡＭ１０２０を介してハードディスクドライブに提供する。 The input / output controller 1084 connects the host controller 1082 to the communication interface 1030, the hard disk drive, and the CD-ROM drive, which are relatively high-speed input / output devices. The communication interface 1030 communicates with other devices via a network. The hard disk drive stores programs and data used by the CPU 1000 in the communication terminal 10. The CD-ROM drive reads a program or data from the CD-ROM and provides it to the hard disk drive via the RAM 1020.

また、入出力コントローラ１０８４には、ＲＯＭ１０１０と、フレキシブルディスクドライブ、および入出力チップの比較的低速な入出力装置とが接続される。ＲＯＭ１０１０は、通信端末１０が起動時に実行するブート・プログラム、通信端末１０のハードウェアに依存するプログラム等を格納する。フレキシブルディスクドライブは、フレキシブルディスクからプログラムまたはデータを読み取り、ＲＡＭ１０２０を介してハードディスクドライブに提供する。入出力チップは、フレキシブルディスクドライブ、例えば、パラレル・ポート、シリアル・ポート、キーボード・ポート、マウス・ポート等を介して各種の入出力装置を接続する。 The input / output controller 1084 is connected to a ROM 1010, a flexible disk drive, and an input / output chip having a relatively low speed. The ROM 1010 stores a boot program executed when the communication terminal 10 is started, a program depending on the hardware of the communication terminal 10, and the like. The flexible disk drive reads a program or data from a flexible disk and provides it to the hard disk drive via the RAM 1020. The input / output chip connects various input / output devices via a flexible disk drive, for example, a parallel port, a serial port, a keyboard port, a mouse port, and the like.

ＲＡＭ１０２０を介してハードディスクドライブに提供される通信プログラムは、フレキシブルディスク、ＣＤ−ＲＯＭ、またはＩＣカード等の記録媒体に格納されて利用者によって提供される。通信プログラムは、記録媒体から読み出され、ＲＡＭ１０２０を介して通信端末１０内のハードディスクドライブにインストールされ、ＣＰＵ１０００において実行される。通信端末１０にインストールされて実行される通信プログラムは、ＣＰＵ１０００等に働きかけて、通信端末１０を、図１から図２にかけて説明した受信部１００、送信部１０２、音声生成部１１０、バッファ処理部１１２、音声データ変換部１１４、エラーパケット検出部１１６、再送要求部１１８、音声パケット受信検出部１２０、待機制御部１３０、音声出力可能検出部１４０、音声出力部１５０、通知音出力部１５２、およびスピーカー１５４として機能させる。 A communication program provided to the hard disk drive via the RAM 1020 is stored in a recording medium such as a flexible disk, a CD-ROM, or an IC card and provided by the user. The communication program is read from the recording medium, installed on the hard disk drive in the communication terminal 10 via the RAM 1020, and executed by the CPU 1000. A communication program installed and executed on the communication terminal 10 works on the CPU 1000 or the like to make the communication terminal 10 the receiving unit 100, the transmitting unit 102, the voice generating unit 110, and the buffer processing unit 112 described with reference to FIGS. , Audio data conversion unit 114, error packet detection unit 116, retransmission request unit 118, audio packet reception detection unit 120, standby control unit 130, audio output enable detection unit 140, audio output unit 150, notification sound output unit 152, and speaker Function as 154.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更または改良を加え得ることが当業者に明らかである。その様な変更または改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 As mentioned above, although this invention was demonstrated using embodiment, the technical scope of this invention is not limited to the range as described in the said embodiment. It will be apparent to those skilled in the art that various modifications or improvements can be added to the above-described embodiment. It is apparent from the scope of the claims that the embodiments added with such changes or improvements can be included in the technical scope of the present invention.

通信端末１０の機能構成を示すブロック図である。2 is a block diagram showing a functional configuration of a communication terminal 10. FIG. 通信端末１０における処理の流れの概要を示す図である。3 is a diagram illustrating an outline of a process flow in a communication terminal 10. FIG. 通信端末１０のハードウェア構成を示すブロック図である。2 is a block diagram illustrating a hardware configuration of a communication terminal 10. FIG.

Explanation of symbols

１０通信端末
１００受信部
１０２送信部
１１０音声生成部
１１２バッファ処理部
１１４音声データ変換部
１１６エラーパケット検出部
１１８再送要求部
１２０音声パケット受信検出部
１３０待機制御部
１４０音声出力可能検出部
１５０音声出力部
１５２通知音出力部
１５４スピーカー
１６０アンテナ
２００元データ
２０２、２０４音声パケット
２０６破損パケット
２０８消失パケット
２１０処理
２１２バッファ
２２０伝送データ
２５０伝送遅延
３００、３１０矢印
３１５期間
１０００ＣＰＵ
１０１０ＲＯＭ
１０２０ＲＡＭ
１０３０通信インターフェース
１０７５グラフィックコントローラ
１０８０表示装置
１０８２ホストコントローラ
１０８４入出力コントローラ DESCRIPTION OF SYMBOLS 10 Communication terminal 100 Reception part 102 Transmission part 110 Voice generation part 112 Buffer processing part 114 Voice data conversion part 116 Error packet detection part 118 Retransmission request part 120 Voice packet reception detection part 130 Standby control part 140 Voice output possibility detection part 150 Voice output Unit 152 notification sound output unit 154 speaker 160 antenna 200 original data 202, 204 voice packet 206 corrupted packet 208 lost packet 210 processing 212 buffer 220 transmission data 250 transmission delay 300, 310 arrow 315 period 1000 CPU
1010 ROM
1020 RAM
1030 Communication interface 1075 Graphic controller 1080 Display device 1082 Host controller 1084 Input / output controller

Claims

A communication terminal that performs half-duplex communication of voice data by packet communication,
A receiver for receiving a plurality of voice packets indicating voice;
A voice generation unit that generates the voice from the plurality of voice packets received by the reception unit;
A sound output enable detection unit that detects that the sound can be output by the sound generation unit generating the sound;
A standby control unit that causes the communication terminal to wait until the voice output enable detection unit detects that the voice can be output;
At least a notification sound output unit that outputs an output notification sound indicating that the sound is output at a timing when the sound output enable detection unit detects that the sound can be output;
A communication terminal comprising: a sound output unit that outputs the sound generated by the sound generation unit after the notification sound output unit outputs the output notification sound.

A voice packet reception detection unit for detecting that a predetermined voice packet is received from the plurality of voice packets received by the reception unit;
The notification sound output unit further outputs a reception notification sound indicating that the predetermined voice packet has been received at a timing when the voice packet reception detection unit detects that the predetermined voice packet has been received. The communication terminal described in 1.

The communication terminal according to claim 2, wherein the voice packet reception detection unit detects that one of the plurality of voice packets received by the reception unit is received first.

The notification sound output unit is from the timing at which the voice packet reception detection unit detects that the predetermined voice packet has been received to the timing at which the voice output enable detection unit detects that the voice can be output. The communication terminal according to claim 2, wherein the reception notification sound and the output notification sound are continuously output.

The communication terminal according to claim 4, wherein the reception notification sound and the output notification sound are sounds having different frequencies.

The voice generation unit
A buffer processing unit for temporarily storing the plurality of voice packets received by the receiving unit by a predetermined number of packets;
An audio data conversion unit that converts the plurality of audio packets into audio data by the predetermined number of packets stored in the buffer processing unit;
The voice output enable detection unit, when the predetermined number of packets among the plurality of voice packets received by the reception unit has been converted by the voice data conversion unit, when a predetermined time has elapsed, The communication terminal according to claim 1, wherein it detects that voice can be output.

The voice generation unit
Error packet detection unit for detecting a damaged voice packet among the plurality of voice packets received by the reception unit or a lost voice packet among the plurality of voice packets to be received by the reception unit When,
A retransmission request unit for requesting retransmission of the voice packet detected by the error packet detection unit to the communication terminal that has transmitted the plurality of voice packets;
The voice output enable detection unit is configured to receive the voice generated by the voice generation unit generating the voice after the reception unit receives the voice packet requested to be retransmitted by the retransmission request unit. The communication terminal according to claim 1 to be detected.

A communication method for performing half-duplex communication of voice data by packet communication,
Receiving a plurality of voice packets indicative of voice;
A voice generation stage for generating the voice from the plurality of voice packets received in the reception stage;
A voice output enable detection step for detecting that the voice can be output by generating the voice in the voice generation step;
A standby control stage for waiting the communication terminal until it is detected in the voice output enable detection stage that the voice can be output;
At least a notification sound output step for outputting an output notification sound indicating that the sound is output at a timing when the sound output enable detection step detects that the sound can be output;
A communication method comprising: a sound output step of outputting the sound generated in the sound generation step after the output notification sound is output in the notification sound output step.

A communication program for a communication terminal that performs half-duplex communication of voice data by packet communication, the communication terminal,
A receiver for receiving a plurality of voice packets indicating voice;
A voice generation unit that generates the voice from the plurality of voice packets received by the reception unit;
A sound output enable detection unit for detecting that the sound can be output by the sound generation unit generating the sound;
A standby control unit that causes the communication terminal to wait until the voice output enable detection unit detects that the voice can be output;
At least a notification sound output unit that outputs an output notification sound indicating that the sound is output at a timing when the sound output enable detection unit detects that the sound can be output, and the notification sound output unit A communication program that functions as an audio output unit that outputs the audio generated by the audio generation unit after outputting the output notification sound.