JP4069207B2

JP4069207B2 - Communication device

Info

Publication number: JP4069207B2
Application number: JP2005052041A
Authority: JP
Inventors: 一志西本; 加奈代小倉
Original assignee: Japan Advanced Institute of Science and Technology
Current assignee: Japan Advanced Institute of Science and Technology
Priority date: 2005-02-25
Filing date: 2005-02-25
Publication date: 2008-04-02
Anticipated expiration: 2025-02-25
Also published as: JP2006236149A

Description

本発明は、複数ユーザのクライアント端末に対して受信した音声データを提供することにより音声によるコミュニケーションを支援するコミュニケーション装置に関する。 The present invention relates to a communication device that supports voice communication by providing received voice data to client terminals of a plurality of users.

インターネット環境が普及した現在、電子メールや電子掲示板、チャットのような非対面型のテキストをベースとしたコミュニケーションが日常的に利用されている。中でもテキストチャットは逐次やりとりをするという点で対面対話に通じる点がある。また一方で、「マルチスレッド対話」が容易に可能であるという対面対話にはない利点を持っている。ここで、マルチスレッド対話とは、会話空間で同時並行的に複数の話題についての対話が進行し、しかもある一人の参加者が同時に複数の話題に参加しているような対話である。対面対話でも、多人数が集まれば、同じ場で複数話題が同時に展開されることはあるが、これは話題に基づく集団の単なる分割にすぎないため、マルチスレッド対話とは見なさない。 Now that the Internet environment has become widespread, non-face-to-face text-based communications such as e-mail, electronic bulletin boards, and chat are routinely used. Above all, the text chat has a point that leads to a face-to-face conversation in that it communicates sequentially. On the other hand, it has the advantage over face-to-face conversation that “multi-threaded conversation” is easily possible. Here, the multi-thread conversation is a conversation in which conversations on a plurality of topics proceed simultaneously in a conversation space, and a certain participant is simultaneously participating in a plurality of topics. Even in face-to-face conversations, if a large number of people gather, multiple topics may be developed simultaneously in the same place, but this is merely a division of the group based on topics, so it is not considered a multi-threaded conversation.

日常の対面対話では、一定時間は、対話参加者全員が単一の話題を共有し、しかもその話題についての発話を行う必要があり、話者は一人だけであることが求められる（同期性の制約）。そのため、対面対話は非効率的に進行していると言える。また、この結果、今交わされている話題とは別のことを思いついても、すぐに発話することができず、思いついた内容そのものを忘れてしまうことがしばしば起こる。 In daily face-to-face conversations, all participants in a conversation need to share a single topic for a certain period of time, and it is necessary to speak about that topic, and it is required that there is only one speaker (synchronous Constraints). Therefore, it can be said that face-to-face dialogue is inefficient. In addition, as a result, even if you come up with something different from the topic that is currently being exchanged, it is often impossible to utter immediately and forget the content of the idea.

また、対話参加者の位置関係の影響の有無という点でもテキストチャットと対面対話では大きな違いがある。対面対話の場合、特に会議や会食のような多人数で1つの場を共有する状況では、最初についた座席の位置関係によって会話を交わしやすい人、そうでない人の区分が決定する（近接性の制約）。自分の座席から離れている人と話をするためには、座席を移動するか、大声で相手に呼びかける必要がある。前者の場合は、会話そのものを中断させ、後者の場合は、呼びかけた相手の会話を遮ることになり、どちらの場合も、会話進行の大きな妨げになる。 In addition, there is a big difference between text chat and face-to-face conversation in terms of the presence or absence of the influence of the positional relationship of the participants. In the case of face-to-face conversation, especially in situations where a large number of people share a place, such as meetings and dinners, the location of the seats that are first attached determines the categories of people who are likely to communicate and those who do not (the proximity Constraints). To talk to someone who is away from your seat, you need to move your seat or call out loud. In the former case, the conversation itself is interrupted, and in the latter case, the conversation of the calling party is interrupted. In either case, the conversation progresses greatly.

これに対して、テキストチャット対話では、このような制約が無いため，思いついたことや過去の任意の発話に関連する内容をいつでも発言順序を気にせず発言可能となる。つまり、マルチスレッド対話を行うことが可能となるため、対話が効率化される。また、テキストチャット対話では、発言履歴が会話空間であり、発言履歴に表示される参加者個々のログイン名が参加者の分身であると言える。ここでは単に誰が参加しているのかという存在情報のみが意味を持ち、参加者の位置関係という概念そのものが存在しない。そのため、対面対話とは異なり参加者の位置関係を気にせずに発言することが可能である。ゆえに、たとえば企画会議などにおいてアイディアをもれなく収集するような際に非常に有効である。 On the other hand, since there is no such restriction in the text chat dialog, it is possible to speak the contents related to the thoughts and the past arbitrary utterances without worrying about the utterance order. That is, since it becomes possible to perform a multi-threaded dialogue, the dialogue is made efficient. In the text chat conversation, it can be said that the speech history is a conversation space, and the login name of each participant displayed in the speech history is a part of the participant. Here, only the presence information of who is participating is meaningful, and there is no concept of the positional relationship of the participants. Therefore, unlike face-to-face dialogue, it is possible to speak without worrying about the positional relationship of participants. Therefore, it is very effective when collecting ideas without fail at, for example, planning meetings.

しかし、テキスト対話では文字入力のわずらわしさのために、思うような発言を十分に行うことが難しい。このため、音声対話によりコミュニケーションを支援する装置、特に、音声対話によるマルチスレッド対話を実現可能な装置が求められる。 However, in text dialogue, it is difficult to make enough remarks because of the troublesome character input. For this reason, there is a need for a device that supports communication by voice dialogue, particularly a device that can realize multi-thread dialogue by voice dialogue.

音声データによりコミュニケーションを支援する装置として、下記非特許文献１には、空間的位置関係にかかわらず、音声認識を用いて同一スレッドに属する話者の発話音声がより明確に聞こえる等の自動的音響効果を付加した同時的会話環境である“The Mad Hatter’s Cocktail Party”が開示されている。このシステムでは，音声認識を用いた同一スレッドの自動判定を行っているため、認識精度の問題が生じてしまい確実性が低くなる。 As an apparatus that supports communication using voice data, the following Non-Patent Document 1 discloses an automatic sound such that a voice of a speaker belonging to the same thread can be heard more clearly using voice recognition regardless of a spatial positional relationship. “The Mad Hatter's Cocktail Party”, a simultaneous conversation environment with added effects, is disclosed. In this system, since the same thread is automatically determined using speech recognition, a problem of recognition accuracy arises and the reliability is low.

P.M.Aoki, M. Romaine, M.H. Szymanski, J.D. Thornton, D. Wilson, and A. Woodruff:The Mad Hatter's Cocktail Party: A Social Mobile Audio Space SupportingMultiple Conversations,Proc. ACM SIGCHI Conf. onHuman Factors in Computing Systems, pp.425-432,2003.PMAoki, M. Romaine, MH Szymanski, JD Thornton, D. Wilson, and A. Woodruff: The Mad Hatter's Cocktail Party: A Social Mobile Audio Space Supporting Multiple Conversations, Proc. ACM SIGCHI Conf. OnHuman Factors in Computing Systems, pp. 425-432,2003.

ところが、音声を用いた対話（対面対話、電話、音声チャット）では、今のところマルチスレッド対話は実現不可能ないし非常に困難である。その最大の理由は、一般に人は複数の発言を同時に聞いてそれらを記憶し、理解することができないことにある。 However, in conversations using voice (face-to-face conversations, telephone calls, voice chats), multi-threaded conversations are currently impossible or very difficult. The main reason for this is that, in general, a person cannot hear and remember multiple statements at the same time and understand them.

対面でマルチスレッド対話を行なうためには、我々は、一度に複数の発言を聞き分け、記憶し、理解することが必要とされる。しかし、これは人間の認知能力の範囲内では困難である。したがって、マルチスレッド対話が行なわれることはめったにない。一方テキストチャットでは、誰が、いつ、何を発言したかという発言にかかわるデータが保存され、それが発言履歴として表示され、参加者は自由に履歴を閲覧できる。これが人間の短期記憶を補う役割を果たしているために同期性の制約が解消され、また発言履歴が対話の場として機能するために近接性の制約も解消される。この結果、テキストチャットではマルチスレッド対話が可能となっていると思われる。つまりマルチスレッド対話の実現には，発言履歴が不可欠であると考えられる。 In order to conduct face-to-face multithreaded conversations, we need to hear, remember and understand multiple statements at once. However, this is difficult within the human cognitive ability. Therefore, multithreaded conversations are rarely performed. On the other hand, in the text chat, data related to the utterances of who and when have been uttered is stored and displayed as the utterance history, and the participants can freely browse the history. Since this plays a role of supplementing human short-term memory, the restriction of synchronism is eliminated, and the restriction of proximity is also eliminated because the speech history functions as a place for dialogue. As a result, it seems that multi-threaded conversation is possible in text chat. In other words, speech history is indispensable for realizing multi-threaded dialogue.

音声チャットシステムでは、個々の発言が音声データとして計算機上に記録され、いつ誰が発言したかを示すログリストも表示されるために、何度でも必要に応じて聞きなおすことができる。つまり音声データやログの保存によって人の短期記憶が補助されているため、対面対話や電話よりはマルチスレッド対話を実現しやすい音声コミュニケーション・メディアとなっている。 In the voice chat system, individual utterances are recorded on the computer as voice data, and a log list indicating when and who uttered is also displayed. Therefore, the voice chat system can be listened to as many times as necessary. In other words, since the short-term memory of a person is assisted by the storage of voice data and logs, it is a voice communication medium that facilitates multi-threaded conversation rather than face-to-face conversation and telephone.

しかし、ログ上では個々の発言（音声データ）の関連性が見えず、音を再生して聞いてみないとどの発言（音声データ）とどの発言（音声データ）とがつながっているのかがわからない。このため、既存の音声チャットシステムでは依然としてマルチスレッド対話を実現することが難しい。 However, the relevance of individual utterances (voice data) is not visible on the log, and it is not possible to know which utterance (voice data) is connected to which utterance (voice data) unless you listen to it after playing the sound. . For this reason, it is still difficult to implement multi-threaded conversations with existing voice chat systems.

本発明は、このような実情に鑑みてなされたものであり、音声によるコミュニケーションを支援するものであり、音声データ間の相互関係を把握可能とすることによりマルチスレッド対話が実現可能なコミュニケーション装置を提供することを目的とする。 The present invention has been made in view of such a situation, and supports communication by voice. A communication device capable of realizing multi-thread conversation by making it possible to grasp a mutual relationship between voice data. The purpose is to provide.

発明者等は、従来のテキストチャットシステムを用いた場合に、マルチスレッド状況を維持するために、どのような手段がとられているか、その傾向を調査した。その結果を下記に示す。 The inventors investigated the tendency of what measures are taken in order to maintain a multi-thread situation when a conventional text chat system is used. The results are shown below.

チャット対話収録で得たデータについて予備分析を行なった結果、発言履歴の参照しにくさがある中で話の流れを追い、できる限りスムーズに対話を進めるために貢献していると考えられる３種類の表現が存在することがわかった。
１）誰に向けた発言であるかを明記する表現（固有名詞を含む）（図６では“＞人”）
例）まだまだ今年はこれからですよ。＞Ｂさん
２）どの話に関連した発言をしているかを明記する表現（図６では“＞単語”）
例）意見わかれるとこみたいです＞マロンクリーム
３）どの発言に対して発言をしているかを明記する表現（この場合はコピー＆ペーストを行なっていると推測される場合）（図６では“＞コピぺ”）
例）Ａ：栗と生クリームって合わないと思いません？＞ＡＬＬ＞とても合うと思うよ。マロンクリームってめっちゃうまいやん！
（「栗と生クリームって合わないと思いません？＞ＡＬＬ」がコピー＆ペーストされた部分。） As a result of preliminary analysis of the data obtained in the chat dialogue recording, there are three types that are considered to contribute to follow the flow of the story while making it difficult to refer to the speech history, and to promote the dialogue as smoothly as possible. It was found that there is an expression of.
1) An expression (including proper nouns) that clearly indicates to whom the remark is directed (">People" in FIG. 6)
Example) This year is still coming. > Mr. B 2) An expression that clearly indicates which utterance is relevant (">word" in FIG. 6)
Example) It seems like you can get an opinion> Maron cream 3) An expression that clearly states which statement you are speaking (in this case, it is assumed that you are copying and pasting) (">Copy" in Fig. 6) ""
Ex) A: Don't you think chestnuts and fresh cream do not match? >ALL> I think it fits very well. Maron cream is really good!
("Don't think chestnuts and fresh cream don't fit?>ALL" was copied and pasted.)

これら３つの表現が対話中に出現した割合のグラフを図６に示す。さらに、発言間の意味的つながりの判定作業を行なった上で、３つの表現を発言間のインターバル（発言間距離）ごとに分類した出現頻度の結果を示すグラフを図７に示す。なお、分析対象としたデータは、２人対話が３対話３１１発言分、３人対話が５対話５５９発言分、計８７０発言分の発言履歴データであり、全てチャット経験者を被験者としたデータである。 A graph of the ratio of these three expressions appearing during the dialogue is shown in FIG. Further, FIG. 7 shows a graph showing the result of appearance frequency obtained by classifying the three expressions for each interval (inter-utterance distance) after performing the semantic connection determination between the speeches. The data to be analyzed is the speech history data for a total of 870 utterances for 3 dialogues for 311 dialogues for 3 dialogues and 559 for 3 dialogues. is there.

図６を見ると、３つの表現をあわせると分析対象発言の約１／４の割合で、話をスムースに進めるためにいずれかの表現を用いていることがわかる。また、図７を見ると、隣接する発言同士が異なる話題である場合（発言間距離２以上の場合）にどの表現も出現頻度が増加している。このことから、より複雑な状況になればなるほど、話の流れを追いやすくするために、誰に対する発言なのか、どの話、どの発言に関する発言なのかなど、発言間の関連を明記して対応していると推測できる。 As can be seen from FIG. 6, when the three expressions are combined, one of the expressions is used to smoothly advance the story at a rate of about 1/4 of the analysis target speech. Moreover, when FIG. 7 is seen, when the adjacent utterances are different topics (when the distance between utterances is 2 or more), the frequency of appearance of any expression increases. For this reason, in order to make it easier to follow the flow of the story as the situation becomes more complicated, it is necessary to clearly indicate the relationship between the statements, such as who is speaking, which story, and what is the statement. I can guess that.

以上の結果から、発明者等は、本発明のコミュニケーション装置を開発した。本発明のコミュニケーション装置は、複数のユーザのクライアント端末から音声データを受信する手段と、当該受信した音声データを蓄積記憶する手段と、当該蓄積記憶した音声データの一覧を複数のユーザのクライアント端末に提供する手段と、当該一覧を受信したクライアント端末からの要求に応じて、要求された音声データを当該クライアント端末に提供する手段とを備える、複数のユーザ間における音声データを用いた対話を可能とするコミュニケーション装置であり、
前記対話における音声データ間の関連を示す情報を前記一覧に付加する手段を備え、前記クライアント端末により、音声データが生成され、前記音声データの一覧から先行する音声データが選択され、当該生成された音声データと当該選択された先行する音声データの識別子とがアップロードされると、
前記音声データを蓄積記憶する手段は、前記生成された音声データと選択された先行する音声データを特定する識別子とを関連付けて蓄積記憶し、
前記音声データ間の関連を示す情報を前記一覧に付加する手段は、当該音声データの一覧に前記蓄積記憶した音声データに関する情報を追加するとともに当該音声データに関する情報に前記先行する音声データを特定する識別子を付与することを特徴とする。 From the above results, the inventors developed the communication device of the present invention. The communication apparatus of the present invention includes means for receiving voice data from client terminals of a plurality of users, means for storing and storing the received voice data, and a list of the stored and stored voice data in the client terminals of the plurality of users. It is possible to perform a dialogue using a plurality of users using voice data, comprising means for providing and means for providing the requested voice data to the client terminal in response to a request from the client terminal that has received the list. A communication device that
Means for adding information indicating the relationship between the voice data in the dialogue to the list , the voice data is generated by the client terminal, the preceding voice data is selected from the list of the voice data, and the generated When the audio data and the identifier of the selected preceding audio data are uploaded,
The means for accumulating and storing the audio data accumulates and stores the generated audio data in association with an identifier that identifies the selected preceding audio data,
The means for adding information indicating the relation between the audio data to the list adds information related to the stored audio data to the audio data list and identifies the preceding audio data in the information related to the audio data. An identifier is assigned .

この本発明によれば、一覧を参照することにより、その音声データがいずれの音声データに対する発言であるかを把握することができる。 According to the present invention, by referring to the list, it is possible to grasp which voice data is the voice data.

または、本発明は、クライアント端末により、音声データが生成され、ユーザの一覧から発言相手の識別子が選択され、当該生成された音声データと当該選択された発言相手の識別子とがアップロードされると、音声データを蓄積記憶する手段は、前記生成された音声データと選択された発言相手の識別子とを関連付けて蓄積記憶し、音声データ間の関連を示す情報を前記音声データの一覧に付加する手段は、当該一覧に前記蓄積記憶した音声データに関する情報を追加するとともに当該音声データに関する情報に前記発言相手の識別子を付与することを特徴とする。Alternatively, in the present invention, when voice data is generated by a client terminal, an identifier of a speaking partner is selected from a list of users, and the generated voice data and the identifier of the selected speaking partner are uploaded, Means for accumulating and storing voice data is means for accumulating and storing the generated voice data and the identifier of the selected speech partner, and means for adding information indicating a relation between the voice data to the list of voice data. The information on the voice data stored and stored is added to the list, and the identifier of the speech partner is added to the information on the voice data.

この発明によれば、一覧を参照することにより、音声データごとに発言相手を知ることができ、音声データの相互関係を把握することができる。According to the present invention, by referring to the list, it is possible to know the speaking partner for each voice data, and to grasp the mutual relationship of the voice data.

または、本発明は、クライアント端末により、音声データが生成され、音声データの一覧から先行する音声データが選択され、当該生成された音声データと前記選択された先行する音声データの発言者の識別子とがアップロードされると、音声データを蓄積記憶する手段は、前記選択された先行する音声データの発言者の識別子を発言相手の識別子として前記生成された音声データと関連付けて蓄積記憶し、音声データ間の関連を示す情報を前記音声データの一覧に付加する手段は、当該一覧に前記蓄積記憶した音声データに関する情報を追加するとともに当該音声データに関する情報に前記発言相手の識別子を付与することを特徴とする。この発明によれば、一覧を参照することにより、音声データごとに発言相手を知ることができ、音声データの相互関係を把握することができる。 Alternatively, according to the present invention, voice data is generated by a client terminal, preceding voice data is selected from a list of voice data, and the generated voice data and an identifier of a speaker of the selected preceding voice data Is uploaded, the means for accumulating and storing audio data accumulates and stores the identifier of the speaker of the selected preceding audio data in association with the generated audio data as the identifier of the other party of speech, The means for adding information indicating the relationship to the list of voice data adds information related to the stored and stored voice data to the list and adds the identifier of the speech partner to the information related to the voice data. To do. According to the present invention, by referring to the list, it is possible to know the speaking partner for each voice data, and to grasp the mutual relationship of the voice data.

また、本発明のコミュニケーション装置は、複数のユーザのクライアント端末とサーバとを有し、前記サーバは、前記複数のクライアント端末から音声データを受信する手段と、当該受信した音声データを蓄積記憶する手段と、当該蓄積記憶した音声データの一覧を複数のユーザのクライアント端末に提供する手段と、当該一覧を受信したクライアント端末からの要求に応じて、要求された音声データを当該クライアント端末に提供する手段とを備え、複数のユーザ間における音声データを用いた対話を可能とするコミュニケーション装置であり、
前記サーバは、前記対話における音声データ間の関連を示す情報を前記一覧に付加する手段を備え、前記各クライアント端末は、信号を発信する発信手段と、発信手段からの信号を受信する受信手段とを備え、当該発信手段と受信手段は、発言者のクライアント端末に備えられる発信手段を発言相手のクライアント端末に備えられる受信手段を指し示すようにして信号を送受信可能であり、
発言者のクライアント端末から前記サーバに音声データが送信され、且つ、当該発言者の発信手段から信号を受信した発言相手の受信手段を備えるクライアント端末から前記サーバに当該発言相手の識別子が送信されると、
前記サーバは、受信した当該音声データと受信した当該発言相手の識別子とを関連付けて蓄積記憶し、前記音声データの一覧に当該音声データに関する情報を追加するとともに当該音声データに関する情報に当該発言相手の識別子を付与することを特徴とする。 The communication apparatus of the present invention includes a plurality of user client terminals and a server, wherein the server receives voice data from the plurality of client terminals, and stores and stores the received voice data. And means for providing a list of the stored and stored audio data to the client terminals of a plurality of users, and means for providing the requested audio data to the client terminal in response to a request from the client terminal that has received the list And a communication device that enables conversation using voice data among a plurality of users,
The server includes means for adding information indicating an association between audio data in the dialogue to the list, and each client terminal includes a sending means for sending a signal, and a receiving means for receiving a signal from the sending means, The transmitting means and the receiving means are capable of transmitting and receiving signals such that the transmitting means provided in the client terminal of the speaker indicates the receiving means provided in the client terminal of the speaking partner,
The voice data is transmitted from the client terminal of the speaker to the server, and the identifier of the speaker partner is transmitted to the server from the client terminal provided with the receiving unit of the speaking partner who has received the signal from the transmitting unit of the speaker. When,
The server accumulates and stores the received voice data and the received identifier of the speaking partner, adds information about the voice data to the list of the voice data, and adds the information about the voice partner to the information about the voice data. An identifier is assigned .

この発明によれば、発言者がクライアント端末から音声データをコミュニケーション装置に送信し、自己の発信手段を用いて発言相手の受信手段に信号を送信すると、その信号を受け取った受信手段を備える発言相手のクライアント端末から当該発言相手の識別子がコミュニケーション装置に送信される。コミュニケーション装置は、受信した当該音声データと受信した当該発言相手の識別子とを関連付けて蓄積記憶し、前記音声データの一覧に当該音声データに関する情報を追加するとともに当該音声データに関する情報に当該発言相手の識別子を付与する。発言者は、発言相手を指し示すなどのように、送信手段を用いて発言相手の受信手段に信号を送る動作をするだけで、一覧に発言相手の情報を付加することができる。 According to the present invention, when a speaker transmits voice data from a client terminal to a communication apparatus and transmits a signal to the receiving unit of the speaking partner using his own transmitting unit, the speaking partner includes a receiving unit that receives the signal. The identifier of the speaking partner is transmitted from the client terminal to the communication device. The communication device accumulates and stores the received voice data and the received identifier of the speaking partner, adds information on the voice data to the list of the voice data, and adds the information on the voice partner to the information on the voice data. Give an identifier . The speaker can add the information of the speaking partner to the list only by sending a signal to the receiving means of the speaking partner using the transmitting means, such as pointing to the speaking partner.

また、本発明のコミュニケーション装置は、テキストデータが入力可能であり、入力されたテキストデータは、前記一覧に付加されることを特徴とする。この発明によれば、音声データのみならず、テキストデータによってもコミュニケーションが図られる。たとえば、具体的かつ詳細な内容は音声で入力し、その内容のタイトルやキーワードなどだけを文字として入力するようなことが可能となり、個々の音声データ間の関係を意味的に明示することが可能となる。また、音声では入力できない「顔文字」のようなものも音声に付加して入力可能となる。 The communication device of the present invention can input text data, and the input text data is added to the list . According to the present invention, communication can be achieved not only by voice data but also by text data. For example, it is possible to input specific and detailed contents by voice, and input only the titles and keywords of the contents as characters, and the relationship between individual voice data can be clearly indicated. It becomes. Also, “emoticons” that cannot be input by voice can be added to the voice and input.

本発明のコミュニケーション装置によれば、コミュニケーションに参加するユーザは、一覧を参照することにより、音声データの相互関係を把握することができるため、音声によるマルチスレッド型の対話が可能となる。とくに、音声データ間の関連を示す情報として、発言者や発言相手を特定する情報が含まれている場合は、個々の音声データごとに、誰の発言であるか、誰に対しての発言であるか、を把握することができる。また、音声データ間の関連を示す情報として、先行する音声データを特定する情報が含まれる場合は、個々の音声データが、先行するいずれの音声データを受けた発言であるかを把握することができる。また、これらの両方が含まれる場合は、音声データ間の相互関係を更に詳しく把握することができる。 According to the communication device of the present invention, users who participate in communication can grasp the mutual relationship of voice data by referring to the list , so that a multi-threaded conversation by voice is possible. In particular, when the information indicating the relationship between the voice data includes information that identifies the speaker and the other party, it is possible to specify who speaks and who speaks for each voice data. You can see if there is. In addition, when information indicating the preceding voice data is included as the information indicating the relationship between the voice data, it is possible to grasp which of the preceding voice data the individual voice data has received. it can. If both of these are included, the interrelationship between the audio data can be grasped in more detail.

ユーザは、発言相手を選択するだけで、その音声データの発言相手を指定でき、また、対話において先行する音声データを選択するだけで、その音声データがいずれの音声データを受けて発言されたものかの情報を付加することができるため、音声データ間の関連付けを簡単に行うことができ、マルチスレッド型の対話が滞りなく円滑に進行される。 The user can specify the speech partner of the speech data simply by selecting the speech partner, and by selecting the preceding speech data in the dialogue, the speech data received any speech data. Therefore, it is possible to easily associate the audio data, and the multi-threaded dialogue can be smoothly advanced without delay.

さらに、信号を発信する発信手段と、その信号を受信する受信手段とを用いることにより、発言者が発言相手を指し示すような感覚で、一覧に発言相手を特定する情報を付加することができる。このシステムを対面型の対話に導入すれば、マルチスレッド型の対話が可能となるだけでなく、全く新しくしかも効率的な対話スタイルを確立することが可能となる。これは会議などにおいて非常に有効であると期待できる。また、発信手段を発言相手側に向ける動作により発言相手を見ることとなり、ログデータを見続けるような状況が回避され、対話がより自然に進行する。 Further, by using a transmitting means for transmitting a signal and a receiving means for receiving the signal, information for specifying the speaking partner can be added to the list as if the speaker points to the speaking partner. If this system is introduced to face-to-face conversations, not only multi-threaded conversations are possible, but a completely new and efficient conversation style can be established. This can be expected to be very effective in meetings and the like. In addition, the conversation partner is seen by the operation of directing the transmission means to the speech partner side, and a situation in which the log data is continuously viewed is avoided, and the dialogue proceeds more naturally.

さらに、一覧にテキストデータが付加されると、一覧を参照することによって、音声データ間の意味的なつながりを把握することが可能となる。これによってマルチスレッド対話をより効率的に行うことができる。また、従来のテキスト・音声混合チャットシステムでは、音声の再生手段を持たない参加者は音声データの内容を全く理解できなかった。本システムによれば、付加されたタイトルやキーワードなどのテキストデータで、おおまかながら音声データの内容を理解可能となる。 Furthermore, when the text data is added to the list, by referring to the list, it is possible to grasp the semantic connections between the audio data. As a result, multi-threaded conversation can be performed more efficiently. Further, in the conventional mixed text / voice chat system, participants who do not have voice playback means cannot understand the contents of the voice data at all. According to this system, it is possible to roughly understand the contents of audio data by using text data such as added titles and keywords.

（第１の実施の形態）
以下、本発明に係るコミュニケーション装置Ｓについて図面を参照しながら説明する。図１は、本実施の形態のコミュニケーション装置Ｓを説明する説明図である。コミュニケーション装置Ｓは、コンピュータシステムであるサーバにより実現され、インターネットなどのネットワークを介して同じくコンピュータシステムである複数のクライアント端末Ｃ１，，，Ｃｘと接続可能となっている。 (First embodiment)
Hereinafter, the communication apparatus S according to the present invention will be described with reference to the drawings. FIG. 1 is an explanatory diagram illustrating the communication device S according to the present embodiment. The communication device S is realized by a server that is a computer system, and can be connected to a plurality of client terminals C1,..., Cx that are also computer systems via a network such as the Internet.

コミュニケーション装置Ｓは、例えば、音声チャットシステムのように、ユーザのクライアント端末Ｃ１，，，Ｃｘから音声データを受信し、その受信した音声データを提供することにより、音声によるコミュニケーションを支援するシステムである。 The communication device S is a system that supports voice communication by receiving voice data from the user's client terminals C1, Cx, and providing the received voice data, such as a voice chat system. .

クライアント端末Ｃ１，，，Ｃｘは、音声によるコミュニケーション参加のために、ネットワークを介してコミュニケーション装置Ｓにアクセス可能となっている。クライアント端末Ｃ１，，，Ｃｘは、音声を録音する録音機能を備え、録音された音声は、音声データとしてネットワークを介してコミュニケーション装置Ｓにアップロード可能となっている。 The client terminals C1,..., Cx can access the communication device S via a network for participating in voice communication. The client terminals C1,..., Cx have a recording function for recording voices, and the recorded voices can be uploaded as voice data to the communication device S via the network.

コミュニケーション装置Ｓは、クライアント端末Ｃ１，，，Ｃｘからの要求に応じて、各クライアント端末Ｃ１，，，Ｃｘにユーザインタフェース画面を表示する。図２は、コミュニケーション装置Ｓがクライアント端末Ｃ１，，，Ｃｘに提供するユーザインタフェース画面を示す図である。 The communication device S displays a user interface screen on each of the client terminals C1,... Cx in response to a request from the client terminals C1,. FIG. 2 is a diagram showing a user interface screen provided by the communication device S to the client terminals C1,.

クライアント端末Ｃ１，，，Ｃｘにおいて、参加者の識別子であるハンドル名を入力する欄１にハンドル名が入力され、ログインのボタン２が押下されると、コミュニケーション装置Ｓは、その信号を受信し、ユーザインタフェース画面のユーザ一覧の欄３にログイン状態にあるユーザの識別子（ここではハンドル名）と、履歴一覧の欄４に蓄積記憶されているログデータを表示する。 In the client terminals C1,..., Cx, when the handle name is input in the column 1 for inputting the handle name that is an identifier of the participant and the login button 2 is pressed, the communication device S receives the signal, In the user list column 3 of the user interface screen, an identifier (in this case, the handle name) of the logged-in user and the log data stored and stored in the history list column 4 are displayed.

コミュニケーション装置Ｓは、ユーザインタフェース画面と同時に提供するプログラムにより、クライアント端末Ｃ１，，，Ｃｘに下記の機能を実現させる。 The communication device S causes the client terminals C1, Cx to realize the following functions by a program provided simultaneously with the user interface screen.

（録音機能）
コミュニケーション装置Ｓは、クライアント端末Ｃ１，，，Ｃｘに、音声を録音して音声データを生成させ、その音声データをコミュニケーション装置Ｓにアップロードさせる録音機能を実現させる。実現させる録音機能としては三種類有り、（１）通常録音機能、（２）先行音声データ指定録音機能、(３)発言相手指定録音機能、を備える。 (Recording function)
The communication device S realizes a recording function for causing the client terminals C1,..., Cx to record sound and generate sound data and upload the sound data to the communication device S. There are three types of recording functions to be realized: (1) normal recording function, (2) preceding voice data designation recording function, and (3) speech partner designation recording function.

（通常録音機能）
クライアント端末Ｃ１，，，Ｃｘは、ユーザインタフェース画面の録音関連ボタン群５のうち「録音」ボタン６の押下を検知すると、図３に示されるような「発言完了」ボタンを、ユーザインタフェース画面をすべて覆い隠す形で画面上に表示し、録音を開始する。クライアント端末Ｃ１，，，Ｃｘは「発言完了」ボタンの押下を検知すると録音を終了し、生成された音声データ、及び、その音声データの発言者の識別子（ここではハンドル名）をコミュニケーション装置Ｓにアップロードする。ここで、発言者とは、音声データの元となる発言をした者であり、音声データを生成してアップロードする者のことである。 (Normal recording function)
When the client terminals C1,..., Cx detect that the “Record” button 6 is pressed in the recording-related button group 5 on the user interface screen, the “utterance complete” button as shown in FIG. Display on the screen in a cover-up form and start recording. When the client terminals C1,..., Cx detect that the “speak complete” button is pressed, the recording is terminated, and the generated voice data and the identifier of the speaker of the voice data (here, the handle name) are sent to the communication device S. Upload. Here, the speaker is a person who has made a speech that is the source of the voice data, and is a person who generates and uploads the voice data.

図４は、ユーザインタフェース画面の履歴一覧の欄４のログデータを抽出して示す図である。コミュニケーション装置Ｓは、音声データを特定する識別子ａ１，発言者を特定する識別子（ハンドル名）ａ２，発言時刻となるアップロード時刻ａ３を、受信した音声データに関する基本ログデータとし、受信した音声データと基本ログデータとを関連付けて蓄積記憶する。つぎに、コミュニケーション装置Ｓは、各クライアント端末Ｃ１，，，Ｃｘに対してその基本ログデータを提供し、再表示させたユーザインタフェース画面の履歴一覧の欄４に、生成された基本ログデータを追加表示させる。 FIG. 4 is a diagram showing log data extracted from the column 4 of the history list on the user interface screen. The communication device S uses an identifier a1 that specifies voice data, an identifier (handle name) a2 that specifies a speaker, and an upload time a3 that is a voice time as basic log data related to the received voice data. The log data is stored in association with the log data. Next, the communication device S provides the basic log data to each of the client terminals C1,..., Cx, and adds the generated basic log data to the history list column 4 of the redisplayed user interface screen. Display.

（先行音声データ指定録音機能１）
また、クライアント端末Ｃ１，，，Ｃｘは、生成した音声データと、先行する音声データとを対応付けてアップロードする機能を備える。クライアント端末Ｃ１，，，Ｃｘは、上記通常録音機能により音声の録音を完了した後に、ユーザインタフェース画面の履歴一覧の欄４から、先行する音声データが選択されるとともに、「先行発言指定」ボタン７が押下されると、選択された先行する音声データを特定する識別子ａ１を、音声データや発言者の識別子（ハンドル名）とともにコミュニケーション装置Ｓにアップロードする。なお、履歴一覧は音声データの履歴を示すものであり、先行する音声データの一覧の役割も同時に果たす。 (Preceding voice data designation recording function 1)
Further, the client terminals C1,..., Cx have a function of uploading the generated voice data and the preceding voice data in association with each other. After the client terminal C1,..., Cx completes the recording of the voice by the normal recording function, the preceding voice data is selected from the history list column 4 on the user interface screen, and the “preceding speech designation” button 7 is selected. When is pressed, the identifier a1 specifying the selected preceding voice data is uploaded to the communication device S together with the voice data and the speaker identifier (handle name). The history list shows the history of audio data, and also plays the role of a list of preceding audio data.

（先行音声データ指定録音機能２）
また、別の先行音声データ指定録音機能として、履歴一覧の欄４から先行する音声データが選択された後に、「発言への返信録音」ボタン８が押下されると、「発言完了」ボタンを表示し、発言の録音を開始する。「発言完了」ボタンが押下されると録音を終了し、生成された音声データと、発言者の識別子（ハンドル名）と、選択された音声データを特定する識別子１ａとを、コミュニケーション装置Ｓにアップロードする。 (Preceding voice data designation recording function 2)
As another preceding voice data designation recording function, when the preceding voice data is selected from the history list column 4 and the “Reply recording to reply” button 8 is pressed, a “speak complete” button is displayed. And start recording the remarks. When the “speak complete” button is pressed, the recording is terminated, and the generated voice data, the identifier (handle name) of the speaker, and the identifier 1a that identifies the selected voice data are uploaded to the communication device S. To do.

コミュニケーション装置Ｓは、上記二通りのいずれかの先行音声データ指定録音機能により生成された音声データ、発言者の識別子（ハンドル名）、及び、選択された先行する音声データを特定する識別子ａ１を受信すると、生成された音声データと基本ログデータに、選択された先行する音声データを特定する識別子ａ１を付加し、互いに関連付けて蓄積記憶する。その後、各クライアント端末Ｃ１，，，Ｃｘに対して、基本ログデータ、及び、選択された先行する音声データの識別子ａ１を提供する。クライアント端末Ｃ１，，，Ｃｘの履歴一覧には、基本ログデータが表示されるとともに、基本ログデータの文末に、例えば「＞＞［２］」のように、先行する音声データを特定する識別子ａ４が付与される。 The communication device S receives the voice data generated by one of the two preceding voice data designation recording functions, the speaker identifier (handle name), and the identifier a1 that identifies the selected preceding voice data. Then, an identifier a1 for specifying the selected preceding audio data is added to the generated audio data and basic log data, and these are stored in association with each other. Thereafter, the basic log data and the identifier a1 of the selected preceding audio data are provided to each of the client terminals C1,. In the history list of the client terminals C1,..., Cx, basic log data is displayed, and an identifier a4 that identifies preceding audio data, for example, “>> [2]” at the end of the basic log data. Is granted.

（発言相手指定録音機能１）
また、クライアント端末Ｃ１，，，Ｃｘは、発言相手を指定して録音を行う発言相手指定録音機能を備える。この機能としては３パターン存在する。第１のパターンとして、上記通常録音機能で発言を録音し、ユーザ一覧の欄３から発言相手が選択されるとともに「発言相手指定」ボタン９の押下を検知すると、確認のために「発言相手指定」ボタン９を選択した発言相手の識別子（ハンドル名）に変更する。その変更されたボタン９の押下を検知すると、生成された音声データと発言相手を特定する識別子（ハンドル名）とをコミュニケーション装置にアップロードする。例えば、通常録音完了後に、ユーザ一覧の欄３から「Bob」が選択されたと仮定する。そうすると、「発言相手指定」ボタン９を「>Bob」という表示のボタンに変更し、そのボタン９の押下を検知すると、「Bob」を指定した発言が完了することになる。 (Speaking party specified recording function 1)
Further, the client terminals C1,..., Cx have a speech partner designation recording function for performing recording by designating a speech partner. There are three patterns for this function. As a first pattern, a speech is recorded by the normal recording function, a speech partner is selected from the column 3 in the user list, and when the “designate speech partner” button 9 is detected, the “speaker designation” is confirmed for confirmation. "Button 9 is changed to the identifier (handle name) of the selected speech partner. When it is detected that the changed button 9 is pressed, the generated voice data and an identifier (handle name) for specifying the speaking partner are uploaded to the communication device. For example, it is assumed that “Bob” is selected from the column 3 of the user list after the normal recording is completed. In this case, the “speak partner designation” button 9 is changed to a button with a display of “> Bob”, and when pressing of the button 9 is detected, the speech designating “Bob” is completed.

（発言相手指定録音機能２）
また、第２のパターンとして、クライアント端末Ｃ１，，，Ｃｘは、ユーザ一覧の欄３から発言相手となるユーザが選択された後に、「相手指定録音」ボタン１０の押下を検知すると、「発言完了」ボタンを表示画面に表示し、録音を開始する。「発言完了」ボタンの押下を検知すると、録音を終了し、生成された音声データと、発言者の識別子（ハンドル名）と、選択された発言相手の識別子（ハンドル名）をコミュニケーション装置Ｓにアップロードする。 (Speaking party specified recording function 2)
Further, as a second pattern, when the client terminal C1,..., Cx detects that the user to be a speaking partner is selected from the column 3 of the user list, "" Button is displayed on the display screen and recording starts. When it detects that the “speak complete” button is pressed, the recording is terminated, and the generated voice data, the identifier of the speaker (handle name), and the identifier of the selected speaker (handle name) are uploaded to the communication device S. To do.

（発言相手指定録音機能３）
また、第３のパターンでは、履歴一覧から選んだ音声データの発言者を発言相手として指定する。クライアント端末Ｃ１，，，Ｃｘは、履歴一覧の欄４から音声データが選択されるとともに、「発言者への返答録音」ボタン１１の押下を検知すると、「発言完了」ボタンを表示画面に表示し、録音を開始する。「発言完了」ボタンの押下を検知すると、録音を終了し、生成された音声データと、発言者の識別子（ハンドル名）と、選択された音声データの発言者の識別子（ハンドル名）とをコミュニケーション装置Ｓにアップロードする。選択された音声データの発言者の識別子は、発言相手の識別子として取り扱われる。 (Speaking party specified recording function 3)
In the third pattern, the speaker of the voice data selected from the history list is designated as the speaking partner. When the client terminal C1,..., Cx selects the voice data from the history list column 4 and detects that the “record reply to speaker” button 11 is pressed, the client terminal C1,. Start recording. When pressing of the “speak complete” button is detected, the recording is terminated, and the generated voice data, the speaker identifier (handle name), and the speaker identifier (handle name) of the selected voice data are communicated. Upload to device S. The identifier of the speaker of the selected voice data is handled as the identifier of the speaking partner.

コミュニケーション装置Ｓは、上記機能により受信した音声データと、基本ログデータと、選択された発言相手の識別子（発言相手指定録音機能３においては「選択された音声データの発言者の識別子」）とを蓄積記憶し、各クライアント端末Ｃ１，，，Ｃｘに対してこれらの情報を提供する。クライアント端末Ｃ１，，，Ｃｘの履歴一覧の欄４には、基本ログデータが表示されるとともに、発言相手として選択したユーザの識別子（ハンドル名）ａ５が付加される。たとえば、クライアント端末Ｃ１，，，Ｃｘに表示される履歴一覧の欄４には，基本ログデータの文末に「>Bob」のよう発言相手の識別子（ハンドル名）が付与される．なお，自己が他のユーザから発言相手として指定を受けた場合には，行頭に「>>>You:」と表示される． The communication device S receives the voice data received by the above function, the basic log data, and the identifier of the selected speech partner (“speaker identifier of the selected speech data” in the speech partner designation recording function 3). The information is stored and stored, and the information is provided to each of the client terminals C1,. In the history list column 4 of the client terminals C1,..., Cx, basic log data is displayed, and an identifier (handle name) a5 of the user selected as the speaking partner is added. For example, in the history list column 4 displayed on the client terminals C1,..., Cx, the identifier (handle name) of the message partner is given at the end of the basic log data, such as “> Bob”. In addition, when you receive a designation from other users as a speaking partner, ">>> You:" is displayed at the beginning of the line.

また、以上のそれぞれの機能は組み合わせても動作可能である。たとえば、基本ログデータに、「>Bob>>[4]」のように、発言相手の指定と先行音声データの指定の両方を付与する機能も備える。さらに、発言相手指定録音機能は、「>Susie>Andy」のように複数の発言相手の指定を行なう機能も備える。 The above functions can be operated even when combined. For example, the basic log data is also provided with a function of giving both the designation of the speaking partner and the designation of the preceding voice data, such as “> Bob >> [4]”. In addition, the speech partner designated recording function has a function of designating a plurality of speech partners, such as “> Susie> Andy”.

（再生機能）
コミュニケーション装置Ｓは、クライアント端末Ｃ１，，，Ｃｘに、受信した音声データを再生する再生機能を実現させる。クライアント端末Ｃ１，，，Ｃｘは、履歴一覧の欄４からログデータが選択されるとともに、「これ→を聞く」ボタン１２が押下されると、その選択されたログデータに対応する音声データをコミュニケーション装置Ｓに要求する。コミュニケーション装置Ｓは、クライアント端末Ｃ１，，，Ｃｘから、音声データの送信要求を受け取ると、要求を受けた音声データをクライアント端末Ｃ１，，，Ｃｘに提供する。クライアント端末Ｃ１，，，Ｃｘは、受信した音声データを再生する。 (Playback function)
The communication device S causes the client terminals C1, Cx to realize a reproduction function for reproducing the received audio data. The client terminals C1,..., Cx communicate log data corresponding to the selected log data when the log data is selected from the history list column 4 and the “listen to this” button 12 is pressed. Request to device S. When the communication device S receives a voice data transmission request from the client terminals C1,..., Cx, the communication device S provides the received voice data to the client terminals C1,. The client terminals C1,... Cx reproduce the received audio data.

また、クライアント端末Ｃ１，，，Ｃ３ｘは、履歴一覧の欄４でのダブルクリックを検知した場合にも、そのダブルクリックされたログデータに対応する音声データをコミュニケーション装置Ｓに要求する。コミュニケーション装置Ｓは、要求された音声データをクライアント端末Ｃ１，，，Ｃｘに提供し、それを受信したクライアント端末Ｃ１，，，Ｃｘは音声データを再生する。これによれば、ダブルクリックのみで音声データを再生できるため操作性が良い。 Further, when the client terminal C1,..., C3x detects a double click in the column 4 of the history list, it requests the communication device S for audio data corresponding to the double clicked log data. The communication device S provides the requested voice data to the client terminals C1,... Cx, and the client terminals C1,. According to this, since the audio data can be reproduced only by double clicking, the operability is good.

また、再生機能の付加機能として、「次を聞く」ボタン１３の押下を検知すると、ログデータの配列順に、直前に再生した音声データの次の音声データをコミュニケーション装置Ｓに要求する。コミュニケーション装置Ｓは、要求された音声データをクライアント端末Ｃ１，，，Ｃｘに提供し、それを受信したクライアント端末Ｃ１，，，Ｃｘは音声データを再生する。この機能を使うと、履歴一覧のログデータの配列順（時系列順）で、音声データを聞くことができる。 Further, as an additional function of the reproduction function, when it is detected that the “listen to next” button 13 is pressed, the communication device S is requested for the audio data next to the audio data reproduced immediately before in the order of the log data arrangement. The communication device S provides the requested voice data to the client terminals C1,... Cx, and the client terminals C1,. Using this function, you can listen to audio data in the order of the log data in the history list (in chronological order).

また、再生機能の付加機能として、「自分宛を聞く」ボタン１４の押下を検知すると、それを操作したユーザの識別子（ハンドル名）が発言相手として付与されているログデータに対応する音声データのみをコミュニケーション装置Ｓに要求する。コミュニケーション装置Ｓはその音声データのみをクライアント端末Ｃ１，，，Ｃｘに提供し、クライアント端末Ｃ１，，，Ｃｘはそれを再生する。自分宛が複数ある場合は、履歴一覧の欄４で選択されている音声データよりも後の音声データ（後にアップロードされた音声データ）で、かつその選択されている音声データにもっとも時間的に近い音声データを再生する。時系列に順次連続して再生するようにしても良い。この機能を使うと、履歴一覧のなかから自分宛の音声データのみを聞くことができる。 As an additional function of the playback function, when the pressing of the “listen to me” button 14 is detected, only the audio data corresponding to the log data to which the identifier (handle name) of the user who operated the button is given as the speaking partner is provided. Is requested to the communication device S. The communication device S provides only the audio data to the client terminals C1,... Cx, and the client terminals C1,. If there are multiple addresses for the user, the audio data is later than the audio data selected in the column 4 of the history list (the audio data uploaded later) and is closest in time to the selected audio data. Play audio data. You may make it reproduce | regenerate sequentially in time series. Using this function, you can listen to only the audio data addressed to you from the history list.

また、再生機能の付加機能として、「先行発言を聞く」ボタン１５の押下を検知すると、履歴一覧の欄４から指定されているログデータに付加されている先行する音声データをコミュニケーション装置Ｓに要求する。コミュニケーション装置Ｓはその音声データのみをクライアント端末Ｃ１，，，Ｃｘに提供し、クライアント端末Ｃ１，，，Ｃｘはそれを再生する。この機能を使うと、関連する先行の音声データを遡って聞くことができる。 Further, as an additional function of the playback function, when it is detected that the “listen to preceding speech” button 15 is pressed, the communication device S is requested to provide preceding audio data added to the log data specified from the history list column 4. To do. The communication device S provides only the audio data to the client terminals C1,... Cx, and the client terminals C1,. Using this function, you can hear the related previous audio data retroactively.

（第２の実施の形態）
図５（ａ）（ｂ）は、本実施の形態のコミュニケーション装置Ｓを説明する説明図である。本実施の形態のコミュニケーション装置Ｓは、主に、会議などの対面方式の対話の場面において使用され、対面対話を行ないながらマルチスレッド対話を実現するものである。コミュニケーション装置Ｓは、上記実施の形態のコミュニケーション装置Ｓに、発言者が発言相手を指し示すような動作を行なうことにより、発言相手を指定することができる機能を備える。 (Second Embodiment)
FIGS. 5A and 5B are explanatory diagrams for explaining the communication device S of the present embodiment. The communication apparatus S according to the present embodiment is mainly used in a face-to-face conversation such as a meeting, and realizes a multi-thread conversation while performing the face-to-face conversation. The communication device S is provided with a function capable of designating a speaking partner by performing an operation such that the speaker points to the speaking partner in the communication device S of the above embodiment.

コミュニケーション装置Ｓは、信号を発信する発信手段ｂ１と、発信手段から発信された信号を受信する受信手段ｂ２とを備える。発信手段ｂ１は、先端に赤外線発光ダイオード（ＬＥＤ）が内蔵されており、スイッチを押下することにより、赤外線光で識別子が発信される。発信手段ｂ１は、すべてのユーザが一つずつ持ち、発信される識別子は発信手段ｂ１ごとに異なるように、各々に固有の識別子が設定されている。各クライアント端末Ｃ１，，，ＣｘにはRS-232Cインタフェースを介して受信手段ｂ２が一つずつ接続されている。 The communication device S includes a transmission unit b1 that transmits a signal and a reception unit b2 that receives a signal transmitted from the transmission unit. The transmitting means b1 has an infrared light emitting diode (LED) built in at the tip thereof, and an identifier is transmitted by infrared light when a switch is pressed. Each of the sending means b1 has one by one, and a unique identifier is set for each of the sending means b1 so that the identifier to be sent is different for each sending means b1. One receiving means b2 is connected to each client terminal C1,..., Cx via an RS-232C interface.

以下、発言者のクライアント端末Ｃ１と、発言相手のクライアント端末Ｃ２を例に説明する（図５（ｂ））。発言者は、クライアント端末Ｃ１から音声録音機能を用いて音声データと発言者の識別子をコミュニケーション装置Ｓにアップロードし、図５（ｂ）に示すように、発信手段ｂ１を発言相手の受信手段ｂ２に向けてスイッチを押下する。発信手段ｂ１が赤外線光で発言者の識別子Ｘを発信すると、受信手段ｂ２が赤外線受光部でその識別子Ｘを受信し、受信手段ｂ２が接続されている発言相手のクライアント端末Ｃ２にその識別子Ｘを伝える。クライアント端末Ｃ２は、受信した識別子Ｘと、そのクライアント端末Ｃ２を使用する発言相手の識別子Ｙとを、コミュニケーション装置Ｓに通知する。 Hereinafter, the client terminal C1 of the speaker and the client terminal C2 of the speaking partner will be described as an example (FIG. 5B). The speaker uploads the voice data and the identifier of the speaker from the client terminal C1 to the communication device S by using the voice recording function, and, as shown in FIG. Press the switch. When the sending means b1 sends the speaker's identifier X by infrared light, the receiving means b2 receives the identifier X at the infrared light receiving unit, and sends the identifier X to the client terminal C2 of the speaking partner to which the receiving means b2 is connected. Tell. The client terminal C2 notifies the communication device S of the received identifier X and the identifier Y of the speaking partner who uses the client terminal C2.

コミュニケーション装置Ｓは、発言者のクライアント端末Ｃ１から、音声データと発言者の識別子Ｘとを受信し、また、発言相手のクライアント端末Ｃ２から、発言者の識別子Ｘと発言相手の識別子Ｙとを受信する。発言者のクライアント端末Ｃ１から受信した発言者の識別子Ｘと、発言相手のクライアント端末Ｃ２から受信した発言者の識別子Ｘが一致する場合は、受信した音声データのログデータに発言相手の識別子Ｙを付加する。発言者は、発信手段ｂ１により受信手段ｂ２に信号を送るだけで、発言相手を指し示すような簡単な動作で音声データのログデータに発言相手の情報を付加することができる。受信した発言者の識別子Ｘを音声データのログデータに付加してもよい。これにより、発言者が発言相手を指し示すような簡単な動作で、音声データのログデータに発言者の情報を付加することができる。発言相手の識別子Ｙと発言者の識別子Ｘとを両方付加すると、より効果的である。 The communication device S receives the voice data and the speaker identifier X from the speaker client terminal C1, and receives the speaker identifier X and the speaker partner identifier Y from the client terminal C2 of the speaker. To do. When the identifier X of the speaker received from the client terminal C1 of the speaker matches the identifier X of the speaker received from the client terminal C2 of the speaker partner, the identifier Y of the speaker partner is added to the log data of the received voice data. Append. The speaker can add the information of the speaking partner to the log data of the voice data by a simple operation that points to the speaking partner only by sending a signal to the receiving unit b2 by the transmitting unit b1. The received speaker identifier X may be added to the log data of the audio data. Thereby, the information of the speaker can be added to the log data of the voice data by a simple operation in which the speaker indicates the partner of the speaker. It is more effective to add both the identifier Y of the speaking partner and the identifier X of the speaking party.

（第３の実施の形態）
本実施の形態のコミュニケーション装置Ｓは、音声データに加えて、テキストデータを入力可能とするものである。コミュニケーション装置Ｓは、クライアント端末Ｃ１，，，Ｃｘからテキストデータと音声データを特定する識別子ａ１を受信すると、そのテキストデータを音声データのログデータに関連付けて記憶する。そのテキストデータは音声データのログデータとともに、クライアント端末Ｃ１，，，Ｃｘに提供され、各クライアント端末Ｃ１，，，Ｃｘの履歴一覧の欄４にログデータとともにテキストデータが表示される。これにより、個々の音声データ間の関係を意味的に明示することが可能となる。また、音声では入力できない「顔文字」のようなものも、音声に付加して入力可能となる。 (Third embodiment)
The communication device S of the present embodiment is capable of inputting text data in addition to voice data. When the communication device S receives the identifier a1 specifying the text data and the voice data from the client terminals C1,..., Cx, the communication device S stores the text data in association with the log data of the voice data. The text data is provided to the client terminals C1,..., Cx together with the log data of the voice data, and the text data is displayed together with the log data in the history list column 4 of each client terminal C1,. As a result, the relationship between the individual audio data can be clearly specified. Also, “emoticons” that cannot be input by voice can be added to the voice and input.

（有効性に関する実験）
以下に、第１の実施の形態を例として、本発明の有効性を確認すべく、実験をおこなった。７人の大学院生から成るグループ２組計１４人に対し，以下３つの条件での実験を行なった。被験者は全員、何らかの形でテキストチャットを利用した経験はあるが、ボイスチャットの経験はない。また、システムの慣れによる影響を抑えるため、個々のグループの使用順序は異なる。 (Effectiveness experiment)
An experiment was conducted to confirm the effectiveness of the present invention by taking the first embodiment as an example. The experiment was conducted under the following three conditions for 14 groups of 2 groups consisting of 7 graduate students. All subjects had some form of text chat experience but no voice chat experience. In addition, the order of use of individual groups is different in order to suppress the effects of system familiarity.

・Base：一般的なインターフェースをもつボイスチャットシステムを非対面状況で使用。実際には、本発明のコミュニケーション装置（以下、ChaTELという）が提供する機能のうち、発言履歴と、「これを聞く」および「録音」ボタンのみを使用可能としたものを使用した。
・非対面ChaTEL：ChaTELをそのまま非対面状況で使用。
・対面のChaTEL：ChaTELをそのまま対面状況で使用。・ Base: A voice chat system with a general interface is used in non-face-to-face situations. Actually, among the functions provided by the communication device of the present invention (hereinafter referred to as “ChaTEL”), the one that can use only the speech history and the “listen to this” and “record” buttons was used.
・ Non-face-to-face ChaTEL: Use ChaTEL in a non-face-to-face situation.
・ Face-to-face ChaTEL: Use ChaTEL in face-to-face situations.

実験では、最初に７人の被験者からなるグループを３人と４人の２つのサブグループに分け、それぞれのサブグループに異なる話題を与え、これらの話題について２つのサブグループが同時並行的に約２０分間話をするように教示した。与えた話題は、「行ってみたい場所」、「昔よくした遊びについて」など、比較的自由なテーマである。なお、これらの与えた話題については、各サブグループでひと通り完結するまで話を続けることを求めたが、それ以外の話をすることや、別のサブグループの話題に参加することについては禁止していない。また非対面条件では、全員が完全に離れた場所で実験を行い、対面条件では、全員が円状に配置するようなセッティングを行なった。 In the experiment, the group consisting of 7 subjects was first divided into two subgroups of 3 and 4, and each subgroup was given a different topic. Taught to talk for 20 minutes. The topics I gave were relatively free themes such as “places I would like to go to” and “about the old play”. Regarding these topics, we requested that each subgroup continue to talk until it is completed, but we are not allowed to talk about other topics or participate in topics from other subgroups. Not done. In the non-face-to-face condition, the experiment was performed in a place where all members were completely separated from each other, and in the face-to-face condition, settings were made so that everyone was placed in a circle.

本システムで付与した、発言相手指定および先行発言指定機能により、マルチスレッド状況に対応しやすくなると考えられるため、対話構造そのものに影響が出ること予想される。そこで、まず、非対面状況で、発言相手指定および先行発言指定機能を持たないBaseシステムを使用した場合とChaTELを使用した場合の対話構造比較を行なった。ここで、対話構造を比較するために、実験で取得したデータの個々の発言がどの発言と意味的につながりがあるかを同定し、木構造の概念を用いて、始端数（スレッドの開始点の数）、パス数（1スレッド内の個々の発言を結びつける経路数）、終端数（スレッドの末端の数）を算出した。また、各スレッドの始端発言のＩＤと、最後の終端発言のIDの差をスレッド長とする。これは、それぞれのスレッドが幅広く展開されるのか、深く展開されるのかを判断するための材料のひとつとして使用できる。結果を表１に示す。 It is expected that the dialogue partner itself will be affected by the speech partner designation and preceding speech designation functions provided in this system, since it will be easier to deal with multi-thread situations. Therefore, we first compared the dialog structure when using the Base system that does not have the speaking partner designation and preceding speech designation functions in non-face-to-face situations and when using ChaTEL. Here, in order to compare dialog structures, we identify which utterances of the individual utterances of the data obtained in the experiment are semantically connected, and use the concept of tree structure, Number of paths), number of paths (number of paths connecting individual utterances within one thread), and number of ends (number of threads at the end). Also, the difference between the ID of the start message of each thread and the ID of the last message is the thread length. This can be used as one of the materials for determining whether each thread is deployed widely or deeply. The results are shown in Table 1.

表１より、Ａ．Ｂどちらの被験者群もBase、本システムどちらについてもスレッド長を除いた項目に大きな違いはないことがわかる。スレッド長については、Ａ，Ｂどちらの被験者群も、Baseよりも本システムを利用したほうが大きくなる傾向にあることがわかる。これは、発言の分岐が起こっても、本システムを利用したほうが1スレッドが長く継続されるということであり、スレッド数そのものがほぼ同じであれば、スレッドが長く続けば続くほどマルチスレッド状況が生じやすくなっているということを意味する。よって、本システムを利用した場合のほうが、マルチスレッドを長く継続できるということになる。 From Table 1, A.I. B It can be seen that there is no significant difference in the items excluding the thread length for both the subject groups in Base and this system. About thread length, it turns out that the subject group of both A and B tends to become larger when this system is used rather than Base. This means that even if there is a branch of remarks, one thread will continue longer if this system is used. If the number of threads themselves is almost the same, the longer the thread continues, the more multi-thread situation will occur. It means that it is easy to occur. Therefore, when this system is used, multithreading can be continued longer.

また、発言相手指定および先行発言指定機能を持たないBaseシステムと本システムとでは、個々の参加メンバーがシステム利用中に同時参加しているスレッド数にどの程度差があるのかを算出した。なお、各参加者の同時参加スレッド数は、以下のようにして求めた。まず、各スレッドについて、個々の参加者による最初の発言と最後の発言を求め、その両発言の間はその参加者はそのスレッドに参加しているものとみなすこととした。その上で、個々の発言が行なわれた時に、個々の参加メンバーが、その時点で存在しているスレッドのうちのいくつに参加していたかということを算出して求めた。結果を表２に示す。 In addition, we calculated how much difference there was in the number of threads that each participating member participated simultaneously while using the system, between the Base system that does not have the speaking partner designation and preceding speech designation functions and this system. In addition, the number of simultaneous participation threads of each participant was obtained as follows. First, for each thread, the first and last utterances by individual participants were sought, and during both utterances, the participants were considered to be participating in the thread. After that, when each remark was made, it was calculated and calculated how many of the existing threads each participating member was participating in. The results are shown in Table 2.

表２より、Ａ，Ｂどちらの被験者群もBaseよりも本システムを利用した場合のほうが平均同時参加スレッド数が多くなっていることがわかる。また，Baseの平均同時参加スレッド数が１．０以下であるということは、自分が確実に発言できる機会がくるまでは、聞くことに専念していることを、また本システムの平均同時参加スレッド数が１．０以上であるということは、常にいずれかのスレッドで発言を行いながら、別のスレッドでも発言を行おうとしていることを示唆している。 From Table 2, it can be seen that the average number of simultaneously participating threads is larger when both the A and B subject groups use this system than Base. Also, the fact that the average number of threads participating in Base is 1.0 or less means that you are devoted to listening until you have a chance to speak, and the average number of threads participating in this system. Is 1.0 or more, suggesting that the user always speaks in one thread and tries to speak in another thread.

前節より、発言相手指定および先行発言指定機能をもつ本システムが、マルチスレッド状況の発生に有効に機能していることがわかった。しかし、これだけでは、対面状況でマルチスレッド状況に有効であるかという点では不十分である。そこで、本システムを非対面状況で用いた場合と対面状況で用いた場合との比較実験を行なった。まず、この場合も、前節同様の方法で、両システムを用いた際の対話構造の比較を行なった。結果を表３に示す。 From the previous section, it was found that this system, which has the function to specify the other party and the previous one, is functioning effectively in the occurrence of multi-thread situations. However, this alone is not sufficient in terms of whether it is effective in a multi-thread situation in a face-to-face situation. Therefore, a comparison experiment was performed between the case where this system was used in a non-face-to-face situation and the case where it was used in a face-to-face situation. First, in this case as well, the dialog structure when using both systems was compared in the same way as in the previous section. The results are shown in Table 3.

表３より、どちらの被験者群も、非対面、対面状況での対話構造に大きな差は見られなかった。

From Table 3, there was no significant difference in the dialogue structure between the non-face-to-face and face-to-face situations in either group of subjects.

さらに、非対面状況、対面状況での差異を見るため、対面状況での実験時に収録したビデオ収録データを検討した。その中で、非対面状況では、通常の対面対話とほぼ同じ音量で発言をしていたにも関わらず、対面状況では、小声で発言するということが音声データ、ビデオの両方から確認できた。さらに対面状況で参加者メンバーが目の前にいるにも関わらず、大半の参加者が発言履歴が表示されているモニターに注視している時間が長いということが観察された。これらから、対面状況での本システムの利用は、物理的には同じ場所を共有しつつも、心理的には独立した自分の空間を保持したまま対話していることが推測される。一方で、他の参加メンバーの笑い声に反応したり場の雰囲気を共有している場面も見受けられた。 Furthermore, in order to see the difference between the non-face-to-face situation and the face-to-face situation, the video recording data recorded during the experiment in the face-to-face situation was examined. Among them, it was confirmed from both audio data and video that in the non-face-to-face situation, the voice was spoken at almost the same volume as the normal face-to-face conversation, but in the face-to-face situation, the voice was spoken. Furthermore, it was observed that the majority of participants were gazing at the monitor on which the utterance history was displayed, even though the participant members were in front of them in the face-to-face situation. From these, it can be inferred that the use of this system in the face-to-face situation interacts while maintaining the same place physically but maintaining its own independent space. On the other hand, there were some scenes that responded to the laughter of other participating members and shared the atmosphere of the place.

本発明のコミュニケーション装置によれば、近接性、同期性の制約なしに対面でのマルチスレッド対話を可能とするために、音声による効率的な発言入力を可能とし、発言履歴と相手指定／対応発言指定機能の提供によって、対面状況でも音声によるマルチスレッド対話を可能とした。さらに、開発したシステムがマルチスレッド対話の進行に有効となることを、実験で得た対話データを中心に分析を行ない、検証した。 According to the communication device of the present invention, in order to enable face-to-face multi-thread conversation without restrictions on proximity and synchronization, it is possible to efficiently input speech by voice, speech history and partner designation / correspondence speech By providing a designated function, multi-threaded conversations can be performed by voice even in face-to-face situations. Furthermore, we analyzed that the developed system is effective for the progress of multi-threaded dialogue, focusing on dialogue data obtained through experiments.

非対面条件で、通常のインターフェースを持つボイスチャットシステムと本研究で開発したシステムとの比較では、前者よりも後者のほうが、１スレッドあたりのスレッド長が長くなることが確認された。また、本研究で開発したシステムを用いたほうが、各参加者が同時に関与するスレッド数が多くなることが確認された。これらのことから、相手指定や先行発言指定を行なうことで、同じ話題を長く維持しながら、複数の話題に同時に参加できることが示され、本システムを用いるとマルチスレッド状況が生じやすくなるということが示唆された。 A comparison between a voice chat system with a normal interface and a system developed in this study under non-face-to-face conditions confirmed that the thread length per thread was longer in the latter than in the former. In addition, it was confirmed that using the system developed in this study increases the number of threads in which each participant is involved simultaneously. From these facts, it is shown that the same topic can be maintained for a long time by specifying the other party or preceding speech, and it is easy to create a multi-thread situation using this system. It was suggested.

対面状況時のビデオ記録から、笑いを共有するなどの場の雰囲気を共有している場面はあるにせよ、発言履歴に注視しがちになってしまい、参加者同士が表情を確認する等の対面対話の利点が生かされていないことが示唆された。心理的には互いに独立したままの対話空間を、実空間に接地することにより、対面状況対話空間を共有しやすくする工夫が必要である。そこで、対面状況で、相手の表情を確認できるくらいの視線の移動がごく自然な動作で可能となるよう、相手指定を、発言者が相手を「指差しする」動作によって行なう機能を追加した。
これにより、実世界での身体的に相手を指差しするという自然な行為による発言の対象者指定が可能となり、その行為の再に相手の方を見ることで、場の雰囲気そのものもより自然なものとなった。 Even if there are scenes sharing the atmosphere of the venue, such as sharing laughter, from the video recording during the meeting situation, it is easy to pay attention to the speech history, and the participants check the facial expression etc. It was suggested that the advantages of dialogue were not utilized. Psychologically, it is necessary to make it easier to share the face-to-face conversation space by grounding the conversation spaces that are independent of each other to the real space. Therefore, a function has been added that allows the speaker to specify the other party by “pointing” at the other party so that the movement of the line of sight can be confirmed with a natural action in a face-to-face situation.
This makes it possible to specify the person to speak by the natural act of physically pointing at the other party in the real world. By looking at the other party again, the atmosphere of the place itself becomes more natural. It became a thing.

本発明は、その趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。 It goes without saying that the present invention can be modified as appropriate without departing from the spirit of the present invention.

本実施の形態のコミュニケーション装置を説明する説明図である。It is explanatory drawing explaining the communication apparatus of this Embodiment. コミュニケーション装置がクライアント端末に提供するユーザインタフェース画面を示す図。The figure which shows the user interface screen which a communication apparatus provides to a client terminal. 「発言完了」ボタンを示す図。The figure which shows the "comment completion" button. ユーザインタフェース画面の履歴一覧の欄のログデータを抽出して示す図。The figure which extracts and shows the log data of the column of the history list of a user interface screen. 本実施の形態のコミュニケーション装置を説明する説明図。Explanatory drawing explaining the communication apparatus of this Embodiment. ３つの表現が対話中に出現した割合のグラフ。A graph of the rate at which three expressions appear during the dialogue. ３つの表現を発言間のインターバル（発言間距離）ごとに分類した出現頻度の結果を示すグラフ。The graph which shows the result of the appearance frequency which classified three expression for every interval (distance between utterances) between utterances.

Explanation of symbols

Ｓコミュニケーション装置
Ｃ１，，，Ｃｘクライアント端末
１ハンドル名を入力する欄
２ログインのボタン
３ユーザ一覧の欄
４履歴一覧の欄
５録音関連ボタン群
６「録音」ボタン
７「先行発言指定」ボタン
８「発言への返信録音」ボタン
９「発言相手指定」ボタン
１０「相手指定録音」ボタン
１１「発言者への返信録音」ボタン
１２「これ→を聞く」ボタン
１３「次を聞く」ボタン
１４「自分宛を聞く」ボタン
１５「先行発言を聞く」ボタン
ａ１音声データを特定する識別子
ａ２発言者を特定する識別子
ａ３発言時刻となるアップロード時刻
ａ４先行する音声データを特定する識別子
ｂ１発信手段
ｂ２受信手段 S Communication device C1,... Cx Client terminal 1 Column for entering handle name 2 Login button 3 User list column 4 History list column 5 Recording related buttons 6 “Record” button 7 “Specify preceding speech” button 8 “ "Reply recording" button 9 "Specify speaking party" button 10 "Recording specified party" button 11 "Record reply to speaker" button 12 "Listen to this" button 13 "Listen to next" button 14 "Send to yourself" 15 button “listen to previous speech” button a1 identifier for identifying voice data a2 identifier for identifying speaker a3 upload time as speech time a4 identifier for identifying preceding voice data b1 transmission means b2 reception means

Claims

Means for receiving audio data from client terminals of a plurality of users;
Means for accumulating and storing the received audio data;
Means for providing a list of the voice data stored and stored to the client terminals of a plurality of users;
In response to a request from the client terminal that has received the list, a means for providing the requested voice data to the client terminal.
It is a communication device that enables dialogue using voice data between multiple users,
Means for adding to the list information indicating the association between audio data in the dialogue;
When audio data is generated by the client terminal, preceding audio data is selected from the audio data list, and the generated audio data and an identifier of the selected preceding audio data are uploaded,
The means for accumulating and storing the audio data accumulates and stores the generated audio data in association with an identifier that identifies the selected preceding audio data,
The means for adding information indicating the relation between the audio data to the list adds information related to the stored audio data to the audio data list and identifies the preceding audio data in the information related to the audio data. A communication device characterized by assigning an identifier .

Recording means for recording the voice on the user interface screen of the client terminal to generate the voice data, a list of voice data for selecting the preceding voice data, and the voice data generated by the recording means together with the voice 2. The communication apparatus according to claim 1, further comprising: a preceding speech designation unit that uploads an identifier of the preceding audio data selected from the data list.

On the user interface screen of the client terminal, a list of voice data for selecting the preceding voice data, a reply recording unit for recording a reply to the selected voice data to generate voice data, and the recording And a speech completion means for uploading the voice data generated by the reply recording means for replying together with the identifier of the preceding voice data selected from the voice data list. 1. The communication device according to 1.

Means for receiving audio data from client terminals of a plurality of users;
Means for accumulating and storing the received audio data;
Means for providing a list of the stored and stored audio data and the list of users to client terminals of a plurality of users;
Means for providing the requested audio data to the client terminal in response to a request from the client terminal that has received the list;
It is a communication device that enables dialogue using voice data between multiple users,
Means for adding information indicating a relationship between audio data in the dialog to the audio data list;
When voice data is generated by the client terminal, an identifier of a speaking partner is selected from the list of users, and the generated voice data and the identifier of the selected speaking partner are uploaded,
The means for accumulating and storing the audio data accumulates and stores the generated audio data in association with the identifier of the selected speech partner,
The means for adding information indicating the relation between the audio data to the list of the audio data adds information related to the stored and stored audio data to the list and adds the identifier of the speaking partner to the information related to the audio data. A communication device characterized by that.

From the user list together with the recording means for recording the voice on the user interface screen of the client terminal, generating the voice data, the user list for selecting the identifier of the speaking partner, and the voice data generated by the recording means 5. The communication apparatus according to claim 4, further comprising: a speech partner designating unit that uploads an identifier of the selected speech partner.

On the user interface screen of the client terminal, a user list for selecting an identifier of the speaking partner, a partner-designated recording unit for recording a speech for the selected speaking partner and generating voice data, and selecting from the user list 5. The communication apparatus according to claim 4, further comprising: a speech completion means for uploading voice data generated by the partner-designated recording means together with the identifier of the speech partner.

Means for receiving audio data from client terminals of a plurality of users;
Means for accumulating and storing the received audio data;
Means for providing a list of the voice data stored and stored to the client terminals of a plurality of users;
Means for providing the requested audio data to the client terminal in response to a request from the client terminal that has received the list;
It is a communication device that enables dialogue using voice data between multiple users,
Means for adding information indicating a relationship between audio data in the dialog to the audio data list;
Audio data is generated by the client terminal, preceding audio data is selected from the audio data list, and the generated audio data and an identifier of a speaker of the selected preceding audio data are uploaded. When,
The means for accumulating and storing the audio data accumulates and stores an identifier of a speaker of the selected preceding audio data in association with the generated audio data as an identifier of a speech partner,
The means for adding information indicating the relation between the audio data to the list of the audio data adds information related to the stored and stored audio data to the list and adds the identifier of the speaking partner to the information related to the audio data. A communication device characterized by that.

A list of voice data for selecting the preceding voice data on the user interface screen of the client terminal and a reply to the speaker of the selected preceding voice data are recorded to the speaker who generates voice data. A reply recording means, and a speech completion means for uploading the voice data generated by the reply recording means to the speaker together with the identifier of the speaker of the preceding voice data selected from the voice data list when the recording is completed; The communication device according to claim 7, wherein the communication device is displayed.

It has client terminals and servers for multiple users,
The server includes means for receiving audio data from the plurality of client terminals;
Means for accumulating and storing the received audio data;
Means for providing a list of the voice data stored and stored to the client terminals of a plurality of users;
Means for providing the requested voice data to the client terminal in response to a request from the client terminal that has received the list;
It is a communication device that enables dialogue using voice data between multiple users,
The server comprises means for adding information indicating a relationship between audio data in the dialogue to the list;
Each of the client terminals includes a transmission unit that transmits a signal, and a reception unit that receives a signal from the transmission unit,
The transmitting means and the receiving means are capable of transmitting and receiving signals such that the transmitting means provided in the client terminal of the speaker indicates the receiving means provided in the client terminal of the speaking partner,
The voice data is transmitted from the client terminal of the speaker to the server, and the identifier of the speaker partner is transmitted to the server from the client terminal provided with the receiving unit of the speaking partner who has received the signal from the transmitting unit of the speaker. When,
The server accumulates and stores the received voice data and the received identifier of the speaking partner, adds information about the voice data to the list of the voice data, and adds the information about the voice partner to the information about the voice data. A communication device characterized by assigning an identifier.