JP6729193B2

JP6729193B2 - Information processing server, information processing system, terminal device, and program

Info

Publication number: JP6729193B2
Application number: JP2016169691A
Authority: JP
Inventors: 伸一深澤
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2016-08-31
Filing date: 2016-08-31
Publication date: 2020-07-22
Anticipated expiration: 2036-08-31
Also published as: JP2018036871A

Description

本発明は、情報処理サーバ、情報処理システム、端末装置、及びプログラムに関する。 The present invention relates to an information processing server, an information processing system, a terminal device, and a program.

近年、従来のようにハードウェアにより実現される電話機に代わって、アプリケーションソフトウェアにより実現されるソフトフォンが普及してきている。ソフトフォンは、ソフトウェアで実現されるため、ソフトフォンへの機能追加、ソフトフォンと他のアプリケーションソフトウェアとの連携等を、電話機に比べて比較的容易に実現することができる。そのため、ソフトフォンに関する様々な応用技術が提案されている。例えば、下記特許文献１には、カメラから得られた撮像画像の表示画面において当該撮像画像に写っている人物をタッチすると、顔認識により当該人物を特定し、当該人物の電話番号を取得し、当該人物への電話発信を行う技術が、開示されている。 In recent years, softphones realized by application software have become popular in place of conventional telephones realized by hardware. Since the softphone is realized by software, addition of functions to the softphone, cooperation between the softphone and other application software, etc. can be realized relatively easily as compared with the telephone. Therefore, various applied technologies regarding softphones have been proposed. For example, in Patent Document 1 below, when a person shown in the captured image is touched on a display screen of the captured image obtained from a camera, the person is identified by face recognition, and the telephone number of the person is acquired. A technique for making a telephone call to the person is disclosed.

また、下記特許文献２には、例えばビデオ通話機能を有するソフトフォンにおいて、実空間の撮像画像上で実空間上の位置情報を扱うことができる情報処理方法が開示されている。 Further, Patent Document 2 below discloses an information processing method capable of handling position information in a real space on a captured image in the real space in a softphone having a video call function, for example.

上述のようなソフトフォンは、ＦＡＸ（ｆａｃｓｉｍｉｌｅ）、電子メール、インスタントメッセンジャー、およびＶｏＩＰ（ＶｏｉｃｅｏｖｅｒＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）などを統合したユニファイドコミュニケーション（登録商標）技術・システムにも用いられている。上記ユニファイドコミュニケーション技術・システムは、異なる複数の場所で複数の人物が協働（ｃｏｌｌａｂｏｒａｔｉｏｎ）作業を行う遠隔分散協働型オフィス（分散環境）で用いられている。 The softphone as described above is also used in a unified communication (registered trademark) technology/system that integrates a fax (facsimile), an electronic mail, an instant messenger, VoIP (Voice over Internet Protocol), and the like. The unified communication technology/system is used in a remote distributed collaborative office (distributed environment) in which a plurality of persons collaborate at a plurality of different places.

特開２００７−２０８８６３号公報JP, 2007-208863, A 特許第５６９２２０４号公報Japanese Patent No. 5692204

「会話」という行為は２名で行われるとは限らず、３名以上のグループで行われることも多い。３名以上のグループで行われる会話としては、例えば、会話の開始時にグループのメンバが全員そろっているパターンの他に、開始された会話の存在に周囲の者が気づき（Ａｗａｒｅｎｅｓｓ：アウェアネス）、気づいた者がその会話の場に後から加わることにより、当該会話を行っているグループのメンバ数が増えていくようなパターンがある。特に、インフォーマル・コミュニケーション（雑談等のあらかじめ計画されておらず偶発的に発生するコミュニケーション）をベースとした会話行為は、後者のパターンで発生することが多いと考えられる。また、後者の場合においては、会話に加わる前の周囲の者は、会話の発生と当該会話の大まかな内容に関する情報（自分が会話に後から参加したいか否か判断できる程度の粒度を持つ断片的な会話情報）を知ることができる状態にある場合が多い。 The act of "conversation" is not always performed by two people, but is often performed by a group of three or more people. As a conversation conducted in a group of three or more people, for example, in addition to the pattern in which all the members of the group are all in place at the beginning of the conversation, other people notice the existence of the conversation started (Awareness) and become aware. There is a pattern in which the number of members of the group having the conversation increases as the person who joins the conversation later on. In particular, it is considered that conversational actions based on informal communication (communication that occurs accidentally without being planned in advance such as chat) often occur in the latter pattern. In the latter case, the surrounding people before joining the conversation have information about the occurrence of the conversation and the rough contents of the conversation (a fragment having a granularity sufficient to determine whether or not the user wants to participate in the conversation later). In many cases, it is in a state where it is possible to know basic conversation information).

しかしながら、従来のソフトフォン製品は、議題が決まった上での話し合いなど、意図的、かつ、正式・公式的なフォーマル・コミュニケーションにおいて使用されることが前提になっている。すなわち、従来のソフトフォン製品は、メンバ全員が会話の開始時からそろっているパターンを想定しているため、メンバ全員が一斉に会話（通信）を開始するテレビ会議（多人数通話）機能か、表示画面のプレゼンスリスト（所在リスト）上で、既存のメンバが通話状態であるか否かの２値情報を知ることができる機能か、しか有していなかった。そのため、従来のソフトフォン製品が遠隔分散協働型オフィス（分散環境）で使用された場合には、メンバが同室環境にいる場合のように、周囲の会話が自然に聞こえてくることにより会話の存在に気づき、さらに当該会話内容の大まかな把握から会話に関心を持つことから会話に参加し、最終的に多人数会話に至る、という自然な行為が実現し難い。すなわち、従来のソフトフォン製品は、上述のようなインフォーマル・コミュニケーションに対応するものではなかった。 However, it is premised that the conventional softphone products are used in intentional, formal, and formal communication such as a discussion after the agenda is decided. That is, since the conventional softphone product assumes a pattern in which all members have the same conversation from the beginning, a video conference (multi-party call) function in which all members start conversation (communication) all at once, It has only the function of being able to know the binary information indicating whether or not the existing member is in the call state on the presence list (location list) of the display screen. Therefore, when a conventional softphone product is used in a remote distributed collaborative office (distributed environment), the presence of conversation can be felt by hearing the surrounding conversation naturally, as if the members were in the same room environment. However, it is difficult to realize the natural act of participating in a conversation and finally reaching a multi-person conversation because of the interest in the conversation from the general understanding of the conversation content. That is, the conventional softphone products do not support the above-mentioned informal communication.

そこで、本発明は、上記状況に鑑みてなされたものであり、本発明の目的とするところは、ユーザが、分散環境において、遠隔地での会話の発生や当該会話の大まかな内容を把握することができ、さらに、誰が通話状態にあるのかを直感的に認識することを可能にする、新規かつ改良された情報処理サーバ、情報処理システム、端末装置、及びプログラムを提供することにある。 Therefore, the present invention has been made in view of the above circumstances, and an object of the present invention is to allow a user to grasp the occurrence of a conversation at a remote place and the rough content of the conversation in a distributed environment. In addition, it is possible to provide a new and improved information processing server, information processing system, terminal device, and program that enable intuitive recognition of who is in a call.

上記課題を解決するために、本発明のある観点によれば、通話に係る複数の話者の通信用識別情報を紐づける、前記通話に係る会話イベントオブジェクトを生成し、前記通話の音声データから抽出された語句に係る発言語句オブジェクトを生成し、前記会話イベントオブジェクトに係るデータおよび前記発言語句オブジェクトに係るデータの配信を制御する制御部を備える、情報処理サーバが提供される。
In order to solve the above problems, according to an aspect of the present invention, a conversation event object relating to the call is generated by linking communication identification information of a plurality of speakers involved in the call, and from the voice data of the call. An information processing server is provided that includes a control unit that generates a language phrase object related to an extracted phrase and controls distribution of data related to the conversation event object and data related to the language phrase object .

前記制御部は、前記会話イベントオブジェクトに対するユーザの入力を取得し、前記会話イベントオブジェクトに紐づけられた前記複数の話者の通信識別情報に、前記ユーザの通信用識別情報を関連付けてもよい。
The control unit may acquire a user's input to the conversation event object and associate the communication identification information of the plurality of speakers associated with the conversation event object with the communication identification information of the user.

前記情報処理サーバは、前記発言語句オブジェクトに対して重みづけ処理を行う重みづけ処理部をさらに備え、前記制御部は、前記重みづけ処理の結果と所定の値とを比較し、比較結果に基づいて、前記発言語句オブジェクトに係るデータの配信を制御してもよい。
It said information processing server further includes a weighting processing portion that performs weighting processing for the previous SL onset language phrase objects, the control unit compares the results with the predetermined value of the weighting processing, the comparison result The distribution of data relating to the language phrase object may be controlled based on

前記重みづけ処理部は、前記通話における前記語句の出現頻度に基づいて、前記重みづけ処理を行ってもよい。 The weighting processing unit may perform the weighting processing based on the frequency of appearance of the phrase in the call.

前記重みづけ処理部は、前記語句の抽象度に基づいて、前記重みづけ処理を行ってもよい。 The weighting processing unit may perform the weighting processing based on the degree of abstraction of the phrase.

前記重みづけ処理部は、前記語句の品詞カテゴリに基づいて、前記重みづけ処理を行ってもよい。 The weighting processing unit may perform the weighting processing based on a part-of-speech category of the phrase.

前記重みづけ処理部は、前記通話の音声データに含まれる前記語句の発話の音圧に係るデータに基づいて、前記重みづけ処理を行ってもよい。 The weighting processing unit may perform the weighting processing based on the data relating to the sound pressure of the utterance of the phrase included in the voice data of the call.

前記制御部は、前記発言語句オブジェクトに紐づけて、前記重みづけ処理の結果に係るデータの配信を制御してもよい。
Wherein, in association cord to the remarks phrase object, it may control the distribution of data according to the result of the weighting processing.

前記情報処理サーバは、前記発言語句オブジェクトに係る語句を発言した前記話者の位置に基づいて、前記発言語句オブジェクトの表示位置を決定し、決定した前記表示位置を配信する発言状況演算部をさらに備えてもよい。 The information processing server further determines a display position of the language phrase object based on a position of the speaker who has said a phrase related to the language phrase object, and further includes a statement status calculation unit that distributes the determined display position. You may prepare.

前記制御部は、実空間における、前記通話に係る前記複数の話者のうちの１人の前記話者と、前記通話に参加していないユーザとの位置関係に基づいて、前記発言語句オブジェクトの配信を制御してもよい。
The control unit , based on the positional relationship between one of the plurality of speakers involved in the call and a user who is not participating in the call in the real space , Distribution may be controlled .

前記制御部は、実空間における、前記通話に係る前記複数の話者のうちの１人の前記話者と、前記通話に参加していないユーザとの位置関係に基づいて、前記会話イベントオブジェクトの配信を制御してもよい。
The control unit , based on the positional relationship between one of the plurality of speakers involved in the call and a user who is not participating in the call in the real space , Distribution may be controlled .

前記情報処理サーバは、ユーザの前記通話に対する関心度の入力を取得し、前記会話イベントオブジェクトに対して、取得した前記関心度と、前記ユーザの通信用識別情報とを紐づける関心度制御部をさらに備えてもよい。 The information processing server acquires an input of the interest level of the user with respect to the call, and associates the acquired interest level with the communication identification information of the user with respect to the conversation event object. You may further prepare.

前記情報処理サーバは、前記ユーザに係る表示体を生成し、取得した前記関心度に基づいて、前記会話イベントオブジェクトの位置と前記表示体との仮想的位置関係を決定する表示体制御部をさらに備えてもよい。 The information processing server further includes a display body control unit that generates a display body for the user, and determines a virtual positional relationship between the position of the conversation event object and the display body based on the acquired degree of interest. You may prepare.

前記情報処理サーバは、前記発言語句オブジェクトに対して重みづけ処理を行う重みづけ処理部をさらに備え、前記制御部は、前記重みづけ処理の結果と、所定の値とを比較し、比較結果に基づいて、前記発言語句オブジェクトに係るデータの配信を制御し、前記所定の値は、取得した前記関心度に基づいて、変更されてもよい。
It said information processing server further includes a weighting processing portion that performs weighting processing on the calling language phrase objects, the control unit compares the result of the weighting processing, and a predetermined value, the comparison result The distribution of the data related to the language phrase object is controlled based on, and the predetermined value may be changed based on the acquired degree of interest.

前記関心度制御部は、前記ユーザの前記通話に係る複数の話者についての関心度の比率の入力を取得し、前記会話イベントオブジェクトに紐づけて、取得した前記関心度の比率に係るデータを配信してもよい。 The degree-of-interest control unit acquires an input of the degree-of-interest ratio of the plurality of speakers involved in the call of the user, associates the input with the conversation event object, and obtains the data regarding the acquired degree-of-interest ratio. You may deliver.

前記情報処理サーバは、前記発言語句オブジェクトに対して重みづけ処理を行う重みづけ処理部をさらに備え、前記制御部は、前記重みづけ処理の結果と、所定の値とを比較し、比較結果に基づいて、前記発言語句オブジェクトに係るデータの配信を制御し、前記所定の値は、取得した前記関心度及び前記関心度の比率に基づいて、変更されてもよい。
It said information processing server further includes a weighting processing portion that performs weighting processing on the calling language phrase objects, the control unit compares the result of the weighting processing, and a predetermined value, the comparison result The distribution of the data related to the language phrase object is controlled based on, and the predetermined value may be changed based on the acquired interest level and the acquired ratio of the interest level.

前記関心度制御部は、前記会話イベントオブジェクトに紐づけられた前記複数の話者の通信識別情報に、前記通話に対して前記関心度を入力した前記ユーザの通信用識別情報を関連付けてもよい。 The interest degree control unit may associate the communication identification information of the plurality of speakers associated with the conversation event object with the communication identification information of the user who has input the interest degree for the call. ..

また、上記課題を解決するために、本発明の別の観点によれば、通話に係る複数の話者が存在する実空間に対応する３次元仮想空間を生成し、前記通話に係る複数の話者の通信用識別情報のそれぞれに対応する複数のオブジェクトと、前記複数のオブジェクトを互いに紐づける、前記通話に係る会話イベントオブジェクトとを生成して、前記通話の音声データから抽出された語句に係る発言語句オブジェクトを生成し、前記会話イベントオブジェクトに係るデータおよび前記発言語句オブジェクトに係るデータを前記３次元仮想空間に配置する、制御部を備える、情報処理サーバが提供される。
In order to solve the above problems, according to another aspect of the present invention, a three-dimensional virtual space corresponding to a real space in which a plurality of speakers involved in a call exist is generated, and a plurality of talks related to the call are generated. A plurality of objects corresponding to each of the communication identification information of the person and a conversation event object relating to the call, which is associated with the plurality of objects, and relates to a phrase extracted from the voice data of the call. An information processing server is provided, which includes a control unit that generates a language phrase object and arranges data relating to the conversation event object and data relating to the language phrase object in the three-dimensional virtual space.

前記情報処理サーバは、ユーザの前記通話に対する関心度の入力を取得し、前記ユーザに対応するユーザオブジェクトを前記３次元仮想空間に配置し、取得した前記関心度に基づいて、前記３次元仮想空間における、前記会話イベントオブジェクトと前記ユーザオブジェクトとの間の仮想的距離を決定する、表示体制御部をさらに備えてもよい。 The information processing server acquires an input of a degree of interest of the user with respect to the call, arranges a user object corresponding to the user in the three-dimensional virtual space, and based on the obtained degree of interest, the three-dimensional virtual space. In, the display control unit may further include a display body control unit that determines a virtual distance between the conversation event object and the user object.

前記情報処理サーバは、ユーザの前記通話に係る前記複数の話者についての関心度の比率の入力を取得し、取得した前記関心度の比率に基づいて、生成した前記会話イベントオブジェクト上に、前記関心度の比率を示す基準点を配置する、関心度制御部をさらに備えてもよい。 The information processing server acquires an input of a ratio of interest levels for the plurality of speakers involved in the call of the user, and based on the acquired ratio of interest levels, on the generated conversation event object, the An interest degree control unit may be further provided, which arranges a reference point indicating a ratio of interest degrees.

また、上記課題を解決するために、本発明の更なる別の観点によれば、情報処理サーバと、複数の端末装置とを含む情報処理システムであって、前記情報処理サーバは、通話に係る複数の話者の通信用識別情報を紐づける、前記通話に係る会話イベントオブジェクトを生成し、前記通話の音声データから抽出された語句に係る発言語句オブジェクトを生成し、前記会話イベントオブジェクトに係るデータおよび前記発言語句オブジェクトに係るデータを、前記複数の端末装置に配信する、情報処理システムが提供される。
In order to solve the above problems, according to still another aspect of the present invention, there is provided an information processing system including an information processing server and a plurality of terminal devices, wherein the information processing server relates to a call. Data relating to the conversation event object, which is associated with communication identification information of a plurality of speakers, generates a conversation event object relating to the call, generates a language phrase object relating to a phrase extracted from the voice data of the call, and An information processing system is provided that distributes data relating to the language phrase object to the plurality of terminal devices.

また、上記課題を解決するために、本発明の更なる別の観点によれば、通話に係る複数の話者の通信用識別情報を紐づける、前記通話に係る会話イベントオブジェクトおよび前記通話の音声データから抽出された語句に係る発言語句オブジェクトを表示する表示部を備える、端末装置が提供される。
Further, in order to solve the above-mentioned problems, according to still another aspect of the present invention, a conversation event object related to the call and a voice of the call are associated with communication identification information of a plurality of speakers involved in the call. A terminal device is provided that includes a display unit that displays a language phrase object related to a phrase extracted from data .

前記端末装置は、前記発言語句オブジェクトに対して行われた重みづけ処理の結果を取得して、前記重みづけ処理の結果と所定の値とを比較し、比較結果に基づいて、前記表示部を制御する、発言語句オブジェクト制御部をさらに備えてもよい。 The terminal device acquires the result of the weighting process performed on the language phrase object, compares the result of the weighting process with a predetermined value, and based on the comparison result, displays the display unit. You may further provide the language phrase object control part which controls.

前記端末装置は、前記発言語句オブジェクトに対して行われた重みづけ処理の結果を取得して、前記重みづけ処理の結果に基づいて、前記発言語句オブジェクトの大きさ、色、コントラスト、表示位置のいずれか１つを制御する、発言語句オブジェクト制御部をさらに備えてもよい。 The terminal device obtains the result of the weighting process performed on the language phrase object, and based on the result of the weighting process, the size, color, contrast, and display position of the language phrase object. A language phrase object control unit that controls any one may be further provided.

前記端末装置は、ユーザの前記通話に対する関心度を取得し、前記関心度に基づいて、前記通話に係る音声の出力を制御する音声出力制御部をさらに備えてもよい。 The terminal device may further include a voice output control unit that acquires a degree of interest of the user in the call and controls output of a voice related to the call based on the degree of interest.

前記端末装置は、ユーザによる、前記表示部に表示された前記ユーザに係るユーザオブジェクトに対する操作に基づいて、前記ユーザの前記通話に対する関心度を取得する会話関心度設定部をさらに備えてもよい。 The terminal device may further include a conversation interest level setting unit that acquires an interest level of the user with respect to the call based on an operation performed by the user on a user object of the user displayed on the display unit.

前記端末装置は、前記ユーザオブジェクトの表示のために、前記ユーザの顔画像を取得する撮像部をさらに備えてもよい。 The terminal device may further include an imaging unit that acquires a face image of the user in order to display the user object.

前記端末装置は、ユーザの前記通話に対する関心度を取得したことに基づいて、前記ユーザの存在を示す通知を行う会話関心度通知部をさらに備えてもよい。 The terminal device may further include a conversation interest level notification unit that performs a notification indicating the presence of the user based on the acquisition of the interest level of the user for the call.

前記会話関心度通知部は、前記ユーザの存在を示す通知表示を前記表示部に表示させ、取得した前記関心度に基づいて、前記通知表示の大きさ、色、動き、コントラスト、表示位置のいずれか１つを制御してもよい。 The conversation interest level notification unit displays a notification display indicating the presence of the user on the display unit, and based on the acquired interest level, any one of the size, color, movement, contrast, and display position of the notification display. You may control one.

前記会話関心度通知部は、前記ユーザの存在を示す音声出力を音声出力部に行わせ、
取得した前記関心度に基づいて、前記音声出力の音量を制御してもよい。 The conversation interest level notification unit causes the audio output unit to perform audio output indicating the presence of the user,
The volume of the audio output may be controlled based on the acquired degree of interest.

また、上記課題を解決するために、本発明の更なる別の観点によれば、コンピュータを、通話に係る複数の話者の通信用識別情報を紐づける、前記通話に係る会話イベントオブジェクトを生成し、前記通話の音声データから抽出された語句に係る発言語句オブジェクトを生成し、前記会話イベントオブジェクトに係るデータおよび前記発言語句オブジェクトに係るデータの配信を制御する制御部として機能させるための、プログラムが提供される。

Further, in order to solve the above-mentioned problems, according to still another aspect of the present invention, a computer generates a conversation event object relating to the call, which associates a computer with communication identification information of a plurality of speakers involved in the call. Then, a program for generating a language phrase object relating to a phrase extracted from the voice data of the call, and causing it to function as a control unit for controlling the distribution of the data relating to the conversation event object and the data relating to the language phrase object. Will be provided.

以上説明したように本発明によれば、ユーザが、分散環境において、遠隔地での会話の発生や当該会話の大まかな内容を把握することができ、さらに、誰が通話状態にあるのかを直感的に認識することが可能である。 As described above, according to the present invention, a user can grasp the occurrence of a conversation at a remote place and a rough content of the conversation in a distributed environment, and further intuitively know who is in a call state. Can be recognized.

第１の実施形態に係る情報処理システムの概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of the information processing system which concerns on 1st Embodiment. 第１の実施形態に係る端末装置１００のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the terminal device 100 which concerns on 1st Embodiment. 第１の実施形態に係る端末装置１００の機能構成の一例を示すブロック図である。It is a block diagram which shows an example of a functional structure of the terminal device 100 which concerns on 1st Embodiment. 第１の実施形態に係る俯瞰モードで表示される表示画面の一例を説明するための説明図である。It is an explanatory view for explaining an example of a display screen displayed in a bird's-eye view mode according to the first embodiment. 第１の実施形態に係る近接モードで表示される表示画面の一例を説明するための説明図である。FIG. 7 is an explanatory diagram illustrating an example of a display screen displayed in the proximity mode according to the first embodiment. ユーザにより指定される近接撮像画像７１内の位置の第１の例を説明するための説明図である。FIG. 8 is an explanatory diagram for explaining a first example of a position in the close-up captured image 71 designated by the user. ユーザにより指定される近接撮像画像７１内の位置の第２の例を説明するための説明図である。FIG. 11 is an explanatory diagram for explaining a second example of the position in the close-up captured image 71 designated by the user. センタオフィス１０に対応する３次元仮想空間９０の第１の例を説明するための説明図である。FIG. 3 is an explanatory diagram illustrating a first example of a three-dimensional virtual space 90 corresponding to the center office 10. 図８に示される３次元仮想空間９０に配置されたオブジェクト９１の選択の一例を説明するための説明図である。FIG. 9 is an explanatory diagram for explaining an example of selection of an object 91 arranged in the three-dimensional virtual space 90 shown in FIG. 8. センタオフィス１０に対応する３次元仮想空間９０の第２の例を説明するための説明図である。FIG. 9 is an explanatory diagram for explaining a second example of the three-dimensional virtual space 90 corresponding to the center office 10. 図１０に示される３次元仮想空間９０に配置されたオブジェクト９１の選択の一例を説明するための説明図である。FIG. 11 is an explanatory diagram for describing an example of selection of an object 91 arranged in the three-dimensional virtual space 90 shown in FIG. 10. 会話モードで表示される表示画面８０の一例を説明するための説明図である。It is an explanatory view for explaining an example of display screen 80 displayed in conversation mode. 表示モードの遷移の一例を説明するための遷移図である。It is a transition diagram for explaining an example of transition of display modes. センタオフィス１０に対応する３次元仮想空間９０内におけるＣＯＭＭリンク８６の一例を説明するための説明図である。FIG. 6 is an explanatory diagram illustrating an example of a COMM link 86 in a three-dimensional virtual space 90 corresponding to the center office 10. 端末装置１００の表示画面５０に表示されるＣＯＭＭリンク８６およびＣＯＭＭワード８７の一例を説明するための説明図である。FIG. 8 is an explanatory diagram illustrating an example of a COMM link 86 and a COMM word 87 displayed on the display screen 50 of the terminal device 100. 端末装置１００の表示画面５５に表示されるＣＯＭＭリンク８６およびＣＯＭＭワード８７の別の一例を説明するための説明図である。FIG. 11 is an explanatory diagram for explaining another example of the COMM link 86 and the COMM word 87 displayed on the display screen 55 of the terminal device 100. 第１の実施形態に係る端末装置１００のソフトウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the software configuration of the terminal device 100 which concerns on 1st Embodiment. 第１の実施形態に係る情報管理サーバ２００のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the information management server 200 which concerns on 1st Embodiment. 第１の実施形態に係る情報管理サーバ２００の機能構成の一例を示すブロック図である。It is a block diagram which shows an example of a functional structure of the information management server 200 which concerns on 1st Embodiment. 第１の実施形態に係る音声認識サーバ２０１のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of the voice recognition server 201 which concerns on 1st Embodiment. 第１の実施形態に係る音声認識サーバ２０１の機能構成の一例を示すブロック図である。It is a block diagram which shows an example of a functional structure of the voice recognition server 201 which concerns on 1st Embodiment. 第１の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。3 is a flowchart showing an example of a schematic flow of information processing according to the first embodiment. 第１の実施形態に係る起動処理の概略的な流れの一例を示すフローチャートである。6 is a flowchart showing an example of a schematic flow of a startup process according to the first embodiment. 第１の実施形態に係る通信制御処理の概略的な流れの一例を示すシーケンス図である。It is a sequence diagram which shows an example of the schematic flow of the communication control process which concerns on 1st Embodiment. 第２の実施形態に係る情報管理サーバ２００Ａの機能構成の一例を示すブロック図である。It is a block diagram showing an example of functional composition of information management server 200A concerning a 2nd embodiment. 第２の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。9 is a flowchart showing an example of a schematic flow of information processing according to the second embodiment. 第３の実施形態に係る端末装置１００Ａの機能構成の一例を示すブロック図である。It is a block diagram showing an example of functional composition of terminal unit 100A concerning a 3rd embodiment. 分散オフィスに対応する３次元仮想空間９０内における第三者オブジェクト９４と仮想的距離９７の一例を説明するための説明図である。It is an explanatory view for explaining an example of third party object 94 and virtual distance 97 in three-dimensional virtual space 90 corresponding to a distributed office. 第３の実施形態に係る端末装置１００Ａの表示画面に表示される第三者オブジェクト及び仮想的距離の一例を説明するための説明図である。It is explanatory drawing for demonstrating an example of the third party object and virtual distance displayed on the display screen of the terminal device 100A which concerns on 3rd Embodiment. 第３の実施形態に係る端末装置１００Ａの表示画面５７に表示される第三者オブジェクト９４及び仮想的距離９７の他の一例を説明するための説明図である。It is an explanatory view for explaining other examples of the third party object 94 and virtual distance 97 displayed on display screen 57 of terminal unit 100A concerning a 3rd embodiment. 分散オフィスに対応する３次元仮想空間９０内における話者関心比反映位置９６の設定の一例を説明するための説明図である。It is an explanatory view for explaining an example of setting of a speaker interest ratio reflection position 96 in a three-dimensional virtual space 90 corresponding to a distributed office. 第３の実施形態に係る端末装置１００Ａの表示画面５８に表示される話者関心比反映位置９６の設定の一例を説明するための説明図である。It is an explanatory view for explaining an example of setting of speaker interest ratio reflection position 96 displayed on display screen 58 of terminal unit 100A concerning a 3rd embodiment. 第三者オブジェクト９４によって話者関心比反映位置９６を設定する際の表示画面６２の一例を説明するための説明図である。FIG. 11 is an explanatory diagram illustrating an example of a display screen 62 when a speaker interest ratio reflection position 96 is set by the third party object 94. 第３の実施形態に係る端末装置１００Ａの表示画面に表示される会話関心度に関する情報を含む通知の表示の一例を説明するための説明図である。It is explanatory drawing for demonstrating an example of the display of the notification containing the information regarding the conversation interest degree displayed on the display screen of the terminal device 100A which concerns on 3rd Embodiment. 第３の実施形態に係る情報管理サーバ２００Ｂの機能構成の一例を示すブロック図である。It is a block diagram showing an example of functional composition of information management server 200B concerning a 3rd embodiment. 第３の実施形態に係る、会話関心度設定とＣＯＭＭワード８７の通信情報量変更と通知処理との概略的な流れの一例を示すシーケンス図である。FIG. 16 is a sequence diagram showing an example of a schematic flow of conversation interest level setting, communication information amount change of COMM word 87, and notification process according to the third embodiment.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this specification and the drawings, constituent elements having substantially the same functional configuration are designated by the same reference numerals, and a duplicate description will be omitted.

また、本明細書及び図面において、実質的に同一の機能構成を有する複数の構成要素を、同一の符号の後に異なるアルファベットを付して区別する場合もある。例えば、実質的に同一の機能構成または論理的意義を有する複数の構成を、必要に応じてボタン画像６３Ａ及びボタン画像６３Ｂのように区別する。ただし、実質的に同一の機能構成を有する複数の構成要素の各々を特に区別する必要がない場合、同一符号のみを付する。例えば、ボタン画像６３Ａ及びボタン画像６３Ｂを特に区別する必要が無い場合には、単にボタン画像６３と称する。 In addition, in the present specification and the drawings, a plurality of constituent elements having substantially the same functional configuration may be distinguished by attaching different alphabets after the same reference numeral. For example, a plurality of configurations having substantially the same functional configuration or logical significance are distinguished as needed, such as a button image 63A and a button image 63B. However, if it is not particularly necessary to distinguish each of the plurality of constituent elements having substantially the same functional configuration, only the same reference numeral is given. For example, when it is not necessary to distinguish between the button image 63A and the button image 63B, they are simply referred to as the button image 63.

なお、説明は以下の順序で行うものとする。
１．第１の実施形態
１．１情報処理システムの概略的な構成
１．２端末装置の構成
１．２．１ハードウェア構成
１．２．２機能構成
１．２．３ソフトウェア構成
１．３情報管理サーバの構成
１．３．１ハードウェア構成
１．３．２機能構成
１．４音声認識サーバの構成
１．４．１ハードウェア構成
１．４．２機能構成
１．５処理の流れ
２．第２の実施形態
２．１情報管理サーバの構成
２．１．１機能構成
２．２処理の流れ
３．第３の実施形態
３．１端末装置の構成
３．１．１機能構成
３．２情報管理サーバの構成
３．２．１機能構成
３．３処理の流れ
４．補足 The description will be given in the following order.
1. 1. First embodiment 1.1 Schematic configuration of information processing system 1.2 Configuration of terminal device 1.2.1 Hardware configuration 1.2.2 Functional configuration 1.2.3 Software configuration 1.3 Information management Server configuration 1.3.1 Hardware configuration 1.3.2 Functional configuration 1.4 Speech recognition server configuration 1.4.1 Hardware configuration 1.4.2 Functional configuration 1.5 Process flow 2. Second embodiment 2.1 Configuration of information management server 2.1.1 Functional configuration 2.2 Process flow 3. Third Embodiment 3.1 Configuration of Terminal Device 3.1.1 Functional Configuration 3.2 Configuration of Information Management Server 3.2.1 Functional Configuration 3.3 Process Flow 4. Supplement

＜１．第１の実施形態＞
＜１．１情報処理システムの概略的な構成＞
本発明の第１の実施形態は、分散環境において、複数の遠隔地に存在する複数の人物の間での会話に使用される情報処理システムに関する。まず、図１を参照して、第１の実施形態に係る情報処理システムの概略的な構成を説明する。図１は、本実施形態に係る情報処理システムの概略的な構成の一例を示す説明図である。 <1. First Embodiment>
<1.1 Schematic configuration of information processing system>
The first embodiment of the present invention relates to an information processing system used in a conversation between a plurality of persons existing in a plurality of remote places in a distributed environment. First, with reference to FIG. 1, a schematic configuration of an information processing system according to the first embodiment will be described. FIG. 1 is an explanatory diagram showing an example of a schematic configuration of an information processing system according to the present embodiment.

図１に示すように、本実施形態に係る情報処理システムは、例えば、複数の拠点にわたって利用される。図１の例では、情報処理システムは、センタオフィス１０とサテライトオフィス２０（又はホームオフィスと２０）にわたって利用される。例えば、センタオフィス１０は、比較的大規模なオフィスであり、サテライトオフィス２０（又はホームオフィスと２０）は、比較的小規模なオフィスである。 As shown in FIG. 1, the information processing system according to the present embodiment is used, for example, over a plurality of bases. In the example of FIG. 1, the information processing system is used across a center office 10 and a satellite office 20 (or home offices 20). For example, the center office 10 is a relatively large office, and the satellite office 20 (or home office and 20) is a relatively small office.

情報処理システムは、センタオフィス１０において、カメラ１１、マイクロフォン１３、センサ１５、メディア配信サーバ１７、情報管理サーバ（情報処理サーバ）２００、音声認識サーバ２０１、及びＬＡＮ（ＬｏｃａｌＡｒｅａｎｅｔｗｏｒｋ）１９を含む。また、情報処理システムは、サテライトオフィス２０（又はホームオフィスと２０）において、端末装置１００、ディスプレイ２１、及びＬＡＮ２３を含む。また、情報処理システムは、さらに外部ネットワーク３０及びＰＢＸ（ＰｒｉｖａｔｅＢｒａｎｃｈｅＸｃｈａｎｇｅ）４０を含む。 The information processing system includes, in the center office 10, a camera 11, a microphone 13, a sensor 15, a media distribution server 17, an information management server (information processing server) 200, a voice recognition server 201, and a LAN (Local Area network) 19. Further, the information processing system includes a terminal device 100, a display 21, and a LAN 23 in the satellite office 20 (or home office and 20). Further, the information processing system further includes an external network 30 and a PBX (Private Branch eXchange) 40.

（カメラ１１）
カメラ１１は、当該カメラ１１が向いている方向（即ち、撮像方向）の領域を撮像する。センタオフィス１０には、１台又は複数のカメラ１１が設置される。そして、設置された各カメラ１１は、それぞれの設置位置からセンタオフィス１０の一部又は全体を撮像することができる。図１からわかるように、本実施形態に係る情報処理システムにおいては、センタオフィス１０に設置された複数のカメラ１１により、様々な位置からセンタオフィス１０を撮像することができる。なお、本実施形態においては、カメラ１１を通じて生成される撮像画像は、静止画像であってもよく、又は動画像（即ち映像）であってもよく、特に限定されるものではない。また、カメラ１１は、例えば、自動で撮像方向を変えることができる。さらに、カメラ１１は、例えば、ズーム機能を有する。当該ズーム機能は、光学ズーム機能であってもよく、又はデジタルズーム機能であってもよく、特に限定されない。 (Camera 11)
The camera 11 captures an image of the area in the direction in which the camera 11 is facing (that is, the capturing direction). One or more cameras 11 are installed in the center office 10. Then, each installed camera 11 can capture an image of a part or the whole of the center office 10 from each installation position. As can be seen from FIG. 1, in the information processing system according to the present embodiment, the plurality of cameras 11 installed in the center office 10 can image the center office 10 from various positions. In addition, in the present embodiment, the captured image generated by the camera 11 may be a still image or a moving image (that is, a video), and is not particularly limited. Further, the camera 11 can automatically change the imaging direction, for example. Further, the camera 11 has, for example, a zoom function. The zoom function may be an optical zoom function or a digital zoom function, and is not particularly limited.

また、カメラ１１は、当該カメラ１１の位置を変えることが可能であってもよい。例えば、カメラ１１は、ドリー（図示省略）により可動するように構成されていてもよい。言い換えると、カメラ１１は、レール（図示省略）に沿って可動するように構成されていてもよい。この場合においては、レールに沿ってカメラ１１が可動するためのモータ（図示省略）の制御により、カメラ１１が当該レールに沿って動いてもよい。これにより、センタオフィス１０に設置されたカメラ１１が１台の場合であっても、異なる位置から撮像された撮像画像をカメラ１１により取得することが可能になる。 Further, the camera 11 may be able to change the position of the camera 11. For example, the camera 11 may be configured to be movable by a dolly (not shown). In other words, the camera 11 may be configured to be movable along a rail (not shown). In this case, the camera 11 may move along the rail by controlling a motor (not shown) for moving the camera 11 along the rail. As a result, even if the number of cameras 11 installed in the center office 10 is one, it is possible for the cameras 11 to acquire captured images captured from different positions.

また、カメラ１１がその位置を変えることが可能である場合には、上記ズーム機能は、カメラ１１の位置を変えることによるズーム機能であってもよい。例えば、上記ズーム機能は、ドリーによるズーム機能であってもよい。具体的には、被写体に向かってカメラ１１を動かすことによりズームインを行い、カメラ１１が被写体から離れる方向にカメラ１１を動かすことによりズームアウトを行ってもよい。なお、ドリーによるズーム機能は、光学ズーム又はデジタルズームのように、精緻に調整されるズームでなくてもよい。この場合、例えば、ズームインでは、被写体がより大きく写った撮像画像が取得できればよく、ズームアウトでは、被写体がより小さく写った撮像画像が取得できればよい。 If the camera 11 can change its position, the zoom function may be a zoom function by changing the position of the camera 11. For example, the zoom function may be a dolly zoom function. Specifically, zooming in may be performed by moving the camera 11 toward the subject, and zooming out by moving the camera 11 in a direction in which the camera 11 moves away from the subject. Note that the dolly zoom function need not be a precisely adjusted zoom, such as an optical zoom or a digital zoom. In this case, for example, in zooming in, it is only necessary to obtain a captured image in which the subject is larger, and in zooming out, it is sufficient to obtain a captured image in which the subject is smaller.

（マイクロフォン１３）
マイクロフォン１３は、当該マイクロフォン１３の周囲の音を集音する。センタオフィス１０には、例えば、１台又は複数のマイクロフォン１３が設置される。例えば、設置された各マイクロフォン１３は、センタオフィス１０内のそれぞれの設置位置の周囲の音を集音する。このように、本実施形態に係る情報処理システムでは、センタオフィス１０に設置された複数のマイクロフォン１３により、センタオフィス１０内の様々な位置での音が集音される。 (Microphone 13)
The microphone 13 collects sounds around the microphone 13. In the center office 10, for example, one or a plurality of microphones 13 are installed. For example, each of the installed microphones 13 collects a sound around the installation position in the center office 10. As described above, in the information processing system according to the present embodiment, the plurality of microphones 13 installed in the center office 10 collect sounds at various positions in the center office 10.

（センサ１５）
センサ１５は、様々な種類のものを検知する様々な種類のセンサを含み得る。センタオフィス１０には、例えば、１台又は複数のセンサ１５が設置される。センサ１５は、例えば、人物が座席にいるか否かを判定する座席センサであってもよい。当該座席センサは、各座席に設置され、押圧を検出することにより、各座席に人物が座っているか否かを判定する。また、センサ１５は、例えば、座席等に設置された振動センサであってもよく、該当する座席に着席する人物に起因する振動を検出することにより、該当する座席に人物が座っているかを判定する。また、センサ１５は、例えば、机下等に設置された人感センサであってもよく、当該人感センサは、該当する座席に着席する人物に起因する赤外線、超音波、可視光等の変化を検出することにより、該当する座席に人物が座っているかを判定する。 (Sensor 15)
The sensor 15 may include various types of sensors that detect various types. In the center office 10, for example, one or more sensors 15 are installed. The sensor 15 may be, for example, a seat sensor that determines whether a person is in the seat. The seat sensor is installed in each seat and detects whether or not the seat is pressed to determine whether or not a person is sitting in each seat. Further, the sensor 15 may be, for example, a vibration sensor installed in a seat or the like, and determines whether or not a person is seated in the seat by detecting the vibration caused by the person sitting in the seat. To do. Further, the sensor 15 may be, for example, a motion sensor installed under a desk or the like, and the motion sensor changes in infrared rays, ultrasonic waves, visible light, etc. due to a person sitting in the seat. Is detected, it is determined whether or not a person is sitting in the corresponding seat.

（メディア配信サーバ１７）
メディア配信サーバ１７は、要求に応じて端末装置１００等にメディア（例えば、音声、映像等）を配信する。 (Media distribution server 17)
The media delivery server 17 delivers media (for example, audio, video, etc.) to the terminal device 100 or the like in response to the request.

（情報管理サーバ２００）
情報管理サーバ２００は、本実施形態に係る情報処理システムにおいて用いられる様々な情報を管理する。即ち、情報管理サーバ２００は、当該様々な情報を記憶し、適時に当該様々な情報を更新する。例えば、情報管理サーバ２００は、上述したカメラ１１、マイクロフォン１３及びセンサ１５に関するパラメータを管理する。具体的には、例えば、情報管理サーバ２００は、カメラ１１のパラメータとして、カメラ１１の設置位置、撮像方向（例えば、カメラ１１のレンズと垂直な方向）、ズーム率等の情報を記憶し、更新する。 (Information management server 200)
The information management server 200 manages various information used in the information processing system according to this embodiment. That is, the information management server 200 stores the various information and updates the various information in a timely manner. For example, the information management server 200 manages the parameters related to the camera 11, the microphone 13, and the sensor 15 described above. Specifically, for example, the information management server 200 stores and updates information such as the installation position of the camera 11, the imaging direction (for example, the direction perpendicular to the lens of the camera 11), the zoom ratio, etc., as the parameters of the camera 11. To do.

また、例えば、情報管理サーバ２００は、実空間に対応する３次元仮想空間のデータを生成、管理する。当該３次元仮想空間は、例えば、センタオフィス１０を模した３次元仮想空間のことを意味する。また、当該３次元仮想空間には、オブジェクトが配置される。例えば、当該オブジェクトは、人物に対応し、当該オブジェクトは、センタオフィス１０の各座席の位置に対応する上記３次元仮想空間内の３次元仮想位置に配置される。即ち、人物が座席に座っている場合には当該人物が存在するであろう位置に対応する３次元仮想位置に、上記オブジェクトが配置される。一例として、当該オブジェクトは、円柱状の形状を持つ。なお、３次元仮想空間及びオブジェクトについては後述する。 Further, for example, the information management server 200 generates and manages data in the three-dimensional virtual space corresponding to the real space. The three-dimensional virtual space means, for example, a three-dimensional virtual space imitating the center office 10. Further, an object is placed in the three-dimensional virtual space. For example, the object corresponds to a person, and the object is arranged at the three-dimensional virtual position in the three-dimensional virtual space corresponding to the position of each seat of the center office 10. That is, when the person is sitting on the seat, the object is arranged at the three-dimensional virtual position corresponding to the position where the person is likely to exist. As an example, the object has a cylindrical shape. The three-dimensional virtual space and the object will be described later.

さらに、例えば、情報管理サーバ２００は、マイクロフォン１３や端末装置１００の集音部１４０から取得した音声データを、対応する通信用ＩＤ（ｉｄｅｎｔｉｆｉｃａｔｉｏｎ）のデータを付与して、音声認識サーバ２０１へ送信する。また、例えば、情報管理サーバ２００は、音声認識サーバ２０１から認識結果の語句データ（例えば、上記音声データから抽出された語句についてのデータ）を受信し管理する。なお、当該音声データと当該通信用ＩＤのデータ、及び、当該語句データについては後述する。また、音声データの取得や記憶に係る処理等、情報管理サーバ２００が行う処理、もしくは当該処理の一部については、情報管理サーバ２００ではなくメディア配信サーバ１７、端末装置１００等の他の装置によって行われてもよい。 Furthermore, for example, the information management server 200 adds the data of the corresponding communication ID (identification) to the voice data acquired from the microphone 13 or the sound collection unit 140 of the terminal device 100 and transmits the voice data to the voice recognition server 201. .. Further, for example, the information management server 200 receives and manages word/phrase data (for example, data about words/phrases extracted from the above-mentioned voice data) as a recognition result from the voice recognition server 201. The voice data, the communication ID data, and the word/phrase data will be described later. Further, the processing performed by the information management server 200, such as processing related to acquisition and storage of voice data, or a part of the processing is performed by the media distribution server 17, other devices such as the terminal device 100, instead of the information management server 200. May be done.

（音声認識サーバ２０１）
音声認識サーバ２０１は、大規模な語句リストのデータを内蔵し、例えば情報管理サーバ２００を介して、端末装置１００やマイクロフォン１３で取得された音声データを受信し、音声認識処理を行う。当該音声認識処理とは、上述の語句リストを用いて、受信した音声データに含まれる語句の抽出を行う処理のことである。そして、音声認識サーバ２０１は、音声認識処理の結果であるデータを上述の情報管理サーバ２００へと送信する。なお、音声認識サーバ２０１は、仮想サーバやアプリケーションソフトウェアとして、上述の情報管理サーバ２００、メディア配信サーバ１７、後述するＰＢＸ４０等の他のサーバ機器等により実現されてもよい。 (Voice recognition server 201)
The voice recognition server 201 has a large-scale word list data built therein, receives voice data acquired by the terminal device 100 or the microphone 13 via, for example, the information management server 200, and performs voice recognition processing. The voice recognition process is a process of extracting a phrase included in the received voice data using the phrase list described above. Then, the voice recognition server 201 transmits the data, which is the result of the voice recognition process, to the information management server 200 described above. The voice recognition server 201 may be realized as a virtual server or application software by the information management server 200, the media distribution server 17, the other server device such as the PBX 40 described later, or the like.

（ＬＡＮ１９、外部ネットワーク３０）
ＬＡＮ１９は、センタオフィス１０内の各装置を接続するネットワークである。また、ＬＡＮ１９は、外部ネットワーク３０を介して、センタオフィス１０内の各装置とセンタオフィス１０外の装置とを接続する。ＬＡＮ１９及び外部ネットワーク３０は、有線又は無線であることができ、例えば、インターネット、ＩＰ−ＶＰＮ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ‐ＶｉｒｔｕａｌＰｒｉｖａｔｅＮｅｔｗｏｒｋ）、専用回線、又はＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、赤外線通信などの任意の通信ネットワークから構成される。 (LAN 19, external network 30)
The LAN 19 is a network that connects the respective devices in the center office 10. Further, the LAN 19 connects each device inside the center office 10 and a device outside the center office 10 via the external network 30. The LAN 19 and the external network 30 can be wired or wireless, and for example, the Internet, IP-VPN (Internet Protocol-Virtual Private Network), leased line, or WAN (Wide Area Network), any communication such as infrared communication. It consists of a network.

（端末装置１００）
端末装置１００は、ユーザにより使用される。例えば、端末装置１００は、電話、メール等のコミュニケーションを行うための機能をユーザに提供する。端末装置１００は、例えば、タブレット端末である。なお、端末装置１００は、タブレット端末の代わりに、スマートフォン、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、ディスプレイ付き電話機等の、表示機能及び通信機能を有する別の装置であってもよい。 (Terminal device 100)
The terminal device 100 is used by a user. For example, the terminal device 100 provides the user with a function for performing communication such as telephone and email. The terminal device 100 is, for example, a tablet terminal. Note that the terminal device 100 may be another device having a display function and a communication function, such as a smartphone, a PC (Personal Computer), or a telephone with a display, instead of the tablet terminal.

（ディスプレイ２１）
ディスプレイ２１は、様々な画面を表示する。例えば、ディスプレイ２１は、カメラ１１を通じて取得された撮像画像を含む画面を表示する。これにより、端末装置１００のユーザを含む多数の人物が、ディスプレイ２１を介してセンタオフィス１０の様子を見ることができる。また、ディスプレイ２１は、例えば、いずれかの音声も出力してもよい。具体的には、ディスプレイ２１は、マイクロフォン１３により集音される音声を出力してもよい。これにより、端末装置１００のユーザを含む多数の人物が、ディスプレイ２１を介してセンタオフィス１０内の音を聞くことができる。 (Display 21)
The display 21 displays various screens. For example, the display 21 displays a screen including a captured image acquired through the camera 11. As a result, many persons including the user of the terminal device 100 can see the state of the center office 10 via the display 21. Further, the display 21 may output any sound, for example. Specifically, the display 21 may output the sound collected by the microphone 13. As a result, a large number of people including the user of the terminal device 100 can hear the sound in the center office 10 via the display 21.

（ＬＡＮ２３）
ＬＡＮ２３は、サテライトオフィス２０（又はホームオフィス２０）内の各装置を接続するネットワークである。また、ＬＡＮ２３は、外部ネットワーク３０を介して、サテライトオフィス２０内の各装置とサテライトオフィス２０外の装置とを接続する。ＬＡＮ２３についても、有線又は無線であることができ、例えば、専用回線、赤外線通信などの任意の通信ネットワークから構成される。 (LAN23)
The LAN 23 is a network that connects the devices in the satellite office 20 (or the home office 20). In addition, the LAN 23 connects each device inside the satellite office 20 and a device outside the satellite office 20 via the external network 30. The LAN 23 can also be wired or wireless, and is composed of an arbitrary communication network such as a dedicated line and infrared communication.

（ＰＢＸ４０）
ＰＢＸ４０は、外部ネットワーク３０を介した装置間の通信を可能にする。ＰＢＸ４０は、例えば、Ｈ．３２３又はＳＩＰ（ＳｅｓｓｉｏｎＩｎｉｔｉａｔｉｏｎＰｒｏｔｏｃｏｌ）に従って動作することができる。具体的には、例えば、ＰＢＸ４０は、通信用の識別情報（例えば、電話番号）とＩＰ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）アドレスとを、互いに対応付けて記憶する。そして、ＰＢＸ４０は、要求に応じて、通信用の識別情報をＩＰアドレスに変換し、当該ＩＰアドレスを要求元に提供する。なお、ＰＢＸ４０は、上述のＬＡＮ１９又はＬＡＮ２３に接続されてもよい。 (PBX40)
The PBX 40 enables communication between devices via the external network 30. The PBX 40 is, for example, H.264. 323 or SIP (Session Initiation Protocol). Specifically, for example, the PBX 40 stores communication identification information (for example, a telephone number) and an IP (Internet Protocol) address in association with each other. Then, the PBX 40 converts the identification information for communication into an IP address in response to the request, and provides the IP address to the request source. The PBX 40 may be connected to the above-mentioned LAN 19 or LAN 23.

＜１．２端末装置の構成＞
続いて、図２から図１７を参照して、本実施形態に係る端末装置１００の構成の一例を説明する。端末装置１００は、先に説明したように、コミュニケーションを行うための機能をユーザに提供する装置である。 <1.2 Configuration of terminal device>
Subsequently, an example of the configuration of the terminal device 100 according to the present embodiment will be described with reference to FIGS. 2 to 17. As described above, the terminal device 100 is a device that provides a user with a function for performing communication.

＜１．２．１ハードウェア構成＞
まず、図２を参照して、本実施形態に係る端末装置１００のハードウェア構成の一例を説明する。図２は、本実施形態に係る端末装置１００のハードウェア構成の一例を示すブロック図である。図２を参照すると、端末装置１００は、ＣＰＵ（ｃｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）８０１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）８０３、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）８０５、バス８０７、記憶装置８０９、通信インターフェース８１１、カメラ８１３、マイクロフォン８１５、スピーカ８１７及びタッチパネル８２０を有する。 <1.2.1 Hardware configuration>
First, an example of the hardware configuration of the terminal device 100 according to the present embodiment will be described with reference to FIG. FIG. 2 is a block diagram showing an example of the hardware configuration of the terminal device 100 according to this embodiment. 2, the terminal device 100 includes a CPU (central processing unit) 801, a ROM (Read Only Memory) 803, a RAM (Random Access Memory) 805, a bus 807, a storage device 809, a communication interface 811, a camera 813, and a microphone. 815, a speaker 817, and a touch panel 820.

（ＣＰＵ８０１、ＲＯＭ８０３、ＲＡＭ８０５）
ＣＰＵ８０１は、端末装置１００における様々な処理を実行する。また、ＲＯＭ８０３は、端末装置１００における処理をＣＰＵ８０１に実行させるためのプログラム及びデータを記憶する。さらに、ＲＡＭ８０５は、ＣＰＵ８０１の処理の実行時に、プログラム及びデータを一時的に記憶する。 (CPU 801, ROM 803, RAM 805)
The CPU 801 executes various processes in the terminal device 100. Further, the ROM 803 stores a program and data for causing the CPU 801 to execute the process in the terminal device 100. Further, the RAM 805 temporarily stores programs and data when the processing of the CPU 801 is executed.

（バス８０７）
バス８０７は、ＣＰＵ８０１、ＲＯＭ８０３及びＲＡＭ８０５を相互に接続する。バス８０７には、さらに、後述する記憶装置８０９、通信インターフェース８１１、カメラ８１３、マイクロフォン８１５、スピーカ８１７及びタッチパネル８２０が接続される。バス８０７は、例えば、複数の種類のバスを含む。具体的には、バス８０７は、ＣＰＵ８０１、ＲＯＭ８０３及びＲＡＭ８０５を接続する高速バスと、当該高速バスよりも低速の１つ以上の別のバスを含んでもよい。 (Bus 807)
The bus 807 connects the CPU 801, the ROM 803, and the RAM 805 to each other. A storage device 809, a communication interface 811, a camera 813, a microphone 815, a speaker 817, and a touch panel 820, which will be described later, are further connected to the bus 807. The bus 807 includes, for example, a plurality of types of buses. Specifically, the bus 807 may include a high-speed bus that connects the CPU 801, the ROM 803, and the RAM 805, and one or more other buses that are slower than the high-speed bus.

（記憶装置８０９）
記憶装置８０９は、端末装置１００内で一時的又は恒久的に保存すべきデータを記憶する。記憶装置８０９は、例えば、ハードディスク（ＨａｒｄＤｉｓｋ）等の磁気記憶装置であってもよく、又は、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、フラッシュメモリ（ＦＬＡＳＨＭｅｍｏｒｙ）、ＭＲＡＭ（ＭａｇｎｅｔｏｒｅｓｉｓｔｉｖｅＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＦｅＲＡＭ（ＦｅｒｒｏｅｌｅｃｔｒｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）及びＰＲＡＭ（ＰｈａｓｅｃｈａｎｇｅＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等の不揮発性メモリ（ｎｏｎｖｏｌａｔｉｌｅｍｅｍｏｒｙ）であってもよい。 (Memory device 809)
The storage device 809 stores data to be temporarily or permanently stored in the terminal device 100. The storage device 809 may be, for example, a magnetic storage device such as a hard disk, or an EEPROM (Electrically Erasable and Programmable Read Only Memory), a flash memory (FLASH Memory Memory), or a MRAM (MRAM). Non-volatile memory such as FeRAM (Ferroelectric Random Access Memory) and PRAM (Phase change Random Access Memory) may be used.

（通信インターフェース８１１）
通信インターフェース８１１は、端末装置１００の通信手段であり、ネットワークを介して（あるいは、直接的に）外部装置と通信する。通信インターフェース８１１は、無線通信用のインターフェースであってもよく、この場合には、例えば、通信アンテナ、ＲＦ（ＲａｄｉｏＦｒｅｑｕｅｎｃｙ）回路及びその他の通信処理用の回路を含んでもよい。また、通信インターフェース８１１は、有線通信用のインターフェースであってもよく、この場合には、例えば、ＬＡＮ端子、伝送回路及びその他の通信処理用の回路を含んでもよい。 (Communication interface 811)
The communication interface 811 is a communication unit of the terminal device 100, and communicates with an external device via the network (or directly). The communication interface 811 may be an interface for wireless communication, and in this case, may include, for example, a communication antenna, an RF (Radio Frequency) circuit, and other circuits for communication processing. Further, the communication interface 811 may be an interface for wired communication, and in this case, for example, may include a LAN terminal, a transmission circuit, and other circuits for communication processing.

（カメラ８１３）
カメラ８１３は、被写体を撮像する。カメラ８１３は、例えば、光学系、撮像素子及び画像処理回路を含む。 (Camera 813)
The camera 813 images a subject. The camera 813 includes, for example, an optical system, an image sensor, and an image processing circuit.

（マイクロフォン８１５）
マイクロフォン８１５は、周囲の音を集音する。マイクロフォン８１５は、周囲の音を電気信号へ変換し、当該電気信号をデジタルデータに変換する。 (Microphone 815)
The microphone 815 collects ambient sounds. The microphone 815 converts ambient sound into an electric signal and converts the electric signal into digital data.

（スピーカ８１７）
スピーカ８１７は、音声を出力する。スピーカ８１７は、デジタルデータを電気信号に変換し、当該電気信号を音声に変換する。 (Speaker 817)
The speaker 817 outputs sound. The speaker 817 converts digital data into an electric signal and converts the electric signal into voice.

（タッチパネル８２０）
タッチパネル８２０は、タッチ検出面８２１及び表示面８２３を含む。 (Touch panel 820)
The touch panel 820 includes a touch detection surface 821 and a display surface 823.

タッチ検出面８２１は、タッチパネル８２０におけるタッチ位置を検出する。より具体的には、例えば、ユーザが、タッチパネル８２０にタッチすると、タッチ検出面８２１は、当該タッチを感知し、当該タッチの位置に応じた電気信号を生成し、そして当該電気信号をタッチ位置の情報に変換する。タッチ検出面８２１は、例えば、静電容量方式、抵抗膜方式、光学式等の任意のタッチ検出方式に対応することができる。 The touch detection surface 821 detects a touch position on the touch panel 820. More specifically, for example, when the user touches the touch panel 820, the touch detection surface 821 senses the touch, generates an electrical signal according to the position of the touch, and then applies the electrical signal to the touch position. Convert to information. The touch detection surface 821 can correspond to an arbitrary touch detection method such as a capacitance method, a resistance film method, or an optical method.

表示面８２３は、端末装置１００からの出力画像（即ち、表示画面）を表示する。表示面８２３は、例えば、液晶、有機ＥＬ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ：ＯＬＥＤ）、ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）等を用いて実現され得る。 The display surface 823 displays an output image (that is, a display screen) from the terminal device 100. The display surface 823 can be realized using, for example, liquid crystal, organic EL (Organic Light Emitting Diode: OLED), CRT (Cathode Ray Tube), or the like.

＜１．２．２機能構成＞
次に、本実施形態に係る端末装置１００の機能構成の一例を説明する。図３は、本実施形態に係る端末装置１００の機能構成の一例を示すブロック図である。図３を参照すると、端末装置１００は、通信部１１０、入力部１２０、撮像部１３０、集音部１４０、表示部１５０、音声出力部１６０、記憶部１７０及び制御部１８０を有する。 <1.2.2 Functional configuration>
Next, an example of the functional configuration of the terminal device 100 according to the present embodiment will be described. FIG. 3 is a block diagram showing an example of the functional configuration of the terminal device 100 according to this embodiment. Referring to FIG. 3, the terminal device 100 includes a communication unit 110, an input unit 120, an imaging unit 130, a sound collection unit 140, a display unit 150, a sound output unit 160, a storage unit 170, and a control unit 180.

（通信部１１０）
通信部１１０は、他の装置と通信する。例えば、通信部１１０は、上述のＬＡＮ２３に接続され、サテライトオフィス２０内の各装置と通信する。また、通信部１１０は、上述の外部ネットワーク３０及びＬＡＮ１９を介して、センタオフィス１０内の各装置と通信する。具体的には、例えば、通信部１１０は、カメラ１１、マイクロフォン１３、センサ１５、メディア配信サーバ１７、情報管理サーバ２００及び音声認識サーバ２０１と通信する。なお、通信部１１０は、例えば、通信インターフェース８１１により実現され得る。 (Communication unit 110)
The communication unit 110 communicates with other devices. For example, the communication unit 110 is connected to the above-described LAN 23 and communicates with each device in the satellite office 20. Further, the communication unit 110 communicates with each device in the center office 10 via the external network 30 and the LAN 19 described above. Specifically, for example, the communication unit 110 communicates with the camera 11, the microphone 13, the sensor 15, the media distribution server 17, the information management server 200, and the voice recognition server 201. The communication unit 110 can be realized by, for example, the communication interface 811.

（入力部１２０）
入力部１２０は、端末装置１００のユーザによる入力を受け付ける。そして、入力部１２０は、入力結果を後述する制御部１８０へ提供する。例えば、入力部１２０は、表示画面におけるユーザにより指定される位置を検出する。より具体的には、入力部１２０は、タッチ検出面８２１により実現され、タッチパネル８２０におけるタッチ位置を検出する。そして、入力部１２０は、検出されたタッチ位置を制御部１８０へ提供する。 (Input unit 120)
The input unit 120 receives an input from the user of the terminal device 100. Then, the input unit 120 provides the input result to the control unit 180 described later. For example, the input unit 120 detects the position designated by the user on the display screen. More specifically, the input unit 120 is realized by the touch detection surface 821 and detects the touch position on the touch panel 820. Then, the input unit 120 provides the detected touch position to the control unit 180.

（撮像部１３０）
撮像部１３０は、被写体を撮像する。例えば、撮像部１３０は、端末装置１００の正面方向の領域を撮像する。この場合には、撮像部１３０は、端末装置１００のユーザを撮像することができる。撮像部１３０は、撮像結果（即ち、撮像画像）を制御部１８０に提供する。なお、撮像部１３０は、例えば、カメラ８１３により実現され得る。 (Imaging unit 130)
The image capturing unit 130 captures an image of a subject. For example, the image capturing unit 130 captures an image of the area in the front direction of the terminal device 100. In this case, the image capturing unit 130 can capture the user of the terminal device 100. The imaging unit 130 provides the imaging result (that is, the captured image) to the control unit 180. The image capturing unit 130 may be realized by the camera 813, for example.

（集音部１４０）
集音部１４０は、端末装置１００の周囲の音を集音する。例えば、集音部１４０は、端末装置１００のユーザの声を集音することができる。集音部１４０は、集音結果（音声データ）を制御部１８０に提供する。なお、集音部１４０は、例えば、マイクロフォン８１５により実現され得る。 (Sound collection unit 140)
The sound collection unit 140 collects sounds around the terminal device 100. For example, the sound collection unit 140 can collect the voice of the user of the terminal device 100. The sound collection unit 140 provides the sound collection result (sound data) to the control unit 180. The sound collection unit 140 can be realized by, for example, a microphone 815.

（表示部１５０）
表示部１５０は、出力画像（表示画面）を表示する。表示部１５０は、制御部１８０による制御に応じて表示画面を表示する。なお、表示部１５０は、例えば、表示面８２３により実現され得る。 (Display unit 150)
The display unit 150 displays an output image (display screen). The display unit 150 displays a display screen under the control of the control unit 180. The display unit 150 may be realized by the display surface 823, for example.

（音声出力部１６０）
音声出力部１６０は、音声を出力する。音声出力部１６０は、制御部１８０による制御に応じて音声を出力する。なお、音声出力部１６０は、例えば、スピーカ８１７により実現され得る。 (Voice output unit 160)
The voice output unit 160 outputs a voice. The voice output unit 160 outputs a voice under the control of the control unit 180. The audio output unit 160 can be realized by, for example, the speaker 817.

（記憶部１７０）
記憶部１７０は、端末装置１００の動作のためのプログラム及びデータを記憶する。例えば、記憶部１７０は、実空間に対応する３次元仮想空間のデータを記憶する。具体的には、例えば、情報管理サーバ２００が、センタオフィス１０に対応する３次元仮想空間のデータを記憶しており、制御部１８０が、通信部１１０を介して、上記３次元仮想空間のデータを取得する。そして、記憶部１７０は、取得された当該３次元仮想空間のデータを記憶する。なお、記憶部１７０は、例えば、記憶装置８０９により実現され得る。 (Storage unit 170)
The storage unit 170 stores programs and data for the operation of the terminal device 100. For example, the storage unit 170 stores the data of the three-dimensional virtual space corresponding to the real space. Specifically, for example, the information management server 200 stores the data of the three-dimensional virtual space corresponding to the center office 10, and the control unit 180 transmits the data of the three-dimensional virtual space via the communication unit 110. To get Then, the storage unit 170 stores the acquired data of the three-dimensional virtual space. The storage unit 170 can be realized by, for example, the storage device 809.

（制御部１８０）
制御部１８０は、端末装置１００の様々な機能を提供する。制御部１８０は、実空間情報提供部１８１、音声出力制御部１８２、位置取得部１８３、オブジェクト選択部１８５、ＩＤ取得部１８７、電話部１８９、会話オブジェクト選択部１９１、ＣＯＭＭリンク制御部１９３、及びＣＯＭＭワード制御部（発言語句オブジェクト制御部）１９５を含む。なお、制御部１８０は、例えば、ＣＰＵ８０１、ＲＯＭ８０３及びＲＡＭ８０５により実現され得る。以下に、制御部１８０の各機能部について説明する。 (Control unit 180)
The control unit 180 provides various functions of the terminal device 100. The control unit 180 includes a real space information providing unit 181, a voice output control unit 182, a position acquisition unit 183, an object selection unit 185, an ID acquisition unit 187, a telephone unit 189, a conversation object selection unit 191, a COMM link control unit 193, and It includes a COMM word control unit (language phrase object control unit) 195. The control unit 180 can be realized by, for example, the CPU 801, the ROM 803, and the RAM 805. The functional units of the control unit 180 will be described below.

（実空間情報提供部１８１）
実空間情報提供部１８１は、実空間の情報を端末装置１００のユーザに提供する。詳細には、実空間情報提供部１８１は、表示部１５０に、実空間の撮像画像の表示画面を表示させる。より具体的には、例えば、当該撮像画像は、実空間（センタオフィス１０）内にあるカメラ１１を通じて取得される撮像画像である。なお、当該撮像画像は、カメラ１１により取得された撮像画像であってもよく、又は、カメラ１１により取得された撮像画像を加工することにより生成された撮像画像であってもよい。また、上記表示画面は、当該撮像画像を一部又は全体に含む画面である。すなわち、実空間情報提供部１８１は、通信部１１０を介して、カメラ１１の撮像画像を取得する。そして、実空間情報提供部１８１は、当該撮像画像を含む表示画面を生成し、表示部１５０に当該表示画面を表示させる。 (Real space information providing unit 181)
The real space information providing unit 181 provides real space information to the user of the terminal device 100. Specifically, the real space information providing unit 181 causes the display unit 150 to display the display screen of the captured image of the real space. More specifically, for example, the captured image is a captured image acquired through the camera 11 in the real space (center office 10). The captured image may be a captured image acquired by the camera 11 or may be a captured image generated by processing the captured image acquired by the camera 11. The display screen is a screen that includes the captured image in part or in whole. That is, the real space information providing unit 181 acquires the captured image of the camera 11 via the communication unit 110. Then, the real space information providing unit 181 generates a display screen including the captured image and causes the display unit 150 to display the display screen.

また、例えば、上記撮像画像は、実空間内にある複数の撮像装置のうちの選択された１つの撮像装置を通じて取得された撮像画像であってもよい。より具体的には、例えば、上記撮像画像は、センタオフィス１０に配置された複数のカメラ１１のうちの選択された１つのカメラ１１を通じて取得された撮像画像であってもよい。なお、ユーザがカメラ１１をどのように選択するかの具体的な手法は後述する。このように、ユーザがカメラ１１を選択することができるので、ユーザは所望の位置からの撮像画像を見ることができる。 Further, for example, the captured image may be a captured image acquired through a selected one of the plurality of imaging devices in the real space. More specifically, for example, the captured image may be a captured image acquired through one selected camera 11 of the plurality of cameras 11 arranged in the center office 10. Note that a specific method of how the user selects the camera 11 will be described later. In this way, since the user can select the camera 11, the user can see the captured image from the desired position.

また、例えば、上記表示画面は、表示モードに応じた撮像画像を含む。より具体的には、例えば、上記表示画面は、第１の表示モードでは、実空間（例えばセンタオフィス１０内）の第１の領域が撮像された第１の撮像画像を含み、第２の表示モードでは、実空間の、第１の領域よりも狭い第２の領域が撮像された第２の撮像画像を含む。即ち、実空間情報提供部１８１は、第１の表示モードでは、上記第１の撮像画像を表示部１５０に表示させ、第２の表示モードでは、上記第２の撮像画像を表示部１５０に表示させる。 Further, for example, the display screen includes a captured image according to the display mode. More specifically, for example, in the first display mode, the display screen includes a first captured image obtained by capturing a first region of the real space (for example, in the center office 10), and the second display The mode includes the second captured image in which the second region in the real space, which is narrower than the first region, is captured. That is, the real space information providing unit 181 displays the first captured image on the display unit 150 in the first display mode, and displays the second captured image on the display unit 150 in the second display mode. Let

さらに具体的には、例えば、上記第１の撮像画像は、第１のズーム率に対応する撮像画像である。そして、上記第２の撮像画像は、上記第１のズーム率よりも大きい第２のズーム率に対応する撮像画像である。例えば、実空間情報提供部１８１は、通信部１１０を介して、カメラ１１へのズーム（光学ズーム、デジタルズーム、又は撮像装置の位置の変更によるズーム（例えば、ドリーによるズーム））に関する要求を行うことにより、第１のズーム率に対応する撮像画像、又は第２のズーム率に対応する撮像画像を取得する。または、実空間情報提供部１８１は、カメラ１１の撮像画像に対するデジタルズームにより、第１のズーム率に対応する撮像画像、又は第２のズーム率に対応する撮像画像を取得してもよい。 More specifically, for example, the first captured image is a captured image corresponding to the first zoom ratio. Then, the second captured image is a captured image corresponding to a second zoom rate larger than the first zoom rate. For example, the real space information providing unit 181 makes a request regarding the zoom (optical zoom, digital zoom, or zoom by changing the position of the imaging device (for example, dolly zoom)) to the camera 11 via the communication unit 110. Thus, the captured image corresponding to the first zoom ratio or the captured image corresponding to the second zoom ratio is acquired. Alternatively, the real space information providing unit 181 may acquire the captured image corresponding to the first zoom ratio or the captured image corresponding to the second zoom ratio by digital zooming the captured image of the camera 11.

なお、ここでのズーム率は、１．５倍、２倍等の精緻な値である必要はなく、被写体が撮像画像に写る大きさの程度を直接的又は間接的に示すものであればよい。例えば、カメラ１１の位置の変更によるズーム（例えば、ドリーによるズームイン及びズームアウト）が用いられる場合には、ズーム率は、１．５倍、２倍等の精緻な値ではなく、被写体の大きさの程度を直接的に示すもの（例えば、被写体の概ねの大きさの程度を示すパラメータ等）、又は、被写体の大きさの程度を間接的に示すもの（例えば、レールにおけるカメラ１１の位置等）であってもよい。第１のズーム率に対応する撮像画像は、被写体がより小さく写っている撮像画像であり、第１のズーム率よりも大きい第２のズーム率に対応する撮像画像は、当該被写体がより大きく写っている撮像画像であればよい。 The zoom ratio here does not have to be a delicate value such as 1.5 times or 2 times, and may be any value that directly or indirectly indicates the size of the size of the subject in the captured image. .. For example, when zooming by changing the position of the camera 11 (for example, zooming in and out by dolly) is used, the zoom rate is not a precise value such as 1.5 times or 2 times, but the size of the subject. Directly indicating the degree of the subject (for example, a parameter indicating the approximate size of the subject) or indirectly indicating the degree of the subject (eg, the position of the camera 11 on the rail). May be The captured image corresponding to the first zoom ratio is a captured image in which the subject is smaller, and the captured image corresponding to the second zoom ratio that is larger than the first zoom ratio is the subject in a larger size. Any captured image may be used.

一例として、上記表示画面は、俯瞰モードでは、Ｘ倍のズーム率（例えば、Ｘ＝１）での撮像でカメラ１１により生成された俯瞰撮像画像を含み、近接モードでは、Ｙ倍のズーム率（Ｙ＞Ｘ）での撮像でカメラ１１により生成された近接撮像画像を含む。即ち、俯瞰撮像画像は、センタオフィス１０内のより広い領域が撮像された撮像画像であり、近接撮像画像は、センタオフィス１０内のより狭い領域が撮像された撮像画像である。以下、これら撮像画像について、図４及び図５を参照してその具体例を説明する。 As an example, the display screen includes a bird's-eye view captured image generated by the camera 11 in imaging at a zoom rate of X times (for example, X=1) in the bird's-eye view mode, and a zoom rate of Y times in the close-up mode ( (Y>X) includes a close-up captured image generated by the camera 11. That is, the bird's-eye view captured image is a captured image in which a wider area in the center office 10 is captured, and the close-up captured image is a captured image in which a smaller area in the center office 10 is captured. Hereinafter, specific examples of these captured images will be described with reference to FIGS. 4 and 5.

−俯瞰モードで表示される表示画面−
図４は、本実施形態に係る俯瞰モードで表示される表示画面の一例を説明するための説明図である。詳細には、図４には、俯瞰モードの表示画面６０が示されている。当該表示画面６０は、俯瞰撮像画像６１、ボタン画像６３、プレゼンスアイコン６５、吹き出し画像６７及びマップ画像６９を含む。 -Display screen displayed in overhead view mode-
FIG. 4 is an explanatory diagram illustrating an example of a display screen displayed in the overhead view mode according to the present embodiment. More specifically, FIG. 4 shows a display screen 60 in the overhead view mode. The display screen 60 includes a bird's-eye view captured image 61, a button image 63, a presence icon 65, a balloon image 67, and a map image 69.

俯瞰撮像画像６１は、例えば、Ｘ倍のズーム率でのカメラ１１により取得された撮像画像である。一例として、Ｘ＝１である。即ち、俯瞰撮像画像６１は、ズームなしでのカメラ１１の撮像画像である。また、例えば、ユーザが、俯瞰撮像画像６１の位置を指定すると、実空間情報提供部１８１は、表示モードを俯瞰モードから近接モードに切り替える。より具体的には、例えば、ユーザが俯瞰撮像画像６１内の所望の位置をタッチし、俯瞰撮像画像６１に対応するタッチ位置が検出されると、実空間情報提供部１８１は、表示モードを俯瞰モードから近接モードに切り替える。 The bird's-eye view captured image 61 is, for example, a captured image acquired by the camera 11 at a zoom ratio of X times. As an example, X=1. That is, the bird's-eye view captured image 61 is a captured image of the camera 11 without zooming. Further, for example, when the user specifies the position of the bird's-eye view captured image 61, the real space information providing unit 181 switches the display mode from the bird's-eye view mode to the proximity mode. More specifically, for example, when the user touches a desired position in the bird's-eye view captured image 61 and a touch position corresponding to the bird's-eye view captured image 61 is detected, the real space information providing unit 181 looks down the display mode. Switch from mode to proximity mode.

また、ボタン画像６３は、別のカメラ１１を選択するための画像である。例えば、ユーザが、ボタン画像６３の位置を指定すると、実空間情報提供部１８１は、別のカメラ１１の俯瞰撮像画像を取得し、表示部１５０に当該俯瞰撮像画像を表示させる。より具体的には、例えば、ユーザがボタン画像６３の位置をタッチし、ボタン画像６３に対応するタッチ位置が検出されると、実空間情報提供部１８１は、別のカメラ１１の俯瞰撮像画像を取得し、表示部１５０に当該俯瞰撮像画像を表示させる。具体的には、図４の例においては、ボタン画像６３Ａの位置がユーザにより指定されると、現在のカメラ１１の左側に位置するカメラ１１が選択される。また、ボタン画像６３Ｂの位置がユーザにより指定されると、現在のカメラ１１の右側に位置するカメラ１１が選択される。そして、実空間情報提供部１８１は、選択されたカメラ１１の俯瞰撮像画像を取得し、表示部１５０に、当該俯瞰撮像画像を表示させる。 The button image 63 is an image for selecting another camera 11. For example, when the user specifies the position of the button image 63, the real space information providing unit 181 acquires the bird's-eye view captured image of another camera 11, and causes the display unit 150 to display the bird's-eye view captured image. More specifically, for example, when the user touches the position of the button image 63 and the touch position corresponding to the button image 63 is detected, the real space information providing unit 181 displays the bird's eye captured image of another camera 11. The acquired captured bird's-eye view image is displayed on the display unit 150. Specifically, in the example of FIG. 4, when the position of the button image 63A is designated by the user, the camera 11 located on the left side of the current camera 11 is selected. When the position of the button image 63B is designated by the user, the camera 11 located on the right side of the current camera 11 is selected. Then, the real space information providing unit 181 acquires the bird's-eye view captured image of the selected camera 11, and causes the display unit 150 to display the bird's-eye view captured image.

また、プレゼンスアイコン６５は、例えば、俯瞰撮像画像６１に写る人物の繁忙度を示すアイコンである。より具体的には、プレゼンスアイコン６５は、人物の繁忙度に応じて色が変わる。一例として、プレゼンスアイコンは、赤色の場合に繁忙度が高いことを示し、黄色の場合に繁忙度が普通であることを示し、青色の場合に繁忙度が低いことを示す。後述するように、俯瞰撮像画像６１のうちのどこに人物が写っているはずであるかが分かるので、このようなアイコンを表示することも可能である。なお、人物の繁忙度については、当該人物に対応するＰＣ等の端末装置１００の操作状況（一定時間あたりの高頻度打鍵や業務アプリケーションの長期継続使用等が行われていれば当該人物は忙しいと判断する等）によって、判断してもよい。また、本実施形態においては、プレゼンスアイコン６５は、上述のような形態に限定されるものではなく、例えば、白い色の円のアイコンである場合には、対応する人物が在席中であることを示し、黒色の円のアイコンである場合には、対応する人物が不在であることを示していてもよい。 Further, the presence icon 65 is, for example, an icon indicating the busy degree of a person shown in the bird's-eye view captured image 61. More specifically, the presence icon 65 changes its color according to the busyness of a person. As an example, the presence icon indicates that the busyness degree is high when the color is red, indicates that the busyness degree is normal when the color is yellow, and indicates that the busyness degree is low when the color is blue. As will be described later, it is possible to display such an icon because it is possible to know where in the bird's-eye view captured image 61 the person should appear. Regarding the busyness of a person, the operating status of the terminal device 100 such as a PC corresponding to the person (the person is busy if high frequency keystrokes per certain time or long-term continuous use of business applications are performed). It may be determined according to (determination, etc.). Further, in the present embodiment, the presence icon 65 is not limited to the above-described form. For example, when the presence icon 65 is a white circle icon, it means that the corresponding person is present. And a black circle icon may indicate that the corresponding person is absent.

上述のように、表示画面６０は、例えば、俯瞰撮像画像６１に写る人物に関連する情報（以下、「人物関連情報」と呼ぶ）を含む。そして、人物関連情報は、例えば、上記人物の状態を示す状態情報を含む。上述したように、当該状態情報は、その一例としてプレゼンスアイコン６５を挙げることができる。なお、人物関連情報は、２つ以上の時点における上記人物の状態を示す状態履歴を含んでもよい。また、一例として、当該状態履歴情報は、俯瞰撮像画像６１に写る人物の繁忙度の履歴を含んでもよい。即ち、表示画面６０に、人物の繁忙度の履歴が表示されてもよい。実空間情報提供部１８１は、例えば、通信部１１０を介して、情報管理サーバ２００から人物関連情報、又は人物関連情報の表示に必要な情報を取得する。 As described above, the display screen 60 includes, for example, information related to a person shown in the bird's-eye view captured image 61 (hereinafter, referred to as “person-related information”). Then, the person-related information includes, for example, state information indicating the state of the person. As described above, the presence icon 65 can be cited as an example of the state information. In addition, the person-related information may include a state history indicating the state of the person at two or more time points. Further, as an example, the state history information may include a history of the busyness of the person shown in the bird's-eye view captured image 61. That is, the history of the busyness of a person may be displayed on the display screen 60. The real space information providing unit 181 acquires the person-related information or information necessary for displaying the person-related information from the information management server 200 via the communication unit 110, for example.

本実施形態においては、このような人物関連情報により、ユーザは、人物が置かれている状況をより的確に把握することができる。また、上述の状態情報により、ユーザは、状態情報に対応する人物にコンタクトしてもよいかをより的確に判断することができる。また、状態履歴により、ユーザは、状態履歴に対応する人物の瞬時の状態だけではなく、ある期間での当該人物の状態を把握することができるので、ユーザは、当該人物にコンタクトしてもよいかをさらに的確に判断することができる。 In the present embodiment, such person-related information allows the user to more accurately understand the situation in which the person is placed. In addition, the above-mentioned status information allows the user to more accurately determine whether or not to contact the person corresponding to the status information. Further, the state history allows the user to grasp not only the instantaneous state of the person corresponding to the state history but also the state of the person in a certain period, so the user may contact the person. It can be judged more accurately.

また、吹き出し画像６７は、俯瞰撮像画像６１に写る人物により提示される文字情報を含む画像である。吹き出し画像６７も、人物関連情報の一例である。 Further, the balloon image 67 is an image including character information presented by the person shown in the bird's-eye view captured image 61. The balloon image 67 is also an example of the person-related information.

また、マップ画像６９は、センタオフェス１０のマップを示す画像である。マップ画像６９は、さらに、使用しているカメラ１１をアイコン３１により示す。なお、センタオフィス１０内に１つ又は少数のカメラ１１しか設置されない場合には、マップ画像６９は省略されてもよい。 The map image 69 is an image showing a map of the center office 10. The map image 69 further shows the camera 11 being used by the icon 31. If only one or a small number of cameras 11 are installed in the center office 10, the map image 69 may be omitted.

−近接モードで表示される表示画面−
図５は、本実施形態に係る近接モードで表示される表示画面の一例を説明するための説明図である。詳細には、図５には、近接モードで表示される表示画面７０が示されている。当該表示画面７０は、近接撮像画像７１、ボタン画像７３及びマップ画像７５を含む。 -Display screen displayed in proximity mode-
FIG. 5 is an explanatory diagram for explaining an example of a display screen displayed in the proximity mode according to the present embodiment. Specifically, FIG. 5 shows a display screen 70 displayed in the proximity mode. The display screen 70 includes a close-up captured image 71, a button image 73, and a map image 75.

近接撮像画像７１は、例えば、Ｙ倍のズーム率（Ｙ＞Ｘ）でのカメラ１１の撮像画像である。一例として、Ｙ＝１．５である。即ち、俯瞰撮像画像６１は、１．５倍ズームの撮像でのカメラ１１の撮像画像である。 The close-up captured image 71 is, for example, a captured image of the camera 11 at a zoom ratio of Y (Y>X). As an example, Y=1.5. That is, the bird's-eye view captured image 61 is a captured image of the camera 11 in the 1.5× zoom imaging.

また、ボタン画像７３は、表示モードを近接モードから俯瞰モードに切り替えるための画像である。例えば、ユーザが、ボタン画像７３の位置を指定すると、実空間情報提供部１８１は、表示モードを近接モードから俯瞰モードに切り替える。より具体的には、例えば、ユーザがボタン画像７３をタッチし、ボタン画像７３に対応するタッチ位置が検出されると、実空間情報提供部１８１は、表示モードを近接モードから俯瞰モードに切り替える。 The button image 73 is an image for switching the display mode from the close-up mode to the bird's eye view mode. For example, when the user specifies the position of the button image 73, the real space information providing unit 181 switches the display mode from the proximity mode to the bird's eye view mode. More specifically, for example, when the user touches the button image 73 and the touch position corresponding to the button image 73 is detected, the real space information providing unit 181 switches the display mode from the proximity mode to the bird's eye view mode.

また、マップ画像７５は、俯瞰モードにおけるマップ画像６９と同様に、センタオフェス１０のマップを示す画像である。マップ画像７５は、さらに、使用しているカメラ１１を示す。例えば、近接モードでは、ズームされたことを象徴的に示すために、マップ画像７５の中の使用しているカメラのアイコン３１が、撮影対象により近接した位置に表示される。なお、俯瞰モードにおけるマップ画像６９と同様に、センタオフィス１０内に１つ又は少数のカメラ１１しか設置されない場合には、マップ画像７５は省略されてもよい。 Further, the map image 75 is an image showing a map of the center office 10, like the map image 69 in the overhead view mode. The map image 75 further shows the camera 11 being used. For example, in the proximity mode, the icon 31 of the camera being used in the map image 75 is displayed at a position closer to the shooting target in order to symbolically indicate zooming. Similar to the map image 69 in the overhead view mode, the map image 75 may be omitted when only one or a small number of cameras 11 are installed in the center office 10.

なお、近接モードで表示される表示画面７０にも、プレゼンスアイコン６５、吹き出し画像６７等の人物関連情報が含まれてもよい。 The display screen 70 displayed in the proximity mode may also include the person-related information such as the presence icon 65 and the balloon image 67.

以上のように表示モードを切り替えることにより、より広い領域が撮像された撮像画像が表示されることにより、ユーザは実空間の全体的な状況を見ることができ、また特定の人物を容易に見つけることができる。そして、より狭い領域が撮像された撮像画像が表示されることにより、ユーザは特定の人物の位置をより容易に指定することができる。また、本実施形態においては、ユーザは容易な操作により、表示モードを切り替えることができる。また、より広い領域が撮像された撮像画像とより狭い領域が撮像された撮像画像とは、互いにズーム率が異なる撮像画像であるため、ユーザは、これらの撮像画像間の関係を直感的に容易に把握することができる。よって、ユーザは、表示モードが切り替わったとしても、特定の人物を容易に見つけ、当該特定の人物の位置を指定することができる。 By switching the display mode as described above, the captured image in which a wider area is captured is displayed, so that the user can see the overall situation in the real space and easily find a specific person. be able to. Then, by displaying the captured image in which a narrower area is captured, the user can more easily specify the position of a specific person. In addition, in this embodiment, the user can switch the display mode by an easy operation. In addition, since the captured image in which the wider area is captured and the captured image in which the narrower area is captured are captured images having different zoom rates, the user can intuitively easily understand the relationship between these captured images. Can be grasped. Therefore, even if the display mode is switched, the user can easily find a specific person and specify the position of the specific person.

−その他の実空間情報−
以上のように、実空間情報提供部１８１は、表示部１５０に実空間の撮像画像の表示画面を表示させることにより、実空間の視覚的な情報を提供する。さらに、実空間情報提供部１８１は、実空間の聴覚的な情報も提供してもよい。即ち、実空間情報提供部１８１は、音声出力部１６０に、実空間での集音により得られた音声データの音声を出力させてもよい。例えば、実空間情報提供部１８１は、ユーザにより選択されたカメラ１１に近いマイクロフォン１３を選択する。そして、実空間情報提供部１８１は、通信部１１０を介して、当該マイクロフォン１３から、センタオフィス１０での集音により得られた音声データを取得する。そして、実空間情報提供部１８１は、音声出力部１６０に、取得した音声データの音声を出力させてもよい。 -Other real space information-
As described above, the real space information providing unit 181 provides the visual information of the real space by displaying the display screen of the captured image of the real space on the display unit 150. Furthermore, the real space information providing unit 181 may also provide auditory information in the real space. That is, the real space information providing unit 181 may cause the sound output unit 160 to output the sound of the sound data obtained by the sound collection in the real space. For example, the real space information providing unit 181 selects the microphone 13 close to the camera 11 selected by the user. Then, the real space information providing unit 181 acquires the voice data obtained by the sound collection in the center office 10 from the microphone 13 via the communication unit 110. Then, the real space information providing unit 181 may cause the audio output unit 160 to output the audio of the acquired audio data.

（音声出力制御部１８２）
音声出力制御部１８２は、制御部１８０が通信部１１０を介して取得したデータに基づいて、音声出力部１６０による音声出力を制御する。具体的には、音声出力制御部１８２は、上記データに基づいて、音声出力部１６０から出力される音声の音量を制御する。 (Voice output control unit 182)
The audio output control unit 182 controls the audio output by the audio output unit 160 based on the data acquired by the control unit 180 via the communication unit 110. Specifically, the audio output control unit 182 controls the volume of the audio output from the audio output unit 160 based on the above data.

（位置取得部１８３）
位置取得部１８３は、実空間の撮像画像の表示画面においてユーザにより指定される上記撮像画像内の位置を取得する。より具体的には、例えば、入力部１２０が、実空間の撮像画像の表示画面においてユーザにより指定される上記撮像画像内の位置を検出すると、位置取得部１８３は、当該位置を取得する。そして、位置取得部１８３は、当該撮像画像内の位置をオブジェクト選択部１８５に提供する。 (Position acquisition unit 183)
The position acquisition unit 183 acquires the position in the captured image specified by the user on the display screen of the captured image in the real space. More specifically, for example, when the input unit 120 detects a position in the captured image specified by the user on the display screen of the captured image in the real space, the position acquisition unit 183 acquires the position. Then, the position acquisition unit 183 provides the position in the captured image to the object selection unit 185.

例えば、位置取得部１８３は、図５の近接撮像画像７１内のいずれかの位置を取得した場合に、当該位置をオブジェクト選択部１８５に提供する。以下、位置取得部１８３による位置の取得及び提供について図６及び図７を参照して具体的に説明する。 For example, when the position acquisition unit 183 acquires any position in the close-up captured image 71 of FIG. 5, the position acquisition unit 183 provides the position to the object selection unit 185. Hereinafter, the acquisition and provision of the position by the position acquisition unit 183 will be specifically described with reference to FIGS. 6 and 7.

図６は、ユーザにより指定される近接撮像画像７１内の位置の第１の例を説明するための説明図である。図６には、近接モードの表示画面に含まれる近接撮像画像７１、及びユーザの手３が、示されている。また、図６には、近接撮像画像７１に写る人物の人物画像７７が示されている。そして、ユーザは、手３で人物画像７７の位置をタッチすることにより、近接撮像画像７１の人物画像７７の位置を指定している。この場合に、入力部１２０は、近接撮像画像７１の人物画像７７の上記位置を検出し、位置取得部１８３は、当該位置を取得する。そして、入力部１２０は、取得した当該位置をオブジェクト選択部１８５に提供する。 FIG. 6 is an explanatory diagram for describing a first example of the position in the close-up captured image 71 designated by the user. FIG. 6 shows the close-up captured image 71 and the user's hand 3 included in the display screen in the close-up mode. Further, FIG. 6 shows a person image 77 of a person shown in the close-up captured image 71. Then, the user touches the position of the person image 77 with the hand 3 to specify the position of the person image 77 of the close-up captured image 71. In this case, the input unit 120 detects the position of the person image 77 of the close-up captured image 71, and the position acquisition unit 183 acquires the position. Then, the input unit 120 provides the acquired position to the object selection unit 185.

図７は、ユーザにより指定される近接撮像画像７１内の位置の第２の例を説明するための説明図である。図７には、近接撮像画像７１に写る人物画像７７Ａ及び７７Ｂが示されている。そして、ユーザは、手３で人物画像７７Ａの位置をタッチすることにより、近接撮像画像７１の人物画像７７Ａの位置を指定している。この場合に、入力部１２０は、近接撮像画像７１の人物画像７７Ａの上記位置を検出し、位置取得部１８３は、当該位置を取得する。そして、入力部１２０は、取得した当該位置をオブジェクト選択部１８５に提供する。 FIG. 7 is an explanatory diagram for explaining a second example of the position in the close-up captured image 71 designated by the user. FIG. 7 shows person images 77A and 77B shown in the close-up captured image 71. Then, the user touches the position of the person image 77A with the hand 3 to specify the position of the person image 77A of the close-up captured image 71. In this case, the input unit 120 detects the position of the person image 77A of the close-up captured image 71, and the position acquisition unit 183 acquires the position. Then, the input unit 120 provides the acquired position to the object selection unit 185.

（オブジェクト選択部１８５）
オブジェクト選択部１８５は、取得される撮像画像内の位置に基づいて、実空間に対応する３次元仮想空間に配置されたオブジェクトを選択する。例えば、位置取得部１８３が、実空間の撮像画像の表示画面においてユーザにより指定される上記撮像画像内の位置を取得すると、オブジェクト選択部１８５は、当該位置に基づいて、上記実空間に対応する３次元仮想空間に配置されたオブジェクトを選択する。詳細には、上記オブジェクトは、上記撮像画像内の上記位置に対応する上記３次元仮想空間内の３次元仮想位置に配置されたオブジェクトである。また、例えば、上記撮像画像は、表示画面に含まれる上記第２のモード（例えば、近接モード）の撮像画像である。例えば、位置取得部１８３が、カメラ１１により生成された近接撮像画像７１内の位置を取得すると、オブジェクト選択部１８５は、センタオフィス１０に対応する３次元仮想空間に配置されたオブジェクトのうちの、上記位置に対応するオブジェクトを選択する。なお、オブジェクト選択部１８５は、例えば、センタオフィス１０に対応する３次元仮想空間のデータを記憶部１７０から取得する。 (Object selection unit 185)
The object selection unit 185 selects an object arranged in the three-dimensional virtual space corresponding to the real space based on the position in the captured image that is acquired. For example, when the position acquisition unit 183 acquires the position in the captured image specified by the user on the display screen of the captured image in the real space, the object selection unit 185 corresponds to the real space based on the position. Select an object placed in a three-dimensional virtual space. Specifically, the object is an object arranged at a three-dimensional virtual position in the three-dimensional virtual space corresponding to the position in the captured image. Further, for example, the captured image is a captured image in the second mode (for example, the proximity mode) included in the display screen. For example, when the position acquisition unit 183 acquires the position in the close-up captured image 71 generated by the camera 11, the object selection unit 185 causes the object selection unit 185 to select one of the objects arranged in the three-dimensional virtual space corresponding to the center office 10. Select the object corresponding to the above position. The object selection unit 185 acquires, for example, the data of the three-dimensional virtual space corresponding to the center office 10 from the storage unit 170.

−１つのオブジェクトが配置されている場合の例−
以下、図８及び図９を参照して、３次元仮想空間９０に１つのオブジェクト９１が配置されている場合の３次元仮想空間９０の具体例を説明する。図８は、センタオフィス１０に対応する３次元仮想空間９０の第１の例を説明するための説明図である。図８においては、センタオフィス１０に対応する３次元仮想空間９０が示されている。また、当該３次元仮想空間９０には、オブジェクト９１が配置されている。当該オブジェクト９１は、人物（例えば、Ａ氏）に対応する。そして、当該オブジェクト９１は、センタオフィス１０の当該人物（例えば、Ａ氏）の座席の位置に対応する３次元仮想位置に配置される。即ち、上記人物が座席に座っている場合には上記人物が存在するであろう位置に対応する３次元仮想位置に、上記オブジェクト９１が配置される。この例では、オブジェクト９１は、円柱状のオブジェクトである。当該円柱状のオブジェクト９１は、例えば、半径Ｒ及び高さＨを伴う円柱のオブジェクトである。半径Ｒ及び高さＨは、例えば、予め定められている。なお、３次元仮想空間９０のデータには、各カメラ１１に関連する情報も含まれている。例えば、各カメラ１１に関連する情報は、各カメラ１１の設置位置に対応する３次元仮想位置、撮像方向（例えば、カメラのレンズと垂直な方向）、画角等を含む。 -Example when one object is placed-
Hereinafter, a specific example of the three-dimensional virtual space 90 in the case where one object 91 is arranged in the three-dimensional virtual space 90 will be described with reference to FIGS. 8 and 9. FIG. 8 is an explanatory diagram illustrating a first example of the three-dimensional virtual space 90 corresponding to the center office 10. In FIG. 8, a three-dimensional virtual space 90 corresponding to the center office 10 is shown. An object 91 is arranged in the three-dimensional virtual space 90. The object 91 corresponds to a person (for example, Mr. A). Then, the object 91 is arranged at a three-dimensional virtual position corresponding to the position of the seat of the person (for example, Mr. A) in the center office 10. That is, when the person is sitting on the seat, the object 91 is arranged at the three-dimensional virtual position corresponding to the position where the person is likely to exist. In this example, the object 91 is a columnar object. The cylindrical object 91 is, for example, a cylindrical object having a radius R and a height H. The radius R and the height H are predetermined, for example. The data of the three-dimensional virtual space 90 also includes information related to each camera 11. For example, the information related to each camera 11 includes a three-dimensional virtual position corresponding to the installation position of each camera 11, an imaging direction (for example, a direction perpendicular to the lens of the camera), an angle of view, and the like.

図９は、図８に示される３次元仮想空間９０に配置されたオブジェクト９１の選択の一例を説明するための説明図である。図９においては、理解を容易にするために、３次元仮想空間９０における水平面における位置関係が示されている。具体的には、図９では、３次元仮想空間９０に配置されたオブジェクト９１、撮像に用いられるカメラ１１の設置位置に対応する３次元仮想位置（以下、「仮想カメラ位置」と呼ぶ）Ｏ、カメラ１１の撮像方向（例えば、カメラのレンズと垂直な方向）に対応する軸ｙ、及び、軸ｙと直交する軸ｘが、示されている。図９の例では、理解を容易にするために、カメラ１１は、当該カメラ１１の撮像方向が水平面に平行になるように、設置されているものとする。 FIG. 9 is an explanatory diagram for explaining an example of selection of the object 91 arranged in the three-dimensional virtual space 90 shown in FIG. In FIG. 9, the positional relationship in the horizontal plane in the three-dimensional virtual space 90 is shown for easy understanding. Specifically, in FIG. 9, an object 91 arranged in a three-dimensional virtual space 90, a three-dimensional virtual position (hereinafter, referred to as “virtual camera position”) O corresponding to the installation position of the camera 11 used for imaging, An axis y corresponding to the imaging direction of the camera 11 (for example, a direction perpendicular to the camera lens) and an axis x orthogonal to the axis y are shown. In the example of FIG. 9, for easy understanding, the camera 11 is assumed to be installed such that the imaging direction of the camera 11 is parallel to the horizontal plane.

さらに、図９においては、カメラ１１の画角θも示されている。また、図９においては、カメラ１１の撮像方向に対応する軸ｙに垂直であり、且つ画角θに対応する幅を有する仮想面９３が示されている。また、仮想面９３は、仮想カメラ位置Ｏから距離Ｉだけ離れている。そして、仮想面９３は、四角形の面であり、撮像画像と同一の縦横比を有する。即ち、仮想面９３は、撮像画像に対応する面である。 Further, in FIG. 9, the angle of view θ of the camera 11 is also shown. Further, in FIG. 9, an imaginary plane 93 which is perpendicular to the axis y corresponding to the imaging direction of the camera 11 and has a width corresponding to the angle of view θ is shown. The virtual surface 93 is separated from the virtual camera position O by the distance I. The virtual surface 93 is a quadrangular surface and has the same aspect ratio as the captured image. That is, the virtual surface 93 is a surface corresponding to the captured image.

オブジェクト選択部１８５は、例えば、図６に示されるようにユーザにより指定される上記撮像画像内の位置を、３次元仮想位置Ａに変換する。そして、オブジェクト選択部１８５は、仮想カメラ位置Ｏと３次元仮想位置Ａとを通る直線に交わるオブジェクトを特定する。すなわち、オブジェクト選択部１８５は、オブジェクト９１を特定する。そして、オブジェクト選択部１８５は、オブジェクト９１を選択する。図９の例では、例えば、仮想面９３のうちの３次元仮想位置Ｂと３次元仮想位置Ｄとの間にある３次元仮想位置に変換される撮像画像内の位置が、撮像画像内でユーザにより指定されると、オブジェクト９１が選択される。なお、このような位置は、概ね、撮像画像においてオブジェクト９１に対応する人物が写っている位置である。なお、距離Ｉは、仮想カメラ位置Ｏとオブジェクト９１との間に仮想面９３が位置するように決定される。一例として、距離Ｉは、カメラ１１の焦点距離であるが、当然ながら本実施形態においてはこれに限られない。 The object selection unit 185 converts the position in the captured image designated by the user into the three-dimensional virtual position A, as shown in FIG. 6, for example. Then, the object selecting unit 185 identifies an object that intersects with a straight line passing through the virtual camera position O and the three-dimensional virtual position A. That is, the object selection unit 185 identifies the object 91. Then, the object selection unit 185 selects the object 91. In the example of FIG. 9, for example, the position in the captured image that is converted into the three-dimensional virtual position between the three-dimensional virtual position B and the three-dimensional virtual position D on the virtual surface 93 is the user in the captured image. Object 91, the object 91 is selected. Note that such a position is generally a position where a person corresponding to the object 91 is shown in the captured image. The distance I is determined so that the virtual plane 93 is located between the virtual camera position O and the object 91. As an example, the distance I is the focal length of the camera 11, but naturally it is not limited to this in the present embodiment.

また、上記においては、３次元仮想空間９０の水平面に着目してオブジェクト９１を選択する手法を説明したが、当該手法によれば、当然ながら、垂直方向（例えば、ｚ軸）での処理を加えたとしても、撮像画像内の位置を３次元仮想位置に変換して当該３次元仮想位置からオブジェクト９１を特定することが可能である。また、上述した例では、撮像画像内の位置を３次元仮想位置に変換することにより、撮像画像内の位置に対応するオブジェクト９１が特定されたが、本実施形態においては、撮像画像内の位置に対応するオブジェクト９１を特定する手法は、これに限られない。 Further, in the above, the method of selecting the object 91 by paying attention to the horizontal plane of the three-dimensional virtual space 90 has been described, but according to the method, of course, processing in the vertical direction (for example, the z axis) is added. Even if it is, it is possible to convert the position in the captured image into a three-dimensional virtual position and specify the object 91 from the three-dimensional virtual position. Further, in the example described above, the object 91 corresponding to the position in the captured image is specified by converting the position in the captured image into a three-dimensional virtual position, but in the present embodiment, the position in the captured image is determined. The method of identifying the object 91 corresponding to is not limited to this.

一例として、オブジェクト選択部１８５は、仮想カメラ位置Ｏを原点としてオブジェクト９１を仮想面９３に射影し、オブジェクト９１の射影範囲を、撮像画像内の範囲に変換してもよい。そして、ユーザにより指定される上記撮像画像内の位置が、上記範囲に含まれる場合に、オブジェクト選択部１８５は、オブジェクト９１を選択してもよい。 As an example, the object selection unit 185 may project the object 91 on the virtual surface 93 with the virtual camera position O as the origin, and convert the projection range of the object 91 into a range within the captured image. Then, when the position in the captured image designated by the user is included in the range, the object selection unit 185 may select the object 91.

また、別の例としては、仮想カメラ位置Ｏ、軸ｙ及び画角θを用いて３次元仮想空間９０のレンダリングを行うことにより、レンダリング画像を生成し、当該レンダリング画像のうちのオブジェクト９１が写っている範囲から、オブジェクト９１に対応する撮像画像内の範囲を得てもよい。そして、ユーザにより指定される上記撮像画像内の位置が、上記範囲に含まれる場合に、オブジェクト選択部１８５は、オブジェクト９１を選択してもよい。 As another example, a rendering image is generated by rendering the three-dimensional virtual space 90 using the virtual camera position O, the axis y, and the angle of view θ, and the object 91 in the rendering image is captured. The range in the captured image corresponding to the object 91 may be obtained from the range. Then, when the position in the captured image designated by the user is included in the range, the object selection unit 185 may select the object 91.

−状態を考慮したオブジェクトの選択
また、例えば、３次元仮想空間９０に配置されるオブジェクト９１は、複数の状態のうちのいずれかの状態を示す状態情報に対応する。例えば、オブジェクト９１は、人物に対応する。そして、オブジェクト９１は、当該人物が座席に座っているか否かを示す状態情報（以下、「存否情報」と呼ぶ）に対応する。当該存否情報は、人物が座席に座っている状態、又は人物が座席に座っていない状態を示す。オブジェクト選択部１８５は、例えば、オブジェクト９１に対応する人物が座席に座っているか否かの判定結果を、通信部１１０を介してセンサ１５から取得する。そして、当該判定結果が存否情報となる。 -Selection of Object Considering State Also, for example, the object 91 arranged in the three-dimensional virtual space 90 corresponds to state information indicating any one of a plurality of states. For example, the object 91 corresponds to a person. Then, the object 91 corresponds to state information (hereinafter, referred to as “presence/absence information”) indicating whether or not the person is sitting on the seat. The presence/absence information indicates a state where a person is sitting on the seat or a state where the person is not sitting on the seat. The object selection unit 185 acquires, for example, the determination result of whether or not the person corresponding to the object 91 is sitting on the seat from the sensor 15 via the communication unit 110. Then, the determination result becomes the presence/absence information.

そして、例えば、オブジェクト選択部１８５は、上記３次元仮想空間９０に配置されたオブジェクト９１であって、上記複数の状態のうちの所定の状態を示す状態情報に対応する上記オブジェクト９１を、選択する。より具体的には、オブジェクト選択部１８５は、上記３次元仮想空間９０に配置されたオブジェクト９１であって、人物が座席に座っている状態を示す状態情報に対応するオブジェクト９１を、選択する。即ち、オブジェクト選択部１８５は、センサ１５により人物が座席に座っていると判定される場合には、当該人物に対応するオブジェクト９１を選択し得るが、センサ１５により人物が座席に座っていないと判定される場合には、当該人物に対応するオブジェクト９１を選択しない。 Then, for example, the object selecting unit 185 selects the object 91 arranged in the three-dimensional virtual space 90 and corresponding to the status information indicating a predetermined status of the plurality of statuses. .. More specifically, the object selection unit 185 selects the object 91 arranged in the three-dimensional virtual space 90 and corresponding to the state information indicating the state in which a person is sitting on the seat. That is, when the sensor 15 determines that the person is sitting on the seat, the object selecting unit 185 can select the object 91 corresponding to the person, but if the sensor 15 does not place the person on the seat. If determined, the object 91 corresponding to the person is not selected.

このように、本実施形態においては、人物の状態に応じてオブジェクト９１が選択されるので、本来選択されるべきでないオブジェクト９１が選択されることを回避することができる。例えば、本実施形態においては、人物がいない場合に当該人物に対応するオブジェクト９１が選択されてしまうことを、回避することができる。 As described above, in the present embodiment, the object 91 is selected according to the state of the person, so that it is possible to avoid selecting the object 91 that should not be selected. For example, in the present embodiment, it can be avoided that the object 91 corresponding to the person is selected when there is no person.

−２つのオブジェクトが配置されている場合の例
また、３次元仮想空間９０内に２つ以上のオブジェクト９１が配置され得る。以下、２つのオブジェクト９１が配置される場合の３次元仮想空間９０の具体例を、図１０を参照して説明する。図１０は、センタオフィス１０に対応する３次元仮想空間９０の第２の例を説明するための説明図である。図１０には、センタオフィス１０に対応する３次元仮想空間９０が示されている。また、当該３次元仮想空間９０には、オブジェクト９１Ａ及びオブジェクト９１Ｂが配置されている。オブジェクト９１Ａは、ある人物（例えば、Ａ氏）に対応し、センタオフィス１０の当該ある人物の座席の位置に対応する３次元仮想位置に配置される。また、オブジェクト９１Ｂは、ある人物（例えば、Ｂ氏）に対応し、センタオフィス１０の当該ある人物の座席の位置に対応する３次元仮想位置に配置される。図８の例と同様に、オブジェクト９１は、半径Ｒ及び高さＨを伴う円柱状のオブジェクトである。 -Example in which two objects are arranged Further, two or more objects 91 may be arranged in the three-dimensional virtual space 90. Hereinafter, a specific example of the three-dimensional virtual space 90 when two objects 91 are arranged will be described with reference to FIG. FIG. 10 is an explanatory diagram illustrating a second example of the three-dimensional virtual space 90 corresponding to the center office 10. FIG. 10 shows a three-dimensional virtual space 90 corresponding to the center office 10. In addition, an object 91A and an object 91B are arranged in the three-dimensional virtual space 90. The object 91A corresponds to a certain person (for example, Mr. A), and is arranged at a three-dimensional virtual position corresponding to the position of the seat of the certain person in the center office 10. Further, the object 91B corresponds to a certain person (for example, Mr. B), and is arranged at a three-dimensional virtual position corresponding to the position of the seat of the certain person in the center office 10. Similar to the example of FIG. 8, the object 91 is a cylindrical object with a radius R and a height H.

このように３次元仮想空間９０内に２つ以上のオブジェクト９１が配置される場合には、複数のオブジェクト９１が、ユーザにより指定される撮像画像内の位置に対応し得る。例えば、図１０の例において、オブジェクト９１Ａとオブジェクト９１Ｂの両方が、ユーザにより指定される撮像画像内の位置に対応し得る。一例として、図７に示されるように撮像画像内に２人の人物が写り得るような場合に、ユーザが、撮像画像において当該２人の人物が重なり合う位置を指定すると、当該２人の人物に対応する２つのオブジェクト９１が、上記位置に対応し得る。 When two or more objects 91 are arranged in the three-dimensional virtual space 90 in this way, the plurality of objects 91 may correspond to positions in the captured image designated by the user. For example, in the example of FIG. 10, both the object 91A and the object 91B may correspond to the position in the captured image designated by the user. As an example, when two people can appear in the captured image as shown in FIG. 7, when the user specifies a position where the two persons overlap in the captured image, the two persons are selected. Two corresponding objects 91 may correspond to the above positions.

そこで、このような場合（即ち、複数のオブジェクト９１が、ユーザにより指定される撮像画像の位置に対応する場合）には、オブジェクト選択部１８５は、上記複数のオブジェクト９１のうちのいずれか１つのオブジェクト９１を選択する。即ち、オブジェクト選択部１８５は、ユーザにより指定される撮像画像内の位置に対応する３次元仮想空間９０内の３次元仮想位置にそれぞれ配置された複数のオブジェクト９１がある場合に、当該複数のオブジェクト９１のうちのいずれか１つのオブジェクト９１を選択する。 Therefore, in such a case (that is, in the case where the plurality of objects 91 correspond to the positions of the captured image designated by the user), the object selecting unit 185 causes the object selecting unit 185 to select one of the plurality of objects 91. Select the object 91. That is, when there are a plurality of objects 91 respectively arranged at the three-dimensional virtual positions in the three-dimensional virtual space 90 corresponding to the positions in the captured image designated by the user, the object selection unit 185 determines that the plurality of objects 91. Any one of the objects 91 is selected.

例えば、上記撮像画像は、実空間内にある撮像装置を通じて生成される。そして、オブジェクト選択部１８５は、上記複数のオブジェクト９１のうちの、上記撮像装置に対応する３次元仮想空間９０内の３次元仮想位置により近いオブジェクト９１を選択する。より具体的には、例えば、撮像画像は、センタオフィス１０内にあるカメラ１１を通じて取得される。そして、オブジェクト選択部１８５は、ユーザにより指定される撮像画像内の位置に対応する複数のオブジェクト９１がある場合に、カメラ１１の設置位置に対応する３次元仮想位置（即ち、仮想カメラ位置Ｏ）により近いオブジェクト９１を選択する。以下、この点について図１０及び図１１を参照して具体例を説明する。 For example, the captured image is generated through an image capturing device in the real space. Then, the object selecting unit 185 selects, from the plurality of objects 91, the object 91 closer to the three-dimensional virtual position in the three-dimensional virtual space 90 corresponding to the imaging device. More specifically, for example, the captured image is acquired by the camera 11 in the center office 10. Then, when there are a plurality of objects 91 corresponding to the positions in the captured image designated by the user, the object selection unit 185, the three-dimensional virtual position corresponding to the installation position of the camera 11 (that is, virtual camera position O). The object 91 closer to is selected. Hereinafter, a specific example of this point will be described with reference to FIGS. 10 and 11.

図１１は、図１０に示される３次元仮想空間９０に配置されたオブジェクト９１の選択の一例を説明するための説明図である。図１１においては、理解を容易にするために、３次元仮想空間９０における水平面における位置関係が示されている。具体的には、図１１においては、３次元仮想空間９０に配置されたオブジェクト９１Ａ及びオブジェクト９１Ｂが示されている。また、図１１においては、図９と同様に、仮想カメラ位置Ｏ、軸ｙ、軸ｘ、画角θ及び仮想面９３が、示されている。この図１１の例でも、図９と同様に、理解を容易にするために、カメラ１１は、当該カメラ１１の撮像方向が水平面に並行になるように、設置されているものとする。 FIG. 11 is an explanatory diagram for explaining an example of selection of the object 91 arranged in the three-dimensional virtual space 90 shown in FIG. In FIG. 11, for easy understanding, the positional relationship in the horizontal plane in the three-dimensional virtual space 90 is shown. Specifically, in FIG. 11, an object 91A and an object 91B arranged in the three-dimensional virtual space 90 are shown. Further, in FIG. 11, as in FIG. 9, the virtual camera position O, the axis y, the axis x, the angle of view θ, and the virtual surface 93 are shown. In the example of FIG. 11 as well, in order to facilitate understanding, it is assumed that the camera 11 is installed such that the imaging direction of the camera 11 is parallel to the horizontal plane, as in FIG. 9.

例えば、図７に示されるようにユーザにより上記撮像画像内の位置が指定される。この場合に、図９を参照して説明した手法に従うと、撮像画像内の上記位置が、３次元仮想位置Ｂ’と３次元仮想位置Ｄとの間にある３次元仮想位置に変換される場合には、オブジェクト９１Ａ及びオブジェクト９１Ｂの両方が、撮像画像内の上記位置に対応するオブジェクトとして特定される。そして、オブジェクト選択部１８５は、オブジェクト９１Ａ及びオブジェクト９１Ｂのうちの、仮想カメラ位置Ｏにより近いオブジェクト９１Ａを選択する。 For example, as shown in FIG. 7, the user specifies the position in the captured image. In this case, according to the method described with reference to FIG. 9, the position in the captured image is converted into a three-dimensional virtual position between the three-dimensional virtual position B′ and the three-dimensional virtual position D. In this case, both the object 91A and the object 91B are specified as objects corresponding to the above positions in the captured image. Then, the object selecting unit 185 selects the object 91A that is closer to the virtual camera position O from the objects 91A and 91B.

なお、ユーザにより指定される撮像画像内の位置が、３次元仮想位置Ｂと３次元仮想位置Ｂ’との間にある３次元仮想位置に変換される場合には、オブジェクト９１Ａが、上記撮像画像内の位置に対応するオブジェクトとして特定され、選択される。また、ユーザにより指定される撮像画像内の位置が、３次元仮想位置Ｄと３次元仮想位置Ｄ’との間にある３次元仮想位置に変換される場合には、オブジェクト９１Ｂが、上記撮像画像内の位置に対応するオブジェクトとして特定され、選択される。 Note that when the position in the captured image designated by the user is converted into a three-dimensional virtual position between the three-dimensional virtual position B and the three-dimensional virtual position B′, the object 91A is the above-mentioned captured image. Identified and selected as the object corresponding to the position within. Further, when the position in the captured image designated by the user is converted into the three-dimensional virtual position between the three-dimensional virtual position D and the three-dimensional virtual position D′, the object 91B becomes the captured image. Identified and selected as the object corresponding to the position within.

このように、本実施形態においては、複数のオブジェクト９１から１つのオブジェクト９１が選択されることにより、複数のオブジェクト９１が選択されることに起因してその後の処理でエラー（複数の通信用ＩＤを取得することによるエラー）が発生することを、回避することができる。また、複数のオブジェクト９１のうちのカメラ１１に対応する仮想カメラ位置に近いオブジェクト９１が選択されることにより、例えば、撮像画像において人物が重なりあっているような場合でも、手前に写っている人物に対応するオブジェクト９１が選択される。従って、ユーザは、ユーザが意図した人物に対応するオブジェクト９１が選択することができる。 Thus, in the present embodiment, when one object 91 is selected from the plurality of objects 91, an error (a plurality of communication IDs) occurs in the subsequent processing due to the selection of the plurality of objects 91. Error can be avoided. In addition, by selecting the object 91 close to the virtual camera position corresponding to the camera 11 among the plurality of objects 91, for example, even when people are overlapped in the captured image, the person shown in the foreground is displayed. The object 91 corresponding to is selected. Therefore, the user can select the object 91 corresponding to the person intended by the user.

以上のように、オブジェクト選択部１８５は、オブジェクト９１を選択する。そして、オブジェクト選択部１８５は、選択したオブジェクト９１の識別情報（以下、「オブジェクトＩＤ」と呼ぶ）をＩＤ取得部１８７に提供する。当該オブジェクトＩＤは、選択されたオブジェクト９１に対応する人物の識別情報であってもよく、又は、選択されたオブジェクト９１に付された単なる番号（例えば、数字や文字等で構成される）であってもよい。 As described above, the object selection unit 185 selects the object 91. Then, the object selection unit 185 provides the ID acquisition unit 187 with the identification information of the selected object 91 (hereinafter referred to as “object ID”). The object ID may be identification information of a person corresponding to the selected object 91, or may be a simple number attached to the selected object 91 (for example, composed of numbers and letters). May be.

（会話オブジェクト選択部１９１）
会話オブジェクト選択部１９１は、後述する「ＣＯＭＭリンク」オブジェクトを選択する。位置取得部１８３が、実空間の撮像画像の表示画面においてユーザにより指定される上記撮像画像内の位置を取得すると、会話オブジェクト選択部１９１は、当該位置に基づいて、上記実空間に対応する３次元仮想空間に配置されたＣＯＭＭリンク（すなわち、会話イベントオブジェクト）を選択する。なお、ＣＯＭＭリンクは、例えば、通話に対応するオブジェクトであり、線分状の形状を持ち、その両端には、当該通話に関わる人物に対応するオブジェクト９１等が位置する。当該ＣＯＭＭリンクの詳細については後述する。 (Conversation object selection unit 191)
The conversation object selection unit 191 selects a “COMM link” object described later. When the position acquisition unit 183 acquires the position in the captured image designated by the user on the display screen of the captured image in the real space, the conversation object selection unit 191 corresponds to the real space 3 based on the position. Select a COMM link (ie, a conversation event object) located in the dimensional virtual space. The COMM link is, for example, an object corresponding to a call, has a line segment shape, and has an object 91 or the like corresponding to a person involved in the call at both ends thereof. Details of the COMM link will be described later.

（ＩＤ取得部１８７）
ＩＤ取得部１８７は、選択される上記オブジェクト９１に対応する識別情報を取得する。例えば、当該識別情報は、選択される上記オブジェクト９１に対応する通信用の識別情報（以下、「通信用ＩＤ」と呼ぶ）である。当該通信用ＩＤは、一例として電話番号が挙げられる。具体的には、例えば、オブジェクト選択部１８５が、オブジェクト９１を選択すると、ＩＤ取得部１８７は、選択された当該オブジェクト９１のオブジェクトＩＤを取得する。そして、オブジェクト選択部１８５は、通信部１１０を介して、オブジェクトＩＤを情報管理サーバ２００へ送信し、オブジェクトＩＤに対応する通信用ＩＤを取得する。当該通信用ＩＤは、選択されたオブジェクト９１に対応する人物の通信用ＩＤである。当該通信用ＩＤは、オブジェクト９１に対応する人物（即ち、ユーザにより指定された撮像画像の位置に写っている人物）の通信装置の通信用ＩＤである。そして、ＩＤ取得部１８７は、取得した通信用ＩＤを後述する電話部１８９に提供する。 (ID acquisition unit 187)
The ID acquisition unit 187 acquires the identification information corresponding to the selected object 91. For example, the identification information is identification information for communication (hereinafter, referred to as “communication ID”) corresponding to the selected object 91. An example of the communication ID is a telephone number. Specifically, for example, when the object selection unit 185 selects the object 91, the ID acquisition unit 187 acquires the object ID of the selected object 91. Then, the object selection unit 185 transmits the object ID to the information management server 200 via the communication unit 110, and acquires the communication ID corresponding to the object ID. The communication ID is the communication ID of the person corresponding to the selected object 91. The communication ID is the communication ID of the communication device of the person corresponding to the object 91 (that is, the person shown in the position of the captured image designated by the user). Then, the ID acquisition unit 187 provides the acquired communication ID to the telephone unit 189 described later.

以上のように、本実施形態においては、ユーザにより撮像画像内の位置が指定されると、当該位置に対応するオブジェクト９１が選択され、当該オブジェクト９１に対応する通信用ＩＤが取得される。これにより、ユーザは、直感的な操作で対象の人物にコンタクトすることが可能になる。また、撮像画像において人物がどのように写っているかによらず、当該人物に対応するオブジェクト９１が選択され、通信用ＩＤが取得されるので、より確実に当該人物にコンタクトすることが可能になる。 As described above, in the present embodiment, when the position in the captured image is designated by the user, the object 91 corresponding to the position is selected and the communication ID corresponding to the object 91 is acquired. This allows the user to contact the target person with an intuitive operation. Further, the object 91 corresponding to the person is selected and the communication ID is acquired regardless of how the person appears in the captured image, so that the person can be contacted more reliably. ..

また、ＩＤ取得部１８７は、選択される会話オブジェクトに対応する複数の識別情報、具体的には、後述するＣＯＭＭリンクの線分の両端に位置するオブジェクト９１に対応する２以上の話者の通信用ＩＤを取得することもできる。具体的には、例えば、オブジェクト選択部１８５が、会話オブジェクト（ＣＯＭＭリンク）を選択すると、ＩＤ取得部１８７は、選択された会話オブジェクトの両端に位置するオブジェクト９１に対応する２以上の話者のオブジェクトＩＤを取得する。そして、オブジェクト選択部１８５は、通信部１１０を介して、ＩＤ取得部１８７により取得された上記２以上のオブジェクトＩＤを情報管理サーバ２００へ送信し、オブジェクトＩＤに対応する通信用ＩＤを取得する。当該通信用ＩＤは、選択されたオブジェクト９１に対応する人物の通信用ＩＤである。さらに、ＩＤ取得部１８７は、オブジェクト選択部１８５により取得された複数の通信用ＩＤを取得する。そして、ＩＤ取得部１８７は、取得した通信用ＩＤを電話部１８９に提供する。 Further, the ID acquisition unit 187 communicates a plurality of pieces of identification information corresponding to the selected conversation object, specifically, the communication of two or more speakers corresponding to the objects 91 located at both ends of the line segment of the COMM link described later. It is also possible to acquire a business ID. Specifically, for example, when the object selection unit 185 selects a conversation object (COMM link), the ID acquisition unit 187 causes two or more speakers corresponding to the objects 91 located at both ends of the selected conversation object. Get the object ID. Then, the object selection unit 185 transmits the two or more object IDs acquired by the ID acquisition unit 187 to the information management server 200 via the communication unit 110, and acquires the communication ID corresponding to the object ID. The communication ID is the communication ID of the person corresponding to the selected object 91. Further, the ID acquisition unit 187 acquires the plurality of communication IDs acquired by the object selection unit 185. Then, the ID acquisition unit 187 provides the acquired communication ID to the telephone unit 189.

以上のように、ユーザにより撮像画像内の位置が指定されると、当該位置に対応する会話オブジェクトが選択され、当該会話オブジェクトに対応する複数の通信用ＩＤが取得される。これにより、ユーザは、遠隔会話という不可視の存在を直感的な操作で指定することが可能になり、既存のたとえば二者通話への参入による三者通話への移行操作が容易に実施できるようになる。 As described above, when the position in the captured image is designated by the user, the conversation object corresponding to the position is selected, and the plurality of communication IDs corresponding to the conversation object are acquired. As a result, the user can specify the invisible presence of remote conversation by an intuitive operation, and can easily perform the operation of transitioning to the existing three-way call by joining the existing two-way call, for example. Become.

（電話部１８９）
電話部１８９は、電話を行うための機能を提供する。例えば、電話部１８９は、ソフトフォンの機能を提供する。例えば、電話部１８９は、ＩＤ取得部１８７により提供される通信用ＩＤを取得すると、当該通信用ＩＤを用いて電話発信を行う。より具体的には、例えば、電話部１８９は、通信用ＩＤを取得すると、通信部１１０を介して、当該通信用ＩＤをＰＢＸ４０に提供し、ＰＢＸ４０からＩＰアドレスを取得する。そして、電話部１８９は、当該ＩＰアドレスを有する通信装置（即ち、着信先の通信装置）との間で、セッションを確立するための一連のシーケンスを実行する。このように、電話部１８９は、表示画面においてユーザにより指定された撮像画像内の位置に写る人物への電話のための電話発信を行う。即ち、電話部１８９は、当該人物の通信装置への電話発信を行う。 (Telephone section 189)
The telephone unit 189 provides a function for making a telephone call. For example, the telephone unit 189 provides a softphone function. For example, when the telephone unit 189 acquires the communication ID provided by the ID acquisition unit 187, it makes a telephone call using the communication ID. More specifically, for example, when the telephone unit 189 acquires the communication ID, the telephone unit 189 provides the communication ID to the PBX 40 via the communication unit 110 and acquires the IP address from the PBX 40. Then, the telephone unit 189 executes a series of sequences for establishing a session with the communication device having the IP address (that is, the communication device at the destination). In this way, the telephone unit 189 makes a telephone call for making a telephone call to the person shown in the position in the captured image designated by the user on the display screen. That is, the telephone unit 189 makes a telephone call to the communication device of the person.

また、電話の相手先の通信装置からの音声データが、通信部１１０により受信されると、電話部１８９は、音声出力部１６０に、当該音声データの音声を出力させる。また、電話部１８９は、通信部１１０に、集音部１４０により提供される音声データを電話の相手先の通信装置へ送信させる。また、電話部１８９は、例えば、通信部１１０に、撮像部１３０により提供される撮像画像（例えば、端末装置１００のユーザが写っている撮像画像）も電話の相手先の通信装置へ送信させる。 When voice data from the communication device of the other party of the call is received by the communication unit 110, the telephone unit 189 causes the voice output unit 160 to output the voice of the voice data. The telephone unit 189 also causes the communication unit 110 to transmit the voice data provided by the sound collection unit 140 to the communication device of the other party of the telephone. The telephone unit 189 also causes the communication unit 110 to transmit, for example, the captured image provided by the image capturing unit 130 (for example, the captured image showing the user of the terminal device 100) to the communication device of the other party of the telephone.

また、電話部１８９は、表示部１５０に、電話時の表示画面を表示させる。例えば、電話時の表示画面を表示する表示モードを、会話モードと呼ぶ。この場合に、電話部１８９は、ＩＤ取得部１８７により提供される通信用ＩＤを取得すると、表示モードを、近接モードから会話モードへ切り替える。以下、会話モードの表示画面について図１２を参照してその具体例を説明する。 In addition, the telephone unit 189 causes the display unit 150 to display a display screen during a telephone call. For example, a display mode for displaying a display screen during a call is called a conversation mode. In this case, when the telephone unit 189 acquires the communication ID provided by the ID acquisition unit 187, the display unit switches the display mode from the proximity mode to the conversation mode. Hereinafter, a specific example of the conversation mode display screen will be described with reference to FIG.

図１２は、会話モードで表示される表示画面８０の一例を説明するための説明図である。図１２においては、会話モードで表示される表示画面８０が示されている。当該表示画面８０は、相手側撮像画像８１、ボタン画像８３及び自分側撮像画像８５を含む。 FIG. 12 is an explanatory diagram for explaining an example of the display screen 80 displayed in the conversation mode. In FIG. 12, a display screen 80 displayed in the conversation mode is shown. The display screen 80 includes a partner side captured image 81, a button image 83, and a self side captured image 85.

相手側撮像画像８１は、例えば、電話の相手先の通信装置から取得された撮像画像である。例えば、電話の相手先の通信装置からの撮像画像が、通信部１１０により受信されると、電話部１８９は、当該撮像画像を相手側撮像画像８１として利用する。 The other party's captured image 81 is, for example, a captured image acquired from the communication device of the other party of the telephone. For example, when the communication unit 110 receives a captured image from the communication device of the other party of the telephone, the telephone unit 189 uses the captured image as the captured image 81 of the other party.

また、ボタン画像６３は、電話を終了させるための画像である。例えば、ユーザが、ボタン画像８３の位置を指定すると、電話部１８９は、通話を終了させる。より具体的には、例えば、ユーザがボタン画像８３をタッチし、ボタン画像８３に対応するタッチ位置が検出されると、電話部１８９は、セッションの切断を含む電話終了用のシーケンスを実行する。また、例えば、電話部１８９は、表示モードを会話モードから俯瞰モードに切り替える。 The button image 63 is an image for ending the call. For example, when the user specifies the position of the button image 83, the telephone unit 189 ends the call. More specifically, for example, when the user touches the button image 83 and the touch position corresponding to the button image 83 is detected, the telephone unit 189 executes a sequence for terminating the call including disconnecting the session. Further, for example, the telephone unit 189 switches the display mode from the conversation mode to the bird's eye view mode.

自分側撮像画像８５は、撮像部１３０により提供される撮像画像である。 The own-side captured image 85 is a captured image provided by the image capturing unit 130.

−表示モードの遷移−
ここでは、表示画面の表示モードの遷移、すなわち、俯瞰モード、近接モード及び会話モードの遷移の具体例を、図１３を参照して説明する。図１３は、表示モードの遷移の一例を説明するための遷移図である。図１３を参照すると、例えばソフトウェアの起動時において、カメラ１１、マイクロフォン１３、センサ１５、情報管理サーバ２００等との接続処理が行われると、表示モードは、俯瞰モード３０１になる（ＥＮＴＲＹ）。 − Transition of display mode −
Here, a specific example of the transition of the display mode of the display screen, that is, the transition of the overhead view mode, the proximity mode, and the conversation mode will be described with reference to FIG. 13. FIG. 13 is a transition diagram for explaining an example of transition of display modes. Referring to FIG. 13, for example, when the connection process with the camera 11, the microphone 13, the sensor 15, the information management server 200, and the like is performed at the time of starting the software, the display mode becomes the overhead view mode 301 (ENTRY).

俯瞰モード３０１では、俯瞰モード処理が実行される（ＤＯ）。そして、ユーザが、俯瞰撮像画像６１の位置を指定すると、モード変更処理が行われ（ＥＸＩＴ）、表示モードは、俯瞰モード３０１から近接モード３０３に切り替わる。モード変更処理は、カメラ１１のズーム処理を含む（ＥＮＴＲＹ）。 In the overhead view mode 301, the overhead view mode process is executed (DO). Then, when the user specifies the position of the bird's-eye view captured image 61, a mode changing process is performed (EXIT), and the display mode is switched from the bird's-eye view mode 301 to the proximity mode 303. The mode changing process includes the zoom process of the camera 11 (ENTRY).

近接モード３０３では、近接モード処理が実行される（ＤＯ）。そして、ユーザが、近接撮像画像７１のうちの人物が写っている位置を指定すると、モード変更処理が行われ（ＥＸＩＴ）、表示モードは、近接モード３０３から会話モード３０５に切り替わる。この場合のモード変更処理は、通話のための処理を含む（ＥＮＴＲＹ）。また、ユーザが、ボタン画像７３の位置を指定すると、モード変更処理が行われ（ＥＸＩＴ）、表示モードは、近接モード３０３から俯瞰モード３０１に切り替わる。この場合のモード変更処理は、上述した接続処理を含む（ＥＮＴＲＹ）。 In the proximity mode 303, proximity mode processing is executed (DO). Then, when the user specifies the position of the person in the close-up captured image 71, the mode changing process is performed (EXIT), and the display mode is switched from the close-up mode 303 to the conversation mode 305. The mode changing process in this case includes a process for a call (ENTRY). When the user specifies the position of the button image 73, the mode changing process is performed (EXIT), and the display mode is switched from the proximity mode 303 to the bird's eye view mode 301. The mode changing process in this case includes the connection process described above (ENTRY).

会話モード３０５では、会話モード処理が実行される（ＤＯ）。また、ユーザが、ボタン画像８３をタッチすると、モード変更処理が行われ（ＥＸＩＴ）、表示モードは、会話モード３０５から俯瞰モード３０１に切り替わる。この場合のモード変更処理は、上述した接続処理を含む（ＥＮＴＲＹ）。 In the conversation mode 305, conversation mode processing is executed (DO). Further, when the user touches the button image 83, a mode changing process is performed (EXIT), and the display mode is switched from the conversation mode 305 to the bird's eye view mode 301. The mode changing process in this case includes the connection process described above (ENTRY).

−表示モードに応じた撮像画像-
上述したように、例えば、実空間情報提供部１８１は、第１の表示モード（例えば、俯瞰モード）では、実空間の第１の領域が撮像された第１の撮像画像（例えば、俯瞰撮像画像６１）を表示部１５０に表示させる。また、実空間情報提供部１８１は、第２の表示モード（例えば、近接モード）では、上記第１の領域よりも狭い第２の領域が撮像された第２の撮像画像（例えば、近接撮像画像７１）を表示部１５０に表示させる。そして、上述した例では、実空間の第１の領域が撮像された第１の撮像画像は、第１のズーム率に対応する撮像画像であり、上記第１の領域よりも狭い第２の領域が撮像された第２の撮像画像は、上記第１のズーム率よりも大きい第２のズーム率に対応する撮像画像である。しかしながら、本実施形態においては、上記第１の撮像画像及び上記第２の撮像画像はこれに限られない。 -Captured image according to display mode-
As described above, for example, in the first display mode (for example, bird's-eye view mode), the real space information providing unit 181 uses the first captured image (for example, the bird's-eye view captured image) in which the first region of the real space is captured. 61) is displayed on the display unit 150. Further, in the second display mode (for example, the proximity mode), the real space information providing unit 181 uses the second captured image (for example, the proximity captured image) in which the second region narrower than the first region is captured. 71) is displayed on the display unit 150. Then, in the example described above, the first captured image in which the first region of the real space is captured is the captured image corresponding to the first zoom ratio, and the second region narrower than the first region. The second captured image of is captured is a captured image corresponding to a second zoom rate that is higher than the first zoom rate. However, in the present embodiment, the first captured image and the second captured image are not limited to this.

例えば、実空間情報提供部１８１は、俯瞰モードの際には、センタオフィス１０の広い領域を撮像するカメラ１１を選択し、当該カメラ１１の撮像画像を俯瞰撮像画像６１として取得する。そして、実空間情報提供部１８１は、近接モードの際には、センタオフィス１０のより狭い領域を撮像するカメラ１１（例えば、より前方にあるカメラ）を選択し、選択されたカメラ１１の撮像画像を近接撮像画像７１として取得する。 For example, in the overhead view mode, the real space information providing unit 181 selects the camera 11 that captures a wide area of the center office 10 and acquires the captured image of the camera 11 as the overhead captured image 61. Then, in the proximity mode, the real space information providing unit 181 selects the camera 11 (for example, the camera in front) that captures a narrower area of the center office 10, and captures the image captured by the selected camera 11. Is acquired as the close-up captured image 71.

これにより、カメラ１１の配置によっては、カメラ１１により取得された撮像画像により、より容易に人物の位置を指定しやすくなる。このような場合には、カメラ１１にズームを要求しなくてもよいことがある。その結果、この場合、例えば、光学ズーム又はドリーによるズームを要求する場合のように、個別の複数の端末装置１００からの同一のカメラ１１に対する要求が競合し、いずれかの端末装置１００に待ち状態が生じるようなこともない。また、この場合には、例えば、デジタルズームを用いる場合のように、処理量が増大するようなこともない。 Thereby, depending on the arrangement of the camera 11, it becomes easier to specify the position of the person by the captured image acquired by the camera 11. In such a case, it may not be necessary to request the camera 11 to zoom. As a result, in this case, for example, as in the case of requesting optical zoom or zoom by a dolly, requests from the plurality of individual terminal devices 100 for the same camera 11 compete with each other, and one of the terminal devices 100 is in a waiting state. Will not occur. Further, in this case, the processing amount does not increase unlike the case of using the digital zoom.

−より自由な条件での撮像により生成される撮像画像−
また、上述した例では、表示モードが切り替えられる例を説明したが、本実施形態においては、表示画面はこれに限られない。例えば、表示モードの切り替えの代わりに、より自由な条件での撮像により撮像画像が取得され、当該撮像画像を含む表示画面が表示されてもよい。例えば、実空間の撮像画像は、複数のズーム率のうちの選択されたズーム率に対応する撮像画像であってもよい。この場合に、例えば、実空間情報提供部１８１は、入力部１２０を介してユーザにより指定されるズーム率を、通信部１１０を介してカメラ１１に要求する。その結果、カメラ１１は、ズーム率を要求に従って変更し、変更後のズーム率での撮像により生成された撮像画像を端末装置１００に提供する。そして、実空間情報提供部１８１は、表示部１５０に、提供された撮像画像を含む表示画面を表示させる。そして、ユーザは、当該撮像画像内の位置を指定すると、位置取得部１８３は、当該位置を取得する。さらに、位置取得部１８３は、撮像画像内の当該位置をオブジェクト選択部１８５に提供する。これにより、ユーザは、細かいズーム率を指定して、所望の撮像画像を表示することができる。よって、ユーザは、撮像画像を用いて、特定の人物の位置をより指定しやすくなる。 -Captured image generated by imaging under more free conditions-
Moreover, although the example in which the display mode is switched has been described in the above-described example, the display screen is not limited to this in the present embodiment. For example, instead of switching the display mode, a captured image may be acquired by capturing under more free conditions, and a display screen including the captured image may be displayed. For example, the captured image in the real space may be a captured image corresponding to the selected zoom ratio of the plurality of zoom ratios. In this case, for example, the real space information providing unit 181 requests the camera 11 via the communication unit 110 for the zoom ratio specified by the user via the input unit 120. As a result, the camera 11 changes the zoom rate according to the request, and provides the terminal device 100 with the captured image generated by the image capturing with the changed zoom rate. Then, the real space information providing unit 181 causes the display unit 150 to display a display screen including the provided captured image. Then, when the user specifies a position in the captured image, the position acquisition unit 183 acquires the position. Further, the position acquisition unit 183 provides the position in the captured image to the object selection unit 185. This allows the user to specify a fine zoom ratio and display a desired captured image. Therefore, the user can more easily specify the position of the specific person by using the captured image.

なお、上述したように、ここでのズーム率は、１．５倍、２倍等の精緻な値である必要はなく、被写体が撮像画像に写る大きさの程度を直接的又は間接的に示すものである。例えば、とりわけカメラ１１の位置の変更によるズーム（例えば、ドリーによるズームイン及びズームアウト）が用いられる場合には、ズーム率は、１．５倍、２倍等の精緻な値ではなく、被写体の大きさの程度を直接的に示すもの（例えば、被写体の概ねの大きさの程度を示すパラメータ、等）、又は、被写体の大きさの程度を間接的に示すもの（例えば、レールにおけるカメラ１１の位置等）であってもよい。 Note that, as described above, the zoom ratio here does not have to be a delicate value such as 1.5 times or 2 times, and directly or indirectly indicates the degree of the size of the subject captured in the captured image. It is a thing. For example, especially when zooming by changing the position of the camera 11 (for example, zooming in and out by dolly) is used, the zoom ratio is not a precise value such as 1.5 times or 2 times, but the size of the subject. That directly indicates the size of the subject (for example, a parameter that indicates the size of the subject, or the like) or indirectly indicates the size of the subject (eg, the position of the camera 11 on the rail). Etc.).

（ＣＯＭＭリンク制御部１９３）
ＣＯＭＭリンク制御部１９３は、「ＣＯＭＭリンク」オブジェクトに関する情報を情報管理サーバ２００から受信し、当該ＣＯＭＭリンクの表示の制御等を行う。 (COMM link control unit 193)
The COMM link control unit 193 receives information regarding the “COMM link” object from the information management server 200, and controls display of the COMM link.

ここで、「ＣＯＭＭリンク」オブジェクトとは、本明細書においては、端末装置１００の表示部１５０の画面に提示されるユーザインタフェースの表示要素または入力要素であってもよい。より具体的には、ＣＯＭＭリンクは、端末装置１００の集音部１４０を介して通話を行っている、例えば人物Ｃと人物Ｄに関して、人物Ｃに対応するオブジェクト９１に対応する撮像画像内の位置と、人物Ｄに対応するオブジェクト９１に対応する撮像画像内の位置とを結ぶ「線分」または曲線の「弦」である。当該ＣＯＭＭリンクは、センタオフィス１０に対応する３次元仮想空間９０内に３次元の線分または曲線弦として存在することができる（図１４を用いて後述する）。 Here, in the present specification, the “COMM link” object may be a display element or an input element of a user interface presented on the screen of the display unit 150 of the terminal device 100. More specifically, the COMM link is a position in the captured image corresponding to the object 91 corresponding to the person C with respect to, for example, the person C and the person D who are talking via the sound collection unit 140 of the terminal device 100. And a position in the captured image corresponding to the object 91 corresponding to the person D is a “line segment” or a curved “string”. The COMM link can exist as a three-dimensional line segment or curved chord in the three-dimensional virtual space 90 corresponding to the center office 10 (described later with reference to FIG. 14).

なお、上記人物Ｃと上記人物Ｄは実空間上の異なる拠点に存在していても構わず、その場合は、ＣＯＭＭリンクは異なる拠点をそれぞれ撮影した複数の撮像画像間をまたいで表示されても構わない。また、ＣＯＭＭリンクは、一人の人物に対し２以上同時に存在してもよく、例えば、三者通話の場合は人物Ｌ、人物Ｍ、人物Ｎに対応する３つの位置をたとえば三角形の３辺状につなぐ３本のＣＯＭＭリンクが存在し表示されても構わない。さらに、ユーザが、三角形やそれ以上の頂点を有する多角形を構成するＣＯＭＭリンクの内部領域の一部を指定することで、多角形等を構成する複数のＣＯＭＭリンクの端に位置する複数のオブジェクト９１のすべてを指定できるようにしてもよい（このような表示例としては、後述する図１６が挙げられる）。なお、当該多角形のＣＯＭＭリンクにおいては、同一の多角形を構成する複数の辺が、互いに交差しないように構成されていることが好ましい。 The person C and the person D may exist at different bases in the real space, and in that case, the COMM link may be displayed across a plurality of captured images of the different bases. I do not care. Further, two or more COMM links may exist for one person at the same time. For example, in the case of a three-way call, three positions corresponding to the person L, the person M, and the person N are formed into, for example, three sides of a triangle. There may be three COMM links to be connected and displayed. Further, the user designates a part of the internal area of the COMM link that forms a triangle or a polygon having vertices or more, so that a plurality of objects located at the ends of the plurality of COMM links that form the polygon or the like. All of 91 may be designated (FIG. 16 to be described later is an example of such a display). In addition, in the polygonal COMM link, it is preferable that a plurality of sides forming the same polygon are configured not to intersect with each other.

なお、ＣＯＭＭリンクの基礎となる会話行動（通話動作）は、端末装置１００の集音部１４０を介したものでなくてもよい。例えば、後述する情報管理サーバ２００は、本実施形態に係る情報処理システムが備える複数のマイクロフォン１３のパラメータとして各マイクロフォン１３の設置位置や集音方向に関する情報を記録し更新してもよい。この場合、情報管理サーバ２００は、上記会話行動（通話動作）の話者である人物のオブジェクト９１の位置情報から最も近い位置に設置されたマイクロフォン１３から集音された音声データを利用し、前述のＣＯＭＭリンクや後述するＣＯＭＭワード（発言語句オブジェクト）の生成や管理に係る処理を行ってもよい。 The conversation behavior (calling operation) that is the basis of the COMM link does not have to be via the sound collecting unit 140 of the terminal device 100. For example, the information management server 200, which will be described later, may record and update information about the installation position of each microphone 13 and the sound collection direction as parameters of the plurality of microphones 13 included in the information processing system according to the present embodiment. In this case, the information management server 200 uses the voice data collected from the microphone 13 installed at the position closest to the position information of the object 91 of the person who is the speaker of the conversation behavior (call operation), and The COMM link and the processing relating to the generation and management of the COMM word (language phrase object) described later may be performed.

ＣＯＭＭリンクは、ＣＯＭＭリンク制御部１９３により表示部１５０が制御されることにより、端末装置１００の表示部１５０の画面上、例えば図４の俯瞰撮像画像６１上やマップ画像６９上に射影して重畳表示される（図１５を用いて後述する）。端末装置１００の表示画面を見たユーザは、実空間の写像画像上に表示されたＣＯＭＭリンクを視認することで、分散環境にいる複数の遠隔地の誰と誰とが通話状態にあるかを直感的に把握することができる。さらに、会話（通話）という音声情報ベースの行為がＣＯＭＭリンクにより可視化されることで、たとえば遠隔地の音声データを取得・出力できず画像データのみの遠隔通信にメディア情報共有が制限される場合でも、ユーザは、ＣＯＭＭリンクが表示された端末装置１００の表示画面を見て遠隔地での会話行為の発生を知ることができる。 The COMM link is projected and superposed on the screen of the display unit 150 of the terminal device 100, for example, on the bird's-eye view captured image 61 or the map image 69 in FIG. 4 by controlling the display unit 150 by the COMM link control unit 193. It is displayed (described later with reference to FIG. 15). The user who sees the display screen of the terminal device 100 visually recognizes the COMM link displayed on the mapped image of the real space, and thereby identifies who and who are in a call in a plurality of remote places in the distributed environment. You can understand intuitively. Furthermore, by visualizing a voice information-based action called a conversation (call) by a COMM link, even when media information sharing is restricted to remote communication of only image data, for example, voice data at a remote location cannot be acquired and output. The user can know the occurrence of the conversation act at the remote place by looking at the display screen of the terminal device 100 on which the COMM link is displayed.

以下に、ＣＯＭＭリンク８６の一例を、図１４を参照して説明する。図１４は、センタオフィス１０に対応する３次元仮想空間９０内におけるＣＯＭＭリンク８６の一例を説明するための説明図である。この３次元仮想空間９０に関するデータは、先に説明したように、情報管理サーバ２００が管理している。図１４においては、センタオフィス１０に対応する３次元仮想空間９０が示されている。また、当該３次元仮想空間９０には、オブジェクト９１Ｃ及びオブジェクト９１Ｄが配置されている。オブジェクト９１Ｃは、たとえば人物Ｃに対応し、センタオフィス１０内の人物Ｃの座席の位置に対応する３次元仮想位置に配置される。また、オブジェクト９１Ｄは、たとえば人物Ｄに対応し、センタオフィス１０内の人物Ｄの座席の位置に対応する３次元仮想位置に配置される。図８や図１０の例と同様に、これらオブジェクト９１は、たとえば半径Ｒ及び高さＨを伴う円柱状のオブジェクトである。また、図１４においては、オブジェクト９１Ｃおよびオブジェクト９１Ｄそれぞれの３次元重心位置９２Ｃおよび９２Ｄも示されている。なお、本実施形態においては、３次元重心位置９２Ｃおよび９２Ｄは、幾何学的な３次元の重心位置でなくてもよく、それぞれオブジェクト９１の３次元形状内に含まれている位置（点）であればよく、例えばオブジェクト９１の上面中心位置であってもよい。また、図１４には、上記３次元重心位置９２Ｃと９２Ｄを両端とする線分状のＣＯＭＭリンク８６が示されている。このようにＣＯＭＭリンク８６はオブジェクトの一種として３次元仮想空間９０内に位置や形状等の情報を有することができる。なお、ＣＯＭＭリンク８６の生成については、後述する。 Hereinafter, an example of the COMM link 86 will be described with reference to FIG. FIG. 14 is an explanatory diagram illustrating an example of the COMM link 86 in the three-dimensional virtual space 90 corresponding to the center office 10. The data regarding the three-dimensional virtual space 90 is managed by the information management server 200 as described above. In FIG. 14, a three-dimensional virtual space 90 corresponding to the center office 10 is shown. In addition, an object 91C and an object 91D are arranged in the three-dimensional virtual space 90. The object 91C corresponds to the person C, for example, and is arranged at a three-dimensional virtual position corresponding to the position of the seat of the person C in the center office 10. Further, the object 91D corresponds to, for example, the person D, and is arranged at a three-dimensional virtual position corresponding to the position of the seat of the person D in the center office 10. Similar to the examples of FIGS. 8 and 10, these objects 91 are columnar objects with a radius R and a height H, for example. Further, in FIG. 14, three-dimensional barycentric positions 92C and 92D of the object 91C and the object 91D are also shown. In the present embodiment, the three-dimensional barycentric positions 92C and 92D do not have to be geometrical three-dimensional barycentric positions, and are positions (points) included in the three-dimensional shape of the object 91, respectively. It may be any position, and may be the center position of the upper surface of the object 91, for example. Further, FIG. 14 shows a line-segment-shaped COMM link 86 having both ends at the three-dimensional center-of-gravity positions 92C and 92D. As described above, the COMM link 86 can have information such as position and shape in the three-dimensional virtual space 90 as a kind of object. The generation of the COMM link 86 will be described later.

そして、前述のように、ＣＯＭＭリンク８６は通話中状態である端末装置１００のユーザである人物のオブジェクト９１間をつなぐオブジェクトであるため、当該ＣＯＭＭリンク８６は、その線分の両端に位置する、最低２以上のオブジェクト９１とそれに対応する通信用ＩＤのデータと紐づく。たとえば、図１４におけるＣＯＭＭリンク８６を、後述するような入力方法でユーザが指定すると、当該ユーザの端末装置１００は、ＣＯＭＭリンク８６の両端の位置にあるオブジェクト９１Ｃおよびオブジェクト９１Ｄの通信用ＩＤの情報を取得することができる。これにより、当該ユーザの端末装置１００は、例えば、当該ユーザのＣＯＭＭリンク８６というひとつのオブジェクトを指定する１操作に基づいて、オブジェクト９１Ｃとオブジェクト９１Ｄという異なる位置にある２つのオブジェクト９１を同時に選択することができる。さらに、当該ユーザの端末装置１００は、上述の２つのオブジェクト９１に対応する人物Ｃと人物Ｄそれぞれの通信用ＩＤを利用して、人物Ｃと人物Ｄ（に対応する端末装置１００）にアクセスすることができる。すなわち、ユーザは、二者通話中の人物Ｃと人物Ｄに対して、両者をつなぐＣＯＭＭリンク８６を指定する１操作を行うことにより、当該二者通話に参加することができ、容易に、人物Ｃ及び人物Ｄとの三者通話を行うことができる。 Then, as described above, since the COMM link 86 is an object that connects the objects 91 of the person who is the user of the terminal device 100 in the talking state, the COMM link 86 is located at both ends of the line segment, It is associated with at least two or more objects 91 and corresponding communication ID data. For example, when the user specifies the COMM link 86 in FIG. 14 by the input method described below, the terminal device 100 of the user has information on the communication IDs of the objects 91C and 91D at both ends of the COMM link 86. Can be obtained. As a result, the terminal device 100 of the user simultaneously selects two objects 91C and 91D at different positions based on one operation of designating one object of the COMM link 86 of the user, for example. be able to. Further, the terminal device 100 of the user accesses (the terminal device 100 corresponding to) the person C and the person D by using the communication IDs of the person C and the person D corresponding to the above-described two objects 91. be able to. That is, the user can participate in the two-party call by performing one operation for the person C and the person D who are in the two-party call to specify the COMM link 86 that connects them to each other. A three-way call with C and person D can be made.

また、実空間上で三者通話と二者通話との２つの通話が存在する場合に、三者通話を示す三角形状のＣＯＭＭリンクを、二者通話の一方の話者が指定した場合には、三者通話のＣＯＭＭリンクを指定した話者のみが三者通話に参加するようにしてもよく、もしくは、二者通話の両方の話者が、三者通話に参加するようにしてもよい。さらに、三者通話のＣＯＭＭリンクを指定した話者のみが三者通話に参加した後に、三者通話の複数の話者が許可した場合に、二者通話の残りの話者が三者通話に参加してもよい。 Also, when two calls, a three-way call and a two-way call, exist in the real space, when one of the two-way calls specifies a triangular COMM link indicating the three-way call. , Only the speaker who has designated the COMM link for the three-way call may participate in the three-way call, or both speakers of the two-way call may participate in the three-way call. Furthermore, if only the speaker who specified the COMM link for the three-way call participates in the three-way call, and if more than one speaker in the three-way call permits, the remaining two-way caller becomes the three-way call. You may participate.

（ＣＯＭＭワード制御部１９５）
ＣＯＭＭワード制御部１９５は、後述する「ＣＯＭＭワード」オブジェクトに関する情報を情報管理サーバ２００から受信し、ＣＯＭＭワードの表示の制御を行う。また、ＣＯＭＭワード制御部１９５は、通信部１１０に、集音部１４０により提供される音声データを情報管理サーバ２００へ送信させる。この時、ＣＯＭＭワード制御部１９５は、送信する音声データに、音声を発した人物に対応する通信用ＩＤのデータを付与して送信してもよい。当該通信用ＩＤを特定する処理は、端末装置１００が、当該端末装置１００のユーザとしてあらかじめ登録された人物に対応する通信用ＩＤを選択したり、音声認識サーバ２０１が話者認識処理を行い音声データから発話人物を特定したりすることで実施してもよい。 (COMM word control unit 195)
The COMM word control unit 195 receives information regarding a “COMM word” object described later from the information management server 200, and controls the display of the COMM word. Further, the COMM word control unit 195 causes the communication unit 110 to transmit the voice data provided by the sound collection unit 140 to the information management server 200. At this time, the COMM word control unit 195 may add the data of the communication ID corresponding to the person who uttered the voice to the voice data to be transmitted, and transmit the voice data. In the process of identifying the communication ID, the terminal device 100 selects a communication ID corresponding to a person registered in advance as a user of the terminal device 100, or the voice recognition server 201 performs a speaker recognition process to perform voice recognition. You may implement by specifying a speaker from data.

ここで、「ＣＯＭＭワード」オブジェクトとは、本明細書においては、端末装置１００の表示部１５０の画面に提示されるユーザインタフェースの表示要素であってもよい。より具体的には、ＣＯＭＭワードは、端末装置１００を介した通話内容の要部が音声認識処理で抽出され、可視化処理されて前述のＣＯＭＭリンク８６の近傍位置に表示されるオブジェクトであってもよい。たとえば、人物Ｃと人物Ｄがそれぞれ端末装置１００Ｃと端末装置１００Ｄを介して二者通話を行っている。この時、端末装置１００Ｃと１００Ｄからそれぞれ入力された人物Ｃと人物Ｄの音声データは、端末装置１００Ｃと１００Ｄのそれぞれの通信用ＩＤのデータを付与されて情報管理サーバ２００へ送信される。情報管理サーバ２００は受信した音声データを音声認識サーバ２０１へ送信し、音声認識サーバ２０１からその認識結果となる語句データを受信して、認識結果にあたる語句データを端末装置１００Ｃと端末装置１００Ｄへ送信する。なお、情報管理サーバ２００及び音声認識サーバ２０１での処理の詳細については後述する。また、当該ＣＯＭＭワードは、ＣＯＭＭワード制御部１９５により表示部１５０が制御されることにより、端末装置１００の表示部１５０の画面上、たとえば図４の俯瞰撮像画像６１上やマップ画像６９上で、前述の対応するＣＯＭＭリンク８６の近傍位置に重畳表示される（図１５参照）。なお、「ＣＯＭＭワード」オブジェクトは、前述の「ＣＯＭＭリンク」オブジェクト同様、３次元仮想空間９０内において位置や形状（体積）を有するデータであってもよく、そのデータは情報管理サーバ２００によって管理されていてもよい。さらに、その際、「ＣＯＭＭワード」オブジェクトは、３次元仮想空間９０内において、対応する「ＣＯＭＭリンク」オブジェクトに空間的に近接するまたは近傍の位置に設定されてもよく、当該設定は情報管理サーバ２００によって行われても構わない。 Here, in the present specification, the “COMM word” object may be a display element of a user interface presented on the screen of the display unit 150 of the terminal device 100. More specifically, the COMM word is an object that is displayed in the vicinity of the COMM link 86 described above, in which the main part of the call content via the terminal device 100 is extracted by the voice recognition process, visualized, and displayed. Good. For example, a person C and a person D are making a two-party call via the terminal device 100C and the terminal device 100D, respectively. At this time, the voice data of the person C and the person D respectively input from the terminal devices 100C and 100D are added to the communication ID data of the terminal devices 100C and 100D and transmitted to the information management server 200. The information management server 200 transmits the received voice data to the voice recognition server 201, receives the word/phrase data as the recognition result from the voice recognition server 201, and transmits the word/phrase data corresponding to the recognition result to the terminal devices 100C and 100D. To do. The details of the processing in the information management server 200 and the voice recognition server 201 will be described later. Further, the COMM word is displayed on the screen of the display unit 150 of the terminal device 100, for example, on the bird's-eye view captured image 61 or the map image 69 in FIG. 4 by controlling the display unit 150 by the COMM word control unit 195. It is superposed and displayed at a position near the corresponding COMM link 86 (see FIG. 15). The “COMM word” object may be data having a position and a shape (volume) in the three-dimensional virtual space 90 like the above-mentioned “COMM link” object, and the data is managed by the information management server 200. May be. Further, at that time, the “COMM word” object may be set in a position spatially close to or in the vicinity of the corresponding “COMM link” object in the three-dimensional virtual space 90, and the setting is performed by the information management server. It may be performed by 200.

以下に、ＣＯＭＭワード８７の一例を、図１５を参照して説明する。図１５は、端末装置１００の表示画面５０に表示されるＣＯＭＭリンク８６およびＣＯＭＭワード８７の一例を説明するための説明図である。図１５においては、俯瞰モードまたは近接モードで表示される表示画面５０が示されている。表示画面５０は、撮像画像５１Ａ、５１Ｂ、５１Ｚ、マップ画像６９Ａ、６９Ｂ、６９Ｚ、人物Ｃ、Ｄ、Ｅ、Ｆの人物画像７７Ｃ、７７Ｄ、７７Ｅ、７７Ｆ、人物Ｃ、Ｄ、Ｅ、Ｆのプレゼンスアイコン７９Ｃ、７９Ｄ、７９Ｅ、７９Ｆ、射影して重畳表示されたＣＯＭＭリンク８６Ｇ２、８６Ｇ３、８６Ｈ２、８６Ｈ３、及びＣＯＭＭワード８７Ｉ、８７Ｊ、８７Ｋ、８７Ｌを含む。撮像画像５１Ａ、５１Ｂ、５１Ｚは、分散環境における拠点Ａ、Ｂ、Ｚ（たとえば、東京、大阪、沖縄）をそれぞれ俯瞰的に撮影した画像であり、マップ画像６９Ａ、６９Ｂ、６９Ｚは撮像画像５１Ａ、５１Ｂ、５１Ｃ内の各種オブジェクトの位置関係を２次元的に表現したたとえば平面図である。なお、撮像画像５１とマップ画像６９とは、表示画面５０上にどちらか一方だけが表示されていてもよい。 An example of the COMM word 87 will be described below with reference to FIG. FIG. 15 is an explanatory diagram illustrating an example of the COMM link 86 and the COMM word 87 displayed on the display screen 50 of the terminal device 100. In FIG. 15, a display screen 50 displayed in the overhead view mode or the proximity mode is shown. The display screen 50 shows the presence of the captured images 51A, 51B, 51Z, the map images 69A, 69B, 69Z, the person images 77C, 77D, 77E, 77F, the persons C, D, E, F of the persons C, D, E, F. It includes icons 79C, 79D, 79E, 79F, COMM links 86G2, 86G3, 86H2, 86H3, and COMM words 87I, 87J, 87K, 87L projected and superimposed. The captured images 51A, 51B, and 51Z are images obtained by bird's-eye view of the bases A, B, and Z (for example, Tokyo, Osaka, and Okinawa) in the distributed environment, and the map images 69A, 69B, and 69Z are captured images 51A and 51Z. It is the top view which expressed two-dimensionally the positional relationship of various objects in 51B and 51C. Only one of the captured image 51 and the map image 69 may be displayed on the display screen 50.

例えば、拠点Ａの人物Ｃと拠点Ａの人物Ｄが自席にある端末装置１００Ｃと１００Ｄとで二者通話を始めると、撮像画像５１Ａ上の人物画像７７Ｃと７７Ｄとの間に、ＣＯＭＭリンク８６Ｇ３、マップ画像６９Ａ上のプレゼンスアイコン７９Ｃと７９Ｄ上に、ＣＯＭＭリンク８６Ｇ２、さらにその近傍位置にＣＯＭＭワード８７Ｉ、８７Ｊ、８７Ｋが表示される。なお、撮像画像５１Ａ上の人物画像７７Ｃと７７Ｄとは、それぞれ人物Ｃ及び人物Ｄに対応し、マップ画像６９Ａ上のプレゼンスアイコン７９Ｃと７９Ｄとは、それぞれ人物Ｃ及び人物Ｄに対応する。また、拠点Ａの人物Ｅと拠点Ｂの人物Ｆが自席にある端末装置１００Ｅと１００Ｆとで二者通話を始めると、撮像画像５１Ａ上の人物画像７７Ｅと撮像画像５１Ｂ上の人物画像７７ＦにまたがってＣＯＭＭリンク８６Ｈ３、マップ画像６９Ａ上のプレゼンスアイコン７９Ｅとマップ画像６９Ｂ上のプレゼンスアイコン７９ＦにまたがってＣＯＭＭリンク８６Ｈ２が表示される。なお、撮像画像５１Ａ上の人物画像７７Ｅと７７Ｆとは、それぞれ人物Ｅ及び人物Ｆに対応し、マップ画像６９Ａ上のプレゼンスアイコン７９Ｅと７９Ｆとは、それぞれ人物Ｅ及び人物Ｆに対応する。 For example, when the person C at the location A and the person D at the location A start a two-party call between the terminal devices 100C and 100D at their own seats, the COMM link 86G3, between the person images 77C and 77D on the captured image 51A, COMM links 86G2 are displayed on the presence icons 79C and 79D on the map image 69A, and COMM words 87I, 87J, and 87K are displayed near the COMM links 86G2. The person images 77C and 77D on the captured image 51A correspond to the person C and the person D, respectively, and the presence icons 79C and 79D on the map image 69A correspond to the person C and the person D, respectively. Further, when the person E at the location A and the person F at the location B start a two-party call between the terminal devices 100E and 100F in their own seats, the person image 77E on the captured image 51A and the person image 77F on the captured image 51B are straddled. The COMM link 86H2 is displayed across the COMM link 86H3, the presence icon 79E on the map image 69A, and the presence icon 79F on the map image 69B. The person images 77E and 77F on the captured image 51A correspond to the person E and the person F, respectively, and the presence icons 79E and 79F on the map image 69A correspond to the person E and the person F, respectively.

そして、端末装置１００Ｕのユーザが表示画面５０上のＣＯＭＭリンク８６Ｇ３またはＣＯＭＭリンク８６Ｇ２をタッチ入力して指定すると、当該ユーザは人物Ｃと人物Ｄの二者会話に参加することができる。また、端末装置１００Ｕのユーザが表示画面５０上のＣＯＭＭリンク８６Ｈ３またはＣＯＭＭリンク８６Ｈ２をタッチ入力して指定すると、当該ユーザは人物Ｅと人物Ｆの二者会話に参加することができる。 Then, when the user of the terminal device 100U touch-inputs and specifies the COMM link 86G3 or the COMM link 86G2 on the display screen 50, the user can participate in the two-party conversation between the person C and the person D. When the user of the terminal device 100U touch-inputs and specifies the COMM link 86H3 or the COMM link 86H2 on the display screen 50, the user can participate in the two-party conversation between the person E and the person F.

図１５では、ＣＯＭＭリンク８６Ｇ３またはＣＯＭＭリンク８６Ｇ２の近傍位置に、ＣＯＭＭワード８７Ｉ、８７Ｊ、８７Ｋが表示されている。ここで、ＣＯＭＭワード８７は、先に説明したように、端末装置１００を介した通話内容の要部が音声認識処理で抽出され、可視化処理されて表示されるオブジェクトである。例えば、図１５においては、ＣＯＭＭワード８７は、後述する統計的重み付け処理による重みが反映されるように表示される（当該重み付け処理は、後述する情報管理サーバ２００で行われ、当該重み付け処理の結果は端末装置１００等へ配信される。詳細については後述する）。例えば、図１５においては、ＣＯＭＭワード８７Ｉの語句「会議」は、ＣＯＭＭワード８７Ｊの語句「中止」やＣＯＭＭワード８７Ｋの語句の「開催」等より大きなサイズで表示されている。この図１５のＣＯＭＭワードの表示は、統計的重み付け処理による重みを反映した表示であり、例えば重みの基になる指標が話者の発言回数である場合には、「会議」という語句はその他の語句「中止」等よりも多くの回数で話者によって発言され、通話内に多く出現していることを示す。このように、ＣＯＭＭワード８７が、統計的重み付け処理による重みが反映されるように表示されことにより、表示画面５０を見ているユーザにも直感的に会話の要部を理解することができる。 In FIG. 15, COMM words 87I, 87J, and 87K are displayed near the COMM link 86G3 or the COMM link 86G2. Here, as described above, the COMM word 87 is an object in which the main part of the content of the call via the terminal device 100 is extracted by the voice recognition process, visualized, and displayed. For example, in FIG. 15, the COMM word 87 is displayed so that the weight by the statistical weighting process described later is reflected (the weighting process is performed by the information management server 200 described later, and the result of the weighting process is performed. Is distributed to the terminal device 100, etc. The details will be described later). For example, in FIG. 15, the phrase “meeting” of the COMM word 87I is displayed in a larger size than the phrase “cancel” of the COMM word 87J and the “holding” of the phrase of the COMM word 87K. The display of the COMM word in FIG. 15 is a display in which the weight by the statistical weighting process is reflected. For example, when the index that is the basis of the weight is the number of times the speaker speaks, the word “meeting” is not displayed. It is shown that the speaker speaks more times than the phrase “stop” and appears frequently in the call. In this way, the COMM word 87 is displayed so that the weight by the statistical weighting process is reflected, so that the user looking at the display screen 50 can intuitively understand the main part of the conversation.

さらに、ＣＯＭＭワード８７の位置について説明すると、ＣＯＭＭワード８７Ｉの語句「会議」は対応するＣＯＭＭリンク８６Ｇの中央付近に位置している。それに対し、ＣＯＭＭワード８７Ｊの語句「中止」等は、ＣＯＭＭリンク８６Ｇに紐づけられた人物Ｃの人物画像７７Ｃの付近の位置に、ＣＯＭＭワード８７Ｋの語句「開催」等は、ＣＯＭＭリンク８６Ｇに紐づけられた人物Ｄの人物画像７７Ｄの付近の位置に表示されている。これは、ＣＯＭＭリンク８６Ｇに対応する二者通話を行っている人物Ｃと人物Ｄのどちらがどのような語句を多く発言しているかという会話の発言状況を、位置（ＣＯＭＭリンク８６の端のオブジェクト９１からの距離と同義）というパラメータに変換して可視化処理した結果である。例えば、ＣＯＭＭワード８７Ｉの語句「会議」は、ＣＯＭＭリンク８６の中央付近に位置しているので、人物Ｃも人物Ｄも同じくらいの回数で発言していることを表す。一方で、ＣＯＭＭワード８７Ｊの語句「中止」は、人物Ｃの人物画像７７Ｃの付近に位置していることから、人物Ｃが多く発言しており、ＣＯＭＭワード８７Ｋの語句「開催」は、人物Ｄの人物画像７７Ｄの付近に位置していることから、人物Ｄが多く発言している語句であることを表す。なお、上記の各ＣＯＭＭワード８７とオブジェクト９１（すなわち対応する語句の発言者のオブジェクト）との位置関係は、表示部１５０での表示上だけでなく、３次元仮想空間９０においても同様の位置または同様の距離の関係であってもよい（例えば、ＣＯＭＭワード８７Ｊは、３次元仮想空間９０においても、人物Ｃの人物画像７７Ｃに対応するオブジェクト９１Ｃの近傍に位置していてもよい）。 Further, explaining the position of the COMM word 87, the phrase "meeting" of the COMM word 87I is located near the center of the corresponding COMM link 86G. On the other hand, the phrase “stop” of the COMM word 87J is linked to the position of the person image 77C of the person C linked to the COMM link 86G, and the phrase “holding” of the COMM word 87K is linked to the COMM link 86G. It is displayed at a position near the person image 77D of the attached person D. This is based on the position (the object 91 at the end of the COMM link 86) of the speech state of which person C or person D making a two-way call corresponding to the COMM link 86G frequently speaks. It is the result of the visualization process after conversion into a parameter (synonymous with the distance from). For example, the phrase “meeting” of the COMM word 87I is located near the center of the COMM link 86, and thus indicates that both the person C and the person D are speaking the same number of times. On the other hand, since the phrase “stop” of the COMM word 87J is located near the person image 77C of the person C, the person C is speaking a lot, and the phrase “holding” of the COMM word 87K is the person D. Since it is located in the vicinity of the person image 77D, it indicates that the person D speaks a lot. The positional relationship between each COMM word 87 and the object 91 (that is, the object of the speaker of the corresponding word/phrase) is not limited to the display on the display unit 150 and the same position or position in the three-dimensional virtual space 90. The same distance relationship may be used (for example, the COMM word 87J may be located near the object 91C corresponding to the person image 77C of the person C in the three-dimensional virtual space 90).

このように、本実施形態においては、ＣＯＭＭワード制御部１９５が行う、上記の統計的重み付けや発言状況を反映した可視化処理を行い、通話内容の要部をＣＯＭＭワード８７として表示する。従って、本実施形態によれば、遠隔地にいて会話の音声が聞こえないようなユーザでも、ＣＯＭＭワード８７が表示された画面を見ることより、会話内容の大まかな把握を行うことが可能となる。例えば、図１５の例においては、遠隔地のユーザは、ＣＯＭＭワード８７が表示された画面を見ることにより、人物Ｃと人物Ｄは明日の会議の開催可否をテーマにした会話を行っており、人物Ｃは会議の中止や延期を主張している一方、人物Ｄは同会議を開催すべきだと主張している、というような、会話内容を把握することができる。そして、当該ユーザが、会話内容の大まかな把握を行った後に、当該会話内容に関心が生じたら、前述のＣＯＭＭリンク８６に対する簡便な指定操作によって、スムーズに当該会話に参加することができる。すなわち、同一環境下にいる場合と同じように、ユーザは、二者会話にスムーズに参加することができる。 As described above, in the present embodiment, the COMM word control unit 195 performs the visualization processing that reflects the above statistical weighting and the utterance status, and displays the main part of the call content as the COMM word 87. Therefore, according to the present embodiment, even a user who is in a remote place and cannot hear the voice of conversation can roughly understand the conversation content by looking at the screen on which the COMM word 87 is displayed. .. For example, in the example of FIG. 15, the user at the remote place sees the screen on which the COMM word 87 is displayed, and the person C and the person D are having a conversation with the theme of whether to hold a meeting tomorrow, It is possible to grasp the content of the conversation such that the person C insists on canceling or postponing the conference, while the person D insists that the conference should be held. Then, if the user becomes interested in the conversation content after having roughly grasped the conversation content, the user can smoothly participate in the conversation by a simple designation operation on the COMM link 86 described above. That is, the user can smoothly participate in the two-party conversation, as in the case of being in the same environment.

なお、ＣＯＭＭワード制御部１９５は、ＣＯＭＭワード８７の表示の際に、上記通話に含まれるすべての語句データを表示しなくてもよい。例えば、ＣＯＭＭワード制御部１９５は、上記重み付けにおける重みの値と所定の閾値（所定の値）とを比較し、当該重みの値が所定の閾値以上であった語句のみ、端末装置１００の表示画面に表示されるように制御しても構わない。このようにすることで、ＣＯＭＭワード８７として、会話における重要度が高い語句のみが「精選」されて表示されることとなる。また、この時、重み付き語句データのＣＯＭＭワード８７に対応するＣＯＭＭリンク８６についても、表示の制御がなされてもよい。さらに、ＣＯＭＭワード制御部１９５は、重みの値に基づいて、ＣＯＭＭワード８７の表示を制御してもよい。具体的には、ＣＯＭＭワード制御部１９５は、重みの値に基づいて、ＣＯＭＭワード８７の表示の大きさ、色、当該表示が重畳される画面とのコントラスト、表示する位置等を制御してもよい。 Note that the COMM word control unit 195 does not have to display all the phrase data included in the call when displaying the COMM word 87. For example, the COMM word control unit 195 compares the weight value in the above weighting with a predetermined threshold value (predetermined value), and only the words and phrases whose weight value is equal to or greater than the predetermined threshold value are displayed on the display screen of the terminal device 100. You may control so that it may be displayed in. By doing so, only the words and phrases having a high degree of importance in the conversation are “selected” and displayed as the COMM word 87. At this time, the display control may also be performed on the COMM link 86 corresponding to the COMM word 87 of the weighted phrase data. Further, the COMM word control unit 195 may control the display of the COMM word 87 based on the weight value. Specifically, the COMM word control unit 195 controls the display size and color of the COMM word 87, the contrast with the screen on which the display is superimposed, the display position, and the like based on the weight value. Good.

図１６は、端末装置１００の表示画面５５に表示されるＣＯＭＭリンク８６およびＣＯＭＭワード８７の別の一例を説明するための説明図である。図１６においては、俯瞰モードまたは近接モードで表示される表示画面５５が示されている。まず、拠点Ａの人物Ｃと拠点Ｂの人物Ｆが自席にある端末装置１００Ｃと１００Ｆとで二者通話を始めると、人物Ｃと人物Ｆの人物画像間にＣＯＭＭリンク８６およびＣＯＭＭワード８７が表示される。さらに、拠点Ａの人物Ｄが上記ＣＯＭＭリンク８６を指定操作し、ＣＯＭＭリンク８６に対応する会話に参加し、三者通話に移行すると、人物Ｃと人物Ｄ、および、人物Ｆと人物Ｄの人物画像間にＣＯＭＭリンク８６およびＣＯＭＭワード８７が表示される。その結果、例えば、人物Ｃ、人物Ｄ及び人物Ｆをつなぐ三角形状のＣＯＭＭリンク８６Ｍ３及びＣＯＭＭリンク８６Ｍ２が表示される。この場合、さらに別の人物Ｅが上記三者通話に参入して四者通話に移行するためには、例えば、人物Ｅが、三角形状のＣＯＭＭリンク８６Ｍ２のいずれか一辺に該当する位置、または、当該三角形の内部領域の一部に該当する位置を指定する操作を行えばよい。 FIG. 16 is an explanatory diagram for explaining another example of the COMM link 86 and the COMM word 87 displayed on the display screen 55 of the terminal device 100. In FIG. 16, a display screen 55 displayed in the overhead view mode or the proximity mode is shown. First, when the person C at the location A and the person F at the location B start a two-way call between the terminal devices 100C and 100F at their own seats, a COMM link 86 and a COMM word 87 are displayed between the person images of the person C and the person F. To be done. Further, when the person D at the site A designates the COMM link 86, participates in the conversation corresponding to the COMM link 86, and shifts to the three-way call, the person C and the person D, and the person F and the person D A COMM link 86 and a COMM word 87 are displayed between the images. As a result, for example, triangular COMM links 86M3 and COMM links 86M2 that connect the person C, the person D, and the person F are displayed. In this case, in order for another person E to enter the three-way call and shift to the four-way call, for example, the person E is at a position corresponding to one side of the triangular COMM link 86M2, or An operation for designating a position corresponding to a part of the inner area of the triangle may be performed.

＜１．２．３ソフトウェア構成＞
次に、本実施形態に係る端末装置１００のソフトウェア構成の一例を説明する。図１７は、本実施形態に係る端末装置１００のソフトウェア構成の一例を示すブロック図である。図１７を参照すると、端末装置１００は、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）８４０及び複数のアプリケーションソフトウェアを有する。そして、端末装置１００は、アプリケーションソフトウェアとして、ソフトフォン８５１、超臨場感クライアント８５３及び電話発信制御機能８５５を含む。 <1.2.3 Software configuration>
Next, an example of the software configuration of the terminal device 100 according to the present embodiment will be described. FIG. 17 is a block diagram showing an example of the software configuration of the terminal device 100 according to this embodiment. Referring to FIG. 17, the terminal device 100 includes an OS (Operating System) 840 and a plurality of application software. Then, the terminal device 100 includes a softphone 851, a super-realistic sensation client 853, and a telephone call control function 855 as application software.

（ＯＳ８４０）
ＯＳ８４０は、端末装置１００を動作させるための基本的な機能を提供するソフトウェアである。ＯＳ８４０は、各アプリケーションソフトウェアを実行する。 (OS840)
The OS 840 is software that provides basic functions for operating the terminal device 100. The OS 840 executes each application software.

（ソフトフォン８５１）
ソフトフォン８５１は、端末装置１００を用いて電話を行うためのアプリケーションソフトウェアである。電話部１８９は、例えば、ソフトフォン８５１により実現され得る。 (Softphone 851)
The softphone 851 is application software for making a call using the terminal device 100. The telephone unit 189 can be realized by the softphone 851, for example.

（超臨場感クライアント８５３）
超臨場感クライアント８５３は、実空間の情報を端末装置１００に提供するためのアプリケーションソフトウェアである。超臨場感クライアント８５３は、実空間（例えば、センタオフィス１０）にいる人物の状態を示す状態情報を取得し、ＯＳを介してソフトフォン８５１に提供してもよい。そして、ソフトフォン８５１は、当該状態情報に基づいて、電話発信を制御してもよい。なお、実空間情報提供部１８１は、例えば、超臨場感クライアント８５３により実現され得る。 (Ultra-realistic client 853)
The ultra-realistic client 853 is application software for providing information in the real space to the terminal device 100. The ultra-realism client 853 may acquire state information indicating the state of a person in the real space (for example, the center office 10) and provide the state information to the softphone 851 via the OS. Then, the softphone 851 may control the call origination based on the state information. The real space information providing unit 181 can be realized by, for example, the ultra-realistic presence client 853.

（電話発信制御機能８５５）
また、電話発信制御機能８５５は、表示画面内の撮像画像に写っている人物の通信装置の通信用ＩＤを取得するアプリケーションソフトウェアである。電話発信制御機能８５５は、通信用ＩＤを取得すると、ＯＳ８４０を介してソフトフォン８５１に提供する。そして、ソフトフォン８５１は、当該通信用ＩＤを用いて、電話発信を行う。なお、位置取得部１８３、オブジェクト選択部１８５及びＩＤ取得部１８７は、電話発信制御機能８５５により実現され得る。 (Telephone call control function 855)
The telephone call control function 855 is application software that acquires the communication ID of the communication device of the person shown in the captured image on the display screen. Upon acquiring the communication ID, the telephone call control function 855 provides the softphone 851 via the OS 840. Then, the softphone 851 makes a telephone call using the communication ID. The position acquisition unit 183, the object selection unit 185, and the ID acquisition unit 187 can be realized by the telephone call control function 855.

＜１．３情報管理サーバの構成＞
続いて、図１８及び図１９を参照して、本実施形態に係る情報管理サーバ２００の構成の一例を説明する。情報管理サーバ２００は、先に説明したように、本実施形態に係る情報処理システムにおいて用いられる様々な情報を管理する。 <1.3 Information management server configuration>
Subsequently, an example of the configuration of the information management server 200 according to the present embodiment will be described with reference to FIGS. 18 and 19. As described above, the information management server 200 manages various information used in the information processing system according to this embodiment.

＜１．３．１ハードウェア構成＞
まず、図１８を参照して、本実施形態に係る情報管理サーバ２００のハードウェア構成の一例を説明する。図１８は、本実施形態に係る情報管理サーバ２００のハードウェア構成の一例を示すブロック図である。図１８を参照すると、情報管理サーバ２００は、ＣＰＵ９０１、ＲＯＭ９０３、ＲＡＭ９０５、バス９０７、記憶装置９０９及び通信インターフェース９１１を有する。 <1.3.1 Hardware configuration>
First, an example of the hardware configuration of the information management server 200 according to the present embodiment will be described with reference to FIG. FIG. 18 is a block diagram showing an example of the hardware configuration of the information management server 200 according to this embodiment. Referring to FIG. 18, the information management server 200 includes a CPU 901, a ROM 903, a RAM 905, a bus 907, a storage device 909, and a communication interface 911.

（ＣＰＵ９０１、ＲＯＭ９０３、ＲＡＭ９０５）
ＣＰＵ９０１は、情報管理サーバ２００における様々な処理を実行する。また、ＲＯＭ９０３は、情報管理サーバ２００における処理をＣＰＵ９０１に実行させるためのプログラム及びデータを記憶する。さらに、ＲＡＭ９０５は、ＣＰＵ９０１の処理の実行時に、プログラム及びデータを一時的に記憶する。 (CPU901, ROM903, RAM905)
The CPU 901 executes various processes in the information management server 200. Further, the ROM 903 stores a program and data for causing the CPU 901 to execute the process in the information management server 200. Further, the RAM 905 temporarily stores programs and data when the processing of the CPU 901 is executed.

（バス９０７）
バス９０７は、ＣＰＵ９０１、ＲＯＭ９０３及びＲＡＭを相互に接続する。バス９０７には、さらに、記憶装置９０９及び通信インターフェース９１１が接続される。バス９０７は、例えば、複数の種類のバスを含む。一例として、バス９０７は、ＣＰＵ９０１、ＲＯＭ９０３及びＲＡＭ９０５を接続する高速バスと、当該高速バスよりも低速の１つ以上の別のバスを含んでもよい。 (Bus 907)
The bus 907 connects the CPU 901, the ROM 903, and the RAM to each other. A storage device 909 and a communication interface 911 are further connected to the bus 907. The bus 907 includes, for example, a plurality of types of buses. As an example, the bus 907 may include a high-speed bus that connects the CPU 901, the ROM 903, and the RAM 905, and one or more other buses that are slower than the high-speed bus.

（記憶装置９０９）
記憶装置９０９は、情報管理サーバ２００内で一時的又は恒久的に保存すべきデータを記憶する。記憶装置９０９は、例えば、ハードディスク等の磁気記憶装置であってもよく、又は、ＥＥＰＲＯＭ、フラッシュメモリ、ＭＲＡＭ、ＦｅＲＡＭ及びＰＲＡＭ等の不揮発性メモリであってもよい。 (Memory device 909)
The storage device 909 stores data to be temporarily or permanently stored in the information management server 200. The storage device 909 may be, for example, a magnetic storage device such as a hard disk, or a non-volatile memory such as EEPROM, flash memory, MRAM, FeRAM and PRAM.

（通信インターフェース９１１）
通信インターフェース９１１は、情報管理サーバ２００が備える通信手段であり、ネットワークを介して（あるいは、直接的に）外部装置と通信する。通信インターフェース９１１は、無線通信用のインターフェースであってもよく、この場合には、例えば、通信アンテナ、ＲＦ回路及びその他の通信処理用の回路を含んでもよい。また、通信インターフェース９１１は、有線通信用のインターフェースであってもよく、この場合には、例えば、ＬＡＮ端子、伝送回路及びその他の通信処理用の回路を含んでもよい。 (Communication interface 911)
The communication interface 911 is a communication unit included in the information management server 200, and communicates with an external device via a network (or directly). The communication interface 911 may be an interface for wireless communication, and in this case, may include, for example, a communication antenna, an RF circuit, and other circuits for communication processing. Further, the communication interface 911 may be an interface for wired communication, and in this case, for example, may include a LAN terminal, a transmission circuit, and other circuits for communication processing.

＜１．３．２機能構成＞
次に、本実施形態に係る情報管理サーバ２００の機能構成の一例を説明する。図１９は、本実施形態に係る情報管理サーバ２００の機能構成の一例を示すブロック図である。図１９を参照すると、情報管理サーバ２００は、通信部２１０、記憶部２２０及び制御部２３０を有する。 <1.3.2 Functional configuration>
Next, an example of a functional configuration of the information management server 200 according to the present embodiment will be described. FIG. 19 is a block diagram showing an example of the functional configuration of the information management server 200 according to this embodiment. Referring to FIG. 19, the information management server 200 includes a communication unit 210, a storage unit 220, and a control unit 230.

（通信部２１０）
通信部２１０は、他の装置と通信する。例えば、通信部２１０は、ＬＡＮ１９に直接的に接続され、センタオフィス１０内の各装置と通信する。具体的には、例えば、通信部２１０は、カメラ１１、マイクロフォン１３、センサ１５、メディア配信サーバ１７及び音声認識サーバ２０１と通信する。また、通信部２１０は、外部ネットワーク３０及びＬＡＮ２３を介して、サテライトオフィス２０内の各装置と通信する。具体的には、例えば、通信部２１０は、端末装置１００及びディスプレイ２１と通信する。なお、通信部２１０は、例えば、通信インターフェース９１１により実現され得る。 (Communication unit 210)
The communication unit 210 communicates with another device. For example, the communication unit 210 is directly connected to the LAN 19 and communicates with each device in the center office 10. Specifically, for example, the communication unit 210 communicates with the camera 11, the microphone 13, the sensor 15, the media distribution server 17, and the voice recognition server 201. In addition, the communication unit 210 communicates with each device in the satellite office 20 via the external network 30 and the LAN 23. Specifically, for example, the communication unit 210 communicates with the terminal device 100 and the display 21. The communication unit 210 can be realized by the communication interface 911, for example.

（記憶部２２０）
記憶部２２０は、情報管理サーバ２００の動作のためのプログラム及びデータを記憶する。とりわけ本実施形態では、記憶部２２０は、情報処理システムにおいて用いられる様々な情報を記憶する。 (Storage unit 220)
The storage unit 220 stores programs and data for the operation of the information management server 200. In particular, in this embodiment, the storage unit 220 stores various information used in the information processing system.

第１の例として、記憶部２２０は、カメラ１１、マイクロフォン１３及びセンサ１５に関するパラメータを記憶する。当該パラメータの具体的な内容は、上述したとおりである。第２の例として、記憶部２２０は、実空間に対応する３次元仮想空間９０のデータを記憶する。当該３次元仮想空間９０は、例えば、センタオフィス１０を模した３次元仮想空間である。当該３次元仮想空間９０の具体的な内容は、上述したとおりである。第３の例として、記憶部２２０は、人物関連情報を記憶する。当該人物関連情報は、例えば、センタオフィス１０にいる人物の人物関連情報である。なお、当該人物関連情報の具体的な内容は、上述したとおりである。第４の例として、記憶部２２０は、３次元仮想空間に配置されるオブジェクト９１のオブジェクトＩＤと通信用ＩＤとを、互いに対応付けて記憶する。なお、当該オブジェクトＩＤ及び通信用ＩＤの具体的な内容は、上述したとおりである。 As a first example, the storage unit 220 stores parameters regarding the camera 11, the microphone 13, and the sensor 15. The specific content of the parameter is as described above. As a second example, the storage unit 220 stores the data of the three-dimensional virtual space 90 corresponding to the real space. The three-dimensional virtual space 90 is, for example, a three-dimensional virtual space imitating the center office 10. The specific content of the three-dimensional virtual space 90 is as described above. As a third example, the storage unit 220 stores person-related information. The person-related information is, for example, person-related information about a person in the center office 10. The specific content of the person-related information is as described above. As a fourth example, the storage unit 220 stores the object ID of the object 91 arranged in the three-dimensional virtual space and the communication ID in association with each other. The specific contents of the object ID and the communication ID are as described above.

（制御部２３０）
制御部２３０は、情報管理サーバ２００の様々な機能を提供する。制御部２３０は、ＣＯＭＭリンク配信部（会話イベントオブジェクト配信部）２３１、抽出語句データ管理部２３２、重み付け演算部（重み付け処理部）２３３、ＣＯＭＭワード配信部（発言語句オブジェクト配信部）２３４、及び発言状況演算部２３５を含む。制御部２３０は、例えば、要求に応じて、情報処理システムにおいて用いられる様々な情報を提供する。具体的には、制御部２３０は、端末装置１００による要求に応じて、３次元仮想空間９０のデータ、人物関連情報、オブジェクトＩＤに対応する通信用ＩＤ、並びに、カメラ１１、マイクロフォン１３及びセンサ１５に関するパラメータ等を提供する。また、例えば、制御部２３０は、本実施形態に係る情報処理システムにおいて用いられる様々な情報を適宜更新してもよい。制御部２３０は、自動で、又は手動による指示に応じて、当該情報を更新する。 (Control unit 230)
The control unit 230 provides various functions of the information management server 200. The control unit 230 includes a COMM link distribution unit (conversation event object distribution unit) 231, an extracted word data management unit 232, a weighting calculation unit (weighting processing unit) 233, a COMM word distribution unit (language phrase object distribution unit) 234, and a statement. The situation calculation unit 235 is included. The control unit 230 provides various information used in the information processing system in response to a request, for example. Specifically, the control unit 230, in response to a request from the terminal device 100, the data of the three-dimensional virtual space 90, the person-related information, the communication ID corresponding to the object ID, the camera 11, the microphone 13, and the sensor 15. It provides parameters related to. In addition, for example, the control unit 230 may appropriately update various information used in the information processing system according to the present embodiment. The control unit 230 updates the information automatically or according to a manual instruction.

（ＣＯＭＭリンク配信部２３１）
ＣＯＭＭリンク配信部２３１は、ＣＯＭＭリンク８６を生成するための情報を端末装置１００等からから受信し、当該ＣＯＭＭリンク８６の位置とサイズに関する演算、ＣＯＭＭリンク８６の配信、ＣＯＭＭリンク８６へのユーザ入力データの取得、を行う。詳細には、ＣＯＭＭリンク配信部２３１は、端末装置１００からＣＯＭＭリンク８６の基礎となる会話行動（通話動作）を通信部２１０を介して受信する。そして、ＣＯＭＭリンク配信部２３１は、当該端末装置１００からの情報に基づいて、先に説明した３次元仮想空間９０を用いて当該端末装置１００を利用する話者に係るオブジェクト９１を取得し、これらオブジェクト９１をつなぐＣＯＭＭリンク８６に係るデータを生成し、配信する。詳細には、ＣＯＭＭリンク配信部２３１は、先に説明したように、予め記憶部２２０に記憶された実空間に対応する３次元仮想空間９０上における複数の人物のオブジェクト９１を参照し、当該端末装置１００から得られた会話行動の情報に基づき、当該会話行動に係る話者のオブジェクト９１を選択する。そして、選択したオブジェクト９１を結びつけるＣＯＭＭリンクを３次元仮想空間９０上に生成する。 (COMM link distribution unit 231)
The COMM link distribution unit 231 receives the information for generating the COMM link 86 from the terminal device 100 or the like, calculates the position and size of the COMM link 86, distributes the COMM link 86, and inputs the user to the COMM link 86. Acquire data. Specifically, the COMM link delivery unit 231 receives a conversation action (call operation) that is the basis of the COMM link 86 from the terminal device 100 via the communication unit 210. Then, the COMM link distribution unit 231 acquires the object 91 relating to the speaker who uses the terminal device 100 using the three-dimensional virtual space 90 described above, based on the information from the terminal device 100, and Data related to the COMM link 86 connecting the objects 91 is generated and distributed. Specifically, as described above, the COMM link distribution unit 231 refers to the plurality of person objects 91 in the three-dimensional virtual space 90 corresponding to the real space stored in the storage unit 220 in advance and refers to the terminal concerned. Based on the conversation behavior information obtained from the device 100, the speaker object 91 relating to the conversation behavior is selected. Then, a COMM link that connects the selected objects 91 is generated in the three-dimensional virtual space 90.

さらに、ＣＯＭＭリンク配信部２３１は、生成したＣＯＭＭリンク８６に係るデータを記憶部２２０に記憶させる。この時、ＣＯＭＭリンク配信部２３１は、ＣＯＭＭリンク８６に識別用のＩＤを付与して記憶部２２０に記憶し情報管理を行ってもよい。さらに、ＣＯＭＭリンク配信部２３１は、当該ＣＯＭＭリンク８６に付与したＩＤと、ＣＯＭＭリンク８６に対応する２以上の話者のオブジェクトＩＤ（通信用ＩＤ（通信用識別情報））とを対応づけて記憶部２２０に記憶することにより、ＣＯＮＮリンク８６に２以上の話者のオブジェクトＩＤを紐づけて管理してもよい。なお、ＣＯＭＭリンク配信部２３１は、上記会話行動（通話動作）の話者である人物の位置情報については、３次元仮想空間９０上のオブジェクト９１のデータを用いず、情報管理サーバ２００が複数のマイクロフォン１３の集音データから音源推定処理を行うことにより、位置を推定して求めてもよい。 Further, the COMM link distribution unit 231 causes the storage unit 220 to store the data related to the generated COMM link 86. At this time, the COMM link delivery unit 231 may give an ID for identification to the COMM link 86, store it in the storage unit 220, and manage information. Further, the COMM link delivery unit 231 stores the ID given to the COMM link 86 and the object IDs (communication ID (communication ID information)) of two or more speakers corresponding to the COMM link 86 in association with each other. By storing in the unit 220, the CONN link 86 may be associated with the object IDs of two or more speakers and managed. Note that the COMM link distribution unit 231 does not use the data of the object 91 in the three-dimensional virtual space 90 for the position information of the person who is the speaker of the conversation action (call operation), and the information management server 200 stores a plurality of information items. The position may be estimated and obtained by performing sound source estimation processing from the sound collection data of the microphone 13.

また、ＣＯＭＭリンク配信部は、生成したＣＯＭＭリンク８６に対するユーザからの入力操作を通信部２１０を介して受信した場合には、ＣＯＭＭリンク８６に、当該ユーザのオブジェクトＩＤを関連付ける。このようにすることで、ＣＯＭＭリンク８６に対応する２以上の話者のオブジェクトＩＤが、当該ユーザのオブジェクトＩＤと関連付けられる。そして、本実施形態においては、１つのＣＯＭＭリンク８６に関連付けられたオブジェクトＩＤを参照して制御を行うことにより、新たに当該ユーザが加わった通話を開始することができる。 Further, when the COMM link delivery unit receives an input operation from the user for the generated COMM link 86 via the communication unit 210, the COMM link distribution unit associates the object ID of the user with the COMM link 86. By doing so, the object IDs of two or more speakers corresponding to the COMM link 86 are associated with the object IDs of the user. Then, in the present embodiment, by referring to the object ID associated with one COMM link 86 and performing control, it is possible to start a call newly added by the user.

（抽出語句データ管理部２３２）
抽出語句データ管理部２３２は、端末装置１００の通信部１１０から受信した音声データを取得し、音声認識サーバ２０１へ送信する。そして、抽出語句データ管理部２３２は、音声認識サーバ２０１から認識結果の語句データを受信し、当該語句の発言者の識別情報と対応させて管理する。なお、例えば、抽出語句データ管理部２３２は、語句に関するデータや当該語句の発言者の識別情報は記憶部２２０に記憶させてもよい。 (Extracted word data management unit 232)
The extracted word/phrase data management unit 232 acquires the voice data received from the communication unit 110 of the terminal device 100 and transmits the voice data to the voice recognition server 201. Then, the extracted word/phrase data management unit 232 receives the word/phrase data of the recognition result from the voice recognition server 201, and manages it in association with the identification information of the speaker of the word/phrase. Note that, for example, the extracted word data management unit 232 may cause the storage unit 220 to store the data regarding the word and the identification information of the speaker of the word.

（重み付け演算部２３３）
重み付け演算部２３３は、抽出語句データ管理部２３２が管理する語句データを分析し、統計的な重み付け処理を行う。当該重み付けのための指標としては、たとえば、語句の会話における出現頻度（回数）（例えば、出現頻度が高いほど重みを大きくする）、語句の抽象度（例えば、具体性が高いほど重みを大きくする。具体的には、「料理」よりも、「フランス料理」という語句の方が具体性が高いこととなり、「フランス料理」よりも「ブッフブルキニョン（ブルゴーニュ風牛肉の赤ワイン煮）」の方がより具体性が高いこととなる。なお、抽象度を示す値は、例えば、抽出された語句とともに、音声認識サーバ２０１から供給される。）、語句の品詞カテゴリ（例えば、動詞よりも名詞の重みを大きくする）等の指標を用いる。また、端末装置１００が、集音部１４０と電話部１８９とを用いて、音声データの取得時に当該音声データとともに通話音声（語句の発話）の音圧のデータも取得して情報管理サーバ２００へ送信し、重み付け演算部２３３は、当該音圧データを重み付けのための指標として用いてもよい（例えば、大きな音圧レベルで発せられた語句ほど重みを大きくする）。 (Weighting calculation unit 233)
The weighting calculation unit 233 analyzes the word/phrase data managed by the extracted word/phrase data management unit 232, and performs statistical weighting processing. As an index for the weighting, for example, the appearance frequency (number of times) of a phrase in conversation (for example, the weight increases as the appearance frequency increases) and the abstraction degree of the phrase (for example, the weight increases as the specificity increases). Specifically, the phrase "French cuisine" is more specific than "Cooking," and "Buch Bourquinon (burgundy-style beef cooked in red wine)" is better than "French cuisine." The value indicating the degree of abstraction is supplied from the voice recognition server 201 together with the extracted word, for example, and the part of speech category of the word (for example, the weight of the noun rather than the verb). Increase) is used. Further, the terminal device 100 uses the sound collection unit 140 and the telephone unit 189 to acquire the sound pressure data of the call voice (word utterance) together with the voice data at the time of acquiring the voice data to the information management server 200. Then, the weighting calculation unit 233 may use the sound pressure data as an index for weighting (for example, a word or phrase emitted at a higher sound pressure level has a higher weight).

（ＣＯＭＭワード配信部２３４）
ＣＯＭＭワード配信部２３４は、抽出語句データ管理部２３２、重み付け演算部２３３、後述する発言状況演算部２３５から取得した、語句を含む語句データを用いて、当該語句を含むＣＯＭＭワード８７を生成し、ＣＯＭＭワード８７に係るデータを配信する。この際、ＣＯＭＭワード配信部２３４は、生成したＣＯＭＭワード８７には、当該ＣＯＭＭワードに含まれる語句の発言者の識別情報が紐づける。また、ＣＯＭＭワード配信部２３４は、生成したＣＯＭＭワード８７を記憶部２２０に記憶させてもよい。さらに、ＣＯＭＭワード配信部２３４は、重み付け演算部２３３による重みづけ処理の結果に応じて、配信してもよい。より具体的には、ＣＯＭＭワード配信部２３４は、重みづけ処理の結果（重みの値）と所定の閾値（所定の値）とを比較し、比較結果に基づいて、生成したＣＯＭＭワード８７を配信してもよい。この場合、ＣＯＭＭワード配信部２３４は、ＣＯＭＭワード８７に紐づけて重みづけ処理の結果のデータを配信してもよい。さらに、ＣＯＭＭワード配信部２３４は、後述する発言状況演算部２３５に算出されたＣＯＭＭワード８７の位置情報をともに配信してもよい。 (COMM word delivery unit 234)
The COMM word distribution unit 234 generates a COMM word 87 including the phrase using the phrase data including the phrase acquired from the extracted phrase data management unit 232, the weighting calculation unit 233, and the statement situation calculation unit 235 described later, The data related to the COMM word 87 is distributed. At this time, the COMM word delivery unit 234 associates the generated COMM word 87 with the identification information of the speaker of the phrase included in the COMM word. Further, the COMM word distribution unit 234 may store the generated COMM word 87 in the storage unit 220. Further, the COMM word delivery unit 234 may deliver according to the result of the weighting processing by the weighting calculation unit 233. More specifically, the COMM word distribution unit 234 compares the weighting processing result (weight value) with a predetermined threshold value (predetermined value), and distributes the generated COMM word 87 based on the comparison result. You may. In this case, the COMM word delivery unit 234 may deliver the data of the result of the weighting process in association with the COMM word 87. Furthermore, the COMM word distribution unit 234 may also distribute the position information of the COMM word 87 calculated by the statement status calculation unit 235 described later.

（発言状況演算部２３５）
発言状況演算部２３５は、抽出語句データ管理部２３２が管理する語句データと、当該語句データに対応する識別情報データとを分析し、当該語句データが二者通話内の仮想的な場において、どのような位置（各発話者からの仮想的な距離）に存在するかを算出する。発言状況演算部２３５は、ＣＯＭＭリンク８６の３次元仮想空間９０内における位置と、ＣＯＭＭリンク８６に対応するＣＯＭＭワード８７の当該ＣＯＭＭリンク８６に対する位置とを算出する。例えば、図１５の例で説明すると、ＣＯＭＭワード８７Ｉの語句「会議」が、人物Ｃも人物Ｄも同じくらいの回数（例えば、人物Ｃが１０回、人物Ｄも１０回）で発言されている場合には、ＣＯＭＭワード８７Ｉ「会議」の位置は、例えば、ＣＯＭＭリンク８６Ｇ上の「中点」となる。一方で、ＣＯＭＭワード８７Ｊの語句「中止」は、人物Ｃが多く発言している（例えば、人物Ｃが５回、人物Ｄは０回）ことから、ＣＯＭＭワード８７Ｊ「中止」の位置は、人物Ｃの近傍となる。さらに、ＣＯＭＭワード８７Ｋの語句「開催」は、人物Ｄが多く発言している（たとえば、人物Ｃが１回、人物Ｄは６回）ことから、ＣＯＭＭワード８７Ｋ「開催」の位置は、人物Ｄの近傍となる。そして、発言状況演算部は、算出した位置をＣＯＭＭワードに紐づけて、ＣＯＭＭワード配信部２３４に供給する。ＣＯＭＭワード配信部２３４は、当該位置情報をＣＯＭＭワード８７とともに、端末装置１００へ配信することから、端末装置１００においては、当該位置情報に基づいて、ＣＯＭＭワード８７が表示されることとなる。 (Speaking status calculation unit 235)
The utterance status calculation unit 235 analyzes the word/phrase data managed by the extracted word/phrase data management unit 232 and the identification information data corresponding to the word/phrase data, and determines whether the word/phrase data is a virtual place in a two-way call. It is calculated whether or not there is such a position (virtual distance from each speaker). The speech state calculation unit 235 calculates the position of the COMM link 86 in the three-dimensional virtual space 90 and the position of the COMM word 87 corresponding to the COMM link 86 with respect to the COMM link 86. For example, in the example of FIG. 15, the word “meeting” of the COMM word 87I is spoken by the person C and the person D at the same number of times (for example, the person C is 10 times and the person D is 10 times). In this case, the position of the COMM word 87I "meeting" is, for example, the "midpoint" on the COMM link 86G. On the other hand, in the phrase “stop” of the COMM word 87J, since the person C is often speaking (for example, the person C is 5 times and the person D is 0 times), the position of the COMM word 87J “stop” is the person. It is near C. Further, since the person D often speaks the phrase "holding" in the COMM word 87K (for example, the person C once and the person D six times), the position of the COMM word 87K "holding" is the person D. It becomes the neighborhood of. Then, the utterance status calculation unit associates the calculated position with the COMM word and supplies the COMM word to the COMM word distribution unit 234. Since the COMM word distribution unit 234 distributes the position information together with the COMM word 87 to the terminal device 100, the COMM word 87 is displayed on the terminal device 100 based on the position information.

すなわち、制御部２３０は、抽出語句データ管理部２３２、重み付け演算部２３３、発言状況演算部２３５等による情報処理によって、通話中状態の端末装置１００から受信した音声データを、位置情報や重み付けや発言状況を反映した重み付き語句データに変換し、関連する拠点の端末装置１００へと送信する。なお、制御部２３０（ＣＯＭＭワード配信部２３４）は、前記統計的重み付けにおける重みの値が所定の閾値以上であった重み付き語句データのみ、端末装置１００へ配信するようにしてもよい。 That is, the control unit 230 processes the voice data received from the terminal device 100 in the call state by the information processing by the extraction word/phrase data management unit 232, the weighting calculation unit 233, the statement status calculation unit 235, etc. It is converted into weighted word/phrase data that reflects the situation, and is transmitted to the terminal device 100 at the relevant base. The control unit 230 (COMM word delivery unit 234) may deliver only the weighted word/phrase data whose weight value in the statistical weighting is equal to or greater than a predetermined threshold value to the terminal device 100.

＜１．４音声認識サーバの構成＞
続いて、図２０及び図２１を参照して、本実施形態に係る音声認識サーバ２０１の構成の一例を説明する。音声認識サーバ２０１は、先に説明したように、大規模な語句リストのデータを内蔵し、情報管理サーバ２００を介して、端末装置１００やマイクロフォン１３で取得された音声データを受信し、音声データに対して音声認識処理を行って、認識結果のデータを情報管理サーバ２００へと送信する。 <1.4 Structure of voice recognition server>
Subsequently, an example of the configuration of the voice recognition server 201 according to the present embodiment will be described with reference to FIGS. 20 and 21. As described above, the voice recognition server 201 has a large-scale word list data built therein, receives voice data acquired by the terminal device 100 or the microphone 13 via the information management server 200, and outputs the voice data. Voice recognition processing is performed on the received data and the data of the recognition result is transmitted to the information management server 200.

＜１．４．１ハードウェア構成＞
図２０を参照して、本実施形態に係る音声認識サーバ２０１のハードウェア構成の一例を説明する。図２０は、本実施形態に係る音声認識サーバ２０１のハードウェア構成の一例を示すブロック図である。図２０を参照すると、音声認識サーバ２０１は、ＣＰＵ７０１、ＲＯＭ７０３、ＲＡＭ７０５、バス７０７、記憶装置７０９及び通信インターフェース７１１を備える。 <1.4.1 Hardware configuration>
An example of the hardware configuration of the voice recognition server 201 according to the present embodiment will be described with reference to FIG. FIG. 20 is a block diagram showing an example of the hardware configuration of the voice recognition server 201 according to this embodiment. Referring to FIG. 20, the voice recognition server 201 includes a CPU 701, a ROM 703, a RAM 705, a bus 707, a storage device 709, and a communication interface 711.

（ＣＰＵ７０１、ＲＯＭ７０３、ＲＡＭ７０５）
ＣＰＵ７０１は、音声認識サーバ２０１における様々な処理を実行する。また、ＲＯＭ７０３は、音声認識サーバ２０１における処理をＣＰＵ７０１に実行させるためのプログラム及びデータを記憶する。また、ＲＡＭ７０５は、ＣＰＵ７０１の処理の実行時に、プログラム及びデータを一時的に記憶する。 (CPU 701, ROM 703, RAM 705)
The CPU 701 executes various processes in the voice recognition server 201. Further, the ROM 703 stores a program and data for causing the CPU 701 to execute the process in the voice recognition server 201. Further, the RAM 705 temporarily stores a program and data when the processing of the CPU 701 is executed.

（バス７０７）
バス７０７は、ＣＰＵ７０１、ＲＯＭ７０３及びＲＡＭ７０５を相互に接続する。バス７０７には、さらに、記憶装置７０９及び通信インターフェース７１１が接続される。バス７０７は、例えば、複数の種類のバスを含む。 (Bus 707)
The bus 707 connects the CPU 701, the ROM 703, and the RAM 705 to each other. A storage device 709 and a communication interface 711 are further connected to the bus 707. The bus 707 includes, for example, a plurality of types of buses.

（記憶装置７０９）
記憶装置７０９は、音声認識サーバ２０１内で一時的又は恒久的に保存すべきデータ、例えば、語句データを記憶する。記憶装置７０９は、例えば、ハードディスク等の磁気記憶装置であってもよく、又は、ＥＥＰＲＯＭ、フラッシュメモリ、ＭＲＡＭ、ＦｅＲＡＭ及びＰＲＡＭ等の不揮発性メモリであってもよい。 (Memory device 709)
The storage device 709 stores data to be temporarily or permanently stored in the voice recognition server 201, for example, word/phrase data. The storage device 709 may be, for example, a magnetic storage device such as a hard disk, or a non-volatile memory such as EEPROM, flash memory, MRAM, FeRAM, and PRAM.

（通信インターフェース７１１）
通信インターフェース７１１は、音声認識サーバ２０１が備える通信手段であり、ネットワークを介して（あるいは、直接的に）外部装置と通信する。通信インターフェース７１１は、無線通信用のインターフェースであってもよく、もしくは、有線通信用のインターフェースであってもよい。 (Communication interface 711)
The communication interface 711 is a communication unit included in the voice recognition server 201, and communicates with an external device via a network (or directly). The communication interface 711 may be a wireless communication interface or a wired communication interface.

＜１．４．２機能構成＞
次に、本実施形態に係る音声認識サーバ２０１の機能構成の一例を説明する。図２１は、本実施形態に係る音声認識サーバ２０１の機能構成の一例を示すブロック図である。図２１を参照すると、音声認識サーバ２０１は、通信部５１０、記憶部５２０及び制御部５３０を備える。 <1.4.2 Functional configuration>
Next, an example of the functional configuration of the voice recognition server 201 according to the present embodiment will be described. FIG. 21 is a block diagram showing an example of the functional configuration of the voice recognition server 201 according to this embodiment. Referring to FIG. 21, the voice recognition server 201 includes a communication unit 510, a storage unit 520, and a control unit 530.

（通信部５１０）
通信部５１０は、他の装置と通信する。例えば、通信部５１０は、ＬＡＮ１９に直接的に接続され、センタオフィス１０内の各装置と通信する。具体的には、例えば、通信部５１０は、マイクロフォン１３及び情報管理サーバ２００と通信する。また、通信部５１０は、外部ネットワーク３０及びＬＡＮ２３を介して、サテライトオフィス２０内の各装置と通信する。 (Communication unit 510)
The communication unit 510 communicates with other devices. For example, the communication unit 510 is directly connected to the LAN 19 and communicates with each device in the center office 10. Specifically, for example, the communication unit 510 communicates with the microphone 13 and the information management server 200. Further, the communication unit 510 communicates with each device in the satellite office 20 via the external network 30 and the LAN 23.

（記憶部５２０）
記憶部５２０は、音声認識サーバ２０１の動作のためのプログラム及びデータを記憶する。詳細には、本実施形態では、記憶部５２０は、大規模な語句リストのデータを記憶する。 (Storage unit 520)
The storage unit 520 stores programs and data for the operation of the voice recognition server 201. Specifically, in the present embodiment, the storage unit 520 stores data of a large-scale word list.

（制御部５３０）
制御部５３０は、音声認識サーバ２０１の様々な機能を提供する。制御部５３０は、語句抽出部５３１及び語句データ生成部５３３を含む。 (Control unit 530)
The control unit 530 provides various functions of the voice recognition server 201. The control unit 530 includes a phrase extraction unit 531 and a phrase data generation unit 533.

（語句抽出部５３１）
語句抽出部５３１は、記憶部５２０に記憶された語句リストを参照して、情報管理サーバ２００を介して、端末装置１００やマイクロフォン１３から取得された音声データから語句を抽出する。語句抽出部５３１は、音声データを受け取ったら逐次、当該音声データから語句を抽出してもよく、もしくは、受け取った音声データの量が所定の量になった場合に（例えば、５分間分の会話に係る音声データ）、受け取った音声データから語句を抽出してもよい。また、語句抽出部５３１は、記憶部５２０にあらかじめ記憶された、語句を発話した話者に対応付けられた語句リストを用いて、語句の抽出を行ってもよい。このようにすることで、当該話者の発言する語句の傾向についての情報を蓄積し、語句抽出部５３１は、蓄積した情報を用いて、当該話者の発言の頻度が高い語句を優先的に抽出することができる。また、特定の語句は抽出されることがないように（業務に関係のない語句（例えば「ゲーム」等の語句））が抽出されることがないように、語句抽出部５３１による語句の抽出の際には、フィルタリングを行ってもよい。 (Word extraction unit 531)
The word/phrase extraction unit 531 refers to the word/phrase list stored in the storage unit 520, and extracts a word/phrase from the voice data acquired from the terminal device 100 or the microphone 13 via the information management server 200. The word/phrase extraction unit 531 may sequentially extract words/phrases from the voice data upon receiving the voice data, or when the received voice data reaches a predetermined amount (for example, conversation for 5 minutes). May be extracted from the received voice data). Further, the phrase extracting unit 531 may extract the phrase using a phrase list stored in advance in the storage unit 520 and associated with the speaker who uttered the phrase. By doing so, information about the tendency of words and phrases spoken by the speaker is accumulated, and the phrase extraction unit 531 uses the accumulated information to give priority to words and phrases with a high frequency of speech by the speaker. Can be extracted. In addition, the phrase extraction unit 531 extracts a phrase so that a specific phrase is not extracted (a phrase that is not related to work (eg, a phrase such as “game”) is not extracted). At this time, filtering may be performed.

（語句データ生成部５３３）
語句データ生成部５３３は、語句抽出部５３１により抽出された語句の抽出（認識）結果のデータを生成し、通信部５１０を介して、情報管理サーバ２００へ送信する。 (Word data generation unit 533)
The word/phrase data generation unit 533 generates data of the result of extraction (recognition) of the word/phrase extracted by the word/phrase extraction unit 531 and transmits the data to the information management server 200 via the communication unit 510.

＜１．５処理の流れ＞
続いて、図２２を参照して、本実施形態に係る情報処理の例を説明する。図２２は、本実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。図２２には、ステップＳ４０１からステップＳ４１３までが含まれる。 <1.5 Process flow>
Next, an example of information processing according to this embodiment will be described with reference to FIG. FIG. 22 is a flowchart showing an example of a schematic flow of information processing according to this embodiment. 22 includes steps S401 to S413.

（ステップＳ４０１）
位置取得部１８３は、実空間の撮像画像の表示画面において当該撮像画像内の位置がユーザにより指定されたかを判定する。より具体的には、例えば、位置取得部１８３は、近接撮像画像７１内の位置をユーザにより指定されたかを判定する。上記位置が指定されていれば、処理はステップＳ４０３へ進む。一方、上記位置が指定されていない場合には、ステップＳ４０１を繰り返す。 (Step S401)
The position acquisition unit 183 determines whether the position in the captured image on the display screen of the captured image in the real space is designated by the user. More specifically, for example, the position acquisition unit 183 determines whether the position in the close-up captured image 71 is designated by the user. If the position is specified, the process proceeds to step S403. On the other hand, if the position is not designated, step S401 is repeated.

（ステップＳ４０３）
位置取得部１８３は、ユーザにより指定される上記撮像画像内の位置を取得する。 (Step S403)
The position acquisition unit 183 acquires the position in the captured image specified by the user.

（ステップＳ４０５）
オブジェクト選択部１８５は、実空間に対応する３次元仮想空間９０のデータを記憶部１７０から取得する。 (Step S405)
The object selection unit 185 acquires the data of the three-dimensional virtual space 90 corresponding to the real space from the storage unit 170.

（ステップＳ４０７）
取得された上記撮像画像内の上記位置に基づいて、上記３次元仮想空間９０に配置されたオブジェクト９１を選択する。 (Step S407)
The object 91 arranged in the three-dimensional virtual space 90 is selected based on the acquired position in the captured image.

（ステップＳ４０９）
ＩＤ取得部１８７は、選択された上記オブジェクト９１に対応する通信用ＩＤを着信側の通信用ＩＤとして取得する。 (Step S409)
The ID acquisition unit 187 acquires the communication ID corresponding to the selected object 91 as the communication ID on the receiving side.

(ステップＳ４１１)
ＩＤ取得部１８７は、発信側の通信用ＩＤ（即ち、端末装置１００の通信用ＩＤ）を取得する。 (Step S411)
The ID acquisition unit 187 acquires the communication ID of the calling side (that is, the communication ID of the terminal device 100).

（ステップＳ４１３）
電話部１８９は、着信側の通信用ＩＤを用いて電話発信を行う。その後、電話部１８９は、電話のための様々な処理を実行し、処理は終了する。 (Step S413)
The telephone unit 189 makes a telephone call using the communication ID on the receiving side. Thereafter, the telephone unit 189 executes various processes for making a call, and the process ends.

以上のように、本実施形態に係る情報処理が実行される。さらに、当該情報処理の開始前に行われる起動処理の一例を、図２３を参照して説明する。図２３は、本実施形態に係る起動処理の概略的な流れの一例を示すフローチャートである。図２３は、ステップＳ５０１からステップＳ５０７までを含む。 As described above, the information processing according to this embodiment is executed. Furthermore, an example of the activation process performed before the start of the information processing will be described with reference to FIG. FIG. 23 is a flowchart showing an example of a schematic flow of a startup process according to this embodiment. FIG. 23 includes steps S501 to S507.

（ステップＳ５０１）
ソフトフォン８５１の起動処理が実行される。これにより、ソフトフォン８５１が起動する。 (Step S501)
The activation process of the softphone 851 is executed. This activates the softphone 851.

（ステップＳ５０３）
ソフトフォン８５１に関する登録処理が実行される。例えば、ソフトフォン８５１の登録処理の１つとして、ＰＢＸ４０での登録（例えば、ＳＩＰＲＥＧＩＳＴＲＡＴＩＯＮ）が行われる。 (Step S503)
The registration process for the softphone 851 is executed. For example, as one of the registration processes of the softphone 851, registration with the PBX 40 (for example, SIP REGISTRATION) is performed.

（ステップＳ５０５）
超臨場感クライアント８５３の起動処理が実行される。例えば、超臨場感クライアント８５３において利用されるカメラ１１、マイクロフォン１３、センサ１５、メディア配信サーバ１７、情報管理サーバ２００等が特定される。 (Step S505)
The activation process of the ultra-realistic client 853 is executed. For example, the camera 11, the microphone 13, the sensor 15, the media distribution server 17, the information management server 200, and the like used in the ultra-realistic client 853 are specified.

（ステップＳ５０７）
超臨場感クライアント８５３の俯瞰モード処理が実行される。そして、一連の起動処理は終了する。 (Step S507)
The bird's-eye view mode process of the ultra-realism client 853 is executed. Then, the series of start-up processing ends.

次に、本実施形態に係る通信制御処理の一例を、図２４を参照して説明する。図２４は、本実施形態に係る通信制御処理の概略的な流れの一例を示すシーケンス図である。図２４の通信制御処理においては、情報管理サーバ２００が、先に二者通話を開始した人物Ｃの端末装置１００Ｃと人物Ｄの端末装置１００Ｄ間の会話に関する情報をＣＯＭＭリンク８６及びＣＯＭＭワード８７として第三者となるユーザの端末装置１００Ｕに配信するまでの処理を含む。さらに、図２４の通信制御処理においては、当該ユーザが端末装置１００Ｕから当該ＣＯＭＭリンク８６を指定入力することで、上記二者通話に参加するまでの処理を含む。詳細には、図２４には、ステップＳ６０１からステップＳ６１３までが含まれる。 Next, an example of the communication control process according to the present embodiment will be described with reference to FIG. FIG. 24 is a sequence diagram showing an example of a schematic flow of the communication control process according to the present embodiment. In the communication control process of FIG. 24, the information management server 200 uses the information regarding the conversation between the terminal device 100C of the person C and the terminal device 100D of the person D who started the two-party call as the COMM link 86 and the COMM word 87. It includes processing until distribution to the terminal device 100U of the user who is a third party. Further, the communication control process of FIG. 24 includes a process until the user enters the two-way call by designating and inputting the COMM link 86 from the terminal device 100U. Specifically, FIG. 24 includes steps S601 to S613.

（ステップＳ６０１）
情報管理サーバ２００を介して、端末装置１００Ｃと端末装置１００Ｄ間での二者通話が開始される。この二者通話の開始には、前述のステップＳ４０１からステップＳ４１３で説明した一連の情報処理が用いられていてもよい。 (Step S601)
A two-party call is started between the terminal device 100C and the terminal device 100D via the information management server 200. The series of information processing described in steps S401 to S413 may be used to start the two-party call.

（ステップＳ６０３）
情報管理サーバ２００は、端末装置１００Ｃと端末装置１００Ｄ間の二者通話における音声データから音声認識サーバ２０１が抽出した語句データを取得する。さらに、情報管理サーバ２００は、取得した語句データを用いて、位置情報や統計的重み付けや発言状況を反映した重み付き語句データを生成する。 (Step S603)
The information management server 200 acquires the phrase data extracted by the voice recognition server 201 from the voice data in the two-way call between the terminal device 100C and the terminal device 100D. Furthermore, the information management server 200 uses the acquired word/phrase data to generate weighted word/phrase data that reflects position information, statistical weighting, and the state of speech.

（ステップＳ６０５）
情報管理サーバ２００は、端末装置１００Ｃと端末装置１００Ｄ間の二者通話に関する重み付き語句データを、端末装置１００Ｕへ送信する。 (Step S605)
The information management server 200 transmits the weighted word/phrase data regarding the two-way call between the terminal device 100C and the terminal device 100D to the terminal device 100U.

（ステップＳ６０７）
端末装置１００Ｕは、重み付き語句データを受信し、表示部１５０の画面上にＣＯＭＭリンク８６とＣＯＭＭワード８７を表示する。 (Step S607)
The terminal device 100U receives the weighted word/phrase data and displays the COMM link 86 and the COMM word 87 on the screen of the display unit 150.

（ステップＳ６０９）
端末装置１００Ｕは、表示された端末装置１００Ｃと端末装置１００Ｄに対応するオブジェクト９１をつなぐＣＯＭＭリンク８６を指定するユーザ入力を取得する。 (Step S609)
The terminal device 100U acquires a user input that specifies the COMM link 86 that connects the objects 91 corresponding to the displayed terminal devices 100C and 100D.

（ステップＳ６１１）
端末装置１００Ｕは、上記ＣＯＭＭリンク８６を指定するユーザ入力に関するデータ（例えば、前述の会話オブジェクトに対応する２以上の話者のオブジェクトＩＤ）を情報管理サーバ２００へ送信する。 (Step S611)
The terminal device 100U transmits, to the information management server 200, data relating to a user input designating the COMM link 86 (for example, object IDs of two or more speakers corresponding to the conversation object described above).

（ステップＳ６１３）
情報管理サーバ２００は、上記ＣＯＭＭリンク８６を指定するユーザ入力に関するデータを受信し、上記ＣＯＭＭリンク８６に対応する端末装置１００Ｃと端末装置１００Ｄ間の二者通話のセッションに端末装置１００Ｕとの通話を追加する。そして、端末装置１００Ｃ、端末装置１００Ｄ、端末装置１００Ｕ間の三者通話のセッションが開始される。 (Step S613)
The information management server 200 receives the data related to the user input designating the COMM link 86, and establishes a call with the terminal device 100U in a two-party call session between the terminal devices 100C and 100D corresponding to the COMM link 86. to add. Then, a three-party call session between the terminal device 100C, the terminal device 100D, and the terminal device 100U is started.

以上説明したように、本実施形態においては、通話の話者を結ぶＣＯＭＭリンク８６が表示されることから、ユーザは、ＣＯＭＭリンク８６を視認することで、分散環境にいる複数の遠隔地の誰と誰との間に会話が発生したかを直感的に把握することができる。さらに、本実施形態においては、通話内容の要部をＣＯＭＭワード８７として表示することから、遠隔地にいて音声が聞こえないようなユーザでも、ＣＯＭＭワード８７が表示された画面を見ることより、会話内容の大まかな把握を行うことが可能となる。そして、当該ユーザは、会話内容の大まかな把握を行った後に、当該会話内容に関心が生じたら、前述のＣＯＭＭリンク８６に対する簡便な指定操作によって、スムーズに当該会話に参加することができる。すなわち、本実施形態によれば、ユーザが、分散環境において、遠隔地での会話の発生や当該会話の大まかな内容を把握することができ、さらに、誰が通話状態にあるのかを直感的に認識することが可能である。 As described above, in the present embodiment, since the COMM link 86 connecting the callers is displayed, the user visually recognizes the COMM link 86 to identify who is in a plurality of remote locations in the distributed environment. It is possible to intuitively understand with whom the conversation occurred. Further, in the present embodiment, since the main part of the call content is displayed as the COMM word 87, even a user who is in a remote place and cannot hear the voice can see the screen on which the COMM word 87 is displayed, It becomes possible to roughly understand the contents. Then, the user can participate in the conversation smoothly by performing a simple designation operation on the COMM link 86 when the user becomes interested in the conversation content after roughly understanding the conversation content. That is, according to the present embodiment, the user can grasp the occurrence of a conversation at a remote place and the rough content of the conversation in a distributed environment, and further intuitively recognize who is in a call state. It is possible to

＜２．第２の実施形態＞
次に、以下に説明する、本発明の第２の実施形態は、例えば分散オフィスのような分散環境と同室環境とが混在する環境において、ユーザが、既に通話を行っている複数の話者のいずれかの話者と同室環境内の位置に存在することを前提とした処理である。 <2. Second Embodiment>
Next, in the second embodiment of the present invention described below, in a mixed environment of a distributed environment and a common room environment, such as a distributed office, a user can talk to a plurality of speakers who are already talking. This processing is premised on that the speaker exists in the same room environment as any speaker.

詳細には、上述の第１の実施形態においては、分散環境では気づきにくくなってしまう会話というコミュニケーションイベントについて、それをＣＯＭＭリンク８６およびＣＯＭＭワード８７として可視化処理することにより、ユーザが遠隔地の会話にも気づけるようにしていた。しかしながら、例えば、分散オフィスのような分散環境と同室環境とが混在する環境においては、ユーザが、既に通話を行っている複数の話者の全ての話者と異なる環境内に位置するものとは限られず、当該通話の話者の１人と同室環境に位置する場合がある。この場合、当該ユーザは、同室環境に位置することから、周囲の会話として、当該通話の内容を自然に聴感できているため、当該会話をＣＯＭＭワード８７として、ユーザの端末装置１００Ｕ上にも提示すると、視聴した情報と表示された情報とによって、ユーザに対して同一の情報が二重に提供されることとなる。そして、このような情報の二重提供は、かえってユーザの思考の混乱を招くこととなる。そこで、本実施形態においては、同室環境の人物の発話に関するＣＯＭＭワード８７の生成や配信処理を回避することにより、同一情報の二重提供を防ぎ、ユーザの指向を混乱させることを避けるような処理を行う。さらに、本実施形態においては、ＣＯＭＭワード８７の生成や配信処理を回避することにより、本実施形態のシステムにおける処理の一部を軽減化し、処理の迅速化を図ることができる。 Specifically, in the first embodiment described above, a communication event, which is a conversation that is difficult to notice in a distributed environment, is visualized as a COMM link 86 and a COMM word 87 so that the user can communicate in a remote area. I was trying to notice. However, for example, in an environment in which a distributed environment and a room environment coexist, such as a distributed office, it is unlikely that a user is located in an environment different from all speakers of a plurality of speakers who are already talking. Without limitation, it may be located in the same room environment as one of the callers. In this case, since the user is located in the same room environment, the user can naturally hear the content of the call as a surrounding conversation, and therefore the conversation is also presented as the COMM word 87 on the user's terminal device 100U. Then, the same information is provided to the user in duplicate based on the viewed information and the displayed information. Further, such double provision of information would rather confuse the user's thinking. Therefore, in the present embodiment, by avoiding the generation and distribution processing of the COMM word 87 related to the utterance of the person in the same room environment, it is possible to prevent the double provision of the same information and avoid the confusion of the user's orientation. I do. Furthermore, in the present embodiment, by avoiding the generation and distribution processing of the COMM word 87, a part of the processing in the system of the present embodiment can be reduced and the processing can be speeded up.

＜２．１情報管理サーバの構成＞
＜２．１．１機能構成＞
図２５を参照して、本実施形態に係る情報管理サーバ２００Ａの機能構成の一例を説明する。図２５は、本実施形態に係る情報管理サーバ２００Ａの機能構成の一例を示すブロック図である。当該情報管理サーバ２００Ａは、上述した第１の実施形態の情報管理サーバ２００と同様に、通信部２１０、記憶部２２０、及び制御部２３０を有する。さらに、制御部２３０は、第１の実施形態と同様に、ＣＯＭＭリンク配信部２３１、抽出語句データ管理部２３２、重み付け演算部２３３、ＣＯＭＭワード配信部２３４、及び発言状況演算部２３５を含む。加えて、制御部２３０は、位置連動配信制御部２３６をさらに含む。従って、ここでは、第１の実施形態と同様の機能部の説明は省略し、位置連動配信制御部２３６についてのみ説明する。 <2.1 Information management server configuration>
<2.1.1 Functional configuration>
An example of the functional configuration of the information management server 200A according to the present embodiment will be described with reference to FIG. FIG. 25 is a block diagram showing an example of the functional configuration of the information management server 200A according to this embodiment. The information management server 200A includes a communication unit 210, a storage unit 220, and a control unit 230, similar to the information management server 200 of the first embodiment described above. Further, the control unit 230 includes the COMM link distribution unit 231, the extracted word/phrase data management unit 232, the weighting calculation unit 233, the COMM word distribution unit 234, and the statement status calculation unit 235, as in the first embodiment. In addition, the control unit 230 further includes a position-linked distribution control unit 236. Therefore, here, the description of the functional units similar to those of the first embodiment will be omitted, and only the position-linked distribution control unit 236 will be described.

（位置連動配信制御部２３６）
位置連動配信制御部２３６は、既に行われている通話に対応する人物（話者）の位置と、当該通話に参加していない第三者ユーザの位置と、を比較し、当該第三者ユーザの端末装置１００ＵへＣＯＭＭワード８７を配信するか否かを制御する。より具体的には、位置連動配信制御部２３６は、上記通話の音声に係る人物（話者）の位置情報と、上記通話に関与していない端末装置１００Ｕの位置情報とを比較し、両者の位置情報が所定の距離内、もしくは、同拠点の同室環境に共に存在する場合には、ＣＯＭＭワード８７の生成又は配信に係る制御処理を中止する。 (Position linked delivery control unit 236)
The position-linked delivery control unit 236 compares the position of the person (speaker) corresponding to the call already made with the position of the third party user who is not participating in the call, and the third party user concerned. It controls whether or not to deliver the COMM word 87 to the terminal device 100U. More specifically, the position-linked delivery control unit 236 compares the position information of the person (speaker) related to the voice of the call with the position information of the terminal device 100U that is not involved in the call, and When the position information is within a predetermined distance or both exist in the same room environment of the same base, the control process related to the generation or distribution of the COMM word 87 is stopped.

＜２．２処理の流れ＞
続いて、図２６を参照して、本実施形態に係る情報処理の例を説明する。図２６は、本実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。図２６には、ステップＳ７０１からステップＳ７０７までが含まれており、ステップＳ７０１は、第１の実施形態の図２４のステップＳ６０１の後に、開始される。 <2.2 Process Flow>
Subsequently, an example of information processing according to the present embodiment will be described with reference to FIG. FIG. 26 is a flowchart showing an example of a schematic flow of information processing according to the present embodiment. FIG. 26 includes steps S701 to S707, and step S701 is started after step S601 of FIG. 24 of the first embodiment.

（ステップＳ７０１）
情報管理サーバ２００は、端末装置１００Ｃと端末装置１００Ｄ間での二者通話が開始されたことを検知した場合には、次のステップＳ７０３へ進む。なお、情報管理サーバ２００は、二者通話が開始されたことを検知していない場合には、ステップＳ７０１を繰り返す。 (Step S701)
When the information management server 200 detects that the two-party call between the terminal device 100C and the terminal device 100D is started, the information management server 200 proceeds to the next step S703. If the information management server 200 does not detect that the two-party call has started, step S701 is repeated.

（ステップＳ７０３）
位置連動配信制御部２３６は、上記通話から抽出された語句データに係る語句を発話した人物の位置と、上記通話に関与していない端末装置１００Ｕの位置（上記通話に参加していないユーザの位置）との間の距離を算出する。 (Step S703)
The position-linked delivery control unit 236 detects the position of the person who uttered the phrase related to the phrase data extracted from the call, and the position of the terminal device 100U that is not involved in the call (the position of the user who is not participating in the call. ) Is calculated.

（ステップＳ７０５）
位置連動配信制御部２３６は、ステップＳ７０３へ算出した距離を、あらかじめ定められた所定の距離と比較する。算出した距離が所定の距離よりも短い場合には、処理を終了する。一方、算出した距離が所定の距離よりも長い場合には、ステップＳ７０７へ進む。なお、所定の距離とは、例えば、話者とユーザとが同室に存在する場合に想定される両者の間の距離のことであり、もしくは、一方の話者が実空間上で発話した場合に、ユーザが当該発話の聞くことが可能な実空間上の位置と、上記話者との間の距離のことである。 (Step S705)
The position-interlocked delivery control unit 236 compares the distance calculated in step S703 with a predetermined distance. If the calculated distance is shorter than the predetermined distance, the process ends. On the other hand, when the calculated distance is longer than the predetermined distance, the process proceeds to step S707. The predetermined distance is, for example, a distance between the speaker and the user, which is assumed when the user is in the same room, or when one speaker speaks in a real space. , The distance between the position in the real space where the user can hear the utterance and the speaker.

（ステップＳ７０７）
図２４のステップＳ６０３へ進む。すなわち、会話についての語句を抽出し、抽出した語句に対して重みづけを行う。 (Step S707)
It progresses to step S603 of FIG. That is, words and phrases related to conversation are extracted, and the extracted words and phrases are weighted.

以上のように、本実施形態においては、位置連動配信制御部２３６は、通話の話者の１人と同室環境又は当該話者の近傍に位置する場合には、同室環境の人物の発話に関するＣＯＭＭワード８７の生成や配信処理を回避するような処理を行う。詳細には、位置連動配信制御部２３６は、図２４のステップＳ６０３の処理をスキップするような処理を行う。 As described above, in the present embodiment, the position-interlocked delivery control unit 236, when positioned in the same room environment as one of the talkers or in the vicinity of the talker, the COMM regarding the utterance of the person in the same room environment. Processing for avoiding the generation and distribution processing of the word 87 is performed. Specifically, the position-interlocked delivery control unit 236 performs a process that skips the process of step S603 in FIG.

例えば、第２の実施形態を図１５の例で説明すると、第三者ユーザが拠点Ｂ：大阪オフィスにいる場合、拠点Ｂにいる人物Ｆは第三者ユーザと同室におり周囲の会話として自然に聞こえている。このような場合には、本実施形態においては、人物Ｅと人物Ｆの二者通話における人物Ｆの発話分については、情報管理サーバ２００Ａは、ＣＯＭＭワード８７を生成して第三者ユーザの端末装置１００Ｕへ送信する処理を回避する。具体的には、図１５のＣＯＭＭワード８７Ｌは情報管理サーバ２００Ａで生成されない、または、端末装置１００Ｕへは配信されず、結果、端末装置１００Ｕの表示画面上には提示されないこととなる。 For example, when the second embodiment is described with reference to the example of FIG. 15, when the third party user is at the base B: Osaka office, the person F at the base B is in the same room as the third party user, and the conversation naturally occurs. Is heard. In such a case, in the present embodiment, for the utterance of the person F in the two-party call between the person E and the person F, the information management server 200A generates the COMM word 87 to generate the terminal of the third party user. The process of transmitting to the device 100U is avoided. Specifically, the COMM word 87L in FIG. 15 is not generated by the information management server 200A or is not delivered to the terminal device 100U, and as a result, is not presented on the display screen of the terminal device 100U.

なお、本実施形態においては、ＣＯＭＭワード８７の配信を回避する処理を行うことに限定されるものではなく、例えば、ＣＯＭＭリンク８６の配信を回避する処理を行ってもよい。また、本実施形態においては、ＣＯＭＭワード８７の配信を回避する処理を行うことに限定されるものではなく、例えば、配信するＣＯＭＭワード８７の量を少なくする処理を行ってもよい。 It should be noted that the present embodiment is not limited to performing the process of avoiding the distribution of the COMM word 87, and for example, the process of avoiding the distribution of the COMM link 86 may be performed. Further, the present embodiment is not limited to the process of avoiding the distribution of the COMM word 87, and for example, the process of reducing the amount of the COMM word 87 to be distributed may be executed.

すなわち、本実施形態によれば、同室環境の人物の発話に関するＣＯＭＭワード８７の生成や配信処理を回避することにより、同一情報の二重提供を防ぎ、第三者ユーザの思考を混乱させることを避けることができる。詳細には、第三者ユーザの端末装置１００の表示画面には、ＣＯＭＭワード８７やＣＯＭＭリンク８６が表示されなくなることにより、これらＣＯＭＭリンク８６やＣＯＭＭワード８７や重畳描画の背景となる撮像画像５１の視認性が向上する。さらに、本実施形態においては、ＣＯＭＭワード８７の生成や配信処理を回避することにより、本実施形態のシステムにおける処理の一部を軽減化し、処理の迅速化を図ることができる。 That is, according to the present embodiment, by avoiding the generation and distribution processing of the COMM word 87 regarding the utterance of the person in the same room environment, it is possible to prevent double provision of the same information and to confuse the thought of the third party user. Can be avoided. Specifically, since the COMM word 87 and the COMM link 86 are not displayed on the display screen of the terminal device 100 of the third party user, the COMM link 86, the COMM word 87, and the captured image 51 that is the background of the superimposed drawing are displayed. Visibility is improved. Furthermore, in the present embodiment, by avoiding the generation and distribution processing of the COMM word 87, a part of the processing in the system of the present embodiment can be reduced and the processing can be speeded up.

＜３．第３の実施形態＞
以下に説明する本発明の第３の実施形態は、既に開始された会話に対して、どの第三者ユーザが当該会話に関心を持っているかどうかを当該会話に関わる話者に示す。このようにすることで、同一環境で行われている会話のように、話者は関心を持っている第三者ユーザの存在に気が付き、当該会話に第三者ユーザを引き込むことができる。 <3. Third Embodiment>
A third embodiment of the invention described below, for a conversation that has already started, shows to the parties involved in the conversation which third-party user is interested in the conversation. By doing so, the speaker becomes aware of the presence of the interested third-party user and can attract the third-party user to the conversation, as in a conversation conducted in the same environment.

詳細には、先に説明したように、「会話」という行為は２名で行われるとは限らず、３名以上のグループで行われることも多い。このような場合に、会話の開始時にグループのメンバが全員そろっているパターンの他に、開始された会話の存在に周囲の者が気づき（Ａｗａｒｅｎｅｓｓ：アウェアネス）、その者がその会話の場に後から加わることで、当該会話を行っているグループのメンバ数が増えていくようなパターンがある。 Specifically, as described above, the act of "conversation" is not always performed by two people, but is often performed by a group of three or more people. In such a case, in addition to the pattern in which all the members of the group are all available at the beginning of the conversation, the surrounding person notices the existence of the started conversation (Awareness), and the person comes to the place of the conversation. There is a pattern in which the number of members of the group having the conversation increases as a result of joining the group.

例えば、このような状況において、同室環境では、会話中の二者の近くに立って当該会話内容に関心を持って聴いている第三者の存在に、会話中の二者は自然に気づくだろう。また、上記第三者は同じ場所に立ち続けているとは限らず、上記会話内容への関心がさらに強まれば会話中の二者へさらに近づき、逆に会話内容への関心が失われれば会話中の二者から離れて去っていく。このように第三者は上記会話内容への関心度に合わせて自身の位置を能動的に変動させる。すなわち、第三者の会話内容への関心度は、会話中の二者と第三者との間の空間的「距離」として現象化される。さらに、当該距離が近ければ会話中の二者は第三者の存在に気づきやすくなり、逆に距離が遠ければ第三者に気づきにくくなる。そして、会話中の二者が、近づく第三者に気づいて会話に引き入れたり、第三者が会話の様子見をしながら徐々に近づいて行ってそのまま会話に参加したりして、自然な三者会話が始まることとなる。 For example, in such a situation, in the same room environment, the two parties in the conversation naturally notice the presence of a third party who stands near the two parties in the conversation and is interested in the content of the conversation. Let's do it. Further, the third party does not always stand in the same place, and if the interest in the conversation content becomes stronger, the third party will be closer to the two parties in the conversation, and conversely if the interest in the conversation content is lost. I leave the two in conversation and go away. In this way, the third party actively changes its position according to the degree of interest in the conversation content. That is, the degree of interest in the conversation content of the third party is manifested as a spatial “distance” between the two parties in the conversation and the third party. Further, if the distance is short, the two parties in the conversation are more likely to notice the existence of the third party, and conversely, if the distance is longer, the second party is less likely to notice the third party. Then, two people in the conversation notice the third party approaching and draw in the conversation, or the third party gradually approaches while attending the conversation and participates in the conversation as it is. Person conversation will begin.

しかしながら、従来から分散環境において使用されていた従来のソフトフォン製品は、先に説明したように、メンバ全員が会話の開始時からそろっているパターンを想定しているため、第三者に会話に対する関心度や、関心度に応じた会話の話者と第三者との相互の位置関係の変化を考慮して処理するものではない。従って、従来のソフトフォン製品を用いた場合、同室環境での会話と比べて、第三者が途中から加わりにくく、また、会話の二者側も第三者から急に話しかけられることとなることから、二者が驚き、第三者と二者との通話がスムーズに進まないことがある。 However, since the conventional softphone products that have been used in the distributed environment conventionally assume a pattern in which all the members are aligned from the beginning of the conversation, as described above, the softphone product is not available to the third party. The processing is not performed in consideration of the degree of interest and the change in the mutual positional relationship between the speaker of the conversation and the third party according to the degree of interest. Therefore, when a conventional softphone product is used, compared to a conversation in the same room environment, a third party is less likely to join in the middle of the conversation, and the two parties in the conversation can suddenly speak to the third party. Therefore, the two parties may be surprised and the call between the third party and the second party may not proceed smoothly.

そこで、本実施形態は、複数の遠隔地における第三者の会話への関心度を、たとえば仮想的な「距離」として第三者が直感的に設定入力でき、当該会話を行っている二者には、その距離（関心度）に応じて、当該会話に関心を持っている第三者が存在することを知らせる通知がなされるような機能を提供する。さらに、本実施形態は、上記距離（関心度）に応じて、当該会話に係る会話内容の情報を第三者が取得できるようにする機能を提供する。 Therefore, in the present embodiment, the degree of interest in a conversation of a third party at a plurality of remote locations can be intuitively set and input by a third party as, for example, a virtual "distance", and the two parties who are engaged in the conversation. Is provided with a function to notify that there is a third party who is interested in the conversation according to the distance (degree of interest). Furthermore, the present embodiment provides a function that allows a third party to acquire information on the conversation content related to the conversation according to the distance (degree of interest).

＜３．１端末装置の構成＞
＜３．１．１機能構成＞
図２７を参照して、本実施形態に係る端末装置１００Ａの機能構成の一例を説明する。図２７は、本実施形態に係る端末装置１００Ａの機能構成の一例を示すブロック図である。図２７を参照すると、端末装置１００Ａは、第１の実施形態に係る端末装置１００と同様に、通信部１１０、入力部１２０、撮像部１３０、集音部１４０、表示部１５０、音声出力部１６０、記憶部１７０及び制御部１８０を有する。さらに、制御部１８０は、第１の実施形態と同様に、実空間情報提供部１８１、音声出力制御部１８２、位置取得部１８３、オブジェクト選択部１８５、ＩＤ取得部１８７、電話部１８９、会話オブジェクト選択部１９１、ＣＯＭＭリンク制御部１９３、及びＣＯＭＭワード制御部１９５を含む。加えて、制御部１８０は、会話関心度設定部１９６、話者関心比設定部１９７、及び、会話関心度通知部１９８をさらに含む。従って、ここでは、第１の実施形態と同様の機能部の説明は省略し、会話関心度設定部１９６、話者関心比設定部１９７、及び、会話関心度通知部１９８についてのみ説明する。 <3.1 Configuration of Terminal Device>
<3.1.1 Functional configuration>
An example of the functional configuration of the terminal device 100A according to the present embodiment will be described with reference to FIG. FIG. 27 is a block diagram showing an example of the functional configuration of the terminal device 100A according to this embodiment. Referring to FIG. 27, the terminal device 100A, like the terminal device 100 according to the first embodiment, has a communication unit 110, an input unit 120, an imaging unit 130, a sound collecting unit 140, a display unit 150, and an audio output unit 160. The storage unit 170 and the control unit 180 are included. Further, as in the first embodiment, the control unit 180 further includes the real space information providing unit 181, the voice output control unit 182, the position acquisition unit 183, the object selection unit 185, the ID acquisition unit 187, the telephone unit 189, and the conversation object. It includes a selection unit 191, a COMM link control unit 193, and a COMM word control unit 195. In addition, the control unit 180 further includes a conversation interest level setting unit 196, a speaker interest ratio setting unit 197, and a conversation interest level notification unit 198. Therefore, the description of the functional units similar to those in the first embodiment is omitted here, and only the conversation interest level setting unit 196, the speaker interest ratio setting unit 197, and the conversation interest level notification unit 198 will be described.

（会話関心度設定部１９６）
会話関心度設定部１９６は、ユーザの端末装置１００Ａへの入力から、後述する「会話関心度（関心度）」の設定入力を受け付け、その結果を表示部１５０へ表示させるとともに、当該会話関心度に関するデータを情報管理サーバ２００へと送信する。ここで、「会話関心度」とは、本明細書においては、本実施形態に係る情報処理システム上で発生したある通話（会話）に対してユーザが感じた関心の度合い、を多段階で示すものである。多段階とは、例えば各段階の間隔を１きざみとして、０：全く関心がない〜９：非常に関心がある、の１０段階でもよいし、０〜９９の１００段階でもよいし、さらにＶｉｓｕａｌＡｎａｌｏｇｕｅＳｃａｌｅのように無段階の連続的な尺度であってもよい。なお、ひとつの会話関心度は、ひとつの通話（会話）、すなわちひとつのＣＯＭＭリンク８６に紐付き、さらに、当該ＣＯＭＭリンク８６に係る１つ又は複数のＣＯＭＭワード８７に対して紐づくことができる。 (Conversation interest level setting unit 196)
The conversation interest level setting unit 196 accepts a setting input of a “conversation interest level (interest level)” to be described later from the user's input to the terminal device 100A, displays the result on the display unit 150, and also the conversation interest level. The data regarding to is transmitted to the information management server 200. Here, in the present specification, the “conversation interest degree” indicates the degree of interest felt by the user with respect to a certain call (conversation) that occurs on the information processing system according to the present embodiment in multiple stages. It is a thing. The multi-stage may be 10 stages of 0: no interest to 9: very interested, 100 stages of 0 to 99, or the Visual Analogue. It may be a stepless continuous scale such as Scale. Note that one conversation interest level can be associated with one call (conversation), that is, one COMM link 86, and further associated with one or a plurality of COMM words 87 related to the COMM link 86.

端末装置１００Ａのユーザによる、上記会話関心度の設定操作は、たとえば入力部１２０から数値を指定入力することであってもよい。しかしながら、より直感的に入力できる方法として、ユーザが、ＣＯＭＭリンク８６に対して自身の仮想的な化体（以下、「第三者オブジェクト（ユーザオブジェクト）」）を配置し、その第三者オブジェクトの位置とＣＯＭＭリンク８６の位置との間の仮想的距離により、会話関心度を入力する操作法を図２８を参照して以下に説明する。図２８は、分散オフィスに対応する３次元仮想空間９０内における第三者オブジェクト９４と仮想的距離９７の一例を説明するための説明図である。 The operation of setting the conversation interest level by the user of the terminal device 100A may be, for example, designating and inputting a numerical value from the input unit 120. However, as a more intuitive input method, the user arranges his/her virtualized body (hereinafter, “third party object (user object)”) on the COMM link 86, and the third party object An operation method for inputting the conversation interest level based on the virtual distance between the position of the position and the position of the COMM link 86 will be described below with reference to FIG. FIG. 28 is an explanatory diagram for explaining an example of the third party object 94 and the virtual distance 97 in the three-dimensional virtual space 90 corresponding to the distributed office.

図２８においては、分散オフィスに対応する３次元仮想空間９０が示されている。当該３次元仮想空間９０には、人物Ｅに対応するオブジェクト９１Ｅ及び人物Ｆに対応するオブジェクト９１Ｆが配置されている。また、オブジェクト９１Ｅと９１Ｆ間には人物ＥとＦの二者通話に対応したＣＯＭＭリンク８６が生成され配置されている。なお、ＣＯＭＭリンク８６の基となる両端のオブジェクト９１はそれぞれ実空間上の別拠点に存在していても構わない。この場合、両端のオブジェクト９１は同じ３次元仮想空間９０内にそれぞれ絶対位置座標を持っており、全体位置座標からオブジェクト９１間の相対的位置関係が算出されてもよい。もしくは、各拠点間の位置関係が予め定められおり、情報管理サーバ２００が、その位置関係を用いて、拠点をまたぐ場合の各拠点に存在する人物に対応するオブジェクト９１の相対的位置関係を算出してもよい。ユーザが、当該ＣＯＭＭリンク８６に対して第三者オブジェクト９４Ｕを設置すると、３次元仮想空間９０には第三者オブジェクト９４Ｕが新たに配置される。また、第三者オブジェクト９４Ｕの位置は、ユーザの水平方向の位置指定だけで自由に設定できる。 In FIG. 28, a three-dimensional virtual space 90 corresponding to the distributed office is shown. An object 91E corresponding to the person E and an object 91F corresponding to the person F are arranged in the three-dimensional virtual space 90. Further, a COMM link 86 corresponding to a two-way call between the persons E and F is generated and arranged between the objects 91E and 91F. It should be noted that the objects 91 at both ends, which are the bases of the COMM link 86, may exist at different bases in the real space. In this case, the objects 91 at both ends have absolute position coordinates in the same three-dimensional virtual space 90, and the relative positional relationship between the objects 91 may be calculated from the overall position coordinates. Alternatively, the positional relationship between the bases is predetermined, and the information management server 200 uses the positional relationship to calculate the relative positional relationship of the objects 91 corresponding to the persons existing at the bases when the bases are crossed. You may. When the user installs the third party object 94U on the COMM link 86, the third party object 94U is newly placed in the three-dimensional virtual space 90. The position of the third party object 94U can be freely set only by the user's horizontal position designation.

そして、第三者オブジェクト９４Ｕが３次元仮想空間９０に配置されると、当該第三者オブジェクト９４Ｕの３次元重心位置９２Ｕと、ＣＯＭＭリンク８６の話者関心比反映位置９６Ｕ（話者関心比反映位置９６Ｕの初期位置は、例えばＣＯＭＭリンク８６の中点）が生成される。さらに、３次元重心位置９２Ｕの位置と３次元重心位置９６Ｕの位置との間の仮想的距離９７Ｕを、後述する情報管理サーバ２００Ｂや端末装置１００Ａが算出する。従って、ユーザ（第三者）が第三者オブジェクト９４Ｕの配置位置を変更操作することで、仮想的距離９７Ｕの大きさも対応して変更される。 When the third-party object 94U is placed in the three-dimensional virtual space 90, the three-dimensional center-of-gravity position 92U of the third-party object 94U and the speaker interest ratio reflection position 96U of the COMM link 86 (speaker interest ratio reflection) The initial position of the position 96U is, for example, the middle point of the COMM link 86). Further, the virtual distance 97U between the position of the three-dimensional barycentric position 92U and the position of the three-dimensional barycentric position 96U is calculated by the information management server 200B and the terminal device 100A described later. Therefore, when the user (third party) changes the placement position of the third party object 94U, the size of the virtual distance 97U is correspondingly changed.

先に説明したように、会話への関心度の高いまたは低いという概念は、会話が発生している位置に対して会話の第三者がとる距離が近いまたは遠いという概念と、相性がよい。従って、上記距離に応じて会話への関心度を表示した場合、当該表示により直感的に関心度を把握することが可能である。また、前述の通り端末装置１００Ａは、表示部１５０に撮像画像５１やマップ画像６９を表示することができる。撮像画像５１やマップ画像６９は実空間の構造を射影した情報であるため、位置や距離といった情報を重畳させてグラフィカルに表示したりタッチ指定入力したりするのに適している。従って、本実施形態に係る情報処理システムにおいては、数値をキーボード等で指定入力させたり、その結果設定される仮想的距離の値を数値で表示させたりするよりも、前述の第三者オブジェクト９４を、撮像画像５１やマップ画像６９上に投影して表示し、表示させた第三者オブジェクト９４（詳細には画面表示体）に対して操作を行わせ、位置を変更することにより、会話への会話関心度を入力する方法を採用することが好ましい。このような方法を採用することにより、ユーザは、直感的に会話関心度を入力することが可能となり、ユーザの利便性が向上する。 As described above, the concept that the degree of interest in the conversation is high or low is compatible with the concept that the third party in the conversation is close or far away from the position where the conversation is occurring. Therefore, when the degree of interest in conversation is displayed according to the distance, the degree of interest can be intuitively grasped by the display. Further, as described above, the terminal device 100A can display the captured image 51 and the map image 69 on the display unit 150. Since the captured image 51 and the map image 69 are information obtained by projecting the structure of the real space, they are suitable for superimposing information such as position and distance for graphical display or touch designation input. Therefore, in the information processing system according to the present embodiment, the third party object 94 described above is used rather than inputting a numerical value with a keyboard or the like and displaying the value of the virtual distance set as a result as a numerical value. Is projected and displayed on the captured image 51 and the map image 69, and the displayed third party object 94 (specifically, the screen display body) is operated, and the position is changed, so that the conversation is started. It is preferable to adopt the method of inputting the conversation interest level of. By adopting such a method, the user can intuitively input the conversation interest level, and the convenience of the user is improved.

なお、本実施形態においては、あるユーザは、ひとつのＣＯＭＭリンク８６に対しひとつの第三者オブジェクト９４および仮想的距離９７を設定できる。しかし、本実施形態はこれに限定されるものではない。例えば、複数のＣＯＭＭリンク８６に対しひとつの第三者オブジェクト９４を設定し、当該第三者オブジェクト９４の位置を変更すると、上記複数のＣＯＭＭリンク８６との位置関係に従い、当該第三者オブジェクト９４の上記複数のＣＯＭＭリンク８６への複数の仮想的距離９７が連動して変更されるような処理を行ってもよい。 In the present embodiment, a user can set one third-party object 94 and virtual distance 97 for one COMM link 86. However, the present embodiment is not limited to this. For example, when one third party object 94 is set for a plurality of COMM links 86 and the position of the third party object 94 is changed, the third party object 94 is changed according to the positional relationship with the plurality of COMM links 86. The plurality of virtual distances 97 to the plurality of COMM links 86 may be changed in conjunction with each other.

次に、図２９及び図３０を参照して、本実施形態に係る端末装置１００Ａの表示画面に表示される第三者オブジェクト及び仮想的距離の一例を説明する。図２９は、本実施形態に係る端末装置１００Ａの表示画面に表示される第三者オブジェクト及び仮想的距離の一例を説明するための説明図である。図３０は、本実施形態に係る端末装置１００Ａの表示画面５７に表示される第三者オブジェクト９４及び仮想的距離９７の他の一例を説明するための説明図である。 Next, an example of a third-party object and a virtual distance displayed on the display screen of the terminal device 100A according to the present embodiment will be described with reference to FIGS. 29 and 30. FIG. 29 is an explanatory diagram illustrating an example of a third-party object and virtual distance displayed on the display screen of the terminal device 100A according to the present embodiment. FIG. 30 is an explanatory diagram for explaining another example of the third party object 94 and the virtual distance 97 displayed on the display screen 57 of the terminal device 100A according to the present embodiment.

図２９においては、表示画面５６が示されている。実空間では人物ＥとＦが二者通話をしており、表示画面５６上には当該二者通話に対応したＣＯＭＭリンク８６Ｈ３が撮像画像５１Ａ、５１Ｂ上に、またはＣＯＭＭリンク８６Ｈ２がマップ画像６９Ａ、６９Ｂ上に表示されている。ユーザ（第三者）はＣＯＭＭリンク８６Ｈ３に対して、撮像画像５１Ａ、５１Ｂ、５１Ｚ上の２次元表示位置に第三者オブジェクト９４Ｕの位置を指定入力することができる。そして、当該入力に対応して撮像画像５１上には第三者オブジェクト９４Ｕの画面表示体（アイコン）９８Ｕ３、話者関心比反映位置９６Ｕの画面表示体１０１Ｕ３、および仮想的距離９７Ｕの画面表示体１０３Ｕ３が表示される。 In FIG. 29, the display screen 56 is shown. In the real space, the persons E and F are making a two-way call, and the COMM link 86H3 corresponding to the two-way call is on the captured images 51A and 51B on the display screen 56, or the COMM link 86H2 is the map image 69A. It is displayed on 69B. The user (third party) can designate and input the position of the third party object 94U at the two-dimensional display position on the captured images 51A, 51B, 51Z with respect to the COMM link 86H3. Corresponding to the input, the screen display body (icon) 98U3 of the third party object 94U, the screen display body 101U3 of the speaker interest ratio reflecting position 96U, and the screen display body of the virtual distance 97U are displayed on the captured image 51. 103U3 is displayed.

同様に、ユーザは、ＣＯＭＭリンク８６Ｈ２に対して、マップ画像６９Ａ、６９Ｂ、６９Ｚ上の２次元表示位置に第三者オブジェクト９４Ｕの位置を指定入力することができる。そして、当該入力に対応してマップ画像６９上には第三者オブジェクト９４Ｕの画面表示体９８Ｕ２、話者関心比反映位置９６Ｕの画面表示体１０１Ｕ２、および仮想的距離９７Ｕの画面表示体１０３Ｕ２が表示される。第三者オブジェクト９４Ｕの画面表示体９８Ｕ３または９８Ｕ２は、例えば当該ユーザの氏名が描かれたアイコンでもよいし、図２９に図示されているようにユーザの顔画像でもよい。ユーザの当該顔画像は、例えば、当該ユーザの端末装置１００Ａの撮像部１３０で撮影されて顔検出処理により顔領域をトリミングされた画像でもよい。 Similarly, the user can specify and input the position of the third-party object 94U at the two-dimensional display position on the map images 69A, 69B, and 69Z for the COMM link 86H2. Then, in response to the input, the screen display body 98U2 of the third party object 94U, the screen display body 101U2 of the speaker interest ratio reflecting position 96U, and the screen display body 103U2 of the virtual distance 97U are displayed on the map image 69. To be done. The screen display 98U3 or 98U2 of the third party object 94U may be, for example, an icon in which the name of the user is drawn, or a face image of the user as illustrated in FIG. The face image of the user may be, for example, an image captured by the image capturing unit 130 of the terminal device 100A of the user and having the face area trimmed by the face detection processing.

ユーザは、第三者オブジェクト９４Ｕの画面表示体９８Ｕ３または９８Ｕ２を、たとえば画面上でドラッグ操作し、前述の３次元仮想空間９０における位置を指定入力することができる。ユーザによる上記位置の指定入力に対応して画面表示体９８Ｕ３または９８Ｕ２の画面上２次元表示位置も変更され、また、対応して仮想的距離９７Ｕの長さと画面表示体１０３Ｕ３または１０３Ｕ２の表示上の長さと端点位置が変更される。そして、変更された仮想的距離９７Ｕが、上記ユーザのＣＯＭＭリンク８６Ｈ２に係る会話への会話関心度に対応する。なお、本実施形態においては、仮想的距離９７と会話関心度との対応関係は、関心度の最大／最小値と、仮想的距離９７の最大／最小値との対応関係が適切に対応していれば、任意に設定されてもよい。そして、会話関心度設定部１９６は、上述のような、ユーザの第三者オブジェクト９４に対する操作に基づいて、当該ユーザの会話関心度を取得することができる。 The user can drag and operate the screen display body 98U3 or 98U2 of the third party object 94U on the screen, for example, to specify and input the position in the three-dimensional virtual space 90 described above. The two-dimensional display position on the screen of the screen display body 98U3 or 98U2 is also changed in response to the user's designation input of the position, and the length of the virtual distance 97U and the display of the screen display body 103U3 or 103U2 are correspondingly changed. The length and end point position are changed. Then, the changed virtual distance 97U corresponds to the degree of conversation interest in the conversation regarding the COMM link 86H2 of the user. In the present embodiment, the correspondence relationship between the virtual distance 97 and the conversation interest level appropriately corresponds to the maximum/minimum value of the interest level and the maximum/minimum value of the virtual distance 97. If so, it may be set arbitrarily. Then, the conversation interest level setting unit 196 can acquire the conversation interest level of the user based on the operation of the third party object 94 by the user as described above.

図３０においては、表示画面５７が示されている。表示画面５７上には、図２９と同様に、ＣＯＭＭリンク８６Ｈ３または８６Ｈ２、第三者オブジェクト９４Ｕの画面表示体９８Ｕ３または９８Ｕ２、仮想的距離９７Ｕの画面表示体１０３Ｕ３または１０３Ｕ２が表示されている。しかし、図３０では、第三者オブジェクト９４Ｕの画面表示体９８Ｕ３または９８Ｕ２は、図２９の例と比較してＣＯＭＭリンク８６Ｈ３または８６Ｈ２により近い位置に設定されている。さらに、これに対応して、仮想的距離９７Ｕの画面表示体１０３Ｕ３または１０３Ｕ２の表示上の長さと端点位置が変更されている。 A display screen 57 is shown in FIG. 29, the COMM link 86H3 or 86H2, the screen display body 98U3 or 98U2 of the third party object 94U, and the screen display body 103U3 or 103U2 of the virtual distance 97U are displayed on the display screen 57. However, in FIG. 30, the screen display body 98U3 or 98U2 of the third party object 94U is set at a position closer to the COMM link 86H3 or 86H2 as compared with the example of FIG. Further, correspondingly, the display length and the end point position of the screen display body 103U3 or 103U2 with the virtual distance of 97U are changed.

さらに、図３０ではＣＯＭＭリンク８６Ｈ３または８６Ｈ２近傍に表示されるＣＯＭＭワード８７の数が図２９よりも増えている。これは、同室環境では、第三者が二者会話の場の近くに寄るほど当該会話の内容がより多く聞こえる、という事象を本実施形態に係るシステムにおいて再現している。すなわち、ユーザが、会話に関心を持ち、第三者オブジェクト９４Ｕの画面表示体９８Ｕ３または９８Ｕ２をＣＯＭＭリンク８６Ｈ３または８６Ｈ２により近い位置に設定した場合には、仮想的距離９７Ｕの画面表示体１０３Ｕ３または１０３Ｕ２の長さが短くなり、会話関心度が高くなったことを示す。それに伴い、表示されるＣＯＭＭワード８７の数が増加する。この仮想的距離９７（すなわち、会話関心度）の大きさに応じたＣＯＭＭワード８７の表示数の変更処理は、前述の、ＣＯＭＭワード制御部１９５による重み付き語句データの表示における重み閾値を用いた処理において、ＣＯＭＭワード制御部１９５が当該閾値の大きさを上記仮想的距離９７の大きさに対応させて変化させることにより、実現されてもよい。 Further, in FIG. 30, the number of COMM words 87 displayed near the COMM link 86H3 or 86H2 is larger than that in FIG. This reproduces in the system according to the present embodiment the phenomenon that in a shared room environment, the closer a third party is to the place of a two-party conversation, the more the content of the conversation is heard. That is, when the user is interested in conversation and sets the screen display body 98U3 or 98U2 of the third party object 94U closer to the COMM link 86H3 or 86H2, the screen display body 103U3 or 103U2 with a virtual distance of 97U is displayed. Indicates that the length of the is shorter and the degree of interest in conversation is higher. As a result, the number of COMM words 87 displayed increases. The processing for changing the number of display of the COMM words 87 according to the size of the virtual distance 97 (that is, the degree of conversation interest) uses the weight threshold value in the display of the weighted word/phrase data by the COMM word control unit 195 described above. In the processing, the COMM word control unit 195 may be realized by changing the size of the threshold value in accordance with the size of the virtual distance 97.

なお、本実施形態においては、取得した会話関心度に基づき、ＣＯＭＭワード８７の表示数を仮想的距離９７の大きさに対応させて変化させることに限定されるものではない。例えば、本実施形態においては、マイクロフォン１３や集音部１４０から取得され情報管理サーバ２００Ｂを介してユーザの端末装置１００Ａから提示される会話の音声の音量を、音声出力制御部１８２により、会話関心度に応じて、すなわち、仮想的距離９７の大きさに応じて変化させてもよい。 It should be noted that the present embodiment is not limited to changing the number of displayed COMM words 87 in correspondence with the size of the virtual distance 97 based on the acquired conversation interest level. For example, in the present embodiment, the voice output control unit 182 controls the volume of the voice of the conversation, which is acquired from the microphone 13 or the sound collection unit 140 and presented from the user's terminal device 100A via the information management server 200B, by the conversation interest. It may be changed according to the degree, that is, the size of the virtual distance 97.

（話者関心比設定部１９７）
話者関心比設定部１９７は、ユーザの端末装置１００Ａへの入力から、後述する「話者関心比（関心度を比率）」の設定処理を行い、その結果を表示部１５０へ表示させるとともに、当該話者関心比に関するデータを後述する情報管理サーバ２００Ｂへと送信する。また、話者関心比設定部１９７は、ＣＯＭＭワード制御部１９５が表示するＣＯＭＭワード８７の話者別比率を変更させる。ここで、「話者関心比」とは、本明細書においては、本実施形態に係る情報処理システム上で発生したある通話における各話者（とその発話内容）に対してユーザが感じた会話関心度の高さの比率、を示すものである。例えば、話者関心比は、１つの会話の話者毎に０．０〜１．０の間の値をとり、さらに複数の話者の話者関心比の合計値は１．０となる。より具体的には、人物Ｅと人物Ｆとの二者の会話に対して、ユーザが関心を持った場合を例に説明する。ユーザは、上記会話に係るＣＯＭＭワード８７の表示を見て、上記会話での人物Ｅの発言に対して、当該会話での人物Ｆの発言に比べて高い関心（詳細には、人物Ｅに対しては人物Ｆの４倍程度となる関心）を持っていると仮定する。このような場合、例えば、話者関心比は、当該会話に対する全体の会話関心度を１．０とすると、人物Ｅに対しては０．８、人物Ｆに対しては０．２として表現することができる。このように、ユーザが話者関心比を設定することにより、その設定比率に応じて、ユーザに対して、各話者に係るＣＯＭＭワード８７（すなわち、話者の発言した語句）が表示部１５０の画面上に表示されることができる。 (Speaker interest ratio setting unit 197)
The speaker interest ratio setting unit 197 performs a process of setting a “speaker interest ratio (ratio of interest rate)”, which will be described later, from the user's input to the terminal device 100A, and displays the result on the display unit 150. The data regarding the speaker interest ratio is transmitted to the information management server 200B described later. Further, the speaker interest ratio setting unit 197 changes the speaker ratio of the COMM word 87 displayed by the COMM word control unit 195. Here, the “speaker interest ratio” means, in the present specification, a conversation felt by the user with respect to each speaker (and the utterance content thereof) in a certain call generated on the information processing system according to the present embodiment. It is a ratio of high interest level. For example, the speaker interest ratio takes a value between 0.0 and 1.0 for each speaker of one conversation, and the total value of the speaker interest ratios of a plurality of speakers is 1.0. More specifically, a case where the user is interested in the conversation between the person E and the person F will be described as an example. The user sees the display of the COMM word 87 related to the conversation and has a higher interest in the remark of the person E in the conversation than in the remark of the person F in the conversation (specifically, for the person E. Is assumed to have about four times the interest of the person F). In such a case, for example, the speaker interest ratio is expressed as 0.8 for the person E and 0.2 for the person F when the overall conversation interest degree for the conversation is 1.0. be able to. In this way, by the user setting the speaker interest ratio, the display unit 150 displays the COMM word 87 (that is, the phrase spoken by the speaker) for each speaker to the user according to the set ratio. Can be displayed on the screen.

以下に、図３１及び図３２を参照して、本実施形態に係る話者関心比を示す話者関心比反映位置９６の設定の一例を説明する。図３１は、分散オフィスに対応する３次元仮想空間９０内における話者関心比反映位置（基準点）９６の設定の一例を説明するための説明図である。図３２は、本実施形態に係る端末装置１００Ａの表示画面５８に表示される話者関心比反映位置９６の設定の一例を説明するための説明図である。 An example of setting the speaker interest ratio reflection position 96 indicating the speaker interest ratio according to the present embodiment will be described below with reference to FIGS. 31 and 32. FIG. 31 is an explanatory diagram for explaining an example of setting the speaker interest ratio reflection position (reference point) 96 in the three-dimensional virtual space 90 corresponding to the distributed office. FIG. 32 is an explanatory diagram for explaining an example of setting the speaker interest ratio reflection position 96 displayed on the display screen 58 of the terminal device 100A according to the present embodiment.

図３１においては、分散オフィスに対応する３次元仮想空間９０が示されている。図２８と比較して、話者関心比反映位置９６Ｕが設定された位置が異なっている。なお、話者関心比反映位置９６Ｕをユーザが設定する方法については後述する。図２８では、話者関心比反映位置９６Ｕは初期位置としてＣＯＭＭリンク８６の線分の中点にあり、すなわち、ユーザの、人物Ｅへの話者関心比は０．５、人物Ｆへの話者関心比も０．５であり、話者関心比は人物Ｅ、Ｆの両方に対して等しい状態を示している。一方、図３１では、話者関心比反映位置９６Ｕは、ＣＯＭＭリンク８６の線分上でオブジェクト９１Ｅとオブジェクト９１Ｆからそれぞれ７：３の距離長になる位置にあり、すなわち距離長に反比例して、ユーザの、人物Ｅへの話者関心比は０．３、人物Ｆへの話者関心比は０．７であることを示している。 In FIG. 31, a three-dimensional virtual space 90 corresponding to the distributed office is shown. As compared with FIG. 28, the position where the speaker interest ratio reflecting position 96U is set is different. A method for the user to set the speaker interest ratio reflection position 96U will be described later. In FIG. 28, the speaker interest ratio reflecting position 96U is at the midpoint of the line segment of the COMM link 86 as the initial position, that is, the user's speaker interest ratio to the person E is 0.5, and the talk to the person F is 0.5. The speaker interest ratio is also 0.5, and the speaker interest ratio is the same for both persons E and F. On the other hand, in FIG. 31, the speaker interest ratio reflecting position 96U is at a position where the distance length is 7:3 from the object 91E and the object 91F on the line segment of the COMM link 86, that is, in inverse proportion to the distance length, It is shown that the user's speaker interest ratio to the person E is 0.3, and the speaker interest ratio to the person F is 0.7.

また、図３２においては表示画面５８が示されている。図３０と比較して、話者関心比反映位置９６Ｕが設定された位置が異なっており、それを受けて図３２の仮想的距離９７Ｕの画面表示体１０３Ｕ３および１０３Ｕ２の表示上の長さと端点位置が図３０と異なっている。話者関心比反映位置９６Ｕは、図３０はＣＯＭＭリンク８６の線分の中点であったが、図３２では図３１の位置設定状態に合わせてオブジェクト９１Ｅとオブジェクト９１Ｆからそれぞれ７：３の距離長になる位置に設定されている。すなわち、図３２の話者関心比反映位置９６Ｕは、当該距離長に反比例して、ユーザの、人物Ｅへの話者関心比は０．７、人物Ｆへの話者関心比は０．３であることを示している。 A display screen 58 is shown in FIG. As compared with FIG. 30, the position where the speaker interest ratio reflecting position 96U is set is different, and accordingly, the display length and the end point position of the screen display bodies 103U3 and 103U2 at the virtual distance 97U of FIG. 32 are received. Is different from FIG. The speaker interest ratio reflecting position 96U is the midpoint of the line segment of the COMM link 86 in FIG. 30, but in FIG. 32 the distance is 7:3 from the object 91E and the object 91F in accordance with the position setting state of FIG. It is set to a position that makes it long. That is, the speaker interest ratio reflection position 96U in FIG. 32 is inversely proportional to the distance length, and the user has a speaker interest ratio of 0.7 for the person E and a speaker interest ratio of 0.3 for the person F. It is shown that.

また、図３２では、ＣＯＭＭリンク８６Ｈ３または８６Ｈ２近傍に表示されるＣＯＭＭワード８７の数の分布が図３０と異なっている。すなわち、話者関心比が下がった（０．５→０．３）人物Ｅに関連するＣＯＭＭワード数（人物Ｅの近傍に表示されるＣＯＭＭワードの数）が減少し、話者関心比が上がった（０．５→０．７）人物Ｆに関連するＣＯＭＭワード数（人物Ｆの近傍に表示されるＣＯＭＭワードの数）は増加している。なお、話者関心比反映位置９６Ｕの位置は、ＣＯＭＭリンク８６Ｈ３または８６Ｈ２上の１点をユーザがたとえばタッチ指定入力することを契機として設定されてもよい。 Further, in FIG. 32, the distribution of the number of COMM words 87 displayed near the COMM link 86H3 or 86H2 is different from that in FIG. That is, the number of COMM words related to the person E whose speaker interest ratio has decreased (0.5→0.3) (the number of COMM words displayed in the vicinity of the person E) decreases, and the speaker interest ratio increases. (0.5→0.7) The number of COMM words related to the person F (the number of COMM words displayed in the vicinity of the person F) is increasing. The position of the speaker interest ratio reflecting position 96U may be set when the user inputs, for example, one point on the COMM link 86H3 or 86H2 by touch designation.

また、本実施形態においては、話者関心比反映位置９６Ｕの位置は、他の方法により設定されてもよい。以下に、図３３を参照して、他の一例を説明する。図３３は、第三者オブジェクト９４によって話者関心比反映位置９６を設定する際の表示画面６２の一例を説明するための説明図である。詳細には、図３３は、端末装置１００Ａの表示画面５０上の部分領域を示しており、図３３の左右の図で、第三者オブジェクト９４の向きとそれにより設定される話者関心比反映位置９６が異なる例を２つ例示している。 Further, in the present embodiment, the position of the speaker interest ratio reflection position 96U may be set by another method. Another example will be described below with reference to FIG. 33. FIG. 33 is an explanatory diagram for explaining an example of the display screen 62 when the speaker interest ratio reflection position 96 is set by the third party object 94. In detail, FIG. 33 shows a partial area on the display screen 50 of the terminal device 100A, and in the left and right views of FIG. 33, the orientation of the third party object 94 and the speaker interest ratio reflection set by it are reflected. Two examples in which the position 96 is different are illustrated.

詳細には、図３３の第三者オブジェクト９４Ｕの画面表示体９８Ｕ２は、第三者ユーザの顔画像の周囲にひとつの尖端部１０５を有する。そして、当該第三者オブジェクト９４Ｕを指定入力する第三者ユーザは、例えば２本以上の指を用いたタッチ回転操作を行い、上記尖端部１０５の向きを変更することができる。さらに、当該尖端部１０５の向きの延長線９９上に、人物Ｅのオブジェクトと人物Ｆのオブジェクト間に形成されたＣＯＭＭリンク８６Ｈ２の線分が存在し、延長線９９とＣＯＭＭリンク８６Ｈ３と交わる場合、話者関心比設定部１９７は、その交点の位置を、話者関心比反映位置９６Ｕの位置として取得する。このようにして、話者関心比反映位置９６Ｕの位置が設定されるため、話者関心比設定部１９７は、ユーザの会話の話者に対する話者関心比を算出することができる。例えば、図３３を見ると、図３３の左図の第三者オブジェクト９４Ｕの画面表示体９８Ｕ２と比べて、右図の第三者オブジェクト９４Ｕの画面表示体９８Ｕ２は右方向に回転した向きとなっており、その向きに対応する話者関心比反映位置９６Ｕ（画面表示体１０１Ｕ２）の位置も左図と比べ右図の方が右寄りに位置している。 In detail, the screen display body 98U2 of the third party object 94U of FIG. 33 has one pointed portion 105 around the face image of the third party user. Then, the third party user who specifies and inputs the third party object 94U can change the orientation of the tip portion 105 by performing a touch rotation operation using, for example, two or more fingers. Furthermore, when the line segment of the COMM link 86H2 formed between the object of the person E and the object of the person F exists on the extension line 99 in the direction of the tip portion 105 and intersects the extension line 99 and the COMM link 86H3, The speaker interest ratio setting unit 197 acquires the position of the intersection as the position of the speaker interest ratio reflection position 96U. Since the position of the speaker interest ratio reflection position 96U is set in this manner, the speaker interest ratio setting unit 197 can calculate the speaker interest ratio for the speaker of the user's conversation. For example, looking at FIG. 33, the screen display body 98U2 of the third party object 94U in the right diagram is rotated rightward as compared with the screen display body 98U2 of the third party object 94U in the left diagram of FIG. Also, the position of the speaker interest ratio reflection position 96U (screen display body 101U2) corresponding to that direction is located more rightward in the right figure than in the left figure.

また、本実施形態においては、ユーザがＣＯＭＭリンク８６及び第三者オブジェクト９４を操作して、話者関心比を設定することに限定されるものではない。例えば、本実施形態においては、各ＣＯＭＭワード８７に対するユーザの視線を認識し、認識した頻度を用いて、話者関心比を算出してもよい（例えば、人物Ｅと人物Ｆとが会話しており、ユーザが、画面表示された人物Ｅが発話した語句に係るＣＯＭＭワード８７と人物Ｆが発話した語句に係るＣＯＭＭワード８７とを見た場合を例に説明する。このような場合、ユーザの視線を検出する視線検出装置を用いることにより、各ＣＯＭＭワード８７に向けられたユーザの視線を検出し、カウントすることができる。従って、ユーザが、人物Ｅが発話した語句に係るＣＯＭＭワード８７に対して７回視線を落とし、人物Ｆが発話した語句に係るＣＯＭＭワード８７に対して３回視線を落とした場合には、人物Ｅへの話者関心比は０．７、人物Ｆへの話者関心比は０．３とすることができる）。 In addition, the present embodiment is not limited to the user operating the COMM link 86 and the third party object 94 to set the speaker interest ratio. For example, in the present embodiment, the user's line of sight with respect to each COMM word 87 may be recognized, and the speaker interest ratio may be calculated using the recognized frequency (for example, the person E and the person F talk to each other). An example will be described in which the user sees the COMM word 87 related to the phrase spoken by the person E and the COMM word 87 related to the phrase spoken by the person F displayed on the screen. By using the line-of-sight detection device that detects the line-of-sight, it is possible to detect and count the line-of-sight of the user directed to each COMM word 87. Therefore, the user can detect the COMM word 87 related to the phrase spoken by the person E. On the other hand, when the line of sight is dropped seven times and the line of sight is dropped three times with respect to the COMM word 87 related to the phrase uttered by the person F, the speaker interest ratio to the person E is 0.7, and the talk to the person F is Person interest ratio can be 0.3).

（会話関心度通知部１９８）
会話関心度通知部１９８は、情報管理サーバ２００から会話関心度に関する情報を受信し、当該会話関心度に関する情報を、当該会話を行う人物の端末装置１００の表示部１５０の画面上にプッシュ通知表示する。例えば、人物Ｃと人物Ｄが端末装置１００Ｃと１００Ｄ間で行っている二者通話に対し、ユーザの端末装置１００Ｕが会話関心度を設定した場合には、端末装置１００Ｃと１００Ｄの各会話関心度通知部１９８は、後述する情報管理サーバ２００Ｂから端末装置１００Ｕで設定された会話関心度のデータを受信する。さらに、端末装置１００Ｃと１００Ｄの各会話関心度通知部１９８は、会話関心度の値の大きさに対応した注意喚起強度で、端末装置１００Ｃと１００Ｄの表示部１５０の画面上に、会話に関心を持つユーザの存在を知らせる通知を表示させる。ここで、上記通知は、例えばポップアップ形式のウィンドウ（表示）である。また、注意喚起強度は、上述の会話関心度に応じて設定され、会話関心度が大きいほど、注意喚起強度は大きくなる。上記通知の表示形態は、当該注意喚起強度に応じて変更される。具体的には、注意喚起強度が大きいほど、上記ポップアップ形式のウィンドウの表示の大きさが大きくなる。また、本実施形態においては、注意喚起強度に応じてウィンドウの表示の大きさを変更することに限定されるものではなく、ウィンドウの表示位置が変更されたり（画面の中央に近いほど注意喚起強度が大きい）、ウィンドウと背景とのコントラストを変更したり、ウィンドウの色、ウィンドウの表示アニメーション（動き）の種類等を変更してもよい。また、上記通知は、画面表示に限定されるものではなく、例えば、会話に関心を持つユーザの存在を知らせるサイン音等でもよく、この場合はサイン音の音量、周波数分布、リズム等を注意喚起強度に応じて変更してもよい。 (Conversation interest degree notification unit 198)
The conversation interest degree notification unit 198 receives the information regarding the conversation interest degree from the information management server 200, and displays the information regarding the conversation interest degree on the screen of the display unit 150 of the terminal device 100 of the person who has the conversation in a push notification display. To do. For example, when the user's terminal device 100U sets the conversation interest level for a two-party call between the person devices C and D between the terminal devices 100C and 100D, the conversation interest levels of the terminal devices 100C and 100D. The notification unit 198 receives the conversation interest level data set in the terminal device 100U from the information management server 200B described later. Furthermore, the conversation interest level notification unit 198 of each of the terminal devices 100C and 100D has an attention intensity corresponding to the magnitude of the value of the conversation interest level, and is interested in the conversation on the screen of the display unit 150 of each of the terminal devices 100C and 100D. Display a notification informing you of the existence of a user with. Here, the notification is, for example, a pop-up window (display). Further, the alerting intensity is set according to the conversation interest level described above, and the greater the conversation interest level, the greater the alertness intensity. The display form of the notification is changed according to the alert strength. Specifically, the larger the alerting strength, the larger the display size of the popup window. Further, the present embodiment is not limited to changing the display size of the window according to the alert strength, but the display position of the window is changed (the closer to the center of the screen the alert strength is However, the contrast between the window and the background may be changed, the window color, the type of window display animation (motion), and the like may be changed. Further, the notification is not limited to the screen display, and may be, for example, a sign sound or the like indicating the presence of a user who is interested in conversation. In this case, the sound volume, frequency distribution, rhythm, etc. of the sign sound are called up. It may be changed according to the strength.

以下に、図３４を参照して、本実施形態に係る端末装置１００Ａの表示画面に表示される会話関心度に関する情報を含む通知の表示の一例を説明する。図３４は、本実施形態に係る端末装置１００Ａの表示画面に表示される会話関心度に関する情報を含む通知の表示の一例を説明するための説明図である。図３４の上側の表示画面５９は注意喚起強度が高い通知の表示の一例であり、図３４の下側の表示画面６６は注意喚起強度が低い通知の表示の一例である。表示画面５９においては、表示画面６６と比べて、通知ウィンドウ１０７が大きなサイズで画面中央に近い位置に表示されており、ユーザが通知の出現に気づきやすくなっている。また、通知ウィンドウ１０７には、３次元仮想空間９０における仮想的距離の大きさを表示してもよい（例えば、表示画面５９のウィンドウ１０７においては「１．５ｍ」と表示されている）。 Hereinafter, with reference to FIG. 34, an example of the display of the notification including the information regarding the conversation interest degree displayed on the display screen of the terminal device 100A according to the present embodiment will be described. FIG. 34 is an explanatory diagram for describing an example of a display of a notification including information regarding the conversation interest degree displayed on the display screen of the terminal device 100A according to the present embodiment. The display screen 59 on the upper side of FIG. 34 is an example of the display of the notification with a high alerting intensity, and the display screen 66 on the lower side of FIG. 34 is an example of the display of the notification with a low alerting intensity. On the display screen 59, the notification window 107 is displayed in a larger size and closer to the center of the screen than the display screen 66, so that the user can easily notice the appearance of the notification. Further, the notification window 107 may display the size of the virtual distance in the three-dimensional virtual space 90 (for example, “1.5 m” is displayed in the window 107 of the display screen 59).

なお、図３４においては、該当する会話に対して一人の第三者が関心を示している場合の例を示しているが、同一の会話に対して複数の第三者が関心を示している場合には，１つのウィンドウでその旨の情報を示してもよく、もしくは、第三者ごとにウィンドウを表示させてもよい。また、図３４においては、俯瞰モードで表示される表示画面にウィンドウが重ねられているが、本実施形態においては、このような表示形態に限定されるものではない。例えば、会話モードで表示される表示画面に上記ウィンドウが重ねられてもよい。 Although FIG. 34 shows an example in which one third party is interested in the corresponding conversation, a plurality of third parties are interested in the same conversation. In that case, the information to that effect may be shown in one window, or the window may be displayed for each third party. Further, in FIG. 34, the window is superimposed on the display screen displayed in the overhead view mode, but the present embodiment is not limited to such a display form. For example, the window may be superimposed on the display screen displayed in the conversation mode.

＜３．２情報管理サーバの構成＞
＜３．２．１機能構成＞
次に、図３５を参照して、本実施形態に係る情報管理サーバ２００Ｂの機能構成の一例を説明する。図３５は、本実施形態に係る情報管理サーバ２００Ｂの機能構成の一例を示すブロック図である。図３５を参照すると、第１及び第２の実施形態の情報管理サーバ２００と同様に、情報管理サーバ２００Ｂは、通信部２１０、記憶部２２０及び制御部２３０を有する。さらに、制御部２３０は、第２の実施形態と同様に、ＣＯＭＭリンク配信部２３１、抽出語句データ管理部２３２、重み付け演算部２３３、ＣＯＭＭワード配信部２３４、発言状況演算部２３５、及び位置連動配信制御部２３６を含む。加えて、制御部２３０は、会話関心度制御部２４１をさらに含む。従って、ここでは、第１及び第２の実施形態と同様の機能部の説明は省略し、会話関心度制御部２４１についてのみ説明する。 <3.2 Information management server configuration>
<3.2.1 Functional configuration>
Next, an example of the functional configuration of the information management server 200B according to the present embodiment will be described with reference to FIG. FIG. 35 is a block diagram showing an example of the functional configuration of the information management server 200B according to this embodiment. Referring to FIG. 35, the information management server 200B includes a communication unit 210, a storage unit 220, and a control unit 230, similar to the information management server 200 of the first and second embodiments. Further, as in the second embodiment, the control unit 230 further includes the COMM link distribution unit 231, the extracted word/phrase data management unit 232, the weighting calculation unit 233, the COMM word distribution unit 234, the statement status calculation unit 235, and the position-linked distribution. The control unit 236 is included. In addition, the control unit 230 further includes a conversation interest level control unit 241. Therefore, the description of the functional units similar to those in the first and second embodiments will be omitted here, and only the conversation interest level control unit 241 will be described.

（会話関心度制御部２４１）
会話関心度制御部２４１は、会話関心度に対応したＣＯＭＭワード８７の提供情報量と、会話関心度に対応した通知の配信に係る処理を行う。詳細には、会話関心度制御部２４１は、関心度制御部２４２、表示体制御部２４３、関心会話情報量制御部２４４及び会話関心度通知送信部２４５を含む。 (Conversation interest degree control unit 241)
The conversation interest level control unit 241 performs processing related to the amount of information provided in the COMM word 87 corresponding to the conversation interest level and the delivery of the notification corresponding to the conversation interest level. Specifically, the conversation interest degree control unit 241 includes an interest degree control unit 242, a display body control unit 243, an interest conversation information amount control unit 244, and a conversation interest degree notification transmission unit 245.

（関心度制御部２４２）
関心度制御部２４２は、ユーザの通話に対する会話関心度の入力を取得し、ＣＯＭＭリンク８６に対して、取得した当該会話関心度と、上記ユーザのオブジェクトＩＤ（通信用ＩＤ（通信用識別情報））とを関連付ける。さらに、紐づけられた会話関心度は、ＣＯＭＭリンク８６とともに、ＣＯＭＭリンク配信部２３１及び通信部２１０を介して、端末装置１００に配信される。このようにすることで、ＣＯＭＭリンク８６に対応する２以上の話者のオブジェクトＩＤが、当該ユーザのオブジェクトＩＤと関連付けられる。そして、本実施形態においては、１つのＣＯＭＭリンク８６に関連付けられたオブジェクトＩＤを参照して制御を行うことにより、話者の端末装置１００に対して、上記ユーザの存在の通知を行うことができる。また、関心度制御部２４２は、ユーザの話者関心比を取得して、ＣＯＭＭリンク８６に対して、取得した話者関心比と、上記ユーザのオブジェクトＩＤ（通信用ＩＤ（通信用識別情報））とを関連付けてもよい。この場合にも、紐づけられた話者関心比は、ＣＯＭＭリンク８６とともに、ＣＯＭＭリンク配信部２３１及び通信部２１０を介して、端末装置１００に配信される。 (Interest control unit 242)
The interest level control unit 242 acquires the input of the conversation interest level for the call of the user, and the acquired conversation interest level and the user object ID (communication ID (communication identification information)) for the COMM link 86. ) With. Further, the associated conversation interest degree is distributed to the terminal device 100 via the COMM link distribution unit 231 and the communication unit 210 together with the COMM link 86. By doing so, the object IDs of two or more speakers corresponding to the COMM link 86 are associated with the object IDs of the user. Then, in the present embodiment, the presence of the user can be notified to the terminal device 100 of the speaker by performing control by referring to the object ID associated with one COMM link 86. .. Further, the interest degree control unit 242 acquires the speaker interest ratio of the user, and the acquired speaker interest ratio and the object ID (communication ID (communication identification information)) of the user for the COMM link 86. ) And may be associated. Also in this case, the associated speaker interest ratio is distributed to the terminal device 100 through the COMM link distribution unit 231 and the communication unit 210 together with the COMM link 86.

（表示体制御部２４３）
表示体制御部２４３は、ユーザによる会話への会話関心度の入力操作が行われた場合には、当該ユーザに係る画面表示体９８を生成する。さらに、表示体制御部２４３は、取得した会話関心度に基づいて、会話に対応するＣＯＭＭリンク８６と画面表示体９８との仮想的位置関係を決定する。詳細には、表示体制御部２４３は、ユーザによる会話への会話関心度の入力操作が行われた場合には、先に説明した３次元仮想空間９０上に、当該ユーザに係る第三者オブジェクト９４を配置する。この際、上記会話関心度に基づいて、３次元仮想空間９０上のＣＯＭＭリンク８６と第三者オブジェクト９４との間の仮想的距離は決定される。さらに、表示体制御部２４３は、ユーザによる話者関心比の入力操作が行われて場合には、３次元仮想空間９０において、ＣＯＭＭリンク８６上の話者関心比に対応する位置に話者関心比反映位置９６を配置する。 (Display control unit 243)
The display body control unit 243 generates the screen display body 98 related to the user when the input operation of the conversation interest level in the conversation is performed by the user. Further, the display body control unit 243 determines the virtual positional relationship between the COMM link 86 and the screen display body 98 corresponding to the conversation based on the acquired conversation interest level. Specifically, when the user performs an operation of inputting the conversation interest level in the conversation, the display body control unit 243 displays the third-party object relating to the user on the three-dimensional virtual space 90 described above. 94 is arranged. At this time, the virtual distance between the COMM link 86 in the three-dimensional virtual space 90 and the third party object 94 is determined based on the conversation interest level. Furthermore, the display body control unit 243, in the case where the input operation of the speaker interest ratio is performed by the user, the speaker interest at the position corresponding to the speaker interest ratio on the COMM link 86 in the three-dimensional virtual space 90. The ratio reflection position 96 is arranged.

（関心会話情報量制御部２４４）
関心会話情報量制御部２４４は、会話関心度および話者関心比に基づいて、ＣＯＭＭワード配信部２３４を制御する。より具体的には、関心会話情報量制御部２４４は、ＣＯＭＭワード配信部２３４におけるＣＯＭＭワード８７の配信の際に行う、ＣＯＭＭワード８７に係る重みの数値と比較される所定の閾値を、会話関心度および話者関心比に基づいて、変更する。 (Interested conversation information amount control unit 244)
The interest conversation information amount control unit 244 controls the COMM word distribution unit 234 based on the conversation interest level and the speaker interest ratio. More specifically, the interest conversation information amount control unit 244 sets a predetermined threshold value, which is used when the COMM word 87 is distributed by the COMM word distribution unit 234 and is compared with a numerical value of the weight related to the COMM word 87, as a conversation interest. Change based on degree and speaker interest rate.

（会話関心度通知送信部２４５）
会話関心度通知送信部２４５は、ユーザが設定した会話関心度（もしくは話者関心比）に対応した通知情報のデータを、会話関心度を設定されたＣＯＭＭリンク８６の両端に位置するオブジェクト９１の通信用ＩＤに対応する端末装置１００へ送信する。当該会話関心度に対応した通知情報のデータは、当該会話関心度や仮想的距離９７の大きさ、当該会話関心度を設定した第三者ユーザの通信用ＩＤ、等を含む。 (Conversation interest degree notification transmission unit 245)
The conversation interest level notification transmitting unit 245 stores the data of the notification information corresponding to the conversation interest level (or the speaker interest ratio) set by the user in the object 91 located at both ends of the COMM link 86 in which the conversation interest level is set. It is transmitted to the terminal device 100 corresponding to the communication ID. The data of the notification information corresponding to the conversation interest level includes the conversation interest level, the size of the virtual distance 97, the communication ID of the third-party user who has set the conversation interest level, and the like.

＜３．３処理の流れ＞
続いて、図３６を参照して、本実施形態に係る情報処理の例を説明する。図３６は、本実施形態に係る、会話関心度設定とＣＯＭＭワード８７の通信情報量変更と通知処理との概略的な流れの一例を示すシーケンス図である。図３６には、ステップＳ８０１からステップＳ８４３までが含まれている。なお、図３６のステップＳ８０１からステップＳ８０７は、それぞれ図２４のステップＳ６０１からステップＳ６０７と同様の処理であり、ここでは説明を省略する。従って、ステップＳ８０１、ステップＳ８０３、ステップＳ８０５、ステップＳ８０７を順次行った後、以下のステップＳ８２１を行うこととなる。 <3.3 Process flow>
Subsequently, an example of information processing according to the present embodiment will be described with reference to FIG. 36. FIG. 36 is a sequence diagram showing an example of a schematic flow of conversation interest level setting, communication information amount change of the COMM word 87, and notification processing according to the present embodiment. FIG. 36 includes steps S801 to S843. Note that steps S801 to S807 in FIG. 36 are the same processes as steps S601 to S607 in FIG. 24, respectively, and description thereof will be omitted here. Therefore, after step S801, step S803, step S805, and step S807 are sequentially performed, the following step S821 is performed.

（ステップＳ８２１）
端末装置１００Ｕは、表示された端末装置１００Ｃと端末装置１００Ｄに対応するオブジェクトをつなぐＣＯＭＭリンク８６に対し、会話関心度や話者関心比を設定するユーザ入力を取得する。 (Step S821)
The terminal device 100U acquires the user input for setting the conversation interest level or the speaker interest ratio for the COMM link 86 that connects the displayed terminal devices 100C and the objects corresponding to the terminal devices 100D.

（ステップＳ８２３）
端末装置１００Ｕは、ＣＯＭＭリンク８６に対して設定された会話関心度や話者関心比に関するデータを情報管理サーバ２００Ｂへ送信する。 (Step S823)
The terminal device 100U transmits the data regarding the conversation interest level and the speaker interest ratio set for the COMM link 86 to the information management server 200B.

（ステップＳ８２５）
情報管理サーバ２００Ｂは、上記ＣＯＭＭリンク８６に対して設定された会話関心度や話者関心比に関するデータを受信し、上記ＣＯＭＭリンク８６の識別情報と対応させデータ管理する。 (Step S825)
The information management server 200B receives the data regarding the conversation interest level and the speaker interest ratio set for the COMM link 86, and manages the data in association with the identification information of the COMM link 86.

（ステップＳ８２７）
情報管理サーバ２００Ｂは、会話関心度や話者関心比の大きさに対応した情報量のＣＯＭＭワード８７に関するデータを端末装置１００Ｕへ送信する。 (Step S827)
The information management server 200B transmits, to the terminal device 100U, data regarding the COMM word 87 having an information amount corresponding to the degree of conversation interest or the degree of speaker interest.

（ステップＳ８２９）
端末装置１００Ｕは、受信した会話関心度や話者関心比の大きさに対応した情報量のＣＯＭＭワード８７に関するデータを利用して、表示部１５０に表示するＣＯＭＭワード８７の状態を変更させる。 (Step S829)
The terminal device 100U changes the state of the COMM word 87 displayed on the display unit 150 by using the received data regarding the COMM word 87 having the information amount corresponding to the magnitude of the conversation interest level or the speaker interest ratio.

（ステップＳ８３１、ステップＳ８３３）
情報管理サーバ２００Ｂは、会話関心度に対応した通知情報のデータを端末装置１００Ｃと１００Ｄへ送信する。 (Step S831, Step S833)
The information management server 200B transmits the data of the notification information corresponding to the conversation interest level to the terminal devices 100C and 100D.

（ステップＳ８３５、ステップＳ８３７）
端末装置１００Ｃと１００Ｄは、受信した会話関心度に対応した通知情報のデータを利用して、表示部１５０に会話関心度に対応した注意喚起強度で通知を表示させる。 (Step S835, Step S837)
The terminal devices 100C and 100D use the received notification information data corresponding to the conversation interest level to cause the display unit 150 to display the notification at the alerting intensity corresponding to the conversation interest level.

（ステップＳ８３９）
端末装置１００Ｃ（または１００Ｄ）は、上記通知から第三者ユーザの存在に気づいた人物Ｃ（または人物Ｄ）による通話への引き込みの入力を取得する。 (Step S839)
From the notification, the terminal device 100C (or 100D) acquires the input of the lead-in to the call by the person C (or person D) who notices the presence of the third-party user.

（ステップＳ８４１）
端末装置１００Ｃは、上記通話への引き込みの入力に関するデータ（第三者ユーザの識別情報を含んでいてもよい）を情報管理サーバ２００Ｂへ送信する。次に、ステップＳ８０９へ進む。 (Step S841)
The terminal device 100C transmits to the information management server 200B the data (which may include the identification information of the third party user) regarding the input of the pull-in to the call. Then, the process proceeds to step S809.

（ステップＳ８０９、ステップＳ８１１）
ステップＳ８０９及び、その後のステップＳ８１１は、それぞれ図２４のステップＳ６０９、ステップＳ６１１と同様の処理であり、ここでは説明を省略する。なお、ステップＳ８１１とステップＳ８４１は、どちらか一方の処理が実行されればステップＳ８４３へ進んで構わない。 (Steps S809 and S811)
Step S809 and subsequent step S811 are the same processes as step S609 and step S611 of FIG. 24, respectively, and description thereof will be omitted here. If either one of step S811 and step S841 is executed, the process may proceed to step S843.

（ステップＳ８４３）
情報管理サーバ２００Ｂは、上記通話引き込みの入力に関するデータ、もしくは、上記ＣＯＭＭリンク８６を指定するユーザ入力に関するデータを受信する。そして、上記ＣＯＭＭリンク８６に対応する端末装置１００Ｃと端末装置１００Ｄ間の二者通話のセッションに端末装置１００Ｕの通話を新たに加わり、端末装置１００Ｃ、端末装置１００Ｄ、端末装置１００Ｕ間の三者通話のセッションが開始される。 (Step S843)
The information management server 200B receives the data regarding the input of the call pull-in or the data regarding the user input for designating the COMM link 86. Then, the call of the terminal device 100U is newly added to the two-party call session between the terminal device 100C and the terminal device 100D corresponding to the COMM link 86, and the three-party call between the terminal device 100C, the terminal device 100D, and the terminal device 100U. Session will start.

以上のようにして、本実施形態においては、複数の遠隔地における第三者の会話への会話関心度を、たとえば仮想的な「距離」として第三者が直感的に設定入力でき、当該会話を行っている二者には、その距離（会話関心度）に応じて、当該会話に関心を持っている第三者が存在することを知らせる通知がなされるような機能を提供する。さらに、本実施形態においては、上記距離（会話関心度）に応じて、当該会話に係る会話内容の情報を第三者が取得できるようにする機能を提供する。従って、本実施形態によれば、同室環境下での会話にように、会話中の二者の近くに立って当該会話内容に関心を持って聴いているような第三者の存在に気づくことができ、その気づきをきっかけとして、当該会話に第三者を招き入れることができる。その結果、本実施形態によれば、同室環境下での会話のように、第三者と二者との会話をスムーズに進めることができる。 As described above, in the present embodiment, the degree of interest in conversation with a third party at a plurality of remote locations can be intuitively set and input by the third party as, for example, a virtual “distance”. A function is provided to the two parties who are conducting a notification according to the distance (conversation interest level) to notify that there is a third party who is interested in the conversation. Furthermore, in the present embodiment, a function is provided that enables a third party to acquire information on the conversation content related to the conversation in accordance with the distance (degree of conversation interest). Therefore, according to the present embodiment, it is possible to notice the presence of a third party who stands close to two people in a conversation and is interested in the content of the conversation, as in a conversation in the same room environment. It is possible to invite a third party to the conversation by using the awareness. As a result, according to the present embodiment, it is possible to smoothly proceed with a conversation between the third party and the two persons, as in a conversation in the same room environment.

さらに、本実施形態においては、上記距離（会話関心度）に応じて、当該会話に係る会話内容の情報を第三者が取得できるようにする機能を提供する。従って、本実施形態によれば、同室環境下での会話のように、第三者が二者会話の場の近くに寄るほど当該会話の内容がより多く聞こえる事象と同様の事象を提供することができることから、より自然に第三者は会話に関する情報を取得し、取得した情報に基づいて、スムーズに当該会話に参加することができる。 Furthermore, in the present embodiment, a function is provided that enables a third party to acquire information on the conversation content related to the conversation in accordance with the distance (degree of conversation interest). Therefore, according to the present embodiment, it is possible to provide an event similar to a conversation in a room environment, in which the closer the third party is to the place of the two-party conversation, the more the content of the conversation can be heard. As a result, the third party can more naturally acquire information about the conversation and can smoothly participate in the conversation based on the acquired information.

＜補足＞
以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 <Supplement>
The preferred embodiments of the present invention have been described above in detail with reference to the accompanying drawings, but the present invention is not limited to these examples. It is obvious that a person having ordinary knowledge in the technical field to which the present invention pertains can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that these also belong to the technical scope of the present invention.

例えば、実空間に対応する３次元仮想空間９０として、センタオフィス１０の３次元仮想空間９０が用意される例を説明したが、本発明の実施形態はこれに限定されない。例えば、複数の３次元仮想空間９０が用意されてもよい。一例として、実空間に対応する３次元仮想空間９０は、複数のオフィスの各々について用意されてもよい。例えば、サテライトオフィス２０、ホームオフィス２０、他のセンタオフィス１０等についての３次元仮想空間９０も用意されてもよい。この場合に、各オフィスの３次元仮想空間９０は、各オフィスの大きさに応じた大きさの３次元仮想空間であってもよい。また、端末装置１００のオブジェクト選択部１８５は、複数の３次元仮想空間９０のうちの所望の３次元仮想空間９０のデータを取得してもよい。また、オフィス以外の３次元仮想空間９０が用意されてもよい。 For example, the example in which the three-dimensional virtual space 90 of the center office 10 is prepared as the three-dimensional virtual space 90 corresponding to the real space has been described, but the embodiment of the present invention is not limited to this. For example, a plurality of three-dimensional virtual spaces 90 may be prepared. As an example, the three-dimensional virtual space 90 corresponding to the real space may be prepared for each of the plurality of offices. For example, a three-dimensional virtual space 90 for the satellite office 20, home office 20, other center offices 10, etc. may be prepared. In this case, the three-dimensional virtual space 90 of each office may be a three-dimensional virtual space having a size corresponding to the size of each office. Further, the object selection unit 185 of the terminal device 100 may acquire data of a desired three-dimensional virtual space 90 among the plurality of three-dimensional virtual spaces 90. A three-dimensional virtual space 90 other than the office may be prepared.

また、人物が座席に座っている場合に限り当該人物に対応するオブジェクト９１が選択される例を説明したが、本発明の実施形態はこれに限定されない。例えば、人物が座席に座っていない場合にもオブジェクト９１が選択されてもよい。一例として、人物が座席に座っている場合には、当該座席に設置された通信装置の通信用ＩＤが取得され、人物が座席に座っていない場合には、当該人物の携帯端末の通信用ＩＤが取得されてもよい。 Further, the example in which the object 91 corresponding to the person is selected only when the person is seated is explained, but the embodiment of the present invention is not limited to this. For example, the object 91 may be selected even when the person is not sitting on the seat. As an example, when a person is sitting in the seat, the communication ID of the communication device installed in the seat is acquired, and when the person is not sitting in the seat, the communication ID of the mobile terminal of the person is acquired. May be obtained.

また、通信用ＩＤが電話番号である例を説明したが、本発明の実施形態はこれに限定されない。通信用ＩＤは、電話番号以外のＩＤであってもよい。一例として、通信用ＩＤは、電話番号以外のソフトフォン用ＩＤであってもよい。また、別の例として、通信用ＩＤは、電話以外の通信のためのＩＤであってもよい。例えば、通信用ＩＤは、メールアドレスであってもよく、又はショートメッセージ用のＩＤであってもよい。この場合に、通信用ＩＤを用いて、メールが送信され、又はショートメッセージが送信されてもよい。 Further, although the example in which the communication ID is a telephone number has been described, the embodiment of the present invention is not limited to this. The communication ID may be an ID other than a telephone number. As an example, the communication ID may be a softphone ID other than a telephone number. Further, as another example, the communication ID may be an ID for communication other than a telephone call. For example, the communication ID may be a mail address or an ID for a short message. In this case, an email or a short message may be transmitted using the communication ID.

また、オブジェクト９１が選択された場合に当該オブジェクト９１に対応する通信用ＩＤが取得される例を説明したが、本発明の実施形態はこれに限定されない。例えば、オブジェクト９１が選択された場合に当該オブジェクト９１に対応するいずれかの識別情報が取得されてもよい。一例として、オブジェクト９１が選択された場合にオブジェクト９１に対応する人物のいずれかの識別情報が取得されてもよい。そして、例えば、この識別情報から、通信用ＩＤが取得されてもよい。 Further, although an example in which the communication ID corresponding to the object 91 is acquired when the object 91 is selected has been described, the embodiment of the present invention is not limited to this. For example, when the object 91 is selected, any identification information corresponding to the object 91 may be acquired. As an example, when the object 91 is selected, the identification information of any of the persons corresponding to the object 91 may be acquired. Then, for example, the communication ID may be acquired from this identification information.

また、実空間に対応する３次元仮想空間９０に配置されるオブジェクト（オブジェクト選択部１８５により選択されるオブジェクト）９１が人物に対応し且つ円柱状のオブジェクトである例を説明したが、本発明の実施形態はこれに限定されない。例えば、オブジェクトは、円柱状のオブジェクトではなく、別の形状のオブジェクトであってもよい。また、例えば、オブジェクト９１は、人物以外のものに対応してもよい。一例として、オブジェクト９１は、実空間の領域に対応してもよい。具体的には、例えば、オブジェクト９１は、座席に対応し、当該座席の位置に対応する３次元仮想位置に配置されてもよい。そして、当該座席に設置された通信装置の通信用ＩＤと上記オブジェクト９１とが対応し、当該オブジェクト９１が選択されると、当該通信用ＩＤが取得されてもよい。また、オブジェクト９１は、座席よりも広い領域に対応し、当該領域の範囲に対応する３次元仮想範囲に渡って存在してもよい。そして、当該領域内に設置された通信装置の通信用ＩＤと上記オブジェクト９１とが対応し、当該オブジェクト９１が選択されると、当該通信用ＩＤが取得されてもよい。 Further, an example in which the object (object selected by the object selection unit 185) 91 arranged in the three-dimensional virtual space 90 corresponding to the real space is a columnar object corresponding to a person has been described. The embodiment is not limited to this. For example, the object may be an object having another shape instead of the cylindrical object. Further, for example, the object 91 may correspond to something other than a person. As an example, the object 91 may correspond to a real space area. Specifically, for example, the object 91 may correspond to a seat and may be arranged at a three-dimensional virtual position corresponding to the position of the seat. Then, when the communication ID of the communication device installed in the seat corresponds to the object 91 and the object 91 is selected, the communication ID may be acquired. Further, the object 91 may correspond to a region wider than the seat and may exist over a three-dimensional virtual range corresponding to the range of the region. Then, when the communication ID of the communication device installed in the area corresponds to the object 91 and the object 91 is selected, the communication ID may be acquired.

また、表示画面において撮像画像の位置がユーザによるタッチで指定される例を説明したが、本発明の実施形態はこれに限定されない。例えば、撮像画像の位置は、タッチパネル８２０以外の入力手段を用いてユーザにより指定されてもよい。例えば、撮像画像の位置は、マウスによるクリックで指定されてもよく、ボタン、キーボード等の別の入力手段を用いて指定されてもよい。 Further, although the example in which the position of the captured image is designated by the touch of the user on the display screen has been described, the embodiment of the present invention is not limited to this. For example, the position of the captured image may be designated by the user using an input unit other than the touch panel 820. For example, the position of the captured image may be designated by clicking with a mouse, or may be designated using another input means such as a button or a keyboard.

また、カメラ１１により生成される撮像画像、マイクロフォン１３により生成される音声データ、及び、センサ１５による判定結果が、それぞれ、カメラ１１、マイクロフォン１３及びセンサ１５により、端末装置１００に直接提供される例を説明したが、本発明の実施形態はこれに限定されない。例えば、別の装置によりこれらのデータが提供されてもよい。一例として、いずれかのサーバ（例えば、メディア配信サーバ）が、これらのデータを取得し、これらのデータを端末装置１００に提供してもよい。 Further, an example in which the captured image generated by the camera 11, the audio data generated by the microphone 13, and the determination result by the sensor 15 are directly provided to the terminal device 100 by the camera 11, the microphone 13, and the sensor 15, respectively. However, the embodiment of the present invention is not limited to this. For example, another device may provide these data. As an example, any server (for example, a media distribution server) may acquire these data and provide these data to the terminal device 100.

また、位置取得部１８３、オブジェクト選択部１８５及びＩＤ取得部１８７等の機能が端末装置１００により備えられる例を説明したが、本発明の実施形態はこれに限定されない。例えば、これらの機能は、端末装置１００以外の装置により備えられてもよい。一例として、これらの機能はいずれかのサーバにより備えられてもよい。また、ＣＯＭＭリンク配信部２３１及び語句抽出データ管理部２３２等の機能が情報管理サーバ２００により備えられる例を説明したが、本発明の実施形態はこれに限定されない。例えば、これらの機能は、情報管理サーバ２００以外の装置により備えられてもよい。一例として、これらの機能は端末装置１００により備えられてもよい。 In addition, although an example in which the terminal device 100 has functions such as the position acquisition unit 183, the object selection unit 185, and the ID acquisition unit 187 has been described, the embodiment of the present invention is not limited to this. For example, these functions may be provided by a device other than the terminal device 100. As an example, these functions may be provided by any server. Also, an example has been described in which the information management server 200 has functions such as the COMM link distribution unit 231 and the phrase extraction data management unit 232, but the embodiment of the present invention is not limited to this. For example, these functions may be provided by a device other than the information management server 200. As an example, these functions may be provided by the terminal device 100.

また、実空間の撮像画像の表示画面が端末装置１００により表示される例を説明したが、本発明の実施形態はこれに限定されない。例えば、当該表示画面は別の装置により表示されてもよい。一例として、上記表示画面はサテライトオフィス２０に設置されたディスプレイ２１により表示されてもよい。そして、ユーザが、当該ディスプレイ２１において、表示画像に含まれる撮像画像の位置を指定してもよい。 Further, although the example in which the display screen of the captured image of the real space is displayed by the terminal device 100 has been described, the embodiment of the present invention is not limited to this. For example, the display screen may be displayed by another device. As an example, the display screen may be displayed on the display 21 installed in the satellite office 20. Then, the user may specify the position of the captured image included in the display image on the display 21.

また、話者及びユーザは、座席に着席しているものとして説明したが、本発明の実施形態はこれに限定されず、話者及びユーザは実空間上で移動していてもよい。例えば、各話者及びユーザが持っている、タブレット等の端末装置１００による通信やカメラ１１による被写体に追従した撮像により、移動する各話者及びユーザの位置を特定し、特定した位置に基づいて、上述の３次元仮想空間９０におけるオブジェクトの位置を決定してもよい。 Further, although the speaker and the user are explained as being seated in the seat, the embodiment of the present invention is not limited to this, and the speaker and the user may be moving in the real space. For example, the position of each moving speaker and user is specified by communication performed by the terminal device 100 such as a tablet or the image held by the camera 11 that is held by each speaker and user, and based on the specified position. The position of the object in the above-mentioned three-dimensional virtual space 90 may be determined.

また、本明細書の情報処理における処理ステップは、必ずしもフローチャートに記載された順序に沿って時系列に実行されなくてよい。例えば、情報処理における処理ステップは、フローチャートとして記載した順序と異なる順序で実行されても、並列的に実行されてもよい。 Further, the processing steps in the information processing of the present specification do not necessarily have to be executed in time series in the order described in the flowchart. For example, the processing steps in the information processing may be executed in an order different from the order described as the flowchart or may be executed in parallel.

また、情報処理装置（例えば、端末装置）に内蔵されるＣＰＵ、ＲＯＭ及びＲＡＭ等のハードウェアに、上記情報処理装置の各構成と同等の機能を発揮させるためのコンピュータプログラムも作成可能である。また、当該コンピュータプログラムを記憶させた記憶媒体も提供される。 Further, it is possible to create a computer program for causing hardware such as a CPU, a ROM, and a RAM built in an information processing device (for example, a terminal device) to exhibit the same function as each component of the information processing device. A storage medium storing the computer program is also provided.

３手
１０センタオフィス
１１、８１３カメラ
１３、８１５マイクロフォン
１５センサ
１７メディア配信サーバ
１９、２３ＬＡＮ
２０サテライトオフィス
２１ディスプレイ
３０外部ネットワーク
３１アイコン
４０ＰＢＸ
５０、５５、５６、５７、５８、５９、６０、６２、６６、７０、８０表示画面
５１、５１Ａ、５１Ｂ、５１Ｚ撮像画像
６１俯瞰撮像画像
６３、６３Ａ、６３Ｂ、７３、８３ボタン画像
６５、７９Ｃ，７９Ｄ、７９Ｅ、７９Ｆプレゼンスアイコン
６７吹き出し画像
６９、６９Ａ、６９Ｂ、６９Ｚ、７５マップ画像
７１近接撮像画像
７７、７７Ａ、７７Ｂ、７７Ｃ、７７Ｄ、７７Ｅ、７７Ｆ人物画像
８１相手側撮像画像
８５自分側撮像画像
８６、８６Ｇ２、８６Ｇ３、８６Ｈ２、８６Ｈ３、８６Ｍ２、８６Ｍ３ＣＯＭＭリンク
８７、８７Ｉ、８７Ｊ、８７Ｋ、８７ＬＣＯＭＭワード
９０３次元仮想空間
９１、９１Ａ、９１Ｂ、９１Ｃ、９１Ｄ、９１Ｅ、９１Ｆオブジェクト
９２、９２Ｃ、９２Ｄ、９２Ｅ、９２Ｆ、９２Ｕ３次元重心位置
９３仮想面
９４、９４Ｕ第三者オブジェクト
９６９６Ｕ話者関心比反映位置
９７９７Ｕ仮想的距離
９８、９８Ｕ２、９８Ｕ３、１０１、１０１Ｕ２、１０１Ｕ３、１０３、１０３Ｕ２、１０３Ｕ３画面表示体
９９延長線
１００、１００Ａ、１００Ｃ、１００Ｄ、１００Ｕ端末装置
１０５尖端部
１０７通知ウィンドウ
１１０、２１０、５１０通信部
１２０入力部
１３０撮像部
１４０集音部
１５０表示部
１６０音声出力部
１７０、２２０、５２０記憶部
１８０、２３０、５３０制御部
１８１実空間情報提供部
１８２音声出力制御部
１８３位置取得部
１８５オブジェクト選択部
１８７ＩＤ取得部
１８９電話部
１９１会話オブジェクト選択部
１９３ＣＯＭＭリンク制御部
１９５ＣＯＭＭワード制御部
１９６会話関心度設定部
１９７話者関心比設定部
１９８会話関心度通知部
２００、２００Ａ、２００Ｂ情報管理サーバ
２０１音声認識サーバ
２３１ＣＯＭＭリンク配信部
２３２抽出語句データ管理部
２３３重みづけ演算部
２３４ＣＯＭＭワード配信部
２３５発言状況演算部
２３６位置連動配信制御部
２４１会話関心度制御部
２４２関心度制御部
２４３表示体制御部
２４４関心会話情報量制御部
２４５会話関心度通知送信部
５３１語句抽出部
５３３語句データ生成部
７０１、８０１、９０１ＣＰＵ
７０３、８０３、９０３ＲＯＭ
７０５、８０５、９０５ＲＡＭ
７０７、８０７、９０７バス
７０９、８０９、９０９記憶装置
７１１、８１１、９１１通信インターフェース
８１７スピーカ
８２０タッチパネル
８２１タッチ検出面
８２３表示面
８４０ＯＳ
８５１ソフトフォン
８５３超臨場感クライアント
８５５電話発信制御機能 3 Hands 10 Center office 11, 813 Camera 13, 815 Microphone 15 Sensor 17 Media distribution server 19, 23 LAN
20 Satellite Office 21 Display 30 External Network 31 Icon 40 PBX
50, 55, 56, 57, 58, 59, 60, 62, 66, 70, 80 Display screen 51, 51A, 51B, 51Z Captured image 61 Bird's eye captured image 63, 63A, 63B, 73, 83 Button image 65, 79C , 79D, 79E, 79F Presence icon 67 Balloon image 69, 69A, 69B, 69Z, 75 Map image 71 Close-up image 77, 77A, 77B, 77C, 77D, 77E, 77F Portrait image 81 Opposite image 85 Self-side image Image 86, 86G2, 86G3, 86H2, 86H3, 86M2, 86M3 COMM link 87, 87I, 87J, 87K, 87L COMM word 90 3D virtual space 91, 91A, 91B, 91C, 91D, 91E, 91F object 92, 92C, 92D, 92E, 92F, 92U Three-dimensional barycentric position 93 Virtual plane 94, 94U Third party object 96 96U Speaker interest ratio reflection position 97 97U Virtual distance 98, 98U2, 98U3, 101, 101U2, 101U3, 103, 103U2, 103U3 screen display 99 extension line 100, 100A, 100C, 100D, 100U terminal device 105 tip 107 notification window 110, 210, 510 communication unit 120 input unit 130 image pickup unit 140 sound collection unit 150 display unit 160 voice output unit 170, 220, 520 Storage section 180, 230, 530 Control section 181 Real space information provision section 182 Voice output control section 183 Position acquisition section 185 Object selection section 187 ID acquisition section 189 Telephone section 191 Conversation object selection section 193 COMM link control section 195 COMM Word control unit 196 Conversation interest level setting unit 197 Speaker interest ratio setting unit 198 Conversation interest level notification unit 200, 200A, 200B Information management server 201 Speech recognition server 231 COMM link distribution unit 232 Extracted word data management unit 233 Weighting calculation unit 234 COMM word distribution unit 235 utterance status calculation unit 236 position-linked distribution control unit 241 conversation interest degree control unit 242 interest degree control unit 243 display body control unit 244 interest conversation information amount control unit 245 conversation interest degree notice transmission unit 531 word extraction unit 533 Word/Data Generator 701, 801, 901 CPU
703, 803, 903 ROM
705, 805, 905 RAM
707, 807, 907 Bus 709, 809, 909 Storage device 711, 811, 911 Communication interface 817 Speaker 820 Touch panel 821 Touch detection surface 823 Display surface 840 OS
851 Softphone 853 Ultra-realistic client 855 Telephone call control function

Claims

A conversation event object relating to the call, which associates communication identification information of a plurality of speakers involved in the call, generates a language phrase object relating to the phrase extracted from the voice data of the call, and the conversation event object An information processing server, comprising: a control unit that controls distribution of data related to the language and data related to the language phrase object .

The control unit acquires a user's input to the conversation event object, and associates the communication identification information of the user with the communication identification information of the plurality of speakers linked to the conversation event object,
The information processing server according to claim 1.

Further comprising a weighting processing unit which performs weighting processing for the previous SL onset language phrase objects,
The control unit compares the results with the predetermined value of the weighting process based on the comparison result, it controls the distribution of data according to the talk word object,
The information processing server according to claim 1 .

The information processing server according to claim 3 , wherein the weighting processing unit performs the weighting processing based on an appearance frequency of the word or phrase in the call.

The information processing server according to claim 3 , wherein the weighting processing unit performs the weighting processing based on the degree of abstraction of the phrase.

The information processing server according to claim 3 , wherein the weighting processing unit performs the weighting processing based on a part-of-speech category of the word.

The information processing server according to claim 3 , wherein the weighting processing unit performs the weighting processing based on data regarding sound pressure of speech of the phrase included in the voice data of the call.

Wherein, in association cord to the remarks phrase objects, and controls the distribution of data according to the result of the weighting processing, the information processing server according to any one of claims 3 7.

Based on the position of the speaker who spoke the words according to the talk word object, it determines the display position of the speech phrase objects, further comprising a speech status calculation unit for distributing the determined the display position, from claim 1 8. The information processing server according to any one of 8 .

The control unit , based on the positional relationship between one of the plurality of speakers involved in the call and a user who is not participating in the call in the real space , controlling the delivery, the information processing server according to claim 1.

The control unit , based on the positional relationship between one of the plurality of speakers involved in the call and a user who is not participating in the call in the real space , The information processing server according to claim 1, which controls distribution.

An interest level control unit that acquires an input of the interest level of the user with respect to the call and associates the acquired interest level with the communication identification information of the user with respect to the conversation event object is further provided.
The information processing server according to claim 1.

A display body control unit configured to generate a display body related to the user, and determine a virtual positional relationship between the position of the conversation event object and the display body based on the acquired degree of interest,
The information processing server according to claim 12 .

Further comprising a weighting processing unit which performs weighting processing on the calling language phrase objects,
The control unit compares the result of the weighting processing, and a predetermined value, based on the comparison result, controls the distribution of data according to the talk word object,
The predetermined value is changed based on the acquired degree of interest,
The information processing server according to claim 12 or 13 .

The degree-of-interest control unit acquires an input of the degree-of-interest ratio of the plurality of speakers involved in the call of the user, associates the input with the conversation event object, and obtains the data regarding the acquired degree-of-interest ratio. delivering, the information processing server according to claim 12 or 13.

Further comprising a weighting processing unit which performs weighting processing on the calling language phrase objects,
The control unit compares the result of the weighting processing, and a predetermined value, based on the comparison result, controls the distribution of data according to the talk word object,
The predetermined value is changed based on the acquired interest level and the ratio of the interest levels,
The information processing server according to claim 15 .

The interest degree control unit associates the communication identification information of the plurality of speakers associated with the conversation event object with the communication identification information of the user who inputs the interest degree for the call.
The information processing server according to any one of claims 12 to 16 .

Generate a three-dimensional virtual space corresponding to the real space where there are multiple speakers involved in the call,
A plurality of objects corresponding to the respective communication identification information of the plurality of speakers involved in the call, and a conversation event object related to the call, which associates the plurality of objects with each other, to generate voice data of the call. Generate a language phrase object related to the phrase extracted from the above , and arrange the data related to the conversation event object and the data related to the language phrase object in the three-dimensional virtual space,
An information processing server including a control unit .

Getting an input of the user's interest in the call,
Arranging a user object corresponding to the user in the three-dimensional virtual space,
Determining a virtual distance between the conversation event object and the user object in the three-dimensional virtual space based on the acquired degree of interest,
Further comprising a display body control unit,
The information processing server according to claim 18 .

Obtaining an input of interest rate ratios for the plurality of speakers involved in the call of the user,
Arranging a reference point indicating the interest rate on the generated conversation event object based on the obtained interest rate.
Further comprising an interest degree control unit,
The information processing server according to claim 18 .

An information processing system including an information processing server and a plurality of terminal devices,
The information processing server associates communication identification information of a plurality of speakers involved in a call, generates a conversation event object related to the call, and generates a language phrase object related to a phrase extracted from the voice data of the call. Then, the data related to the conversation event object and the data related to the language phrase object are distributed to the plurality of terminal devices,
Information processing system.

A terminal device, comprising: a display unit that associates communication identification information of a plurality of speakers involved in a call with each other, and displays a conversation event object related to the call and a language phrase object related to a phrase extracted from voice data of the call .

The result of the weighting process performed on the language phrase object is acquired, the result of the weighting process is compared with a predetermined value, and the display unit is controlled based on the comparison result. The terminal device according to claim 22 , further comprising an object control unit.

The result of the weighting process performed on the language phrase object is acquired, and any one of the size, color, contrast, and display position of the language phrase object is acquired based on the result of the weighting process. 23. The terminal device according to claim 22 , further comprising a language phrase object control unit for controlling.

The terminal according to any one of claims 22 to 24 , further comprising: a voice output control unit that acquires a user's interest in the call and controls an output of voice related to the call based on the interest. apparatus.

By the user, based on the operation for the user object according to the user which is displayed on the display unit further comprises a conversational interest setting unit that acquires interest for the call of the user, any one of claims 22 to 24, The terminal device according to item 1.

27. The terminal device according to claim 26 , further comprising an imaging unit that acquires a face image of the user for displaying the user object.

The terminal device according to any one of claims 22 to 24 , further comprising: a conversation interest degree notification unit that notifies the presence of the user based on acquisition of the interest degree of the user for the call.

The conversation interest level notification unit,
Displaying a notification display indicating the presence of the user on the display unit,
Based on the acquired degree of interest, any one of the size, color, movement, contrast, and display position of the notification display is controlled.
The terminal device according to claim 28 .

The conversation interest level notification unit,
Causing the voice output unit to perform voice output indicating the presence of the user,
Controlling the volume of the audio output based on the acquired degree of interest;
The terminal device according to claim 28 .

Computer,
A conversation event object relating to the call, which associates communication identification information of a plurality of speakers involved in the call, generates a language phrase object relating to the phrase extracted from the voice data of the call, and the conversation event object A program for functioning as a control unit that controls the distribution of the data related to the above and the data related to the language phrase object .