JP2001228794A

JP2001228794A - Conversation information presenting method and immersed type virtual communication environment system

Info

Publication number: JP2001228794A
Application number: JP2000037462A
Authority: JP
Inventors: Shuhei Oda; 修平織田; Takashi Yagi; 貴史八木; Satoshi Ishibashi; 聡石橋
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2000-02-16
Filing date: 2000-02-16
Publication date: 2001-08-24
Anticipated expiration: 2020-02-16
Also published as: JP3621861B2

Abstract

PROBLEM TO BE SOLVED: To obtain a means through which a user easily converses with other user while the users are freely walking in a virtual space in an immersed type virtual communication environment. SOLUTION: Plural display devices are arranged to surround the users in an immersed type virtual communication environment. In the conversation information presenting method, uttered contents of an uttering user are inputted, character information is generated from the inputted contents, three dimensional positions of the uttering user and the user receiving the conversation and their line of sight vectors in the environment are extracted, the distance between the uttering user and the receiving user is extracted by the three dimensional positions of the users, the field of vision of the receiving user is extracted by the user's line of sight vector and a beforehand determined field of vision angle, the presentation position of the uttered user's character information is determined based on the distance between the users, the field of vision of the receiving user, the three dimensional position of the uttering user ad the line of sight vector of the uttering user and the generated character information is outputted based on the determined presentation position.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、会話情報提示方法
及び没入型仮想コミュニケーション環境システムに関
し、特に、複数の表示装置がユーザを囲むように配置さ
れた没入型仮想環境において、聴覚障害者の会話支援等
を目的とし、ユーザの発話内容を文字画像提示する会話
情報提示技術に適用して有効な技術に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a conversation information presentation method and an immersive virtual communication environment system, and more particularly to a conversation of a hearing impaired person in an immersive virtual environment in which a plurality of display devices are arranged so as to surround a user. The present invention relates to a technology that is effective when applied to a conversation information presentation technology for presenting a text image of the contents of a user's utterance for the purpose of support or the like.

【０００２】[0002]

【従来の技術】従来、没入型仮想環境が体験できる没入
型多面ディスプレイシステムがある。この没入型多面デ
ィスプレイシステムは、本来、シミュレーション等の可
視化環境として開発されたものである。近年ではそれを
ネットワークで接続し、コミュニケーション環境として
利用する研究が盛んに行われている。没入型多面ディス
プレイシステムは複数のスクリーン（表示装置）を前後
左右上下等に配置し、ユーザを映像で囲むような構造に
なっており、高い臨場感を得ることができる。2. Description of the Related Art Conventionally, there is an immersive multi-sided display system in which an immersive virtual environment can be experienced. This immersive multi-surface display system was originally developed as a visualization environment for simulation and the like. In recent years, researches on connecting them via a network and using them as a communication environment have been actively conducted. The immersive multi-surface display system has a structure in which a plurality of screens (display devices) are arranged in front, rear, left, right, top, bottom, etc., and surrounds the user with images, so that a high sense of reality can be obtained.

【０００３】このような没入型仮想コミュニケーション
環境では、ユーザはアバタ（分身）となり立体的な仮想
世界中を自由に歩き回ることができ、前後上下左右方向
を見ることができ、他アバタ（他ユーザ）と遭遇したと
きに会話の場を持つことができる。このとき、発話ユー
ザの会話情報は音声で提示される。[0003] In such an immersive virtual communication environment, a user becomes an avatar (self), can freely walk around a three-dimensional virtual world, can see the front, rear, up, down, left and right directions, and can use other avatars (other users). You can have a conversation place when you encounter. At this time, conversation information of the speaking user is presented by voice.

【０００４】このような没入型仮想コミュニケーション
環境については、例えば、文献：信学技報、ＭＶＥ９９
-４５、ｐｐ．１〜８、１９９９（河野隆志、鈴木由里
子、山本憲男、志和新一、石橋聡著、表題“没入型仮想
コミュニケーション環境”）に記載されている。[0004] Such an immersive virtual communication environment is described in, for example, literature: IEICE Technical Report, MVE99.
-45, pp. 1-8, 1999 (Takashi Kono, Yuriko Suzuki, Norio Yamamoto, Shinichi Shiwa, Satoshi Ishibashi, titled "Immersive Virtual Communication Environment").

【０００５】一方、一面ディスプレイで仮想環境を体験
できる非没入型ディスプレイシステムにおいては、聴覚
障害者支援等の為に会話内容の音声提示に替わる代替手
段として、テレビの字幕のように会話情報を文字によっ
て提示する方法がある。この場合、ユーザの視界方向に
ディスプレイが存在するため、ディスプレイの一部に文
字情報を提示することでユーザは会話情報を獲得するこ
とができる。On the other hand, in a non-immersive display system in which a virtual environment can be experienced on a one-sided display, text information such as subtitles on a television can be used as an alternative to voice presentation of the content of the conversation to assist the hearing impaired. There is a way to present. In this case, since the display exists in the direction of the user's field of view, the user can acquire conversation information by presenting character information on a part of the display.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、非没入
型ディスプレイシステムで採用されているテレビの字幕
のような文字提示方法は、没入型多面ディスプレイシス
テムには不向きである。没入型多面ディスプレイシステ
ムはユーザを囲むように複数のディスプレイが設置され
ており、ユーザが前後上下左右方向を見ることができる
ため、ユーザの視界方向は特定のディスプレイに固定さ
れない。そのため、ある特定のディスプレイの一部にテ
レビ字幕のように文字情報を提示する方法では、ユーザ
がその特定のディスプレイを見ていない場合に会話情報
の獲得に失敗するという問題があった。However, the method of presenting a character such as a subtitle of a television employed in a non-immersive display system is not suitable for an immersive multi-sided display system. In the immersive multi-surface display system, a plurality of displays are installed so as to surround the user, and the user can see the front, rear, up, down, left, and right directions, so that the view direction of the user is not fixed to a specific display. For this reason, the method of presenting character information such as television subtitles on a part of a specific display has a problem that acquisition of conversation information fails when the user does not look at the specific display.

【０００７】また、ユーザがそのディスプレイを見てい
た場合でも、発話者がユーザの視界にない場合には、ユ
ーザが発話者の位置を把握できず、円滑なコミュニケー
ションが行えないという問題があった。さらには、ユー
ザが会話情報の獲得のためにその特定のディスプレイを
視界に置こうとすることで、仮想空間内を自由に歩き回
れないという問題が発生する。Further, even when the user is looking at the display, if the speaker is not in the field of view of the user, there is a problem that the user cannot grasp the position of the speaker and cannot communicate smoothly. . Furthermore, if the user attempts to place the specific display in view for acquiring conversation information, a problem arises that the user cannot freely roam in the virtual space.

【０００８】本発明は、前記課題を解決するためになさ
れたものであり、没入型仮想コミュニケーション環境に
おいて、ユーザが仮想空間内を自由に歩き回りながら、
他ユーザと会話の場を容易に持つことが可能な会話情報
提示技術を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems. In an immersive virtual communication environment, a user can freely walk around in a virtual space.
An object of the present invention is to provide a conversation information presentation technology that can easily have a place for conversation with another user.

【０００９】本発明の前記ならびにその他の目的と新規
な特徴は、本明細書の記述及び添付図面によって明らか
にする。The above and other objects and novel features of the present invention will become apparent from the description of the present specification and the accompanying drawings.

【００１０】[0010]

【課題を解決するための手段】本願において開示される
発明のうち、代表的なものの概要を簡単に説明すれば、
下記のとおりである。SUMMARY OF THE INVENTION Among the inventions disclosed in the present application, the outline of a representative one will be briefly described.
It is as follows.

【００１１】（１）複数の表示装置（ディスプレイ）が
ユーザを囲むように配置された没入型仮想コミュニケー
ション環境において、発話ユーザの発話内容を入力する
発話入力過程と、前記入力された発話内容から文字情報
を生成する文字生成過程と、前記仮想環境における発話
ユーザと受話ユーザの３次元位置と視線ベクトルを抽出
する抽出過程と、前記発話ユーザと受話ユーザの３次元
位置により発話ユーザと受話ユーザ間の距離を抽出する
距離抽出過程と、前記受話ユーザの視線ベクトルとあら
かじめ決められた視界角により受話ユーザの視界を抽出
する視界抽出過程と、前記発話ユーザと受話ユーザ間の
距離と前記受話ユーザの視界と発話ユーザの３次元位置
及び発話ユーザの視線ベクトルに基づいて発話ユーザの
文字情報の提示位置を決定する提示位置決定過程と、前
記生成された文字情報を前記決定された提示位置に基づ
いて出力する出力過程とを有する会話情報提示方法であ
る。(1) In an immersive virtual communication environment in which a plurality of display devices (displays) are arranged so as to surround a user, an utterance input process of inputting the utterance content of the utterance user, and a character input from the input utterance content A character generating step of generating information, an extracting step of extracting a three-dimensional position and a line-of-sight vector of the uttering user and the receiving user in the virtual environment, and a three-dimensional position of the uttering user and the receiving user. A distance extracting step of extracting a distance, a view extracting step of extracting a view of the receiving user by a line-of-sight vector of the receiving user and a predetermined viewing angle, a distance between the talking user and the receiving user, and a view of the receiving user. And the presentation position of the character information of the speaking user based on the three-dimensional position of the speaking user and the gaze vector of the speaking user A presentation position determination process of determining a conversation information presentation method and an output step of outputting on the basis of character information the generated to the determined presentation position.

【００１２】（２）前記手段（１）の会話情報提示方法
において、前記提示位置決定過程は、前記発話ユーザと
受話ユーザの距離があらかじめ設定された距離内にあ
り、かつ発話ユーザの３次元位置が受話ユーザの視界内
にある場合は、発話から一定時間、発話ユーザの周囲を
文字表示画像の提示位置とし、それ以外の場合は、発話
ユーザの周囲から視線ベクトル方向へ発話からの時間に
ともなって移動する位置を提示位置とする過程である。(2) In the conversation information presentation method according to the means (1), in the presentation position determination step, the distance between the speaking user and the receiving user is within a preset distance, and the three-dimensional position of the speaking user is determined. Is within the visual field of the receiving user, the surroundings of the speaking user are set as the presentation position of the character display image for a certain period of time from the utterance, and otherwise, the time from the utterance of the surroundings of the speaking user in the direction of the line-of-sight vector. This is the process of setting the position to be moved to the presentation position.

【００１３】（３）複数の表示装置（ディスプレイ）が
ユーザを囲むように配置された没入型仮想コミュニケー
ション環境システムであって、発話ユーザの発話内容を
入力する発話入力手段と、前記入力された発話内容から
文字情報を生成する文字生成手段と、発話ユーザと受話
ユーザの３次元位置と視線ベクトルを抽出する抽出手段
と、前記発話ユーザと受話ユーザの３次元位置により発
話ユーザと受話ユーザ間の距離を抽出する距離抽出手段
と、前記受話ユーザの視線ベクトルとあらかじめ決めら
れた視界角により受話ユーザの視界を抽出する視界抽出
手段と、前記発話ユーザと受話ユーザ間の距離と前記受
話ユーザの視界と発話ユーザの３次元位置及び発話ユー
ザの視線ベクトルに基づいて発話ユーザの文字情報の提
示位置を決定する提示位置決定手段と、前記生成された
文字情報を前記決定された提示位置に基づいて出力する
出力手段とを具備するものである。(3) An immersive virtual communication environment system in which a plurality of display devices (displays) are arranged so as to surround the user, and utterance input means for inputting the utterance content of the uttering user, and the input utterance Character generating means for generating character information from the contents, extracting means for extracting the three-dimensional positions of the speaking user and the receiving user and the line-of-sight vector, and distance between the speaking user and the receiving user according to the three-dimensional positions of the speaking user and the receiving user Distance extracting means for extracting the visual field of the receiving user by the line-of-sight vector of the receiving user and a predetermined visual field angle, and the distance between the speaking user and the receiving user and the visual field of the receiving user. The presentation position of the character information of the speaking user is determined based on the three-dimensional position of the speaking user and the gaze vector of the speaking user. And shows the position determining means, the character information the generated is to and an output means for outputting, based on the determined presentation position.

【００１４】（４）前記手段（３）の没入型仮想コミュ
ニケーション環境システムにおいて、前記提示位置決定
手段は、前発話ユーザと受話ユーザの距離があらかじめ
設定された距離内にあり、かつ発話ユーザの３次元位置
が受話ユーザの視界内にある場合は、発話から一定時
間、発話ユーザの周囲を文字表示画像の提示位置とし、
それ以外の場合は、発話ユーザの周囲から視線ベクトル
方向へ発話からの時間にともなって移動する位置を提示
位置とするものである。(4) In the immersive virtual communication environment system of the means (3), the presenting position determining means determines that the distance between the previous uttering user and the receiving user is within a preset distance, and that the uttering user has a third distance. When the three-dimensional position is within the field of view of the receiving user, the surroundings of the uttering user are set as the presentation position of the character display image for a fixed time from the utterance,
In other cases, the presentation position is a position that moves from the periphery of the utterance user in the direction of the line of sight with time from the utterance.

【００１５】前述の手段によれば、没入型仮想コミュニ
ケーション環境内において、ユーザ（例えば、聴覚障害
ユーザ）が、発話ユーザの位置と発話内容を把握するこ
とができる。これにより、仮想空間内を自由に歩き回り
ながら、他ユーザと会話の場を容易に持つことができ
る。According to the above-described means, a user (for example, a hearing-impaired user) can grasp the position of the speaking user and the speaking content in the immersive virtual communication environment. This makes it possible to easily have a place for conversation with other users while freely walking around in the virtual space.

【００１６】以下に、本発明について、本発明による実
施形態（実施例）とともに図面を参照して詳細に説明す
る。Hereinafter, the present invention will be described in detail with reference to the drawings together with embodiments (examples) according to the present invention.

【００１７】[0017]

【発明の実施の形態】図１は、本発明による一実施形態
（実施例）の没入型仮想コミュニケーション環境システ
ムの概略構成を示すブロック構成図である。FIG. 1 is a block diagram showing a schematic configuration of an immersive virtual communication environment system according to an embodiment (example) of the present invention.

【００１８】図１において、発話入力手段１は、発話ユ
ーザの発話内容を入力し発話内容をデータ化するもので
ある。この発話入力手段１としては、例えば、マイク等
の発話音声入力機器やジェスチャー発話に対するモーシ
ョンキャプチャ動作入力機器を用いる。In FIG. 1, an utterance input means 1 inputs utterance contents of an uttering user and converts the utterance contents into data. As the utterance input unit 1, for example, an utterance voice input device such as a microphone or a motion capture operation input device for gesture utterance is used.

【００１９】文字生成手段２は、発話入力手段１より入
力された発話内容データを認識して文字情報に変換生成
するものである。この文字生成手段２としては、例え
ば、前記発話内容データが音声情報であれば音声認識装
置を使用し、発話内容データが動作情報であれば動作認
識装置を使用する。The character generation means 2 recognizes the utterance content data input from the utterance input means 1 and converts it into character information. As the character generating means 2, for example, a voice recognition device is used if the utterance content data is voice information, and a motion recognition device is used if the utterance content data is motion information.

【００２０】ここで、音声認識用ソフトウエアは、例え
ば、音声認識エンジンＲＥＸ（ＮＴＴ）が知られてい
る。また、動作認識方法は、例えば、文献：信学技報、
ＭＶＥ９９−３６、１９９９／７（矢部愽明，その他
著、表題“ジェスチャ動画像と意記述単語系列とのネッ
トワーク構造対応に基づくジェスチャ認識”）に記載さ
れている。Here, as the voice recognition software, for example, a voice recognition engine REX (NTT) is known. In addition, the motion recognition method is described in, for example, literature: IEICE Technical Report,
MVE99-36, 1999/7 (Tomoaki Yabe, et al., Entitled "Gesture Recognition Based on Network Structure Correspondence between Gesture Moving Image and Meaning Word Sequence").

【００２１】抽出手段３-１、３-２は、それぞれ発話ユ
ーザと受話ユーザの仮想環境内における３次元位置と視
線ベクトルを抽出するものである。抽出する方法例とし
て、ユーザの身体に位置を検出する位置センサを取りつ
けて仮想環境内における位置を抽出する方法がある。具
体的には、三次元（３Ｄ）メガネに取りつけられた磁気
センサや３次元ワンド（磁気センサとスイッチボタンと
を備えた棒状のインタフェース装置）に備えられた磁気
センサによって、現実の位置・方向情報処理用パーソナ
ルコンピュータ（ＰＣ）等による仮想環境空間における
ユーザの３次元抽出手段を抽出することが考えられる。
特に、視線ベクトルはユーザの見ている方向を忠実に抽
出するためにもセンサを頭部につけるのがよいと考えら
れる。位置センサ装置にこだわる必要は無く、検出精度
のよいものがいいのはいうまでもない。The extracting means 3-1 and 3-2 extract a three-dimensional position and a line-of-sight vector of the speaking user and the receiving user in the virtual environment, respectively. As an example of the extracting method, there is a method of extracting a position in a virtual environment by attaching a position sensor for detecting a position to a user's body. Specifically, the actual position / direction information is provided by a magnetic sensor attached to three-dimensional (3D) glasses or a magnetic sensor provided in a three-dimensional wand (a bar-shaped interface device having a magnetic sensor and a switch button). It is conceivable to extract a user's three-dimensional extraction means in a virtual environment space using a processing personal computer (PC) or the like.
In particular, it is considered that it is better to attach a sensor to the head in order to faithfully extract the direction in which the user is looking at the line-of-sight vector. There is no need to stick to the position sensor device, and it goes without saying that a device with good detection accuracy is good.

【００２２】ここで、抽出された仮想環境内における発
話ユーザの３次元位置をＡ、その位置Ａでの視線ベクト
ルをベクトルａとし、受話ユーザの３次元位置をＢ、そ
の位置Ｂでの視線ベクトルをベクトルｂとする。また、
視線ベクトルａはユーザの３次元位置Ａを起点とした方
向ベクトルとする。視線ベクトルｂも同様に３次元位置
Ｂを起点とした方向ベクトルとする。Here, the three-dimensional position of the speaking user in the extracted virtual environment is A, the line-of-sight vector at the position A is vector a, the three-dimensional position of the receiving user is B, and the line-of-sight vector at the position B is B. Is a vector b. Also,
The line-of-sight vector a is a direction vector starting from the three-dimensional position A of the user. The line-of-sight vector b is also a direction vector starting from the three-dimensional position B.

【００２３】距離抽出手段４は、前記抽出手段３-１及
び３-２で得られた発話ユーザの３次元位置Ａと受話ユ
ーザの３次元位置Ｂを入力し、３次元位置Ａ、Ｂ間の距
離を計算し、発話ユーザと受話ユーザの距離ｄを抽出す
るものである。The distance extracting means 4 inputs the three-dimensional position A of the speaking user and the three-dimensional position B of the receiving user obtained by the extracting means 3-1 and 3-2. The distance is calculated, and the distance d between the speaking user and the receiving user is extracted.

【００２４】視界抽出手段５は、前記抽出手段３-２で
得られた受話ユーザの視線ベクトルｂを入力し、その視
線ベクトルｂを中心軸として、図２のように、あらかじ
め設定した視界角Ｒの無限円錐状の視界Ｗを抽出するも
のである。視界角Ｒは仮想環境内で会話するときに必要
と考えられる受話ユーザの視野の角度と定義し、視界Ｗ
の角度となる。この視界角Ｒは自由自在に調整すること
ができ、視界角Ｒが大きければ大きいほど視界Ｗは広く
なる。The view extracting means 5 inputs the line-of-sight vector b of the receiving user obtained by the extracting means 3-2, and sets a predetermined view angle R as shown in FIG. To extract a field of view W having an infinite conical shape. The view angle R is defined as the angle of the field of view of the receiving user considered necessary when talking in the virtual environment.
Angle. This view angle R can be freely adjusted, and the larger the view angle R, the wider the view W.

【００２５】提示位置決定手段６は、前記抽出手段３-
１で抽出された３次元位置Ａと視線ベクトルａ、前記距
離抽出手段４で抽出された距離ｄ及び前記視界抽出手段
５で抽出された視界Ｗを入力し、それにもとづいて文字
情報の提示位置を決定するものである。The presentation position determining means 6 includes the extracting means 3-
1. The three-dimensional position A and the line-of-sight vector a extracted in step 1, the distance d extracted by the distance extracting unit 4 and the field of view W extracted by the field extracting unit 5 are input, and the presentation position of the character information is determined based on the input. To decide.

【００２６】出力手段７は、前記文字生成手段２で生成
された文字情報及び提示位置決定手段６で決定された提
示位置を入力し、文字情報を提示位置に基づいて出力す
る出力手段である。The output means 7 is an output means for inputting the character information generated by the character generation means 2 and the presentation position determined by the presentation position determination means 6, and outputting the character information based on the presentation position.

【００２７】図３は、前記提示位置決定手段６の処理手
順を示すフローチャートである。前記提示位置決定手段
６の処理手順は、図３に示すように、前記距離抽出手段
４から前記発話ユーザと受話ユーザの距離ｄ、前記抽出
手段３-１から発話者の３次元位置Ａ及び視界抽出手段
５から視界Ｗを入力する。そして、距離Ｄを設定する。
あらかじめ設定される距離Ｄは、仮想環境内におけるユ
ーザ間の会話に適すると考えられる距離とする。距離Ｄ
は長ければ長いほど会話範囲が広くなり遠くにいるユー
ザとも明確な会話ができるようにすることができる。
発話ユーザが、図４のように、受話ユーザからの距離Ｄ
内で、かつ視界Ｗ内にいる場合と、それ以外の場合とに
分ける順序を、図３に示すように、ｄ≦Ｄの判断及びＡ
⊂Ｗの判断で分類する。ｄ≦ＤかつＡ⊂Ｗが成り立つと
きは、文字情報の提示位置を、発話されてから一定時間
Ｔ（設定時間）の間、発話ユーザの周囲と決定する。FIG. 3 is a flowchart showing a processing procedure of the presentation position determining means 6. As shown in FIG. 3, the processing procedure of the presentation position determination unit 6 is as follows: the distance d between the speaking user and the receiving user from the distance extracting unit 4, and the three-dimensional position A and visibility of the speaker from the extracting unit 3-1. The field of view W is input from the extraction means 5. Then, the distance D is set.
The distance D set in advance is a distance considered to be suitable for conversation between users in the virtual environment. Distance D
The longer the conversation, the wider the conversation range, so that a clear conversation can be made with a distant user.
As shown in FIG. 4, the uttering user is at a distance D from the receiving user.
3 and the order of dividing into cases other than those within the field of view W as shown in FIG.
分類 Classify based on the judgment of W. When d ≦ D and A⊂W hold, the presentation position of the character information is determined to be around the speaking user for a fixed time T (set time) after the utterance.

【００２８】一方、ｄ≦ＤあるいはＡ⊂Ｗが成り立たな
いときは、文字情報の提示位置を発話ユーザの周囲とす
る。これを初期提示位置とし、ある一定の時間Ｔ（設定
時間）が経過すると発話ユーザの視線ベクトルａの方向
に文字情報の提示位置を移動させる。On the other hand, when d ≦ D or A⊂W does not hold, the presentation position of the character information is set to be around the speaking user. This is set as the initial presentation position, and after a certain time T (set time) has elapsed, the presentation position of the character information is moved in the direction of the line-of-sight vector a of the speaking user.

【００２９】図５及び図６は、文字情報の出力例を示す
模式図である。図５は、ｄ≦ＤかつＡ⊂Ｗが成り立つと
きの出力例であり、吹き出し画像が口元付近に出力され
発話内容が吹き出しの中に表示される。また、ｄ≦Ｄあ
るいはＡ⊂Ｗが成り立たないときは、図６のように、吹
き出し画像が初期提示位置である口元に表示され、発話
内容が吹き出しの中に表示される。FIGS. 5 and 6 are schematic diagrams showing examples of outputting character information. FIG. 5 is an output example when d ≦ D and A⊂W holds, in which a balloon image is output near the mouth and the utterance content is displayed in the balloon. When d ≦ D or A⊂W does not hold, as shown in FIG. 6, the speech balloon image is displayed at the mouth, which is the initial presentation position, and the speech content is displayed in the speech balloon.

【００３０】そして、発話してから一定時間Ｔになった
ときは、発話ユーザのベクトルａの方向へ前記初期提示
位置から一定距離Ｐ先の提示位置へ移動させる。さらに
時間Ｔの２倍の時間がたつと前記初期提示位置から距離
Ｐの２倍の位置と、時間がｎ倍（ｎは自然数）増えるご
とに初期提示位置からｎ倍増えた先を提示位置とする。
時間Ｔと距離Ｐは自在に設定することができるが吹き出
し内の文字が読み取りやすいようにバランスをとる必要
がある。また、文字情報の出力には吹き出し以外にも球
体、雲のような浮遊体等も考えられる。When a predetermined time T has passed since the utterance, the user is moved from the initial presentation position to the presentation position a predetermined distance P from the utterance user in the direction of the vector a. Further, after a lapse of time twice as long as the time T, a position twice the distance P from the initial presentation position, and a destination that is increased n times from the initial presentation position every time the time increases n times (n is a natural number) is referred to as a presentation position. I do.
The time T and the distance P can be set freely, but it is necessary to balance them so that the characters in the balloon are easy to read. In addition to a balloon, a sphere, a floating body such as a cloud, or the like may be used for outputting character information.

【００３１】すなわち、本実施例の会話情報提示方法
は、図１に示すように、複数の表示装置（ディスプレ
イ）がユーザ（例えば、聴覚障害ユーザ）を囲むように
配置された没入型仮想コミュニケーション環境におい
て、発話ユーザの発話内容を発話入力手段１に入力し、
この入力された発話内容から文字情報を文字生成手段２
で生成し、前記没入型仮想コミュニケーション環境にお
ける発話ユーザと受話ユーザの３次元位置と視線ベクト
ルを抽出手段３-１と抽出手段３-２で抽出する。前記発
話ユーザと受話ユーザの３次元位置により発話ユーザと
受話ユーザ間の距離を距離抽出手段４で抽出し、前記受
話ユーザの視線ベクトルとあらかじめ決められた視界角
により受話ユーザの視界を視界抽出手段５で抽出する。
前記発話ユーザと受話ユーザ間の距離と前受話ユーザの
視界と発話ユーザの３次元位置及び発話ユーザの視線ベ
クトルに基づいて発話ユーザの文字情報の提示位置を提
示位置決定手段６で決定し、前記生成された文字情報を
前記決定された提示位置に基づいて出力手段７により出
力し、表示装置（ディスプレイ）に表示する。That is, as shown in FIG. 1, the conversation information presenting method of the present embodiment is an immersive virtual communication environment in which a plurality of display devices (displays) are arranged so as to surround a user (for example, a hearing impaired user). , The utterance content of the utterance user is input to the utterance input means 1
Character generating means 2 generates character information from the input utterance contents.
The three-dimensional position and the line-of-sight vector of the speaking user and the receiving user in the immersive virtual communication environment are extracted by the extraction means 3-1 and the extraction means 3-2. The distance between the speaking user and the receiving user is extracted by the distance extracting means 4 based on the three-dimensional positions of the speaking user and the receiving user, and the field of view of the receiving user is extracted by the line-of-sight vector of the receiving user and a predetermined visibility angle. Extract at 5.
The presentation position determining unit 6 determines the presentation position of the character information of the speaking user based on the distance between the speaking user and the receiving user, the field of view of the previous receiving user, the three-dimensional position of the speaking user, and the line-of-sight vector of the speaking user. The generated character information is output by the output means 7 based on the determined presentation position and displayed on a display device (display).

【００３２】前記提示位置決定過程は、図３に示すよう
に、前記発話ユーザと受話ユーザの距離があらかじめ設
定された距離Ｄ内にあり、かつ発話ユーザの３次元位置
が受話ユーザの視界内にある場合は、発話から一定時間
Ｔ、発話ユーザの周囲を文字表示画像の提示位置とし、
それ以外の場合は、発話ユーザの周囲から視線ベクトル
方向へ発話からの時間にともなって移動する位置を提示
位置とする。In the presenting position determining step, as shown in FIG. 3, the distance between the speaking user and the receiving user is within a predetermined distance D, and the three-dimensional position of the speaking user is within the field of view of the receiving user. In some cases, the surroundings of the uttering user are set as the presentation position of the character display image for a certain time T from the utterance,
In other cases, a position that moves from the surroundings of the speaking user in the direction of the line of sight with time from the speaking is set as the presentation position.

【００３３】以上、本発明者によってなされた発明を、
前記実施形態（実施例）に基づき具体的に説明したが、
本発明は、前記実施形態に限定されるものではなく、そ
の要旨を逸脱しない範囲において種々変更可能であるこ
とは勿論である。As described above, the invention made by the present inventor is:
Although specifically described based on the embodiment (example),
The present invention is not limited to the above-described embodiment, and it is needless to say that various modifications can be made without departing from the spirit of the present invention.

【００３４】[0034]

【発明の効果】以上説明したように、本発明によれば、
ユーザ（例えば聴覚障害者）が、没入型仮想コミュニケ
ーション環境内において、発話者の位置と発議内容を容
易に把握することができ、また、発話者の発話行為も把
握しやすくなるので、例えば、仮想空間内を自由に歩き
回りながら他ユーザと会話の場を持つことができる。As described above, according to the present invention,
A user (for example, a hearing-impaired person) can easily grasp the position of the speaker and the contents of the utterance in the immersive virtual communication environment, and also easily understand the utterance act of the speaker. It is possible to have a place for conversation with other users while freely walking around in the space.

【００３５】以上、本発明者によってなされた発明を、
前記実施形態に基づき具体的に説明したが、本発明は、
前記実施形態に限定されるものではなく、その要旨を逸
脱しない範囲において種々変更可能であることは勿論で
ある。As described above, the invention made by the present inventor is:
Although specifically described based on the embodiment, the present invention
It is needless to say that the present invention is not limited to the above-described embodiment, but can be variously modified without departing from the scope of the invention.

【００３６】[0036]

【発明の効果】以上説明したように、本発明によれば、
没入型仮想コミュニケーション環境内において、ユーザ
（例えば、聴覚障害ユーザ）が、発話ユーザの位置と発
話内容を把握することができる。これにより、仮想空間
内を自由に歩き回りながら、他ユーザと会話の場を容易
に持つことができる。As described above, according to the present invention,
In an immersive virtual communication environment, a user (for example, a hearing-impaired user) can grasp the position of the speaking user and the content of the speaking. This makes it possible to easily have a place for conversation with other users while freely walking around in the virtual space.

[Brief description of the drawings]

【図１】本発明による一実施形態（実施例）の没入型仮
想コミュニケーション環境システムの概略構成を示すブ
ロック構成図である。FIG. 1 is a block diagram showing a schematic configuration of an immersive virtual communication environment system according to an embodiment (example) of the present invention.

【図２】本実施例における受話ユーザの視界Ｗの例を示
す図である。FIG. 2 is a diagram illustrating an example of a field of view W of a receiving user in the present embodiment.

【図３】本実施例における提示位置決定手段の処理手順
を示すフローチャートである。FIG. 3 is a flowchart illustrating a processing procedure of a presentation position determining unit in the embodiment.

【図４】本実施例における会話範囲分類例を示す図であ
る。FIG. 4 is a diagram illustrating an example of a conversation range classification in the embodiment.

【図５】本実施例における文字情報の出力例を示す図で
ある。FIG. 5 is a diagram showing an output example of character information in the embodiment.

【図６】本実施例における文字情報の出力例を示す図で
ある。FIG. 6 is a diagram illustrating an output example of character information in the embodiment.

[Explanation of symbols]

１…発話入力手段、２…文字生成手段、３-１，３-２…
抽出手段、４…距離抽出手段、５…視界抽出手段、６…
提示位置決定手段、７…出力手段。1 ... utterance input means, 2 ... character generation means, 3-1 and 3-2 ...
Extraction means, 4 ... Distance extraction means, 5 ... Visibility extraction means, 6 ...
Presentation position determining means 7, 7 output means.

───────────────────────────────────────────────────── フロントページの続き (72)発明者石橋聡東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5B050 BA09 BA20 CA07 EA07 EA20 EA28 FA02 FA10 ────────────────────────────────────────────────── ─── Continuing on the front page (72) Inventor Satoshi Ishibashi 2-3-1 Otemachi, Chiyoda-ku, Tokyo F-term in Nippon Telegraph and Telephone Corporation (reference) 5B050 BA09 BA20 CA07 EA07 EA20 EA28 FA02 FA10

Claims

[Claims]

1. An immersive virtual communication environment in which a plurality of display devices are arranged to surround a user,
An utterance input process of inputting the utterance content of the utterance user, a character generation process of generating character information from the input utterance content, and an extraction of extracting a three-dimensional position and a line-of-sight vector of the utterance user and the reception user in the virtual environment A distance extracting step of extracting a distance between the speaking user and the receiving user based on a three-dimensional position of the speaking user and the receiving user; and extracting a field of view of the receiving user from a line-of-sight vector of the receiving user and a predetermined viewing angle. View extraction process,
A presentation position determining step of determining a presentation position of character information of the speaking user based on a distance between the speaking user and the receiving user, a field of view of the receiving user, a three-dimensional position of the speaking user, and a line-of-sight vector of the speaking user; Outputting the selected character information based on the determined presentation position.

2. The method according to claim 1, wherein the presenting position determining step includes: starting from the utterance when the distance between the uttering user and the receiving user is within a predetermined distance and the three-dimensional position of the uttering user is within the field of view of the receiving user. A process in which the position around the utterance user is set as the presentation position of the character display image for a certain period of time, and in other cases, the position that moves from the ambience of the utterance user in the direction of the line of sight with the time from the utterance is set as the presentation position. 2. The conversation information presentation method according to claim 1, wherein:

3. An immersive virtual communication environment system in which a plurality of display devices are arranged so as to surround a user, an utterance input means for inputting utterance content of an uttering user, and character information from the input utterance content. Character extracting means for extracting the three-dimensional positions of the speaking user and the receiving user and the line-of-sight vector, and a distance for extracting the distance between the speaking user and the receiving user based on the three-dimensional positions of the speaking user and the receiving user. Extracting means, view extracting means for extracting the field of view of the receiving user based on the line-of-sight vector of the receiving user and a predetermined field angle, and the distance between the speaking user and the receiving user, the field of view of the receiving user and the speaking user. A presentation position determining means for determining a presentation position of the character information of the speaking user based on the three-dimensional position and the line-of-sight vector of the speaking user; Output means for outputting the generated character information on the basis of the determined presentation position.

4. The presenting position determining means, if the distance between the previous uttering user and the receiving user is within a preset distance and the three-dimensional position of the uttering user is within the field of view of the receiving user, start the utterance. A means for presenting the character display image around the uttering user for a certain period of time, and otherwise, a position moving from the surrounding of the uttering user in the direction of the line of sight with the time from the utterance to the presenting position. The immersive virtual communication environment system according to claim 3, wherein: