JP2020113095A

JP2020113095A - Method of controlling character in virtual space

Info

Publication number: JP2020113095A
Application number: JP2019004083A
Authority: JP
Inventors: 昌史三上; Masashi Mikami; 拓也姫路; Takuya Himeji; 京介高山; Kyosuke Takayama
Original assignee: CS Reporters Inc; XR IPLab Co Ltd
Current assignee: CS Reporters Inc; XR IPLab Co Ltd
Priority date: 2019-01-15
Filing date: 2019-01-15
Publication date: 2020-07-27
Also published as: WO2020149271A1

Abstract

To provide a control method which allows a character in a virtual space to be naturally expressed.SOLUTION: A method of controlling a character performed by a user in a virtual space includes the steps of: disposing a character in a virtual space; detecting a user input; converting input contents into output contents in other format; and outputting the output contents into the virtual space.SELECTED DRAWING: Figure 8

Description

本発明は、仮想空間におけるキャラクタの制御方法に関する。 The present invention relates to a character control method in a virtual space.

仮想空間にアバター等のキャラクタを配置して仮想的に会話を行うシステムにおいて、音声によるコミュニケーションを最適化する技術が開示されている。たとえば、特許文献１には、仮想空間における利用者同士の距離に応じて、肉声による会話または無線機等による会話を行っているような効果を演出する開示されている。 A technique for optimizing voice communication in a system in which a character such as an avatar is placed in a virtual space to virtually communicate has been disclosed. For example, Patent Document 1 discloses that an effect is produced in which a conversation using a real voice or a conversation using a wireless device is performed according to the distance between users in a virtual space.

特開２０１７−２８３９０JP, 2017-28390, A

しかしながら、仮想空間における会話の実現とともに、様々な背景を有する人がキャラクタに扮して同じ仮想空間に同居するようになり、仮想空間におけるコミュニケーションに際して更なる利便性が期待される。 However, with the realization of conversation in the virtual space, people with various backgrounds impersonate the character and live together in the same virtual space, and further convenience in communication in the virtual space is expected.

本発明は、このような背景を鑑みてなされたものであり、仮想空間におけるキャラクタ同士の会話において、更なる利便性を有するコミュニケーション方法を提供することを目的とする。 The present invention has been made in view of such a background, and an object thereof is to provide a communication method having further convenience in a conversation between characters in a virtual space.

仮想空間におけるパフォーマユーザが演じるキャラクタの制御方法であって、仮想空間内にキャラクタを配置するステップと、ユーザの入力を検出するステップと、前記入力内容を他フォーマットの出力内容に変換するステップと、前記出力内容を仮想空間内に出力するステップと、を有することを特徴とする。 A method of controlling a character played by a performer user in a virtual space, the step of placing the character in the virtual space, the step of detecting the user's input, the step of converting the input content to the output content of another format, Outputting the output content in the virtual space.

その他本願が開示する課題やその解決方法については、発明の実施形態の欄及び図面により明らかにされる。 Other problems disclosed by the present application and a method for solving the problems will be clarified by the section of the embodiments of the invention and the drawings.

本発明によれば、仮想空間におけるキャラクタ同士の会話において、更なる利便性を有するコミュニケーション方法を提供することができる。 According to the present invention, it is possible to provide a communication method having further convenience in a conversation between characters in a virtual space.

第１の実施形態にかかるＨＭＤシステム３００の全体構成例を示す図である。It is a figure showing the example of whole composition of HMD system 300 concerning a 1st embodiment. 第１の実施形態に係るＨＭＤ１１０の外観の模式図である。It is a schematic diagram of the external appearance of HMD110 which concerns on 1st Embodiment. 第１の実施形態に係るＨＭＤ１１０の機能構成図である。It is a functional block diagram of HMD110 which concerns on 1st Embodiment. 第１の実施形態に係るコントローラ２１０の外観の模式図である。It is a schematic diagram of the external appearance of the controller 210 which concerns on 1st Embodiment. 第１の実施形態に係るコントローラ２１０の機能構成図である。It is a functional block diagram of the controller 210 which concerns on 1st Embodiment. 第１の実施形態に係る画像生成装置３１０の機能構成図である。It is a functional block diagram of the image generation apparatus 310 which concerns on 1st Embodiment. 第１の実施形態に係る本実施形態に係るＨＭＤ１１０に表示される仮想空間の一例である。It is an example of a virtual space displayed on the HMD 110 according to the present embodiment according to the first embodiment. 第１の実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。It is a figure which shows the flow of a process in the HMD system which concerns on 1st Embodiment. 第２の実施形態に係る画像生成装置３１０の機能構成図である。It is a functional block diagram of the image generation apparatus 310 which concerns on 2nd Embodiment. 第２の実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。It is a figure which shows the flow of a process in the HMD system which concerns on 2nd Embodiment. 第３の実施形態に係る画像生成装置３１０の機能構成図である。It is a functional block diagram of the image generation apparatus 310 which concerns on 3rd Embodiment. 第３の実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。It is a figure which shows the flow of a process in the HMD system which concerns on 3rd Embodiment. 第４の実施形態に係る画像生成装置３１０の機能構成図である。It is a functional block diagram of the image generation apparatus 310 which concerns on 4th Embodiment. 第４の実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。It is a figure which shows the flow of a process in the HMD system which concerns on 4th Embodiment.

本発明の実施形態の内容を列記して説明する。本発明の実施の形態による仮想空間におけるキャラクタの制御方法は、以下のような構成を備える。
［項目１］
仮想空間におけるパフォーマユーザが演じるキャラクタの制御方法であって、
仮想空間内にキャラクタを配置するステップと、
ユーザの入力を検出するステップと、
前記入力内容を他フォーマットの出力内容に変換するステップと、
前記出力内容を仮想空間内に出力するステップと、
を有することを特徴とする制御方法。
［項目２］
項目１に記載の制御方法であって、
前記入力内容は、ユーザの音声データであり、前記出力内容は、前記音声データに対応するテキストデータであること、を特徴とする制御方法。
［項目３］
項目１に記載の制御方法であって、
前記入力内容はテキストデータであり、前記出力内容は、前記テキストデータに対応する音声データであること、を特徴とする制御方法。
［項目４］
項目１に記載の制御方法であって、
前記入力内容は第一の言語に関する音声データであり、前記出力内容は、前記第一の言語に対応する第二の言語に関する音声データであること、を特徴とする制御方法。
［項目５］
項目１に記載の制御方法であって、
前記入力内容は音声データであり、前記出力内容は、前記音声データに対する回答に関する音声データであり、前記出力内容は、前記仮想空間内に配置される第二のパフォーマユーザの音声として出力されること、を特徴とする制御方法。 The contents of the embodiments of the present invention will be listed and described. A character control method in a virtual space according to the embodiment of the present invention has the following configuration.
[Item 1]
A method of controlling a character played by a performer user in a virtual space,
Placing the character in the virtual space,
Detecting user input,
Converting the input content into output content in another format,
Outputting the output content in a virtual space,
A control method comprising:
[Item 2]
The control method according to item 1,
The control method, wherein the input content is user's voice data, and the output content is text data corresponding to the voice data.
[Item 3]
The control method according to item 1,
The control method, wherein the input content is text data and the output content is voice data corresponding to the text data.
[Item 4]
The control method according to item 1,
The control method, wherein the input content is voice data regarding a first language, and the output content is voice data regarding a second language corresponding to the first language.
[Item 5]
The control method according to item 1,
The input content is voice data, the output content is voice data relating to an answer to the voice data, and the output content is output as a voice of a second performer user arranged in the virtual space. , A control method characterized by.

（第１の実施形態）
本発明の第１の実施形態に係るヘッドマウントディスプレイ（ＨＭＤ）システム３００の具体例を、以下に図面を参照しつつ説明する。なお、本発明はこれらの例示に限定されるものではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。以下の説明では、図面の説明において同一の要素には同一の符号を付し、重複する説明を省略する。 (First embodiment)
A specific example of the head mounted display (HMD) system 300 according to the first embodiment of the present invention will be described below with reference to the drawings. It should be noted that the present invention is not limited to these exemplifications, and is shown by the scope of the claims, and is intended to include meanings equivalent to the scope of the claims and all modifications within the scope. In the following description, the same reference numerals are given to the same elements in the description of the drawings, and overlapping description will be omitted.

図１は、本実施形態にかかるＨＭＤシステム３００の全体構成例を示す図である。ＨＭＤシステム３００は、たとえば、ユーザが装着するＨＭＤ１１０およびコントローラ２１０、ならびにホストコンピュータとして機能する画像生成装置３１０で構成することができる。図１には、一例として、ユーザＡおよびユーザＢが装着するＨＭＤ１１０およびコントローラ２１０が表示されている。また、ＨＭＤシステム３００は、ＨＭＤ１１０やコントローラ２１０の位置、向き及び傾き等を検出するための赤外線カメラ（図示せず）等を追加することもできる。これらの装置は、相互に、有線又は無線手段により接続することができる。たとえば、各々の装置にＵＳＢポートを備え、ケーブルで接続することで通信を確立することもできるし、他に、ＨＤＭＩ、有線ＬＡＮ、赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）、ＷｉＦｉ（登録商標）等の有線又は無線により通信を確立することもできる。画像生成装置３１０は、ＰＣ、ゲーム機、携帯通信端末等計算処理機能を有する装置であればよい。また、画像生成装置３１０は、生成した画像をストリーミングまたはダウンロードの形態でＨＭＤ１１０に送信することができる。ＨＭＤ１１０は、画像生成装置３１０から送信された画像を再生することができる。画像生成装置３１０は、複数のＨＭＤ１１０に対し、直接画像を送信することもできるし、他のコンテンツサーバを介して画像を送信することもできる。また、一例として、個々の画像生成装置を、各々のユーザが有するＨＭＤ、コントローラと直接またはローカルネットワークを介して接続し、生成した画像を、インターネット等のネットワークを介してサーバに送信し、サーバが他のユーザのＨＭＤに画像を送信することもできる。 FIG. 1 is a diagram showing an example of the overall configuration of an HMD system 300 according to this embodiment. The HMD system 300 can be configured by, for example, the HMD 110 and the controller 210 worn by the user, and the image generation device 310 that functions as a host computer. In FIG. 1, as an example, the HMD 110 and the controller 210 worn by the users A and B are displayed. In addition, the HMD system 300 can also add an infrared camera (not shown) for detecting the position, orientation, tilt, etc. of the HMD 110 and the controller 210. These devices can be connected to each other by wired or wireless means. For example, each device is provided with a USB port, and communication can be established by connecting with a cable. In addition, wired communication such as HDMI, wired LAN, infrared ray, Bluetooth (registered trademark), WiFi (registered trademark), etc. Alternatively, the communication can be established wirelessly. The image generation device 310 may be a device having a calculation processing function such as a PC, a game machine, or a mobile communication terminal. Further, the image generating device 310 can transmit the generated image to the HMD 110 in the form of streaming or download. The HMD 110 can reproduce the image transmitted from the image generation device 310. The image generation device 310 can directly send an image to the plurality of HMDs 110, or can send an image via another content server. In addition, as an example, each image generation device is connected to the HMD and controller of each user directly or via a local network, and the generated image is transmitted to the server via a network such as the Internet, and the server Images can also be sent to the HMDs of other users.

図２は、本実施形態に係るヘッドマウントディスプレイ（以下：ＨＭＤ）１１０の外観の模式図である。ＨＭＤ１１０はユーザの頭部に装着され、ユーザの左右の眼前に配置されるよう表示パネル１２０を備える。表示パネルとしては、光学透過型と非透過型のディスプレイが考えられるが、本実施形態では、より没入感を提供可能な非透過型の表示パネルを例示する。表示パネル１２０には、左目用画像と右目用画像とが表示され、両目の視差を利用することにより立体感のある画像をユーザに提供することができる。左目用画像と右目用画像とを表示することができれば、左目用ディスプレイと右目用ディスプレイとを個別に備えることも可能であるし、左目用及び右目用の一体型のディスプレイを備えることも可能である。 FIG. 2 is a schematic view of the outer appearance of the head mounted display (hereinafter referred to as HMD) 110 according to the present embodiment. The HMD 110 is mounted on the user's head and includes a display panel 120 so as to be placed in front of the left and right eyes of the user. As the display panel, an optically transmissive display and a non-transmissive display can be considered, but in the present embodiment, a non-transmissive display panel capable of providing a more immersive feeling is exemplified. An image for the left eye and an image for the right eye are displayed on the display panel 120, and it is possible to provide the user with an image having a stereoscopic effect by utilizing the parallax of both eyes. If it is possible to display an image for the left eye and an image for the right eye, it is possible to separately provide a display for the left eye and a display for the right eye, or it is possible to provide an integrated display for the left eye and the right eye. is there.

さらに、ＨＭＤ１１０の筐体部１３０は、センサ１４０を備える。センサは、ユーザの頭部の向きや傾きといった動きを検出するために、図示しないが、たとえば、磁気センサ、加速度センサ、もしくはジャイロセンサのいずれか、またはこれらの組み合わせを備えることができる。ユーザの頭部の垂直方向をＹ軸とし、Ｙ軸と直交する軸のうち、表示パネル１２０の中心とユーザとを結ぶ、ユーザの前後方向に相当する軸をＺ軸とし、Ｙ軸及びＺ軸と直交し、ユーザの左右方向に相当する軸をＸ軸とするとき、センサ１４０は、Ｘ軸まわりの回転角（いわゆる、ピッチ角）、Ｙ軸まわりの回転角（いわゆる、ヨー角）、Ｚ軸まわりの回転角（いわゆる、ロール角）を検出することができる。 Furthermore, the housing unit 130 of the HMD 110 includes a sensor 140. Although not shown, the sensor may include, for example, a magnetic sensor, an acceleration sensor, or a gyro sensor, or a combination thereof, which is not illustrated, in order to detect a movement such as the orientation or inclination of the user's head. The vertical direction of the user's head is the Y axis, and among the axes orthogonal to the Y axis, the axis that connects the center of the display panel 120 and the user and that corresponds to the front-back direction of the user is the Z axis, and the Y axis and the Z axis. When an axis that is orthogonal to the horizontal direction of the user and corresponds to the left-right direction of the user is the X axis, the sensor 140 has a rotation angle around the X axis (so-called pitch angle), a rotation angle around the Y axis (so-called yaw angle), and Z. The rotation angle around the axis (so-called roll angle) can be detected.

さらに、センサ１４０に加えて、またはセンサ１４０に代えて、ＨＭＤ１１０の筐体部１３０は、複数の光源１５０（たとえば、赤外光ＬＥＤ、可視光ＬＥＤ）を備えることもでき、ＨＭＤ１１０の外部（たとえば、室内等）に設置されたカメラ（たとえば、赤外光カメラ、可視光カメラ）がこれらの光源を検出することで、特定の空間におけるＨＭＤ１１０の位置、向き、傾きを検出することができる。または、同じ目的で、ＨＭＤ１１０に、ＨＭＤ１１０の筐体部１３０に設置された光源を検出するためのカメラを備えることもできる。 Further, in addition to the sensor 140 or in place of the sensor 140, the housing unit 130 of the HMD 110 may include a plurality of light sources 150 (for example, infrared light LEDs, visible light LEDs), and the outside of the HMD 110 (for example, A camera (for example, an infrared light camera or a visible light camera) installed in a room (indoor, etc.) can detect the position, orientation, and tilt of the HMD 110 in a specific space. Alternatively, for the same purpose, the HMD 110 may be provided with a camera for detecting a light source installed in the housing unit 130 of the HMD 110.

また、ＨＭＤ１１０には光源１５０を設けずに、ＨＭＤ１１０の外部に設置されたカメラによりＨＭＤ１１０の外観を撮影し、その撮影画像を画像解析することにより、ＨＭＤ１１０の位置、向き、傾きを検出することもできる。この場合、ＨＭＤ１１０が外部のカメラからの画像を取得して画像解析を行うようにしてもよいし、ＨＭＤ１１０とは異なるコンピュータが画像解析を行い、解析結果をＨＭＤ１１０に送信するようにしてもよい。 Further, without providing the HMD 110 with the light source 150, it is also possible to detect the position, orientation, and inclination of the HMD 110 by photographing the appearance of the HMD 110 with a camera installed outside the HMD 110 and analyzing the photographed image. it can. In this case, the HMD 110 may acquire an image from an external camera and perform image analysis, or a computer different from the HMD 110 may perform image analysis and transmit the analysis result to the HMD 110.

また、ＨＭＤ１１０は、本例のように、サーバや画像処理装置等の外部処理装置とネットワーク連携しながら、アプリケーションを実行し、データを送受信することもできるし、外部処理装置に依存せずに、ＨＭＤ単体として、内蔵されたプログラムを実行する、スタンドアローン型の装置として機能することもできる。 Further, the HMD 110 can execute an application and transmit/receive data while network-linking with an external processing device such as a server or an image processing device as in the present example, or without depending on the external processing device. The HMD alone can function as a stand-alone type device that executes a built-in program.

図３は、本実施形態に係るＨＭＤ１１０の機能構成図である。図２において述べたように、ＨＭＤ１１０は、センサ１４０を備えることができる。センサ１４０としては、ユーザの頭部の向きや傾きといった動きを検出するために、図示しないが、たとえば、磁気センサ、加速度センサ、もしくはジャイロセンサのいずれか、またはこれらの組み合わせを備えることができる。さらに精度よく、ユーザの頭部の向きや傾きといった動きを検出したり、ユーザの頭部の位置を検出したりするために、赤外光や紫外光といったＬＥＤ１５０を備えることもできる。また、ＨＭＤの外景を撮影するためのカメラ１６０を備えることができる。また、ユーザの発話を集音するためのマイク１７０、音声を出力するためのヘッドフォン１８０を備えることもできる。なお、マイクやヘッドフォンはＨＭＤ１１０とは別個独立した装置として有することもできる。 FIG. 3 is a functional configuration diagram of the HMD 110 according to the present embodiment. As described in FIG. 2, the HMD 110 may include the sensor 140. Although not shown, the sensor 140 may include, for example, a magnetic sensor, an acceleration sensor, a gyro sensor, or a combination thereof, which is not shown in order to detect a movement such as the orientation or inclination of the user's head. Further, the LED 150 such as infrared light or ultraviolet light may be provided to detect the movement of the user's head such as the direction and inclination of the user or the position of the user's head with higher accuracy. In addition, a camera 160 for taking an external view of the HMD can be provided. Further, a microphone 170 for collecting a user's speech and a headphone 180 for outputting a voice can be provided. It should be noted that the microphone and headphone can be provided as a device independent of the HMD 110.

さらに、ＨＭＤ１１０は、たとえば、コントローラ２１０や画像生成装置３１０等の周辺装置との有線による接続を確立するための入出力部１９０を備えることができ、赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）やＷｉＦｉ（登録商標）等無線による接続を確立するための通信部１１５を備えることができる。センサ１４０により取得されたユーザの頭部の向きや傾きといった動きに関する情報は、制御部１２５によって、入出力部１９０および／または通信部１１５を介して画像生成装置３１０に送信される。画像生成装置３１０において、ユーザの頭部の動きに基づいて生成された画像は、入出力部１９０および／または通信部１１５を介して受信され、制御部１２５によって表示部１２０に出力される。 Further, the HMD 110 can include, for example, an input/output unit 190 for establishing a wired connection with peripheral devices such as the controller 210 and the image generation device 310, and infrared, Bluetooth (registered trademark), and WiFi (registered trademark). ) Etc. can be provided with a communication unit 115 for establishing a wireless connection. The information about the movement of the user such as the orientation and inclination of the head acquired by the sensor 140 is transmitted to the image generation apparatus 310 by the control unit 125 via the input/output unit 190 and/or the communication unit 115. The image generated by the image generation device 310 based on the movement of the user's head is received via the input/output unit 190 and/or the communication unit 115, and is output to the display unit 120 by the control unit 125.

図４は、本実施形態に係るコントローラ２１０の外観の模式図である。ユーザはコントローラ２１０を把持して所定の指示を行う。コントローラ２１０により、仮想空間内において、ユーザが所定の入力を行うことをサポートすることができる。例えば、ユーザは、コントローラを傾けることで、仮想空間内に表示される入力キーボードの所望のキーにカーソルを当て、ボタンを押下することで、テキスト入力を行うことができる。コントローラ２１０は、左手用２２０及び右手用２３０のコントローラのセットとして構成することができる。左手用コントローラ２２０及び右手用コントローラ２３０は、各々操作用トリガーボタン２４０、赤外線ＬＥＤ２５０、センサ２６０、ジョイスティック２７０、メニューボタン２８０を有することができる。 FIG. 4 is a schematic view of the appearance of the controller 210 according to this embodiment. The user holds the controller 210 and gives a predetermined instruction. The controller 210 can support the user to make a predetermined input in the virtual space. For example, the user can perform text input by tilting the controller to bring the cursor to a desired key of the input keyboard displayed in the virtual space and pressing the button. The controller 210 may be configured as a set of left-hand 220 and right-hand 230 controllers. The left-hand controller 220 and the right-hand controller 230 can each include an operation trigger button 240, an infrared LED 250, a sensor 260, a joystick 270, and a menu button 280.

操作用トリガーボタン２４０は、コントローラ２１０のグリップ２３５を把持したときに、中指及び人差し指でトリガーを引くような操作を行うことを想定した位置に２４０ａ、２４０ｂとして配置される。コントローラ２１０の両側面から下方にリング状に形成されるフレーム２４５には、複数の赤外線ＬＥＤ２５０が備えられ、コントローラ２１０の外部に備えられるカメラ（図示せず）により、これらの赤外線ＬＥＤ２５０の位置を検出することで、特定の空間におけるコントローラ２１０の位置、向き及び傾きを検出することができる。 The operation trigger button 240 is arranged as positions 240a and 240b at positions where it is assumed that an operation such as pulling the trigger with the middle finger and the index finger is performed when the grip 235 of the controller 210 is held. A plurality of infrared LEDs 250 are provided on a frame 245 formed in a ring shape downward from both side surfaces of the controller 210, and the position of these infrared LEDs 250 is detected by a camera (not shown) provided outside the controller 210. By doing so, the position, orientation, and inclination of the controller 210 in the specific space can be detected.

なお、赤外線ＬＥＤ２５０を省略し、コントローラ２１０の外部に配されるカメラによりコントローラ２１０の外観を撮影し、撮影画像を解析することにより、コントローラ２１０の位置、向き、傾きを検出することもできる。この場合、カメラからの画像をＨＭＤ１１０に送信してＨＭＤ１１０が画像解析を行うようにしてもよいし、ＨＭＤ１１０およびコントローラ２１０とは異なるコンピュータが画像解析を行い、解析結果をＨＭＤ１１０に送信するようにしてもよい。 It is also possible to detect the position, orientation, and inclination of the controller 210 by omitting the infrared LED 250 and photographing the appearance of the controller 210 with a camera arranged outside the controller 210 and analyzing the photographed image. In this case, the image from the camera may be transmitted to the HMD 110 and the HMD 110 may perform the image analysis, or a computer different from the HMD 110 and the controller 210 may perform the image analysis and transmit the analysis result to the HMD 110. Good.

また、コントローラ２１０は、コントローラ２１０の向きや傾きといった動きを検出するために、センサ２６０を内蔵することができる。センサ２６０として、図示しないが、たとえば、磁気センサ、加速度センサ、もしくはジャイロセンサのいずれか、またはこれらの組み合わせを備えることができる。さらに、コントローラ２１０の上面には、ジョイスティック２７０及びメニューボタン２８０を備えることができる。ジョイスティック２７０は、基準点を中心に３６０度方向に動かすことができ、コントローラ２１０のグリップ２３５を把持したときに、親指で操作されることが想定される。メニューボタン２８０もまた同様に、親指で操作されることが想定される。さらに、コントローラ２１０は、コントローラ２１０を操作するユーザの手に振動を与えるためのバイブレータ（図示せず）を内蔵することもできる。ボタンやジョイスティックを介したユーザの入力内容やセンサ等を介したコントローラ２１０の位置、向き及び傾きといった情報を出力するため、また、ホストコンピュータからの情報を受信するために、コントローラ２１０は、入出力部及び通信部を有する。 Further, the controller 210 can include a sensor 260 in order to detect a movement such as a direction and a tilt of the controller 210. Although not shown, the sensor 260 may include, for example, a magnetic sensor, an acceleration sensor, a gyro sensor, or a combination thereof. Further, a joystick 270 and a menu button 280 may be provided on the upper surface of the controller 210. The joystick 270 can be moved in the direction of 360 degrees around the reference point, and it is assumed that the joystick 270 is operated by the thumb when the grip 235 of the controller 210 is gripped. It is envisioned that the menu button 280 will also be operated with the thumb as well. Further, the controller 210 may include a vibrator (not shown) for applying vibration to the hand of the user who operates the controller 210. In order to output information such as user input contents via buttons and joysticks, position, orientation, and tilt of the controller 210 via sensors, etc., and to receive information from the host computer, the controller 210 inputs/outputs. And a communication unit.

ユーザがコントローラ２１０を把持し、各種ボタンやジョイスティックを操作することの有無、及び赤外線ＬＥＤやセンサにより検出される情報によって、システムはユーザの手の動きや姿勢を決定し、仮想空間内において擬似的にユーザの手を表示させ、動作させることができる。 The system determines the movement and posture of the user's hand based on the presence/absence of the user holding the controller 210 and operating various buttons and joysticks, and the information detected by the infrared LED and the sensor. The user's hand can be displayed and operated.

図５は、本実施形態に係るコントローラ２１０の機能構成図である。図４において述べたように、コントローラ２１０は、左手用２２０及び右手用２３０のコントローラのセットとして構成することができるが、いずれのコントローラにおいても、操作用トリガーボタン２４０、ジョイスティック２７０、メニューボタン２８０といった操作部２４５を備えることができる。 FIG. 5 is a functional configuration diagram of the controller 210 according to the present embodiment. As described with reference to FIG. 4, the controller 210 can be configured as a set of controllers for the left hand 220 and the right hand 230, but in any of the controllers, the operation trigger button 240, the joystick 270, the menu button 280, etc. An operation unit 245 can be provided.

また、コントローラ２１０は、コントローラ２１０の向きや傾きといった動きを検出するために、センサ２６０を内蔵することができる。センサ２６０として、図示しないが、たとえば、磁気センサ、加速度センサ、もしくはジャイロセンサのいずれか、またはこれらの組み合わせを備えることができる。 Further, the controller 210 can include a sensor 260 in order to detect a movement such as a direction and a tilt of the controller 210. Although not shown, the sensor 260 may include, for example, a magnetic sensor, an acceleration sensor, a gyro sensor, or a combination thereof.

さらに、複数の赤外線ＬＥＤ２５０が備えられ、コントローラ外部に備えられるカメラ（図示せず）により、これらの赤外線ＬＥＤの位置を検出することで、特定の空間におけるコントローラ２１０の位置、向き及び傾きを検出するようにしてもよい。 Further, a plurality of infrared LEDs 250 are provided, and a camera (not shown) provided outside the controller detects the positions of these infrared LEDs to detect the position, orientation, and inclination of the controller 210 in a specific space. You may do it.

コントローラ２１０は、たとえば、ＨＭＤ１１０や画像生成装置３１０等の周辺装置との有線による接続を確立するための入出力部２５５を備えることができ、赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）やＷｉＦｉ（登録商標）等無線による接続を確立するための通信部２６５を備えることができる。ユーザにより操作部２４５を介して入力された、例えば、テキスト入力情報及びセンサ２６０によって取得されたコントローラ２１０の向きや傾きといった情報は、入出力部２５５および／または通信部２６５を介して画像生成装置３１０に送信される。 The controller 210 can include, for example, an input/output unit 255 for establishing a wired connection with a peripheral device such as the HMD 110 or the image generation device 310, and infrared rays, Bluetooth (registered trademark), WiFi (registered trademark), or the like. A communication unit 265 for establishing a wireless connection can be provided. Information input by the user via the operation unit 245, for example, text input information and information such as the orientation and inclination of the controller 210 acquired by the sensor 260 is input via the input/output unit 255 and/or the communication unit 265 to the image generation apparatus. Sent to 310.

図６は、本実施形態に係る画像生成装置３１０の機能構成図である。画像生成装置３１０としては、ＨＭＤ１１０やコントローラ２１０から送信された、ユーザ入力情報やセンサ等により取得されたユーザの頭部動きやコントローラの動きや操作に関する情報を記憶し、所定の計算処理を行い、画像を生成するための機能を有する、ＰＣ、ゲーム機及び携帯通信端末等といった装置を使用することができる。画像生成装置３１０は、たとえば、ＨＭＤ１１０やコントローラ２１０等の周辺装置との有線による接続を確立するための入出力部３２０を備えることができ、赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）やＷｉＦｉ（登録商標）等無線による接続を確立するための通信部３３０を備えることができる。入出力部３２０および／または通信部３３０を介して、ＨＭＤ１１０および／またはコントローラ２１０から受信された、ユーザの頭部の動きやコントローラの動きや操作に関する情報は、制御部３４０において、ユーザの位置、視線、姿勢等の動作、発話、操作等を含めた入力内容として検出され、ユーザの入力内容に応じて、記憶部３５０に格納された制御プログラムを実行することで、キャラクタの制御を行い、画像を生成するといった処理がなされる。制御部３４０は、ＣＰＵで構成することもできるが、画像処理に特化したＧＰＵをさらに設けることで、情報処理と画像処理を分散化し、全体の処理の効率化を図ることもできる。画像生成装置３１０はまた、他の計算処理装置と通信を行い、他の計算処理装置に情報処理や画像処理を分担させることもできる。 FIG. 6 is a functional configuration diagram of the image generation apparatus 310 according to the present embodiment. As the image generation device 310, the user input information transmitted from the HMD 110 and the controller 210, the information regarding the user's head movement, the movement of the controller, and the operation acquired by the sensor are stored, and a predetermined calculation process is performed. A device such as a PC, a game machine, or a mobile communication terminal having a function of generating an image can be used. The image generation device 310 can include an input/output unit 320 for establishing a wired connection with a peripheral device such as the HMD 110 or the controller 210, for example, infrared rays, Bluetooth (registered trademark), WiFi (registered trademark), or the like. A communication unit 330 for establishing a wireless connection can be provided. The information about the movement of the user's head, the movement of the controller, and the operation received from the HMD 110 and/or the controller 210 via the input/output unit 320 and/or the communication unit 330 is stored in the control unit 340 as the position of the user, Characters are controlled by executing the control program stored in the storage unit 350, which is detected as input contents including motions such as line of sight and posture, utterances, and operations, and is executed according to the input contents of the user. Is generated. The control unit 340 may be configured by a CPU, but by further providing a GPU specialized for image processing, it is possible to decentralize information processing and image processing and improve the efficiency of the entire processing. The image generation device 310 can also communicate with another calculation processing device to allow the other calculation processing device to share information processing and image processing.

さらに、画像生成装置３１０の制御部３４０は、ＨＭＤ１１０および／またはコントローラ２１０から受信された、ユーザの頭部の動きやユーザの発話、また、コントローラ操作によるテキスト入力情報、及びコントローラの動きや操作に関する情報を検出する入力検出部６１０と、予め記憶部３５０のキャラクタデータ格納部６５０に格納されたキャラクタに対して、制御プログラム格納部に格納された制御プログラムを実行するキャラクタ制御部６２０と、キャラクタ制御に基づいて画像を生成する画像生成部６３０を有する。ここでキャラクタの動きの制御については、ＨＭＤ１１０やコントローラ２１０を介して検出されたユーザ頭部の向きや傾き、手の動きといった情報を、人間の身体の関節の動きや制約に則って作成されたボーン構造の各部の動きに変換し、予め格納されたキャラクタデータに対して、ボーン構造を関連付けることで、ボーン構造の動きを適用させることで実現される。また、制御部３４０は、ユーザの発話内容から音声を認識する音声認識部６４０及び認識した音声をテキストに変換するテキスト変換部６５０を有する。 Furthermore, the control unit 340 of the image generation apparatus 310 relates to the movement of the head of the user and the utterance of the user received from the HMD 110 and/or the controller 210, the text input information by the controller operation, and the movement and operation of the controller. An input detection unit 610 that detects information, a character control unit 620 that executes a control program stored in a control program storage unit for a character stored in a character data storage unit 650 of a storage unit 350 in advance, and a character control unit. An image generation unit 630 that generates an image based on Here, regarding the control of the movement of the character, information such as the direction and inclination of the user's head and the movement of the hand detected via the HMD 110 and the controller 210 is created in accordance with the movement and restrictions of the joints of the human body. This is realized by converting the motion of each part of the bone structure and associating the bone structure with previously stored character data to apply the motion of the bone structure. Further, the control unit 340 includes a voice recognition unit 640 that recognizes a voice from the utterance content of the user and a text conversion unit 650 that converts the recognized voice into a text.

記憶部３５０は、上述のキャラクタデータ格納部６５０に、キャラクタのイメージデータのほか、キャラクタの属性等キャラクタに関連する情報を格納する。また、制御プログラム格納部６７０は、仮想空間におけるキャラクタの動作や表情を制御するためのプログラムを格納する。ストリーミングデータ格納部６６０は、画像生成部６３０で生成された画像を格納する。 The storage unit 350 stores, in the above-described character data storage unit 650, information related to the character such as character image data as well as the character attribute. Further, the control program storage unit 670 stores a program for controlling the motion and facial expression of the character in the virtual space. The streaming data storage unit 660 stores the image generated by the image generation unit 630.

図７は、本実施形態に係るＨＭＤ１１０に表示される仮想空間の一例を示す図である。本実施形態では、仮想空間においてキャラクタ同士で音声会話を行うことを想定している。図７では、一例として、ユーザＡを表すキャラクタ７２０（Ａ）と、ユーザＡの対話の相手となるユーザＢを表すキャラクタ７２０（Ｂ）とが対面に配置されている仮想空間を示している。ユーザＡとユーザＢとは、キャラクタ７２０（Ａ）および７２０（Ｂ）を介して仮想的に音声会話を行う。なお、音声は、ユーザの生声をそのまま再生するようにしてのよいし、画像生成装置３１０において音声をテキストに変換し、変換したテキストに基づいて合成音声を作成して再生するようにしてもよい。 FIG. 7 is a diagram showing an example of a virtual space displayed on the HMD 110 according to this embodiment. In this embodiment, it is assumed that characters have a voice conversation in a virtual space. FIG. 7 shows, as an example, a virtual space in which a character 720(A) representing the user A and a character 720(B) representing the user B who is a partner of the user A's dialogue are face-to-face. The user A and the user B virtually conduct a voice conversation via the characters 720(A) and 720(B). The voice may be reproduced as the user's raw voice as it is, or the voice may be converted into text in the image generation device 310, and a synthetic voice may be created and reproduced based on the converted text. Good.

仮想空間においてキャラクタ７２０は、少なくとも胸部以上の部分が互いに視認可能な状態で配置される。キャラクタ７２０の目７２１は可動式であり、目７２１が動くことで、キャラクタ７２０の視線７２２が設定される。 In the virtual space, the character 720 is arranged in a state where at least the chest and above are visible to each other. The eyes 721 of the character 720 are movable, and the line of sight 722 of the character 720 is set by moving the eyes 721.

本実施形態では、キャラクタ７２０の視線７２２は、コントローラ２１０の操作により設定される。本実施形態では、一例として、トリガーボタン２４０を押下しながらコントローラ２１０を傾けることにより、３軸方向のそれぞれの傾きに応じて視線７２２を設定するように制御されるものとしている。 In the present embodiment, the line of sight 722 of the character 720 is set by operating the controller 210. In the present embodiment, as an example, by tilting the controller 210 while pressing the trigger button 240, it is assumed that the line of sight 722 is set according to each tilt in the three axis directions.

ＨＭＤ１１０の表示パネル１２０には、ユーザの仮想の手（不図示）を仮想空間に表示してもよい。また、仮想の手からは３軸の傾きに応じたレイ（光線）を表示するようにしてもよい。この場合に、レイをポインタとして用いることが可能となり、トリガーボタン２４０を押下した場合には、レイが示す先にキャラクタ７２０の視線７２２を向けるようにすることもできる。さらに、レイが示す先を操作の対象としてもよい。たとえば、仮想空間においてレイが示す先に配置されているオブジェクト（たとえば、図７の例では、机の上に配置されているプリントなど）を操作対象として、その内容を表示パネル１２０に表示させることができる。この場合、ユーザはプリントの内容を読みながら会話を行うことになるところ、仮想空間のキャラクタ７２０の視線７２２が向いている先のオブジェクトに関する話題になるため、自然な雰囲気でキャラクタ７２０に会話をさせることができる。なお、ユーザからの指示に応じて、レイを他のユーザが視認できるように、画像生成装置３１０がレイを作成するようにしてもよい。 The display panel 120 of the HMD 110 may display a user's virtual hand (not shown) in the virtual space. In addition, a ray (light ray) may be displayed from the virtual hand according to the inclination of the three axes. In this case, the ray can be used as a pointer, and when the trigger button 240 is pressed, the sight line 722 of the character 720 can be directed to the point indicated by the ray. Further, the point indicated by the ray may be the operation target. For example, displaying the contents on the display panel 120 with an object (for example, a print arranged on a desk in the example of FIG. 7) arranged ahead of the ray in the virtual space as an operation target. You can In this case, when the user has a conversation while reading the content of the print, the user is talking about the object to which the line of sight 722 of the character 720 in the virtual space is facing, so that the character 720 has a conversation in a natural atmosphere. be able to. Note that the image generating apparatus 310 may create a ray so that another user can view the ray in response to an instruction from the user.

また、キャラクタ７２０の各部はＨＭＤ１１０およびコントローラ２１０の操作に連動させることができる。たとえば、ＨＭＤ１１０の向きおよび傾きに応じてキャラクタ７２０の頭部の向きおよび傾きを設定することができる。また、上述したように視線７２２を動かした場合に、その視線７２２に応じて頭部の向きおよび傾きを設定するようにしてもよい。 Further, each part of the character 720 can be linked with the operation of the HMD 110 and the controller 210. For example, the orientation and inclination of the head of character 720 can be set according to the orientation and inclination of HMD 110. Further, when the line of sight 722 is moved as described above, the orientation and inclination of the head may be set according to the line of sight 722.

図８は、本実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。 FIG. 8 is a diagram showing a flow of processing in the HMD system according to the present embodiment.

画像生成装置３１０の制御部３４０の入力検出部６１０は、ＨＭＤ１１０からの入力データ、例えば、ユーザの発話データを受信する（Ｓ１０１）。音声認識部６４０は、受信した発話データに基づいて音声を認識する。音声認識方法としては、例えば、予め発話データを学習データとして記録しておき、入力された発話データから抽出された特徴と学習データから抽出された特徴を比較して最も近い言語列を音声認識結果として出力する方法や、隠れマルコフモデル等を利用することが考えられる。 The input detection unit 610 of the control unit 340 of the image generation apparatus 310 receives the input data from the HMD 110, for example, the utterance data of the user (S101). The voice recognition unit 640 recognizes voice based on the received speech data. As a voice recognition method, for example, utterance data is recorded in advance as learning data, the features extracted from the input utterance data are compared with the features extracted from the learning data, and the closest language string is used as the voice recognition result. It is conceivable to use a hidden Markov model or the like.

次に、画像生成装置３１０の制御部３４０のテキスト変換部６５０は、音声認識結果に基づいて、音声をテキストに変換する（Ｓ１０２）。 Next, the text conversion unit 650 of the control unit 340 of the image generation apparatus 310 converts voice into text based on the voice recognition result (S102).

次に、画像生成装置３１０の画像生成部６３０は、変換されたテキストを含む画像を生成し、入出力部３２０又は通信部３３０を介して画像をユーザのＨＭＤに送信し、テキストはＨＭＤの表示パネルに出力され表示される（Ｓ１０３）。 Next, the image generation unit 630 of the image generation apparatus 310 generates an image including the converted text, transmits the image to the user's HMD via the input/output unit 320 or the communication unit 330, and the text is displayed on the HMD. It is output and displayed on the panel (S103).

以上説明したように、本実施形態に係るＨＭＤシステム３００によれば、仮想空間において、ユーザが発話した音声をテキストに変換し、表示することで、仮想空間内におけるユーザが演じるキャラクタ同士の会話に更なる利便性を提供することができる。 As described above, according to the HMD system 300 according to the present embodiment, by converting the voice uttered by the user into text in the virtual space and displaying the text, the conversation between the characters played by the user in the virtual space can be performed. Further convenience can be provided.

以上、本実施形態について説明したが、上記実施形態は本発明の理解を容易にするためのものであり、本発明を限定して解釈するためのものではない。本発明は、その趣旨を逸脱することなく、変更、改良され得ると共に、本発明にはその等価物も含まれる。 Although the present embodiment has been described above, the above embodiment is for the purpose of facilitating the understanding of the present invention and is not for limiting and interpreting the present invention. The present invention can be modified and improved without departing from the spirit thereof, and the present invention also includes equivalents thereof.

たとえば、本実施形態では、ＨＭＤ１１０とコントローラ２１０とがそれぞれ画像生成装置３１０と通信を行うものとしたが、これに限らず、コントローラ２１０からの信号をＨＭＤ１１０に伝達し、ＨＭＤ１１０からコントローラ２１０を介した入力データを画像生成装置３１０に送信するようにしてもよい。また、スマートフォンなどのユーザ端末を介して、ＨＭＤ１１０およびコントローラ２１０からの入力データを送信するようにしてもよい。 For example, in the present embodiment, the HMD 110 and the controller 210 respectively communicate with the image generation device 310, but the present invention is not limited to this, and a signal from the controller 210 is transmitted to the HMD 110, and the HMD 110 passes through the controller 210. The input data may be transmitted to the image generating device 310. In addition, input data from the HMD 110 and the controller 210 may be transmitted via a user terminal such as a smartphone.

また、本実施形態のＨＭＤシステム３００は、ＨＭＤ１１０を利用した仮想現実空間における会話システムを想定していたが、拡張現実空間や複合現実空間でも同様の処理を行うことができる。たとえば、ＨＭＴ１１０の外観をカメラ１６０により撮影し、その撮影画像を背景として、キャラクタ７２０を重畳表示させることもできる。 Further, the HMD system 300 of the present embodiment is assumed to be a conversation system in the virtual reality space using the HMD 110, but the same processing can be performed in the augmented reality space or the mixed reality space. For example, the appearance of the HMT 110 can be photographed by the camera 160, and the character 720 can be superimposed and displayed with the photographed image as the background.

また、本実施形態では、ユーザはコントローラ２１０を介してテキスト入力を行う旨例示したが、これに限らず、例えば、スマートフォンのタッチパネルやキーボードにより直接入力を行う方法やハンドトラッキング手法を利用して仮想空間内のキーボードを操作して入力する方法等も考えられる。 In addition, in the present embodiment, the user exemplifies that text input is performed via the controller 210, but the present invention is not limited to this. For example, a method for performing direct input using a touch panel or a keyboard of a smartphone or a hand tracking method is used for virtual A method of inputting by operating a keyboard in the space is also conceivable.

（第２の実施形態）
以下、本発明の第２の実施形態について説明する。以下、特に言及される内容のほかは、本実施形態において実現される構成及び方法は、第１の実施形態と同等である。図９は、第２の実施形態に係る画像生成装置３１０の機能構成図である。本実施形態においては、第１の実施形態の画像生成装置３１０に対し、翻訳機能に関連する翻訳実行部６５５及び翻訳辞書格納部６９０を有する。 (Second embodiment)
The second embodiment of the present invention will be described below. The configuration and method implemented in this embodiment are the same as those in the first embodiment except what is particularly mentioned below. FIG. 9 is a functional configuration diagram of the image generation apparatus 310 according to the second embodiment. In this embodiment, the image generation apparatus 310 of the first embodiment has a translation execution unit 655 and a translation dictionary storage unit 690 related to the translation function.

図１０は、第２の実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。 FIG. 10 is a diagram showing a flow of processing in the HMD system according to the second embodiment.

画像生成装置３１０の制御部３４０の入力検出部６１０は、ＨＭＤ１１０からの入力データ、例えば、ユーザの発話データを受信する（Ｓ２０１）。音声認識部６４０は、受信した発話データに基づいて音声を認識する。音声認識方法としては、例えば、予め発話データを学習データとして記録しておき、入力された発話データから抽出された特徴と学習データから抽出された特徴を比較して最も近い言語列を音声認識結果として出力する方法や、隠れマルコフモデル等を利用することが考えられる。 The input detection unit 610 of the control unit 340 of the image generation apparatus 310 receives the input data from the HMD 110, for example, the utterance data of the user (S201). The voice recognition unit 640 recognizes voice based on the received speech data. As a voice recognition method, for example, utterance data is recorded in advance as learning data, the features extracted from the input utterance data are compared with the features extracted from the learning data, and the closest language string is used as the voice recognition result. It is conceivable to use a hidden Markov model or the like.

次に、画像生成装置３１０の制御部３４０の翻訳実行部６５５は、音声認識結果に基づいて、翻訳辞書格納部６９０を参照し、翻訳を実行する（Ｓ２０２）。翻訳については、例えば、日本語で発話された内容を英語に翻訳したり、翻訳辞書格納部に格納される辞書の書類によっては、英語以外の他言語に翻訳することも可能である。 Next, the translation execution unit 655 of the control unit 340 of the image generation apparatus 310 refers to the translation dictionary storage unit 690 based on the voice recognition result and executes translation (S202). Regarding translation, for example, the contents uttered in Japanese may be translated into English, or depending on the document of the dictionary stored in the translation dictionary storage unit, it may be translated into another language other than English.

次に、画像生成装置３１０の画像生成部６３０は、変換された翻訳のテキストを含む画像を生成し、入出力部３２０又は通信部３３０を介して画像をユーザのＨＭＤに送信し、テキストはＨＭＤの表示パネルに出力され表示される（Ｓ２０３）。 Next, the image generation unit 630 of the image generation device 310 generates an image including the converted translation text, transmits the image to the user's HMD via the input/output unit 320 or the communication unit 330, and the text is the HMD. Is displayed and displayed on the display panel (S203).

以上説明したように、本実施形態に係るＨＭＤシステム３００によれば、仮想空間において、ユーザが発話した音声を他言語に翻訳し、表示することで、仮想空間内におけるユーザが演じるキャラクタ同士の会話に更なる利便性を提供することができる。 As described above, according to the HMD system 300 according to the present embodiment, in the virtual space, the voice uttered by the user is translated into another language and displayed, so that the conversation between the characters played by the user in the virtual space is performed. Further convenience can be provided.

（第３の実施形態）
以下、本発明の第３の実施形態について説明する。以下、特に言及される内容のほかは、本実施形態において実現される構成及び方法は、第１の実施形態と同等である。図１１は、第３の実施形態に係る画像生成装置３１０の機能構成図である。本実施形態においては、第１の実施形態の画像生成装置３１０に対し、音声認識部６４０及びテキスト変換部６５０を有する代わりに、入力されたテキストを音声化するための音声合成部６４５を有する。 (Third Embodiment)
The third embodiment of the present invention will be described below. The configuration and method implemented in this embodiment are the same as those in the first embodiment except what is particularly mentioned below. FIG. 11 is a functional configuration diagram of the image generation apparatus 310 according to the third embodiment. In the present embodiment, in addition to the voice recognition unit 640 and the text conversion unit 650 of the image generation device 310 of the first embodiment, a voice synthesis unit 645 for converting the input text into voice is provided.

図１２は、第３の実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。 FIG. 12 is a diagram showing a flow of processing in the HMD system according to the third embodiment.

画像生成装置３１０の制御部３４０の入力検出部６１０は、ＨＭＤ１１０またはコントローラ２１０からの入力データ、例えば、テキストデータを受信する（Ｓ３０１）。 The input detection unit 610 of the control unit 340 of the image generation apparatus 310 receives the input data from the HMD 110 or the controller 210, for example, text data (S301).

次に、画像生成装置３１０の制御部３４０の音声合成部６４５は、入力されたテキストデータに基づいて、テキストに対応する音声を生成する（Ｓ３０２）。音声合成は、所定のアルゴリズムによって、テキストに対応する音声を（図示しない）音源から抽出することで実現される。 Next, the voice synthesis unit 645 of the control unit 340 of the image generation apparatus 310 generates voice corresponding to the text based on the input text data (S302). The voice synthesis is realized by extracting a voice corresponding to the text from a sound source (not shown) by a predetermined algorithm.

次に、画像生成装置３１０の音声合成部６４５は、生成された音声を、入出力部３２０又は通信部３３０を介して画像とともにユーザのＨＭＤに送信し、音声はＨＭＤのスピーカを通じて出力され、画像は表示パネルに出力され表示される（Ｓ３０３）。 Next, the voice synthesis unit 645 of the image generation apparatus 310 transmits the generated voice to the user's HMD via the input/output unit 320 or the communication unit 330, and the voice is output through the speaker of the HMD. Is output and displayed on the display panel (S303).

以上説明したように、本実施形態に係るＨＭＤシステム３００によれば、仮想空間において、ユーザが入力したテキストを音声化し、出力することで、仮想空間内におけるユーザが演じるキャラクタ同士の会話に更なる利便性を提供することができる。 As described above, according to the HMD system 300 according to the present embodiment, by converting the text input by the user into a voice in the virtual space and outputting the text, the conversation between the characters played by the user in the virtual space is further enhanced. It is possible to provide convenience.

（第４の実施形態）
以下、本発明の第４の実施形態について説明する。以下、特に言及される内容のほかは、本実施形態において実現される構成及び方法は、第１の実施形態と同等である。図１３は、第４の実施形態に係る画像生成装置３１０の機能構成図である。本実施形態においては、第１の実施形態の画像生成装置３１０に対し、ユーザの入力内容に対する回答を生成するための回答生成部６８５及び回答について音声を生成するための音声合成部６４５を有する。 (Fourth Embodiment)
Hereinafter, the fourth embodiment of the present invention will be described. The configuration and method implemented in this embodiment are the same as those in the first embodiment except what is particularly mentioned below. FIG. 13 is a functional configuration diagram of the image generation apparatus 310 according to the fourth embodiment. In the present embodiment, the image generation apparatus 310 of the first embodiment includes an answer generation unit 685 for generating an answer to the input content of the user and a voice synthesis unit 645 for generating a voice for the answer.

図１４は、第４の実施形態に係るＨＭＤシステムにおける処理の流れを示す図である。 FIG. 14 is a diagram showing a flow of processing in the HMD system according to the fourth embodiment.

画像生成装置３１０の制御部３４０の入力検出部６１０は、ＨＭＤ１１０からの入力データ、例えば、ユーザの発話データを受信する（Ｓ４０１）。音声認識部６４０は、受信した発話データに基づいて音声を認識する。音声認識方法としては、例えば、予め発話データを学習データとして記録しておき、入力された発話データから抽出された特徴と学習データから抽出された特徴を比較して最も近い言語列を音声認識結果として出力する方法や、隠れマルコフモデル等を利用することが考えられる。 The input detection unit 610 of the control unit 340 of the image generation apparatus 310 receives the input data from the HMD 110, for example, the utterance data of the user (S401). The voice recognition unit 640 recognizes voice based on the received speech data. As a voice recognition method, for example, utterance data is recorded in advance as learning data, the features extracted from the input utterance data are compared with the features extracted from the learning data, and the closest language string is used as the voice recognition result. It is conceivable to use a hidden Markov model or the like.

次に、画像生成装置３１０の制御部３４０のテキスト変換部６５０は、音声認識結果に基づいて、音声をテキストに変換する（Ｓ４０２）。 Next, the text conversion unit 650 of the control unit 340 of the image generation apparatus 310 converts the voice into text based on the voice recognition result (S402).

次に、画像生成装置３１０の制御部３４０の回答生成部６８５は、変換されたテキストデータに基づいて、ユーザに対する回答を生成する（Ｓ４０３）。ここで、回答生成に際しては、（図示しない）質問／回答テーブルを参照し、質問のテキストデータに対応する回答テキストを抽出する方法、（図示しない）会話テーブルを参照し、所定の発話に対する応答テキストを抽出する方法、または、仮想現実空間におけるユーザ間の会話の履歴をテキストデータとして予め記憶しておき、所定の質問／発話に対応する回答／応答に関する学習モデルを生成し、学習モデルを参照して回答／応答テキストを抽出する方法等が考えられる。 Next, the answer generation unit 685 of the control unit 340 of the image generation apparatus 310 generates an answer to the user based on the converted text data (S403). Here, when generating an answer, a method for extracting an answer text corresponding to the question text data (not shown) by referring to a question/answer table (not shown) and a response text for a predetermined utterance by referring to a conversation table (not shown) Or a history of conversations between users in the virtual reality space is stored in advance as text data, a learning model for an answer/response corresponding to a given question/utterance is generated, and the learning model is referred to. A method of extracting an answer/response text by using the above method can be considered.

次に、画像生成装置３１０の音声合成６４５は、変換されたテキストに対応する音声を生成する（Ｓ４０４）。 Next, the voice synthesis 645 of the image generating apparatus 310 generates voice corresponding to the converted text (S404).

次に、画像生成装置３１０の音声合成部６４５は、生成された音声を、入出力部３２０又は通信部３３０を介して画像とともにユーザのＨＭＤに送信し、音声はＨＭＤのスピーカを通じて出力され、画像は表示パネルに出力され表示される（Ｓ４０５）。 Next, the voice synthesis unit 645 of the image generation apparatus 310 transmits the generated voice to the user's HMD via the input/output unit 320 or the communication unit 330, and the voice is output through the speaker of the HMD. Is output and displayed on the display panel (S405).

以上説明したように、本実施形態に係るＨＭＤシステム３００によれば、仮想空間において、ユーザ双方が会話をしなくても、一方のユーザの発話内容を基に他方のユーザの回答内容を生成し、会話を楽しむかのような感覚を得られることができる。これにより、仮想空間内におけるユーザが演じるキャラクタ同士の会話に更なる利便性を提供することができる。
As described above, according to the HMD system 300 according to the present embodiment, the answer contents of the other user are generated based on the utterance contents of one user even if both users do not have a conversation in the virtual space. You can feel as if you are enjoying a conversation. As a result, it is possible to provide further convenience in the conversation between the characters played by the user in the virtual space.

１１０ＨＭＤ
１２０表示パネル
１３０筐体部
１４０センサ
１５０光源
１６０カメラ
１７０マイク
１８０ヘッドフォン
２１０コントローラ
２２０左手用コントローラ
２３０右手用コントローラ
２３５グリップ
２４０操作用トリガーボタン
２５０赤外線ＬＥＤ
２６０センサ
２７０ジョイスティック
２８０メニューボタン
３００ＨＭＤシステム
３１０画像生成装置 110 HMD
120 Display Panel 130 Housing 140 Sensor 150 Light Source 160 Camera 170 Microphone 180 Headphones 210 Controller 220 Left Hand Controller 230 Right Hand Controller 235 Grip 240 Operation Trigger Button 250 Infrared LED
260 sensor 270 joystick 280 menu button 300 HMD system 310 image generation device

Claims

A method of controlling a character played by a performer user in a virtual space,
Placing the character in the virtual space,
Detecting user input,
Converting the input content into output content in another format,
Outputting the output content in a virtual space,
A control method comprising:

The control method according to claim 1, wherein
The control method, wherein the input content is user's voice data, and the output content is text data corresponding to the voice data.

The control method according to claim 1, wherein
The control method, wherein the input content is text data and the output content is voice data corresponding to the text data.

The control method according to claim 1, wherein
The control method, wherein the input content is voice data regarding a first language, and the output content is voice data regarding a second language corresponding to the first language.

The control method according to claim 1, wherein
The input content is voice data, the output content is voice data relating to an answer to the voice data, and the output content is output as a voice of a second performer user arranged in the virtual space. , A control method characterized by.