WO2021145243A1

WO2021145243A1 - Program, method executed by computer, and computer

Info

Publication number: WO2021145243A1
Application number: PCT/JP2021/000155
Authority: WO
Inventors: 一晃澤木
Original assignee: 株式会社コロプラ
Priority date: 2020-01-16
Filing date: 2021-01-06
Publication date: 2021-07-22
Also published as: JP7295045B2; JP2021114036A

Abstract

A program for causing a computer to execute: a step for defining a virtual space; a step for positioning a first avatar in the virtual space, the first avatar being associated with a first user; a step for detecting the expression on the face of the first user; a step for controlling the expression on the face of the first avatar in accordance with the expression on the face of the first user; and a step for executing calibration when a calibration execution condition has been satisfied, the calibration being for controlling the expression on the face of the first avatar so as to reach a standard state.

Description

Programs, how computers run and computers

The present invention relates to a program, a method executed by a computer, and a computer.

Conventionally, there is known a technique in which a plurality of users communicate with each other via one shared virtual space (see, for example, Patent Document 1). Each user's avatar is placed in the virtual space, and users can chat with each other via the avatar. By detecting the movement of the user's face with the head mount device worn by the user, it is possible to reflect the detected movement of the face on the avatar. The user can recognize the facial expression of the chat partner by the movement of the face of the avatar, and can experience the dialogue in the virtual space as if it were a dialogue in the real world.

Japanese Patent No. 6298561

Over time, the avatar's face may crumble. The cause is that the displacement of the head mount device and the deformation of the face when the user touches it are reflected on the face of the avatar. Such face collapse of the avatar can be corrected by performing a calibration to standardize the face.

Usually, what is displayed on the user's head mount device during chat is a field of view image from the user's point of view. Such a first-person view image includes the avatar of the chat partner, but does not include the user's own avatar. Since it is not possible to determine from the field of view image whether or not the face of the user's avatar is broken, the user had to perform an operation instructing the execution of calibration one by one when the chat partner pointed out that the face was broken. ..

An object of the present invention is to calibrate the face of an avatar at an appropriate timing without any user operation.

According to one embodiment, a step of defining a virtual space, a step of arranging a first avatar associated with the first user on the virtual space, and a step of detecting a facial expression of the first user. , The step of controlling the facial expression of the first avatar according to the facial expression of the first user and the execution condition of the calibration for controlling the facial expression of the first avatar to the standard state are satisfied. In some cases, a step of performing the calibration and a program for causing the computer to perform the calibration are provided.

According to the present disclosure, the face of the avatar can be calibrated at an appropriate timing without any user operation.

It is a figure which shows the outline of the structure of the HMD system according to a certain embodiment. It is a block diagram which shows an example of the hardware composition of the computer according to a certain embodiment. It is a figure which conceptually represents the uvw field-of-view coordinate system set in the HMD according to a certain embodiment. It is a figure that conceptually represents one aspect of expressing a virtual space according to a certain embodiment. It is a figure which showed the head of the user who wears an HMD according to a certain embodiment from the top. It is a figure which shows the YZ cross section which looked at the visual field area from the X direction in the virtual space. It is a figure which shows the XZ cross section which looked at the field of view area from the Y direction in a virtual space. It is a figure which shows the schematic structure of the controller according to a certain embodiment. It is a figure which shows an example of each direction of yaw, roll, and pitch defined for the right hand of the user according to a certain embodiment. It is a block diagram which shows an example of the hardware configuration of the server according to a certain embodiment. It is a block diagram which shows the computer according to a certain embodiment as a module structure. FIG. 5 is a sequence chart representing a portion of the processing performed in an HMD set according to an embodiment. It is a schematic diagram which shows the situation that each HMD provides a virtual space to a user in a network. It is a figure which shows the field of view image of the user 5A in FIG. 12A. It is a sequence diagram which shows the process to perform in the HMD system according to a certain embodiment. It is a block diagram which shows the detailed structure of the module of the computer according to a certain embodiment. It is a figure explaining the process of detecting a mouth from a user's face image. It is a figure (the 1) explaining the process which the face organ detection module detects the shape of a mouth. It is a figure (the 2) explaining the process which the face organ detection module detects the shape of a mouth. It is a figure which shows an example of the structure of face tracking data. It is a flowchart which shows the process performed by the computer according to a certain embodiment. It is a figure which shows the virtual space shared with other computers. It is a figure which shows an example of the field of view image including a menu. It is a figure which shows an example of the field of view image including the avatar of the user of another computer. It is a figure which shows an example of the field of view image including the avatar before and after calibration.

Hereinafter, embodiments of this technical idea will be described in detail with reference to the drawings. In the following description, the same parts are designated by the same reference numerals. Their names and functions are the same. Therefore, the detailed description of them will not be repeated. In one or more embodiments set forth in the present disclosure, the elements included in each embodiment may be combined with each other, and the combined deliverables shall also form part of the embodiments set forth in the present disclosure.

[HMD system configuration]
The configuration of the HMD (Head-Mounted Device) system 100 will be described with reference to FIG. FIG. 1 is a diagram showing an outline of the configuration of the HMD system 100 according to the present embodiment.
The HMD system 100 is provided as a home system or a business system.

The HMD system 100 includes a server 600,

HMD sets

110A, 110B, 110C, 110D, an external device 700, and a network 2. Each of the

HMD sets

110A, 110B, 110C, and 110D is configured to be able to communicate with the server 600 and the external device 700 via the network 2. Hereinafter, the HMD set 110A, 110B, 110C, 110D are collectively referred to as the HMD set 110. The number of HMD sets 110 constituting the HMD system 100 is not limited to four, and may be three or less or five or more. The HMD set 110 includes an HMD 120, a computer 200, an HMD sensor 410, a display 430, and a controller 300. The HMD 120 includes a monitor 130, a gaze sensor 140, a first camera 150, a second camera 160, a microphone 170, and a speaker 180. The controller 300 may include a motion sensor 420.

In some aspects, the computer 200 can connect to the Internet or other network 2 and communicate with the server 600 or other computer connected to the network 2. Examples of other computers include computers of other HMD sets 110 and external devices 700. In another aspect, the HMD 120 may include a sensor 190 instead of the HMD sensor 410.

The HMD 120 may be worn on the head of the user 5 and provide the user 5 with a virtual space during operation. More specifically, the HMD 120 displays an image for the right eye and an image for the left eye on the monitor 130, respectively. When each eye of the user 5 visually recognizes the respective image, the user 5 can recognize the image as a three-dimensional image based on the parallax of both eyes. The HMD 120 may include either a so-called head-mounted display having a monitor and a head-mounted device to which a smartphone or other terminal having a monitor can be attached.

The monitor 130 is realized as, for example, a non-transparent display device. In one aspect, the monitor 130 is arranged in the body of the HMD 120 so that it is located in front of both eyes of the user 5. Therefore, the user 5 can immerse himself in the virtual space when he / she visually recognizes the three-dimensional image displayed on the monitor 130. In one aspect, the virtual space includes, for example, a background, an object that the user 5 can manipulate, and an image of a menu that the user 5 can select. In a certain aspect, the monitor 130 can be realized as a liquid crystal monitor or an organic EL (Electro Luminescence) monitor included in a so-called smartphone or other information display terminal.

In another aspect, the monitor 130 can be realized as a transmissive display device. In this case, the HMD 120 may be an open type such as a glasses type rather than a closed type that covers the eyes of the user 5 as shown in FIG. The transmissive monitor 130 may be temporarily configured as a non-transparent display device by adjusting its transmittance. The monitor 130 may include a configuration that simultaneously displays a part of the image constituting the virtual space and the real space. For example, the monitor 130 may display an image of the real space taken by the camera mounted on the HMD 120, or may make the real space visible by setting a part of the transmittance to be high.

In some aspects, the monitor 130 may include a sub-monitor for displaying an image for the right eye and a sub-monitor for displaying an image for the left eye. In another aspect, the monitor 130 may be configured to display the image for the right eye and the image for the left eye as a unit. In this case, the monitor 130 includes a high speed shutter. The high-speed shutter operates so that the image for the right eye and the image for the left eye can be alternately displayed so that the image is recognized by only one of the eyes.

In one aspect, the HMD 120 includes a plurality of light sources (not shown). Each light source is realized by, for example, an LED (Light Emitting Diode) that emits infrared rays. The HMD sensor 410 has a position tracking function for detecting the movement of the HMD 120. More specifically, the HMD sensor 410 reads a plurality of infrared rays emitted by the HMD 120 and detects the position and inclination of the HMD 120 in the real space.

In another aspect, the HMD sensor 410 may be implemented by a camera. In this case, the HMD sensor 410 can detect the position and tilt of the HMD 120 by executing the image analysis process using the image information of the HMD 120 output from the camera.

In another aspect, the HMD 120 may include a sensor 190 as a position detector in place of the HMD sensor 410 or in addition to the HMD sensor 410. The HMD 120 can use the sensor 190 to detect the position and tilt of the HMD 120 itself. For example, if the sensor 190 is an angular velocity sensor, a geomagnetic sensor, or an accelerometer, the HMD 120 may use any of these sensors instead of the HMD sensor 410 to detect its position and tilt. As an example, when the sensor 190 is an angular velocity sensor, the angular velocity sensor detects the angular velocity around the three axes of the HMD 120 in real space over time. The HMD 120 calculates the temporal change of the angle around the three axes of the HMD 120 based on each angular velocity, and further calculates the inclination of the HMD 120 based on the temporal change of the angle.

The gaze sensor 140 detects the directions in which the eyes of the user 5's right eye and left eye are directed. That is, the gaze sensor 140 detects the line of sight of the user 5. The detection of the direction of the line of sight is realized by, for example, a known eye tracking function. The gaze sensor 140 is realized by a sensor having the eye tracking function. In certain aspects, the gaze sensor 140 preferably includes a sensor for the right eye and a sensor for the left eye. The gaze sensor 140 may be, for example, a sensor that irradiates the right eye and the left eye of the user 5 with infrared light and detects the angle of rotation of each eyeball by receiving the reflected light from the cornea and the iris with respect to the irradiation light. .. The gaze sensor 140 can detect the line of sight of the user 5 based on each of the detected rotation angles.

The first camera 150 captures the lower part of the user 5's face. More specifically, the first camera 150 captures the nose, mouth, and the like of the user 5. The second camera 160 captures the eyes, eyebrows, and the like of the user 5. The housing on the user 5 side of the HMD 120 is defined as the inside of the HMD 120, and the housing on the side opposite to the user 5 of the HMD 120 is defined as the outside of the HMD 120. In some aspects, the first camera 150 may be located outside the HMD 120 and the second camera 160 may be located inside the HMD 120. The images generated by the first camera 150 and the second camera 160 are input to the computer 200. In another aspect, the first camera 150 and the second camera 160 may be realized as one camera, and the face of the user 5 may be photographed by this one camera.

The microphone 170 converts the utterance of the user 5 into an audio signal (electric signal) and outputs it to the computer 200. The speaker 180 converts the voice signal into voice and outputs it to the user 5. In another aspect, the HMD 120 may include earphones instead of the speaker 180.

The controller 300 is connected to the computer 200 by wire or wirelessly. The controller 300 receives an instruction input from the user 5 to the computer 200. In one aspect, the controller 300 is configured to be grippable by the user 5. In another aspect, the controller 300 is configured to be wearable on a part of the user 5's body or clothing. In yet another aspect, the controller 300 may be configured to output at least one of vibration, sound, and light based on a signal transmitted from the computer 200. In yet another aspect, the controller 300 receives from the user 5 an operation for controlling the position and movement of the object arranged in the virtual space.

In one aspect, the controller 300 includes a plurality of light sources. Each light source is realized by, for example, an LED that emits infrared rays. The HMD sensor 410 has a position tracking function. In this case, the HMD sensor 410 reads a plurality of infrared rays emitted by the controller 300 and detects the position and inclination of the controller 300 in the real space. In another aspect, the HMD sensor 410 may be implemented by a camera. In this case, the HMD sensor 410 can detect the position and tilt of the controller 300 by executing the image analysis process using the image information of the controller 300 output from the camera.

The motion sensor 420 is attached to the user 5's hand in a certain aspect to detect the movement of the user 5's hand. For example, the motion sensor 420 detects the rotation speed, the number of rotations, and the like of the hand. The detected signal is sent to the computer 200. The motion sensor 420 is provided in the controller 300, for example. In a certain aspect, the motion sensor 420 is provided in, for example, a controller 300 configured to be grippable by the user 5. In another aspect, for safety in real space, the controller 300 is attached to something that does not easily fly by being attached to the user 5's hand, such as a glove type. In yet another aspect, a sensor not attached to the user 5 may detect the movement of the user 5's hand. For example, the signal of the camera that shoots the user 5 may be input to the computer 200 as a signal representing the operation of the user 5. As an example, the motion sensor 420 and the computer 200 are wirelessly connected to each other. In the case of wireless communication, the communication mode is not particularly limited, and for example, Bluetooth (registered trademark) or other known communication method is used.

The display 430 displays an image similar to the image displayed on the monitor 130. As a result, users other than the user 5 wearing the HMD 120 can also view the same image as the user 5. The image displayed on the display 430 does not have to be a three-dimensional image, and may be an image for the right eye or an image for the left eye. Examples of the display 430 include a liquid crystal display and an organic EL monitor.

The server 600 may send the program to the computer 200. In another aspect, the server 600 may communicate with another computer 200 to provide virtual reality to the HMD 120 used by another user. For example, in an amusement facility, when a plurality of users play a participatory game, each computer 200 communicates a signal based on the operation of each user with another computer 200 via a server 600, and a plurality of users are used in the same virtual space. Allows users to enjoy a common game. Each computer 200 may communicate a signal based on the operation of each user with another computer 200 without going through the server 600.

The external device 700 may be any device as long as it can communicate with the computer 200. The external device 700 may be, for example, a device capable of communicating with the computer 200 via the network 2, or a device capable of directly communicating with the computer 200 by short-range wireless communication or a wired connection. Examples of the external device 700 include, but are not limited to, smart devices, PCs (Personal Computers), and peripheral devices of the computer 200.

[Computer hardware configuration]
The computer 200 according to the present embodiment will be described with reference to FIG. FIG. 2 is a block diagram showing an example of the hardware configuration of the computer 200 according to the present embodiment. The computer 200 includes a processor 210, a memory 220, a storage 230, an input / output interface 240, and a communication interface 250 as main components. Each component is connected to bus 260, respectively.

The processor 210 executes a series of instructions included in the program stored in the memory 220 or the storage 230 based on the signal given to the computer 200 or when a predetermined condition is satisfied. In a certain aspect, the processor 210 is realized as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), an MPU (Micro Processor Unit), an FPGA (Field-Programmable Gate Array), or other device.

The memory 220 temporarily stores programs and data. The program is loaded from storage 230, for example. The data includes data input to the computer 200 and data generated by the processor 210. In a certain aspect, the memory 220 is realized as a RAM (Random Access Memory) or other volatile memory.

Storage 230 holds programs and data permanently. The storage 230 is realized as, for example, a ROM (Read-Only Memory), a hard disk device, a flash memory, or other non-volatile storage device. The program stored in the storage 230 includes a program for providing a virtual space in the HMD system 100, a simulation program, a game program, a user authentication program, and a program for realizing communication with another computer 200. The data stored in the storage 230 includes data, objects, and the like for defining the virtual space.

In another aspect, the storage 230 may be realized as a removable storage device such as a memory card. In yet another aspect, a configuration that uses programs and data stored in an external storage device may be used instead of the storage 230 built into the computer 200. According to such a configuration, for example, in a scene where a plurality of HMD systems 100 are used such as an amusement facility, it is possible to update programs and data at once.

The input / output interface 240 communicates signals with the HMD 120, the HMD sensor 410, the motion sensor 420, and the display 430. The monitor 130, the gaze sensor 140, the first camera 150, the second camera 160, the microphone 170, and the speaker 180 included in the HMD 120 can communicate with the computer 200 via the input / output interface 240 of the HMD 120. In a certain aspect, the input / output interface 240 is realized by using USB (Universal Serial Bus), DVI (Digital Visual Interface), HDMI (registered trademark) (High-Definition Multimedia Interface) and other terminals. The input / output interface 240 is not limited to the above.

In some aspects, the input / output interface 240 may further communicate with the controller 300. For example, the input / output interface 240 receives input of signals output from the controller 300 and the motion sensor 420. In another aspect, the input / output interface 240 sends an instruction output from the processor 210 to the controller 300. The command instructs the controller 300 to vibrate, output voice, emit light, and the like. Upon receiving the command, the controller 300 executes either vibration, voice output, or light emission in response to the command.

The communication interface 250 is connected to the network 2 and communicates with another computer (for example, the server 600) connected to the network 2. In a certain aspect, the communication interface 250 is realized as, for example, a LAN (Local Area Network) or other wired communication interface, or a WiFi (Wireless Fidelity), Bluetooth (registered trademark), NFC (Near Field Communication) or other wireless communication interface. Will be done. The communication interface 250 is not limited to the above.

In one aspect, the processor 210 accesses the storage 230, loads one or more programs stored in the storage 230 into the memory 220, and executes a series of instructions contained in the program. The one or more programs may include an operating system of a computer 200, an application program for providing a virtual space, game software that can be executed in the virtual space, and the like. The processor 210 sends a signal to the HMD 120 to provide virtual space via the input / output interface 240. The HMD 120 displays an image on the monitor 130 based on the signal.

In the example shown in FIG. 2, the computer 200 is configured to be provided outside the HMD 120, but in another aspect, the computer 200 may be built in the HMD 120. As an example, a portable information communication terminal (for example, a smartphone) including a monitor 130 may function as a computer 200.

The computer 200 may have a configuration commonly used for a plurality of HMD 120s. According to such a configuration, for example, the same virtual space can be provided to a plurality of users, so that each user can enjoy the same application as other users in the same virtual space.

In a certain embodiment, in the HMD system 100, a real coordinate system, which is a coordinate system in the real space, is preset. The real coordinate system has three reference directions (axises) that are parallel to the vertical direction in the real space, the horizontal direction orthogonal to the vertical direction, and the front-back direction orthogonal to both the vertical direction and the horizontal direction. The horizontal direction, the vertical direction (vertical direction), and the front-back direction in the real coordinate system are defined as the x-axis, the y-axis, and the z-axis, respectively. More specifically, in the real coordinate system, the x-axis is parallel to the horizontal direction in real space. The y-axis is parallel to the vertical direction in real space. The z-axis is parallel to the front-back direction of the real space.

In some aspects, the HMD sensor 410 includes an infrared sensor. When the infrared sensor detects infrared rays emitted from each light source of the HMD 120, the presence of the HMD 120 is detected. The HMD sensor 410 further detects the position and inclination (orientation) of the HMD 120 in the real space according to the movement of the user 5 wearing the HMD 120 based on the value of each point (each coordinate value in the real coordinate system). do. More specifically, the HMD sensor 410 can detect a temporal change in the position and inclination of the HMD 120 by using each value detected over time.

Each inclination of the HMD 120 detected by the HMD sensor 410 corresponds to each inclination of the HMD 120 around three axes in the real coordinate system. The HMD sensor 410 sets the uvw field coordinate system to the HMD 120 based on the inclination of the HMD 120 in the real coordinate system. The uvw field-of-view coordinate system set in the HMD 120 corresponds to the viewpoint coordinate system when the user 5 wearing the HMD 120 sees an object in the virtual space.

[Uvw field coordinate system]
The uvw field coordinate system will be described with reference to FIG. FIG. 3 is a diagram conceptually representing the uvw field coordinate system set in the HMD 120 according to an embodiment. The HMD sensor 410 detects the position and tilt of the HMD 120 in the real coordinate system when the HMD 120 is activated. Processor 210 sets the uvw field coordinate system to HMD 120 based on the detected values.

As shown in FIG. 3, the HMD 120 sets a three-dimensional uvw visual field coordinate system centered (origin) on the head of the user 5 wearing the HMD 120. More specifically, the HMD 120 defines the real coordinate system in the horizontal, vertical, and front-back directions (x-axis, y-axis, z-axis) by the inclination of the HMD 120 around each axis in the real coordinate system. The three directions newly obtained by tilting each around the axis are set as the pitch axis (u axis), the yaw axis (v axis), and the roll axis (w axis) of the uvw field coordinate system in the HMD 120.

In a certain aspect, when the user 5 wearing the HMD 120 is upright and visually recognizing the front, the processor 210 sets the uvw field coordinate system parallel to the real coordinate system to the HMD 120. In this case, the horizontal direction (x-axis), vertical direction (y-axis), and front-back direction (z-axis) in the real coordinate system are the pitch axis (u-axis) and yaw-axis (v-axis) of the uvw field coordinate system in the HMD 120. , And the roll axis (w axis).

After the uvw field coordinate system is set to the HMD 120, the HMD sensor 410 can detect the tilt of the HMD 120 in the set uvw field coordinate system based on the movement of the HMD 120. In this case, the HMD sensor 410 detects the pitch angle (θu), yaw angle (θv), and roll angle (θw) of the HMD 120 in the uvw visual field coordinate system as the inclination of the HMD 120, respectively. The pitch angle (θu) represents the tilt angle of the HMD 120 around the pitch axis in the uvw visual field coordinate system. The yaw angle (θv) represents the tilt angle of the HMD 120 around the yaw axis in the uvw visual field coordinate system. The roll angle (θw) represents the tilt angle of the HMD 120 around the roll axis in the uvw field coordinate system.

The HMD sensor 410 sets the uvw field coordinate system in the HMD 120 after the HMD 120 has moved to the HMD 120 based on the detected inclination of the HMD 120. The relationship between the HMD 120 and the uvw field coordinate system of the HMD 120 is always constant regardless of the position and inclination of the HMD 120. When the position and inclination of the HMD 120 change, the position and inclination of the uvw visual field coordinate system of the HMD 120 in the real coordinate system change in conjunction with the change of the position and inclination.

In one aspect, the HMD sensor 410 determines the HMD 120 based on the intensity of the infrared light obtained based on the output from the infrared sensor and the relative positional relationship between the points (eg, the distance between the points). The position of the above in the real space may be specified as a relative position with respect to the HMD sensor 410. The processor 210 may determine the origin of the uvw visual field coordinate system of the HMD 120 in real space (real coordinate system) based on the identified relative position.

[Virtual space]
The virtual space will be further described with reference to FIG. FIG. 4 is a diagram conceptually representing one aspect of expressing the virtual space 11 according to a certain embodiment. The virtual space 11 has an all-sky spherical structure that covers the entire center 12 in the 360-degree direction. In FIG. 4, the celestial sphere in the upper half of the virtual space 11 is illustrated so as not to complicate the explanation. Each mesh is defined in the virtual space 11. The position of each mesh is predetermined as a coordinate value in the XYZ coordinate system, which is a global coordinate system defined in the virtual space 11. The computer 200 associates each partial image constituting the panoramic image 13 (still image, moving image, etc.) expandable in the virtual space 11 with each corresponding mesh in the virtual space 11.

In a certain aspect, the virtual space 11 defines an XYZ coordinate system with the center 12 as the origin. The XYZ coordinate system is, for example, parallel to the real coordinate system. The horizontal direction, vertical direction (vertical direction), and front-back direction in the XYZ coordinate system are defined as the X-axis, the Y-axis, and the Z-axis, respectively. Therefore, the X-axis (horizontal direction) of the XYZ coordinate system is parallel to the x-axis of the real coordinate system, and the Y-axis (vertical direction) of the XYZ coordinate system is parallel to the y-axis of the real coordinate system. The Z-axis (front-back direction) is parallel to the z-axis of the real coordinate system.

At the time of starting the HMD 120, that is, in the initial state of the HMD 120, the virtual camera 14 is arranged at the center 12 of the virtual space 11. In one aspect, the processor 210 displays an image captured by the virtual camera 14 on the monitor 130 of the HMD 120. The virtual camera 14 moves in the virtual space 11 in the same manner in conjunction with the movement of the HMD 120 in the real space. As a result, changes in the position and inclination of the HMD 120 in the real space can be similarly reproduced in the virtual space 11.

As in the case of the HMD 120, the virtual camera 14 is defined with an uvw field-of-view coordinate system. The uvw field-of-view coordinate system of the virtual camera 14 in the virtual space 11 is defined to be linked to the uvw field-of-view coordinate system of the HMD 120 in the real space (real coordinate system). Therefore, when the inclination of the HMD 120 changes, the inclination of the virtual camera 14 also changes accordingly. The virtual camera 14 can also move in the virtual space 11 in conjunction with the movement of the user 5 wearing the HMD 120 in the real space.

The processor 210 of the computer 200 defines the field of view 15 in the virtual space 11 based on the position and tilt (reference line of sight 16) of the virtual camera 14. The visual field area 15 corresponds to an area in the virtual space 11 that is visually recognized by the user 5 wearing the HMD 120. That is, the position of the virtual camera 14 can be said to be the viewpoint of the user 5 in the virtual space 11.

The line of sight of the user 5 detected by the gaze sensor 140 is a direction in the viewpoint coordinate system when the user 5 visually recognizes an object. The uvw field-of-view coordinate system of the HMD 120 is equal to the viewpoint coordinate system when the user 5 visually recognizes the monitor 130. The uvw field-of-view coordinate system of the virtual camera 14 is linked to the uvw field-of-view coordinate system of the HMD 120. Therefore, the HMD system 100 according to a certain aspect can consider the line of sight of the user 5 detected by the gaze sensor 140 as the line of sight of the user 5 in the uvw field of view coordinate system of the virtual camera 14.

[User's line of sight]
The determination of the line of sight of the user 5 will be described with reference to FIG. FIG. 5 is a top view of the head of the user 5 who wears the HMD 120 according to an embodiment.

In one aspect, the gaze sensor 140 detects each line of sight of the user 5's right and left eyes. In a certain aspect, when the user 5 is looking near, the gaze sensor 140 detects the lines of sight R1 and L1. In another aspect, when the user 5 is looking far away, the gaze sensor 140 detects the lines of sight R2 and L2. In this case, the angle formed by the lines of sight R2 and L2 with respect to the roll axis w is smaller than the angle formed by the lines of sight R1 and L1 with respect to the roll axis w. The gaze sensor 140 transmits the detection result to the computer 200.

When the computer 200 receives the detection values of the lines of sight R1 and L1 from the gaze sensor 140 as the detection result of the line of sight, the computer 200 identifies the gaze point N1 which is the intersection of the lines of sight R1 and L1 based on the detected values. On the other hand, when the computer 200 receives the detected values of the lines of sight R2 and L2 from the gaze sensor 140, the computer 200 identifies the intersection of the lines of sight R2 and L2 as the gaze point. The computer 200 identifies the line of sight N0 of the user 5 based on the position of the specified gazing point N1. The computer 200 detects, for example, the extending direction of the straight line passing through the midpoint of the straight line connecting the right eye R and the left eye L of the user 5 and the gazing point N1 as the line of sight N0. The line of sight N0 is the direction in which the user 5 actually directs the line of sight with both eyes. The line of sight N0 corresponds to the direction in which the user 5 actually directs the line of sight with respect to the field of view area 15.

In another aspect, the HMD system 100 may include a television broadcast receiving tuner. According to such a configuration, the HMD system 100 can display a television program in the virtual space 11.

In yet another aspect, the HMD system 100 may include a communication circuit for connecting to the Internet or a telephone function for connecting to a telephone line.

[Visibility area]
The field of view 15 will be described with reference to FIGS. 6 and 7. FIG. 6 is a diagram showing a YZ cross section of the field of view region 15 viewed from the X direction in the virtual space 11. FIG. 7 is a diagram showing an XZ cross section of the field of view region 15 viewed from the Y direction in the virtual space 11.

As shown in FIG. 6, the field of view region 15 in the YZ cross section includes the region 18. The region 18 is defined by the position of the virtual camera 14, the reference line of sight 16, and the YZ cross section of the virtual space 11. The processor 210 defines a range including the polar angle α centered on the reference line of sight 16 in the virtual space as a region 18.

As shown in FIG. 7, the field of view region 15 in the XZ cross section includes the region 19. The region 19 is defined by the position of the virtual camera 14, the reference line of sight 16, and the XZ cross section of the virtual space 11. The processor 210 defines a range including the azimuth angle β centered on the reference line of sight 16 in the virtual space 11 as a region 19. The polar angles α and β are determined according to the position of the virtual camera 14 and the inclination (orientation) of the virtual camera 14.

In a certain aspect, the HMD system 100 provides the user 5 with a field of view in the virtual space 11 by displaying the field of view image 17 on the monitor 130 based on the signal from the computer 200. The field-of-view image 17 is an image corresponding to a portion of the panoramic image 13 corresponding to the field-of-view area 15. When the user 5 moves the HMD 120 attached to the head, the virtual camera 14 also moves in conjunction with the movement. As a result, the position of the visual field region 15 in the virtual space 11 changes. As a result, the field of view image 17 displayed on the monitor 130 is updated to an image of the panorama image 13 superimposed on the field of view area 15 in the direction in which the user 5 faces in the virtual space 11. The user 5 can visually recognize a desired direction in the virtual space 11.

As described above, the inclination of the virtual camera 14 corresponds to the line of sight (reference line of sight 16) of the user 5 in the virtual space 11, and the position where the virtual camera 14 is arranged corresponds to the viewpoint of the user 5 in the virtual space 11. Therefore, by changing the position or tilt of the virtual camera 14, the image displayed on the monitor 130 is updated, and the field of view of the user 5 is moved.

While wearing the HMD 120, the user 5 can visually recognize only the panoramic image 13 developed in the virtual space 11 without visually recognizing the real world. Therefore, the HMD system 100 can give the user 5 a high sense of immersion in the virtual space 11.

In a certain aspect, the processor 210 may move the virtual camera 14 in the virtual space 11 in conjunction with the movement of the user 5 wearing the HMD 120 in the real space. In this case, the processor 210 identifies an image region (field of view region 15) projected onto the monitor 130 of the HMD 120 based on the position and tilt of the virtual camera 14 in the virtual space 11.

In some aspects, the virtual camera 14 may include two virtual cameras, a virtual camera for providing an image for the right eye and a virtual camera for providing an image for the left eye. Appropriate parallax is set for the two virtual cameras so that the user 5 can recognize the three-dimensional virtual space 11. In another aspect, the virtual camera 14 may be realized by one virtual camera. In this case, an image for the right eye and an image for the left eye may be generated from the image obtained by one virtual camera. In the present embodiment, the virtual camera 14 includes two virtual cameras, and the roll axis (w) generated by synthesizing the roll axes of the two virtual cameras is adapted to the roll axis (w) of the HMD 120. The technical idea of the present disclosure is illustrated as being configured as such.

[controller]
An example of the controller 300 will be described with reference to FIG. FIG. 8 is a diagram showing a schematic configuration of a controller 300 according to an embodiment.

As shown in FIG. 8, in some aspects, the controller 300 may include a right controller 300R and a left controller (not shown). The right controller 300R is operated by the right hand of the user 5. The left controller is operated by the left hand of the user 5. In a certain aspect, the right controller 300R and the left controller are symmetrically configured as separate devices. Therefore, the user 5 can freely move the right hand holding the right controller 300R and the left hand holding the left controller. In another aspect, the controller 300 may be an integrated controller that accepts operations of both hands. Hereinafter, the right controller 300R will be described.

The right controller 300R includes a grip 310, a frame 320, and a top surface 330. The grip 310 is configured to be gripped by the right hand of the user 5. For example, the grip 310 may be held by the palm of the user 5's right hand and three fingers (middle finger, ring finger, little finger).

The grip 310 includes

buttons

340, 350 and a motion sensor 420. The button 340 is arranged on the side surface of the grip 310 and accepts an operation by the middle finger of the right hand. The button 350 is arranged in front of the grip 310 and accepts an operation by the index finger of the right hand. In one aspect, the buttons 340,350 are configured as trigger-type buttons. The motion sensor 420 is built in the housing of the grip 310. If the movement of the user 5 can be detected from around the user 5 by a camera or other device, the grip 310 may not include the motion sensor 420.

The frame 320 includes a plurality of infrared LEDs 360 arranged along its circumferential direction. The infrared LED 360 emits infrared rays as the program progresses while the program using the controller 300 is being executed. The infrared rays emitted from the infrared LED 360 can be used to detect each position and orientation (tilt, orientation) of the right controller 300R and the left controller. In the example shown in FIG. 8, infrared LEDs 360 arranged in two rows are shown, but the number of arrays is not limited to that shown in FIG. An array of one column or three or more columns may be used.

The top surface 330 includes

buttons

370, 380 and an analog stick 390. The

buttons

370 and 380 are configured as push-type buttons.

Buttons

370 and 380 accept operations by the thumb of the user 5's right hand. The analog stick 390 accepts an operation 360 degrees in any direction from the initial position (neutral position) in a certain aspect. The operation includes, for example, an operation for moving an object arranged in the virtual space 11.

In one aspect, the right controller 300R and the left controller include a battery for driving the infrared LED 360 and other components. Batteries include, but are not limited to, rechargeable, button type, dry cell type and the like. In another aspect, the right controller 300R and the left controller may be connected to, for example, the USB interface of the computer 200. In this case, the right controller 300R and the left controller do not require batteries.

As shown in the states (A) and (B) of FIG. 8, for example, the yaw, roll, and pitch directions are defined with respect to the right hand of the user 5. When the user 5 extends the thumb and the index finger, the direction in which the thumb extends is the yaw direction, the direction in which the index finger extends is the roll direction, and the direction perpendicular to the plane defined by the yaw direction axis and the roll direction axis is the pitch direction. Is defined as.

[Server hardware configuration]
The server 600 according to the present embodiment will be described with reference to FIG. FIG. 9 is a block diagram showing an example of the hardware configuration of the server 600 according to a certain embodiment. The server 600 includes a processor 610, a memory 620, a storage 630, an input / output interface 640, and a communication interface 650 as main components. Each component is connected to bus 660, respectively.

The processor 610 executes a series of instructions contained in the program stored in the memory 620 or the storage 630 based on the signal given to the server 600 or the condition that a predetermined condition is satisfied. In some aspects, the processor 610 is implemented as a CPU, GPU, MPU, FPGA or other device.

Memory 620 temporarily stores programs and data. The program is loaded from storage 630, for example. The data includes data input to the server 600 and data generated by the processor 610. In one aspect, the memory 620 is realized as a RAM or other volatile memory.

Storage 630 permanently holds programs and data. The storage 630 is realized as, for example, a ROM, a hard disk device, a flash memory, or other non-volatile storage device. The program stored in the storage 630 may include a program for providing a virtual space in the HMD system 100, a simulation program, a game program, a user authentication program, and a program for realizing communication with the computer 200. The data stored in the storage 630 may include data, objects, and the like for defining the virtual space.

In another aspect, the storage 630 may be realized as a removable storage device such as a memory card. In yet another aspect, a configuration using programs and data stored in an external storage device may be used instead of the storage 630 built into the server 600. According to such a configuration, for example, in a scene where a plurality of HMD systems 100 are used such as an amusement facility, it is possible to update programs and data at once.

The input / output interface 640 communicates a signal with the input / output device. In some aspects, the input / output interface 640 is implemented using USB, DVI, HDMI® and other terminals. The input / output interface 640 is not limited to the above.

The communication interface 650 is connected to the network 2 and communicates with the computer 200 connected to the network 2. In some aspects, the communication interface 650 is implemented as, for example, a LAN or other wired communication interface, or a WiFi, Bluetooth, NFC or other wireless communication interface. The communication interface 650 is not limited to the above.

In one aspect, the processor 610 accesses the storage 630, loads one or more programs stored in the storage 630 into the memory 620, and executes a series of instructions contained in the program. The one or more programs may include an operating system for the server 600, an application program for providing the virtual space, game software that can be executed in the virtual space, and the like. The processor 610 may send a signal to the computer 200 to provide virtual space via the input / output interface 640.

[HMD control device]
The control device of the HMD 120 will be described with reference to FIG. In certain embodiments, the control device is implemented by a computer 200 having a well-known configuration. FIG. 10 is a block diagram showing a computer 200 according to an embodiment as a module configuration.

As shown in FIG. 10, the computer 200 includes a control module 510, a rendering module 520, a memory module 530, and a communication control module 540. In some aspects, the control module 510 and the rendering module 520 are implemented by the processor 210. In another aspect, the plurality of processors 210 may operate as the control module 510 and the rendering module 520. The memory module 530 is realized by the memory 220 or the storage 230. The communication control module 540 is realized by the communication interface 250.

The control module 510 controls the virtual space 11 provided to the user 5. The control module 510 defines the virtual space 11 in the HMD system 100 by using the virtual space data representing the virtual space 11. The virtual space data is stored in, for example, the memory module 530. The control module 510 may generate virtual space data or acquire virtual space data from a server 600 or the like.

The control module 510 arranges an object in the virtual space 11 by using the object data representing the object. The object data is stored in, for example, the memory module 530. The control module 510 may generate object data or acquire object data from a server 600 or the like. The objects are, for example, an avatar object that is the alter ego of the user 5, a character object, an operation object such as a virtual hand operated by the controller 300, a landscape including forests, mountains, etc. arranged as the story of the game progresses, a cityscape, and animals. Etc. may be included.

The control module 510 arranges the avatar object of the user 5 of another computer 200 connected via the network 2 in the virtual space 11. In a certain aspect, the control module 510 arranges the avatar object of the user 5 in the virtual space 11. In a certain aspect, the control module 510 arranges an avatar object imitating the user 5 in the virtual space 11 based on the image including the user 5. In another aspect, the control module 510 arranges in the virtual space 11 an avatar object that has been selected by the user 5 from among a plurality of types of avatar objects (for example, an object imitating an animal or a deformed human object). do.

The control module 510 identifies the tilt of the HMD 120 based on the output of the HMD sensor 410. In another aspect, the control module 510 identifies the tilt of the HMD 120 based on the output of the sensor 190, which functions as a motion sensor. The control module 510 detects organs (for example, mouth, eyes, eyebrows) constituting the face of the user 5 from the images of the face of the user 5 generated by the first camera 150 and the second camera 160. The control module 510 detects the movement (shape) of each detected organ.

The control module 510 detects the line of sight of the user 5 in the virtual space 11 based on the signal from the gaze sensor 140. The control module 510 detects the viewpoint position (coordinate value in the XYZ coordinate system) at which the detected line of sight of the user 5 and the celestial sphere in the virtual space 11 intersect. More specifically, the control module 510 detects the viewpoint position based on the line of sight of the user 5 defined by the uvw coordinate system and the position and inclination of the virtual camera 14. The control module 510 transmits the detected viewpoint position to the server 600. In another aspect, the control module 510 may be configured to transmit line-of-sight information representing the line-of-sight of the user 5 to the server 600. In such a case, the viewpoint position can be calculated based on the line-of-sight information received by the server 600.

The control module 510 reflects the movement of the HMD 120 detected by the HMD sensor 410 on the avatar object. For example, the control module 510 detects that the HMD 120 is tilted and tilts and arranges the avatar object. The control module 510 reflects the detected movement of the facial organ on the face of the avatar object arranged in the virtual space 11. The control module 510 receives the line-of-sight information of the other user 5 from the server 600 and reflects it in the line-of-sight of the avatar object of the other user 5. In a certain aspect, the control module 510 reflects the movement of the controller 300 on the avatar object and the operation object. In this case, the controller 300 includes a motion sensor, an acceleration sensor, a plurality of light emitting elements (for example, infrared LEDs), and the like for detecting the movement of the controller 300.

The control module 510 arranges an operation object for receiving the operation of the user 5 in the virtual space 11 in the virtual space 11. By operating the operation object, the user 5 operates, for example, an object arranged in the virtual space 11. In a certain aspect, the operation object may include, for example, a hand object which is a virtual hand corresponding to the hand of the user 5. In a certain aspect, the control module 510 moves the hand object in the virtual space 11 so as to be linked to the movement of the user 5's hand in the real space based on the output of the motion sensor 420. In some aspects, the manipulation object can correspond to the hand portion of the avatar object.

When each of the objects arranged in the virtual space 11 collides with another object, the control module 510 detects the collision. The control module 510 can detect, for example, the timing at which the collision area of one object and the collision area of another object touch each other, and when the detection is made, a predetermined process is performed. The control module 510 can detect the timing when the object and the object are separated from the touching state, and when the detection is made, a predetermined process is performed. The control module 510 can detect that the object is in contact with the object. For example, when the operation object touches another object, the control module 510 detects that the operation object touches the other object and performs a predetermined process.

In one aspect, the control module 510 controls the image display on the monitor 130 of the HMD 120. For example, the control module 510 arranges the virtual camera 14 in the virtual space 11. The control module 510 controls the position of the virtual camera 14 in the virtual space 11 and the inclination (orientation) of the virtual camera 14. The control module 510 defines the field of view 15 according to the inclination of the head of the user 5 wearing the HMD 120 and the position of the virtual camera 14. The rendering module 520 generates a visual field image 17 to be displayed on the monitor 130 based on the determined visual field region 15. The field of view image 17 generated by the rendering module 520 is output to the HMD 120 by the communication control module 540.

When the control module 510 detects an utterance using the microphone 170 of the user 5 from the HMD 120, the control module 510 identifies the computer 200 to which the voice data to be transmitted corresponding to the utterance is transmitted. The voice data is transmitted to the computer 200 identified by the control module 510. When the control module 510 receives voice data from another user's computer 200 via the network 2, the control module 510 outputs the voice (utterance) corresponding to the voice data from the speaker 180.

The memory module 530 holds data used by the computer 200 to provide the virtual space 11 to the user 5. In a certain aspect, the memory module 530 holds spatial information, object information, and user information.

Spatial information holds one or more templates defined to provide the virtual space 11.

The object information includes a plurality of panoramic images 13 constituting the virtual space 11 and object data for arranging the objects in the virtual space 11. The panoramic image 13 may include a still image and a moving image. The panoramic image 13 may include an image in the unreal space and an image in the real space. Examples of images in unreal space include images generated by computer graphics.

The user information holds a user ID that identifies the user 5. The user ID may be, for example, an IP (Internet Protocol) address or a MAC (Media Access Control) address set in the computer 200 used by the user. In another aspect, the user ID may be set by the user. The user information includes a program for operating the computer 200 as a control device of the HMD system 100 and the like.

The data and programs stored in the memory module 530 are input by the user 5 of the HMD 120. Alternatively, the processor 210 downloads a program or data from a computer (for example, a server 600) operated by a business operator that provides the content, and stores the downloaded program or data in the memory module 530.

The communication control module 540 may communicate with the server 600 and other information communication devices via the network 2.

In certain aspects, the control module 510 and the rendering module 520 may be implemented using, for example, Unity® provided by Unity Technologies. In another aspect, the control module 510 and the rendering module 520 can also be realized as a combination of circuit elements that realize each process.

The processing in the computer 200 is realized by the hardware and the software executed by the processor 210. Such software may be pre-stored in a hard disk or other memory module 530. The software may be stored on a CD-ROM or other computer-readable non-volatile data recording medium and distributed as a program product. Alternatively, the software may be provided as a downloadable program product by an information provider connected to the Internet or other networks. Such software is read from a data recording medium by an optical disk drive or other data reader, or downloaded from a server 600 or other computer via a communication control module 540, and then temporarily stored in a storage module. .. The software is read from the storage module by the processor 210 and stored in RAM in the form of an executable program. Processor 210 executes the program.

[Control structure of HMD system]
The control structure of the HMD set 110 will be described with reference to FIG. FIG. 11 is a sequence chart showing a part of the processing performed in the HMD set 110 according to an embodiment.

As shown in FIG. 11, in step S1110, the processor 210 of the computer 200 specifies the virtual space data as the control module 510 and defines the virtual space 11.

In step S1120, the processor 210 initializes the virtual camera 14. For example, the processor 210 arranges the virtual camera 14 at a predetermined center 12 in the virtual space 11 in the work area of the memory, and directs the line of sight of the virtual camera 14 in the direction in which the user 5 is facing.

In step S1130, the processor 210 generates the field of view image data for displaying the initial field of view image as the rendering module 520. The generated visual field image data is output to the HMD 120 by the communication control module 540.

In step S1132, the monitor 130 of the HMD 120 displays the visual field image based on the visual field image data received from the computer 200. The user 5 wearing the HMD 120 can recognize the virtual space 11 when he / she visually recognizes the field of view image.

In step S1134, the HMD sensor 410 detects the position and tilt of the HMD 120 based on the plurality of infrared rays emitted from the HMD 120. The detection result is output to the computer 200 as motion detection data.

In step S1140, the processor 210 identifies the visual field direction of the user 5 wearing the HMD 120 based on the position and the inclination included in the motion detection data of the HMD 120.

In step S1150, the processor 210 executes the application program and arranges the object in the virtual space 11 based on the instruction included in the application program.

In step S1160, the controller 300 detects the operation of the user 5 based on the signal output from the motion sensor 420, and outputs the detection data representing the detected operation to the computer 200. In another aspect, the operation of the controller 300 by the user 5 may be detected based on an image from a camera arranged around the user 5.

In step S1170, the processor 210 detects the operation of the controller 300 by the user 5 based on the detection data acquired from the controller 300.

In step S1180, the processor 210 generates the field of view image data based on the operation of the controller 300 by the user 5. The generated visual field image data is output to the HMD 120 by the communication control module 540.

In step S1190, the HMD 120 updates the visual field image based on the received visual field image data, and displays the updated visual field image on the monitor 130.

[Avatar Object]
An avatar object according to the present embodiment will be described with reference to FIGS. 12A and 12B. Hereinafter, it is a figure explaining the avatar object of each user 5 of the HMD set 110A, 110B. Hereinafter, the user of the HMD set 110A will be referred to as a user 5A, the user of the HMD set 110B will be referred to as a user 5B, the user of the HMD set 110C will be referred to as a user 5C, and the user of the HMD set 110D will be referred to as a user 5D. A is added to the reference code of each component related to the HMD set 110A, B is added to the reference code of each component related to the HMD set 110B, and C is added to the reference code of each component related to the HMD set 110C. A D is added to the reference code of each component with respect to 110D. For example, the HMD 120A is included in the HMD set 110A.

FIG. 12A is a schematic diagram showing a situation in which each HMD 120 provides the virtual space 11 to the user 5 in the network 2. The computers 200A to 200D provide the virtual spaces 11A to 11D to the users 5A to 5D via the HMDs 120A to 120D, respectively. In the example shown in FIG. 12A, the virtual space 11A and the virtual space 11B are composed of the same data. In other words, the computer 200A and the computer 200B share the same virtual space. In the virtual space 11A and the virtual space 11B, the avatar object 6A of the user 5A and the avatar object 6B of the user 5B exist. The avatar object 6A in the virtual space 11A and the avatar object 6B in the virtual space 11B are each equipped with the HMD 120, but this is for the sake of clarity, and in reality, these objects are equipped with the HMD 120. Not.

In one aspect, the processor 210A may place a virtual camera 14A that captures the field of view image 17A of the user 5A at the eye position of the avatar object 6A.

12 (B) is a diagram showing the field of view image 17A of the user 5A in FIG. 12 (A). The visual field image 17A is an image displayed on the monitor 130A of the HMD 120A. The field view image 17A is an image generated by the virtual camera 14A. The avatar object 6B of the user 5B is displayed in the field of view image 17A. Although not shown in particular, the avatar object 6A of the user 5A is also displayed in the field of view image of the user 5B.

In the state of FIG. 12B, the user 5A can communicate with the user 5B through the virtual space 11A by dialogue. More specifically, the voice of the user 5A acquired by the microphone 170A is transmitted to the HMD 120B of the user 5B via the server 600, and is output from the speaker 180B provided in the HMD 120B. The voice of the user 5B is transmitted to the HMD 120A of the user 5A via the server 600, and is output from the speaker 180A provided in the HMD 120A.

The operation of the user 5B (the operation of the HMD 120B and the operation of the controller 300B) is reflected in the avatar object 6B arranged in the virtual space 11A by the processor 210A. As a result, the user 5A can recognize the operation of the user 5B through the avatar object 6B.

FIG. 13 is a sequence chart showing a part of the processing executed in the HMD system 100 according to the present embodiment. Although the HMD set 110D is not shown in FIG. 13, the HMD set 110D operates in the same manner as the HMD sets 110A, 110B, and 110C. Also in the following description, A is added to the reference code of each component related to the HMD set 110A, B is added to the reference code of each component related to the HMD set 110B, and C is added to the reference code of each component related to the HMD set 110C. It shall be attached and D shall be attached to the reference code of each component with respect to the HMD set 110D.

In step S1310A, the processor 210A in the HMD set 110A acquires the avatar information for determining the operation of the avatar object 6A in the virtual space 11A. This avatar information includes information about the avatar such as motion information, face tracking data, and voice data. The motion information includes information indicating a temporal change in the position and inclination of the HMD 120A, information indicating the hand motion of the user 5A detected by the motion sensor 420A or the like, and the like. Examples of the face tracking data include data for specifying the position and size of each part of the face of the user 5A. Examples of the face tracking data include data indicating the movement of each organ constituting the face of the user 5A and line-of-sight data. Examples of the voice data include data indicating the voice of the user 5A acquired by the microphone 170A of the HMD 120A. The avatar information may include information that identifies the avatar object 6A or the user 5A associated with the avatar object 6A, information that identifies the virtual space 11A in which the avatar object 6A exists, and the like. Information that identifies the avatar object 6A and the user 5A includes a user ID. Information that identifies the virtual space 11A in which the avatar object 6A exists includes a room ID. The processor 210A transmits the avatar information acquired as described above to the server 600 via the network 2.

In step S1310B, the processor 210B in the HMD set 110B acquires the avatar information for determining the operation of the avatar object 6B in the virtual space 11B and transmits it to the server 600, as in the process in step S1310A. Similarly, in step S1310C, the processor 210C in the HMD set 110C acquires the avatar information for determining the operation of the avatar object 6C in the virtual space 11C and transmits it to the server 600.

In step S1320, the server 600 temporarily stores the player information received from each of the HMD set 110A, the HMD set 110B, and the HMD set 110C. The server 600 integrates the avatar information of all users (users 5A to 5C in this example) associated with the common virtual space 11 based on the user ID, room ID, and the like included in each avatar information. Then, the server 600 transmits the integrated avatar information to all the users associated with the virtual space 11 at a predetermined timing. As a result, the synchronization process is executed. By such a synchronization process, the HMD set 110A, the HMD set 110B, and the HMD 110C can share each other's avatar information at substantially the same timing.

Subsequently, each HMD set 110A to 110C executes the process of steps S1330A to S1330C based on the avatar information transmitted from the server 600 to each HMD set 110A to 110C. The process of step S1330A corresponds to the process of step S1180 in FIG.

In step S1330A, the processor 210A in the HMD set 110A updates the information of the avatar object 6B and the avatar object 6C of the other users 5B and 5C in the virtual space 11A. Specifically, the processor 210A updates the position and orientation of the avatar object 6B in the virtual space 11 based on the motion information included in the avatar information transmitted from the HMD set 110B. For example, the processor 210A updates the information (position, orientation, etc.) of the avatar object 6B included in the object information stored in the memory module 530. Similarly, the processor 210A updates the information (position, orientation, etc.) of the avatar object 6C in the virtual space 11 based on the motion information included in the avatar information transmitted from the HMD set 110C.

In step S1330B, the processor 210B in the HMD set 110B updates the information of the avatar objects 6A, 6C of the users 5A, 5C in the virtual space 11B, as in the process in step S1330A. Similarly, in step S1330C, the processor 210C in the HMD set 110C updates the information of the avatar objects 6A, 6B of the users 5A, 5B in the virtual space 11C.

[Detailed configuration of module]
The details of the module configuration of the computer 200 will be described with reference to FIG. FIG. 14 is a block diagram showing a detailed configuration of a module of the computer 200 according to an embodiment.

As shown in FIG. 14, the control module 510 includes a virtual camera control module 1421, a view area determination module 1422, a reference line-of-sight identification module 1423, a facial organ detection module 1424, a motion detection module 1425, and a virtual space definition. It includes a module 1426, a virtual object generation module 1427, an operation object control module 1428, and an avatar control module 1429. The rendering module 520 includes a field of view image generation module 1438. The memory module 530 holds spatial information 1431, object information 1432, user information 1433, and face information 1434.

The virtual camera control module 1421 arranges the virtual camera 14 in the virtual space 11. The virtual camera control module 1421 controls the arrangement position of the virtual camera 14 in the virtual space 11 and the orientation (tilt) of the virtual camera 14. The field-of-view area determination module 1422 defines the field-of-view area 15 according to the orientation of the head of the user wearing the HMD 120 and the arrangement position of the virtual camera 14. The visual field image generation module 1438 generates a visual field image 17 to be displayed on the monitor 130 based on the determined visual field region 15.

The reference line-of-sight identification module 1423 identifies the line-of-sight of the user 5 based on the signal from the gaze sensor 140. The facial organ detection module 1424 detects organs (for example, mouth, eyes, eyebrows) constituting the face of the user 5 from the images of the face of the user 5 generated by the first camera 150 and the second camera 160. The motion detection module 1425 detects the motion (shape) of each organ detected by the facial organ detection module 1424. In FIGS. 15 to 18, the control contents of the facial organ detection module 1424 and the motion detection module 1425 will be described later.

The virtual space definition module 1426 defines the virtual space 11 in the HMD system 100 by generating virtual space data representing the virtual space 11.

The virtual object generation module 1427 generates an object to be arranged in the virtual space 11. Objects can include, for example, landscapes, animals, etc., including forests, mountains, etc., which are arranged as the story of the game progresses.

The operation object control module 1428 arranges an operation object for receiving a user's operation in the virtual space 11 in the virtual space 11. By manipulating the operation object, the user operates, for example, an object arranged in the virtual space 11. In some aspects, the operating object may include, for example, a hand object corresponding to the hand of the user wearing the HMD 120. In some aspects, the manipulation object may correspond to the hand portion of the avatar object described below.

The avatar control module 1429 generates data for arranging the avatar object of another computer 200 user connected via the network 2 in the virtual space 11. In a certain aspect, the avatar control module 1429 generates data for arranging the avatar object of the user 5 in the virtual space 11. In one aspect, the avatar control module 1429 creates an avatar object that mimics the user 5 based on an image that includes the user 5. In another aspect, the avatar control module 1429 creates an avatar object in the virtual space 11 that has been selected by the user 5 from among a plurality of types of avatar objects (for example, an object imitating an animal or a deformed human object). Generate data for placement.

The avatar control module 1429 reflects the movement of the HMD 120 detected by the HMD sensor 410 on the avatar object. For example, the avatar control module 1429 detects that the HMD 120 is tilted and generates data for tilting and arranging the avatar object. In one aspect, the avatar control module 1429 reflects the movement of the controller 300 on the avatar object. In this case, the controller 300 includes a motion sensor, an acceleration sensor, a plurality of light emitting elements (for example, infrared LEDs), and the like for detecting the movement of the controller 300. The avatar control module 1429 reflects the movement of the facial organ detected by the motion detection module 1425 on the face of the avatar object arranged in the virtual space 11. That is, the avatar control module 1429 reflects the movement of the face of the user 5A on the avatar object.

When each of the objects arranged in the virtual space 11 collides with another object, the control module 510 detects the collision. The control module 510 can detect, for example, the timing at which a certain object and another object touch each other, and when the detection is made, a predetermined process is performed. The control module 510 can detect the timing when the object and the object are separated from the touching state, and when the detection is made, a predetermined process is performed. The control module 510 can detect that the object is in contact with the object. Specifically, the operation object control module 1428 detects that the operation object touches the other object when the operation object touches the other object, and performs a predetermined process. ..

The memory module 530 holds data used by the computer 200 to provide the virtual space 11 to the user 5. In a certain aspect, the memory module 530 holds spatial information 1431, object information 1432, user information 1433, and face information 1434.

Spatial information 1431 holds one or more templates defined to provide the virtual space 11.

The object information 1432 holds the content to be reproduced in the virtual space 11, the object used in the content, and the information (for example, position information) for arranging the object in the virtual space 11. The content may include, for example, a game, content representing a landscape similar to that of the real world, and the like.

The user information 1433 holds a program for operating the computer 200 as a control device of the HMD system 100, an application program for using each content held in the object information 1432.

In the face information 1434, the face organ detection module 1424 holds a template stored in advance for detecting the face organ of the user 5. In one aspect, the face information 1434 holds a mouth template 1435, an eye template 1436, and an eyebrow template 1437. Each template can be an image corresponding to the organs that make up the face. For example, the mouth template 1435 can be an image of the mouth. Each template may contain multiple images.

[Face tracking]
Hereinafter, a specific example for detecting a user's facial expression (face movement) will be described with reference to FIGS. 15 to 18. 15 to 18 show, as an example, a specific example of detecting the movement of the mouth of the user 5. The detection methods described in FIGS. 15 to 18 are not limited to the movement of the mouth of the user 5, and the movement of other organs (for example, eyes, eyebrows, nose, cheeks) constituting the face of the user 5 is detected. Can also be applied to.

FIG. 15 is a diagram illustrating control for detecting a mouth from a user's face image 1521. The face image 1521 generated by the first camera 150 includes the nose and mouth of the user 5.

The face organ detection module 1424 identifies the mouth region 1531 from the face image 1521 by pattern matching using the mouth template 1435 stored in the face information 1434. In a certain aspect, the facial organ detection module 1424 sets a comparison area on a rectangle in the face image, and changes the size, position, and angle of the comparison area to obtain an image of the comparison area and an image of the mouth template 1435. Calculate the similarity with. The facial organ detection module 1424 may identify a comparison region for which a similarity greater than a predetermined threshold is calculated as the mouth region 1531.

The facial organ detection module 1424 further determines that the comparison region is based on the relative relationship between the position of the comparison region where the calculated similarity is greater than the threshold value and the position of other facial organs (eg, eyes, nose). It is possible to judge whether or not it corresponds to the mouth area.

The motion detection module 1425 detects a more detailed mouth shape from the mouth region 1531 detected by the facial organ detection module 1424.

FIG. 16 is a diagram (No. 1) for explaining the process of detecting the shape of the mouth by the motion detection module 1425. With reference to FIG. 16, the motion detection module 1425 sets a contour detection line 1641 for detecting the shape of the mouth (the contour of the lips) included in the mouth region 1531. A plurality of contour detection lines 1641 are set at predetermined intervals in a direction orthogonal to the height direction of the face.

The motion detection module 1425 can detect a change in the brightness value of the mouth region 1531 along each of the plurality of contour detection lines 1641, and can specify a position where the change in the brightness value is abrupt as a contour point. More specifically, the motion detection module 1425 can specify as a contour point a pixel whose luminance difference (that is, change in luminance value) from adjacent pixels is equal to or greater than a predetermined threshold value. The brightness value of the pixel is obtained, for example, by integrating the RBG value of the pixel with a predetermined weighting.

The motion detection module 1425 identifies two types of contour points from the image corresponding to the mouth region 1531. The motion detection module 1425 identifies contour points 1642 corresponding to the outer contour of the mouth (lips) and contour points 1643 corresponding to the inner contour of the mouth (lips). In one aspect, the motion detection module 1425 may identify the contour points at both ends as the outer contour points 1642 when three or more contour points are detected on one contour detection line 1641. In this case, the motion detection module 1425 can identify contour points other than the outer contour point 1642 as the inner contour point 1643. Further, when two or less contour points are detected on one contour detection line 1641, the motion detection module 1425 can specify the detected contour points as the outer contour points 1642.

FIG. 17 is a diagram (No. 2) for explaining a process in which the motion detection module 1425 detects the shape of the mouth. In FIG. 17, the outer contour points 1642 are shown as white circles and the inner contour points 1643 are shown as hatched circles.

The motion detection module 1425 identifies the mouth shape 1721 by interpolating between the inner contour points 1643. In certain aspects, the motion detection module 1425 may identify the mouth shape 1721 using a non-linear interpolation method such as spline interpolation. In another aspect, the motion detection module 1425 may specify the mouth shape 1721 by interpolating between the outer contour points 1642. In yet another aspect, the motion detection module 1425 excludes contour points that deviate significantly from the assumed mouth shape (a predetermined shape that can be formed by the upper and lower lips of a person), and the remaining contour points make the mouth. The shape 1721 may be specified. In this way, the motion detection module 1425 can identify the motion (shape) of the user's mouth. The method for detecting the mouth shape 1721 is not limited to the above, and the motion detection module 1425 may detect the mouth shape 1721 by another method. In addition, the motion detection module 1425 can detect the motion of the user's eyes and eyebrows in the same manner. The motion detection module 1425 may be configured to be able to detect the shape of an organ such as a cheek or a nose.

FIG. 18 shows an example of the structure of face tracking data. The motion detection module 1425 generates face tracking data representing the facial expression of the user 5. The face tracking data represents the position coordinates of the feature points constituting the shape of each organ to be detected in the uvw visual field coordinate system. For example, the points m1, m2, ... Shown in FIG. 18 correspond to the outer contour points 1642 constituting the mouth shape 1721. In a certain aspect, the face tracking data is a coordinate value in the uvw field of view coordinate system with the position of the first camera 150 as a reference (origin). In another aspect, the face tracking data is a coordinate value in a coordinate system with a feature point predetermined for each organ as a reference (origin). As an example, the points m1, m2, ... Are coordinate values in the coordinate system with the origin of any one of the outer contour points 1642 corresponding to the corner of the mouth.

The computer 200 transmits the generated face tracking data to the server 600. The server 600 transfers this data to another computer 200 that communicates with the computer 200. The other computer 200 reflects the received face tracking data in the avatar object corresponding to the user of the receiving computer 200.

In the example shown in FIG. 12A, the computer 200A receives face tracking data representing the facial expression of the user 5B from the computer 200B. The computer 200A reflects the received data on the avatar object 6B. As an example, the vertices of the polygons constituting the avatar object 6B include the vertices corresponding to the feature points of the face tracking data. The computer 200A moves the positions of the corresponding vertices based on the face tracking data. As a result, the facial expression of the user 5B is reflected in the avatar object 6B. As a result, the user 5A can recognize the facial expression of the user 5B via the avatar object 6B.

The facial expression of the avatar may be controlled not only according to the facial expression of the user but also according to the movement of the user's body. For example, if the movement of the limbs exceeds the threshold, the avatar's face is controlled to a happy smiling expression, and if the head falls toward you, the avatar's face is controlled to a sad expression with the bottom of the eyebrows lowered. ..

[Face calibration]
In certain embodiments, the computer 200 can perform avatar face calibration. Calibration is a process of controlling the facial expression of the avatar to the standard state. The standard state of facial expression is, for example, expressionless with no emotions, but a specific facial expression such as a smile may be set as the standard state.

At the time of performing the calibration, the processor 210 of the computer 200 photographs the face of the user 5 by the first camera 150 and the second camera 160 of the HMD 120. At the time of shooting, the processor 210 can urge the user 5 to set the facial expression to the standard state by displaying a menu, voice guidance, or the like. The processor 210 can detect the facial expression of the user 5 from the captured image and generate face tracking data in the standard state.

The processor 210 updates the face tracking data of the current avatar with the generated standard state face tracking data. The processor 210 can control the facial expression of the avatar to the standard state by moving the positions of the vertices of the polygons of the avatar's face based on the updated face tracking data.

[Calibration control]
In certain embodiments, the computer 200 can calibrate the avatar's face in response to user instructions. Further, in a certain embodiment, the computer 200 can automatically execute the calibration when the execution timing of the calibration is detected without any instruction operation by the user.

FIG. 19 is a flowchart showing the processing executed by the processor 210 of the computer 200 for the execution of the automatic calibration. As an example, as shown in FIG. 12A, a process executed by the computer 200A communicating with another computer 200B will be described.

FIG. 20 shows a virtual space 11A defined by the computer 200A of the user 5A. As described above, the virtual space 11A is shared by the

computers

200A and 200B because the data is the same as the virtual space 11B defined by the computer 200B of the user 5B. The computer 200A arranges the avatar 6A associated with the user 5A and the avatar 6B associated with the user 5B in the virtual space 11A.

The computer 200A prompts the user 5A to set the face in the standard state, detects the facial expression of the user 5A, and generates face tracking data in the standard state. The computer 200A controls the facial expression of the avatar 6A based on the face tracking data in the standard state. As a result, the expression in the standard state is reflected on the face of the avatar 6A. Since the face tracking data of the user 5B in the standard state is transmitted from the other computer 200B as the avatar information, the computer 200A reflects the facial expression of the user 5B in the standard state on the face of the avatar 6B.

After that, the computer 200A updates the face tracking data at a predetermined timing, and reflects the facial expressions of the users 5A and 5B on the

avatars

6A and 6B. The user 5A can interact with each other while recognizing the facial expression of the user 5B by communicating via the virtual space 11A.

The face of the avatar 6A may reflect the displacement of the HMD 120A at the time of shooting, the deformation of the face due to the touch of the user 5A, and the like. Therefore, if the face tracking data is continuously updated, the face state of the avatar 6A may change significantly from the standard state, causing the face to collapse, and calibration may be required. In step S1951 shown in FIG. 19, the computer 200A determines whether or not the processor 210 satisfies the predetermined calibration execution conditions.

One of the execution conditions of the calibration is that the view image corresponding to the view from the avatar 6B of the user 6 is the view image including the face of the avatar 6A, but the image is switched to the image not including the face of the avatar 6A. .. The field of view from the avatar 6B means the field of view from the virtual camera 14 (virtual viewpoint) associated with the user 5B. According to the execution timing of this calibration, the calibration can be executed while the user B of the communication partner does not recognize the collapsed face of the avatar 6A. In addition, calibration may be executed before the face of the avatar 6A changes significantly from the standard state, and it is possible to prevent the face from collapsing in advance.

In a certain aspect, the processor 210 determines that when the display mode of the field of view image is switched to another display mode different from the display mode of the field of view image, the image is switched to the image that does not include the face of the avatar 6A. Other display modes include, for example, display modes such as darkening images provided to the

HMDs

120A and 120B while switching virtual spaces, tutorial images such as operation methods, and 360-degree moving images. Examples of switching virtual spaces include cases where the data of

virtual spaces

11A and 11B are switched and a new virtual space is defined, specifically, cases where the virtual space for dialogue is switched to the virtual space for battle games. Be done. The 360-degree moving image is a moving image captured by a 360-degree camera. The space of the whole celestial sphere can be defined as a virtual space by 360 moving images. The processor 210 can detect the switching of the virtual space in the HMD 120B when, for example, new virtual space data is transmitted from the server 600 and an update is instructed.

Further, in a certain aspect, when the object is arranged between the avatar 6B and the face of the avatar 6A, the processor 210 determines that the image is switched to the image that does not include the face of the avatar 6A. Even if the view image includes the avatar 6A, if the object is placed between the faces of the avatar 6B and the avatar 6A, the face is covered by the object, so that the view image does not include the face of the avatar 6A. When the processor 210 detects, for example, the positional relationship in which the object and the face of the avatar 6A are lined up on the line of sight of the avatar 6B, it can be determined that the image is switched to the image that does not include the face of the avatar 6B.

Examples of the object include an object for operation such as a menu, an object such as a tree and a wall, and the like. The menu is a user interface (UI: User Interface) having a display area for notifications, function explanations, etc., and may also have an operation area for receiving user instructions as needed. Examples of the menu display form include icons, pop-up windows, widgets, and the like. The menu may be an object placed in the

virtual spaces

11A and 11B, or it may be a two-dimensional image rendered in the visual field image.

FIG. 21 shows an example of a field of view image including a menu.
As shown in FIG. 20, when the avatar 6A is located on the line of sight 2016B from the avatar 6B, the field of view image 2117 displayed on the HMD 120B includes the avatar 6A. However, when the menu object is placed between the virtual viewpoint and the face of the avatar 6A, the view image 2117 includes the menu 2121 located in front of the face of the avatar 6A. Menu 2121 is a UI having a display area for asking for scene switching and an operation area for accepting selection of whether or not to switch. The face of avatar 6A is covered by menu 2121, and the field of view image 2117 provided to the HMD 120B does not include the face of avatar 6A.

One of the calibration execution conditions is that the amount of change in the facial expression of the avatar 6A from the standard state exceeds the threshold value. According to this execution condition, the calibration can be executed at the timing when the face actually changes significantly.

When a specific pattern of movement or voice of the avatar 6B associated with the user 5B is detected, the processor 210 can determine that the amount of change in the facial expression of the avatar 6A exceeds the threshold value. As a result, the calibration can be immediately executed according to the reaction of the communication partner to the change of the face. The user 5A, whose face has been pointed out by the communication partner, can omit the operation for calibration one by one.

For example, when the processor 210 detects a specific image pattern such as a movement of the avatar 6B pointing to its own face, a suspicious facial expression, or a laughing facial expression from the view image corresponding to the view from the avatar 6A, the avatar 6A It can be determined that the amount of change in facial expression of the face exceeds the threshold. The field of view from the avatar 6A means the field of view from the virtual camera 14 (virtual viewpoint) associated with the user 5A. The processor 210 extracts the movement information of the avatar 6B from the avatar information transmitted from the computer 200B via the server 600, and detects a specific movement pattern pointing out the change of the face of the avatar 6A from the extracted movement information. May be good. Further, the processor 210 analyzes the voice data of the user 5B included in the avatar information of the avatar 6B and recognizes a specific voice pattern such as "face", "funny", and "funny" to recognize the face of the avatar 6A. It can be determined that the amount of change in the facial expression of is exceeding the threshold.

The method for detecting the image pattern and the sound pattern is not particularly limited, and a known method can be used. Examples of the image pattern detection method include a method in which an image pattern to be detected is prepared as a model pattern and an image pattern similar to the feature amount of the image of the model pattern is searched for in the visual field image. Further, as a method of detecting a voice pattern, a method of morphologically analyzing voice data, identifying a phoneme by collating it with a phoneme model, and searching for a target voice pattern can be mentioned.

When the avatar 6A is not included in the field of view of the avatar 6B, the user 5B cannot visually recognize the facial expression of the avatar 6A. Therefore, only when the avatar 6A is located in the line of sight of the avatar 6B, the processor 210 may determine whether or not the amount of change in the facial expression of the avatar 6A described above exceeds the threshold value. good.

FIG. 22 shows an example of a field of view image including the avatar 6B.
As shown in FIG. 20, when the avatar 6B is located on the line of sight 2016A of the avatar 6A, the field of view image 2217 displayed on the HMD 120A includes the avatar 6B as shown in FIG.
As shown in FIG. 22, when the processor 210 detects the image pattern of the avatar 6B pointing to its own face in the view image 2217, it determines that the amount of change in the facial expression of the avatar 6A exceeds the threshold value. Alternatively, when the processor 210 detects the voice 2221 of the user 5B "the face is strange" that matches the voice patterns of "face" and "funny" from the avatar information of the avatar 6B, the processor 210 detects the facial expression of the avatar 6A. It is judged that the amount of change exceeds the threshold.

The processor 210 may detect the amount of change in the facial expression of the avatar 6A included in the view image by image-analyzing the view image corresponding to the view from the avatar 6B. For example, the processor 210 detects each part of the face of the avatar 6A in the same manner as in the case of generating face tracking data, and determines whether or not the amount of change in the position, shape, or the like exceeds the threshold value. The field of view image of the avatar 6B can be obtained from the computer 200B of the user 5B.

One of the conditions for executing the calibration is that the facial expression of the user 5A in the standard state is detected. The processor 210 compares the face tracking data generated for update with the face tracking data in the standard state, and if the similarity is equal to or higher than a certain value, it may determine that the facial expression of the user 5A is in the standard state. can. According to this execution condition, the calibration may be performed before the face changes significantly, and it is possible to prevent the face from collapsing. Further, since the calibration is executed by the user 5A intentionally setting the facial expression to the standard state, the execution instruction becomes easy.

When the above-mentioned calibration execution condition is satisfied (S1951: YES), the processor 210 decides to execute the calibration. The processor 210 may determine whether or not a specific one of the plurality of calibration execution conditions described above is satisfied. Further, the processor 210 may monitor a plurality of execution conditions and decide to execute the calibration when any one of them is satisfied. If the calibration execution condition is not satisfied in step S1951 (S1951: NO), this process ends.

When the execution of the calibration is determined, in step S1952, the processor 210 performs a process of notifying the execution of the calibration during the execution of the calibration. For example, the processor 210 places a notification menu in front of the face of Avatar 6A. The menu may have not only a display area for notifying execution, but also a display area for prompting a standard state expression for calibration. Then, in step S1953, the processor 210 generates and updates the field of view image.

By the notification processing, the field of view image including the menu is displayed on the HMD120A of the user 5A. The processor 210 may control the size of the menu to cover the face of the avatar 6A to be calibrated. Thereby, the visual field image corresponding to the visual field from the avatar 6B can be switched to the visual field image in which the face of the avatar 6A is covered with the menu.

In step S1954, processor 210 calibrates the face of Avatar 6A. Specifically, the processor 210 fixes the face of the avatar 6A being calibrated to the face immediately before the calibration. The processor 210 detects the facial expression of the user 5A in the standard state, generates face tracking data in the standard state, and reflects it on the face of the avatar 6A. Instead of regenerating the face tracking data in the standard state, the facial expression of the avatar 6A may be controlled based on the face tracking data in the standard state first generated.

In step S1955, the processor 210 transmits the avatar information of the avatar 6A in the virtual space 11A to the computer 200B via the server 600. Specifically, the server 600 integrates the received avatar information of the avatar 6A with the avatar information of the avatar 6B transmitted from the computer 200B, and transmits the avatar information to the

computers

200A and 200B. The processor 210 updates the positions, orientations, facial expressions, and the like of the

avatars

6A and 6B on the virtual space 11A based on the integrated avatar information of the

avatars

6A and 6B.

In step S1956, the processor 210 generates and updates the field of view image. Similarly, the avatar and the field of view image are updated on the computer 200B of the user 5B. With the update, a field of view image including the avatar 6A whose face is controlled to the standard state by calibration is displayed on the computer 200B.

FIG. 23 shows an example of a field of view image including the avatar 6A before and after calibration.
The

visual field images

2317A and 2317B shown in FIG. 23 are visual field images in which the visual field 2016B shown in FIG. 20 is detected and displayed on the HMD 120B of the user 5B. The field of view image 2317A is a field of view image before calibration, and as shown in FIG. 23, includes an avatar 6A in which the positions of eyes, mouth, etc. are shifted downward and the face is significantly changed from the standard state. After calibration, the field of view image 2317A is updated to the field of view image 2317B. The face of avatar 6A in the field of view image 2317B is expressionless in the standard state, and the collapse of the face is corrected.

As described above, according to the above calibration control, the processor 210 of the computer 200A executes the calibration when the execution condition of the calibration is satisfied. Therefore, the face of the avatar 6A can be calibrated at an appropriate timing without operating an input device such as a keyboard of the user 5A. Since no operation is required, it is particularly convenient when the field of view is obstructed by the HMD120A and it is difficult to operate the input device in the real space. Further, since the face collapse is automatically corrected, the user 5A can immerse himself in the dialogue without worrying about the face collapse of the avatar 6A and the operation for correcting the face collapse.

In order to prevent frequent calibration, even if the calibration execution conditions are satisfied, if a certain period of time has not passed since the previous calibration, the processor 210 may not perform the calibration. good.

Although the embodiments of the present disclosure have been described above, the technical scope of the present invention should not be construed as being limited by the description of the present embodiments. This embodiment is an example, and it is understood by those skilled in the art that various embodiments can be changed within the scope of the invention described in the claims. The technical scope of the present invention should be determined based on the scope of the invention described in the claims and the equivalent scope thereof.

In the above embodiment, the virtual space (VR space) in which the user is immersed by the HMD has been described as an example, but a transparent HMD may be adopted as the HMD. In this case, augmented reality (AR) space or mixed reality (AR) space or mixed reality (AR: Augmented Reality) space or mixed reality (AR) space or mixed reality (AR) MR: Mixed Reality) A virtual experience in space may be provided to the user. In this case, instead of the operation object, an action on the target object in the virtual space may be generated based on the movement of the user's hand. Specifically, the processor may specify the coordinate information of the position of the user's hand in the real space, and may define the position of the target object in the virtual space in relation to the coordinate information in the real space. As a result, the processor can grasp the positional relationship between the user's hand in the real space and the target object in the virtual space, and can execute the process corresponding to the collision control and the like described above between the user's hand and the target object. .. As a result, it becomes possible to give an action to the target object based on the movement of the user's hand.

(Constitution)
The technical features disclosed above can be summarized as follows.

(Structure 1)
A step of defining a virtual space, a step of arranging a first avatar associated with the first user on the virtual space, a step of detecting a facial expression of the first user, and a face of the first user. When the step of controlling the facial expression of the first avatar and the execution condition of the calibration for controlling the facial expression of the first avatar to the standard state are satisfied according to the facial expression of the first avatar, the calibration is performed. A program for causing a computer to execute steps (steps S1951 and S1954) to be executed.

(Structure 2)
In (Structure 1), the computer is further made to perform a step of arranging a second avatar associated with the second user on the virtual space. One of the execution conditions of the calibration is a field of view image corresponding to the field of view from the second avatar, and the field of view image including the face of the first avatar is switched to an image not including the face of the first avatar. That is.

(Structure 3)
In (Structure 2), the step of executing the calibration switches to an image that does not include the face of the first avatar when an object is placed between the face of the second avatar and the face of the first avatar. Judge.

(Structure 4)
In (Structure 2), the step of executing the calibration is an image that does not include the face of the first avatar when the display mode of the visual field image is switched to a display mode different from the display mode of the visual field image. Judge that it has switched to.

(Structure 5)
In any of (Structure 1) to (Structure 4), one of the execution conditions of the calibration is that the amount of change in the facial expression of the first avatar from the standard state exceeds the threshold value.

(Structure 6)
In any of (Structure 1) to (Structure 5), one of the execution conditions of the calibration is that the facial expression of the first user in the standard state is detected.

(Structure 7)
A method executed by a computer, in which a step of defining a virtual space, a step of placing a first avatar associated with the first user on the virtual space, and a facial expression of the first user are detected. The step, the step of controlling the facial expression of the first avatar according to the facial expression of the first user, and the execution condition of the calibration for controlling the facial expression of the first avatar to the standard state are satisfied. A method comprising, if done, a step of performing the calibration.

(Structure 8)
A memory for storing a program and a processor are provided, and the processor reads the program, defines a virtual space, and arranges a first avatar associated with the first user on the virtual space. A step of detecting the facial expression of the first user, a step of controlling the facial expression of the first avatar according to the facial expression of the first user, and a step of controlling the facial expression of the first avatar. A computer that executes a step of executing the calibration and a computer that executes the calibration when the execution condition of the calibration for controlling the facial expression to the standard state is satisfied.

2 ... network, 5 ... user, 6 ... avatar object, 11 ... virtual space, 12 ... center, 14 ... virtual camera, 15 ... view area, 100 ... HMD system, 110 ... HMD set, 130 ... monitor, 170 ... microphone, 180 ... Speaker, 190 ... Sensor, 200 ... Computer, 210 ... Processor, 220 ... Memory, 230 ... Storage, 240 ... Input / Output Interface, 250 ... Communication Interface, 300 ... Controller, 310 ... Grip, 320 ... Frame, 340, 350 370, 380 ... Button, 390 ... Analog Stick, 410 ... HMD Sensor, 420 ... Motion Sensor, 430 ... Display, 510 ... Control Module, 520 ... Rendering Module, 530 ... Memory Module, 540 ... Communication Control Module, 600 ... Server , 610 ... Processor, 620 ... Memory, 630 ... Storage, 640 ... Input / output interface, 650 ... Communication interface, 1421 ... Virtual camera control module, 1422 ... View area determination module, 1423 ... Reference line-of-sight identification module, 1424 ... Motion detection module , 1424 ... Face organ detection module, 1425 ... Motion detection module, 1426 ... Virtual space definition module, 1427 ... Virtual object generation module, 1428 ... Operation object control module, 1429 ... Avatar control module, 1438 ... Visibility image generation module

Claims

Steps to define virtual space and
A step of arranging the first avatar associated with the first user on the virtual space, a step of detecting the facial expression of the first user, and a step of detecting the facial expression of the first user.
A step of controlling the facial expression of the first avatar according to the facial expression of the first user, and
When the execution condition of the calibration for controlling the facial expression of the first avatar to the standard state is satisfied, the step of executing the calibration and the step of executing the calibration.
A program that lets your computer run.
Further causing the computer to perform a step of placing a second avatar associated with the second user in the virtual space.
One of the execution conditions of the calibration is a field of view image corresponding to the field of view from the second avatar, and the field of view image including the face of the first avatar is switched to an image not including the face of the first avatar. That is,
The program according to claim 1.
The step of executing the calibration determines that when an object is placed between the second avatar and the face of the first avatar, the image is switched to an image that does not include the face of the first avatar.
The program according to claim 2.
The step of executing the calibration determines that when the display mode of the visual field image is switched to a display mode different from the display mode of the visual field image, the image is switched to the image that does not include the face of the first avatar.
The program according to claim 2.
One of the execution conditions of the calibration is that the amount of change in the facial expression of the first avatar from the standard state exceeds the threshold value.
The program according to any one of claims 1 to 4.
One of the execution conditions of the calibration is that the facial expression of the first user in the standard state is detected.
The program according to any one of claims 1 to 5.
The steps that a computer performs, defining virtual space,
A step of arranging the first avatar associated with the first user on the virtual space, a step of detecting the facial expression of the first user, and a step of detecting the facial expression of the first user.
A step of controlling the facial expression of the first avatar according to the facial expression of the first user, and
A method including a step of executing the calibration when the execution condition of the calibration for controlling the facial expression of the first avatar to the standard state is satisfied.
It has a memory for storing programs and a processor.
The processor reads the program and defines a virtual space.
A step of placing the first avatar associated with the first user in the virtual space, and
The step of detecting the facial expression of the first user and
A step of controlling the facial expression of the first avatar according to the facial expression of the first user, and
A computer that executes a step of executing the calibration and a step of executing the calibration when the execution condition of the calibration for controlling the facial expression of the first avatar to the standard state is satisfied.