WO2023079875A1

WO2023079875A1 - Information processing device

Info

Publication number: WO2023079875A1
Application number: PCT/JP2022/036695
Authority: WO
Inventors: 有希中村; 康夫森永; 充宏後藤; 達哉西▲崎▼; 怜央水田
Original assignee: 株式会社Ｎｔｔドコモ
Priority date: 2021-11-04
Filing date: 2022-09-30
Publication date: 2023-05-11

Abstract

This information processing device is provided with: a display control unit which, on a display device mounted on the user's head, displays multiple virtual objects arranged in a virtual space; a virtual object specifying unit which specifies a first virtual object from multiple virtual objects on the basis of instruction information generated in response to a user operation; and a name specifying part which, in the case that a name corresponding to the first virtual object specified by the virtual object specifying unit is stored in a storage device, specifies said corresponding name as a first name. The display control unit displays the first name on the display device.

Description

Information processing equipment

The present invention relates to an information processing device.

In AR (Augmented Reality) technology, the real environment perceived by the user is augmented by a computer. By using this technology, for example, it is possible to superimpose and display a virtual space on a real space visually recognized through AR glasses worn on the user's head.

In AR technology, tags are sometimes attached to virtual objects placed in the virtual space. For example, in Patent Document 1, face authentication processing is performed on a captured image of a person captured by a head-mounted display, and the name of the person, which is the face recognition result, is processed for a face image extracted from the captured image. is disclosed as tag information.

On the other hand, regarding the technology for capturing the real space, Patent Document 2, for example, discloses a technology for setting a tag indicating an object included in the captured image. In the technique disclosed in Patent Document 2, it is possible to search for a photographed image in which an object indicated by the tag is captured using the tag.

JP 2019-012536 A Japanese Patent No. 6908953

In the AR technology, when a tag is set for a virtual object and a virtual object corresponding to the tag is specified using the tag, as the number of pairs of the virtual object and the tag increases, the user can identify each virtual object. and each tag, and it becomes difficult for the user to memorize the tag itself. As a result, if the user cannot remember the tag corresponding to the virtual object that the user wants to specify, the user cannot easily specify the virtual object.

An object of the present invention is to provide an information processing apparatus that can easily remind a user of a tag as a name for specifying a virtual object placed in a virtual space.

An information processing apparatus according to a preferred aspect of the present invention includes a display control unit that displays a plurality of virtual objects arranged in a virtual space on a display device worn on the user's head, and A virtual object specifying unit for specifying a first virtual object among the plurality of virtual objects based on generated instruction information, and a name corresponding to the first virtual object specified by the virtual object specifying unit are stored in a storage device. and a call name specifying unit that specifies the corresponding call name as a first call name when stored, wherein the display control unit causes the display device to display the first call name.

According to the present invention, it is possible to easily remind the user of the name for specifying the virtual object placed in the virtual space.

The figure which shows the whole structure of the information processing system 1 which concerns on 1st Embodiment. FIG. 2 is a perspective view showing the appearance of the AR glasses 20; Schematic diagram of virtual space VS. Schematic diagram of virtual space VS. FIG. 2 is a block diagram showing a configuration example of the AR glasses 20; 2 is a block diagram showing a configuration example of a terminal device 10; FIG. FIG. 4 is a diagram showing an example of first information IF1; The figure which shows the example of 2nd information IF2. 3 is a functional block diagram showing the configuration of an identifying unit 113; FIG. Explanatory drawing about the utilization method of tag TG. Explanatory drawing about the utilization method of tag TG. Explanatory drawing about the utilization method of tag TG. FIG. 4 is an explanatory diagram of a first operation example of the display control unit 114; FIG. 4 is an explanatory diagram of a first operation example of the display control unit 114; FIG. 9 is an explanatory diagram of a second operation example of the display control unit 114; FIG. 9 is an explanatory diagram of a second operation example of the display control unit 114; FIG. 11 is an explanatory diagram of a third operation example of the display control unit 114; FIG. 11 is an explanatory diagram of a third operation example of the display control unit 114; FIG. 11 is an explanatory diagram of a fourth operation example of the display control unit 114; FIG. 11 is an explanatory diagram of a fourth operation example of the display control unit 114; 3 is a block diagram showing a configuration example of a server 30; FIG. 4 is a flowchart showing a first operation of the information processing system 1 according to the first embodiment; 4 is a flowchart showing a second operation of the information processing system 1 according to the first embodiment; 8 is a flowchart showing a third operation of the information processing system 1 according to the first embodiment; 9 is a flowchart showing a fourth operation of the information processing system 1 according to the first embodiment; The figure which shows the whole structure of 1 A of information processing systems which concern on 2nd Embodiment. The block diagram which shows the structural example of 10 A of terminal devices. FIG. 3 is a functional block diagram showing the configuration of an identifying unit 113A; Explanatory drawing about the 1st operation example of 114 A of display control parts. Explanatory drawing about the 2nd operation example of 114 A of display control parts. FIG. 11 is an explanatory diagram of a third operation example of the display control unit 114A; FIG. 4 is an explanatory diagram showing functions of an updating unit 117; FIG. 4 is an explanatory diagram showing functions of an updating unit 117; 4 is a flowchart for explaining a first operation of the information processing system 1A; 6 is a flowchart for explaining a second operation of the information processing system 1A; A flow chart explaining the 3rd operation of information processing system 1A.

1: First Embodiment Hereinafter, the configuration of an information processing system 1 including an information processing apparatus according to a first embodiment of the present invention will be described with reference to FIGS. 1 to 19. FIG.

1.1: Configuration of First Embodiment 1.1.1: Overall Configuration FIG. 1 is a diagram showing the overall configuration of an information processing system 1 according to the first embodiment of the present invention. The information processing system 1 is a system that provides a virtual space using AR technology to a user U1 wearing AR glasses 20, which will be described later.

The information processing system 1 includes a terminal device 10, AR glasses 20, and a server 30. The terminal device 10 and the AR glasses 20 are communicably connected to each other. Also, the terminal device 10 and the server 30 are communicably connected to each other via a communication network NET. In FIG. 1, the terminal device 10 and the AR glasses 20 are combined as a pair of the terminal device 10-1 and the AR glasses 20-1, a pair of the terminal device 10-2 and the AR glasses 20-2, and a terminal device 10-1 and the AR glasses 20-2. A total of three pairs of devices 10-3 and AR glasses 20-3 are described. However, the number of sets is merely an example, and the information processing system 1 can include any number of sets of terminal devices 10 and AR glasses 20 . Also, the terminal device 10 is an example of an information processing device.

The terminal device 10 is a device for displaying a virtual object arranged in a virtual space on the AR glasses 20 worn on the user's head. The virtual space is, for example, a celestial space. The virtual objects are, for example, virtual objects representing data such as still images, moving images, 3DCG models, HTML files, and text files, and virtual objects representing applications. Examples of text files include memos, source codes, diaries, and recipes. Examples of applications include browsers, applications for using SNS, and applications for generating document files. Note that the terminal device 10 is preferably a mobile terminal device such as a smart phone and a tablet, for example.

The AR glasses 20 are see-through wearable displays worn on the user's head. Under the control of the terminal device 10, the AR glasses 20 display a virtual object on the display panel provided for each of the binocular lenses. Note that the AR glasses 20 are an example of a display device.

The server 30 provides various data and cloud services to the terminal device 10 via the communication network NET.

1.1.2: Configuration of AR Glasses FIG. 2 is a perspective view showing the appearance of the AR glasses 20. As shown in FIG. As shown in FIG. 2, the AR glasses 20 have

temples

91 and 92, a bridge 93,

trunks

94 and 95, and

lenses

41L and 41R, like common spectacles. An imaging device 27 is provided in the bridge 93 . The imaging device 27 captures an image of the outside world and outputs imaging data representing the captured image. Also, each of the

temples

91 and 92 is provided with a sound pickup device 24 that picks up sound. The sound collection device 24 outputs sound data representing the collected sound. Note that the position of the sound pickup device 24 is not limited to the

temples

91 and 92, and may be, for example, the bridge 93 and any one of the

trunks

94 and 95.

Each of the

lenses

41L and 41R has a half mirror. The body portion 94 is provided with a liquid crystal panel or an organic EL panel for the left eye (hereinafter collectively referred to as a display panel) and an optical member for guiding light emitted from the display panel for the left eye to the lens 41L. . The half mirror provided in the lens 41L transmits external light and guides it to the left eye, and reflects the light guided by the optical member to enter the left eye. The body portion 95 is provided with a right-eye display panel and an optical member that guides light emitted from the right-eye display panel to the lens 41R. The half mirror provided in the lens 41R transmits external light and guides it to the right eye, and reflects the light guided by the optical member to enter the right eye.

The display 29, which will be described later, includes a lens 41L, a left-eye display panel, a left-eye optical member, and a lens 41R, a right-eye display panel, and a right-eye optical member.

With the above configuration, the user can observe the image on the display panel superimposed on the state of the outside world. In addition, in the AR glasses 20, of the binocular images with parallax, the image for the left eye is displayed on the display panel for the left eye, and the image for the right eye is displayed on the display panel for the right eye. By using binocular parallax, the user U1 can perceive the displayed image as if it had depth and stereoscopic effect.

3 and 4 are schematic diagrams of the virtual space VS provided to the user U1 by using the AR glasses 20. FIG. As shown in FIG. 3, in the virtual space VS, virtual objects VO1 to VO5 representing various contents such as browsers, cloud services, images, and moving images are arranged. The user U1 walks around the public space while wearing the AR glasses 20 on which the virtual objects VO1 to VO5 arranged in the virtual space VS are displayed. It becomes possible to experience the space VS. Ultimately, the user U1 can act in the public space while receiving benefits brought about by the virtual objects VO1 to VO5 placed in the virtual space VS.

Also, as shown in FIG. 4, it is possible for a plurality of users U1 to U3 to share the virtual space VS. By sharing the virtual space VS with a plurality of users U1-U3, the plurality of users U1-U3 share one or a plurality of virtual objects VO, and the users U1-U3 can communicate with each other.

FIG. 5 is a block diagram showing a configuration example of the AR glasses 20. As shown in FIG. The AR glasses 20 include a processing device 21 , a storage device 22 , a line-of-sight detection device 23 , a sound collection device 24 , a GPS device 25 , a motion detection device 26 , an imaging device 27 , a communication device 28 and a display 29 . Each element of the AR glasses 20 is interconnected by one or more buses for communicating information.

The processing device 21 is a processor that controls the entire AR glasses 20, and is configured using, for example, one or more chips. The processing device 21 is configured using, for example, a central processing unit (CPU) including an interface with peripheral devices, an arithmetic device, registers, and the like. Some or all of the functions of the processing device 21 are realized by hardware such as DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), FPGA (Field Programmable Gate Array), and the like. may The processing device 21 executes various processes in parallel or sequentially.

The storage device 22 is a recording medium readable and writable by the processing device 21, and stores a plurality of programs including the control program PR1 executed by the processing device 21.

The line-of-sight detection device 23 detects the line of sight of the user U1, and outputs line-of-sight data indicating the direction of the line of sight of the user U1 to the processing device 21, which will be described later. The line-of-sight detection device 23 may detect the line-of-sight by any method. For example, line-of-sight data may be detected based on the position of the inner corner of the eye and the position of the iris.

The sound collection device 24 collects sound and outputs sound data based on the collected sound to the processing device 21, which will be described later.

The GPS device 25 receives radio waves from multiple satellites and generates position data from the received radio waves. The position data indicates the position of the AR glasses 20. FIG. The location data may be in any format as long as the location can be specified. The position data indicates the latitude and longitude of the AR glasses 20, for example. As an example, position data is obtained from GPS device 25 . However, the AR glasses 20 may acquire position data by any method. The acquired position data is output to the processing device 21 .

The motion detection device 26 detects motion of the AR glasses 20 and outputs motion data to the processing device 21 . Examples of the motion detection device 26 include inertial sensors such as an acceleration sensor that detects acceleration and a gyro sensor that detects angular acceleration. The acceleration sensor detects acceleration in orthogonal X-, Y-, and Z-axes. The gyro sensor detects angular acceleration around the X-, Y-, and Z-axes. The motion detection device 26 can generate orientation information indicating the orientation of the AR glasses 20 based on the output information of the gyro sensor. The motion data includes acceleration data respectively indicating three-axis accelerations and angular acceleration data respectively indicating three-axis angular accelerations.

The imaging device 27 outputs imaging data obtained by imaging the outside world. The imaging device 27 includes, for example, a lens, an imaging element, an amplifier, and an AD converter. The light condensed through the lens is converted into an image pickup signal, which is an analog signal, by the image pickup device. The amplifier amplifies the imaging signal and outputs it to the AD converter. The AD converter converts the amplified imaging signal, which is an analog signal, into imaging data, which is a digital signal. The converted imaging data is output to the processing device 21 . The imaging data output to the processing device 21 is output to the terminal device 10 via the communication device 28 . The terminal device 10 recognizes various gestures of the user U1 based on the imaging data, and controls the terminal device 10 according to the recognized gestures. That is, the imaging device 27 functions as an input device for inputting instructions from the user U1, like a pointing device and a touch panel.

The communication device 28 is hardware as a transmission/reception device for communicating with other devices. The communication device 28 is also called, for example, a network device, a network controller, a network card, a communication module, or the like. The communication device 28 may include a connector for wired connection and an interface circuit corresponding to the connector. The communication device 28 may also have a wireless communication interface. Products conforming to wired LAN, IEEE1394, or USB can be used as connectors and interface circuits for wired connection. Also, as a wireless communication interface, there are products conforming to wireless LAN, Bluetooth (registered trademark), and the like.

The display 29 is a device that displays images. The display 29 displays various images under the control of the processing device 21 . The display 29 includes the lens 41L, the left-eye display panel, the left-eye optical member, and the lens 41R, the right-eye display panel, and the right-eye optical member, as described above. Various display panels such as a liquid crystal display panel and an organic EL display panel are preferably used as the display panel.

The processing device 21 functions as an acquisition unit 211 and a display control unit 212, for example, by reading the control program PR1 from the storage device 22 and executing it.

The acquisition unit 211 acquires the control signal from the terminal device 10 . More specifically, the acquisition unit 211 acquires a control signal for controlling display on the AR glasses 20 generated by a display control unit 114 provided in the terminal device 10 and described later.

The acquisition unit 211 also receives line-of-sight data input from the line-of-sight detection device 23, audio data input from the sound collection device 24, position data input from the GPS device 25, motion data input from the motion detection device 26, and acquires imaging data input from the imaging device 27 . After that, the acquisition unit 211 outputs the acquired line-of-sight data, audio data, position data, motion data, and imaging data to the communication device 28 .

The display control unit 212 controls display on the display 29 based on the control signal from the terminal device 10 acquired by the acquisition unit 211 .

1.1.3: Configuration of Terminal Device FIG. 6 is a block diagram showing a configuration example of the terminal device 10. As shown in FIG. The terminal device 10 includes a processing device 11 , a storage device 12 , a communication device 13 , a display 14 , an input device 15 and an inertial sensor 16 . The elements of terminal 10 are interconnected by a bus or buses for communicating information. Note that the term "apparatus" in this specification may be replaced with another term such as a circuit, a device, or a unit.

The processing device 11 is a processor that controls the entire terminal device 10, and is configured using, for example, one or more chips. The processing unit 11 is configured using, for example, a central processing unit (CPU) including interfaces with peripheral devices, arithmetic units, registers, and the like. A part or all of the functions of the processing device 11 may be realized by hardware such as DSP, ASIC, PLD, and FPGA. The processing device 11 executes various processes in parallel or sequentially.

The storage device 12 is a recording medium readable and writable by the processing device 11, and stores a plurality of programs including a control program PR2 executed by the processing device 11, first information IF1, and second information IF2.

FIG. 7 is a diagram showing an example of the first information IF1. In the example shown in FIG. 7, the first information IF1 is tabular information. The first information IF1 includes identification information that uniquely identifies the virtual object VO, a tag TG corresponding to the virtual object VO, position information indicating the position of the virtual object VO in the celestial virtual space VS, and an image of each virtual object VO. Match information. The identification information is hereinafter referred to as ID. The position information is three-dimensional coordinates in the virtual space VS. If each virtual object VO does not have a tag TG corresponding to itself, the tag TG column is blank. That is, in the first information IF1, some or all of the plurality of virtual objects VO arranged in the virtual space VS are associated one-to-one with the plurality of tags TG. Note that the tag TG is an example of the name of the virtual object VO. Also, the plurality of virtual objects VO arranged in the virtual space VS are the plurality of virtual objects VO that can be visually recognized by the user U1 changing the posture. The virtual object VO can be placed in the virtual space VS based on an instruction from the user U1. Also, the virtual object VO may be arranged in the virtual space VS when a predetermined condition is satisfied without an instruction from the user U1. For example, K virtual objects VO out of J virtual objects VO are arranged in the virtual space VS. Of the K virtual objects VO placed in the virtual space VS, L virtual objects VO are assigned tags TG. J, K, and L are integers, and J≧K≧L. The first information IF1 is information about K virtual objects VO. Note that the first information IF1 may be acquired from the server 30 via the communication device 13 .

FIG. 8 is a diagram showing an example of the second information IF2. In the example shown in FIG. 8, the second information IF2 is tabular information. The second information IF2 associates the ID of the virtual object VO arranged in the virtual space VS with the attributes of the virtual object VO. The second information IF2 is information about K virtual objects VO. Note that the second information IF2 may be acquired from the server 30 via the communication device 13 . Here, the "attribute" is an item that classifies the content of each virtual object VO according to its features or properties. The virtual object VO includes data such as still images, moving images, 3DCG models, HTML files and text files, and applications. In this example, attributes are assigned to each of the plurality of virtual objects VO arranged in the virtual space VS, but attributes may be assigned to some of the plurality of virtual objects VO. Attributes may be assigned attributes common to two or more virtual objects VO. On the other hand, there is no tag TG common to two or more virtual objects VO. Therefore, attributes can be used to group virtual objects VO that have the same features or properties. In the example shown in FIG. 8, virtual objects VO with ID=6, ID=7, and ID=8 are grouped by using the attribute "animal", for example. On the other hand, one virtual object VO can be identified by using the tag TG. In the example shown in FIG. 7, the virtual object VO with ID=8 is identified by using the tag TG "#HORSE", for example.

Returning to FIG. 6, the communication device 13 is hardware as a transmission/reception device for communicating with other devices. The communication device 13 is also called, for example, a network device, a network controller, a network card, or a communication module. The communication device 13 may include a connector for wired connection and an interface circuit corresponding to the connector. Further, the communication device 13 may have a wireless communication interface. Products conforming to wired LAN, IEEE1394, or USB can be used as connectors and interface circuits for wired connection. Also, as a wireless communication interface, there are products conforming to wireless LAN, Bluetooth (registered trademark), and the like.

The display 14 is a device that displays images and character information. The display 14 displays various images under the control of the processing device 11 . For example, various display panels such as a liquid crystal display panel and an organic EL (Electro Luminescence) display panel are preferably used as the display 14 .

The input device 15 accepts operations from the user U1 wearing the AR glasses 20 on his head. For example, the input device 15 includes a pointing device such as a keyboard, touch pad, touch panel, or mouse. Here, when the input device 15 includes a touch panel, the input device 15 may also serve as the display 14 .

The inertial sensor 16 is a sensor that detects inertial force. The inertial sensor 16 includes, for example, one or more of an acceleration sensor, an angular velocity sensor, and a gyro sensor. The processing device 11 detects the orientation of the terminal device 10 based on the output information from the inertial sensor 16 . Further, the processing device 11 receives selection of the virtual object VO, input of characters, and input of instructions in the celestial sphere virtual space VS based on the orientation of the terminal device 10 . For example, the user U1 directs the central axis of the terminal device 10 toward a predetermined area of the virtual space VS, and operates the input device 15 to select the virtual object VO arranged in the predetermined area. The user U1's operation on the input device 15 is, for example, a double tap. By operating the terminal device 10 in this way, the user U1 can select the virtual object VO without looking at the input device 15 of the terminal device 10 .

When a virtual keyboard is arranged in the virtual space VS, characters are input by operating the input device 15 with the central axis of the terminal device 10 facing the key that the user U1 wants to input. Further, for example, when the user U1 presses down the input device 15 and moves the terminal device 10 left or right, a predetermined instruction is input.

As a result, the terminal device 10 functions as a portable controller that controls the virtual space VS.

The processing device 11 functions as an acquisition unit 111, an action recognition unit 112, an identification unit 113, a display control unit 114, a determination unit 115, and a voice recognition unit 116 by reading and executing the control program PR2 from the storage device 12. .

The acquisition unit 111 acquires instruction information according to the motion of the user U1 wearing the AR glasses 20 on the head. The instruction information is information that designates a specific virtual object VO.
Here, the operation of the user U1 is, for example, inputting to the terminal device 10 by the user U1 using the input device 15 . More specifically, the action of the user U1 may be pressing of a specific part as the input device 15 provided in the terminal device 10 . Alternatively, the user U1's action may be an operation using the terminal device 10 as a portable controller.

Alternatively, the action of the user U1 may be visual observation of the AR glasses 20 by the user U1. If the action of the user U1 is visual observation, the instruction information is the viewpoint of the user U1 on the AR glasses 20 . In this case, the instruction information is transmitted from the AR glasses 20 to the terminal device 10 .

Alternatively, the action of user U1 may be a gesture of user U1. As will be described later, the action recognition unit 112 recognizes various gestures of the user U1. The acquisition unit 111 may acquire instruction information according to various gestures of the user U1.

Also, the acquisition unit 111 acquires the first information IF1 and the second information IF2 from the server 30 by using the communication device 13 . The acquisition unit 111 stores the acquired first information IF1 and second information IF2 in the storage device 12 . Furthermore, the acquisition unit 111 acquires the first information IF1 and the second information IF2 from the storage device 12 .

The motion recognition unit 112 recognizes various gestures of the user U1 based on the imaging data obtained from the AR glasses 20. More specifically, as described above, the imaging device 27 provided in the AR glasses 20 outputs imaging data obtained by imaging the outside world. When the imaging data includes a part of the body of the user U1 wearing the AR glasses 20 on the head, the action recognition unit 112 performs various actions of the user U1 based on the imaging data acquired from the AR glasses 20. gestures.

The identification unit 113, as shown in FIG. 9, includes a virtual object identification unit 113-1 and a nickname identification unit 113-2. Based on the instruction information acquired by the acquisition unit 111, the virtual object specifying unit 113-1 specifies one virtual object VO among a plurality of virtual objects VO arranged in the virtual space VS. One virtual object VO specified by the virtual object specifying unit 113-1 is hereinafter referred to as a first virtual object VO. Further, when the first information IF1 acquired by the acquisition unit 111 includes one tag TG corresponding to the specified virtual object VO, the nickname specifying unit 113-2 specifies the one tag TG. . One tag TG is an example of a first nickname.

The nickname identifying unit 113-2 identifies the tag TG corresponding to the first virtual object VO by referring to the first information IF1. In other words, when the tag TG corresponding to the first virtual object VO is stored in the storage device 12, the nickname identifying unit 113-2 identifies the corresponding tag TG as the first nickname.

Using the example of the first information IF1 shown in FIG. 7, assume that the virtual object specifying unit 113-1 has specified the virtual object VO with ID=7 based on the above instruction information. In this case, in the first information IF1, the tag TG corresponding to the virtual object VO with ID=7 is "#FOX", so the nickname identifying unit 113-2 identifies the tag TG of "#FOX". On the other hand, assume that the virtual object specifying unit 113-1 has specified the virtual object VO with ID=6 based on the above instruction information. In this case, since the tag TG corresponding to the virtual object VO with ID=6 does not exist in the first information IF1, the nickname specifying unit 113-2 does not specify the tag TG.

　Figs. 10A to 10C are explanatory diagrams of how to use the tag TG. As an example, as shown in FIG. 10A, assume that virtual objects VO6 to VO8 representing a deer, a fox, and a horse are arranged in a virtual space VS perceived by the user U1. It is also assumed that the character string "#FOX" is registered as the tag TG7 in the fox virtual object VO7. For example, as shown in FIG. 10B, the user U1 utters a character string indicated by a tag TG corresponding to each content while pressing a specific portion of the input device 15 provided in the terminal device 10. Here, as an example, it is assumed that user U1 has uttered the character string "#FOX", which is the tag TG7 corresponding to the fox. Generation of the tag TG7 by the user U1 enables the user U1 to call the virtual object VO7 corresponding to the uttered character string in the direction of the line of sight, as shown in FIG. 10C.

The display control unit 114 causes the AR glasses 20 as a display device to display a plurality of virtual objects VO placed in the virtual space VS. The display control unit 114 also causes the AR glasses 20 to display the tag TG specified by the name specifying unit 113-2. More specifically, the display control unit 114 generates image data to be displayed on the AR glasses 20 and transmits the generated image data to the AR glasses 20 via the communication device 13 .

11A and 11B are explanatory diagrams of a first operation example of the display control unit 114. FIG. First, the acquisition unit 111 acquires an operation signal generated by an operation on the input device 15 by the user U1 as a trigger. Then, as shown in FIG. 11A, the display control unit 114 divides the celestial sphere-shaped virtual space VS into a plurality of regions R1 to R17 by a plurality of straight lines corresponding to the latitude and longitude lines of the celestial sphere. After that, the display control unit 114 planarizes each of the plurality of divided regions R1 to R17. Furthermore, as shown in FIG. 11B, the display control unit 114 causes the display 29 of the AR glasses 20 to display a two-dimensional image SI obtained by planarizing the plurality of flattened regions R1 to R17. Specifically, the display control unit 114 displays regions R5 and R6 located on the right side of the zenith region R17 as viewed from the user U1 located in the center of the celestial space in the two-dimensional image SI. It is arranged to the right of the region R17. In addition, the display control unit 114 displays regions R13 and R14 positioned to the left of the zenith region R17 as viewed from the user U1 who is positioned in the center of the celestial sphere space in the two-dimensional image SI. Place on the left. In addition, the display control unit 114 displays the regions R9 and R10 positioned in front of the zenith region R17 as viewed from the user U1 positioned in the center of the celestial sphere as follows in the two-dimensional image SI below the region R17. to be placed. In addition, the display control unit 114 causes the regions R1 and R2 positioned behind the zenith region R17 as viewed from the user U1 positioned at the center of the celestial space to be displayed above the region R17 in the two-dimensional image SI. to be placed. Further, the display control unit 114 displays regions R7 and R8, which are positioned to the right and front of the zenith region R17 as viewed from the user U1 who is positioned at the center of the celestial space, in the two-dimensional image SI. Place it in the lower right corner. In addition, the display control unit 114 causes the regions R11 and R12, which are positioned in front left of the zenith region R17 as viewed from the user U1 positioned in the center of the celestial space, to be displayed in the two-dimensional image SI as the region R17. Place it in the lower left corner. In addition, the display control unit 114 causes the regions R3 and R4 positioned to the right rear of the zenith region R17 as viewed from the user U1 positioned in the center of the celestial space to be displayed as the regions R17 in the two-dimensional image SI. Place it in the upper right corner. In addition, the display control unit 114 displays regions R15 and R16 positioned to the left rear of the zenith region R17 as viewed from the user U1 positioned at the center of the celestial sphere space in the two-dimensional image SI. Place it on the upper left. Furthermore, as shown in FIG. 11B, the vertical and horizontal directions of each of the regions R1 to R17 are the same as those viewed from the user U1 when the user U1, who is positioned in the center of the celestial space, faces the regions R1 to R17. Matches up, down, left, and right.

Then, the display control unit 114 causes the display 29 of the AR glasses 20 to be adjacent to the region R containing the virtual object VO specified by the virtual object specifying unit 113-1, and the virtual object VO specified by the name specifying unit 113-2. Display the specified tag TG. In the example shown in FIG. 11B, the display control unit 114 causes the tag specified by the nickname specifying unit 113-2 to be adjacent to the region R3 containing the virtual object VO7 specified by the virtual object specifying unit 113-1. Display TG7.

Note that, as described above, the display control unit 114 planarly develops the plurality of regions R1 to R17 using an operation signal corresponding to the operation of the input device 15 by the user U1 as a trigger. However, the operation signal that triggers planar development is not limited to being generated in response to an operation on the input device 15 . For example, as described above, the processing device 11 functions as the motion recognition unit 112 . The operation signal may be generated according to the gesture of the user U1 detected by the processing device 11 functioning as the motion recognition unit 112 . Alternatively, the above operation signal may be generated according to the attitude of the terminal device 10 .

12A and 12B are explanatory diagrams of a second operation example of the display control unit 114. FIG. As shown in FIG. 12A, during normal operation, the display control unit 114 causes the AR glasses 20 worn on the head of the user U1 to have a celestial virtual space VS in which the user U1 is positioned at the nadir. display on the display 29. 12B, the display control unit 114 controls the display 29 of the AR glasses 20 based on the operation signal generated by the operation of the input device 15 by the user U1. is displayed as a reduced three-dimensional image TI. Then, the display control unit 114 causes the virtual object VO specified by the virtual object specifying unit 113-1 and the tag TG corresponding to the virtual object VO to be displayed adjacent to each other on the three-dimensional image TI. In the example shown in FIG. 12B, the display control unit 114 causes the virtual object VO7 specified by the virtual object specifying unit 113-1 and the tag TG7 corresponding to the virtual object VO7 to be adjacent to each other on the three-dimensional image TI. display.

Note that, as described above, the display control unit 114 causes the display 29 to display the three-dimensional image TI based on the operation signal generated by the operation of the input device 15 by the user U1. However, the operation signal that triggers the display of the three-dimensional image TI is not limited to being generated in response to the operation on the input device 15 . For example, similar to the above, the above operation signal may be generated according to a gesture of the user U1 detected by the processing device 11 functioning as the action recognition unit 112 . Alternatively, the above operation signal may be generated according to the attitude of the terminal device 10 .

Returning to FIG. 6, the determination unit 115 determines whether or not the viewpoint of the user U1 is positioned within a specific virtual object VO displayed on the AR glasses 20 as a display device for a predetermined time or longer.

As described above, the AR glasses 20 are equipped with the line-of-sight detection device 23. Also, the line-of-sight detection device 23 detects line-of-sight data based on, for example, the position of the inner corner of the eye and the position of the iris of the user U1. The determination unit 115 determines the coordinates of the point where the line-of-sight direction indicated by the line-of-sight data collides with the spherical surface of the celestial sphere as the virtual space VS as the viewpoint position of the user U1. As illustrated in FIG. 7, the first information IF1 includes position information of each first virtual object VO. The first virtual object VO is arranged at the position indicated by the position information. The determination unit 115 determines whether or not the viewpoint of the user U1 is located within the area where the first virtual object VO is arranged for a predetermined time or longer.

FIGS. 13A and 13B are explanatory diagrams of a third operation example of the display control unit 114 especially in cooperation with the determination unit 115. FIG. As shown in FIG. 13A, in the virtual space VS, when it is determined that the viewpoint of the user U1 is positioned within the specific virtual object VO for a predetermined time or longer, the display control unit 114 performs the following operations as shown in FIG. 13B. , the tag TG corresponding to the virtual object VO is displayed near the virtual object VO. Here, the “neighborhood” of the virtual object VO specifically means a range within a predetermined distance from the virtual object VO. In the example shown in FIGS. 13A and 13B, when it is determined that the user U1's viewpoint is positioned within a specific virtual object VO7 for a predetermined time or longer, the display control unit 114 displays a "#FOX", which is the tag TG7 corresponding to the virtual object VO7, is displayed.

Returning to FIG. 6, the speech recognition unit 116 recognizes the speech uttered by the user U1.

As described above, the AR glasses 20 are equipped with the sound pickup device 24. A sound uttered by the user U1 wearing the AR glasses 20 on the head is picked up by the sound pickup device 24 and converted into sound data. Audio data is output from the AR glasses 20 to the terminal device 10 . The voice recognition unit 116 recognizes the contents of the utterance based on the voice data acquired from the AR glasses 20 . More specifically, speech recognition unit 116 converts speech data into text data.

FIGS. 14A and 14B are explanatory diagrams of a fourth operation example of the display control unit 114 especially in cooperation with the speech recognition unit 116. FIG. As shown in FIG. 14A, when the user U1 wearing the AR glasses 20 on his or her head utters the attribute of the virtual object VO, the voice recognition unit 116 recognizes the voice data representing the voice uttered by the user U1. do. When the recognition result by the speech recognition unit 116 indicates one attribute included in the second information IF2, the virtual object specifying unit 113-1 generates one or more attributes corresponding to the one attribute based on the second information IF2. to identify the virtual object VO.

In the example shown in FIG. 14A, user U1 utters the word "animal" as the above attribute. The voice recognition unit 116 recognizes the voice uttered by the user U1 as the character string "animal". The virtual objects VO corresponding to the recognition result "animal" are a virtual object VO6 that is a 3D model of a deer, a virtual object VO7 that is a 3D model of a fox, and a virtual object VO8 that is a 3D model of a horse. Therefore, the virtual object specifying unit 113-1 specifies virtual objects VO6 to VO8 as virtual objects VO corresponding to the attribute "animal" among the plurality of virtual objects VO.

In addition, based on the first information IF1, the call name specifying unit 113-2 provides one virtual object VO corresponding to each of a part or all of the one or more virtual objects VO specified by the virtual object specifying unit 113-1. Identify the above tag TG. In the example shown in FIG. 14A, examples of one or more virtual objects VO identified by the virtual object identifying unit 113-1 include a virtual object VO6 that is a 3D model of a deer, a virtual object VO7 that is a 3D model of a fox, and a virtual object VO7 that is a 3D model of a fox. A virtual object VO8, which is a 3D model of a horse, can be mentioned. Also, assume that there is no tag TG6 corresponding to the virtual object VO6, which is a 3D model of a deer. It is assumed that the tag TG7 corresponding to the virtual object VO7, which is a 3D model of a fox, is "#FOX". It is assumed that the tag TG8 corresponding to the virtual object VO8, which is a 3D model of a horse, is "#HORSE". Therefore, some of the identified one or more virtual objects VO become a virtual object VO7 that is a 3D model of a fox and a virtual object VO8 that is a 3D model of a horse. Furthermore, the one or more tags TG corresponding to each of the partial virtual objects VO of the identified one or more virtual objects VO are divided into two tags TG, a tag TG7 of "#FOX" and a tag TG8 of "#HORSE". becomes. Therefore, the calling name identifying unit 113-2 identifies two tags TG, ie, the tag TG7 of "#FOX" and the tag TG8 of "#HORSE". On the other hand, if "#DEER" is assigned as the tag TG6 corresponding to the virtual object VO6, which is a 3D model of a deer, the calling name identifying unit 113-2 is identified by the virtual object identifying unit 113-1. In addition, one or more tags TG corresponding to each of all virtual objects VO of one or more virtual objects VO are specified.

The display control unit 114 displays one or more virtual objects VO specified by the virtual object specifying unit 113-1 based on the second information IF2, as shown in FIG. 14B, on the AR glasses 20 as the display device. display. Further, when a tag TG corresponding to the virtual object VO exists, the display control unit 114 displays the tag TG in association with the corresponding virtual object VO.

In the example shown in FIG. 14B, icons IC6 to IC8 indicating virtual objects VO6 to VO8 identified by the virtual object identifying unit 113-1 are displayed in the popup P1. Further, in the popup P1, the tag TG7 of "#FOX" is added to the icon IC7, and the tag TG8 of "#HORSE" is added to the icon IC8.

Further, when the recognition result by the voice recognition unit 116 matches the tag TG included in the first information IF1, the display control unit 114 changes the display content regarding the first virtual object VO corresponding to the tag TG. . As an example, as described with reference to FIGS. 10A to 10C, the display control unit 114 may move the virtual object VO corresponding to the tag TG in the line-of-sight direction of the user U1. Alternatively, when the virtual object VO corresponding to the tag TG is an application, the display control unit 114 may display a screen for selecting whether to start the application.

Although not shown in FIG. 6, the terminal device 10 may include a GPS device similar to the GPS device 25 provided in the AR glasses 20. In this case, the AR glasses 20 do not have to be equipped with the GPS device 25 .

1.1.4: Server Configuration FIG. 15 is a block diagram showing a configuration example of the server 30. As shown in FIG. The server 30 comprises a processing device 31 , a storage device 32 , a communication device 33 , a display 34 and an input device 35 . Each element of server 30 is interconnected by a bus or buses for communicating information.

The processing device 31 is a processor that controls the entire server 30, and is configured using, for example, one or more chips. The processing unit 31 is configured using, for example, a central processing unit (CPU) including interfaces with peripheral devices, arithmetic units, registers, and the like. A part or all of the functions of the processing device 31 may be realized by hardware such as DSP, ASIC, PLD, or FPGA. The processing device 31 executes various processes in parallel or sequentially.

The storage device 32 is a recording medium readable and writable by the processing device 31, and stores a plurality of programs including the control program PR3 executed by the processing device 31, first information IF1, and second information IF2.

The communication device 33 is hardware as a transmission/reception device for communicating with other devices. The communication device 33 is also called, for example, a network device, a network controller, a network card, or a communication module. The communication device 33 may include a connector for wired connection and an interface circuit corresponding to the connector. Further, the communication device 33 may have a wireless communication interface. Products conforming to wired LAN, IEEE1394, and USB are examples of connectors and interface circuits for wired connection. Also, as a wireless communication interface, there are products conforming to wireless LAN, Bluetooth (registered trademark), and the like.

The display 34 is a device that displays images and character information. The display 34 displays various images under the control of the processing device 31 . For example, various display panels such as a liquid crystal display panel and an organic EL display panel are preferably used as the display 34 .

The input device 35 is a device that accepts operations by the administrator of the information processing system 1 . For example, the input device 35 includes a pointing device such as a keyboard, touch pad, touch panel, or mouse. Here, when the input device 35 includes a touch panel, the input device 35 may also serve as the display 34 . In particular, the administrator of the information processing system 1 can use the input device 35 to input and edit the first information IF1 and the second information IF2.

The processing device 31 functions as an output unit 311 and an acquisition unit 312 by reading and executing the control program PR3 from the storage device 32, for example.

The output unit 311 outputs the first information IF1 and the second information IF2 stored in the storage device 32 to the terminal device 10 by using the communication device 33 . In addition, the output unit 311 outputs to the terminal device 10 data necessary for the terminal device 10 to provide the virtual space VS to the user U1 wearing the AR glasses 20 on the head. The data includes data related to the virtual object VO itself and data related to an application (not shown) for using the cloud service.

The acquisition unit 312 acquires various data from the terminal device 10 by using the communication device 33 . The data includes, for example, data indicating the operation content for the virtual object VO, which is input to the terminal device 10 by the user U1 wearing the AR glasses 20 on the head. Further, when the user U1 uses the above cloud service, the data includes input data to the above application.

2: Operation of First Embodiment FIGS. 16 to 19 are flowcharts showing the operation of the information processing system 1 according to the first embodiment. The operation of the information processing system 1 will be described below with reference to FIGS. 16 to 19. FIG.

1.2.1: First Operation FIG. 16 is a flow chart explaining the first operation of the information processing system 1 .

In step S1 , the processing device 11 acquires an operation signal by functioning as the acquisition unit 111 . 11A and 11B, the operation signal is a trigger for causing the display 29 provided in the AR glasses 20 to display a two-dimensional image SI obtained by planarly developing the celestial virtual space VS. is.

In step S2, the processing device 11 functions as the display control unit 114 to cause the display 29 provided in the AR glasses 20 to display a two-dimensional image SI obtained by planarly developing the celestial virtual space VS.

In step S3, the processing device 11 functions as the acquisition unit 111 to acquire instruction information. The instruction information is information for specifying the first virtual object VO among the plurality of virtual objects VO arranged in the virtual space VS.

In step S4, the processing device 11, by functioning as the virtual object specifying unit 113-1, specifies the first virtual object VO among the plurality of virtual objects VO based on the instruction information. Further, the processing device 11 functions as the virtual object specifying unit 113-1, and as a result of specifying the first virtual object VO, an ID corresponding to the first virtual object VO is output.

In step S5, the processing device 11 refers to the first information IF1 stored in the storage device 12 by functioning as the calling name identification unit 113-2. After that, the processing device 11 functions as the call name identifying unit 113-2 so that the tag TG corresponding to the first virtual object VO identified by functioning as the virtual object identifying unit 113-1 is changed to the first virtual object VO. 1 information IF1. Specifically, the processing device 11 functions as the calling name identification unit 113-2 to determine whether the tag TG corresponding to the ID output in step S4 is included in the first information IF1. If the tag TG is included in the first information IF1, that is, if the determination result in step S5 is YES, the processing device 11 functions as the nickname specifying unit 113-2 to specify the tag TG as the first nickname. After that, the process of step S6 is executed. If the tag TG is not included in the first information IF1, that is, if the determination result in step S5 is NO, the processing device 11 ends all the processes.

In step S6, the processing device 11 functions as the display control unit 114 to display the tag TG specified by the name specifying unit 113-2 on the display 29 of the AR glasses 20. More specifically, the processing device 11 functions as the display control unit 114 to display the specified tag TG in the two-dimensional image SI displayed on the display 29 . Note that if the tag TG is not specified in step S5, the processing device 11 omits the processing of step S6.

1.2.2: Second Operation FIG. 17 is a flow chart explaining the second operation of the information processing system 1 .

In step S11 , the processing device 11 acquires the operation signal by functioning as the acquisition unit 111 . 12A and 12B, the operation signal is a trigger for causing the display 29 provided in the AR glasses 20 to display a three-dimensional image TI obtained by reducing the celestial sphere virtual space VS. .

In step S12, the processing device 11 functions as the display control unit 114 to cause the display 29 provided in the AR glasses 20 to display a three-dimensional image TI obtained by reducing the celestial sphere-shaped virtual space VS.

In step S13 , the processing device 11 acquires the instruction information by functioning as the acquisition unit 111 . The instruction information is information for specifying the first virtual object VO among the plurality of virtual objects VO arranged in the virtual space VS.

In step S14, the processing device 11, by functioning as the virtual object specifying unit 113-1, specifies the first virtual object VO among the plurality of virtual objects VO based on the instruction information. Further, the processing device 11 functions as the virtual object specifying unit 113-1, and as a result of specifying the first virtual object VO, an ID corresponding to the first virtual object VO is output.

In step S15, the processing device 11 refers to the first information IF1 stored in the storage device 12 by functioning as the calling name identification unit 113-2. After that, the processing device 11 functions as the call name identifying unit 113-2 so that the tag TG corresponding to the first virtual object VO identified by functioning as the virtual object identifying unit 113-1 is changed to the first virtual object VO. 1 information IF1. Specifically, the processing device 11 functions as the nickname identification unit 113-2 to determine whether the tag TG corresponding to the ID output in step S14 is included in the first information IF1. If the tag TG is included in the first information IF1, that is, if the determination result in step S15 is YES, the processing device 11 functions as the nickname identification unit 113-2 to identify the tag TG as the first nickname. After that, the process of step S16 is executed. If the tag TG is not included in the first information IF1, that is, if the determination result in step S15 is NO, the processing device 11 ends all the processes.

In step S16, the processing device 11 functions as the display control unit 114 to display the tag TG specified by the name specifying unit 113-2 on the display 29 of the AR glasses 20. More specifically, the processing device 11 functions as the display control unit 114 to display the specified tag TG in the three-dimensional image TI displayed on the display 29 . Note that if the tag TG is not specified in step S15, the processing device 11 omits the process of step S16.

1.2.3: Third Operation FIG. 18 is a flow chart explaining the third operation of the information processing system 1 .

In step S21 , the processing device 11 functions as the acquisition unit 111 to acquire line-of-sight data related to the line of sight of the user U1 on the AR glasses 20 . More specifically, the processing device 21 of the AR glasses 20 outputs the acquired line-of-sight data to the communication device 28 by functioning as the acquisition unit 211 . The communication device 28 outputs the line-of-sight data acquired from the processing device 21 to the terminal device 10 . The processing device 11 of the terminal device 10 functions as the acquisition unit 111 to acquire line-of-sight data from the AR glasses 20 using the communication device 13 .

In step S22, the processing device 11 functions as the determination unit 115 to determine whether or not the viewpoint of the user U1 is positioned within the first virtual object VO displayed on the AR glasses 20 for a predetermined time or longer. . More specifically, the processing device 11 functions as the determination unit 115 to acquire the viewpoint position of the user U1 as instruction information based on the line-of-sight data acquired in step S21. After that, the processing device 11, as the determination unit 115, determines whether or not the viewpoint of the user U1 has been positioned at the specific virtual object VO for a predetermined time or longer. When the determination result is true, that is, when the determination result of step S22 is "YES", the processing device 11 executes the process of step S23. When the determination result is false, that is, when the determination result of step S22 is "NO", the processing device 11 executes the process of step S21.

In step S23, the processing device 11 functions as the virtual object specifying unit 113-1 to specify the first virtual object VO among the plurality of virtual objects VO based on the determination result in step S22. More specifically, the processing device 11, by functioning as the virtual object specifying unit 113-1, determines in step S22 that the virtual object VO for which the viewpoint of the user U1 has been positioned for a predetermined period of time or more is assigned to the first virtual object VO. Identify as object VO. Further, the processing device 11 functions as the virtual object specifying unit 113-1, and as a result of specifying the first virtual object VO, an ID corresponding to the first virtual object VO is output.

In step S24, the processing device 11 refers to the first information IF1 stored in the storage device 12 by functioning as the calling name identification unit 113-2. After that, the processing device 11 functions as the call name identifying unit 113-2 so that the tag TG corresponding to the first virtual object VO identified by functioning as the virtual object identifying unit 113-1 is changed to the first virtual object VO. 1 information IF1. Specifically, the processing device 11 functions as the calling name identification unit 113-2 to determine whether the tag TG corresponding to the ID output in step S23 is included in the first information IF1. If the tag TG is included in the first information IF1, that is, if the determination result in step S24 is YES, the processing device 11 functions as the nickname identification unit 113-2 to identify the tag TG as the first nickname. After that, the process of step S25 is executed. If the tag TG is not included in the first information IF1, that is, if the determination result in step S24 is NO, the processing device 11 terminates all the processes.

In step S25 , the processing device 11 functions as the display control unit 114 to display the tag TG specified as the first nickname on the display 29 provided in the AR glasses 20 . More specifically, the processing device 11 functions as the display control unit 114 to cause the display 29 to display the specified tag TG in the vicinity of the first virtual object VO. Here, the “neighborhood” of the first virtual object VO specifically means a range within a predetermined distance from the first virtual object VO. Note that if the tag TG is not identified in step S25, the processing device 11 omits the processing of step S26.

1.2.4: Fourth Operation FIG. 19 is a flow chart explaining the fourth operation of the information processing system 1 .

In step S31, the processing device 11 functions as the speech recognition unit 116 to recognize the speech uttered by the user U1. More specifically, the processing device 21 of the AR glasses 20 acquires voice data representing the voice of the user U1 from the sound pickup device 24 by functioning as the acquisition unit 211 . Also, the processing device 21 of the AR glasses 20 outputs the acquired audio data to the communication device 28 by functioning as the acquisition unit 211 . The communication device 28 outputs the audio data acquired from the processing device 21 to the terminal device 10 . The processing device 11 of the terminal device 10 functions as the acquisition unit 111 to acquire audio data from the AR glasses 20 using the communication device 13 . Further, the processing device 11 of the terminal device 10 performs voice recognition on voice data by functioning as a voice recognition unit 116 . A character string as a speech recognition result corresponds to instruction information in the first to third operations described above. In this operation example, it is assumed that a character string as a speech recognition result is a character string indicating an attribute.

In step S32, the processing device 11 identifies the virtual object VO by functioning as the virtual object identification unit 113-1. More specifically, the processing device 11 refers to the second information IF2 stored in the storage device 12 by functioning as the virtual object identification unit 113-1. Further, the processing device 11, by functioning as the virtual object specifying unit 113-1, specifies one or more virtual objects VO corresponding to the attribute whose character string as the speech recognition result is included in the second information IF2. . Further, the processing device 11 functions as the virtual object specifying unit 113-1, and as a result of specifying one or more virtual objects VO, an ID corresponding to the one or more virtual objects VO is output.

In step S33, the processing device 11 refers to the first information IF1 stored in the storage device 12 by functioning as the nickname identification unit 113-2. After that, the processing device 11 functions as the call name identifying unit 113-2 so that the tag TG corresponding to the first virtual object VO identified by functioning as the virtual object identifying unit 113-1 is changed to the first virtual object VO. 1 information IF1. Specifically, the processing device 11 functions as the nickname identification unit 113-2 to determine whether the tag TG corresponding to the ID output in step S32 is included in the first information IF1. If the tag TG is included in the first information IF1, that is, if the determination result in step S33 is YES, the processing device 11 functions as the nickname specifying unit 113-2 to specify the tag TG as the first nickname. After that, the process of step S34 is executed. If the tag TG is not included in the first information IF1, that is, if the determination result in step S33 is NO, the processing device 11 ends all the processes.

In step S34 , the processing device 11 functions as the display control unit 114 to display the tag TG specified as the first nickname on the display 29 provided in the AR glasses 20 . As an example, the processing device 11 functions as the display control unit 114 to cause the display 29 to display the specified tag TG in a form attached to the virtual object VO within the popup P1. Note that if the tag TG is not specified in step S33, the processing device 11 omits the processing of step S34.

1.3: Effects of the First Embodiment According to the above description, the terminal device 10 as an information processing device includes the display control unit 114, the virtual object identification unit 113-1, and the name identification unit 113-2. Prepare. The display control unit 114 causes the AR glasses 20 as a display device worn on the head of the user U1 to display a plurality of virtual objects VO arranged in the virtual space VS. The virtual object specifying unit 113-1 specifies the first virtual object VO among the plurality of virtual objects VO based on instruction information generated according to the user U1's action. If the tag TG corresponding to the first virtual object VO identified by the virtual object identifying unit 113-1 is stored in the storage device 12, the call name identifying unit 113-2 identifies the corresponding tag TG as the first virtual object VO. identified as a nickname for The display control unit 114 causes the AR glasses 20 as the display device to display the tag TG as the first name.

By using the above configuration, the terminal device 10 as an information processing device identifies the first virtual object VO arranged in the virtual space VS for the user U1 wearing the AR glasses 20 on the head. It is possible to easily recall the tag TG as the first name of the. Specifically, it may be difficult for the user U1 to grasp the correspondence between each virtual object VO and each tag TG and to store the tag TG. In such a case, the AR glasses 20 as a display device that the user U1 wears on the head are associated with the virtual object VO specified based on the instruction information generated according to the user U1's motion. to display the tag TG corresponding to the virtual object VO. By visually recognizing the displayed tag TG on the AR glasses 20, the user U1 can recall the tag TG corresponding to the virtual object VO.

Also, the display control unit 114 causes the AR glasses 20 as a display device to display a two-dimensional image SI obtained by planarly developing the virtual space VS. Further, the display control unit 114 displays the tag TG as the first name in association with the first virtual object VO within the two-dimensional image SI.

By using the above-described configuration of the terminal device 10 as an information processing device, the user U1 can grasp the position of the virtual object VO that the user wants to call in the virtual space VS, and then call the virtual object VO. It becomes possible to recall the corresponding tag TG.

Also, the display control unit 114 causes the AR glasses 20 as a display device to display a three-dimensional image TI obtained by reducing the virtual space VS. Further, the display control unit 114 displays the tag TG as the first name in association with the first virtual object VO within the three-dimensional image TI.

Also, the above operation of the user U1 is viewing the AR glasses 20 as a display device. The above instruction information indicates the viewpoint of the user U1 on the AR glasses 20 as a display device. The terminal device 10 as an information processing device further includes a determination unit 115 . The determination unit 115 determines whether or not the viewpoint of the user U1 is positioned within the first virtual object VO displayed on the AR glasses 20 as a display device for a predetermined time or longer. When the determination unit 115 determines that the viewpoint of the user U1 is positioned within the first virtual object VO for a predetermined time or longer, the display control unit 114 displays the first virtual object VO on the AR glasses 20 as a display device. to display the tag TG as the first nickname corresponding to .

By using the above-described configuration of the terminal device 10 as the information processing device, the user U1 does not need to perform any motion other than the motion related to the line of sight with respect to the AR glasses 20 as the display device. It becomes possible to recall the tag TG as a name of 1.

Also, the display control unit 114 changes the display content regarding the first virtual object VO when the recognition result of the voice uttered by the user U1 matches the tag TG as the first nickname.

By using the above-described configuration of the terminal device 10 as an information processing device, the user U1 can, for example, display the first virtual object VO corresponding to the tag TG that matches the speech recognition result in the line of sight of the user U1. It is possible to move in the direction Alternatively, for example, if the first virtual object VO corresponding to the tag TG that matches the speech recognition result is an application, the user U1 can start the application.

Further, according to the above description, in the terminal device 10 as the information processing device, the virtual object specifying unit 113-1 is the voice uttered by the user U1, which represents at least one attribute of the plurality of virtual objects VO. One or more virtual objects VO are specified based on the speech recognition result. The nickname specifying unit 113-2 specifies a tag TG as a nickname corresponding to each of the one or more virtual objects VO specified by the virtual object specifying unit 113-1. The display control unit 114 causes the AR glasses 20 as the display device to display the tag TG as the corresponding nickname specified by the nickname specifying unit 113-2 in association with each of a part or all of the virtual objects VO. .

By using the above configuration of the terminal device 10 as an information processing device, the user U1 narrows down a plurality of virtual objects VO arranged in the virtual space VS to one or more virtual objects VO corresponding to the same attribute. If the tags TG corresponding to the narrowed-down virtual objects VO include the tag TG corresponding to the virtual object VO that the user U1 wants to call, the user U1 can easily remember the tag TG. .

2: Second Embodiment A configuration of an information processing system 1A including an information processing apparatus according to a second embodiment of the present invention will be described below with reference to FIGS. 20 to 29. FIG. In the following description, for simplification of description, the same symbols are used for the same components as in the first embodiment, and the description of their functions may be omitted. Also, in the following description, for the sake of simplification of description, mainly the differences between the second embodiment and the first embodiment will be described.

2.1: Configuration of First Embodiment 2.1.1: Overall Configuration FIG. 20 is a diagram showing the overall configuration of an information processing system 1A according to the second embodiment of the present invention. The information processing system 1A differs from the information processing system 1 according to the first embodiment in that it includes a terminal device 10A instead of the terminal device 10. FIG.

2.1.2: Configuration of Terminal Device FIG. 21 is a block diagram showing a configuration example of the terminal device 10A. The terminal device 10A differs from the terminal device 10 according to the first embodiment in that it includes a processing device 11A instead of the processing device 11 and a storage device 12A instead of the storage device 12 .

The storage device 12A differs from the storage device 12 according to the first embodiment in that it is not essential to store the second information IF2 and that it stores the learning model LM1.

The learning model LM1 is a learning model for use by the later-described calling name identification unit 113-2A. Specifically, the learning model LM1 is a learning model for calculating the degree of similarity between the first word and the second word. As an example, the learning model LM1 converts the meaning of a word into a numerical vector, and based on how much the direction of the vector of the first word and the direction of the vector of the second word are in the same direction, the Calculate the similarity between the first word and the second word. However, the similarity calculation method described above is merely an example, and the present invention is not limited to this. The learning model LM1 may use another method as long as it can calculate the degree of similarity between the first word and the second word.

The learning model LM1 is generated by learning teacher data in the learning phase. The teacher data used to generate the learning model LM1 has a plurality of pairs of sets of first words and second words and numerical values indicating degrees of similarity.

Also, the learning model LM1 is generated outside the terminal device 10A. In particular, learning model LM1 is preferably generated in server 30 . In this case, the terminal device 10A acquires the learning model LM1 from the server 30 via the communication network NET.

The processing device 11A functions as an acquisition unit 111, an action recognition unit 112, a specification unit 113A, a display control unit 114A, a voice recognition unit 116, and an update unit 117 by reading and executing the control program PR4 from the storage device 12A. . Note that the acquisition unit 111, the action recognition unit 112, and the speech recognition unit 116 are the same as the acquisition unit 111, the action recognition unit 112, and the speech recognition unit 116 as functions of the processing device 11 according to the first embodiment. Therefore, its description is omitted.

FIG. 22 is a functional block diagram showing the configuration of the identification unit 113A. The identification unit 113A differs from the identification unit 113 according to the first embodiment in that it includes a nickname identification unit 113-2A instead of the nickname identification unit 113-2.

The nickname identification unit 113-2A performs the same operation as the nickname identification unit 113-2 as the first operation. Specifically, the call name specifying unit 113-2A adds the tag TG corresponding to the first virtual object VO specified by the virtual object specifying unit 113-1 to the first information IF1 stored in the storage device 12. When included, the corresponding tag TG is identified as the first nickname.

In addition, as a second operation, the nickname identifying unit 113-2A identifies, as a first nickname, a plurality of tags TG corresponding to some or all of the plurality of virtual objects VO arranged in the virtual space VS. More specifically, the call name specifying unit 113-2A identifies the tag TG of the virtual object VO whose first information IF1 includes the tag TG among the plurality of virtual objects VO placed in the virtual space VS. Identify.

Furthermore, as a third operation, the nickname identification unit 113-2A performs a second operation in which the recognition result of the first speech uttered by the user U1 does not match any of the plurality of tags TG included in the first information IF1. , the tag TG that is most similar to the second nickname among the plurality of tags TG included in the first information IF1 is specified as the first nickname.

More specifically, the nickname identification unit 113-2A identifies the character string as the recognition result of the first voice uttered by the user U1 and one tag TG among the plurality of tags TG included in the first information IF1. , into the learning model LM1 described above. The learning model LM1 outputs the degree of similarity between the character string as the recognition result of the first speech and the one tag TG. The nickname specifying unit 113-2A performs the same operation for all the tags TG described in the first information IF1, and determines the degree of similarity between the character string as the recognition result of the first speech and all the tags TG. to get Furthermore, the nickname identifying unit 113-2A identifies the tag TG with the highest similarity value among all the tag TGs.

Returning to FIG. 21, the display control unit 114A causes the AR glasses 20 as a display device to display the tag TG specified by the name specifying unit 113-2A.

FIG. 23 is an explanatory diagram of a first operation example of the display control unit 114A. As shown in FIG. 23, the user U1 uttered "DOG" as the tag TG corresponding to the virtual object VO to be called. , is a second nickname that is not included in the first information IF1. In this case, the nickname identification unit 113-2A selects the tag TG "HORSE" as the tag TG most similar to the second nickname "DOG" among the plurality of tags TG contained in the first information IF1. Identify. After that, the display control unit 114A displays the popup P2 in the virtual space VS. Furthermore, the display control unit 114A causes the user U1 to display a message in the popup P2 to confirm whether the tag TG that the user U1 was originally trying to utter is "HORSE".

FIG. 24 is an explanatory diagram of a second operation example of the display control unit 114A. As shown in FIG. 24, the user U1 uttered some character string as the tag TG corresponding to the virtual object VO that the user U1 wants to call, but the character string that is the recognition result of the uttered first voice is not the first information. Assume that it is a second nickname that is not included in IF1. In this case, the call name identification unit 113-2A identifies a plurality of tags TG corresponding to some or all of the plurality of virtual objects VO placed in the virtual space VS based on the first information IF1. Then, the display control unit 114A causes the popup P3 to be displayed in the virtual space VS. Further, the display control unit 114A associates, in the popup P3, the icons of the multiple virtual objects VO corresponding to the specified multiple tags TG, that is, the reduced display, with the specified multiple tags TG, display the list.

FIG. 25 is an explanatory diagram of a third operation example of the display control unit 114A. As shown in FIG. 25, although the user U1 uttered some character string as the tag TG corresponding to the virtual object VO that the user U1 wants to call, the character string that is the recognition result of the uttered first voice is the first information IF1. Suppose that it is a second nickname that is not included in . In this case, the call name identification unit 113-2A identifies a plurality of tags TG corresponding to some or all of the plurality of virtual objects VO placed in the virtual space VS based on the first information IF1. That is, the call name identification unit 113-2A identifies all tags TG included in the first information IF1. Then, the display control unit 114A displays the plurality of tags TG in the vicinity of some or all of the plurality of virtual objects VO corresponding to the plurality of tags TG in the virtual space VS. Here, the “neighborhood” of some or all of the plurality of virtual objects VO specifically means a range within a predetermined distance from each virtual object VO.

After the display control unit 114A has executed the operation shown in the first operation example, the user U1 utters the tag TG specified by the nickname specifying unit 113-2A. Alternatively, after display control unit 114A performs the operation shown in the second operation example or the third operation example, user U1 may select one or more tags TG specified by nickname specifying unit 113-2A. Suppose that one of the tags TG is uttered. In the operations shown in the first to third operation examples, the user U1 uttered some character string as the tag TG for the virtual object VO to be called, but the character string was not included in the first information IF1. If not, the uttered voice is the first voice. Thereafter, when user U1 utters any tag TG from among the one or more tag TGs specified by the nickname specifying unit 113-2A, the uttered voice is the second voice. When the recognition result of the second voice uttered by user U1 matches the tag TG specified by the nickname specifying unit 113-2A, the display control unit 114A displays the virtual object VO corresponding to the matching tag TG. change. Specifically, the display control unit 114A may move the virtual object VO corresponding to the matching tag TG in the line-of-sight direction of the user U1. Alternatively, if the virtual object VO corresponding to the matched tag TG is an application, the display control unit 114 may display a screen for selecting whether to activate the application.

Returning to FIG. 21, as described above, the updating unit 117 determines that the recognition result of the first voice uttered by the user U1 is not included in the first information IF1, and the recognition result of the second voice is the first information IF1. When the number of matches with a specific tag TG included in IF1 reaches a predetermined number of times, the virtual object VO corresponding to the specific tag TG is treated as a tag TG as a second name, which is the recognition result of the first voice. correspond to More specifically, as described in the explanations of the first to third operation examples above, although the user U1 uttered some character string as the tag TG for the virtual object VO to be called, Suppose that the first information IF1 does not include the recognition result of the first speech by utterance. Next, the user U1 utters a character string as one of the tag TG from among the one or a plurality of tags TG specified by the nickname specifying unit 113-2A, and the recognition result of the second voice resulting from the utterance is obtained. is included in the first information IF1. Assume that the first information IF1 includes the second speech recognition result after the first speech recognition result has not been included in the first information IF1 for a predetermined number of times. . In this case, the updating unit 117 sets the character string as the first speech recognition result to a new tag TG, and then replaces the virtual object VO corresponding to the tag TG as the second speech recognition result with the new tag TG. Associate with a new tag TG.

Here, the storage device 12A stores the virtual object VO, the recognition result of the first voice, and the recognition result of the second voice after the first voice is uttered in the first information IF1. and the number of times the test was performed are stored in a tabular form. By referencing the table by the processing device 11A, the update unit 117 determines that the first information IF1 does not include the first speech recognition result and the second speech recognition result is included in the first information IF1. It is determined whether or not the number of times of being held has reached a predetermined number of times. After that, when the number of times reaches a predetermined number of times, the update unit 117 sets the character string as the first speech recognition result to a new tag TG, and then sets the character string as the second speech recognition result. is associated with the new tag TG.

26A and 26B are explanatory diagrams showing the function of the updating unit 117. FIG. As shown in FIG. 26A, it is assumed that the tag TG corresponding to the virtual object VO with ID=1 is "#WEB" and the tag TG corresponding to the virtual object VO with ID=8 is "#HORSE". Also, in the first to third operation examples described with reference to FIGS. is "HORSE" included in the first information IF1. In this case, the update unit 117 associates the virtual object VO with ID=8 corresponding to the tag TG "#HORSE" with the tag TG "#DOG", as shown in FIG. 26B. That is, the tags TG corresponding to the virtual object VO with ID=8 are two tags TG, "#HORSE" and "#DOG".

2.2: Operation of Second Embodiment FIGS. 27 to 29 are flowcharts showing the operation of the information processing system 1A according to the second embodiment. The operation of the information processing system 1A will be described below with reference to FIGS. 27 to 29. FIG.

2.2.1: First Operation FIG. 27 is a flowchart describing the first operation of the information processing system 1A.

In step S41, the processing device 11A recognizes the voice uttered by the user U1 by functioning as the voice recognition unit 116. More specifically, the processing device 21 of the AR glasses 20 acquires voice data representing the voice of the user U1 from the sound pickup device 24 by functioning as the acquisition unit 211 . Also, the processing device 21 of the AR glasses 20 outputs the acquired audio data to the communication device 28 by functioning as the acquisition unit 211 . The communication device 28 outputs the audio data acquired from the processing device 21 to the terminal device 10A. The processing device 11A of the terminal device 10A acquires audio data from the AR glasses 20 using the communication device 13 by functioning as the acquisition unit 111 . Further, the processing device 11A of the terminal device 10A functions as a voice recognition unit 116 to perform voice recognition on voice data. A character string as a speech recognition result corresponds to the instruction information in the first embodiment.

In step S42, the processing device 11A functions as the nickname identification unit 113-2A, so that the recognition result of the voice uttered by the user U1 is included in the first information IF1 and corresponds to the plurality of virtual objects VO. It is determined whether or not it corresponds to any of the tags TG. When the determination result is true, that is, when the determination result of step S42 is "YES", the processing device 11A executes the process of step S45. In this case, the recognition result of the voice uttered by the user U1 is an example of the above first name. When the determination result is false, that is, when the determination result of step S42 is "NO", the processing device 11A executes the process of step S43. In this case, the recognition result of the voice uttered by the user U1 is an example of the above second name.

In step S43, the processing device 11A functions as the nickname specifying unit 113-2A, thereby identifying the voice of the user U1 among the plurality of tags TG included in the first information IF1, that is, the plurality of first nicknames. A tag TG that is most similar to the second nickname as a recognition result is specified as the first nickname.

In step S44, the processing device 11A functions as the display control unit 114A to display the most similar tag TG identified in step S43 in the virtual space VS. For example, the processing device 11A displays the pop-up P2 in the virtual space VS by functioning as the display control unit 114A. Further, the processing device 11A functions as the display control unit 114A to display a message for the user U1 to confirm the tag TG that the user U1 was originally trying to utter in the popup P2. After that, the processing device 11A executes the process of step S41.

In step S45, the processing device 11A functions as the display control unit 114A to change the display of the virtual object VO corresponding to the tag TG as the first nickname, which is the recognition result in step S42.

Note that the number of times the recognition result of the first voice uttered by the user U1 is not included in the first information IF1 and the recognition result of the second voice matches the specific tag TG included in the first information IF1 reaches the predetermined number of times. If it has reached, after step S45, the update unit 117 may associate the virtual object VO corresponding to the specific tag TG with the tag TG as the second nickname that is the recognition result of the first voice. .

2.2.2: Second Operation FIG. 28 is a flowchart illustrating the second operation of the information processing system 1A.

In step S51, the processing device 11A recognizes the voice uttered by the user U1 by functioning as the voice recognition unit 116. Note that the details of the operation are the same as in step S41 in the first operation, so description thereof will be omitted.

In step S52, the processing device 11A, by functioning as the nickname identification unit 113-2A, determines whether the recognition result of the voice uttered by the user U1 corresponds to any of the plurality of tags TG included in the first information IF1. determine whether or not When the determination result is true, that is, when the determination result of step S52 is "YES", the processing device 11A executes the process of step S55. In this case, the recognition result of the voice uttered by the user U1 is an example of the above first name. When the determination result is false, that is, when the determination result of step S52 is "NO", the processing device 11A executes the process of step S53. In this case, the recognition result of the voice uttered by the user U1 is an example of the above second name.

In step S53, the processing device 11A functions as the call name specifying unit 113-2A, and based on the first information IF1, the plurality of virtual objects VO corresponding to some or all of the plurality of virtual objects VO arranged in the virtual space VS. identifies the tag TG of

In step S54, the processing device 11A functions as the display control unit 114A to display the popup P3 in the virtual space VS. Furthermore, by functioning as the display control unit 114A, the processing device 11A, in the popup P3, displays the icons of the multiple virtual objects VO corresponding to the multiple tags TG identified in step S53, that is, reduced display, and A list is displayed in association with a plurality of specified tags TG. After that, the processing device 11A executes the process of step S51.

In step S55, the processing device 11A functions as the display control unit 114A to change the display of the virtual object VO corresponding to the tag TG as the first nickname, which is the recognition result in step S52.

As in the first operation shown in FIG. 27, the first information IF1 does not include the recognition result of the first voice uttered by the user U1, and the first information IF1 includes the recognition result of the second voice. If the number of matches with the specified tag TG reaches a predetermined number, after step S55, the update unit 117 updates the virtual object VO corresponding to the specified tag TG to the first voice recognition result, which is the first voice recognition result. 2 may be associated with a tag TG as a nickname.

2.2.3: Third Operation FIG. 29 is a flowchart illustrating the third operation of the information processing system 1A.

In step S61, the processing device 11A recognizes the voice uttered by the user U1 by functioning as the voice recognition unit 116. The details of the operation are the same as those in step S41 in the first operation and step S51 in the second operation, and thus description thereof is omitted.

In step S62, the processing device 11A, by functioning as the nickname identification unit 113-2A, determines whether the recognition result of the voice uttered by the user U1 corresponds to any of the plurality of tags TG included in the first information IF1. determine whether or not When the determination result is true, that is, when the determination result of step S62 is "YES", the processing device 11A executes the process of step S65. In this case, the recognition result of the voice uttered by the user U1 is an example of the above first name. When the determination result is false, that is, when the determination result of step S62 is "NO", the processing device 11A executes the process of step S63. In this case, the recognition result of the voice uttered by the user U1 is an example of the above second name.

In step S63, the processing device 11A functions as the call name specifying unit 113-2A, and based on the first information IF1, the plurality of virtual objects VO corresponding to some or all of the plurality of virtual objects VO arranged in the virtual space VS. identifies the tag TG of

In step S64, the processing device 11A, by functioning as the display control unit 114A, displays the plurality of tags TG in the vicinity of some or all of the plurality of virtual objects VO corresponding to the plurality of tags TG identified in step S63. Display TG. Here, the “neighborhood” of some or all of the plurality of virtual objects VO specifically means a range within a predetermined distance from each virtual object VO. After that, the processing device 11A executes the process of step S61.

In step S65, the processing device 11A functions as the display control unit 114A to change the display of the virtual object VO corresponding to the tag TG as the first nickname, which is the recognition result in step S62.

As in the first operation shown in FIG. 27 and the second operation shown in FIG. 28, the first information IF1 does not include the recognition result of the first voice uttered by the user U1, When the number of times that the speech recognition result matches the specific tag TG included in the first information IF1 reaches a predetermined number of times, the updating unit 117 updates the virtual object VO corresponding to the specific tag TG after step S65. , may be associated with the tag TG as the second nickname that is the recognition result of the first voice.

2.3: Effects of the Second Embodiment According to the above description, the terminal device 10A as an information processing device includes the display control section 114A and the nickname specifying section 113-2A. The display control unit 114A causes the AR glasses 20 as a display device worn on the head of the user U1 to display a plurality of virtual objects VO arranged in the virtual space VS. If the recognition result of the first voice uttered by the user U1 is a second nickname that does not match any of the plurality of first nicknames corresponding to the plurality of virtual objects VO, the nickname specifying unit 113-2A , to identify a first nickname that is most similar to the second nickname among the plurality of first nicknames. The display control unit 114A causes the AR glasses 20 as a display device to display the first nickname specified by the nickname specifying unit 113-2A.

By using the above-described configuration of the terminal device 10A as an information processing device, even if the speech recognition result of the user U1's utterance does not correspond to the tag TG included in the first information IF1, it is included in the first information IF1. It becomes possible to recall the tag TG. In particular, user U1 can recall one tag TG that is most similar to the recognition result of the voice uttered by him/herself.

Further, after the first nickname specified by the nickname specifying unit 113-2A is displayed on the AR glasses 20 as the display device, the recognition result of the second voice uttered by the user U1 is the specified first nickname. , the display control unit 114A changes the display of the virtual object VO corresponding to the specified first nickname among the plurality of virtual objects VO. Further, the terminal device 10A as an information processing device further includes an updating unit 117 that associates the virtual object VO with the second nickname when the number of times the recognition result of the first voice becomes the second nickname reaches a predetermined number of times. Prepare.

By using the above configuration, the terminal device 10A as an information processing device allows the user U1 to call a certain virtual object VO using a second name not included in the first information IF1. When the number of times of doing so reaches a predetermined number of times, it becomes possible to associate the second nickname with the virtual object VO.

According to the above description, the terminal device 10A as an information processing device includes a display control section 114A and a call name specifying section 113-2A. The display control unit 114A causes the AR glasses 20 as a display device worn on the head of the user U1 to display a plurality of virtual objects VO arranged in the virtual space VS. The nickname identification unit 113-2A identifies a plurality of first nicknames corresponding to some or all of the plurality of virtual objects VO. When the recognition result of the voice uttered by user U1 is the second nickname that does not match any of the plurality of first nicknames, display control unit 114A controls the plurality of names specified by nickname specifying unit 113-2A. Each of the first nicknames is displayed in association with the corresponding virtual object VO out of some or all of the plurality of virtual objects VO.

By using the above configuration of the terminal device 10A as an information processing device, even if the speech recognition result of the user U1's utterance does not correspond to the tag TG included in the first information IF1, the user U1 can obtain the first information It becomes possible to recall the tag TG contained in IF1. In particular, the user U1 can visually recognize all the tags TG included in the first information IF1 within the virtual space VS.

Further, after the plurality of first nicknames specified by the nickname specifying unit 113-2A are displayed on the AR glasses 20 as the display device, the recognition result of the second voice uttered by the user U1 is the specified plurality of names. If the virtual object VO matches any of the first nicknames, the display control unit 114A changes the display of the virtual object VO corresponding to the matching first nickname among the plurality of virtual objects VO. Further, the terminal device 10A as an information processing device further includes an updating unit 117 that associates the virtual object VO with the second nickname when the number of times the recognition result of the first voice becomes the second nickname reaches a predetermined number of times. Prepare.

3: Modifications The present disclosure is not limited to the embodiments illustrated above. Specific modification modes are exemplified below. Two or more aspects arbitrarily selected from the following examples may be combined.

3.1: Modification 1
The terminal device 10 according to the first embodiment includes a speech recognition unit 116 as a function of the processing device 11 . Similarly, the terminal device 10A according to the second embodiment includes a speech recognition unit 116 as a function of the processing device 11A. However, the

terminal devices

10 and 10A do not have to include the speech recognition section 116 . Specifically, the voice recognition unit 116 may be an external device of the

terminal devices

10 and 10A, and may be communicably connected to the

terminal devices

10 and 10A. In this case, the speech recognition device corresponding to the speech recognition unit 116 may exist on the cloud and be communicably connected to the

terminal devices

10 and 10A via the communication network NET.

3.2: Modification 2
The terminal device 10 according to the first embodiment includes an acquisition unit 111 as a function of the processing device 11 . The acquisition unit 111 acquires the first information IF1 and the second information IF2 from the storage device 12 . Similarly, the terminal device 10A according to the second embodiment includes an acquisition unit 111 as a function of the processing device 11A. The acquiring unit 111 acquires the first information IF1 from the storage device 12A. However, the acquisition source of the first information IF1 and the second information IF2 of the acquisition unit 111 may not be the

storage device

12 or 12A. Specifically, the acquisition unit 111 may directly acquire the first information IF1 and the second information IF2 from the server 30 .

3.3: Modification 3
The terminal device 10 according to the first embodiment includes a motion recognition unit 112 as a function of the processing device 11 . Similarly, the terminal device 10A according to the second embodiment includes a motion recognition unit 112 as a function of the processing device 11A. The action recognition unit 112 recognizes gestures of the user U1. However, the method of recognizing gestures of user U1 is not limited to the above method. For example, the AR glasses 20 may recognize gestures of the user U1 by including a motion recognition unit similar to the motion recognition unit 112 .

3.4: Modification 4
The terminal device 10 according to the first embodiment includes, as a function of the processing device 11, a call name identifying section 113-2. Similarly, the terminal device 10A according to the second embodiment includes a call name identifying section 113-2A as a function of the processing device 11A. The nickname identifying units 113-2 and 113-2A identify a tag TG as a nickname corresponding to the virtual object VO identified by the virtual object identifying unit 113-1. On the other hand, the call name specifying units 113-2 and 113-2A do not specify the tag TG when there is no tag TG corresponding to the virtual object VO specified by the virtual object specifying unit 113-1. In this case, the

processing devices

11 and 11A may have a function of setting a new tag TG for a virtual object VO that does not have a corresponding tag TG.

3.5: Modification 5
In the information processing system 1 according to the first embodiment, the terminal device 10 and the AR glasses 20 are implemented separately. Similarly, in the information processing system 1A according to the second embodiment, the terminal device 10A and the AR glasses 20 are implemented separately. However, the method of implementing the

terminal device

10 or 10A and the AR glasses 20 in the embodiment of the present invention is not limited to this. For example, the

terminal device

10 or 10A and the AR glasses 20 may be realized within a single housing by providing the AR glasses 20 with the same functions as the

terminal device

10 or 10A.

3.6: Modification 6
10 A of terminal devices which concern on 2nd Embodiment are provided with the update part 117 as a function of 11 A of processing apparatuses. The update unit 117 updates the number of times the recognition result of the first voice uttered by the user U1 is not included in the first information IF1 and the recognition result of the second voice matches the specific tag TG included in the first information IF1. When the predetermined number of times has been reached, the virtual object VO corresponding to the specific tag TG is associated with the tag TG as the second name, which is the recognition result of the first voice. That is, the update unit 117 associates multiple tags TG with one virtual object VO. However, the operation of the updating unit 117 is not limited to this. For example, instead of associating a plurality of tags TG with one virtual object VO, the updating unit 117 associates one tag TG with one virtual object VO, A set of one virtual object VO and one tag TG may be set.

4: Others (1) In the above-described embodiment, the storage device 12, the storage device 12A, the storage device 22, and the storage device 32 are ROM and RAM, but flexible disks, magneto-optical disks (for example, compact disks) , Digital Versatile Discs, Blu-ray Discs), Smart Cards, Flash Memory Devices (e.g. Cards, Sticks, Key Drives), CD-ROMs (Compact Disc-ROMs), Registers, Removable Discs, Hard Disks, Floppy disk, magnetic strip, database, server or other suitable storage medium. Also, the program may be transmitted from a network via an electric communication line. Also, the program may be transmitted from the communication network NET via an electric communication line.

(2) In the embodiments described above, the information, signals, etc. described may be represented using any of a variety of different technologies. For example, data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description may refer to voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. may be represented by a combination of

(3) In the above-described embodiments, input/output information and the like may be stored in a specific location (for example, memory), or may be managed using a management table. Input/output information and the like can be overwritten, updated, or appended. The output information and the like may be deleted. The entered information and the like may be transmitted to another device.

(4) In the above-described embodiment, the determination may be made by a value (0 or 1) represented using 1 bit, or by a true/false value (Boolean: true or false). Alternatively, it may be performed by numerical comparison (for example, comparison with a predetermined value).

(5) The order of the processing procedures, sequences, flowcharts, etc. exemplified in the above embodiments may be changed as long as there is no contradiction. For example, the methods described in this disclosure present elements of the various steps using a sample order, and are not limited to the specific order presented.

(6) Each function illustrated in FIGS. 1 to 29 is realized by any combination of at least one of hardware and software. Also, the method of realizing each functional block is not particularly limited. That is, each functional block may be implemented using one device that is physically or logically coupled, or directly or indirectly using two or more devices that are physically or logically separated (e.g. , wired, wireless, etc.) and may be implemented using these multiple devices. A functional block may be implemented by combining software in the one device or the plurality of devices.

(7) The programs illustrated in the above embodiments, whether referred to as software, firmware, middleware, microcode, hardware description language or by other names, instructions, instruction sets, code, code shall be interpreted broadly to mean segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, and the like.

In addition, software, instructions, information, etc. may be transmitted and received via a transmission medium. For example, the software uses at least one of wired technology (coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), etc.) and wireless technology (infrared, microwave, etc.) to website, Wired and/or wireless technologies are included within the definition of transmission medium when sent from a server or other remote source.

(8) In each of the above aspects, the terms "system" and "network" are used interchangeably.

(9) Information, parameters, etc. described in this disclosure may be expressed using absolute values, may be expressed using relative values from a predetermined value, or may be expressed using corresponding other information. may be represented as

(10) In the above-described embodiments, the terminal device 10, the terminal device 10A, and the server 30 may be mobile stations (MS). A mobile station is defined by those skilled in the art as a subscriber station, mobile unit, subscriber unit, wireless unit, remote unit, mobile device, wireless device, wireless communication device, remote device, mobile subscriber station, access terminal, mobile terminal, wireless It may also be called a terminal, remote terminal, handset, user agent, mobile client, client, or some other suitable term. Also, in the present disclosure, terms such as "mobile station", "user terminal", "user equipment (UE)", "terminal", etc. may be used interchangeably.

(11) In the above-described embodiments, the terms "connected," "coupled," or any variation thereof refer to any direct or indirect connection between two or more elements. Any connection or coupling is meant and can include the presence of one or more intermediate elements between two elements that are "connected" or "coupled" to each other. Couplings or connections between elements may be physical couplings or connections, logical couplings or connections, or a combination thereof. For example, "connection" may be replaced with "access". As used in this disclosure, two elements are defined using at least one of one or more wires, cables, and printed electrical connections and, as some non-limiting and non-exhaustive examples, in the radio frequency domain. , electromagnetic energy having wavelengths in the microwave and optical (both visible and invisible) regions, and the like.

(12) In the above-described embodiments, the phrase "based on" does not mean "based only on," unless expressly specified otherwise. In other words, the phrase "based on" means both "based only on" and "based at least on."

(13) The terms "determining" and "determining" as used in this disclosure may encompass a wide variety of actions. "Judgement" and "determination" are, for example, judging, calculating, computing, processing, deriving, investigating, looking up, searching, inquiring (eg, lookup in a table, database, or other data structure), ascertaining as "judged" or "determined", and the like. Also, "judgment" and "determination" are used for receiving (e.g., receiving information), transmitting (e.g., transmitting information), input, output, access (accessing) (for example, accessing data in memory) may include deeming that a "judgment" or "decision" has been made. In addition, "judgment" and "decision" are considered to be "judgment" and "decision" by resolving, selecting, choosing, establishing, comparing, etc. can contain. In other words, "judgment" and "decision" may include considering that some action is "judgment" and "decision". Also, "judgment (decision)" may be read as "assuming", "expecting", "considering", or the like.

(14) In the above-described embodiments, where "include," "including," and variations thereof are used, these terms are synonymous with the term "comprising." , is intended to be inclusive. Furthermore, the term "or" as used in this disclosure is not intended to be an exclusive OR.

(15) In this disclosure, where articles have been added by translation, such as a, an, and the in English, the disclosure includes the plural nouns following these articles. good.

(16) In the present disclosure, the term "A and B are different" may mean "A and B are different from each other." The term may also mean that "A and B are different from C". Terms such as "separate," "coupled," etc. may also be interpreted in the same manner as "different."

(17) Each aspect/embodiment described in the present disclosure may be used alone, may be used in combination, or may be used by switching according to execution. In addition, notification of predetermined information (for example, notification of “being X”) is not limited to explicit notification, but is performed implicitly (for example, not notification of the predetermined information). good too.

Although the present disclosure has been described in detail above, it is clear to those skilled in the art that the present disclosure is not limited to the embodiments described in this disclosure. The present disclosure can be practiced with modifications and variations without departing from the spirit and scope of the present disclosure as defined by the claims. Accordingly, the description of the present disclosure is for illustrative purposes and is not meant to be limiting in any way on the present disclosure.

Reference Signs List

1, 1A...

Information processing system

10, 10A...

Terminal device

11, 11A...

Processing device

12, 12A... Storage device 13... Communication device 14... Display 15... Input device 20... AR glass 21... Processing device 22 Storage device 23 Line of sight acquisition device 24 Sound pickup device 25 GPS device 26 Motion detection device 27 Imaging device 28 Communication device 29 Display 30 Server 31 Processing device 32 Storage device 33 Communication device 34 Display 35

Input device

41L,

41R Lens

91, 92 Temple 93

Bridge

94, 95 Body 111 Acquisition unit , 112...

Action recognition unit

113, 113A... Specification unit 113-1... Virtual object specification unit 113-2, 113-2A... Call

name specification unit

114, 114A... Display control unit 115... Judgment unit 116... Voice recognition unit 117 Update unit 211 Acquisition unit 212 Display control unit 311 Output unit 312 Acquisition unit IF1 First information IF2 Second information LM1 Learning model P1, P2 , P3... popup, PR1, PR2, PR3, PR4... control program, R... area, TG... tag, U1, U2, U3... user, VO... virtual object

Claims

a display control unit for displaying a plurality of virtual objects arranged in a virtual space on a display device worn on the user's head;
a virtual object identifying unit that identifies a first virtual object among the plurality of virtual objects based on instruction information generated according to the user's action;
a nickname identifying unit that identifies the corresponding nickname as a first nickname when a nickname corresponding to the first virtual object identified by the virtual object identifying unit is stored in a storage device;
The display control unit causes the display device to display the first nickname.
Information processing equipment.
The display control unit causes the display device to display a two-dimensional image obtained by planarizing the virtual space, and causes the first name to be displayed in association with the first virtual object in the two-dimensional image. The information processing device according to claim 1 .
The display control unit causes the display device to display a three-dimensional image obtained by reducing the virtual space, and causes the first name to be displayed in the three-dimensional image in association with the first virtual object. Item 1. The information processing apparatus according to item 1.
the user's action is viewing the display device, the instruction information indicates the user's point of view on the display device;
further comprising a determination unit that determines whether the viewpoint is positioned within the first virtual object displayed on the display device for a predetermined time or longer;
When the determination unit determines that the viewpoint is positioned within the first virtual object for the predetermined time or longer, the display control unit causes the display device to display the first virtual object corresponding to the first virtual object. 2. The information processing apparatus according to claim 1, wherein the name of the name of the information processing apparatus is displayed.
5. The display control unit according to any one of claims 1 to 4, wherein, when the recognition result of the voice uttered by the user matches the first name, the display content regarding the first virtual object is changed. 1. The information processing apparatus according to 1.
a display control unit for displaying a plurality of virtual objects arranged in a virtual space on a display device worn on the user's head;
a virtual object identifying unit that identifies one or more virtual objects based on a recognition result of the voice uttered by the user and representing at least one attribute of the plurality of virtual objects;
a nickname identifying unit that identifies a nickname corresponding to each of a part or all of the one or more virtual objects,
The display control unit causes the display device to display the corresponding nickname specified by the nickname specifying unit in association with each of the part or all of the virtual objects.
Information processing equipment.
a display control unit for displaying a plurality of virtual objects arranged in a virtual space on a display device worn on the user's head;
When the recognition result of the first voice uttered by the user is a second nickname that does not match any of the plurality of first nicknames corresponding to the plurality of virtual objects, the plurality of first nicknames a nickname identifying unit that identifies the first nickname most similar to the second nickname among
The display control unit causes the display device to display the first nickname specified by the nickname specifying unit.
Information processing equipment.
After the first nickname specified by the nickname specifying unit is displayed on the display device, the display control unit causes the recognition result of the second voice uttered by the user to match the specified first nickname. if so, changing the display of the virtual object corresponding to the identified first name among the plurality of virtual objects;
Further comprising an updating unit that associates the virtual object with the second nickname when the number of times that the recognition result of the first voice becomes the second nickname reaches a predetermined number of times,
The information processing apparatus according to claim 7.
a display control unit for displaying a plurality of virtual objects arranged in a virtual space on a display device worn on the user's head;
a nickname identifying unit that identifies a plurality of first nicknames corresponding to some or all of the plurality of virtual objects;
When the recognition result of the voice uttered by the user is a second nickname that does not match any of the plurality of first nicknames, the display control unit controls the display of the plurality of nicknames specified by the nickname specifying unit. An information processing device for displaying each of the first nicknames in association with a corresponding virtual object out of some or all of the plurality of virtual objects.
The display control unit, after the plurality of first names specified by the name specifying unit are displayed on the display device, recognizes the recognition result of the second voice uttered by the user as the plurality of specified first names. if any of the names match, changing the display of the virtual object corresponding to the matched first name among the plurality of virtual objects;
Further comprising an updating unit that associates the virtual object with the second nickname when the number of times that the recognition result of the first voice becomes the second nickname reaches a predetermined number of times,
The information processing apparatus according to claim 9 .