WO2023210158A1

WO2023210158A1 - Eyeglass-type display device and display system

Info

Publication number: WO2023210158A1
Application number: PCT/JP2023/007812
Authority: WO
Inventors: 信貴松嶌; 勇一水越
Original assignee: 株式会社Ｎｔｔドコモ
Priority date: 2022-04-28
Filing date: 2023-03-02
Publication date: 2023-11-02

Abstract

This eyeglass-type display device comprises a transparent display unit, a specification unit, and a display control unit. The specification unit specifies a focus area, which includes a real object gazed at by a user within the user's field of view through the display unit, on the basis of voices spoken by the user of the eyeglass-type display device. The display control unit displays a virtual object corresponding to the real object gazed at by the user on a display area that overlaps with the focus area within the user's field of view through the display unit.

Description

Eyeglass-type display device and display system

The present invention relates to a glasses-type display device and a display system.

An HMD (Head Mounted Display) device is generally known that displays a virtual object representing additional information such as explanatory text regarding the real object, superimposed on an object (hereinafter referred to as a real object) in a real-world view. Some HDM devices, such as AR (Augmented Reality) glasses or MR (Mixed Reality) glasses, display virtual objects superimposed on real space without blocking the user's field of view (for example, Patent Document 1 and Patent Document 2).

JP2022-029429A Japanese Patent Application Publication No. 2014-093050

For HMD devices that display virtual objects without blocking the user's field of view, the immersion or convenience may be impaired if there is unnecessary light in the area that includes the real object that the user is gazing at (hereinafter referred to as the attention area). There is a problem. In order to solve this problem, it is conceivable to use an attachment that blocks the entire field of view seen by the user through the HDM device. However, if the entire field of view is always blocked, there is a problem in that the user cannot visually grasp the situation in the surrounding real space. It is also conceivable to perform display control such as blocking light from the real space and displaying a virtual object only in a region of interest specified by the user using a mouse or keyboard. However, using a mouse or keyboard to specify the attention area reduces user convenience.

According to a preferred aspect of the present disclosure, a glasses-type display device including a transmissive display section on which a virtual object is displayed includes a specifying section and a display control section. The specifying unit specifies an attention area including a real object that the user is gazing at in the user's field of view based on the voice uttered by the user. The display control unit displays a virtual object corresponding to the real object in a display area of the display unit that overlaps the attention area of the field of view.

A display system according to a preferred aspect of the present disclosure includes a glasses-type display device that is attached to a user's head and includes a transmissive display section on which a virtual object is displayed, a specific section, and a display control section. Be prepared. The specifying unit specifies an attention area including a real object that the user is gazing at in the user's field of view based on the voice uttered by the user. The display control unit displays a virtual object corresponding to the real object in a display area of the display unit that overlaps the attention area of the field of view.

According to the present disclosure, since the attention area is specified based on the user's voice, the virtual object can be displayed over the attention area without impairing the user's convenience.

FIG. 1 is a block diagram illustrating a configuration example of a display system 1 according to an embodiment of the present disclosure. It is a diagram showing an example of a poster P on which real objects are arranged. It is a figure showing an example of management table TBL in this indication. 1 is a block diagram showing a configuration example of a mobile device 10. FIG. 2 is a flowchart showing the flow of a display method executed by the processing device 18 of the mobile device 10 according to the program PR1. 2 is a block diagram showing a configuration example of a glasses-type display device 20. FIG. FIG. 3 is a diagram for explaining the operation of this embodiment. FIG. 3 is a diagram for explaining the operation of this embodiment. 7 is a diagram illustrating an example of an image viewed by user U in modification example 1. FIG. 7 is a diagram illustrating an example of an image viewed by user U in modification 2. FIG.

A. Embodiment FIG. 1 is a block diagram showing a configuration example of a display system 1 according to an embodiment of the present disclosure. As shown in FIG. 1, the display system 1 includes a mobile device 10 and a glasses-type display device 20. The glasses-type display device 20 is attached to the head of the user U. The glasses-type display device 20 is an HMD device that displays virtual objects that do not exist in real space without blocking the field of view of the user U wearing the glasses-type display device 20. The glasses-type display device 20 has an imaging function. The glasses-type display device 20 mounted on the head of the user U uses an imaging function to capture an image of real space corresponding to the field of view of the user U. An object existing in the field of view of the user U, that is, a real object, appears in the image captured by the glasses-type display device 20 worn on the head of the user U.

The mobile device 10 is, for example, a smartphone. The mobile device 10 is worn on the user U's body. The mobile device 10 is attached to the body of the user U by hanging from the neck using a strap or the like. The mobile device 10 has a sound collection function. The mobile device 10 worn on the user's body collects the voice emitted by the user U using a sound collection function. Furthermore, the mobile device 10 is connected by wire to a glasses-type display device 20 that is worn on the head. The mobile device 10 may be connected to the eyeglass-type display device 20 wirelessly. The mobile device 10 acquires image data representing an image captured by the glasses-type display device 20 from the glasses-type display device 20 . Note that the mobile device 10 is not limited to a smartphone, and may be, for example, a tablet or a notebook personal computer.

Furthermore, the mobile device 10 communicates with the management device 30 via the communication network NW. The mobile device 10 transmits the image data acquired from the glasses-type display device 20 to the management device 30. The management device 30 is a server device that provides a location recognition service and a content management service in AR.

The position recognition service is a service that specifies the position of the glasses-type display device 20 in the global coordinate system based on an image captured by the imaging function of the glasses-type display device 20. Specific implementation modes of the location recognition service include a mode using an AR tag or a mode using a distribution of feature points extracted from an image, such as SLAM (Simultaneous Localization and Mapping). The content management service is a service that distributes information regarding virtual objects corresponding to one or more real objects visible from the position of the glasses-type display device 20 in the global coordinate system to the glasses-type display device 20.

The management device 30 stores virtual object information representing an image of the virtual object, and information about the virtual object corresponding to each of one or more real objects visible from the position in association with the position in the global coordinate system. Area information indicating the position and size of the display area to be displayed is stored in advance. The management device 30 specifies one or more real objects from the position of the glasses-type display device 20 specified based on the image data received from the mobile device 10 via the communication network NW. Then, virtual object information and area information corresponding to each of the identified one or more real objects are sent back to the mobile device 10. The mobile device 10 causes the glasses-type display device 20 to display an image of the virtual object according to the virtual object information and area information received from the management device 30. As a result, the virtual object appears superimposed on the real space in the eyes of the user U.

The real space in this embodiment is, for example, a venue for a poster session at a research presentation such as an academic conference. The real objects in this embodiment are, for example, each English word on a poster displayed in a poster session venue and in which research content is written in English. FIG. 2 is a diagram showing an example of posters P displayed in the poster session venue. The virtual object in this embodiment is a character string representing a Japanese translation of an English word written on the poster P. In this embodiment, the management device 30 transmits virtual object information and area information to the mobile device 10 by transmitting the management table TBL shown in FIG. 3 to the mobile device 10.

As shown in FIG. 3, the management table TBL stores virtual object information and area information about a virtual object corresponding to the real object in association with identification information for identifying the real object. The identification information in this embodiment is character string data representing the pronunciation of the real object identified by the identification information. For example, assume that the English word "Patients" is a real object. The identification information in this case is "patients" in katakana. Katakana are phonetic characters used in Japanese to represent the pronunciation of foreign words. The virtual object information corresponding to this identification information represents an image of the Japanese character string "Kikin" which represents the Japanese translation of the English word "Patients". In the image representing the virtual object information, the background of the Japanese character string is painted over with a predetermined color such as white to block light from real space. Further, the area information stored in the management table TBL in association with the identification information "Patients" represents an area that overlaps with the real object "Patients".

FIG. 4 is a block diagram showing a configuration example of the mobile device 10. As shown in FIG. 4, the mobile device 10 includes an input device 11, an output device 12, a microphone 13, a communication device 14, a communication device 15, a storage device 17, a processing device 18, and a bus 19. ,including. The input device 11, the output device 12, the microphone 13, the communication device 14, the communication device 15, the storage device 17, and the processing device 18 are interconnected by a bus 19 for communicating information. The bus 19 may be configured using a single bus, or may be configured using different buses for each device.

The input device 11 includes a touch panel. The input device 11 may include a plurality of operation keys in addition to a touch panel. The input device 11 may include a plurality of operation keys without including a touch panel. The input device 11 receives operations performed by the user U. Output device 12 includes a display. A touch panel of the input device 11 is stacked on the display of the output device 12 . The output device 12 displays various information.

The microphone 13 picks up user U's voice. The microphone 13 generates sound data indicating the waveform of the collected sound and outputs it to the processing device 18 . Although details will be described later, in this embodiment, the user U's attention area is specified based on the user's U voice picked up by the microphone 13.

The communication device 14 is hardware (transmission/reception device) for communicating with the management device 30 via the communication network NW. The communication device 14 is also called, for example, a network device, a network controller, a network card, a communication module, or the like. The communication device 14 transmits the image data given from the processing device 18 to the management device 30. Further, the communication device 14 supplies the management table TBL received from the management device 30 to the processing device 18. Note that the communication device 14 may communicate with the management device 30 without going through the communication network NW.

The communication device 15 is hardware (transmission/reception device) for communicating with the eyeglass-type display device 20 by wire. The communication device 15 supplies the image data received from the glasses-type display device 20 to the processing device 18 . Furthermore, the communication device 15 transmits image data provided from the processing device 18 to the glasses-type display device 20. Note that the communication device 15 may communicate with the glasses-type display device 20 wirelessly.

The storage device 17 is a recording medium that can be read by the processing device 18. The storage device 17 includes, for example, nonvolatile memory and volatile memory. Nonvolatile memories include, for example, ROM (Read Only Memory), EPROM (Erasable Programmable Read Only Memory), and EEPROM (Electrically Erasable Programmable Read Only Memory). The volatile memory is, for example, RAM (Random Access Memory). The storage device 17 stores in advance a program PR1 that causes the processing device 18 to execute the method of specifying a region of interest according to the present disclosure. Furthermore, the management table TBL received from the management device 30 is written into the storage device 17 by the processing device 18 .

The processing device 18 includes one or more CPUs (Central Processing Units). One or more CPUs are an example of one or more processors. Each of the processor and CPU is an example of a computer. The processing device 18 reads the program PR1 from the storage device 17. The processing device 18 operating according to the program PR1 transmits the image data received from the glasses-type display device 20 using the communication device 15 to the management device 30 using the communication device 14. Further, the processing device 18 operating according to the program PR1 writes the management table TBL received from the management device 30 into the storage device 17 using the communication device 14.

Furthermore, the processing device 18 operating according to the program PR1 functions as the speech recognition section 181, the identification section 182, and the display control section 183 shown in FIG. That is, the speech recognition section 181, the identification section 182, and the display control section 183 in FIG. 4 are software modules realized by operating the processing device 18 according to software.

The speech recognition unit 181 converts the speech represented by the sound data generated by the microphone 13 into a character string. That is, the speech recognition unit 181 performs speech recognition on the user U's speech according to a predetermined speech recognition algorithm. Existing technology may be adopted as appropriate for the speech recognition algorithm. The voice recognition unit 181 generates recognized character string data representing the result of voice recognition of the user U's voice, that is, a character string of one or more words uttered by the user.

The identification unit 182 identifies the region of interest based on the recognition result by the voice recognition unit 181. More specifically, the identification unit 182 manages whether or not the character string represented by the recognized character string data generated by the speech recognition unit 181 matches any identification information stored in the management table TBL. The determination is made with reference to the table TBL. When the character string represented by the recognized character string data matches any identification information, the identification unit 182 identifies the area represented by the area information corresponding to the identification information that matches the character string represented by the recognized character string data as a region of interest. do.

The display control unit 183 causes a display area of the display unit of the glasses-type display device 20 that overlaps with the area of interest specified by the specifying unit 182 to display a virtual object corresponding to the area of interest. More specifically, the display control unit 183 displays an image in which an image representing virtual object information corresponding to identification information that matches the character string represented by the recognized character string data is arranged in an area indicated by area information corresponding to the identification information. Generate image data representing. Then, the display control unit 183 transmits the image data to the eyeglass-type display device 20 using the communication device 15, thereby causing the eyeglass-type display device 20 to display an image represented by the image data.

Furthermore, the processing device 18 operating according to the program PR1 executes the display method shown in FIG. 5 every time sound data is output from the microphone 13. As shown in FIG. 5, this display method includes each process of step SA110 to step SA140.

In step SA110, the processing device 18 functions as the speech recognition unit 181. In step SA110, the processing device 18 generates recognized character string data by performing voice recognition on the voice represented by the sound data output from the microphone 13.

In step SA120 and step SA130, the processing device 18 functions as the specifying unit 182. In step SA120, the processing device 18 checks whether the character string represented by the recognized character string data generated in step SA110 matches any identification information stored in the management table TBL. Determine by referring to.

If the determination result in step SA120 is "Yes", that is, if the character string represented by the recognized character string data matches any identification information, the processing device 18 executes the process in step SA130. In step SA130, the processing device 18 specifies, as a region of interest, the region indicated by the region information stored in the management table TBL in association with the identification information that matches the character string represented by the recognized character string data generated in step SA110. do. If the determination result in step SA120 is "No", that is, if the character string represented by the recognized character string data does not match any identification information, the processing device 18 does not execute the processing from step SA130 onwards. , this display method ends.

In step SA140 following step SA130, the processing device 18 functions as the display control unit 183. In step SA140, the processing device 18 acquires the virtual object information stored in the management table TBL in association with the identification information that matches the character string represented by the recognized character string data generated in step SA110, and Image data representing an image in which a virtual object representing information is placed in the region of interest specified in step SA130 is generated. Then, the processing device 18 supplies the generated image data to the eyeglass-type display device 20, thereby causing the eyeglass-type display device 20 to display an image represented by the image data.

FIG. 6 is a block diagram showing a configuration example of the eyeglass-type display device 20. The eyeglass-type display device 20 includes a display section 2a, a communication device 2b, an imaging device 2c, a storage device 2d, a processing device 2e, and a bus 2f. The display unit 2a, the communication device 2b, the imaging device 2c, the storage device 2d, and the processing device 2e are interconnected by a bus 2f for communicating information. The bus 2f may be configured using a single bus, or may be configured using different buses for each element such as a device.

The display section 2a is a transmissive display section that transmits light. Light representing real space is transmitted through the display section 2a. The display unit 2a displays images of virtual objects under the control of the processing device 2e. When the user U wears the glasses-type display device 20, the display section 2a is located in front of the user's U left and right eyes. The user U wearing the glasses-type display device 20 visually recognizes the real space represented by the light transmitted through the display section 2a and the image of the virtual object displayed on the display section 2a.

To explain in more detail, the display unit 2a includes a left eye lens, a left eye display panel, a left eye optical member, a right eye lens, a right eye display panel, and a right eye optical member. Contains parts. The display panel for the left eye and the display panel for the right eye are, for example, a liquid crystal panel or an organic EL (Electro Luminescence) panel. The display panel for the left eye displays an image represented by image data provided from the processing device e. The left eye optical member is an optical member that guides light emitted from the left eye display panel to the left eye lens. Similarly, the display panel for the right eye displays an image represented by image data provided from the processing device 2e. The right eye optical member is an optical member that guides light emitted from the right eye display panel to the right eye lens.

Each of the left eye lens and the right eye lens has a half mirror. The half mirror included in the left eye lens guides the light representing the real space to the left eye of the user U by transmitting the light representing the real space. Further, the half mirror included in the left eye lens reflects the light guided by the left eye optical member to the user U's left eye. The half mirror included in the right eye lens guides the light representing the real space to the right eye of the user U by transmitting the light representing the real space. The half mirror included in the right eye lens reflects the light guided by the right eye optical member to the user U's right eye.

The communication device 2b is hardware (transmission/reception device) for communicating with the mobile device 10 by wire. The communication device 2b may communicate with the mobile device 10 wirelessly.

The glasses-type display device 20 has a glasses-shaped frame that supports a left eye lens and a right eye lens, and the imaging device 2c (for example, a camera) is provided on a bridge of the frame. The imaging device 2c captures an image of the real space that the user U wearing the glasses-type display device 20 sees through the glasses-type display device 20, that is, an image of the user's U field of view under the control of the processing device 2e. The imaging device 2c outputs image data representing the captured image to the processing device 2e.

The storage device 2d is a recording medium that can be read by the processing device 2e. The storage device 2d, like the storage device 17, includes nonvolatile memory and volatile memory. The storage device 2d stores a program PR2. The processing device 2e includes one or more CPUs. The processing device 2e reads the program PR2 from the storage device 2d. The processing device 2e functions as an operation control unit 2e1 by executing the program PR2.

The operation control unit 2e1 controls the operation of the eyeglass-type display device 20. The operation control unit 2e1 transmits the image data output from the imaging device 2c to the mobile device 10 using the communication device 2b. Further, the operation control unit 2e1 supplies image data received from the mobile device 10 via the communication device 2b to the display unit 2a. The display section 2a displays an image represented by the image data supplied from the operation control section 2e1. As described above, the image represented by the image data transmitted by the mobile device 10 is an image in which a virtual object is placed in the attention area in the user's U field of view. Since this image is displayed on the display unit 2a, the user's U's eyes see an image of the real space in which the virtual object is superimposed on the attention area in the user's U's field of view.

For example, a user U wearing the glasses-type display device 20 on his head and the mobile device 10 on his body is looking at a poster P shown in FIG. It is assumed that the storage device 17 stores a management table TBL shown in FIG. FIG. 7 shows an example of the field of view A that the user U sees through the glasses-type display device 20. When the English word to which the user U, who is looking at the field of view A shown in FIG. 7, is paying attention is "Patients", the user U pronounces "Patients".

The voice of the user U is collected by the microphone 13 of the mobile device 10, and based on the recognition result of the voice by the voice recognition unit 181, the area corresponding to "Patients" is specified as the area of interest. Image data representing an image in which a virtual object corresponding to “Patients” (Japanese translation of “Patients”) is placed in this attention area is transmitted from the mobile device 10 to the glasses-type display device 20. As a result of displaying the image represented by this image data on the display unit 2a, as shown in FIG. 8, the user U's eyes see an image of the real space in which the virtual object VOB is superimposed on the attention area in the field of view A. .

As described above, according to the present embodiment, it is possible to have the user U specify an attention area that includes the real object that the user U is gazing at, and to display a virtual object overlapping the attention area specified by the user U. .

In addition, in this embodiment, the user U can specify the area of interest by voice, and there is no need to operate the mouse or keyboard when specifying the area of interest. Since there is no need to operate the mouse or keyboard when specifying the region of interest, convenience is not reduced. In this way, according to the present embodiment, it is possible to have the user U specify an attention area in the field of view and to display a virtual object in the attention area specified by the user U, without reducing the convenience of the user U. Become.

B: Modification The present disclosure is not limited to the embodiments illustrated above. Specific aspects of the modification are as follows. Two or more aspects arbitrarily selected from the examples below may be combined.
B-1: Modification 1
In order not to divert user U's attention to the attention area, the display control unit 183 may display a translucent image in a portion of the display area of the display unit 2a other than the attention area. When a translucent image is displayed in a part of the display area of the display unit 2a other than the attention area, a part of the light that passes through the part other than the attention area from the real space and enters the eyes of the user U is blocked. Ru. As a result, as shown in FIG. 9, the portion of the user's field of vision seen through the glasses-type display device 20 other than the area of interest becomes blurred, and the user U's sense of immersion is improved. In FIG. 9, blurring in areas other than the region of interest is represented by diagonal hatching. Note that the glasses-type display device 20 may be configured to be able to partially control the transmittance of the left eye lens and the right eye lens. In this case, by controlling the transmittance, blocking of light that passes through a portion other than the region of interest may be realized.

B-2: Modification 2
The identification information may be a series of numbers assigned to each of the plurality of real objects. In this case, the display control unit 183 may cause the glasses-type display device 20 to display identification information in association with each of the plurality of real objects, as shown in FIG. 10. In the example shown in FIG. 10, a series of numbers (circled numbers in the illustrated example) are assigned to each of a plurality of real objects. If the identification information is a series of numbers, the user U specifies the area of interest by speaking the number assigned to the real object of interest. For example, if the English word of interest is "Department", the number given to "Department" is "3", so user U can say "san" in Japanese or "three" in English. to specify the area of interest. In this case, when the identification information (number) corresponding to any of the plurality of real objects is recognized by the voice recognition unit 181, the identification unit 182 identifies the region of interest based on the recognized number.

B-3: Modification 3
The display control unit 183 may cause the glasses-type display device 20 to display a virtual object for instructing the user U to start a dictionary application (for example, an English-Japanese dictionary application). In this case, the processing device 18 starts the English-Japanese dictionary application when sound data indicating the virtual object is provided from the microphone 13. Furthermore, the audio that specifies the English word of interest is not limited to the audio that pronounces the English word, but may be the audio that reads out the alphabets that make up the English word in the order in which they are arranged in the English word. For example, if the English word of interest is "World", user U specifies the area of interest by saying "Double o r l di". In this case, the identification information may be character string information in which the pronunciations of the alphabets constituting the English word are arranged in the order in which they are arranged in the English word. Note that real objects in the present disclosure are not limited to English words, but may be words in other languages such as French, German, or Chinese. In addition, the placement destination of the real object in the present disclosure is not limited to a poster, but may be a magazine, book, or newspaper, or may be a menu board at a restaurant, an instruction manual for equipment, various documents, or a signboard. There may be.

B-4: Modification 4
The glasses-type display device 20 may include a detection device that detects the user's U line of sight. In this case, the specifying unit 182 roughly specifies the range occupied by the attention area in the field of view of the user U based on the line of sight detected by the detection device, and the display control unit 183 performs display control to emphasize the range. Good too. For example, the display control unit 183 may cause the display unit 2a to display an image in which the range occupied by the region of interest is expanded, or may cause the display unit 2a to display an image of a frame line surrounding the range.
Further, the eyeglass-type display device 20 may be configured to allow the user U to specify an area of interest by touching a part of the field of view A in FIG. 7 with a fingertip of the user U. In this case, the display control unit 183 may detect the area designation by the user U based on the image captured by the imaging device 2c, and highlight the detected area as a candidate for translation.

B-5: Modification 5
In the above embodiment, the program PR1 is stored in the storage device 17 of the mobile device 10, but the program PR1 may be manufactured or sold separately. When selling the program PR1, the provider of the program PR1 may write the program PR1 on a computer-readable recording medium such as a flash ROM and distribute it, or may distribute it by downloading it via a telecommunications line. .

B-6: Modification 6
The speech recognition unit 181, identification unit 182, and display control unit 183 in the above embodiment were software modules. However, any one, a plurality, or all of the voice recognition unit 181, the identification unit 182, and the display control unit 183 may be a hardware module. Specific examples of the hardware module include DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), FPGA (Field Programmable Gate Array), and the like.

B-7: Modification 7
In the above embodiment, the mobile device 10 has the voice recognition section 181, the identification section 182, and the display control section 183, but the glasses-type display device 20 or the management device 30 has the voice recognition section 181, the identification section 182, and the display control section 183. It may also include a control section 183. The voice recognition unit 181, the identification unit 182, and the display control unit 183 may be distributed and provided in any two or all of the glasses-type display device 20, the mobile device 10, and the management device 30. Note that since the voice recognition process by the voice recognition unit 181 is a process with a high processing load, it is preferable that the voice recognition unit 181 be provided in the management device 30 or the mobile device 10 rather than the glasses-type display device 20. Furthermore, the voice recognition unit 181 is preferably provided in the management device 30 rather than the mobile device 10.

C: Others (1) In the embodiment described above, the storage device 17 and the storage device 2d are exemplified as ROM, RAM, etc., but the storage device 17 and the storage device 2d are flexible disks, magneto-optical disks (for example, compact disks). , digital versatile discs, Blu-ray discs), smart cards, flash memory devices (e.g. cards, sticks, key drives), CD-ROMs (Compact Disc-ROMs), registers, removable disks, hard disks, It may be a floppy disk, magnetic strip, database, server or other suitable storage medium.

(2) In the embodiments described above, the information, signals, etc. described may be represented using any of a variety of different technologies. For example, data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description may refer to voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. It may also be represented by a combination of

(3) In the embodiments described above, the input/output information may be stored in a specific location (for example, memory) or may be managed using a management table. Information etc. to be input/output may be overwritten, updated, or additionally written. The output information etc. may be deleted. The input information etc. may be transmitted to other devices.

(4) In the embodiments described above, the determination may be made based on a value represented by 1 bit (0 or 1), or may be made based on a truth value (Boolean: true or false). , may be performed by numerical comparison (for example, comparison with a predetermined value).

(5) The order of the processing procedures, sequences, flowcharts, etc. illustrated in the embodiments described above may be changed as long as there is no contradiction. For example, for the methods described in this disclosure, elements of the various steps are presented using an example order and are not limited to the particular order presented.

(6) Each function illustrated in FIG. 4 is realized by an arbitrary combination of at least one of hardware and software. Furthermore, the method for realizing each functional block is not particularly limited. That is, each functional block may be realized using one physically or logically coupled device, or may be realized using two or more physically or logically separated devices directly or indirectly (e.g. , wired, wireless, etc.) and may be realized using a plurality of these devices. The functional block may be realized by combining software with the one device or the plurality of devices.

(7) The programs exemplified in the embodiments described above may include instructions, instruction sets, codes, software, firmware, middleware, microcode, hardware description language, or other names. Should be broadly construed to mean a code segment, program code, program, subprogram, software module, application, software application, software package, routine, subroutine, object, executable, thread of execution, procedure, function, etc. .

Additionally, software, instructions, information, etc. may be sent and received via a transmission medium. For example, if the software uses wired technology (coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), etc.) and/or wireless technology (infrared, microwave, etc.) to create a website, When transmitted from a server or other remote source, these wired and/or wireless technologies are included within the definition of transmission medium.

(8) In each of the above embodiments, the terms "system" and "network" are used interchangeably.

(9) The information, parameters, etc. described in this disclosure may be expressed using absolute values, relative values from a predetermined value, or other corresponding information. It may also be expressed as

(10) In the embodiments described above, the mobile device includes a mobile station (MS). A mobile station is defined by a person skilled in the art as a subscriber station, mobile unit, subscriber unit, wireless unit, remote unit, mobile device, wireless device, wireless communication device, remote device, mobile subscriber station, access terminal, mobile terminal, wireless It may also be referred to as a terminal, remote terminal, handset, user agent, mobile client, client, or some other suitable terminology. Further, in the present disclosure, terms such as "mobile station," "user terminal," "user equipment (UE)," and "terminal" may be used interchangeably.

(11) In the embodiments described above, the terms "connected", "coupled", or any variations thereof refer to direct or indirect connections between two or more elements. Refers to any connection or combination and may include the presence of one or more intermediate elements between two elements that are "connected" or "coupled" to each other. The bonds or connections between elements may be physical, logical, or a combination thereof. For example, "connection" may be replaced with "access." As used in this disclosure, two elements may include one or more electrical wires, cables, and/or printed electrical connections, as well as in the radio frequency domain, as some non-limiting and non-inclusive examples. , electromagnetic energy having wavelengths in the microwave and optical (both visible and non-visible) ranges, and the like.

(12) In the embodiments described above, the statement "based on" does not mean "based solely on" unless specified otherwise. In other words, the phrase "based on" means both "based only on" and "based at least on."

(13) The terms "determining" and "determining" used in this disclosure may encompass a wide variety of operations. "Judgment" and "decision" include, for example, judging, calculating, computing, processing, deriving, investigating, looking up, search, and inquiry. (e.g., searching in a table, database, or other data structure), and regarding an ascertaining as a "judgment" or "decision." In addition, "judgment" and "decision" refer to receiving (e.g., receiving information), transmitting (e.g., sending information), input, output, and access. (accessing) (e.g., accessing data in memory) may include considering something as a "judgment" or "decision." In addition, "judgment" and "decision" refer to resolving, selecting, choosing, establishing, comparing, etc. as "judgment" and "decision". may be included. In other words, "judgment" and "decision" may include regarding some action as having been "judged" or "determined." Further, "judgment (decision)" may be read as "assuming", "expecting", "considering", etc.

(14) In the embodiments described above, when “include”, “including” and variations thereof are used, these terms are used in the same manner as the term “comprising”. , is intended to be comprehensive. Furthermore, the term "or" as used in this disclosure is not intended to be exclusive or.

(15) In the present disclosure, when articles are added by translation, such as a, an, and the in English, the present disclosure does not include that the nouns following these articles are plural. good.

(16) In the present disclosure, the term "A and B are different" may mean "A and B are different from each other." Note that the term may also mean that "A and B are each different from C". Terms such as "separate", "coupled", etc. may also be interpreted similarly to "different".

(17) Each aspect/embodiment described in the present disclosure may be used alone, in combination, or may be switched and used in accordance with execution. In addition, notification of prescribed information (for example, notification of "X") is not limited to being done explicitly, but may also be done implicitly (for example, not notifying the prescribed information). Good too.

D: Aspects understood from the above embodiments or modified examples Although the present disclosure has been described in detail above, it is clear to those skilled in the art that the present disclosure is not limited to the embodiments described in the present disclosure. it is obvious. The present disclosure can be implemented as modifications and variations without departing from the spirit and scope of the present disclosure as determined by the claims. Therefore, the description of the present disclosure is for the purpose of illustrative explanation and is not intended to have any limiting meaning on the present disclosure. The following aspects can be understood from at least one of the above-described embodiments or modifications.

The glasses-type display device according to the first aspect including the transmissive display section 2a on which a virtual object is displayed may include a specifying section 182 and a display control section 183. The specifying unit 182 specifies a region of interest that includes a real object that the user U is gazing at in the user's field of view based on the voice uttered by the user U. The display control unit 183 causes a virtual object corresponding to the real object to be displayed in a display area of the display unit 2a that overlaps with the attention area in the user's U field of view. According to the glasses-type display device according to the first aspect, since the user U can specify the real object to be gazed at by voice, the virtual object can be displayed in the specified attention area without impairing the user U's convenience. I can do it.

The field of view of the user U in the second aspect (an example of the first aspect) may include a plurality of real objects. Furthermore, in the second aspect, one real object that the user U is gazing at may be specified from a plurality of real objects by the voice uttered by the user U. The glasses-type display device according to the second aspect may further include a voice recognition unit 181 that recognizes the voice uttered by the user U. The identification unit 182 in the glasses-type display device according to the second aspect may identify the attention area based on the recognition result of the voice recognition unit 181. The glasses-type display device according to the second aspect can specify the attention area based on the recognition result of the voice recognition unit 181 for the user's U voice.

The display control unit 183 in the glasses-type display device according to the third aspect (an example of the second aspect) displays identification information that uniquely identifies each of the plurality of real objects in association with each of the plurality of real objects. It may be displayed in the display area of section 2a. Further, in the glasses-type display device according to the third aspect (an example of the second aspect), the identification unit 182 detects the voice when the voice recognition unit 181 recognizes the identification information corresponding to any one of the plurality of real objects. The region of interest may be specified based on the identification information recognized by the recognition unit 181. The glasses-type display device according to the third aspect determines the attention area based on the voice of the user U indicating any of a plurality of pieces of identification information displayed in the display area of the display unit 2a in association with each of a plurality of real objects. Can be identified.

The display system according to the fourth aspect includes a glasses-type display device that is attached to the head of the user U and includes a transmissive display section 2a on which a virtual object is displayed, an identification section 182, a display control section 183, including. The specifying unit 182 specifies a region of interest that includes a real object that the user U is gazing at in the user's field of view based on the voice uttered by the user U. The display control unit 183 causes a virtual object corresponding to the real object to be displayed in a display area of the display unit 2a that overlaps with the attention area in the user's U field of view. According to the display system according to the fourth aspect, since the user U can specify the real object to be gazed at by voice, the virtual object can be displayed in the specified attention area without impairing the user U's convenience. .

DESCRIPTION OF SYMBOLS 1...Display system, 10...Mobile device, 20...Glasses type display device, 11...Input device, 12...Output device, 13...Microphone, 14, 15, 2b...Communication device, 17, 2d...Storage device, 18, 2e . . . Processing device, 181 .

Claims

A glasses-type display device including a transmissive display section on which a virtual object is displayed,
a specifying unit that specifies an attention area including a real object that the user is gazing at in the user's field of vision based on a voice uttered by the user;
a display control unit that displays a virtual object corresponding to the real object in a display area of the display unit that overlaps with the attention area of the field of view;
A glasses-type display device comprising:
The field of view includes a plurality of real objects,
The real object that the user focuses on is specified from the plurality of real objects by a voice uttered by the user,
The glasses-type display device is
further comprising a voice recognition unit that recognizes the voice uttered by the user,
The identification unit identifies the region of interest based on the recognition result of the voice recognition unit.
The glasses-type display device according to claim 1, characterized in that:
The display control unit causes identification information that uniquely identifies each of the plurality of real objects to be displayed in the display area in association with each of the plurality of real objects,
The specifying unit specifies the region of interest based on the identification information recognized by the speech recognition unit when identification information corresponding to any of the plurality of real objects is recognized by the speech recognition unit.
The glasses-type display device according to claim 2, characterized in that:
a glasses-type display device that is attached to a user's head and includes a transparent display section on which a virtual object is displayed;
a specifying unit that specifies an attention area including a real object that the user is gazing at in the user's field of vision based on a voice uttered by the user;
a display control unit that displays a virtual object corresponding to the real object in a display area of the display unit that overlaps with the attention area of the field of view;
A display system equipped with.