WO2021230180A1 - 情報処理装置、ディスプレイデバイス、提示方法、及びプログラム - Google Patents
情報処理装置、ディスプレイデバイス、提示方法、及びプログラム Download PDFInfo
- Publication number
- WO2021230180A1 WO2021230180A1 PCT/JP2021/017640 JP2021017640W WO2021230180A1 WO 2021230180 A1 WO2021230180 A1 WO 2021230180A1 JP 2021017640 W JP2021017640 W JP 2021017640W WO 2021230180 A1 WO2021230180 A1 WO 2021230180A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text image
- display device
- presentation mode
- voice
- arrival direction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Definitions
- This disclosure relates to information processing devices, display devices, presentation methods, and programs.
- Hearing aids are widely used as a device to assist hearing.
- Hearing aid wearers may have diminished ability to grasp the direction of arrival of sound due to diminished auditory function.
- the direction of arrival of the voice cannot be grasped, and it is difficult to establish the conversation.
- the purpose of this disclosure is to easily recognize the direction of arrival of voice.
- an information processing device includes means for acquiring sound collected by a plurality of microphones.
- the information processing device includes means for estimating the arrival direction of the acquired voice.
- the information processing device comprises means for generating a text image corresponding to the acquired voice.
- the information processing apparatus includes means for determining the presentation mode of the text image with reference to the estimated arrival direction.
- the information processing apparatus comprises means for presenting a text image in a determined presentation mode.
- FIG. 1 It is a schematic diagram which shows the structure of the display device of this embodiment. It is a schematic diagram of the glass type display device which is an example of the display device shown in FIG. It is explanatory drawing of the outline of this embodiment. It is a flowchart which shows an example of the presentation process of this embodiment. It is a figure for demonstrating the collection of the utterance sound emitted from a speaker. It is a figure for demonstrating the arrival direction of an utterance sound. It is a schematic diagram which shows the presentation example of the glass type display device. It is a figure for demonstrating the field of view of a wearer. It is a schematic diagram which shows the structure of the display device of the modification 1. FIG.
- FIG. 1 is a schematic view showing the configuration of the display device of the present embodiment.
- FIG. 2 is a schematic diagram of a glass-type display device which is an example of the display device shown in FIG.
- the display device 1 shown in FIG. 1 collects sound and displays a text image corresponding to the collected sound in a presentation mode according to the direction of arrival of the sound (an example of "presentation"). It is composed of.
- the form of the display device 1 includes, for example, at least one of the following. ⁇ Glass-type display device ⁇ Mobile terminal ⁇ Conference system
- the display device 1 includes a plurality of microphones 101, a display 102, and a controller 10.
- the microphones 101 are arranged at a predetermined distance from each other.
- the display device 1 when the display device 1 is a glass type display device, the display device 1 has a right temple 21, a right twist 22, a bridge 23, a left twist 24, a left temple 25, and a rim. 26 and.
- the microphone 101-1 is arranged on the right temple 21.
- the microphone 101-2 is arranged on the right twist 22.
- the microphone 101-3 is arranged on the bridge 23.
- the microphone 101-4 is arranged on the left twist 24.
- the microphone 101-5 is arranged on the left temple 25.
- the microphone 101 collects, for example, at least one of the following sounds. -Sound of speech by a person-Sound of the environment in which the display device 1 is used (hereinafter referred to as "environmental sound”)
- the display 102 is a transparent member (for example, at least one of glass, plastic, and a half mirror). In this case, the display 102 is arranged at a position visible to the user wearing the glass-type display device.
- the displays 102-1 to 102-2 are supported by the rim 26.
- the display 102-1 is arranged so as to be located in front of the user's right eye when the user wears the display device 1.
- the display 102-2 is arranged so as to be located in front of the user's left eye when the user wears the display device 1.
- the display 102 presents (for example, displays) an image according to the control from the controller 10.
- the method by which the display 102 presents an image is not limited, and any existing method may be used.
- an image corresponding to the image light is projected onto the display 102-1 from a projector (not shown) arranged behind the right temple 21.
- An image corresponding to the image light is projected onto the display 102-2 from a projector (not shown) arranged on the back side of the left temple 25.
- the display 102-1 and the display 102-2 present an image. The user can visually recognize the image and at the same time visually recognize the scenery transmitted through the display 102-1 and the display 102-2.
- the controller 10 is an information processing device that controls the display device 1.
- the controller 10 is connected to the microphone 101 and the display 102 by wire or wirelessly.
- the display device 1 is a glass-type display device as shown in FIG. 2, the controller 10 is arranged, for example, inside the right temple 21.
- the controller 10 includes a storage device 11, a processor 12, an input / output interface 13, and a communication interface 14.
- the storage device 11 is configured to store programs and data.
- the storage device 11 is, for example, a combination of a ROM (ReadOnlyMemory), a RAM (RandomAccessMemory), and a storage (for example, a flash memory or a hard disk).
- the program includes, for example, the following program. ⁇ OS (Operating System) program ⁇ Application program that executes information processing
- the data includes, for example, the following data.
- -Database referenced in information processing-Data obtained by executing information processing that is, the execution result of information processing
- the processor 12 is configured to realize the function of the controller 10 by activating the program stored in the storage device 11.
- the processor 12 is an example of a computer.
- the processor 12 activates a program stored in the storage device 11 to display an image representing a text corresponding to the utterance sound collected by the microphone 101 (hereinafter referred to as “text image”) at a predetermined position on the display 102. Realize the function presented to.
- the input / output interface 13 acquires at least one of the following. -Voice signal collected by the microphone 101-User's instruction input from the input device connected to the glass-type display device 1.
- the input device may be, for example, a drive button, a keyboard, a pointing device, a touch panel, a remote controller, or a switch. , Or a combination thereof.
- the input / output interface 13 is configured to output information to an output device connected to the display device 1.
- the output device is, for example, a display 102.
- the communication interface 14 is configured to control communication between the display device 1 and an external device (for example, a server or a mobile terminal) (not shown).
- an external device for example, a server or a mobile terminal
- FIG. 3 is an explanatory diagram of an outline of the present embodiment.
- the wearer P1 who wears the display device 1 has a conversation with the speakers P2 to P4.
- the microphone 101 collects the utterance sounds of the speakers P2 to P4.
- the controller 10 estimates the arrival direction of the collected utterance sound.
- the controller 10 determines the text corresponding to the utterance sound by analyzing the audio signal corresponding to the collected utterance sound.
- the controller 10 generates text images T1 to T3 corresponding to the determined text.
- the controller 10 determines the presentation mode of each of the text images T1 to T3 according to the arrival direction of the utterance sound.
- the controller 10 presents the text images T1 to T3 on the displays 102-1 to 102-32 in the determined presentation mode.
- FIG. 4 is a flowchart showing an example of the presentation process of the present embodiment.
- FIG. 5 is a diagram for explaining the collection of utterance sounds emitted from the speaker.
- FIG. 6 is a diagram for explaining the arrival direction of the utterance sound.
- FIG. 7 is a schematic diagram showing a presentation example of the glass-type display device of FIG.
- FIG. 8 is a diagram for explaining the field of view of the wearer.
- Each microphone 101 collects the utterance sound emitted from the speaker.
- the microphones 101-1 to 101-5 arranged in the right temple 21, the right twist 22, the bridge 23, the left twist 24, and the left temple 25 of the display device 1 are shown in FIG. Collects the utterance sounds that arrive through the path shown in.
- the microphones 101-1 to 101-5 convert the collected utterance sound into an audio signal.
- the controller 10 executes acquisition (S110) of the audio signal converted by the microphone 101.
- the processor 12 acquires an audio signal including an utterance sound emitted from at least one of the speakers P2, P3, and P4 transmitted from the microphones 101-1 to 101-5.
- the audio signals transmitted from the microphones 101-1 to 101-5 include spatial information based on the path through which the utterance sound has progressed.
- step S110 the controller 10 executes estimation of the arrival direction (S111).
- the storage device 11 stores the arrival direction estimation model.
- the arrival direction estimation model describes the correlation between the spatial information contained in the voice signal and the arrival direction of the utterance sound.
- any existing method may be used as the arrival direction estimation method used in the arrival direction estimation model.
- MUSIC Multiple Signal Classification
- ESPRIT Estimat of Signal Parameters via Rotational Invariance Techniques
- the processor 12 is collected by the microphones 101-1 to 101-5 by inputting the audio signal received from the microphones 101-1 to 101-5 into the arrival direction estimation model stored in the storage device 11. Estimate the direction of arrival of the utterance sound. At this time, the processor 12 estimates, for example, the declination from the axis whose front is zero degree as the arrival direction of the utterance sound. In the example shown in FIG. 6, the processor 12 estimates the arrival direction of the utterance sound emitted from the speaker P2 as an angle A1 to the right from the axis. The processor 12 estimates the arrival direction of the utterance sound emitted from the speaker P3 as an angle A2 to the left from the axis. The processor 12 estimates the arrival direction of the utterance sound emitted from the speaker P4 as an angle A3 to the left from the axis.
- step S111 the controller 10 executes audio signal extraction (S112).
- the beamforming model is stored in the storage device 11.
- the beamforming model describes the correlation between a given direction and the parameters for forming a directivity with a beam in this direction.
- the parameter for forming the directivity is a parameter related to amplifying or attenuating a plurality of audio signals.
- the processor 12 inputs the estimated arrival direction into the beamforming model stored in the storage device 11 to calculate the parameters for forming the directivity having the beam in the arrival direction.
- the processor 12 inputs the calculated angle A1 into the beamforming model and calculates the parameters for forming the directivity having the beam in the direction of the angle A1 to the right from the axis.
- the processor 12 inputs the calculated angle A2 into the beamforming model and calculates the parameters for forming the directivity having the beam in the direction of the angle A2 to the left from the axis.
- the processor 12 inputs the calculated angle A3 into the beamforming model and calculates the parameters for forming the directivity having the beam in the direction of the angle A3 to the left from the axis.
- the processor 12 amplifies or attenuates the audio signal transmitted from the microphones 101-1 to 101-5 with the parameters calculated for the angle A1.
- the processor 12 synthesizes the amplified or attenuated audio signal to extract the audio signal for the utterance sound coming from the angle A1 from the received audio signal.
- the processor 12 amplifies or attenuates the audio signal transmitted from the microphones 101-1 to 101-5 with the parameters calculated for the angle A2.
- the processor 12 synthesizes the amplified or attenuated audio signal to extract the audio signal for the utterance sound coming from the angle A2 from the received audio signal.
- the processor 12 amplifies or attenuates the audio signal transmitted from the microphones 101-1 to 101-5 with the parameters calculated for the angle A3.
- the processor 12 synthesizes the amplified or attenuated audio signal to extract the audio signal for the utterance sound coming from the angle A3 from the received audio signal.
- step S112 the controller 10 executes voice recognition (S113).
- the voice recognition model is stored in the storage device 11.
- the speech recognition model describes the correlation between the speech signal and the text for the speech signal.
- the speech recognition model is, for example, a trained model learned by machine learning.
- the processor 12 inputs the extracted voice signal into the voice recognition model stored in the storage device 11, and determines the text corresponding to the input voice signal.
- the processor 12 determines the text corresponding to the input voice signal by inputting the voice signals extracted for the angles A1 to A3 into the voice recognition model.
- step S113 the controller 10 executes image generation (S114).
- the processor 12 generates a text image based on the determined text.
- step S114 the controller 10 executes the determination of the presentation mode (S115).
- the processor 12 determines how the text image is presented on the display 102.
- the processor 12 determines the position corresponding to the arrival direction of the audio signal related to the text image as the presentation position of the text image.
- the processor 12 determines the type of the text image to be presented (an example of the "presentation mode") according to the arrival direction.
- the processor 12 sets the presentation position of the text image T1 generated based on the voice signal extracted in the direction of the angle A1 from the axis to the right in the direction corresponding to the angle A1 and the predetermined elevation direction. Determine the position of.
- the processor 12 positions the position of the display 102-1 on the right side of the glass-type display device in the direction corresponding to the angle A1 and in the predetermined elevation angle direction as the text image T1.
- the presentation position of. Further, the processor 12 determines to present the text image T1 so that the text image T1 is formed at a predetermined distance from the wearer P1.
- the processor 12 determines the presentation position of the text image T2 generated based on the voice signal extracted in the direction of the angle A2 from the axis to the left as the position corresponding to the angle A2 and the position in the predetermined elevation angle direction. ..
- the processor 12 positions the position of the display 102-2 on the left side of the glass-type display device in the direction corresponding to the angle A2 and in the predetermined elevation angle direction as the text image T2.
- the presentation position of. determines to present the text image T2 so that the text image T2 is formed at a predetermined distance from the wearer P1.
- the processor 12 determines the presentation position of the text image T3 generated based on the voice signal extracted in the direction of the angle A3 from the axis to the left as the position corresponding to the angle A3 and the position in the predetermined elevation angle direction. ..
- the processor 12 positions the position of the display 102-2 on the left side of the glass-type display device in the direction corresponding to the angle A3 and in the predetermined elevation angle direction as the text image T3.
- the presentation position of. determines to present the text image T3 so that the text image T3 is formed at a predetermined distance from the wearer P1.
- the processor 12 determines a predetermined position as the presentation position of the text images T1 to T3.
- the processor 12 determines to present the text images T1 to T3 in a format including at least one of a character string and a symbol corresponding to the direction of arrival of the voice signal relating to the text image (an example of the "presentation mode").
- step S115 the controller 10 executes image presentation (S116).
- the processor 12 presents the text image on the display 102 in the determined presentation mode.
- the processor 12 presents the text image T1 at a position of the display 102-1 in the direction corresponding to the angle A1 and in the predetermined elevation angle direction.
- the processor 12 presents the text image T2 at a position on the display 102-2 in the direction corresponding to the angle A2 and in the predetermined elevation angle direction.
- the processor 12 presents the text image T3 at a position on the display 102-2 in the direction corresponding to the angle A3 and in the predetermined elevation angle direction.
- the humanoid figure shown by the broken line on the displays 102-1 to 102-2 in FIG. 7 is a supplementary representation of the speaker who can be seen through the displays 102-1 to 102-2 by the wearer P1. , Not presented on displays 102-1 to 102-2.
- the processor 12 contains the text image T1 at a predetermined position on the display 102-1 and at least one of a character string and a symbol corresponding to the direction corresponding to the angle A1.
- the processor 12 presents the text image T2 in a predetermined position on the display 102-2 in a format that includes at least one of a string and a symbol corresponding to the direction corresponding to the angle A2.
- the processor 12 presents the text image T3 in a predetermined position on the display 102-2 in a format that includes at least one of a string and a symbol corresponding to the direction corresponding to the angle A3.
- a text image based on the utterance sound from the speaker on the left may contain, for example, the letters “left” or a symbol reminiscent of "left", to the utterance sound from the speaker on the right.
- Based text images include, for example, the letters "right” or symbols pronounced of "right”.
- the speaker P2 speaks to the wearer P1 of the glass-type display device 1 as shown in FIG.
- the text image T1 which is the conversation content is presented together with the speaker P2 which is visually recognized through the display 102-1.
- the text image T2, which is the conversation content spoken by the speaker P3, is presented to the wearer P1 together with the speaker P3 which is visually recognized through the display 102-2.
- the text image T3, which is the conversation content spoken by the speaker P4 is presented to the wearer P1 together with the speaker P4 which is visually recognized through the display 102-2.
- a text image corresponding to the utterance sound is presented in a presentation mode according to the arrival direction of the utterance sound.
- the wearer of the display device 1 can easily recognize the direction of arrival of the utterance sound.
- the presentation mode is such that the image is presented at a position corresponding to the arrival direction of the utterance sound. This makes it easier to recognize the direction of arrival of the utterance sound.
- the audio signal corresponding to the estimated arrival direction is extracted from the acquired audio signal. This makes it possible to accurately recognize the direction of arrival of the utterance sound.
- the display device is applied to at least one form of a glass type display device, a mobile terminal, and a conference system. This makes it possible to easily recognize the direction of arrival of the utterance sound in various uses.
- Modification 1 shows an example in which the display device 1 is connected to a microphone module including a plurality of microphones 101.
- FIG. 9 is a schematic view showing the configuration of the display device of the first modification.
- the communication interface 14 is connected to the microphone module 101a.
- the microphone 101 is not arranged on the frame of the glass-type display device 1.
- the microphone module 101a includes a plurality of microphones 101.
- the microphones 101 are arranged at a predetermined distance from each other.
- the microphone module 101a is attached to any part of the body shown below. -Head-collar-chest-waist-Other parts that pass through the center of the wearer When the microphone module 101a is worn by the wearer, it communicates with the controller 10 via the communication interface 14.
- the controller 10 executes steps S110 to S116 and presents the text images T1 to T3 on the displays 102-1 to 102-2 in the same manner as in FIG.
- the first modification even in the glass-type display device 1 in which the microphone 101 is not arranged, it is possible to present a text image corresponding to the sound collected by the microphone 101 in a mode corresponding to the arrival direction. Become.
- Modification 2 shows an example in which the display device 1 includes a mobile terminal.
- FIG. 10 is a schematic diagram showing the display device of the modification 2 and the presentation example of the display device.
- the mobile terminal of FIG. 10 is an example of the display device 1.
- the mobile terminal includes, for example, any of the following. ⁇ Smartphones ⁇ Tablet terminals ⁇ Mobile devices with displays ⁇ Personal computers (for example, laptop computers)
- controller 10 executes steps S110 to S116 in the same manner as in FIG.
- the text images T1 to T3 are presented at positions on the display 102 in the direction corresponding to the arrival direction of the utterance sound.
- the microphone module 101a if the microphone module 101a is connected to the mobile terminal, it is possible to present a text image corresponding to the utterance sound collected by the microphone 101 in a presentation mode according to the arrival direction.
- Modification 3 shows an example in which the display device 1 includes a camera.
- FIG. 11 is a schematic view showing the configuration of the display device of the modification 3.
- the display device 1a includes a microphone 101, a display 102, a camera 103, and a controller 10a.
- the camera 103 is arranged so that the speaker is included in the shooting area.
- the camera 103 shoots in a predetermined direction and generates a shooting signal.
- the controller 10a is an information processing device that controls the display device 1a.
- the controller 10a is connected to the microphone 101, the display 102, and the camera 103 by wire or wirelessly.
- the controller 10a includes a storage device 11, a processor 12a, an input / output interface 13a, and a communication interface 14.
- the processor 12a is configured to realize the function of the controller 10a by activating the program stored in the storage device 11.
- the processor 12a is an example of a computer.
- the processor 12a responds to a shooting signal generated by the camera 103 at a predetermined position on the display 102 by activating a program stored in the storage device 11 to display a text image of the utterance sound collected by the microphone 101 at a predetermined position. It realizes a function of superimposing and presenting an image to be displayed (hereinafter referred to as "captured image").
- the input / output interface 13a acquires at least one of the following. -Voice signal collected by the microphone 101-Shooting signal taken by the camera 103-User's instruction input from the input device connected to the display device 1
- the input device is, for example, a drive button, a keyboard, or a pointing device. , Touch panel, remote controller, switch, or a combination thereof.
- the input / output interface 13a is configured to output information to an output device connected to the display device 1.
- the output device is, for example, a display 102.
- the controller 10a executes steps S110 to S113 in the same manner as in FIG.
- step S113 the controller 10a executes image generation (S114).
- the controller 10a converts the shooting signal generated by the camera 103 into a shooting image.
- the controller 10a generates a text image as in FIG.
- step S114 the controller 10a executes the determination of the presentation mode (S115).
- the processor 12a determines how the text image and the captured image are presented on the display 102. For example, the processor 12a determines the position corresponding to the arrival direction of the audio signal related to the text image as the presentation position of the text image, and the type of the text image to be presented according to the arrival direction, as in FIG. To decide. The processor 12a determines the presentation position of the captured image and the type of the captured image to be presented according to the arrival direction.
- step S115 the controller 10a executes the image presentation (S116).
- the processor 12a superimposes the text image generated in step S114 on the captured image and presents it on the display 102 in the determined presentation mode.
- the first example of the display device of Modification 3 shows an example in which the display device 1a includes a glass type display device.
- FIG. 12 is a schematic diagram showing a first example of the display device of the modified example 3 and a presentation example of the display device.
- FIG. 13 is a schematic diagram showing a shooting range by the camera shown in FIG.
- the camera 103 is arranged on the bridge 23 so as to capture an area including the wearer's field of view.
- the camera 103 is set so that the shooting range includes the field of view of the wearer.
- the solid line represents the shooting range by the camera 103
- the broken line represents the field of view of the wearer.
- the camera 103 is capable of capturing a view that is in the field of view of the wearer.
- the controller 10a executes steps S110 to S114 shown in FIG. 4, as described in the modified example 3.
- step S114 the controller 10a executes the determination of the presentation mode (S115).
- the processor 12a determines the presentation position of the text image T1 generated based on the voice signal extracted in the predetermined arrival direction as the position corresponding to the arrival direction and the position in the predetermined elevation angle direction. .. That is, the processor 12a sets the position of the display 102-1 in the direction corresponding to the arrival direction and the predetermined elevation angle direction as the presentation position of the text image T1. Further, the processor 12a determines to present the text image T1 so that the text image T1 is formed at a predetermined distance from the wearer. The processor 12a determines the presentation position of the text images T2 and T3 generated based on the voice signal extracted for the predetermined arrival direction as the position corresponding to the arrival direction and the position in the predetermined elevation angle direction.
- the processor 12a sets the position of the display 102-2 in the direction corresponding to the arrival direction and the predetermined elevation angle direction as the presentation position of the text images T2 and T3. Further, the processor 12a determines to present the text images T2 and T3 so that the text images T2 and T3 are imaged at a predetermined distance from the wearer. The processor 12a determines the presentation position of the captured image based on the imaging direction of the camera 103. Further, the processor 12a determines to present the photographed image so that the photographed image is formed at a predetermined distance from the wearer.
- step S115 the controller 10a executes image presentation (S116). Specifically, the processor 12a superimposes the text image generated in step S114 on the captured image and presents it on the display 102 in the determined presentation mode.
- the processor 12a presents the captured image on the display 102-1 and the display 102-2.
- the image I1 of the speaker P2 taken as shown in FIG. 12 is presented on the display 102-1
- the images I2 and I3 of the speakers P3 and P4 are presented on the display 102-2.
- the processor 12a superimposes and presents the text image T1 on the captured image at a position on the display 102-1 in the direction corresponding to the arrival direction of the utterance sound and in the predetermined elevation angle direction.
- the processor 12a superimposes the text images T2 to T3 on the captured image and presents the text images T2 to T3 at positions on the display 102-2 in the direction corresponding to the arrival direction of the utterance sound and in the predetermined elevation angle direction.
- the wearer of the display device 1a is informed of the text that is the conversation content spoken by the speaker P2.
- the image T1 will be presented together with the image I1 representing the speaker P2.
- the wearer P1 is presented with the text image T2, which is the conversation content spoken by the speaker P3, together with the image I2 representing the speaker P3.
- the wearer P1 is presented with the text image T3, which is the conversation content spoken by the speaker P4, together with the image I3 representing the speaker P4.
- FIG. 14 is a schematic diagram showing a second example of the display device of the modified example 3 and a presentation example of the display device.
- the microphone 101 is not arranged in the frame of the glass type display device 1a.
- the controller 10a executes steps S110 to S116 shown in FIG. 4, as described in the first example of the display device of the modification example 3.
- the image I1 is presented on the display 102-1, and the images I2 and I3 are presented on the display 102-2. Further, the text image T1 is superimposed and presented at a position corresponding to the arrival direction of the display 102-1. Further, the text images T2 and T3 are superimposed and presented at positions corresponding to the arrival direction of the display 102-2.
- FIG. 15 is a schematic diagram showing a third example of the display device of the modified example 3 and a presentation example of the display device.
- a camera arranged on the back surface of the arrangement surface of the display 102 is used so as to capture an area including the field of view of the user P1.
- the controller 10a executes steps S110 to S114 shown in FIG. 4, as described in the first example of the display device of the modification 3.
- step S114 the controller 10a executes the determination of the presentation mode (step S115).
- the processor 12a determines the presentation position of the text images T1 to T3 generated based on the audio signal extracted in the predetermined arrival direction as the position in the direction corresponding to the arrival direction. According to the example adopted for the mobile terminal, the processor 12a sets the position of the display 102 of the mobile terminal in the direction corresponding to the arrival direction as the presentation position of the text images T1 to T3. Further, the processor 12a determines to present the text images T1 to T3. The processor 12a determines the presentation position of the captured image based on the imaging direction of the camera 103. Further, the processor 12a determines to present the captured image.
- step S115 the controller 10a executes the image presentation (S116).
- the processor 12a superimposes the text image generated in step S114 on the captured image and presents it on the display 102 in the determined presentation mode.
- the processor 12a presents the captured image on the display 102.
- the speaker images I1 to I3 taken as shown in FIG. 15 are presented on the display 102.
- the processor 12a presents the text images T1 to T3 at positions on the display 102 of the mobile terminal in the direction corresponding to the arrival direction of the utterance sound.
- the text image T1 which is the conversation content spoken by the speaker P2 is spoken to the user P1 of the display device 1a. It will be presented together with the image I1 representing the person P2.
- the text image T2, which is the conversation content spoken by the speaker P3, is presented to the user P1 together with the image I2 representing the speaker P3.
- the text image T3, which is the conversation content spoken by the speaker P4, is presented to the user P1 together with the image I3 representing the speaker P4.
- FIG. 16 is a schematic diagram showing a fourth example of the display device of the modified example 3 and a presentation example of the display device.
- the conference system is a system that presents the utterance sound collected during the conference to the display as a text image at a position corresponding to the arrival direction.
- the display 102 is arranged at a position where the conference participants can see it.
- the camera 103 is arranged at a position where the conference participants can be photographed. In the example shown in FIG. 16, the camera 103 is located above the display 102. The camera 103 photographs the conference participants P2 to P4 who are having a conference.
- the microphone module 101a is placed in any of the positions shown below: -Conference tabletop-Hollow position suspended from the ceiling When the microphone module 101a is placed in a predetermined position, it regulates with the controller 10a. To carry out.
- the controller 10a executes steps S110 to S116 shown in FIG. 4, as described in the third example of the display device of the modification 3.
- the processor 12a presents the captured image on the display 102.
- the images I1 to I3 obtained by capturing the conference participants P2 to P4 are presented on the display 102.
- the processor 12a presents the text images T1 to T3 at positions on the display 102 in the direction corresponding to the arrival direction of the utterance sound.
- the text image T1 which is the conversation content spoken by the conference participant P2 is presented together with the image I1 representing the conference participant P2. Will be done.
- the text image T2, which is the conversation content spoken by the conference participant P3, is presented together with the image I2 representing the conference participant P3.
- the text image T3, which is the conversation content spoken by the conference participant P4, will be presented together with the image I3 representing the conference participant P4.
- the captured image is presented, and the text image corresponding to the utterance sound collected by the microphone 101 is presented in the presentation mode according to the arrival direction according to the speaker image included in the captured image. Is possible. This makes it possible to improve the visibility of the relationship between the sound source (for example, the speaker) and the text image.
- Modification 4 shows an example in which the function of the controller is realized by the server device.
- FIG. 17 is a schematic view showing the configuration of the display device of the modified example 4.
- the display device 1b includes a plurality of microphones 101, a display 102, and a server device 10b.
- the server device 10b is an information processing device that controls the display device 1b.
- the server device 10b is connected to the network by wire or wirelessly.
- the server device 10b includes a storage device 11, a processor 12b, an input / output interface 13, and a communication interface 14b.
- the processor 12b is configured to realize the function of the server device 10b by activating the program stored in the storage device 11.
- the processor 12b is an example of a computer.
- the processor 12b realizes a function of activating a program stored in the storage device 11 to present a text image based on the utterance sound collected by the microphone 101 to a predetermined position on the display 102.
- the communication interface 14b is configured to control communication via a network between the display device 1b, the microphone 101, and the display 102.
- the server device 10b executes steps S110 to S116 in the same manner as in FIG.
- the text image corresponding to the utterance sound collected by the microphone 101 can be presented in a presentation mode according to the arrival direction. It will be possible.
- Modification 5 shows an example in which the display device of modification 4 includes a camera.
- FIG. 18 is a schematic view showing the configuration of the display device of the modified example 5.
- FIG. 19 is a schematic diagram of a conference system, which is an example of the display device shown in FIG.
- the display device 1c includes a plurality of microphones 101, a display 102, a camera 103, and a server device 10c.
- the server device 10c is a device that controls the display device 1c.
- the server device 10c is connected to the network by wire or wirelessly.
- the server device 10c includes a storage device 11, a processor 12c, an input / output interface 13, and a communication interface 14c.
- the processor 12c is configured to realize the function of the server device 10c by activating the program stored in the storage device 11.
- the processor 12c is an example of a computer.
- the processor 12c realizes a function of activating a program stored in the storage device 11 to present a text image based on the utterance sound collected by the microphone 101 to a predetermined position on the display 102.
- the communication interface 14c is configured to control communication via a network between the display device 1c and the microphone 101, the display 102, and the camera 103.
- a conference held remotely is photographed and the utterance sound of the conference is collected.
- the conference system presents the captured image on the display and presents the text image based on the utterance sound at the position of the display according to the arrival direction of the utterance sound.
- a conference held remotely is referred to as a remote conference.
- the display 102 is arranged at a position visible to at least one of the following persons. ⁇ Person who participates in the conference call ⁇ Person who monitors the conference call
- the camera 103 is arranged at a position where a remote conference can be photographed. According to the example shown in FIG. 19, the camera 103 captures the conference participants P2 to P4 participating in the remote conference. The camera 103 shoots and generates a shooting signal. The camera 103 transmits a shooting signal to the server device 10c via the network.
- the microphone module 101a is placed in one of the positions shown below that can collect the spoken sound of the remote conference.-Conference tabletop-Hollow position suspended from the ceiling The microphone module 101a is placed in a predetermined position. Then, regulation is performed with the server device 10c.
- the server device 10c executes steps S110 to S116 in the same manner as in FIG.
- the processor 12c presents the captured image on the display 102.
- the images I1 to I3 obtained by capturing the conference participants P2 to P4 are presented on the display 102.
- the processor 12c presents the text images T1 to T3 at positions on the display 102 in the direction corresponding to the arrival direction of the utterance sound.
- the text image T1 which is the conversation content spoken by the conference participant P2 is presented together with the image I1 representing the conference participant P2. Will be done.
- the text image T2, which is the conversation content spoken by the conference participant P3, is presented together with the image I2 representing the conference participant P3.
- the text image T3, which is the conversation content spoken by the conference participant P4, will be presented together with the image I3 representing the conference participant P4.
- the captured image is presented, and the text image corresponding to the utterance sound collected by the microphone 101 is presented in the presentation mode according to the arrival direction according to the speaker image included in the captured image. Is possible.
- the display device 1 may be realized by any method as long as the image can be presented to the user.
- the display device 1 can be realized by, for example, the following implementation method.
- -HOE Holographic optical element
- DOE diffractive optical element
- an optical element for example, a light guide plate
- Liquid crystal display ⁇ Retinal projection display
- LED Light Emitting Diode
- Organic EL Electro Luminescence
- Laser display ⁇
- Optical elements for example, lens, mirror, diffraction grid, liquid crystal, MEMS mirror, HOE
- a display that guides the light emitted from the light emitter In particular, a retinal projection display makes it easy for even a person with low vision to observe an image. Therefore, it is possible to make a person suffering from both deafness and amblyopia more easily aware of the direction of arrival of the utterance sound.
- the display device 1a includes the camera 103 has been described as an example, but the present embodiment can also be applied to the case where the display device 1 includes a sensor configured to sense.
- the sensor is, for example, at least one of the following. ⁇ Human sensor ⁇ TOF (Time Of Flight) sensor ⁇ Millimeter wave radar ⁇ LiDAR (Light Detection And Ranging) -Image sensor
- the input / output interface 13 acquires a sensing signal generated by the sensor.
- the processor 12 determines the presentation mode of the text image in step S115 based on the acquired sensing signal. This makes it possible to improve the accuracy with which the text image is presented.
- the sensing signal is, for example, a shooting signal obtained by shooting a region collected by a plurality of microphones by a camera equipped with an image sensor.
- the processors 12a and 12c are the text images. It is also applicable when the presentation position of is determined in association with an image of a speaker located within a predetermined range from the arrival direction of the utterance sound. Specifically, for example, the processors 12a and 12c determine the presentation position of the captured image based on the imaging direction of the camera 103. The processors 12a and 12c associate the arrival direction of the utterance sound with the position of the speaker included in the captured image. The processors 12a and 12c determine the presentation position of the text images T1 to T3 generated based on the audio signal extracted in the predetermined arrival direction as the position in the vicinity of the speaker associated with the arrival direction.
- an example of extracting an amplified or attenuated audio signal by beamforming has been described as a method of extracting an audio signal, but the scope of the present embodiment is not limited to this.
- the extraction of the audio signal of the present embodiment can also be realized by the following method. ⁇ Frost beamformer ⁇ Adaptive filter beamforming (for example, generalized sidelobe canceller)
- the present embodiment is also applied to the case where the presentation mode includes, for example, the following modes. It is possible. -Font-Character color-Pictogram
- the processor 12 instead of presenting the text image at a position corresponding to the arrival direction of the spoken sound, the text image. May be presented on the display 102 in a color or font or the like according to the direction of arrival.
- a text is created based on a voice signal by voice recognition has been described.
- the processor 12 has a speaker attribute (hereinafter referred to as "speaker attribute") by, for example, voice analysis of the utterance sound collected by the microphone 101 or image analysis of an image taken by the camera 103. ) May be estimated. Speaker attributes include, for example: -Mood-Gender-Age Based on the estimated speaker attributes, the processor 12 determines the presentation mode of the text image, for example, the font, the color of the character, and the pictogram. As a result, the wearer of the display device 1 can easily recognize the speaker attribute.
- the captured image captured by the camera 103 is transmitted to the server device 10c via the network. It is also applicable when it is not done. In this case, the captured image captured by the camera 103 is presented on the display 102.
- the processor 12 applies the voice analysis process to the input voice signal, the voice signal being processed, or the voice signal after the processing, so that the voice of the utterance sound among the voices acquired is obtained.
- the processing for the environmental sound is omitted from the voice including the sound other than the utterance sound (for example, the environmental sound), so that the processing load of the information processing apparatus can be suppressed.
- steps S111 to S115 in FIG. 5 are executed by the processor of the server.
- a means for acquiring sound collected by a plurality of microphones 101 (for example, a processor 12 for executing step S110) is provided.
- a means for estimating the arrival direction of the acquired voice (for example, a processor 12 for executing step S111) is provided.
- a means for generating a text image corresponding to the acquired voice (for example, a processor 12 for executing step S114) is provided.
- a means for determining the presentation mode of the text image (for example, the processor 12 for executing step S115) with reference to the estimated arrival direction is provided.
- An information processing device (eg, controller 10) comprising means for presenting a text image (eg, a processor 12 performing step S116) in a determined presentation mode.
- Appendix 2 The information processing apparatus according to (Appendix 1), wherein the means for determining the presentation mode determines the presentation mode in which the text image is presented at a position corresponding to the estimated arrival direction.
- a means for extracting the voice corresponding to the estimated arrival direction from the acquired voice (for example, the processor 12 for executing step S112) is provided.
- the information processing device according to (Appendix 1) or (Appendix 2), wherein the means for generating a text image is to generate a text image corresponding to the extracted voice.
- (Appendix 4) It is equipped with a means for estimating speaker attributes by analyzing the acquired voice.
- the information processing apparatus according to any one of (Appendix 1) to (Appendix 3), wherein the means for determining the presentation mode determines the presentation mode of the text image with reference to the estimated speaker attribute.
- the speaker attribute can be easily recognized.
- a means for example, an input / output interface 13 for acquiring a sensing signal relating to the sensing of a region collected by a plurality of microphones by using a sensor is provided.
- the information processing apparatus according to any one of (Appendix 1) to (Appendix 4), wherein the means for determining the presentation mode determines the presentation mode of the text image with reference to the acquired sensing signal.
- the accuracy of presenting the text image can be improved.
- the accuracy of presenting the text image can be improved.
- a means for acquiring a shooting signal in which a region is shot (for example, an input / output interface 13a) is provided.
- a means for converting the acquired shooting signal into a shooting image (for example, a processor 12 for executing step S114) is provided.
- the information processing device according to any one of (Appendix 1) to (Appendix 5), wherein the means for presenting the text image is superposed on the captured image and presented.
- (Appendix 8) It is equipped with a means to estimate the speaker attribute by analyzing the shooting signal.
- the information processing apparatus according to (Appendix 6) or (Appendix 7), wherein the means for determining the presentation mode determines the presentation mode of the text image with reference to the estimated speaker attribute.
- the speaker attribute can be easily recognized.
- (Appendix 9) It is equipped with a means for extracting the voice of the utterance sound emitted from a person from the acquired voice.
- the means of estimating the arrival direction is to estimate the arrival direction of the extracted voice and
- the means for generating a text image is to generate a text image corresponding to the extracted voice.
- the information processing apparatus according to any one of (Appendix 1) to (Appendix 8).
- a means for acquiring sound collected by a plurality of microphones 101 (for example, a processor 12 for executing step S110) is provided.
- a means for estimating the arrival direction of the acquired voice (for example, a processor 12 for executing step S111) is provided.
- a means for generating a text image corresponding to the acquired voice (for example, a processor 12 for executing step S114) is provided.
- a means for determining the presentation mode of the text image (for example, the processor 12 for executing step S111) with reference to the estimated arrival direction is provided.
- a means for presenting a text image (eg, a processor 12 performing step S116) in a determined presentation mode.
- (Appendix 13) A program for causing a computer (for example, a processor 12) to realize the means according to any one of (Appendix 1) to (Appendix 12).
- a step (for example, step S110) for acquiring the sound collected by a plurality of microphones is provided.
- a step (for example, step S111) for estimating the arrival direction of the acquired voice is provided.
- a step (for example, step S114) for generating a text image corresponding to the acquired voice is provided.
- a step (for example, step S115) for determining the presentation mode of the text image with reference to the estimated arrival direction is provided.
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- User Interface Of Digital Computer (AREA)
- Studio Devices (AREA)
- Controls And Circuits For Display Device (AREA)
- Circuit For Audible Band Transducer (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022521892A JP7820732B2 (ja) | 2020-05-11 | 2021-05-10 | 情報処理装置、ディスプレイデバイス、提示方法、及びプログラム |
| JP2025181402A JP2026012872A (ja) | 2020-05-11 | 2025-10-28 | 情報処理装置、ディスプレイデバイス、提示方法、及びプログラム |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2020082945 | 2020-05-11 | ||
| JP2020-082945 | 2020-05-11 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021230180A1 true WO2021230180A1 (ja) | 2021-11-18 |
Family
ID=78525808
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/017640 Ceased WO2021230180A1 (ja) | 2020-05-11 | 2021-05-10 | 情報処理装置、ディスプレイデバイス、提示方法、及びプログラム |
Country Status (2)
| Country | Link |
|---|---|
| JP (2) | JP7820732B2 (https=) |
| WO (1) | WO2021230180A1 (https=) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2023157963A1 (ja) * | 2022-02-21 | 2023-08-24 | ピクシーダストテクノロジーズ株式会社 | 情報処理装置、情報処理方法、及びプログラム |
| WO2023249073A1 (ja) * | 2022-06-23 | 2023-12-28 | ピクシーダストテクノロジーズ株式会社 | 情報処理装置、ディスプレイデバイス、情報処理方法、及びプログラム |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2012059121A (ja) * | 2010-09-10 | 2012-03-22 | Softbank Mobile Corp | 眼鏡型表示装置 |
| US20150088500A1 (en) * | 2013-09-24 | 2015-03-26 | Nuance Communications, Inc. | Wearable communication enhancement device |
| WO2016075782A1 (ja) * | 2014-11-12 | 2016-05-19 | 富士通株式会社 | ウェアラブルデバイス、表示制御方法、及び表示制御プログラム |
| JP2019057047A (ja) * | 2017-09-20 | 2019-04-11 | 株式会社東芝 | 表示制御システム、表示制御方法及びプログラム |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2002135642A (ja) | 2000-10-24 | 2002-05-10 | Atr Onsei Gengo Tsushin Kenkyusho:Kk | 音声翻訳システム |
| JP2018185362A (ja) | 2017-04-24 | 2018-11-22 | 富士ソフト株式会社 | ロボットおよびその制御方法 |
-
2021
- 2021-05-10 JP JP2022521892A patent/JP7820732B2/ja active Active
- 2021-05-10 WO PCT/JP2021/017640 patent/WO2021230180A1/ja not_active Ceased
-
2025
- 2025-10-28 JP JP2025181402A patent/JP2026012872A/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2012059121A (ja) * | 2010-09-10 | 2012-03-22 | Softbank Mobile Corp | 眼鏡型表示装置 |
| US20150088500A1 (en) * | 2013-09-24 | 2015-03-26 | Nuance Communications, Inc. | Wearable communication enhancement device |
| WO2016075782A1 (ja) * | 2014-11-12 | 2016-05-19 | 富士通株式会社 | ウェアラブルデバイス、表示制御方法、及び表示制御プログラム |
| JP2019057047A (ja) * | 2017-09-20 | 2019-04-11 | 株式会社東芝 | 表示制御システム、表示制御方法及びプログラム |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2023157963A1 (ja) * | 2022-02-21 | 2023-08-24 | ピクシーダストテクノロジーズ株式会社 | 情報処理装置、情報処理方法、及びプログラム |
| JP7399413B1 (ja) * | 2022-02-21 | 2023-12-18 | ピクシーダストテクノロジーズ株式会社 | 情報処理装置、情報処理方法、及びプログラム |
| WO2023249073A1 (ja) * | 2022-06-23 | 2023-12-28 | ピクシーダストテクノロジーズ株式会社 | 情報処理装置、ディスプレイデバイス、情報処理方法、及びプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2026012872A (ja) | 2026-01-27 |
| JP7820732B2 (ja) | 2026-02-26 |
| JPWO2021230180A1 (https=) | 2021-11-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12032155B2 (en) | Method and head-mounted unit for assisting a hearing-impaired user | |
| US11605191B1 (en) | Spatial audio and avatar control at headset using audio signals | |
| US9949056B2 (en) | Method and apparatus for presenting to a user of a wearable apparatus additional information related to an audio scene | |
| US20170277257A1 (en) | Gaze-based sound selection | |
| CN110634189A (zh) | 用于在沉浸式混合现实体验期间用户警报的系统和方法 | |
| JP6518134B2 (ja) | 眼前装着型表示装置 | |
| JP2026012872A (ja) | 情報処理装置、ディスプレイデバイス、提示方法、及びプログラム | |
| CN103869470A (zh) | 显示装置及其控制方法、头戴式显示装置及其控制方法 | |
| US12537013B2 (en) | Audio-visual speech recognition control for wearable devices | |
| KR20190053001A (ko) | 이동이 가능한 전자 장치 및 그 동작 방법 | |
| US20210174823A1 (en) | System for and Method of Converting Spoken Words and Audio Cues into Spatially Accurate Caption Text for Augmented Reality Glasses | |
| US20240119684A1 (en) | Display control apparatus, display control method, and program | |
| KR20230112688A (ko) | 마이크로폰 빔 스티어링이 있는 머리-착용형 컴퓨팅 장치 | |
| EP4432053A1 (en) | Modifying a sound in a user environment in response to determining a shift in user attention | |
| US10665243B1 (en) | Subvocalized speech recognition | |
| TW200411627A (en) | Robottic vision-audition system | |
| US12307012B2 (en) | Response to sounds in an environment based on correlated audio and user events | |
| US20240410969A1 (en) | Information processing apparatus and information processing method | |
| CN116670618A (zh) | 从外部可穿戴电子设备接收信息的可穿戴电子设备及其操作方法 | |
| US11871198B1 (en) | Social network based voice enhancement system | |
| JPWO2021230180A5 (https=) | ||
| CN112751582A (zh) | 用于交互的可穿戴装置、交互方法及设备、存储介质 | |
| CN114531560B (zh) | 一种视频通话方法及装置 | |
| US20240129686A1 (en) | Display control apparatus, and display control method | |
| WO2023105653A1 (ja) | ヘッドマウントディスプレイ、ヘッドマウントディスプレイシステム、および、ヘッドマウントディスプレイの表示方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21804908 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022521892 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21804908 Country of ref document: EP Kind code of ref document: A1 |