JP6467922B2

JP6467922B2 - Head-mounted display device, head-mounted display device control method, information system, and computer program

Info

Publication number: JP6467922B2
Application number: JP2015000618A
Authority: JP
Inventors: 薫千代; 津田　敦也; 敦也津田; 高野　正秀; 正秀高野
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2015-01-06
Filing date: 2015-01-06
Publication date: 2019-02-13
Anticipated expiration: 2035-01-06
Also published as: JP2016127463A

Description

本発明は、頭部装着型表示装置に関する。 The present invention relates to a head-mounted display device.

頭部に装着する表示装置である頭部装着型表示装置（ヘッドマウントディスプレイ（Head Mounted Display）、ＨＭＤ）が知られている。頭部装着型表示装置は、例えば、液晶ディスプレイおよび光源を利用して画像光を生成し、生成された画像光を投写光学系や導光板を利用して使用者の眼に導くことにより、使用者に虚像を視認させる。頭部装着型表示装置には、使用者が虚像に加えて外景も視認可能な透過型と、使用者が外景を視認できない非透過型と、の２つのタイプがある。透過型の頭部装着型表示装置には、光学透過型とビデオ透過型とがある。また、頭部装着型表示装置のような情報処理装置では、外部の音声を取得するマイクが搭載される場合がある。 A head-mounted display device (Head Mounted Display, HMD) that is a display device mounted on the head is known. The head-mounted display device is used, for example, by generating image light using a liquid crystal display and a light source, and guiding the generated image light to a user's eye using a projection optical system or a light guide plate Make the person visually recognize the virtual image. There are two types of head-mounted display devices: a transmission type in which the user can visually recognize the outside scene in addition to a virtual image, and a non-transmission type in which the user cannot visually recognize the outside scene. The transmissive head-mounted display device includes an optical transmissive type and a video transmissive type. An information processing apparatus such as a head-mounted display apparatus may be equipped with a microphone that acquires external sound.

例えば、特許文献１には、記録媒体に記録されたコンテンツを頭部装着型表示装置の使用者に視認させる場合に、コンテンツに含まれる音声を人の声と環境音とに区別して、環境音を表すテキスト画像を生成して、生成したテキスト画像を使用者に視認させる技術が開示されている。また、特許文献２には、頭部装着型表示装置の使用者を基準とした場合の音源の方向が特定され、音源から取得される音声を文字画像として音源の方向と対応付けて表示させる頭部装着型表示装置が開示されている。 For example, in Patent Document 1, when a user of a head-mounted display device visually recognizes content recorded on a recording medium, the sound included in the content is classified into human voice and environmental sound, and environmental sound is distinguished. A technique is disclosed in which a text image representing the image is generated and the generated text image is visually recognized by a user. Further, Patent Document 2 specifies the direction of a sound source when a user of the head-mounted display device is used as a reference, and displays the sound acquired from the sound source as a character image in association with the direction of the sound source. A part-mounted display device is disclosed.

特開２０１１−２５０１００号公報JP 2011-250100 A 特開２０１４−１２０９６３号公報JP 2014-120963 A

しかし、特許文献１に記載された技術では、使用者の操作によって視聴しているコンテンツの音量を調整することで、使用者は、コンテンツの音声とは異なる外部の音声の聞こえ方を調整できるが、コンテンツの音量を調整しないと共に、コンテンツの音量が大きい場合には、外部の音声が聞こえづらい場合がある。また、特許文献２に記載された技術では、音源の方向と音源が発する音声とが対応付けられて文字画像として使用者に視認されるが、音源を方向以外の情報も含めた上で音声を文字画像として使用者に視認させたいという要望があった。そのほか、従来の頭部装着型表示装置においては、使い勝手の向上等が望まれていた。 However, in the technique described in Patent Document 1, the user can adjust the way the external sound is heard, which is different from the sound of the content, by adjusting the volume of the content being viewed by the user's operation. If the volume of the content is not adjusted and the volume of the content is high, it may be difficult to hear external sound. In the technique described in Patent Document 2, the direction of the sound source and the sound emitted from the sound source are associated with each other and visually recognized by the user as a character image. There was a request to make a user visually recognize a character image. In addition, in the conventional head-mounted display device, improvement in usability has been desired.

本発明は、上述の課題の少なくとも一部を解決するためになされたものであり、以下の形態として実現することが可能である。
本発明の一形態は、透過型の頭部装着型表示装置を提供する。この頭部装着型表示装置は、音声を取得する音声取得部と、音声を発する音源の識別情報を用いて、前記音源を特定する音源特定部と、取得された音声を文字画像に変換する変換部と、画像を表示可能であると共に外景を透過可能な画像表示部と、特定された前記音源に対応付けられた対応画像と、変換された前記文字画像と、を対応付けて前記画像表示部に表示させる表示画像設定部と、を備える。前記音源特定部は前記音源である話者の口が動いているか否かを判定し，前記話者の口が動いている場合には前記変換部が前記音声を文字画像に変換し、前記話者の口が動いていない場合には前記変換部が前記音声を文字画像に変換しない。 SUMMARY An advantage of some aspects of the invention is to solve at least a part of the problems described above, and the invention can be implemented as the following forms.
One embodiment of the present invention provides a transmissive head-mounted display device. The head-mounted display device includes a sound acquisition unit that acquires sound, a sound source specifying unit that specifies the sound source using identification information of the sound source that generates sound, and a conversion that converts the acquired sound into a character image. An image display unit capable of displaying an image, an image display unit capable of displaying an image and transmitting an outside scene, a corresponding image associated with the identified sound source, and the converted character image a display image setting unit to be displayed on, Ru comprising a. The sound source specifying unit determines whether or not the speaker's mouth as the sound source is moving, and when the speaker's mouth is moving, the conversion unit converts the speech into a character image, and When the person's mouth is not moving, the conversion unit does not convert the sound into a character image.

（１）本発明の一形態によれば、透過型の頭部装着型表示装置が提供される。この形態の頭部装着型表示装置によれば、音声を取得する音声取得部と；音声を発する音源の識別情報を用いて、前記音源を特定する音源特定部と；取得された音声を文字画像に変換する変換部と；画像を表示可能であると共に外景を透過可能な画像表示部と；特定された前記音源に対応付けられた対応画像と、変換された前記文字画像と、を対応付けて前記画像表示部に表示させる表示画像設定部と、を備える。この形態の頭部装着型表示装置によれば、使用者は、外部の音声を聞くだけでなく、文字画像として視覚的情報として音声を認識できるため、使用者にとっての利便性が向上する。また、文字画像は、音源の識別情報と合わせて画像表示部に表示されるため、使用者にとっての利便性がさらに向上する。 (1) According to one aspect of the present invention, a transmissive head-mounted display device is provided. According to the head-mounted display device of this aspect, a sound acquisition unit that acquires sound; a sound source specifying unit that specifies the sound source using identification information of a sound source that emits sound; and a character image obtained from the acquired sound An image display unit capable of displaying an image and transmitting an outside scene; a corresponding image associated with the identified sound source and the converted character image in association with each other A display image setting unit to be displayed on the image display unit. According to the head-mounted display device of this form, the user can not only listen to the external sound but also recognize the sound as visual information as a character image, which improves convenience for the user. Further, since the character image is displayed on the image display unit together with the identification information of the sound source, the convenience for the user is further improved.

（２）上記形態の頭部装着型表示装置において、前記音声取得部は、前記音源の方向を特定し；前記表示画像設定部は、特定された方向に対応付けて前記文字画像と前記対応画像とを前記画像表示部に表示させてもよい。この形態の頭部装着型表示装置によれば、表示画像設定部は、文字画像と対応画像とを特定された音源に対応付けて画像表示部に表示させることができ、使用者は、文字画像を音源とより関連付けて認識できる。 (2) In the head-mounted display device according to the above aspect, the sound acquisition unit specifies a direction of the sound source; and the display image setting unit associates the character image with the corresponding image in association with the specified direction. May be displayed on the image display unit. According to the head-mounted display device of this aspect, the display image setting unit can display the character image and the corresponding image on the image display unit in association with the specified sound source, and the user can display the character image. Can be recognized in association with the sound source.

（３）上記形態の頭部装着型表示装置において、前記表示画像設定部は、複数の前記音源がある場合に、前記音源ごとに設定された前記識別情報に基づいて、前記対応画像と前記文字画像との少なくとも一方の前記画像表示部における表示態様と表示位置との少なくとも一方を設定してもよい。この形態の頭部装着型表示装置によれば、複数の音源の音声が音声取得部によって取得された場合であっても、複数の音源が発する音声をそれぞれの音源と関連付けて使用者に文字画像を視認させるため、使用者の利便性がさらに向上する。 (3) In the head-mounted display device according to the above aspect, when there are a plurality of the sound sources, the display image setting unit, based on the identification information set for each of the sound sources, You may set at least one of the display mode and display position in the said image display part of at least one with an image. According to the head-mounted display device of this aspect, even when the sound of the plurality of sound sources is acquired by the sound acquisition unit, the sound generated by the plurality of sound sources is associated with each sound source and the character image is displayed to the user. Therefore, the convenience for the user is further improved.

（４）上記形態の頭部装着型表示装置において、さらに；複数の前記音源がある場合に、前記複数の音源のそれぞれに設定された前記識別情報と、前記複数の音源のそれぞれの音声と、を対応付けて記憶する取得音声記憶部を備えてもよい。この形態の頭部装着型表示装置によれば、音声取得部によって取得された音声を単に記憶するだけの場合と比較して、音声と音声を発する音源とを区別して記憶するため、使用者の利便性が向上する。 (4) In the head-mounted display device of the above aspect, when there are a plurality of the sound sources, the identification information set for each of the plurality of sound sources, the respective sounds of the plurality of sound sources, May be provided. According to the head-mounted display device of this embodiment, since the sound acquired by the sound acquisition unit is simply stored, the sound and the sound source that emits the sound are distinguished and stored. Convenience is improved.

（５）上記形態の頭部装着型表示装置において、前記表示画像設定部は、取得された音声の音量に基づいて、変換されて前記画像表示部に表示される前記文字画像の大きさと前記画像表示部に前記文字画像として表示するか否かの表示有無との少なくとも一方を決定してもよい。この形態の頭部装着型表示装置によれば、取得された音量と画像表示部に表示される文字画像との関連性があるため、使用者は外部の音声を文字画像としてより関連付けて認識でき、使用者の利便性がさらに向上する。 (5) In the head-mounted display device according to the above aspect, the display image setting unit converts the size of the character image displayed on the image display unit and the size of the character image converted based on the acquired sound volume. At least one of whether or not to display the character image on the display unit may be determined. According to the head-mounted display device of this aspect, since there is a relationship between the acquired sound volume and the character image displayed on the image display unit, the user can recognize external sound as a character image by associating it more. , User convenience is further improved.

（６）上記形態の頭部装着型表示装置において、さらに；外景を撮像する撮像部と；前記音源が人間である場合に、撮像された前記外景の画像に基づいて、前記音源である人間の視線方向を検出する視線方向検出部と、を備え；前記表示画像設定部は、検出された前記視線方向に基づいて、前記対応画像と前記文字画像との少なくとも一方の前記画像表示部における表示態様と表示位置との少なくとも一方を設定してもよい。この形態の頭部装着型表示装置によれば、音源である人間が使用者に向けて話しかけている場合とそうでない場合とで使用者が視認する文字画像が異なり、使用者に音源である人間の意志を関連づけて音声の内容を認識させることができる。 (6) In the head-mounted display device according to the above aspect, further; an imaging unit that images an outside scene; and when the sound source is a human, the human being that is the sound source based on the captured image of the outside scene A line-of-sight detection unit that detects a line-of-sight direction; and the display image setting unit is configured to display at least one of the corresponding image and the character image on the image display unit based on the detected line-of-sight direction And / or display position may be set. According to this form of the head-mounted display device, the character image visually recognized by the user differs depending on whether the person who is the sound source is speaking to the user or not, and the person who is the sound source is the user. The content of the voice can be recognized in association with

（７）上記形態の頭部装着型表示装置において、さらに；前記画像表示部の加速度を検出する加速度検出部を備え；前記表示画像設定部は、検出された前記画像表示部の加速度に対応させて、前記画像表示部に表示される前記対応画像と前記文字画像との少なくとも一方の表示位置を固定させてもよい。この形態の頭部装着型表示装置によれば、頭部装着型表示装置の使用者が例えば、歩行中であったとしても、画像表示部２０に表示される画像の位置がぶれないため、使用者に対応画像や文字画像をより視認させやすい。 (7) The head-mounted display device of the above aspect further includes: an acceleration detection unit that detects acceleration of the image display unit; and the display image setting unit corresponds to the detected acceleration of the image display unit. The display position of at least one of the corresponding image and the character image displayed on the image display unit may be fixed. According to the head-mounted display device of this aspect, even if the user of the head-mounted display device is walking, for example, the position of the image displayed on the image display unit 20 is not blurred. It is easier for the person to visually recognize the corresponding image and the character image.

（８）上記形態の頭部装着型表示装置において、さらに；他の装置と情報を送受信する通信部と；取得された音声と、取得された音声を発する前記音源の前記識別情報と、を記憶する取得音声記憶部と、を備え；前記表示画像設定部は、前記通信部を介して、記憶された前記音声と前記識別情報とを他の装置へと送受信してもよい。この形態の頭部装着型表示装置によれば、頭部装着型表示装置を含むシステムの全体として、音源や音源が発する音声などの情報を共有化できる。 (8) In the head-mounted display device according to the above aspect, further stores: a communication unit that transmits / receives information to / from another device; the acquired sound; and the identification information of the sound source that emits the acquired sound An acquired voice storage unit; and the display image setting unit may transmit and receive the stored voice and the identification information to another device via the communication unit. According to the head-mounted display device of this aspect, information such as a sound source and sound generated by the sound source can be shared as a whole system including the head-mounted display device.

（９）上記形態の頭部装着型表示装置において、さらに；前記画像表示部の位置を特定する位置特定部を備え；前記通信部は、他の装置から他の前記頭部装着型表示装置の位置情報を受信し；前記表示画像設定部は、特定された前記画像表示部の位置と受信された前記位置情報とに基づいて、前記対応画像を設定して前記画像表示部に表示させてもよい。この形態の頭部装着型表示装置によれば、頭部装着型表示装置の使用者に視認させる対応画像によって使用者と他の頭部装着型表示装置の使用者との位置関係を使用者に認識させることができ、使用者の利便性がさらに向上する。 (9) The head-mounted display device according to the above aspect further includes: a position specifying unit that specifies a position of the image display unit; and the communication unit from another device to another head-mounted display device. The display image setting unit may set the corresponding image based on the identified position of the image display unit and the received position information and display the corresponding image on the image display unit; Good. According to this form of the head-mounted display device, the positional relationship between the user and the user of the other head-mounted display device is indicated to the user by the corresponding image that the user of the head-mounted display device visually recognizes. The user's convenience is further improved.

（１０）上記形態の頭部装着型表示装置において、音源特定部は、取得された音声の言語を特定し；前記表示画像設定部は、特定された音声の言語に基づいて、前記対応画像と前記文字画像との少なくとも一方の前記画像表示部における表示態様と表示位置との少なくとも一方を設定してもよい。この形態の頭部装着型表示装置によれば、音源特定部によって特定された言語によって、画像表示部に表示される文字画像や対応画像が変化するため、特定された言語の相違を視覚情報として使用者に認識させることができ、使用者の利便性が向上する。 (10) In the head-mounted display device according to the above aspect, the sound source specifying unit specifies the language of the acquired voice; the display image setting unit is configured to select the corresponding image based on the specified voice language. You may set at least one of the display mode and display position in the said image display part of at least one of the said character images. According to the head-mounted display device of this aspect, the character image displayed on the image display unit and the corresponding image change depending on the language specified by the sound source specifying unit, and thus the difference in the specified language is used as visual information. The user can be made aware and the convenience of the user is improved.

上述した本発明の各形態の有する複数の構成要素はすべてが必須のものではなく、上述の課題の一部または全部を解決するため、あるいは、本明細書に記載された効果の一部または全部を達成するために、適宜、前記複数の構成要素の一部の構成要素について、その変更、削除、新たな他の構成要素との差し替え、限定内容の一部削除を行なうことが可能である。また、上述の課題の一部または全部を解決するため、あるいは、本明細書に記載された効果の一部または全部を達成するために、上述した本発明の一形態に含まれる技術的特徴の一部または全部を上述した本発明の他の形態に含まれる技術的特徴の一部または全部と組み合わせて、本発明の独立した一形態とすることも可能である。 A plurality of constituent elements of each embodiment of the present invention described above are not essential, and some or all of the effects described in the present specification are to be solved to solve part or all of the above-described problems. In order to achieve the above, it is possible to appropriately change, delete, replace with another new component, and partially delete the limited contents of some of the plurality of components. In order to solve some or all of the above-described problems or achieve some or all of the effects described in this specification, technical features included in one embodiment of the present invention described above. A part or all of the technical features included in the other aspects of the present invention described above may be combined to form an independent form of the present invention.

例えば、本発明の一形態は、音声取得部と、音源特定部と、変換部と、画像表示部と、表示画像設定部の５つの要素の内の一つまたは二つ以上を備えた装置として実現可能である。すなわち、この装置は、音声取得部を有していてもよく、有していなくてもよい。また、装置は、音源特定部を有していてもよく、有していなくてもよい。また、装置は、変換部を有していてもよく、有していなくてもよい。また、装置は、画像表示部を有していてもよく、有していなくてもよい。また、装置は、表示画像設定部を有していてもよく、有していなくてもよい。音源取得部は、例えば、音声を取得してもよい。音源特定部は、例えば、音声を発する音源の識別情報を用いて、前記音源を特定してもよい。変換部は、例えば、取得された音声を文字画像に変換してもよい。画像表示部は、例えば、画像を表示可能であると共に外景を透過可能であってもよい。表示画像設定部は、例えば、特定された前記音源に対応付けられた対応画像と、変換された前記文字画像と、を対応付けて前記画像表示部に表示させてもよい。こうした装置は、例えば、頭部装着型表示装置として実現できるが、頭部装着型表示装置以外の他の装置としても実現可能である。このような形態によれば、装置の操作性の向上および簡易化、装置の一体化や、装置を使用する使用者の利便性の向上、等の種々の課題の少なくとも１つを解決することができる。前述した頭部装着型表示装置の各形態の技術的特徴の一部または全部は、いずれもこの装置に適用することが可能である。 For example, an aspect of the present invention is an apparatus including one or more of five elements of a sound acquisition unit, a sound source identification unit, a conversion unit, an image display unit, and a display image setting unit. It is feasible. That is, this apparatus may or may not have a voice acquisition unit. In addition, the device may or may not have a sound source specifying unit. Moreover, the apparatus may or may not have the conversion unit. Further, the apparatus may or may not have an image display unit. The apparatus may or may not have a display image setting unit. For example, the sound source acquisition unit may acquire sound. The sound source specifying unit may specify the sound source using, for example, identification information of a sound source that emits sound. For example, the conversion unit may convert the acquired voice into a character image. For example, the image display unit may be capable of displaying an image and transmitting an outside scene. For example, the display image setting unit may display the corresponding image associated with the identified sound source and the converted character image on the image display unit in association with each other. Such a device can be realized as, for example, a head-mounted display device, but can also be realized as a device other than the head-mounted display device. According to such a form, it is possible to solve at least one of various problems such as improvement and simplification of the operability of the device, integration of the device, and improvement of convenience of the user who uses the device. it can. Any or all of the technical features of each form of the head-mounted display device described above can be applied to this device.

本発明は、頭部装着型表示装置以外の種々の形態で実現することも可能である。例えば、頭部装着型表示装置の制御方法、頭部装着型表示装置を有する情報システム、頭部装着型表示装置の制御方法および情報システムを実現するためのコンピュータープログラム、そのコンピュータープログラムを記録した記録媒体、および、そのコンピュータープログラムを含み搬送波内に具現化されたデータ信号等の形態で実現できる。 The present invention can also be realized in various forms other than the head-mounted display device. For example, a control method for a head-mounted display device, an information system having a head-mounted display device, a control method for a head-mounted display device, a computer program for realizing the information system, and a record recording the computer program It can be realized in the form of a medium and a data signal including the computer program and embodied in a carrier wave.

頭部装着型表示装置（ＨＭＤ）の外観構成を示す説明図である。It is explanatory drawing which shows the external appearance structure of a head mounting type display apparatus (HMD). ＨＭＤの構成を機能的に示すブロック図である。It is a block diagram which shows the structure of HMD functionally. 識別情報記憶部に記憶された識別情報の一例を示す概略図である。It is the schematic which shows an example of the identification information memorize | stored in the identification information storage part. 画像光生成部によって画像光が射出される様子を示す説明図である。It is explanatory drawing which shows a mode that image light is inject | emitted by the image light production | generation part. 文字画像表示処理のフローチャートである。It is a flowchart of a character image display process. 対応画像が画像表示最大領域に表示された場合に使用者が視認する視野を示す説明図である。It is explanatory drawing which shows the visual field which a user visually recognizes when a corresponding | compatible image is displayed on the image display maximum area | region. 音声が変換された文字画像が画像表示最大領域に表示されたときに使用者が視認する視野を示す説明図である。It is explanatory drawing which shows the visual field which a user visually recognizes when the character image into which the audio | voice was converted is displayed on the image display maximum area | region. 変形例における音声が変換された文字画像が画像表示最大領域に表示されたときに使用者が視認する視野を示す説明図である。It is explanatory drawing which shows the visual field which a user visually recognizes when the character image into which the audio | voice in the modification was converted is displayed on the image display maximum area | region. 変形例のＨＭＤを含む情報システムの一部の外観構成を示す説明図である。It is explanatory drawing which shows the one part external appearance structure of the information system containing HMD of a modification. 変形例におけるＨＭＤの外観構成を示す説明図である。It is explanatory drawing which shows the external appearance structure of HMD in a modification.

Ａ．実施形態：
Ａ−１．頭部装着型表示装置の構成：
図１は、頭部装着型表示装置１００（ＨＭＤ１００）の外観構成を示す説明図である。ＨＭＤ１００は、頭部に装着する表示装置であり、ヘッドマウントディスプレイ（Head Mounted Display、ＨＭＤ）とも呼ばれる。本実施形態のＨＭＤ１００は、使用者が虚像を視認すると同時に外景も直接視認可能な光学透過型の頭部装着型表示装置である。なお、本明細書では、ＨＭＤ１００によって使用者が視認する虚像を便宜的に「表示画像」ともいう。 A. Embodiment:
A-1. Configuration of head mounted display device:
FIG. 1 is an explanatory diagram showing an external configuration of a head-mounted display device 100 (HMD 100). The HMD 100 is a display device mounted on the head, and is also called a head mounted display (HMD). The HMD 100 according to the present embodiment is an optically transmissive head-mounted display device that allows a user to visually recognize a virtual image and at the same time visually recognize an outside scene. In this specification, a virtual image visually recognized by the user with the HMD 100 is also referred to as a “display image” for convenience.

ＨＭＤ１００は、使用者の頭部に装着された状態において使用者に虚像を視認させる画像表示部２０と、画像表示部２０を制御する制御部１０（コントローラー１０）と、を備えている。 The HMD 100 includes an image display unit 20 that allows a user to visually recognize a virtual image while being mounted on the user's head, and a control unit 10 (controller 10) that controls the image display unit 20.

画像表示部２０は、使用者の頭部に装着される装着体であり、本実施形態では眼鏡形状を有している。画像表示部２０は、右保持部２１と、右表示駆動部２２と、左保持部２３と、左表示駆動部２４と、右光学像表示部２６と、左光学像表示部２８と、カメラ６１と、マイク６３と、を含んでいる。右光学像表示部２６および左光学像表示部２８は、それぞれ、使用者が画像表示部２０を装着した際に使用者の右および左の眼前に位置するように配置されている。右光学像表示部２６の一端と左光学像表示部２８の一端とは、使用者が画像表示部２０を装着した際の使用者の眉間に対応する位置で、互いに接続されている。 The image display unit 20 is a mounting body that is mounted on the user's head, and has a glasses shape in the present embodiment. The image display unit 20 includes a right holding unit 21, a right display driving unit 22, a left holding unit 23, a left display driving unit 24, a right optical image display unit 26, a left optical image display unit 28, and a camera 61. And a microphone 63. The right optical image display unit 26 and the left optical image display unit 28 are arranged so as to be positioned in front of the right and left eyes of the user when the user wears the image display unit 20, respectively. One end of the right optical image display unit 26 and one end of the left optical image display unit 28 are connected to each other at a position corresponding to the eyebrow of the user when the user wears the image display unit 20.

右保持部２１は、右光学像表示部２６の他端である端部ＥＲから、使用者が画像表示部２０を装着した際の使用者の側頭部に対応する位置にかけて、延伸して設けられた部材である。同様に、左保持部２３は、左光学像表示部２８の他端である端部ＥＬから、使用者が画像表示部２０を装着した際の使用者の側頭部に対応する位置にかけて、延伸して設けられた部材である。右保持部２１および左保持部２３は、眼鏡のテンプル（つる）のようにして、使用者の頭部に画像表示部２０を保持する。 The right holding unit 21 extends from the end ER which is the other end of the right optical image display unit 26 to a position corresponding to the user's temporal region when the user wears the image display unit 20. It is a member. Similarly, the left holding unit 23 extends from the end EL which is the other end of the left optical image display unit 28 to a position corresponding to the user's temporal region when the user wears the image display unit 20. It is a member provided. The right holding unit 21 and the left holding unit 23 hold the image display unit 20 on the user's head like a temple of glasses.

右表示駆動部２２と左表示駆動部２４とは、使用者が画像表示部２０を装着した際の使用者の頭部に対向する側に配置されている。なお、以降では、右保持部２１および左保持部２３を総称して単に「保持部」とも呼び、右表示駆動部２２および左表示駆動部２４を総称して単に「表示駆動部」とも呼び、右光学像表示部２６および左光学像表示部２８を総称して単に「光学像表示部」とも呼ぶ。 The right display drive unit 22 and the left display drive unit 24 are disposed on the side facing the user's head when the user wears the image display unit 20. Hereinafter, the right holding unit 21 and the left holding unit 23 are collectively referred to simply as “holding unit”, and the right display driving unit 22 and the left display driving unit 24 are collectively referred to simply as “display driving unit”. The right optical image display unit 26 and the left optical image display unit 28 are collectively referred to simply as “optical image display unit”.

表示駆動部２２，２４は、液晶ディスプレイ２４１，２４２（Liquid Crystal Display、以下「ＬＣＤ２４１，２４２」とも呼ぶ）や投写光学系２５１，２５２等を含む（図２参照）。表示駆動部２２，２４の構成の詳細は後述する。光学部材としての光学像表示部２６，２８は、導光板２６１，２６２（図２参照）と調光板とを含んでいる。導光板２６１，２６２は、光透過性の樹脂材料等によって形成され、表示駆動部２２，２４から出力された画像光を使用者の眼に導く。調光板は、薄板状の光学素子であり、使用者の眼の側とは反対の側である画像表示部２０の表側を覆うように配置されている。調光板は、導光板２６１，２６２を保護し、導光板２６１，２６２の損傷や汚れの付着等を抑制する。また、調光板の光透過率を調整することによって、使用者の眼に入る外光量を調整して虚像の視認のしやすさを調整できる。なお、調光板は省略可能である。 The display driving units 22 and 24 include liquid crystal displays 241 and 242 (hereinafter referred to as “LCDs 241 and 242”), projection optical systems 251 and 252 (see FIG. 2). Details of the configuration of the display driving units 22 and 24 will be described later. The optical image display units 26 and 28 as optical members include light guide plates 261 and 262 (see FIG. 2) and a light control plate. The light guide plates 261 and 262 are formed of a light transmissive resin material or the like, and guide the image light output from the display driving units 22 and 24 to the eyes of the user. The light control plate is a thin plate-like optical element, and is arranged so as to cover the front side of the image display unit 20 which is the side opposite to the user's eye side. The light control plate protects the light guide plates 261 and 262 and suppresses damage to the light guide plates 261 and 262 and adhesion of dirt. In addition, by adjusting the light transmittance of the light control plate, it is possible to adjust the amount of external light entering the user's eyes and adjust the ease of visual recognition of the virtual image. The light control plate can be omitted.

カメラ６１は、右光学像表示部２６の端部ＥＲと、左光学像表示部２８のＥＬと、のそれぞれに配置されたステレオカメラである。詳細は後述するが、カメラ６１が撮像した撮像画像から予め制御部１０に記憶されたＱＲコード（登録商標）と同じと判定されるＱＲコードの画像が検出されると、ＱＲコードを所有すると判定された人間が発話者の音源として特定される。 The camera 61 is a stereo camera disposed on each of the end ER of the right optical image display unit 26 and the EL of the left optical image display unit 28. Although details will be described later, when an image of a QR code that is determined to be the same as the QR code (registered trademark) stored in advance in the control unit 10 is detected from a captured image captured by the camera 61, it is determined that the QR code is owned. The identified human is identified as the sound source of the speaker.

マイク６３は、外部の音声を取得する。マイク６３は、使用者が画像表示部２０を装着した際の右表示駆動部２２における使用者と対向する側の反対側（外側）に配置されている。マイク６３が取得した音声は、後述する制御部１０の音声処理部１７０によって各種処理が行なわれる。 The microphone 63 acquires external sound. The microphone 63 is disposed on the opposite side (outside) of the right display driving unit 22 when facing the user in the right display driving unit 22 when the user wears the image display unit 20. The voice acquired by the microphone 63 is subjected to various processes by the voice processing unit 170 of the control unit 10 to be described later.

画像表示部２０は、さらに、画像表示部２０を制御部１０に接続するための接続部４０を有している。接続部４０は、制御部１０に接続される本体コード４８と、右コード４２と、左コード４４と、連結部材４６と、を含んでいる。右コード４２と左コード４４とは、本体コード４８が２本に分岐したコードである。右コード４２は、右保持部２１の延伸方向の先端部ＡＰから右保持部２１の筐体内に挿入され、右表示駆動部２２に接続されている。同様に、左コード４４は、左保持部２３の延伸方向の先端部ＡＰから左保持部２３の筐体内に挿入され、左表示駆動部２４に接続されている。連結部材４６は、本体コード４８と、右コード４２および左コード４４と、の分岐点に設けられ、イヤホンプラグ３０を接続するためのジャックを有している。イヤホンプラグ３０からは、右イヤホン３２および左イヤホン３４が延伸している。 The image display unit 20 further includes a connection unit 40 for connecting the image display unit 20 to the control unit 10. The connection unit 40 includes a main body cord 48, a right cord 42, a left cord 44, and a connecting member 46 that are connected to the control unit 10. The right cord 42 and the left cord 44 are codes in which the main body cord 48 is branched into two. The right cord 42 is inserted into the casing of the right holding unit 21 from the distal end AP in the extending direction of the right holding unit 21 and connected to the right display driving unit 22. Similarly, the left cord 44 is inserted into the housing of the left holding unit 23 from the distal end AP in the extending direction of the left holding unit 23 and connected to the left display driving unit 24. The connecting member 46 is provided at a branch point between the main body cord 48, the right cord 42 and the left cord 44, and has a jack for connecting the earphone plug 30. A right earphone 32 and a left earphone 34 extend from the earphone plug 30.

画像表示部２０と制御部１０とは、接続部４０を介して各種信号の伝送を行なう。本体コード４８における連結部材４６とは反対側の端部と、制御部１０と、のそれぞれには、互いに嵌合するコネクター（図示しない）が設けられている。本体コード４８のコネクターと制御部１０のコネクターとの嵌合／嵌合解除により、制御部１０と画像表示部２０とが接続されたり切り離されたりする。右コード４２と、左コード４４と、本体コード４８とには、例えば、金属ケーブルや光ファイバーを採用できる。 The image display unit 20 and the control unit 10 transmit various signals via the connection unit 40. A connector (not shown) that fits each other is provided at each of the end of the main body cord 48 opposite to the connecting member 46 and the control unit 10. By fitting / releasing the connector of the main body cord 48 and the connector of the control unit 10, the control unit 10 and the image display unit 20 are connected or disconnected. For the right cord 42, the left cord 44, and the main body cord 48, for example, a metal cable or an optical fiber can be adopted.

制御部１０は、ＨＭＤ１００を制御するための装置である。制御部１０は、決定キー１１と、点灯部１２と、表示切替キー１３と、トラックパッド１４と、輝度切替キー１５と、方向キー１６と、メニューキー１７と、電源スイッチ１８と、を含んでいる。決定キー１１は、押下操作を検出して、制御部１０で操作された内容を決定する信号を出力する。点灯部１２は、ＨＭＤ１００の動作状態を、その発光状態によって通知する。ＨＭＤ１００の動作状態としては、例えば、電源のＯＮ／ＯＦＦ等がある。点灯部１２としては、例えば、ＬＥＤが用いられる。表示切替キー１３は、押下操作を検出して、例えば、コンテンツ動画の表示モードを３Ｄと２Ｄとに切り替える信号を出力する。トラックパッド１４は、トラックパッド１４の操作面上での使用者の指の操作を検出して、検出内容に応じた信号を出力する。トラックパッド１４としては、静電式や圧力検出式、光学式といった種々のトラックパッドを採用できる。輝度切替キー１５は、押下操作を検出して、画像表示部２０の輝度を増減する信号を出力する。方向キー１６は、上下左右方向に対応するキーへの押下操作を検出して、検出内容に応じた信号を出力する。電源スイッチ１８は、スイッチのスライド操作を検出することで、ＨＭＤ１００の電源投入状態を切り替える。 The control unit 10 is a device for controlling the HMD 100. The control unit 10 includes a determination key 11, a lighting unit 12, a display switching key 13, a track pad 14, a luminance switching key 15, a direction key 16, a menu key 17, and a power switch 18. Yes. The determination key 11 detects a pressing operation and outputs a signal for determining the content operated by the control unit 10. The lighting unit 12 notifies the operation state of the HMD 100 according to the light emission state. The operation state of the HMD 100 includes, for example, power ON / OFF. For example, an LED is used as the lighting unit 12. The display switching key 13 detects a pressing operation and outputs a signal for switching the display mode of the content video between 3D and 2D, for example. The track pad 14 detects the operation of the user's finger on the operation surface of the track pad 14 and outputs a signal corresponding to the detected content. As the track pad 14, various track pads such as an electrostatic type, a pressure detection type, and an optical type can be adopted. The luminance switching key 15 detects a pressing operation and outputs a signal for increasing or decreasing the luminance of the image display unit 20. The direction key 16 detects a pressing operation on a key corresponding to the up / down / left / right direction, and outputs a signal corresponding to the detected content. The power switch 18 switches the power-on state of the HMD 100 by detecting a slide operation of the switch.

図２は、ＨＭＤ１００の構成を機能的に示すブロック図である。図２に示すように、制御部１０は、記憶部１２０と、電源１３０と、操作部１３５と、無線通信部１３２と、ＣＰＵ１４０と、インターフェイス１８０と、送信部５１（Ｔｘ５１）および送信部５２（Ｔｘ５２）と、を有している。操作部１３５は、使用者による操作を受け付け、決定キー１１、表示切替キー１３、トラックパッド１４、輝度切替キー１５、方向キー１６、メニューキー１７、電源スイッチ１８、から構成されている。 FIG. 2 is a block diagram functionally showing the configuration of the HMD 100. As shown in FIG. 2, the control unit 10 includes a storage unit 120, a power source 130, an operation unit 135, a wireless communication unit 132, a CPU 140, an interface 180, a transmission unit 51 (Tx51), and a transmission unit 52 ( Tx52). The operation unit 135 receives an operation by the user and includes an enter key 11, a display switch key 13, a track pad 14, a luminance switch key 15, a direction key 16, a menu key 17, and a power switch 18.

電源１３０は、ＨＭＤ１００の各部に電力を供給する。電源１３０としては、例えば二次電池を用いることができる。無線通信部１３２は、無線ＬＡＮやブルートゥースといった所定の無線通信規格に則って、例えば、コンテンツサーバー、テレビ、パーソナルコンピューターといった他の機器との間で無線通信を行なう。 The power supply 130 supplies power to each part of the HMD 100. As the power supply 130, for example, a secondary battery can be used. The wireless communication unit 132 performs wireless communication with other devices such as a content server, a television, and a personal computer in accordance with a predetermined wireless communication standard such as a wireless LAN or Bluetooth.

記憶部１２０は、コンピュータープログラムを格納しているＲＯＭと、ＣＰＵ１４０が種々のコンピュータープログラムの書き込みおよび読み取りを実行するときに用いられるＲＡＭと、音声記録部１２２と、識別情報記憶部１２４と、を有する。音声記録部１２２は、マイク６３が取得した外部の音声を記録する制御が行なわれた場合に、取得された音声をデータとして記憶する。識別情報記憶部１２４は、音源を特定するための音源の識別情報を記憶している。本実施形態では、識別情報記憶部１２４は、予め記憶している識別情報として、音源としての話者である人間の名前等をＱＲコードとして記憶している。なお、音声記録部１２２は、請求項における取得音声記憶部に相当する。 The storage unit 120 includes a ROM that stores computer programs, a RAM that is used when the CPU 140 executes writing and reading of various computer programs, an audio recording unit 122, and an identification information storage unit 124. . The sound recording unit 122 stores the acquired sound as data when control for recording the external sound acquired by the microphone 63 is performed. The identification information storage unit 124 stores sound source identification information for specifying a sound source. In the present embodiment, the identification information storage unit 124 stores a name of a person who is a speaker as a sound source as a QR code as identification information stored in advance. The voice recording unit 122 corresponds to the acquired voice storage unit in the claims.

図３は、識別情報記憶部１２４に記憶された識別情報の一例を示す概略図である。図３に示すように、識別情報記憶部１２４には、識別情報としてのＱＲコードごとに対応付けられ個人情報を記憶している。記憶された個人情報としては、識別情報であるＱＲコードとしてのコード（図）ごとに、音源としての話者の氏名と、性別と、所属と、話者に連絡を取ることができる電話番号と、話者がメールを受信できるメールアドレスと、顔写真と、が含まれる。 FIG. 3 is a schematic diagram illustrating an example of identification information stored in the identification information storage unit 124. As shown in FIG. 3, the identification information storage unit 124 stores personal information associated with each QR code serving as identification information. The stored personal information includes the name of the speaker as the sound source, the gender, the affiliation, and the telephone number that can be contacted to the speaker for each code (Figure) as a QR code that is identification information. , An e-mail address from which the speaker can receive e-mails, and a face photo.

ＣＰＵ１４０は、記憶部１２０のＲＯＭに格納されているコンピュータープログラムを読み出し、記憶部１２０のＲＡＭに書き込みおよび読み取りを実行することにより、オペレーティングシステム１５０（ＯＳ１５０）、表示制御部１９０、音声処理部１７０、画像処理部１６０、視線方向特定部１６８、音源識別部１６１、変換部１６９、および、画像設定部１６５として機能する。 The CPU 140 reads a computer program stored in the ROM of the storage unit 120 and writes and reads the computer program in the RAM of the storage unit 120, whereby the operating system 150 (OS 150), the display control unit 190, the audio processing unit 170, It functions as an image processing unit 160, a line-of-sight direction specifying unit 168, a sound source identification unit 161, a conversion unit 169, and an image setting unit 165.

表示制御部１９０は、右表示駆動部２２および左表示駆動部２４を制御する制御信号を生成する。具体的には、表示制御部１９０は、制御信号により、右ＬＣＤ制御部２１１による右ＬＣＤ２４１の駆動ＯＮ／ＯＦＦ、右バックライト制御部２０１による右バックライト２２１の駆動ＯＮ／ＯＦＦ、左ＬＣＤ制御部２１２による左ＬＣＤ２４２の駆動ＯＮ／ＯＦＦ、左バックライト制御部２０２による左バックライト２２２の駆動ＯＮ／ＯＦＦなど、を個別に制御する。これにより、表示制御部１９０は、右表示駆動部２２および左表示駆動部２４のそれぞれによる画像光の生成および射出を制御する。例えば、表示制御部１９０は、右表示駆動部２２および左表示駆動部２４の両方に画像光を生成させたり、一方のみに画像光を生成させたり、両方共に画像光を生成させなかったりする。なお、画像表示部２０が画像光を生成することを「画像を表示する」ともいう。 The display control unit 190 generates control signals for controlling the right display drive unit 22 and the left display drive unit 24. Specifically, the display control unit 190 controls driving of the right LCD 241 by the right LCD control unit 211, driving ON / OFF of the right backlight 221 by the right backlight control unit 201, and left LCD control unit according to control signals. The left LCD 242 driving ON / OFF by 212, the left backlight 222 driving ON / OFF by the left backlight control unit 202, and the like are individually controlled. Thus, the display control unit 190 controls the generation and emission of image light by the right display driving unit 22 and the left display driving unit 24, respectively. For example, the display control unit 190 may cause both the right display driving unit 22 and the left display driving unit 24 to generate image light, generate only one image light, or neither may generate image light. Note that the generation of image light by the image display unit 20 is also referred to as “displaying an image”.

表示制御部１９０は、右ＬＣＤ制御部２１１と左ＬＣＤ制御部２１２とに対する制御信号のそれぞれを、送信部５１および５２を介して送信する。また、表示制御部１９０は、右バックライト制御部２０１と左バックライト制御部２０２とに対する制御信号のそれぞれを送信する。 The display control unit 190 transmits control signals for the right LCD control unit 211 and the left LCD control unit 212 via the transmission units 51 and 52, respectively. In addition, the display control unit 190 transmits control signals to the right backlight control unit 201 and the left backlight control unit 202, respectively.

画像処理部１６０は、コンテンツに含まれる画像信号を取得し、送信部５１，５２を介して、取得した画像信号を画像表示部２０の受信部５３，５４へと送信する。なお、画像処理部１６０は、必要に応じて、画像データに対して、解像度変換処理、輝度、彩度の調整といった種々の色調補正処理、キーストーン補正処理等の画像処理を実行してもよい。 The image processing unit 160 acquires an image signal included in the content, and transmits the acquired image signal to the reception units 53 and 54 of the image display unit 20 via the transmission units 51 and 52. Note that the image processing unit 160 may execute image processing such as various tone correction processing such as resolution conversion processing, brightness and saturation adjustment, and keystone correction processing on the image data as necessary. .

音声処理部１７０は、コンテンツに含まれる音声信号を取得し、取得した音声信号を増幅して、連結部材４６に接続された右イヤホン３２内のスピーカー（図示しない）および左イヤホン３４内のスピーカー（図示しない）に対して供給する。なお、例えば、Ｄｏｌｂｙ（登録商標）システムを採用した場合、音声信号に対する処理がなされ、右イヤホン３２および左イヤホン３４のそれぞれからは、例えば周波数等が変えられた異なる音が出力される。また、音声処理部１７０は、マイク６３が取得した外部の音声に各種処理を行なう。音声処理部１７０は、取得された音声を音声信号として、変換部１６９に送信する。音声処理部１７０は、請求項における音声識別部に相当し、音声処理部１７０およびマイク６３は、請求項における音声取得部に相当する。 The audio processing unit 170 acquires an audio signal included in the content, amplifies the acquired audio signal, and a speaker (not shown) in the right earphone 32 and a speaker (not shown) connected to the connecting member 46 ( (Not shown). For example, when the Dolby (registered trademark) system is adopted, processing on the audio signal is performed, and different sounds with different frequencies or the like are output from the right earphone 32 and the left earphone 34, for example. The audio processing unit 170 performs various processes on the external audio acquired by the microphone 63. The audio processing unit 170 transmits the acquired audio to the conversion unit 169 as an audio signal. The voice processing unit 170 corresponds to the voice identification unit in the claims, and the voice processing unit 170 and the microphone 63 correspond to the voice acquisition unit in the claims.

変換部１６９は、音声処理部１７０から送信された音声信号の音声波形を解析して音声認識した後に、送信された音声信号に対応する文字画像に変換する。変換部１６９は、後述する音源別に文字画像のフォントや大きさなどを種々設定して文字画像に変換できる。 The conversion unit 169 analyzes the voice waveform of the voice signal transmitted from the voice processing unit 170 and recognizes the voice, and then converts it into a character image corresponding to the transmitted voice signal. The conversion unit 169 can convert a character image by setting various fonts and sizes of the character image for each sound source described later.

音源識別部１６１は、カメラ６１の撮像画像の中から識別情報記憶部１２４に記憶されたＱＲコードと同じと判定されるＱＲコードの画像を検出する。音源識別部１６１は、撮像画像に対して、パターンマッチングや統計的識別法を行なうことにより、撮像画像の中から識別情報記憶部１２４に記憶されたＱＲコードを検出する。音源識別部１６１は、検出したＱＲコードに対応付けられた識別情報記憶部１２４に記憶された顔写真を用いてカメラ６１の撮像画像に対してパターンマッチング等を行ない、画像認識によって話者を特定する。音源識別部１６１は、話者を特定した後に、撮像画像の中の話者の画像に対してパターンマッチング等を行なうことで、話者の人間の口を音源として特定する。なお、撮像画像の中から識別情報記憶部１２４に記憶されたＱＲコードに対応するＱＲコードの画像が検出されても、話者の口が撮像範囲外にあり、音源を特定できない場合もある。その場合に、他の実施形態では、撮像画像の中から検出されたＱＲコードの位置が音源として特定されてもよい。音源識別部１６１およびカメラ６１は、請求項における音源特定部に相当する。 The sound source identification unit 161 detects a QR code image determined to be the same as the QR code stored in the identification information storage unit 124 from the captured image of the camera 61. The sound source identification unit 161 detects a QR code stored in the identification information storage unit 124 from the captured image by performing pattern matching or statistical identification on the captured image. The sound source identification unit 161 performs pattern matching on the captured image of the camera 61 using the face photograph stored in the identification information storage unit 124 associated with the detected QR code, and identifies the speaker by image recognition. To do. The sound source identification unit 161 specifies the speaker's human mouth as a sound source by performing pattern matching or the like on the image of the speaker in the captured image after specifying the speaker. Note that even if a QR code image corresponding to the QR code stored in the identification information storage unit 124 is detected from the captured image, the speaker's mouth is outside the imaging range, and the sound source may not be specified. In that case, in another embodiment, the position of the QR code detected from the captured image may be specified as the sound source. The sound source identifying unit 161 and the camera 61 correspond to a sound source specifying unit in claims.

視線方向特定部１６８は、音源識別部１６１によって特定された話者の視線方向を、カメラ６１の撮像画像に対してパターンマッチング等を行なうことで特定する。本実施形態では、視線方向特定部１６８は、撮像画像の中の話者の眼の画像によって話者の視線方向を特定するが、他の実施形態では、話者の頭部の画像によって特定した話者の頭部の向きを視線方向として特定してもよい。視線方向特定部１６８およびカメラ６１は、請求項における視線方向検出部に相当する。 The line-of-sight direction specifying unit 168 specifies the line-of-sight direction of the speaker specified by the sound source identification unit 161 by performing pattern matching or the like on the captured image of the camera 61. In the present embodiment, the line-of-sight direction specifying unit 168 specifies the line-of-sight direction of the speaker from the image of the speaker's eye in the captured image, but in other embodiments, the line-of-sight direction is specified by the image of the speaker's head. The direction of the speaker's head may be specified as the line-of-sight direction. The line-of-sight direction specifying unit 168 and the camera 61 correspond to the line-of-sight direction detecting unit in the claims.

画像設定部１６５は、特定された音源に対応付けられて識別情報記憶部１２４に記憶された氏名と、取得された音声が変換された文字画像と、を画像表示部２０に表示させる。カメラ６１の画角は、ＨＭＤ１００の使用者の視線方向が正面を向いたときの視界と同じになるように予め設定されている。画像設定部１６５は、撮像画像の中から特定された音源の位置に、取得された外部音声が変換された文字画像と、特定された音源の氏名を文字画像として、画像表示部２０に表示させる。なお、画像設定部１６５は、請求項における表示画像設定部に相当する。 The image setting unit 165 causes the image display unit 20 to display the name associated with the identified sound source and stored in the identification information storage unit 124 and the character image obtained by converting the acquired voice. The angle of view of the camera 61 is set in advance so as to be the same as the field of view when the line of sight of the user of the HMD 100 faces the front. The image setting unit 165 causes the image display unit 20 to display the character image obtained by converting the acquired external sound at the position of the sound source specified from the captured image and the name of the specified sound source as a character image. . The image setting unit 165 corresponds to the display image setting unit in the claims.

インターフェイス１８０は、制御部１０に対して、コンテンツの供給元となる種々の外部機器ＯＡを接続するためのインターフェイスである。外部機器ＯＡとしては、例えば、パーソナルコンピューター（ＰＣ）や携帯電話端末、ゲーム端末等、がある。インターフェイス１８０としては、例えば、ＵＳＢインターフェイス、マイクロＵＳＢインターフェイス、メモリーカード用インターフェイス等、を用いることができる。 The interface 180 is an interface for connecting various external devices OA that are content supply sources to the control unit 10. Examples of the external device OA include a personal computer (PC), a mobile phone terminal, and a game terminal. As the interface 180, for example, a USB interface, a micro USB interface, a memory card interface, or the like can be used.

画像表示部２０は、右表示駆動部２２と、左表示駆動部２４と、右光学像表示部２６としての右導光板２６１と、左光学像表示部２８としての左導光板２６２と、カメラ６１と、マイク６３と、を備えている。 The image display unit 20 includes a right display drive unit 22, a left display drive unit 24, a right light guide plate 261 as a right optical image display unit 26, a left light guide plate 262 as a left optical image display unit 28, and a camera 61. And a microphone 63.

右表示駆動部２２は、受信部５３（Ｒｘ５３）と、光源として機能する右バックライト制御部２０１（右ＢＬ制御部２０１）および右バックライト２２１（右ＢＬ２２１）と、表示素子として機能する右ＬＣＤ制御部２１１および右ＬＣＤ２４１と、右投写光学系２５１と、を含んでいる。右バックライト制御部２０１と右バックライト２２１とは、光源として機能する。右ＬＣＤ制御部２１１と右ＬＣＤ２４１とは、表示素子として機能する。なお、右バックライト制御部２０１と、右ＬＣＤ制御部２１１と、右バックライト２２１と、右ＬＣＤ２４１と、を総称して「画像光生成部」とも呼ぶ。 The right display driving unit 22 includes a receiving unit 53 (Rx53), a right backlight control unit 201 (right BL control unit 201) and a right backlight 221 (right BL221) that function as a light source, and a right LCD that functions as a display element. A control unit 211, a right LCD 241 and a right projection optical system 251 are included. The right backlight control unit 201 and the right backlight 221 function as a light source. The right LCD control unit 211 and the right LCD 241 function as display elements. The right backlight control unit 201, the right LCD control unit 211, the right backlight 221 and the right LCD 241 are collectively referred to as “image light generation unit”.

受信部５３は、制御部１０と画像表示部２０との間におけるシリアル伝送のためのレシーバーとして機能する。右バックライト制御部２０１は、入力された制御信号に基づいて、右バックライト２２１を駆動する。右バックライト２２１は、例えば、ＬＥＤやエレクトロルミネセンス（ＥＬ）等の発光体である。右ＬＣＤ制御部２１１は、受信部５３を介して入力されたクロック信号ＰＣＬＫと、垂直同期信号ＶＳｙｎｃと、水平同期信号ＨＳｙｎｃと、右眼用画像データと、に基づいて、右ＬＣＤ２４１を駆動する。右ＬＣＤ２４１は、複数の画素をマトリクス状に配置した透過型液晶パネルである。 The receiving unit 53 functions as a receiver for serial transmission between the control unit 10 and the image display unit 20. The right backlight control unit 201 drives the right backlight 221 based on the input control signal. The right backlight 221 is a light emitter such as an LED or electroluminescence (EL). The right LCD control unit 211 drives the right LCD 241 based on the clock signal PCLK, the vertical synchronization signal VSync, the horizontal synchronization signal HSync, and the right-eye image data input via the reception unit 53. The right LCD 241 is a transmissive liquid crystal panel in which a plurality of pixels are arranged in a matrix.

右投写光学系２５１は、右ＬＣＤ２４１から射出された画像光を並行状態の光束にするコリメートレンズによって構成される。右光学像表示部２６としての右導光板２６１は、右投写光学系２５１から出力された画像光を、所定の光路に沿って反射させつつ使用者の右眼ＲＥに導く。なお、右投写光学系２５１と右導光板２６１とを総称して「導光部」とも呼ぶ。 The right projection optical system 251 is configured by a collimator lens that converts the image light emitted from the right LCD 241 to light beams in a parallel state. The right light guide plate 261 as the right optical image display unit 26 guides the image light output from the right projection optical system 251 to the right eye RE of the user while reflecting the image light along a predetermined optical path. The right projection optical system 251 and the right light guide plate 261 are collectively referred to as “light guide unit”.

左表示駆動部２４は、右表示駆動部２２と同様の構成を有している。左表示駆動部２４は、受信部５４（Ｒｘ５４）と、光源として機能する左バックライト制御部２０２（左ＢＬ制御部２０２）および左バックライト２２２（左ＢＬ２２２）と、表示素子として機能する左ＬＣＤ制御部２１２および左ＬＣＤ２４２と、左投写光学系２５２と、を含んでいる。左バックライト制御部２０２と左バックライト２２２とは、光源として機能する。左ＬＣＤ制御部２１２と左ＬＣＤ２４２とは、表示素子として機能する。なお、左バックライト制御部２０２と、左ＬＣＤ制御部２１２と、左バックライト２２２と、左ＬＣＤ２４２と、を総称して「画像光生成部」とも呼ぶ。また、左投写光学系２５２は、左ＬＣＤ２４２から射出された画像光を並行状態の光束にするコリメートレンズによって構成される。左光学像表示部２８としての左導光板２６２は、左投写光学系２５２から出力された画像光を、所定の光路に沿って反射させつつ使用者の左眼ＬＥに導く。なお、左投写光学系２５２と左導光板２６２とを総称して「導光部」とも呼ぶ。 The left display drive unit 24 has the same configuration as the right display drive unit 22. The left display driving unit 24 includes a receiving unit 54 (Rx54), a left backlight control unit 202 (left BL control unit 202) and a left backlight 222 (left BL222) that function as a light source, and a left LCD that functions as a display element. A control unit 212 and a left LCD 242 and a left projection optical system 252 are included. The left backlight control unit 202 and the left backlight 222 function as a light source. The left LCD control unit 212 and the left LCD 242 function as display elements. The left backlight control unit 202, the left LCD control unit 212, the left backlight 222, and the left LCD 242 are also collectively referred to as “image light generation unit”. The left projection optical system 252 is configured by a collimating lens that converts the image light emitted from the left LCD 242 into a light beam in a parallel state. The left light guide plate 262 as the left optical image display unit 28 guides the image light output from the left projection optical system 252 to the left eye LE of the user while reflecting the image light along a predetermined optical path. The left projection optical system 252 and the left light guide plate 262 are collectively referred to as “light guide unit”.

図４は、画像光生成部によって画像光が射出される様子を示す説明図である。右ＬＣＤ２４１は、マトリクス状に配置された各画素位置の液晶を駆動することによって、右ＬＣＤ２４１を透過する光の透過率を変化させることにより、右バックライト２２１から照射される照明光ＩＬを、画像を表わす有効な画像光ＰＬへと変調する。左側についても同様である。なお、図４に示すように、本実施形態ではバックライト方式を採用したが、フロントライト方式や、反射方式を用いて画像光を射出する構成としてもよい。 FIG. 4 is an explanatory diagram illustrating a state in which image light is emitted by the image light generation unit. The right LCD 241 changes the transmittance of the light transmitted through the right LCD 241 by driving the liquid crystal at each pixel position arranged in a matrix, thereby changing the illumination light IL emitted from the right backlight 221 into an image. Is modulated into an effective image light PL representing The same applies to the left side. As shown in FIG. 4, the backlight system is adopted in the present embodiment, but it may be configured to emit image light using a front light system or a reflection system.

Ａ−２．文字画像表示処理：
図５は、文字画像表示処理のフローチャートである。文字画像表示処理では、ＣＰＵ１４０が、撮像画像の中から特定した音源である話者が発している音声を文字画像に変換して、変換した文字画像と、識別情報記憶部１２４に記憶された話者の識別情報に対応する識別画像と、を画像表示部２０に表示させる。 A-2. Character image display processing:
FIG. 5 is a flowchart of the character image display process. In the character image display process, the CPU 140 converts the voice uttered by the speaker as the sound source identified from the captured image into a character image, and the converted character image and the story stored in the identification information storage unit 124. An identification image corresponding to the identification information of the person is displayed on the image display unit 20.

文字画像表示処理では、初めに、ＣＰＵ１４０は、操作部１３５が文字画像表示処理を開始する所定の操作の受付を待機する（ステップＳ１１）。ＣＰＵ１４０は、操作部１３５が所定の操作を受け付けていないと判定した場合には（ステップＳ１１：ＮＯ）、引き続き、所定の操作の受付を待機する（ステップＳ１１）。ＣＰＵ１４０は、操作部１３５が所定の操作を受け付けたと判定した場合には（ステップＳ１１：ＹＥＳ）、カメラ６１を用いて外景を撮像する（ステップＳ１３）。 In the character image display process, first, the CPU 140 waits for reception of a predetermined operation in which the operation unit 135 starts the character image display process (step S11). When determining that the operation unit 135 has not received a predetermined operation (step S11: NO), the CPU 140 continues to wait for reception of the predetermined operation (step S11). When it is determined that the operation unit 135 has received a predetermined operation (step S11: YES), the CPU 140 images the outside scene using the camera 61 (step S13).

音源識別部１６１は、カメラ６１の撮像画像の中から識別情報としての識別情報記憶部１２４に記憶されたＱＲコードと同じと判定されるＱＲコードの画像を検出する（ステップＳ１５）。音源識別部１６１は、撮像画像の中から識別情報記憶部１２４に記憶されたＱＲコードの画像を検出しない場合には（ステップＳ１５：ＮＯ）、引き続き、撮像画像の中からＱＲコードの画像の検出を待機する（ステップＳ１５）。音源識別部１６１は、撮像画像の中から識別情報記憶部１２４に記憶されたＱＲコードの画像を検出すると（ステップＳ１５：ＹＥＳ）、検出されたＱＲコードに対応付けられ、識別情報記憶部１２４に記憶された個人情報に含まれる顔写真と同じと判定される画像をカメラ６１の撮像画像の中から特定する（ステップＳ１７）。画像設定部１６５は、画像表示部２０が画像を表示できる最大の領域である画像表示最大領域ＰＮにおいて、撮像画像の中から検出された顔写真の付近に個人情報に含まれる氏名を表す文字画像を対応画像として表示させる。なお、本実施形態では、音源識別部１６１は、ＱＲコードに対応する顔写真と同じと判定する画像を検出できない場合には、ＱＲコードを検出しなかったものとして扱うが、他の実施形態では、ＱＲコードを顔写真に代えて特定してもよい。 The sound source identification unit 161 detects an image of a QR code determined to be the same as the QR code stored in the identification information storage unit 124 as identification information from the captured image of the camera 61 (step S15). When the sound source identification unit 161 does not detect the QR code image stored in the identification information storage unit 124 from the captured image (step S15: NO), the sound source identification unit 161 continues to detect the QR code image from the captured image. (Step S15). When the sound source identification unit 161 detects an image of the QR code stored in the identification information storage unit 124 from the captured image (step S15: YES), the sound source identification unit 161 is associated with the detected QR code and stored in the identification information storage unit 124. An image determined to be the same as the face photograph included in the stored personal information is specified from the captured images of the camera 61 (step S17). The image setting unit 165 is a character image representing a name included in personal information in the vicinity of a face photograph detected from a captured image in an image display maximum region PN, which is the maximum region in which the image display unit 20 can display an image. Is displayed as a corresponding image. In the present embodiment, if the sound source identification unit 161 cannot detect an image that is determined to be the same as the face photograph corresponding to the QR code, it is treated as having not detected the QR code, but in other embodiments, The QR code may be specified in place of the face photo.

図６は、対応画像が画像表示最大領域ＰＮに表示された場合に使用者が視認する視野ＶＲを示す説明図である。図６に示すように、使用者には、光学像表示部２６，２８を透過した外景ＳＣと、画像表示最大領域ＰＮに表示された対応画像としての画像ＩＭＧ１と、が視認される。外景ＳＣには、学校で講義を行なっている教師ＴＥと、その講義を聞いている複数の生徒ＳＴと、教師ＴＥがホワイトボートＷＢに書いた文字と、が含まれている。教師ＴＥの服の胸の部分には、教師ＴＥの識別情報ＱＲ１（ＱＲコード）が印字されたＩＤカードが付けられている。音源識別部１６１が撮像画像の中から識別情報ＱＲ１を検出して、画像設定部１６５は、識別情報記憶部１２４に記憶された識別情報ＱＲ１に対応する個人情報から氏名の名字を画像ＩＭＧ１として画像表示最大領域ＰＮに表示させる。画像設定部１６５は、画像表示最大領域ＰＮにおいて、画像ＩＭＧ１を話者である教師ＴＥの口の付近に表示させる。 FIG. 6 is an explanatory diagram showing the visual field VR visually recognized by the user when the corresponding image is displayed in the image display maximum area PN. As shown in FIG. 6, the user visually recognizes the outside scene SC transmitted through the optical image display units 26 and 28 and the image IMG1 as the corresponding image displayed in the maximum image display area PN. The outside scene SC includes a teacher TE giving a lecture at school, a plurality of students ST listening to the lecture, and characters written on the white boat WB by the teacher TE. An ID card on which identification information QR1 (QR code) of the teacher TE is printed is attached to the chest part of the clothes of the teacher TE. The sound source identification unit 161 detects the identification information QR1 from the captured image, and the image setting unit 165 uses the personal information corresponding to the identification information QR1 stored in the identification information storage unit 124 as the name IMG1 as the name IMG1. Display in the maximum display area PN. The image setting unit 165 displays the image IMG1 in the vicinity of the mouth of the teacher TE who is a speaker in the maximum image display area PN.

対応画像である画像ＩＭＧ１が画像表示最大領域ＰＮに表示されると（図５のステップＳ１９）、ＣＰＵ１４０は、操作部１３５を介して、マイク６３が取得する音声を録音するか否かの操作を受け付ける（ステップＳ２１）。ＣＰＵ１４０は、所定の操作を受け付けた場合には（ステップＳ２１：ＹＥＳ）、音声処理部１７０およびマイク６３によって取得された教師ＴＥの音声を、識別情報と対応付けてデータとして識別情報記憶部１２４に記録する（ステップＳ２３）。音声を識別情報記憶部１２４に記憶すると（ステップＳ２３）、音源識別部１６１は、音源である教師ＴＥの口が動いているか否かを判定する（ステップＳ２５）。また、同じように、ステップＳ２１の処理において音声を記憶する所定の操作を受け付けなかった場合には（ステップＳ２５：ＮＯ）、音源識別部１６１は、音源である教師ＴＥの口が動いているか否かを判定する（ステップＳ２５）。教師ＴＥの口が動いていないと判定された場合には（ステップＳ２５：ＮＯ）、変換部１６９は、マイク６３が音声を取得していても、音源である教師ＴＥの発話でないと判定して、音声を文字画像には変換せずに、音源識別部１６１は、教師ＴＥの口が動いた状態の検出を待機する（ステップＳ２５）。音源である教師ＴＥの口が動いていると判定された場合には（ステップＳ２５：ＹＥＳ）、変換部１６９は、マイク６３が取得した音声を文字画像へと変換する（ステップＳ２７）。 When the image IMG1 that is a corresponding image is displayed in the maximum image display area PN (step S19 in FIG. 5), the CPU 140 performs an operation of whether or not to record the sound acquired by the microphone 63 via the operation unit 135. Accept (step S21). When the CPU 140 receives a predetermined operation (step S21: YES), the voice of the teacher TE acquired by the voice processing unit 170 and the microphone 63 is associated with the identification information as data in the identification information storage unit 124. Recording is performed (step S23). When the voice is stored in the identification information storage unit 124 (step S23), the sound source identification unit 161 determines whether or not the mouth of the teacher TE that is the sound source is moving (step S25). Similarly, when a predetermined operation for storing voice is not accepted in the process of step S21 (step S25: NO), the sound source identification unit 161 determines whether the mouth of the teacher TE that is a sound source is moving. Is determined (step S25). When it is determined that the mouth of the teacher TE is not moving (step S25: NO), the conversion unit 169 determines that the speech is not the speech of the teacher TE that is the sound source even if the microphone 63 acquires the voice. The sound source identification unit 161 waits for detection of a state where the mouth of the teacher TE has moved without converting the voice into a character image (step S25). When it is determined that the mouth of the teacher TE, which is a sound source, is moving (step S25: YES), the conversion unit 169 converts the voice acquired by the microphone 63 into a character image (step S27).

次に、視線方向特定部１６８は、撮像画像の中の教師ＴＥの視線方向を特定する（ステップＳ２９）。画像設定部１６５は、音源である教師ＴＥの発話を文字画像へと変換する際に、特定された視線方向に応じて文字画像の表示態様および画像表示最大領域ＰＮに文字画像が表示される表示位置を設定する（ステップＳ３１）。 Next, the line-of-sight direction specifying unit 168 specifies the line-of-sight direction of the teacher TE in the captured image (step S29). The image setting unit 165 displays the character image in the display mode of the character image and the image display maximum area PN according to the specified line-of-sight direction when converting the speech of the teacher TE, which is a sound source, into a character image. A position is set (step S31).

図７は、音声が変換された文字画像ＴＸ１が画像表示最大領域ＰＮに表示されたときに使用者が視認する視野ＶＲを示す説明図である。図７には、音源である教師ＴＥの視線方向がＨＭＤ１００の使用者に向いている場合に画像表示最大領域ＰＮに表示される文字画像ＴＸ１が示されている。画像設定部１６５は、特定された教師ＴＥの視線方向が使用者に向いているため、教師ＴＥの発話である「講義を始めます。１については、・・・」を変換した文字画像ＴＸ１を画像表示最大領域ＰＮの中心に近い部分に表示させる。また、本実施形態では、画像設定部１６５は、マイク６３によって取得された音声の大きさ（例えば、デジベル（ｄＢ））と変換した文字画像のフォントの大きさとを比例させて画像表示最大領域ＰＮに表示させる。そのため、文字画像ＴＸ１において、大きい音声として取得された「講義を始めます。」が変換された文字画像のフォントは、小さい音声として取得された「１については、・・・」が変換された文字画像のフォントよりも大きい。なお、本実施形態における画像表示最大領域ＰＮの中心に近い部分とは、撮像画像における音源である教師ＴＥの位置が、右側である場合には左側の部分のことをいい、上側である場合には下側の部分のことをいう。 FIG. 7 is an explanatory diagram showing the visual field VR visually recognized by the user when the character image TX1 into which the voice has been converted is displayed in the image display maximum area PN. FIG. 7 shows a character image TX1 displayed in the image display maximum region PN when the line-of-sight direction of the teacher TE, which is a sound source, faces the user of the HMD 100. Since the line-of-sight direction of the identified teacher TE is directed to the user, the image setting unit 165 converts the character image TX1 obtained by converting the teacher TE's utterance “Start a lecture. The image is displayed near the center of the image display maximum area PN. Further, in the present embodiment, the image setting unit 165 makes the image display maximum region PN in proportion to the size of the voice (for example, decibel (dB)) acquired by the microphone 63 and the font size of the converted character image. To display. Therefore, in the character image TX1, the font of the character image obtained by converting “Lecture starts” obtained as a large sound is the character obtained by converting “For 1” as a small sound. Larger than the image font. The portion close to the center of the image display maximum region PN in the present embodiment refers to the left portion when the position of the teacher TE, which is the sound source in the captured image, is on the right side, and to the upper side. Means the lower part.

文字画像ＴＸ１が画像表示最大領域ＰＮに表示されると（図５のステップＳ２９）、操作部１３５は、文字画像表示処理の終了するための所定の操作の受付を待機する（ステップＳ３１）。所定の操作を受け付けた場合には（ステップＳ３１：ＹＥＳ）、ＣＰＵ１４０は、文字画像表示処理を終了する。 When the character image TX1 is displayed in the image display maximum area PN (step S29 in FIG. 5), the operation unit 135 waits for reception of a predetermined operation for ending the character image display process (step S31). When the predetermined operation is accepted (step S31: YES), the CPU 140 ends the character image display process.

ステップＳ３１の処理において、所定の操作が受け付けられなかった場合には（ステップＳ３１：ＮＯ）、音源識別部１６１は、特定した音源である教師ＴＥの位置が変更したか否かを判定する（ステップＳ３３）。本実施形態では、音源識別部１６１は、撮像画像において、所定の教師ＴＥの識別情報ＱＲ１であるＩＤカードの位置が変更した場合に、教師ＴＥの位置が変更したと判定する。ステップＳ３３の処理において、音源である教師ＴＥの位置が変更していないと判定された場合には（ステップＳ３３：ＮＯ）、ＣＰＵ１４０は、ステップＳ２５以降の処理を繰り返す。ステップＳ３３の処理において、音源である教師ＴＥの位置が変更したと判定された場合には、ＣＰＵ１４０は、ステップＳ１５以降の処理を繰り返す。 If the predetermined operation is not accepted in the process of step S31 (step S31: NO), the sound source identification unit 161 determines whether or not the position of the teacher TE that is the identified sound source has changed (step S31). S33). In the present embodiment, the sound source identification unit 161 determines that the position of the teacher TE has changed when the position of the ID card that is the identification information QR1 of the predetermined teacher TE has changed in the captured image. If it is determined in step S33 that the position of the teacher TE that is the sound source has not been changed (step S33: NO), the CPU 140 repeats the processing from step S25. If it is determined in step S33 that the position of the teacher TE, which is a sound source, has changed, the CPU 140 repeats the processes in and after step S15.

以上説明したように、本実施形態のＨＭＤ１００では、音源識別部１６１がカメラ６１の撮像画像の中から識別情報ＱＲ１を検出する。画像設定部１６５は、検出された識別情報ＱＲ１に対応付けられた氏名の画像ＩＭＧ１と、変換部１６９によって音声が変換された文字画像ＴＸ１と、を画像表示部の画像表示最大領域ＰＮに表示させる。そのため、本実施形態のＨＭＤ１００では、使用者は、外部の音声を聞くだけでなく、文字画像ＴＸ１として視覚的情報として音声を認識できるため、使用者にとっての利便性が向上する。特に、使用者の耳が不自由の場合、音声を文字画像として使用者に認識させることができ、効果が大きい。また、文字画像ＴＸ１は、音源である発話者の識別情報と合わせて画像表示部２０に表示されるため、使用者にとっての利便性がさらに向上する。 As described above, in the HMD 100 of the present embodiment, the sound source identification unit 161 detects the identification information QR1 from the captured image of the camera 61. The image setting unit 165 displays the name image IMG1 associated with the detected identification information QR1 and the character image TX1 into which the voice is converted by the conversion unit 169 in the image display maximum region PN of the image display unit. . Therefore, in the HMD 100 of the present embodiment, the user can not only listen to the external sound but also recognize the sound as the visual information as the character image TX1, thereby improving convenience for the user. In particular, when the user's ears are inconvenient, the user can be made to recognize voice as a character image, which is highly effective. Further, since the character image TX1 is displayed on the image display unit 20 together with the identification information of the speaker as the sound source, the convenience for the user is further improved.

また、本実施形態のＨＭＤ１００では、音源識別部１６１は、カメラ６１の撮像画像の中から識別情報ＱＲ１を検出することで、使用者から音源までの方向を特定する。そのため、画像設定部１６５は、文字画像ＴＸ１と対応画像である画像ＩＭＧ１とを特定された音源に対応付けて画像表示部２０に表示させることができ、使用者は、文字画像ＴＸ１を音源とより関連付けて認識できる。 In the HMD 100 of the present embodiment, the sound source identification unit 161 identifies the direction from the user to the sound source by detecting the identification information QR1 from the captured image of the camera 61. Therefore, the image setting unit 165 can display the character image TX1 and the image IMG1 that is the corresponding image on the image display unit 20 in association with the specified sound source, and the user can display the character image TX1 using the sound source. Recognized in association.

また、本実施形態のＨＭＤ１００では、音声記録部１２２は、音声処理部１７０およびマイク６３によって取得された教師ＴＥの音声を、識別情報と対応付けて記憶する。そのため、マイク６３によって取得された音声を単に記憶するだけの場合と比較して、本実施形態のＨＭＤ１００では、音声と発話者とを区別して記憶するため、使用者の利便性が向上する。 In the HMD 100 of the present embodiment, the voice recording unit 122 stores the voice of the teacher TE acquired by the voice processing unit 170 and the microphone 63 in association with the identification information. Therefore, compared with the case where the voice acquired by the microphone 63 is simply stored, the HMD 100 according to the present embodiment stores the voice and the speaker separately, so that the convenience for the user is improved.

また、本実施形態のＨＭＤ１００では、画像設定部１６５は、マイク６３によって取得された音声の大きさと、変換した文字画像のフォントの大きさと、を比例させて画像表示部２０に表示させる。そのため、本実施形態のＨＭＤ１００では、取得された音量と画像表示部２０に表示される文字画像との関連性があるため、使用者は外部の音声を文字画像としてより関連付けて認識でき、使用者の利便性がさらに向上する。 Further, in the HMD 100 of the present embodiment, the image setting unit 165 causes the image display unit 20 to display the audio volume acquired by the microphone 63 and the font size of the converted character image in proportion to each other. For this reason, in the HMD 100 of the present embodiment, since there is a relationship between the acquired sound volume and the character image displayed on the image display unit 20, the user can recognize external sound as a character image and recognize it. The convenience is further improved.

また、本実施形態のＨＭＤ１００では、画像設定部１６５は、音源である教師ＴＥの発話を文字画像へと変換する際に、特定された教師ＴＥの視線方向に応じて画像表示部２０に表示される文字画像の表示位置を設定する。そのため、本実施形態のＨＭＤ１００では、音源である発話者が使用者に向けて話しかけている場合とそうでない場合とで使用者が視認する文字画像が異なり、使用者に発話者の意志を関連づけて発話の内容を認識させることができる。 In the HMD 100 of the present embodiment, the image setting unit 165 is displayed on the image display unit 20 according to the identified line-of-sight direction of the teacher TE when converting the speech of the teacher TE, which is a sound source, into a character image. Set the character image display position. Therefore, in the HMD 100 of the present embodiment, the character image visually recognized by the user differs depending on whether the speaker as the sound source is speaking to the user or not, and the intention of the speaker is associated with the user. The content of the utterance can be recognized.

Ｂ．変形例：
なお、この発明は上記実施形態に限られるものではなく、その要旨を逸脱しない範囲において種々の態様において実施することが可能であり、例えば、次のような変形も可能である。 B. Variation:
In addition, this invention is not limited to the said embodiment, It can implement in a various aspect in the range which does not deviate from the summary, For example, the following deformation | transformation is also possible.

Ｂ−１．変形例１：
上記実施形態では、１つの音源である教師ＴＥについて説明したが、複数の音源から音声を取得する場合でも、本発明を適用できる。例えば、音源識別部１６１は、撮像画像の中から２つの異なる識別情報としての識別情報記憶部１２４に記憶されたＱＲコードの画像を検出し、検出された２つのＱＲコードをＩＤカードとして胸に付けた２人の発話者の口を特定した場合に、画像設定部１６５は、取得された音声を、口が動いている発話者の音声として画像表示最大領域ＰＮに文字画像を表示させてもよい。この場合に、画像設定部１６５は、画像表示最大領域ＰＮに表示される文字画像の表示位置を、特定された２人の口の位置に対応付けて表示すると共に、発話者ごとに文字画像のフォントの色を別々の色に設定してもよい。すなわち、画像設定部１６５は、検出された異なる識別情報の発話者の音声ごとに、画像表示最大領域ＰＮに表示される文字画像の表示態様や表示位置を設定する。この変形例では、複数の発話者の音声がマイク６３によって取得された場合であっても、複数の発話者の音声をそれぞれの発話者と関連付けて使用者に文字画像を視認させるため、使用者の利便性がさらに向上する。 B-1. Modification 1:
In the above embodiment, the teacher TE, which is one sound source, has been described. However, the present invention can also be applied to the case where sound is acquired from a plurality of sound sources. For example, the sound source identification unit 161 detects an image of a QR code stored in the identification information storage unit 124 as two different identification information from the captured image, and uses the two detected QR codes as an ID card on the chest. When the two speaker's mouths are identified, the image setting unit 165 may display the character image in the image display maximum area PN using the acquired speech as the speech of the speaker whose mouth is moving. Good. In this case, the image setting unit 165 displays the display position of the character image displayed in the image display maximum area PN in association with the positions of the two mouths specified, and the character image for each speaker. The font color may be set to different colors. That is, the image setting unit 165 sets the display mode and display position of the character image displayed in the image display maximum area PN for each detected voice of the speaker having different identification information. In this modification, even when the voices of a plurality of speakers are acquired by the microphone 63, the voices of the plurality of speakers are associated with the respective speakers and the user can visually recognize the character image. The convenience is further improved.

また、上記実施形態では、音源識別部１６１は、カメラ６１の撮像画像の中からパターンマッチング等の画像認識によって識別情報および音源である発話者の口を特定したが、識別情報の取得や音源の特定については、これに限られず、種々変形可能である。例えば、マイク６３は、指向性のマイクであり、取得された音声の方向を特定することで、音源識別部１６１は、使用者から音源までの方向を特定してもよい。また、音源識別部１６１は、ＧＰＳモジュールなどを用いた発話者の位置情報の送受信によって音源を特定してもよい。また、音源識別部１６１は、無線通信部１３２を介して、発話者が装着しているＨＭＤ１００やその他の装置と通信することで、識別情報の送受信を行なってもよい。無線通信部１３２は、請求項における通信部に相当する。 In the above embodiment, the sound source identifying unit 161 identifies the identification information and the speaker's mouth as the sound source from the captured image of the camera 61 by image recognition such as pattern matching. The identification is not limited to this, and various modifications are possible. For example, the microphone 63 is a directional microphone, and the sound source identification unit 161 may specify the direction from the user to the sound source by specifying the direction of the acquired sound. In addition, the sound source identification unit 161 may specify a sound source by transmitting and receiving the location information of the speaker using a GPS module or the like. The sound source identification unit 161 may transmit and receive identification information by communicating with the HMD 100 and other devices worn by the speaker via the wireless communication unit 132. The wireless communication unit 132 corresponds to the communication unit in the claims.

また、取得された音声の識別として、例えば、音声の種類である言語（日本語や英語）を識別することで、発話者と発話者による音声とが対応付けられてもよい。このように、複数の発話者によって、複数の種類の言語としての音声が取得される場合に、音源識別部１６１が複数の種類の言語を特定の言語として翻訳し、画像設定部１６５は、複数の発話者の言語を、翻訳された特定の言語の文字画像として画像表示最大領域ＰＮに表示させてもよい。このＨＭＤ１００では、音源識別部１６１によって特定された言語によって、画像表示最大領域ＰＮに表示される文字画像が変化するため、使用者が最も理解しやすい言語に翻訳された文字画像を視認でき、使用者の利便性が向上する。また、操作部１３５が操作されたり、操作に関する音声が取得された場合に、音源識別部１６１は、翻訳する特定の言語を操作に応じて選択したり、画像設定部１６５は、複数の言語に翻訳された複数の言語の文字画像を画像表示最大領域ＰＮに表示させてもよい。また、画像設定部１６５は、ＨＭＤ１００の使用者の識別情報に対応させた特定の言語に翻訳された文字画像を画像表示最大領域ＰＮに表示させてもよい。また、画像設定部１６５は、音源識別部１６１によって特定された言語に応じて、翻訳の有無だけでなく、画像表示最大領域ＰＮに表示させる文字画像の位置を設定してもよい。例えば、変換部１６９は、言語の種類に応じての翻訳を行なわずに、画像設定部１６５は、取得された音声がＨＭＤ１００の使用者の母国語（例えば、日本語）である場合には、画像表示最大領域ＰＮの中心を除く周辺部に小さいフォントの文字画像で表示させ、母国語以外の言語である場合には、画像表示最大領域ＰＮの中心付近に大きいフォントの文字画像で表示させてもよい。また、変換部１６９が予め設定された特定の言語（例えば、英語）の音声のみを母国語に変換した後に、画像設定部１６５は、取得された音声の内の変換された文字画像のみを画像表示最大領域ＰＮに表示させてもよい。 Further, as the identification of the acquired voice, for example, the language (Japanese or English) which is the type of voice may be identified, and the speaker and the voice by the speaker may be associated with each other. In this way, when sound as a plurality of types of languages is acquired by a plurality of speakers, the sound source identification unit 161 translates a plurality of types of languages as specific languages, and the image setting unit 165 May be displayed in the image display maximum area PN as a translated character image of a specific language. In this HMD 100, the character image displayed in the image display maximum area PN changes depending on the language specified by the sound source identification unit 161. Therefore, the character image translated into the language that is most easily understood by the user can be visually recognized and used. User convenience is improved. In addition, when the operation unit 135 is operated or sound related to the operation is acquired, the sound source identification unit 161 selects a specific language to be translated according to the operation, and the image setting unit 165 selects a plurality of languages. The translated character images in a plurality of languages may be displayed in the image display maximum area PN. The image setting unit 165 may display a character image translated into a specific language corresponding to the identification information of the user of the HMD 100 in the image display maximum area PN. Further, the image setting unit 165 may set not only the presence / absence of translation but also the position of the character image to be displayed in the image display maximum region PN according to the language specified by the sound source identification unit 161. For example, the conversion unit 169 does not perform translation according to the type of language, and the image setting unit 165 determines that the acquired voice is in the native language (for example, Japanese) of the user of the HMD 100. Display in a small font character image around the center of the image display maximum area PN, and if it is a language other than the native language, display it in a large font character image near the center of the image display maximum area PN. Also good. In addition, after the conversion unit 169 converts only a predetermined language (for example, English) speech into the native language, the image setting unit 165 displays only the converted character image of the acquired speech. The maximum display area PN may be displayed.

上記実施形態では、識別情報の一例としてＱＲコードを例に挙げて説明したが、識別情報については、種々変形可能である。例えば、ＱＲコードのようなマーカー型としては、ＱＲコード以外として、バーコード、ＡＲマーカー、ＩＤマーカー、ＮｙＩＤマーカー、ＤａｔａＭａｔｒｉｘマーカー、フレームマーカー、分割マーカー、これらの識別子に色を加えて情報量を増やしたマーカー、および、多次元コードよって作成されたマーカー、ＯＣＲ等で読み取ることができる文字などが用いられてもよい。また、予め識別情報を備えている必要はなく、例えば、ＨＭＤ１００のカメラ６１によって識別した発話者を撮像し、撮像された画像データに基づく認証を識別情報の代わりとしてもよい。 In the above embodiment, the QR code has been described as an example of the identification information, but the identification information can be variously modified. For example, as a marker type such as QR code, other than QR code, barcode, AR marker, ID marker, NyID marker, DataMatrix marker, frame marker, division marker, add color to these identifiers to increase the amount of information Markers created by multi-dimensional codes, characters that can be read by OCR, or the like may be used. The identification information need not be provided in advance. For example, a speaker identified by the camera 61 of the HMD 100 may be imaged, and authentication based on the captured image data may be used instead of the identification information.

図８は、変形例における音声が変換された文字画像ＴＸ２が画像表示最大領域ＰＮに表示されたときに使用者が視認する視野ＶＲを示す説明図である。図８には、図７で示した視野ＶＲに対して、音源である教師ＴＥの視線方向がＨＭＤ１００の使用者に向いていない場合に画像表示最大領域ＰＮに表示される文字画像ＴＸ２が示されている。図８に示すように、教師ＴＥは、使用者ではなくホワイトボートＷＢに視線を向けており、この場合に、画像設定部１６５は、教師ＴＥの音声を画像表示最大領域ＰＮにおける下側に表示させる。また、画像設定部１６５は、教師ＴＥが使用者に視線を向けていない場合には、取得した教師ＴＥの音声の音量の大きさに関わらず、同じ大きさのフォントの文字画像を画像表示最大領域ＰＮに表示させる。なお、音源である教師ＴＥの視線方向に応じて変化させる文字画像のフォントや文字画像の表示位置ついては、種々変形可能である。教師ＴＥの視線に関わらず、操作部１３５が所定の操作を受け付けることで、ＣＰＵ１４０は、画像表示最大領域ＰＮに表示させる文字画像のフォントの大きさや色の設定を変更できる。 FIG. 8 is an explanatory diagram showing a visual field VR visually recognized by the user when the character image TX2 into which the voice in the modified example is converted is displayed in the image display maximum area PN. FIG. 8 shows a character image TX2 displayed in the image display maximum area PN when the line-of-sight direction of the teacher TE, which is a sound source, does not face the user of the HMD 100 with respect to the visual field VR shown in FIG. ing. As shown in FIG. 8, the teacher TE is looking not at the user but at the white boat WB. In this case, the image setting unit 165 displays the voice of the teacher TE on the lower side in the maximum image display area PN. Let In addition, when the teacher TE is not looking at the user, the image setting unit 165 displays the character image of the same size font regardless of the volume of the acquired voice of the teacher TE. Display in the area PN. Note that the font of the character image and the display position of the character image that are changed according to the line-of-sight direction of the teacher TE, which is the sound source, can be variously modified. Regardless of the line of sight of the teacher TE, the CPU 140 can change the font size and color settings of the character image displayed in the image display maximum area PN when the operation unit 135 receives a predetermined operation.

Ｂ−２．変形例２：
図９は、変形例のＨＭＤ１００ａを含む情報システム５００の一部の外観構成を示す説明図である。情報システム５００は、ＨＭＤ１００ａと、ＨＭＤ１００ａと各種情報を送受信するサーバー３００と、を備えている。この変形例では、サーバー３００が、上記実施形態のＨＭＤ１００の代わりに、上記実施形態のＨＭＤ１００が備えていた記憶部１２０の識別情報記憶部１２４と、変換部１６９と、画像設定部１６５と、を備えている点が上記実施形態と異なる。そのため、情報システム５００は、識別情報を記憶している識別情報記憶部３２４と、ＣＰＵ３４０と、情報送受信部３３２と、を備えている。ＣＰＵ３４０は、音声を文字画像へと変換する変換部３６９と、ＨＭＤ１００ａの画像表示部２０ａに表示させる画像を設定する画像設定部３６５と、を有している。 B-2. Modification 2:
FIG. 9 is an explanatory diagram showing an external configuration of a part of an information system 500 including a modified HMD 100a. The information system 500 includes an HMD 100a and a server 300 that transmits / receives various information to / from the HMD 100a. In this modification, the server 300 includes an identification information storage unit 124, a conversion unit 169, and an image setting unit 165 of the storage unit 120 included in the HMD 100 of the above embodiment, instead of the HMD 100 of the above embodiment. The point provided is different from the above embodiment. Therefore, the information system 500 includes an identification information storage unit 324 that stores identification information, a CPU 340, and an information transmission / reception unit 332. The CPU 340 includes a conversion unit 369 that converts sound into a character image, and an image setting unit 365 that sets an image to be displayed on the image display unit 20a of the HMD 100a.

この変形例では、上記実施形態のように、ＨＭＤ１００ａのカメラ６１が外景を撮像すると、ＨＭＤ１００ａの無線通信部１３２を介して、サーバー３００の情報送受信部３３２へと撮像画像の情報が送信される。ＣＰＵ３４０は、識別情報記憶部３２４に記憶された識別情報と照合し、照合した情報を情報送受信部３３２を介して、ＨＭＤ１００ａの無線通信部１３２へと送信することで、音源識別部１６１が音源を特定する。ＣＰＵ１４０ａは、ＨＭＤ１００ａのマイク６３が音声を取得すると、カメラ６１が撮像した音源の画像と音声との情報を、無線通信部１３２を介して、サーバー３００へと送信する。画像設定部３６５は、情報送受信部３３２を介して受信した音源の画像と音声とに基づいて、ＨＭＤ１００ａの画像表示部２０ａに表示させる画像の表示態様を設定して、設定した画像の表示態様の情報を、情報送受信部３３２を介して、ＨＭＤ１００ａへと送信する。ＨＭＤ１００ａのＣＰＵ１４０ａは、無線通信部１３２を介して受信した画像の表示態様の情報に基づいて、画像表示部２０ａの画像表示最大領域ＰＮに画像を表示させる。 In this modification, when the camera 61 of the HMD 100a captures an outside scene as in the above embodiment, information of the captured image is transmitted to the information transmitting / receiving unit 332 of the server 300 via the wireless communication unit 132 of the HMD 100a. The CPU 340 collates the identification information stored in the identification information storage unit 324, and transmits the collated information to the wireless communication unit 132 of the HMD 100a via the information transmission / reception unit 332, so that the sound source identification unit 161 selects the sound source. Identify. When the microphone 63 of the HMD 100 a acquires sound, the CPU 140 a transmits information on the sound source image and sound captured by the camera 61 to the server 300 via the wireless communication unit 132. The image setting unit 365 sets the display mode of the image to be displayed on the image display unit 20a of the HMD 100a based on the sound source image and sound received via the information transmitting / receiving unit 332, and sets the display mode of the set image. Information is transmitted to the HMD 100a via the information transmission / reception unit 332. The CPU 140a of the HMD 100a displays an image in the image display maximum area PN of the image display unit 20a based on the image display mode information received via the wireless communication unit 132.

この変形例では、上記実施形態においてＨＭＤ１００が有する機能の一部がサーバー３００に搭載されている。そのため、情報システム５００の全体として、音源や音源が発する音声などの情報を共有化できる。なお、上記実施形態が有する機能の内、情報システム５００に含まれるサーバー３００などに搭載される機能については、種々変形可能である。例えば、カメラ６１やマイク６３が会場に設置されていてもよい。また、情報システム５００に含まれる複数のＨＭＤ１００やサーバー３００が同じ機能を重複して有していてもよい。 In this modification, a part of the functions of the HMD 100 in the above embodiment is mounted on the server 300. Therefore, the information system 500 as a whole can share information such as a sound source and sound generated by the sound source. Of the functions of the above-described embodiment, the functions mounted on the server 300 included in the information system 500 can be variously modified. For example, a camera 61 and a microphone 63 may be installed at the venue. In addition, a plurality of HMDs 100 and servers 300 included in the information system 500 may have the same function.

Ｂ−３．変形例３：
上記実施形態では、視線方向特定部１６８は、カメラ６１の撮像画像に対してパターンマッチングを行なうことにより、話者の向きを特定したが、話者の向きを特定する方法については、種々変形可能である。例えば、視線方向特定部１６８は、ＧＰＳモジュールやＬＡＮなどの通信を用いることで、情報を送受信できる装置を携帯した話者の位置を特定し、話者が携帯した装置にジャイロセンサーなどが搭載されることで、ＨＭＤ１００の使用者と話者との位置および向きの関係を特定してもよい。また、視線方向特定部１６８は、無線通信部１３２を介して、ＧＰＳモジュールが搭載された他のＨＭＤ１００の位置および自身が搭載されたＨＭＤ１００の位置を特定することで、使用者と他のＨＭＤ１００の使用者との位置との相対位置関係を特定してもよい。画像設定部１６５は、特定された相対位置関係に応じて、他のＨＭＤ１００の識別情報に対応付けられた対応画像を設定して、画像表示最大領域ＰＮに設定した対応画像を表示させてもよい。相対位置関係に応じて設定される対応画像として、例えば、使用者と他の使用者との距離が所定値以上の場合に、画像設定部１６５は、他の使用者を使用者に認識させるために、対応画像として他の使用者の顔の画像や名前を画像表示最大領域ＰＮに表示させてもよい。また、ＨＭＤ１００の向きを特定するジャイロセンサーなどが画像表示部２０に搭載された場合に、画像設定部１６５は、使用者の向きと他の使用者の向きとに応じて画像表示最大領域ＰＮに表示させる対応画像を設定してもよい。例えば、他の使用者が使用者の近くにいるものの、使用者の背後に他の使用者がいると特定された場合には、画像設定部１６５は、他の使用者が所定値以上離れた所にいるものとして上述したように、対応画像に顔の画像や名前を画像表示最大領域ＰＮに表示させてもよい。この変形例では、使用者に視認させる対応画像によって使用者と他の使用者との位置関係を使用者に認識させることができ、使用者の利便性がさらに向上する。 B-3. Modification 3:
In the above embodiment, the line-of-sight direction specifying unit 168 specifies the direction of the speaker by performing pattern matching on the captured image of the camera 61. However, the method for specifying the direction of the speaker can be variously modified. It is. For example, the line-of-sight direction specifying unit 168 specifies the position of a speaker carrying a device capable of transmitting and receiving information by using communication such as a GPS module or a LAN, and a gyro sensor or the like is mounted on the device carried by the speaker. Thus, the relationship between the position and orientation of the user of HMD 100 and the speaker may be specified. In addition, the line-of-sight direction specifying unit 168 specifies the position of the other HMD 100 on which the GPS module is mounted and the position of the HMD 100 on which the GPS module is mounted via the wireless communication unit 132, thereby You may specify the relative positional relationship with a user's position. The image setting unit 165 may set the corresponding image associated with the identification information of the other HMD 100 according to the specified relative positional relationship, and display the corresponding image set in the image display maximum area PN. . As the corresponding image set according to the relative positional relationship, for example, when the distance between the user and another user is a predetermined value or more, the image setting unit 165 causes the user to recognize the other user. In addition, an image or name of another user's face may be displayed as the corresponding image in the image display maximum area PN. When a gyro sensor or the like for specifying the orientation of the HMD 100 is mounted on the image display unit 20, the image setting unit 165 sets the image display maximum area PN in accordance with the orientation of the user and the orientation of other users. A corresponding image to be displayed may be set. For example, when it is determined that another user is near the user but there is another user behind the user, the image setting unit 165 causes the other user to move away by a predetermined value or more. As described above, the face image or name may be displayed on the corresponding image in the maximum image display area PN as described above. In this modification, the user can recognize the positional relationship between the user and other users by the corresponding image that is visually recognized by the user, and the convenience for the user is further improved.

画像設定部１６５が変換した文字画像を画像表示最大領域ＰＮに表示する表示位置の設定については、上記実施形態の例に限られず、種々変形可能である。画像設定部１６５は、撮像画像に対してパターンマッチング等を用いることで、例えば、ホワイトボートＷＢに記載された文字の画像を認識して、認識された文字の画像とは重ならない位置に文字画像ＴＸ１などを表示してもよい。また、画像設定部１６５は、文字の画像に関わらず、予め設定された画像と重複しない画像表示最大領域ＰＮの位置に文字画像を表示してもよいし、特定の画像が撮像画像の中に含まれる場合には、文字画像を画像表示最大領域ＰＮに一切表示しなくてもよい。また、予め特定された言葉として、例えば、プライバシーに関わる名前や業務上の秘匿義務がある文章が登録されることにより、画像設定部１６５は、登録された言葉の音声を取得しても、登録された言葉のみ文字画像として画像表示最大領域ＰＮに表示させなくてもよい。逆に、画像設定部１６５は、特定の言葉の音声が取得された場合に、特定の言葉を強調させるように画像表示最大領域ＰＮに表示させてもよい。また、ＨＭＤの使用者と識別情報記憶部１２４に登録された音源の話者とのセキュリティーレベルの設定などにより、画像設定部１６５は、取得した音声の文字画像の表示有無や表示態様を設定して画像表示最大領域ＰＮに表示させてもよい。 The setting of the display position at which the character image converted by the image setting unit 165 is displayed in the maximum image display area PN is not limited to the above-described embodiment, and various modifications can be made. The image setting unit 165 recognizes a character image described on the whiteboard WB, for example, by using pattern matching or the like on the captured image, and a character image at a position that does not overlap with the recognized character image. TX1 or the like may be displayed. The image setting unit 165 may display a character image at a position of the image display maximum region PN that does not overlap with a preset image regardless of the character image, or a specific image is included in the captured image. If included, the character image may not be displayed at all in the image display maximum area PN. In addition, as a pre-specified word, for example, a name related to privacy or a sentence with confidentiality duties is registered, so that the image setting unit 165 does not register even if it acquires the voice of the registered word. It is not necessary to display only the written words as character images in the image display maximum area PN. Conversely, when the sound of a specific word is acquired, the image setting unit 165 may display the specific word in the maximum image display area PN so as to emphasize the specific word. Further, the image setting unit 165 sets whether or not to display the character image of the acquired voice and the display mode by setting the security level between the HMD user and the speaker of the sound source registered in the identification information storage unit 124. May be displayed in the maximum image display area PN.

また、画像設定部１６５は、話者との距離が測定されることにより、測定された距離と対応付けて画像表示最大領域ＰＮに表示させる文字画像を設定してもよい。例えば、画像設定部１６５は、話者との距離に応じて、画像表示最大領域ＰＮに表示させる文字画像のフォントの大きさを変更させてもよい。また、視界不良によりカメラ６１の撮像画像から明確にＱＲコードや話者が認識されない場合に、音源識別部１６１は、例えば、光情報として送受信される光ビーコンを識別情報として特定し、赤外線による話者までの距離の特定することで、画像設定部１６５が、取得された音声を文字画像として画像表示最大領域ＰＮに表示させてもよい。 The image setting unit 165 may set a character image to be displayed in the image display maximum area PN in association with the measured distance by measuring the distance to the speaker. For example, the image setting unit 165 may change the font size of the character image displayed in the maximum image display area PN according to the distance from the speaker. In addition, when the QR code or the speaker is not clearly recognized from the captured image of the camera 61 due to poor visibility, the sound source identification unit 161 identifies, for example, an optical beacon transmitted and received as optical information as identification information, and talks by infrared By specifying the distance to the person, the image setting unit 165 may display the acquired voice as a character image in the image display maximum area PN.

Ｂ−４．変形例４：
また、ＨＭＤ１００は、画像表示部２０の加速度を検出する加速度センサーや角速度を検出するジャイロセンサーを備えてもよい。この場合に、画像設定部１６５は、検出された加速度や角速度をキャンセルするように対応画像や文字画像を画像表示最大領域ＰＮに表示させてもよい。すなわち、画像設定部１６５は、検出された加速度や角速度を用いて、対応画像や文字画像の表示位置を固定して画像表示最大領域ＰＮに表示させることができる。この変形例では、使用者が視認する外景の撮像画像に対して画像表示部２０に表示される画像が固定されるため、使用者が歩行中であったとしても、画像表示最大領域ＰＮに対して表示される画像の位置がぶれないため、使用者に対応画像や文字画像をより視認させやすい。 B-4. Modification 4:
Further, the HMD 100 may include an acceleration sensor that detects acceleration of the image display unit 20 and a gyro sensor that detects angular velocity. In this case, the image setting unit 165 may display the corresponding image or the character image in the image display maximum area PN so as to cancel the detected acceleration or angular velocity. That is, the image setting unit 165 can fix the display position of the corresponding image or the character image and display it in the image display maximum area PN using the detected acceleration or angular velocity. In this modified example, the image displayed on the image display unit 20 is fixed with respect to the captured image of the outside scene visually recognized by the user, so even if the user is walking, the image display maximum area PN Since the position of the displayed image is not blurred, it is easier for the user to visually recognize the corresponding image and the character image.

Ｂ−５．変形例５：
上記実施形態では、制御部１０に操作部１３５が形成されたが、操作部１３５の態様については種々変形可能である。例えば、制御部１０とは別体で操作部１３５であるユーザーインターフェースがある態様でもよい。この場合に、操作部１３５は、電源１３０等が形成された制御部１０とは別体であるため、小型化でき、使用者の操作性が向上する。また、カメラ６１が画像表示部２０に配置されたが、カメラ６１が画像表示部２０とは別体に構成され、外景ＳＣを撮像できてもよい。また、制御部１０の構成するＣＰＵ１４０や電源１３０が画像表示部２０に全て搭載されたＨＭＤ１００であってもよい。このＨＭＤ１００では、画像表示部２０と別体で構成されるコントローラーがないため、より小型化できる。また、制御部１０と画像表示部２０とのそれぞれに、ＣＰＵ１４０が搭載されることで、制御部１０がコントローラー単体として使用され、画像表示部２０が表示装置単体として使用されてもよい。 B-5. Modification 5:
In the above embodiment, the operation unit 135 is formed in the control unit 10, but the mode of the operation unit 135 can be variously modified. For example, a mode in which a user interface that is the operation unit 135 is provided separately from the control unit 10 may be used. In this case, the operation unit 135 is separate from the control unit 10 in which the power supply 130 and the like are formed. Therefore, the operation unit 135 can be reduced in size and user operability is improved. Further, although the camera 61 is disposed in the image display unit 20, the camera 61 may be configured separately from the image display unit 20 and can capture the outside scene SC. Alternatively, the HMD 100 in which the CPU 140 and the power supply 130 that are included in the control unit 10 are all mounted in the image display unit 20 may be used. The HMD 100 can be further downsized because there is no controller configured separately from the image display unit 20. Further, by mounting the CPU 140 on each of the control unit 10 and the image display unit 20, the control unit 10 may be used as a single controller and the image display unit 20 may be used as a single display device.

例えば、画像光生成部は、有機ＥＬ（有機エレクトロルミネッセンス、Organic Electro-Luminescence）のディスプレイと、有機ＥＬ制御部とを備える構成としてもよい。また、例えば、画像光生成部は、ＬＣＤに代えて、ＬＣＯＳ（Liquid crystal on silicon, LCoS は登録商標）や、デジタル・マイクロミラー・デバイス等を用いることもできる。また、例えば、レーザー網膜投影型のＨＭＤ１００に対して本発明を適用することも可能である。 For example, the image light generation unit may include an organic EL (Organic Electro-Luminescence) display and an organic EL control unit. Further, for example, the image light generation unit may use LCOS (Liquid crystal on silicon, LCoS is a registered trademark), a digital micromirror device, or the like instead of the LCD. Further, for example, the present invention can be applied to the laser retinal projection type HMD 100.

また、例えば、ＨＭＤ１００は、光学像表示部が使用者の眼の一部分のみを覆う態様、換言すれば、光学像表示部が使用者の眼を完全に覆わない態様のヘッドマウントディスプレイとしてもよい。また、ＨＭＤ１００は、いわゆる単眼タイプのヘッドマウントディスプレイであるとしてもよい。また、ＨＭＤ１００は、両眼タイプの光学透過型であるとしているが、本発明は、例えば、ビデオ透過型といった他の形式の頭部装着型表示装置にも同様に適用可能である。 Further, for example, the HMD 100 may be a head mounted display in which the optical image display unit covers only a part of the user's eyes, in other words, the optical image display unit does not completely cover the user's eyes. The HMD 100 may be a so-called monocular type head mounted display. Although the HMD 100 is a binocular optical transmission type, the present invention can be similarly applied to other types of head-mounted display devices such as a video transmission type.

また、ＨＭＤ１００は、他の装置から受信した画像信号に基づく画像を表示するためだけの表示装置と用いられてもよい。具体的には、デスクトップ型のＰＣのモニターに相当する表示装置として用いられ、例えば、デスクトップ型のＰＣから画像信号を受信することで、画像表示部２０の画像表示最大領域ＰＮに画像が表示されてもよい。 Moreover, HMD100 may be used with the display apparatus only for displaying the image based on the image signal received from the other apparatus. Specifically, it is used as a display device corresponding to a desktop PC monitor. For example, by receiving an image signal from a desktop PC, an image is displayed in the maximum image display area PN of the image display unit 20. May be.

また、ＨＭＤ１００は、システムの一部として機能するように用いられてもよい。例えば、航空機を含むシステムの一部の機能を実行するための装置としてＨＭＤ１００が用いられてもよいし、ＨＭＤ１００が用いられるシステムとしては、航空機を含むシステムに限られず、自動車や自転車など含むシステムであってもよい。 Moreover, HMD100 may be used so that it may function as a part of system. For example, the HMD 100 may be used as a device for executing a part of functions of a system including an aircraft, and the system in which the HMD 100 is used is not limited to a system including an aircraft, but is a system including an automobile or a bicycle. There may be.

また、イヤホンは耳掛け型やヘッドバンド型が採用されてもよく、省略してもよい。また、例えば、自動車や飛行機等の車両に搭載される頭部装着型表示装置として構成されてもよい。また、例えば、ヘルメット等の身体防護具に内蔵された頭部装着型表示装置として構成されてもよい。 The earphone may be an ear-hook type or a headband type, or may be omitted. Further, for example, it may be configured as a head-mounted display device mounted on a vehicle such as an automobile or an airplane. For example, it may be configured as a head-mounted display device built in a body protective device such as a helmet.

Ｂ−６．変形例６：
上記実施形態におけるＨＭＤ１００の構成は、あくまで一例であり、種々変形可能である。例えば、制御部１０に設けられた方向キー１６を省略したり、方向キー１６やトラックパッド１４に加えて、操作用スティック等の他の操作用インターフェイスを設けたりしてもよい。また、制御部１０は、キーボードやマウス等の入力デバイスを接続可能な構成であり、キーボードやマウスから入力を受け付けるものとしてもよい。 B-6. Modification 6:
The configuration of the HMD 100 in the above embodiment is merely an example and can be variously modified. For example, the direction key 16 provided in the control unit 10 may be omitted, or another operation interface such as an operation stick may be provided in addition to the direction key 16 and the track pad 14. Moreover, the control part 10 is a structure which can connect input devices, such as a keyboard and a mouse | mouth, and is good also as what receives an input from a keyboard or a mouse | mouth.

また、画像表示部として、眼鏡のように装着する画像表示部２０に代えて、例えば帽子のように装着する画像表示部といった他の方式の画像表示部を採用してもよい。また、イヤホン３２，３４は、適宜省略可能である。 As the image display unit, instead of the image display unit 20 worn like glasses, another type of image display unit such as an image display unit worn like a hat may be adopted. Further, the earphones 32 and 34 can be omitted as appropriate.

図１０は、変形例におけるＨＭＤの外観構成を示す説明図である。図１０（Ａ）の例の場合、図１に示したＨＭＤ１００との違いは、画像表示部２０ｘが、右光学像表示部２６に代えて右光学像表示部２６ｘを備える点と、左光学像表示部２８に代えて左光学像表示部２８ｘを備える点とである。右光学像表示部２６ｘは、上記実施形態の光学部材よりも小さく形成され、ＨＭＤ１００ｘの装着時における使用者の右眼の斜め上に配置されている。同様に、左光学像表示部２８ｘは、上記実施形態の光学部材よりも小さく形成され、ＨＭＤ１００ｘの装着時における使用者の左眼の斜め上に配置されている。図１０（Ｂ）の例の場合、図１に示したＨＭＤ１００との違いは、画像表示部２０ｙが、右光学像表示部２６に代えて右光学像表示部２６ｙを備える点と、左光学像表示部２８に代えて左光学像表示部２８ｙを備える点とである。右光学像表示部２６ｙは、上記実施形態の光学部材よりも小さく形成され、ヘッドマウントディスプレイの装着時における使用者の右眼の斜め下に配置されている。左光学像表示部２８ｙは、上記実施形態の光学部材よりも小さく形成され、ヘッドマウントディスプレイの装着時における使用者の左眼の斜め下に配置されている。このように、光学像表示部は使用者の眼の近傍に配置されていれば足りる。また、光学像表示部を形成する光学部材の大きさも任意であり、光学像表示部が使用者の眼の一部分のみを覆う態様、換言すれば、光学像表示部が使用者の眼を完全に覆わない態様のＨＭＤ１００として実現できる。 FIG. 10 is an explanatory diagram showing an external configuration of the HMD in the modification. 10A, the difference from the HMD 100 shown in FIG. 1 is that the image display unit 20x includes a right optical image display unit 26x instead of the right optical image display unit 26, and the left optical image. It is a point provided with the left optical image display part 28x instead of the display part 28. The right optical image display unit 26x is formed smaller than the optical member of the above-described embodiment, and is disposed obliquely above the right eye of the user when the HMD 100x is mounted. Similarly, the left optical image display unit 28x is formed smaller than the optical member of the above-described embodiment, and is disposed obliquely above the left eye of the user when the HMD 100x is worn. 10B, the difference from the HMD 100 shown in FIG. 1 is that the image display unit 20y includes a right optical image display unit 26y instead of the right optical image display unit 26, and the left optical image. It is a point provided with the left optical image display part 28y instead of the display part 28. The right optical image display unit 26y is formed smaller than the optical member of the above-described embodiment, and is disposed obliquely below the right eye of the user when the head mounted display is mounted. The left optical image display unit 28y is formed smaller than the optical member of the above-described embodiment, and is disposed obliquely below the left eye of the user when the head mounted display is mounted. Thus, it is sufficient that the optical image display unit is disposed in the vicinity of the user's eyes. The size of the optical member forming the optical image display unit is also arbitrary, and the optical image display unit covers only a part of the user's eye, in other words, the optical image display unit completely covers the user's eye. It can be realized as an HMD 100 in an uncovered mode.

また、上記実施形態において、ＨＭＤ１００は、使用者の左右の眼に同じ画像を表わす画像光を導いて使用者に二次元画像を視認させるとしてもよいし、使用者の左右の眼に異なる画像を表わす画像光を導いて使用者に三次元画像を視認させるとしてもよい。 In the above-described embodiment, the HMD 100 may guide the image light representing the same image to the left and right eyes of the user to make the user visually recognize the two-dimensional image, or may display different images on the left and right eyes of the user. The image light to be represented may be guided to make the user visually recognize the three-dimensional image.

また、上記実施形態において、ハードウェアによって実現されていた構成の一部をソフトウェアに置き換えるようにしてもよく、逆に、ソフトウェアによって実現されていた構成の一部をハードウェアに置き換えるようにしてもよい。例えば、上記実施形態では、画像処理部１６０や音声処理部１７０は、ＣＰＵ１４０がコンピュータープログラムを読み出して実行することにより実現されるとしているが、これらの機能部はハードウェア回路により実現されるとしてもよい。 In the above embodiment, a part of the configuration realized by hardware may be replaced by software, and conversely, a part of the configuration realized by software may be replaced by hardware. Good. For example, in the above-described embodiment, the image processing unit 160 and the sound processing unit 170 are realized by the CPU 140 reading and executing a computer program, but these functional units may be realized by a hardware circuit. Good.

また、本発明の機能の一部または全部がソフトウェアで実現される場合には、そのソフトウェア（コンピュータープログラム）は、コンピューター読み取り可能な記録媒体に格納された形で提供することができる。この発明において、「コンピューター読み取り可能な記録媒体」とは、フレキシブルディスクやＣＤ−ＲＯＭのような携帯型の記録媒体に限らず、各種のＲＡＭやＲＯＭ等のコンピューター内の内部記憶装置や、ハードディスク等のコンピューターに固定されている外部記憶装置も含んでいる。 In addition, when part or all of the functions of the present invention are realized by software, the software (computer program) can be provided in a form stored in a computer-readable recording medium. In the present invention, the “computer-readable recording medium” is not limited to a portable recording medium such as a flexible disk or a CD-ROM, but an internal storage device in a computer such as various RAMs and ROMs, a hard disk, etc. It also includes an external storage device fixed to the computer.

また、上記実施形態では、図１および図４に示すように、制御部１０と画像表示部２０とが別々の構成として形成されているが、制御部１０と画像表示部２０との構成については、これに限られず、種々変形可能である。例えば、画像表示部２０の内部に、制御部１０に形成された構成の全てが形成されてもよいし、一部が形成されてもよい。また、上記実施形態における電源１３０が単独で形成されて、交換可能な構成であってもよいし、制御部１０に形成された構成が重複して画像表示部２０に形成されていてもよい。例えば、図２に示すＣＰＵ１４０が制御部１０と画像表示部２０との両方に形成されていてもよいし、制御部１０に形成されたＣＰＵ１４０と画像表示部２０に形成されたＣＰＵとが行なう機能が別々に分けられている構成としてもよい。 Moreover, in the said embodiment, as shown to FIG. 1 and FIG. 4, although the control part 10 and the image display part 20 are formed as a separate structure, about the structure of the control part 10 and the image display part 20, about. However, the present invention is not limited to this, and various modifications are possible. For example, all of the components formed in the control unit 10 may be formed inside the image display unit 20 or a part thereof may be formed. In addition, the power source 130 in the above embodiment may be formed independently and replaceable, or the configuration formed in the control unit 10 may be overlapped and formed in the image display unit 20. For example, the CPU 140 shown in FIG. 2 may be formed in both the control unit 10 and the image display unit 20, or a function performed by the CPU 140 formed in the control unit 10 and the CPU formed in the image display unit 20. May be configured separately.

本発明は、上記実施形態や変形例に限られるものではなく、その趣旨を逸脱しない範囲において種々の構成で実現することができる。例えば、発明の概要の欄に記載した各形態中の技術的特徴に対応する実施形態、変形例中の技術的特徴は、上述の課題の一部または全部を解決するために、あるいは、上述の効果の一部または全部を達成するために、適宜、差し替えや、組み合わせを行なうことが可能である。また、その技術的特徴が本明細書中に必須なものとして説明されていなければ、適宜、削除することが可能である。 The present invention is not limited to the above-described embodiments and modifications, and can be realized with various configurations without departing from the spirit of the present invention. For example, the technical features in the embodiments and the modifications corresponding to the technical features in each form described in the summary section of the invention are to solve some or all of the above-described problems, or In order to achieve part or all of the effects, replacement or combination can be performed as appropriate. Further, if the technical feature is not described as essential in the present specification, it can be deleted as appropriate.

１０…制御部
１１…決定キー
１２…点灯部
１３…表示切替キー
１４…トラックパッド
１５…輝度切替キー
１６…方向キー
１７…メニューキー
１８…電源スイッチ
２０…画像表示部（画像表示部）
２１…右保持部
２２…右表示駆動部
２３…左保持部
２４…左表示駆動部
２６…右光学像表示部
２８…左光学像表示部
３０…イヤホンプラグ
３２…右イヤホン
３４…左イヤホン
４０…接続部
４２…右コード
４４…左コード
４６…連結部材
４８…本体コード
５１，５２…送信部
５３，５４…受信部
６１…カメラ（音源特定部、視線方向検出部）
６３…マイク（音声取得部）
１００…頭部装着型表示装置（ＨＭＤ）
１２０…記憶部
１２２…音声記録部（取得音声記憶部）
１２４…識別情報記憶部
１３０…電源
１３２…無線通信部（通信部）
１３５…操作部
１４０…ＣＰＵ
１５０…オペレーティングシステム
１６０…画像処理部
１６１…音源識別部（音源特定部）
１６５…画像設定部（表示画像設定部）
１６８…視線方向特定部（視線方向検出部）
１６９…変換部（変換部）
１７０…音声処理部（音声取得部、音声識別部）
１８０…インターフェイス
１９０…表示制御部
２０１…右バックライト制御部
２０２…左バックライト制御部
２１１…右ＬＣＤ制御部
２１２…左ＬＣＤ制御部
２２１…右バックライト
２２２…左バックライト
２４１…右ＬＣＤ
２４２…左ＬＣＤ
２５１…右投写光学系
２５２…左投写光学系
２６１…右導光板
２６２…左導光板
３００…サーバー
３２４…識別情報記憶部
３３２…情報送受信部
３４０…ＣＰＵ
３６５…画像設定部
３６９…変換部
５００…情報システム
ＰＣＬＫ…クロック信号
ＶＳｙｎｃ…垂直同期信号
ＨＳｙｎｃ…水平同期信号
ＩＭＧ１…画像（対応画像）
ＯＡ…外部機器
ＷＢ…ホワイトボート
ＳＣ…外景
ＲＥ…右眼
ＬＥ…左眼
ＴＥ…教師（音源）
ＩＬ…照明光
ＰＬ…画像光
ＰＮ…画像表示最大領域
ＶＲ…視野
ＳＴ…生徒
ＱＲ１…識別情報（識別情報）
ＴＸ１，ＴＸ２…文字画像（文字画像） DESCRIPTION OF SYMBOLS 10 ... Control part 11 ... Decision key 12 ... Illumination part 13 ... Display switch key 14 ... Track pad 15 ... Luminance switch key 16 ... Direction key 17 ... Menu key 18 ... Power switch 20 ... Image display part (image display part)
DESCRIPTION OF SYMBOLS 21 ... Right holding part 22 ... Right display drive part 23 ... Left holding part 24 ... Left display drive part 26 ... Right optical image display part 28 ... Left optical image display part 30 ... Earphone plug 32 ... Right earphone 34 ... Left earphone 40 ... Connection unit 42 ... Right cord 44 ... Left cord 46 ... Connecting member 48 ... Main body cord 51,52 ... Transmission unit 53,54 ... Reception unit 61 ... Camera (sound source identification unit, line-of-sight direction detection unit)
63 ... Microphone (voice acquisition unit)
100: Head-mounted display device (HMD)
120 ... Storage unit 122 ... Audio recording unit (acquired audio storage unit)
124 ... Identification information storage unit 130 ... Power source 132 ... Wireless communication unit (communication unit)
135: Operation unit 140 ... CPU
DESCRIPTION OF SYMBOLS 150 ... Operating system 160 ... Image processing part 161 ... Sound source identification part (sound source specific part)
165 ... Image setting unit (display image setting unit)
168 ... Gaze direction identification unit (gaze direction detection unit)
169: Conversion unit (conversion unit)
170: Voice processing unit (voice acquisition unit, voice identification unit)
180 ... interface 190 ... display control unit 201 ... right backlight control unit 202 ... left backlight control unit 211 ... right LCD control unit 212 ... left LCD control unit 221 ... right backlight 222 ... left backlight 241 ... right LCD
242 ... Left LCD
251 ... Right projection optical system 252 ... Left projection optical system 261 ... Right light guide plate 262 ... Left light guide plate 300 ... Server 324 ... Identification information storage unit 332 ... Information transmission / reception unit 340 ... CPU
365 ... Image setting unit 369 ... Conversion unit 500 ... Information system PCLK ... Clock signal VSync ... Vertical synchronization signal HSync ... Horizontal synchronization signal IMG1 ... Image (corresponding image)
OA ... External device WB ... White boat SC ... Outside view RE ... Right eye LE ... Left eye TE ... Teacher (sound source)
IL ... Illumination light PL ... Image light PN ... Maximum image display area VR ... Visual field ST ... Student QR1 ... Identification information (identification information)
TX1, TX2 ... Character image (character image)

Claims

A transmissive head-mounted display device,
An audio acquisition unit for acquiring audio;
A sound source identifying unit that identifies the sound source using identification information of a sound source that emits sound;
A conversion unit for converting the acquired sound into a character image;
An image display unit capable of displaying an image and transmitting an outside scene;
A display image setting unit that displays the corresponding image associated with the identified sound source and the converted character image on the image display unit in association with each other ,
The sound source specifying unit determines whether or not the speaker's mouth as the sound source is moving, and when the speaker's mouth is moving, the conversion unit converts the speech into a character image, and A head-mounted display device in which the conversion unit does not convert the sound into a character image when a person's mouth is not moving .

The head-mounted display device according to claim 1,
The voice acquisition unit identifies the direction of the sound source,
The display image setting unit is a head-mounted display device that displays the character image and the corresponding image on the image display unit in association with the specified direction.

The head-mounted display device according to claim 1 or 2,
When there are a plurality of speakers as the sound source, the display image setting unit displays at least one of the corresponding image and the character image based on the identification information set for each speaker. A head-mounted display device that sets at least one of a display mode and a display position in the unit.

The head-mounted display device according to any one of claims 1 to 3, further comprising:
A head-mounted device provided with an acquired sound storage unit that stores the identification information set for each of the plurality of sound sources and the sound of each of the plurality of sound sources in association with each other when there are a plurality of the sound sources. Type display device.

The head-mounted display device according to any one of claims 1 to 4,
The display image setting unit displays the size of the character image that is converted and displayed on the image display unit based on the acquired sound volume and whether or not to display the character image on the image display unit A head-mounted display device that determines at least one of presence or absence.

The head-mounted display device according to any one of claims 1 to 5, further comprising:
An imaging unit for imaging an outside scene;
A line-of-sight direction detection unit that detects a line-of-sight direction of the human being, which is the sound source, based on the captured image of the outside scene when the sound source is a human,
The display image setting unit sets at least one of a display mode and a display position in the image display unit of at least one of the corresponding image and the character image based on the detected line-of-sight direction. Type display device.

The head-mounted display device according to any one of claims 1 to 6, further comprising:
An acceleration detection unit for detecting the acceleration of the image display unit;
The display image setting unit fixes a display position of at least one of the corresponding image and the character image displayed on the image display unit in correspondence with the detected acceleration of the image display unit. Type display device.

The head-mounted display device according to any one of claims 1 to 7, further comprising:
A communication unit that transmits and receives information to and from other devices;
An acquired voice storage unit that stores the acquired voice and the identification information of the sound source that emits the acquired voice;
The display image setting unit is a head-mounted display device that transmits and receives the stored voice and the identification information to another device via the communication unit.

The head-mounted display device according to claim 8, further comprising:
A position specifying unit for specifying the position of the image display unit;
The communication unit receives position information of another head-mounted display device from another device,
The head-mounted display device, wherein the display image setting unit sets the corresponding image based on the specified position of the image display unit and the received position information and displays the corresponding image on the image display unit.

The head-mounted display device according to any one of claims 1 to 9,
The sound source specifying unit specifies the language of the acquired voice,
The display image setting unit sets at least one of a display mode and a display position in the image display unit of at least one of the corresponding image and the character image based on the specified language of the sound. Type display device.

The head-mounted display device according to claim 1,
The head-mounted display device, wherein the display image setting unit causes the size of the character image displayed on the image display unit to be proportional to the size of the voice acquired by the voice acquisition unit.

The head-mounted display device according to claim 1,
The head-mounted display device, wherein the display image setting unit causes the image display unit to display an image of the name of the speaker and the character image.

  The head-mounted display device according to claim 1,
  The sound source specifying unit specifies a language of the speaker's voice,
  The conversion unit translates the voice of the speaker into a specific language and converts it into the character image,
  The display image setting unit displays the character image in a peripheral portion excluding the center of the image display unit when the language of the speaker's voice is the native language of the user of the head-mounted display device. The head-mounted display device displays the character image on the peripheral portion of the image display unit when the language of the speaker's voice is not the native language of the user of the head-mounted display device.

The head-mounted display device according to claim 1,
The display image setting unit displays whether or not the character image is displayed on the image display unit according to the security level setting between the user of the head-mounted display device and the speaker, which is the identification information. A head-mounted display device that sets an aspect.

  The head-mounted display device according to claim 1, further comprising:
  An imaging unit for imaging an outside scene;
  A line-of-sight direction detecting unit that detects the line-of-sight direction of the speaker based on the image of the captured outside scene
  The display image setting unit is a head-mounted display device that sets a display position of the character image on the image display unit in accordance with the line-of-sight direction of the speaker.

A control method for a transmissive head-mounted display device having an audio acquisition unit for acquiring audio and an image display unit capable of displaying an image and transmitting an outside scene,
A sound source identifying step for identifying the sound source using identification information of a sound source that emits sound;
A conversion step of converting the acquired voice into a character image;
A process of associating the corresponding image associated with the identified sound source with the converted character image and causing the image display unit to display the associated image .
In the sound source identification step, it is determined whether or not the speaker's mouth as the sound source is moving. If the speaker's mouth is moving, the speech is converted into a character image in the conversion step, and the speech A control method in which the voice is not converted into a character image in the conversion step when the person's mouth is not moving .

An information system including a transmissive head-mounted display device having an image display unit capable of displaying an image and transmitting an outside scene, and a communication unit transmitting and receiving information to and from other devices,
An audio acquisition unit for acquiring audio;
A sound source identifying unit that identifies the sound source using identification information of a sound source that emits sound;
A conversion unit that converts the acquired sound into a character image represented as an image;
An image information transmission unit that associates the correspondence image associated with the identified sound source and the converted character image and transmits the correspondence image to the communication unit ;
The sound source specifying unit determines whether or not the speaker's mouth as the sound source is moving, and when the speaker's mouth is moving, the conversion unit converts the speech into a character image, and An information system in which the conversion unit does not convert the sound into a character image when the person's mouth is not moving .

A computer program for a transmissive head-mounted display device having an audio acquisition unit for acquiring audio and an image display unit capable of displaying an image and transmitting an outside scene,
A sound source identification function for identifying the sound source using identification information of a sound source that emits sound;
A conversion function that converts the acquired sound into a character image;
Causing the computer to realize a display image setting function for associating the corresponding image associated with the identified sound source and the converted character image and displaying them on the image display unit ,
The sound source identification function determines whether or not the speaker's mouth that is the sound source is moving. If the speaker's mouth is moving, the conversion function converts the speech into a character image, and A computer program in which the conversion function does not convert the sound into a character image when the person's mouth is not moving .