JP2020035405A

JP2020035405A - Voice output device

Info

Publication number: JP2020035405A
Application number: JP2018163874A
Authority: JP
Inventors: 島影圭佑; Keisuke SHIMAKAGE; 鶴岡秀樹; Hideki Tsuruoka; 宮下恵太; Keita Miyashita; 佐野友優; Yuyu Sano
Original assignee: Oton Glass Inc
Current assignee: Oton Glass Inc
Priority date: 2018-08-31
Filing date: 2018-08-31
Publication date: 2020-03-05
Anticipated expiration: 2038-08-31
Also published as: JP7207694B2

Abstract

To provide a voice recognition output device that has high operability and portability and is inexpensive.SOLUTION: A voice output device comprises: a carrying member that has a base part and a first arm part and a second arm part extending in the same direction from the base part; voice data generation means that generates voice data on the basis of a picked-up image; output means that outputs a voice on the basis of the voice data; and operation receiving means that is mounted on at least one of the first arm part and the second arm part of the carrying member and receives an operation related to a mode of output of the voice performed by the output means.SELECTED DRAWING: Figure 1

Description

本発明は、音声を出力する音声出力装置に関する。 The present invention relates to an audio output device that outputs audio.

従来、画像に含まれている文字等を音声出力することによって、視覚障碍者の生活を補助する音声出力装置がある。このような音声出力装置としては、メガネに取り付けられたカメラと、このカメラに有線で接続されたコンピュータと、を有する音声出力装置が特許文献１に開示されている（特許文献１参照）。 2. Description of the Related Art Conventionally, there is an audio output device that assists a visually impaired person's life by outputting characters included in an image as audio. As such an audio output device, Patent Document 1 discloses an audio output device having a camera attached to glasses and a computer connected to the camera by a wire (see Patent Document 1).

国際公開第２０１４／１４０８００号公報WO 2014/140800

特許文献１では、コンピュータはカメラが映した画像や文字を音声データに変換し、スピーカから当該変換した音声データに基づいて音声を出力している。ユーザは、音声出力装置を使用する際に、カメラが取り付けられたメガネを掛けると共に、箱型のコンピュータを腰等の体の部位に固定する。 In Patent Literature 1, a computer converts an image or a character projected by a camera into audio data, and outputs audio from a speaker based on the converted audio data. When using the audio output device, the user wears glasses to which a camera is attached and fixes a box-shaped computer to a body part such as a waist.

このような腰に音声認識装置を装着するタイプの装置は、操作性が悪いという問題がある。また、頭部にカメラを装着し、カメラとコンピュータとを有線で接続する場合には、コードが煩わしいという問題が一例として挙げられる。 Such a device in which the voice recognition device is attached to the waist has a problem that operability is poor. In addition, when a camera is attached to the head and the camera and the computer are connected by wire, the problem that the cord is cumbersome is cited as an example.

また、腰に箱形のコンピュータを取り付ける場合には、身体の一方にコンピュータの重さがかかってしまう。それ故に、コンピュータが重いとユーザが動きづらくなってしまう。そのため、携帯性の向上のためにコンピュータの軽量化が必要であるという問題が一例として挙げられる。また、このコンピュータの軽量化には非常にコストがかかり、装置が高価になってしまうと言う問題が一例として挙げられる。 Further, when a box-shaped computer is attached to the waist, the weight of the computer is applied to one side of the body. Therefore, a heavy computer makes it difficult for a user to move. Therefore, the problem that the weight of a computer needs to be reduced to improve portability is cited as an example. In addition, there is a problem that the weight of the computer is extremely high and the apparatus becomes expensive.

本発明は、上記従来の問題点に鑑みてなされたものであり、操作性及び携帯性が高くかつ安価な音声認識出力装置を提供することを目的とする。 The present invention has been made in view of the above-mentioned conventional problems, and has as its object to provide an inexpensive voice recognition output device having high operability and portability.

かかる課題を解決するため、本発明の音声出力装置は、基部及び前記基部から同一方向に伸長する第１の腕部及び第２の腕部を有する携行部材と、撮像した画像に音声データを生成する音声データ生成手段と、前記音声データに基づいて音声を出力する出力手段と、前記携行部材の前記第１の腕部及び前記第２の腕部のうち少なくとも一方に搭載され前記出力手段による前記音声の出力の態様に関する操作を受け付ける操作受付手段と、を有することを特徴とする。 In order to solve such a problem, an audio output device according to the present invention includes: a base member; a carrying member having a first arm and a second arm extending in the same direction from the base; Audio data generating means, output means for outputting audio based on the audio data, and the output means mounted on at least one of the first arm and the second arm of the carrying member. Operation receiving means for receiving an operation relating to a mode of outputting a voice.

本発明の音声出力装置によれば、例えば、携行部材の第１の腕部と第２の腕部との間にユーザの身体の一部を挟み込むことができる。すなわち、ユーザが携行部材を身に着けることが可能となる。例えば、ユーザは、携行部材を首に掛ける態様で音声出力装置を装着することができる。このため、箱型のコンピュータを持ち歩くことなく音声出力装置を移動させることが可能となる。それゆえ、音声出力装置の携帯性の向上を図ることが可能となる。 According to the audio output device of the present invention, for example, a part of the body of the user can be sandwiched between the first arm and the second arm of the carrying member. That is, the user can wear the carrying member. For example, the user can wear the audio output device such that the carrying member is hung on the neck. For this reason, it is possible to move the audio output device without carrying a box-shaped computer. Therefore, the portability of the audio output device can be improved.

また、携行部材は、例えば、ユーザの首に掛ける態様で音声出力装置を装着することができるため、一定の大きさを維持しつつ、重量を第１の腕部と第２の腕部に分散させることが可能となる。すなわち、携行部材の小型化を図る必要がないため、安価な音声認識出力装置を提供することが可能となる。 In addition, since the carrying member can be equipped with, for example, an audio output device in a manner of hanging around the user's neck, the weight is distributed to the first arm and the second arm while maintaining a certain size. It is possible to do. That is, since it is not necessary to reduce the size of the carrying member, an inexpensive voice recognition output device can be provided.

さらに、第１の腕部及び第２の腕部のうち少なくとも一方に操作受付手段が搭載されることにより、ユーザの手元で操作受付手段の操作を行うことが可能となる。それゆえ、音声出力装置の操作性の向上を図ることが可能となる。 Further, the operation accepting means is mounted on at least one of the first arm and the second arm, so that the operation accepting means can be operated at hand of the user. Therefore, it is possible to improve the operability of the audio output device.

前記第１の腕部及び前記第２の腕部は、前記基部から離れるに従って互いに近づくように形成されているとよい。 The first arm and the second arm may be formed so as to approach each other as the distance from the base increases.

第１の腕部及び第２の腕部は、基部から離れるに従って互いに近づくように形成されていることにより、携行部材をユーザの体にフィットさせることが可能となる。従って、携行部材がユーザから離れることを防止することができる。また、携行部材のユーザへのフィット感を高めることにより、ユーザにかかる携行部材の荷重を分散させることができ、ユーザの疲労を軽減することができる。 The first arm and the second arm are formed so as to approach each other as the distance from the base increases, so that the carrying member can be fitted to the body of the user. Therefore, it is possible to prevent the carrying member from separating from the user. In addition, by increasing the fit of the carrying member to the user, the load on the carrying member applied to the user can be dispersed, and user fatigue can be reduced.

前記携行部材は、前記第１の腕部の伸張方向に伸張する直線及び前記第２の腕部の伸張方向に沿って伸張する直線によって規定される平面に垂直な方向からみてＵ字状に形成されているとよい。 The carrying member is formed in a U-shape when viewed from a direction perpendicular to a plane defined by a straight line extending in a direction in which the first arm extends and a straight line extending in a direction in which the second arm extends. It is good to be.

このように、携行部材が形成されていることによって、ユーザの首周りに沿って携行部材を装着することが可能となる。したがって、音声出力装置の携帯性の向上を図ることが可能となる。 By forming the carrying member in this manner, the carrying member can be mounted around the neck of the user. Therefore, the portability of the audio output device can be improved.

前記操作受付手段は、前記第１の腕部及び前記第２の腕部に搭載され、前記第１の腕部の前記操作受付手段が搭載されている搭載面は、前記第２の腕部の前記操作受付手段が搭載されている搭載面とその延長上で交差しているとよい。 The operation receiving unit is mounted on the first arm unit and the second arm unit, and a mounting surface of the first arm unit on which the operation receiving unit is mounted is a mounting surface of the second arm unit. It is preferable that the mounting surface intersects with the mounting surface on which the operation receiving unit is mounted in an extension thereof.

このように、携行部材が形成されていることによって、ユーザの手が操作受付手段に触れやすくなる。したがって、音声出力装置の操作性の向上を図ることが可能となる。 By forming the carrying member in this manner, the user's hand can easily touch the operation receiving unit. Therefore, it is possible to improve the operability of the audio output device.

前記第１の腕部又は前記第２の腕部の少なくとも一方に前記出力手段が搭載され、前記出力手段は、前記操作受付手段よりも前記基部に対して近位に配されているとよい。 The output unit may be mounted on at least one of the first arm unit and the second arm unit, and the output unit may be disposed closer to the base than the operation receiving unit.

このように出力手段が配されていることによって、ユーザの耳に近い位置に出力手段を設けることが可能となる。従って、音声出力装置から出力される音声の聞き取りやすさの向上を図ることが可能となる。 By arranging the output means in this way, it is possible to provide the output means at a position near the user's ear. Therefore, it is possible to improve the audibility of the sound output from the sound output device.

ユーザの頭部に装着可能な保持手段と、前記保持手段に保持されている前記画像を撮像する撮像手段と、前記ユーザの頭部の右側部又は左側部に位置するように前記保持手段に保持され、かつ前記撮像手段に対する撮像の指示の入力を受け付ける撮像指示入力手段と、を有し、前記操作受付手段は、前記第１の腕部及び前記第２の腕部のうち前記撮像指示入力手段からみて近位の腕部に設けられているようにするとよい。 Holding means that can be mounted on the user's head, imaging means for capturing the image held by the holding means, and holding by the holding means so as to be positioned on the right or left side of the user's head And an imaging instruction input means for receiving an input of an imaging instruction to the imaging means, wherein the operation receiving means includes the imaging instruction input means of the first arm portion and the second arm portion. It is good to be provided in the arm part which is proximal from the viewpoint.

このように操作受付手段が設けられていることにより、ユーザは、片手で撮像指示入力手段を操作しつつ、他方の手で操作受付手段を操作することが可能となる。従って、音声出力装置の操作性の向上を図ることが可能となる。 With the provision of the operation receiving means, the user can operate the operation receiving means with the other hand while operating the imaging instruction input means with one hand. Therefore, it is possible to improve the operability of the audio output device.

前記携行部材は、前記撮像装置に対する撮像の指示を行う撮像指示入力手段を有するとよい。 The carrying member may include an imaging instruction input unit that instructs the imaging device to perform imaging.

このように撮像指示入力手段が設けられていることにより、ユーザは、携行部材で撮像指示入力手段を操作することが可能となる。従って、音声出力装置の操作性の向上を図ることが可能となる。 The provision of the imaging instruction input means as described above allows the user to operate the imaging instruction input means with the carrying member. Therefore, it is possible to improve the operability of the audio output device.

前記携行部材は、前記撮像装置に対する撮像の指示を行う撮像指示入力手段を有するようにしてもよい。 The carrying member may include an imaging instruction input unit that instructs the imaging device to perform imaging.

前記操作受付手段は、前記搭載面に露出する露出部を有し、前記露出部は、前記操作受付手段が受け付ける操作に応じて互いに異なる凹凸が表面に形成されているようにするとよい。 It is preferable that the operation receiving means has an exposed part exposed on the mounting surface, and the exposed part has different unevenness formed on a surface in accordance with an operation received by the operation receiving means.

このように操作受付手段が受け付ける操作に応じて互いに異なる凹凸が露出部の表面に形成されていることにより、ユーザは、操作受付手段の凹凸に触れるだけで操作受付手段が受け付ける機能を認識することが可能となる。従って、音声出力装置の操作性の向上を図ることが可能となる。 As described above, since the unevenness different from each other is formed on the surface of the exposed portion according to the operation accepted by the operation accepting means, the user can recognize the function accepted by the operation accepting means simply by touching the unevenness of the operation accepting means. Becomes possible. Therefore, it is possible to improve the operability of the audio output device.

前記携行部材に備えられ、前記音声データ生成手段として機能する制御部を含むようにするとよい。 It is preferable to include a control unit provided in the carrying member and functioning as the audio data generating unit.

また、本発明の携行部材は、基部及び前記基部から同一方向に伸長する第１の腕部及び第２の腕部を有するハウジングと、前記ハウジングの周囲を撮像した画像を取得する取得手段と、前記画像を外部に送信する送信手段と、前記画像に基づいて生成された音声データを受信する受信手段と、前記音声データに基づいて音声を出力する出力手段と、を有することを特徴とする。 Further, the carrying member of the present invention is a housing having a base and a first arm and a second arm extending in the same direction from the base, and an acquisition unit for acquiring an image of the periphery of the housing, It is characterized by comprising transmitting means for transmitting the image to the outside, receiving means for receiving audio data generated based on the image, and output means for outputting audio based on the audio data.

実施例１に係る音声出力装置をユーザが使用した態様を示す説明図である。FIG. 3 is an explanatory diagram illustrating a mode in which a user uses the audio output device according to the first embodiment. 図１の携行部材の斜視図である。It is a perspective view of the carrying member of FIG. 図１の携行部材の斜視図である。It is a perspective view of the carrying member of FIG. 図１の携行部材の斜視図である。It is a perspective view of the carrying member of FIG. 図１の携行部材の斜視図である。It is a perspective view of the carrying member of FIG. 図２Ａの携行部材に配された操作ボタンの拡大図である。FIG. 2B is an enlarged view of an operation button arranged on the carrying member of FIG. 2A. 図２Ａの携行部材に配された操作ボタンの拡大図である。FIG. 2B is an enlarged view of an operation button arranged on the carrying member of FIG. 2A. 図２Ａの携行部材に配された操作ボタンの拡大図である。FIG. 2B is an enlarged view of an operation button arranged on the carrying member of FIG. 2A. 図２Ａの携行部材に配された操作ボタンの拡大図である。FIG. 2B is an enlarged view of an operation button arranged on the carrying member of FIG. 2A. 実施例１に係る音声出力装置をユーザが使用した態様を示す説明図である。FIG. 3 is an explanatory diagram illustrating a mode in which a user uses the audio output device according to the first embodiment. 実施例１に係る音声出力装置の操作態様を示す説明図である。FIG. 4 is an explanatory diagram illustrating an operation mode of the audio output device according to the first embodiment. 実施例１に係る音声出力装置の操作態様を示す説明図である。FIG. 4 is an explanatory diagram illustrating an operation mode of the audio output device according to the first embodiment. 図１の撮像装置のコントロールユニットの機能ブロックを示すブロック図である。FIG. 2 is a block diagram illustrating functional blocks of a control unit of the imaging device in FIG. 1. 図１の携行部材のコントロールユニットの機能ブロックを示すブロック図である。It is a block diagram which shows the functional block of the control unit of the carrying member of FIG. 実施例１に係る音声出力装置の音声出力処理を示すフロー図である。FIG. 4 is a flowchart illustrating a sound output process of the sound output device according to the first embodiment. 図１の撮像装置による撮像処理を示すフロー図である。FIG. 2 is a flowchart illustrating an imaging process performed by the imaging device of FIG. 1. 図１の携行部材による音声データ生成処理を示すフロー図である。FIG. 2 is a flowchart illustrating audio data generation processing by the portable member in FIG. 1. 実施例２に係る音声出力装置に接続されているサーバの機能ブロックを示すブロック図である。FIG. 14 is a block diagram illustrating functional blocks of a server connected to the audio output device according to the second embodiment. 図１３のサーバによる音声データ生成処理を示すフロー図である。FIG. 14 is a flowchart illustrating audio data generation processing by the server in FIG. 13.

以下、図面を参照しながら、本発明の実施の形態についてさらに詳しく説明する。しかし、これらを適宜改変し、組み合わせてもよい。また、以下の説明及び添付図面において、実質的に同一又は等価な部分には同一の参照符を付して説明する。 Hereinafter, embodiments of the present invention will be described in more detail with reference to the drawings. However, these may be appropriately modified and combined. In the following description and the accompanying drawings, substantially the same or equivalent parts are denoted by the same reference numerals.

図１は、実施例１に係る音声出力装置１０をユーザＵが使用した態様を示している。図１に示すように、音声出力装置１０は、撮像装置であるカメラ２０によって撮像された画像に基づいて音声を出力する。 FIG. 1 illustrates an aspect in which a user U uses the audio output device 10 according to the first embodiment. As shown in FIG. 1, the audio output device 10 outputs audio based on an image captured by a camera 20, which is an imaging device.

カメラ２０は、ユーザの頭部に装着可能な保持手段としてのメガネＥＧに搭載されている。具体的には、メガネＥＧは、ユーザＵの右目を覆う右レンズＲＬ及びユーザＵの左目を覆う左レンズＬＬを支持するフレームＦＲを含む。カメラ２０は、メガネＥＧの左レンズＬＬ側のフレームＦＲに搭載されている。言い換えれば、カメラ２０は、ユーザＵの頭部の左側部に位置するようにメガネＥＧに保持されている。 The camera 20 is mounted on glasses EG as holding means that can be worn on the user's head. Specifically, the glasses EG include a frame FR that supports a right lens RL that covers the right eye of the user U and a left lens LL that covers the left eye of the user U. The camera 20 is mounted on a frame FR on the left lens LL side of the glasses EG. In other words, the camera 20 is held by the glasses EG so as to be located on the left side of the head of the user U.

尚、カメラ２０は、メガネＥＧに対して容易に着脱可能なアタッチメント介して搭載されていてもよいし、ボルトやねじ等の締結部材によってフレームＦＲに締結されていてもよい。また、カメラ２０は、ユーザＵの頭部の右側部に位置するようにメガネＥＧに保持されていてもよい。 Note that the camera 20 may be mounted via an attachment that is easily detachable from the glasses EG, or may be fastened to the frame FR by a fastening member such as a bolt or a screw. Further, the camera 20 may be held by the glasses EG so as to be located on the right side of the head of the user U.

カメラ２０は、ユーザＵの周囲を撮像する。言い換えれば、カメラ２０は、メガネＥＧの周囲を撮像する。カメラ２０は、カメラ２０に対する撮像を指示の入力を受け付けるシャッターボタンＳＢを含む。言い換えれば、シャッターボタンＳＢは、カメラ２０に対する撮像の指示の入力を受け付ける撮像指示入力手段として機能する。シャッターボタンＳＢは、カメラ２０のボディから垂直方向に向かって離れるように突出して設けられている。なお、本実施例においては、シャッターボタンＳＢはカメラ２０のボディから垂直方向下側または地面側に向かって離れるように突出して設けられている。これは本実施例においてはユーザＵが利き腕にて操作部であるボタンＢ１およびＢ２を操作するとともに、利き腕とは逆の腕でシャッターボタンＳＢを操作することを想定しているからである。たとえば、利き腕が右腕のユーザＵの場合、シャッターボタンＳＢおよび操作部であるボタンＢ１およびＢ２、ユーザＵの左側に配置される。ユーザが利き腕とは逆の腕側に配置されたシャッターボタンＳＢを操作する場合、図６のように親指が下側、人差し指から小指が親指に対して上側にくるのが自然である。そのため、ユーザＵがシャッターボタンＳＢを操作する場合、人差し指から小指のうち任意の指でシャッターボタンＳＢの上側を支え、親指にてシャッターボタンＳＢを押下（この場合上方向に押下）することになる。こうすることで、自然な手の配置となるため、利き腕とは逆の腕でシャッターボタンＳＢを押下してもメガネＥＧおよびそれに付随するカメラ２０に対して振動が加わりにくくなり、手振れの少ない画像を撮像することが可能である。こうすることによって画像からの文字の認識のエラー率を下げることができる。それと同時に、ユーザＵは利き腕にて操作部であるボタンＢ１およびＢ２が操作可能であるので、利き腕でない方の腕での操作に比べて細かな操作が可能となる。 The camera 20 captures an image around the user U. In other words, the camera 20 captures an image around the glasses EG. The camera 20 includes a shutter button SB that receives an input of an instruction to perform imaging with respect to the camera 20. In other words, the shutter button SB functions as an imaging instruction input unit that receives an input of an imaging instruction to the camera 20. The shutter button SB is provided so as to protrude away from the body of the camera 20 in the vertical direction. In the present embodiment, the shutter button SB is provided so as to protrude from the body of the camera 20 so as to be away from the body in the vertical direction or toward the ground. This is because in the present embodiment, it is assumed that the user U operates the buttons B1 and B2, which are the operation units, with the dominant arm, and operates the shutter button SB with the arm opposite to the dominant arm. For example, when the user U has a dominant arm, the shutter button SB and the buttons B1 and B2, which are operation units, are arranged on the left side of the user U. When the user operates the shutter button SB arranged on the arm side opposite to the dominant arm, it is natural that the thumb is on the lower side and the forefinger is on the upper side with respect to the thumb as shown in FIG. Therefore, when the user U operates the shutter button SB, the upper side of the shutter button SB is supported by an arbitrary one of the index finger and the little finger, and the shutter button SB is pressed by the thumb (in this case, pressed upward). . By doing so, a natural hand arrangement is achieved, so that even if the shutter button SB is pressed with the arm opposite to the dominant arm, vibration is less likely to be applied to the glasses EG and the camera 20 accompanying the glasses EG, and an image with less camera shake Can be imaged. By doing so, the error rate of character recognition from an image can be reduced. At the same time, since the user U can operate the buttons B1 and B2, which are the operation units, with the dominant arm, the user U can perform a finer operation than the operation with the arm that is not the dominant arm.

カメラ２０は、シャッターボタンＢＳが押されると、レンズＬＥから入光した光を撮像素子（図示せず）によって電気信号に変換する。変換された電気信号は、画像としてカメラ２０に内蔵されたフラッシュメモリ等の記録媒体（図示せず）に記録される。 When the shutter button BS is pressed, the camera 20 converts the light incident from the lens LE into an electric signal by an image sensor (not shown). The converted electric signal is recorded as an image on a recording medium (not shown) such as a flash memory built in the camera 20.

音声出力装置１０は、携行部材３０を含んでいる。携行部材３０は、例えば、樹脂素材で形成されている。携行部材３０には、カメラ２０と通信可能なインタフェース（図示せず）が設けられている。携行部材３０は、ＵＳＢ（Universal Sirial Bas）等のケーブルによってカメラ２０と接続されている。すなわち、カメラ２０は、携行部材３０の周囲を撮像するともいえる。 The audio output device 10 includes a carrying member 30. The carrying member 30 is formed of, for example, a resin material. The carrying member 30 is provided with an interface (not shown) capable of communicating with the camera 20. The carrying member 30 is connected to the camera 20 by a cable such as a USB (Universal Sirial Bas). That is, it can be said that the camera 20 captures an image around the carrying member 30.

携行部材３０には、音声データに基づいて音声を出力する出力手段としてのスピーカＳＰが搭載されている。スピーカＳＰは、カメラ２０によって撮像された画像に基づいて変換された音声データに基づいて音声を出力する。携行部材３０は、音声データに基づいて音声を出力する制御を行うコントロールユニットＣＵ１を有する。 The portable member 30 is equipped with a speaker SP as an output unit that outputs sound based on sound data. The speaker SP outputs audio based on audio data converted based on an image captured by the camera 20. The carrying member 30 has a control unit CU1 that controls output of sound based on the sound data.

図２Ａは、実施例１に係る携行部材３０を正面から見た斜視図を示している。図２Ｂは、実施例１に係る携行部材３０を上面から見た斜視図を示している。図２Ａ及び図２Ｂに示すように、携行部材３０は、基部３１及び基部３１から同一方向に伸長する第１の腕部３３及び第２の腕部３４を有する。すなわち、基部３１、第１の腕部３３及び第２の腕部３４によってハウジングが構成されている。 FIG. 2A is a perspective view of the carrying member 30 according to the first embodiment as viewed from the front. FIG. 2B is a perspective view of the carrying member 30 according to the first embodiment as viewed from above. As shown in FIGS. 2A and 2B, the carrying member 30 has a base 31 and a first arm 33 and a second arm 34 extending from the base 31 in the same direction. That is, the housing is constituted by the base 31, the first arm 33, and the second arm 34.

基部３１は、第１の腕部３３の伸張方向に伸張する直線Ｌ１及び第２の腕部３４の伸張方向に沿って伸張する直線Ｌ２によって規定される平面Ｓに垂直な方向からみて円弧状に湾曲している板状部材である。 The base 31 has an arc shape when viewed from a direction perpendicular to a plane S defined by a straight line L1 extending in the direction in which the first arm 33 extends and a straight line L2 extending in the direction in which the second arm 34 extends. It is a curved plate-shaped member.

第１の腕部３３及び第２の腕部３４は、それぞれ基部３１とは別体に形成されている。第１の腕部３３は、基部３１の一方の端部に接続する接続部３３ａを有する。第１の腕部３３は、基部３１の一方の端部に接続する接続部３３ａを有する。接続部３３ａは、基部３１にボルト等の締結部材によって固定されている。 The first arm 33 and the second arm 34 are formed separately from the base 31. The first arm portion 33 has a connection portion 33a connected to one end of the base portion 31. The first arm portion 33 has a connection portion 33a connected to one end of the base portion 31. The connecting portion 33a is fixed to the base 31 by a fastening member such as a bolt.

第１の腕部３３は、接続部３３ａから続いて一体に形成され、かつ基部３１に対して略Ｌ字状を成して形成されている延長部３３ｂを有する。延長部３３ｂは、錐台状に形成されている。 The first arm portion 33 has an extension portion 33b formed integrally with the connecting portion 33a and formed substantially in an L-shape with the base portion 31. The extension 33b is formed in a frustum shape.

延長部３３ｂは、具体的には、全体として四角錘台状に形成されている。延長部３３ｂの伸長方向の先端は丸みを帯びて湾曲して形成されている。また、ユーザＵの体に触れる部分については、角が落ちて形成されている。延長部３３ｂの頂面及び頂面に対向する底面は略長方形に形成されている。頂面は、底面よりも面積が狭く形成されている。頂面と底面の間に形成される側面は台形状に形成されている。 The extension 33b is specifically formed in a truncated pyramid shape as a whole. The distal end of the extension 33b in the direction of extension is rounded and curved. Further, the portion that touches the body of the user U is formed with a corner falling. The top surface of the extension 33b and the bottom surface facing the top surface are formed in a substantially rectangular shape. The top surface has a smaller area than the bottom surface. The side surface formed between the top surface and the bottom surface has a trapezoidal shape.

第２の腕部３４は、基部３１の一方の端部に接続する接続部３４ａを有する。第２の腕部３４は、基部３１の他方の端部に接続する接続部３４ａを有する。接続部３４ａは、基部３１にボルト等の締結部材によって固定されている。 The second arm portion 34 has a connecting portion 34a connected to one end of the base 31. The second arm portion 34 has a connecting portion 34a connected to the other end of the base 31. The connecting portion 34a is fixed to the base 31 by a fastening member such as a bolt.

第２の腕部３４は、接続部３４ａから続いて一体に形成され、かつ基部３１に対して略Ｌ字状を成して形成されている延長部３４ｂを有する。延長部３４ｂは、錐台状に形成されている。 The second arm portion 34 has an extension portion 34b formed integrally with the connecting portion 34a and formed substantially in an L-shape with respect to the base portion 31. The extension 34b is formed in a frustum shape.

延長部３４ｂは、具体的には、全体として四角錘台状に形成されている。延長部３４ｂの伸長方向の先端は丸みを帯びて湾曲して形成されている。また、ユーザＵの体に触れる部分については、角が落ちて形成されている。延長部３４ｂの頂面及び頂面に対向する底面は略長方形に形成されている。頂面は、底面よりも面積が狭く形成されている。頂面と底面の間に形成される側面は台形状に形成されている。 The extension part 34b is specifically formed in the shape of a truncated pyramid as a whole. The distal end of the extension 34b in the extension direction is rounded and curved. Further, the portion that touches the body of the user U is formed with a corner falling. The top surface of the extension 34b and the bottom surface facing the top surface are formed in a substantially rectangular shape. The top surface has a smaller area than the bottom surface. The side surface formed between the top surface and the bottom surface has a trapezoidal shape.

第１の腕部３３の延長部３３ｂの頂面は、第２の腕部３４の延長部３４ｂの頂面と対向するように配されている。第１の腕部３３の延長部３３ｂ及び第２の腕部３４の延長部３４ｂは、基部３１から離れるに従って互いに近づくように形成されている。 The top surface of the extension 33b of the first arm 33 is disposed so as to face the top of the extension 34b of the second arm 34. The extension 33b of the first arm 33 and the extension 34b of the second arm 34 are formed so as to approach each other as the distance from the base 31 increases.

すなわち、携行部材３０は、第１の腕部３３の伸張方向に伸張する直線Ｌ１及び第２の腕部３４の伸張方向に沿って伸張する直線Ｌ２によって規定される平面Ｓに垂直な方向からみてＵ字状に形成されている。従って、携行部材３０は、ユーザが音声出力装置１０を首に掛けると、ユーザの首周りから肩回りにかけて沿うように湾曲している。 That is, the carrying member 30 is viewed from a direction perpendicular to the plane S defined by the straight line L1 extending in the direction in which the first arm 33 extends and the line L2 extending in the direction in which the second arm 34 extends. It is formed in a U-shape. Accordingly, when the user puts the audio output device 10 on his / her neck, the carrying member 30 is curved so as to extend from around the user's neck to around his / her shoulder.

このように、携行部材３０が形成されていることにより、ユーザＵの体と接触する携行部材３０の接触面積が高くなる。すなわち、携行部材３０をユーザＵの体にフィットさせることが可能となる。従って、携行部材３０がユーザから離れることを防止することができる。また、携行部材３０のユーザＵへのフィット感を高めることにより、ユーザＵにかかる携行部材３０の荷重を分散させることができ、ユーザＵの疲労を軽減することができる。 Thus, by forming the carrying member 30, the contact area of the carrying member 30 that contacts the body of the user U increases. That is, the carrying member 30 can be fitted to the body of the user U. Therefore, it is possible to prevent the carrying member 30 from separating from the user. Further, by increasing the fit of the carrying member 30 to the user U, the load of the carrying member 30 applied to the user U can be dispersed, and fatigue of the user U can be reduced.

第１の腕部３３及び第２の腕部３４には、スピーカＳＰの音声の出力の態様に関する操作を受け付ける操作受付手段としての４つのボタンＢ１，Ｂ２，Ｂ３，Ｂ４が搭載されている。言い換えれば、音声出力装置１０は、携行部材３０の第１の腕部３３及び第２の腕部３４のうち少なくとも一方に搭載されスピーカＳＰによる音声の出力の態様に関する操作を受け付ける操作受付手段を有する。 Four buttons B1, B2, B3, and B4 are mounted on the first arm 33 and the second arm 34 as operation accepting means for accepting an operation related to the mode of audio output of the speaker SP. In other words, the audio output device 10 has an operation receiving unit that is mounted on at least one of the first arm 33 and the second arm 34 of the carrying member 30 and that receives an operation related to an audio output mode of the speaker SP. .

ボタンＢ１〜Ｂ４は、例えば、第１の腕部３３又は第２の腕部の伸長方向に沿って形成されている搭載面Ｓ１，Ｓ２に搭載されている。当該搭載面に平行且つ前記伸長方向に垂直な方向における各ボタンの長さ、すなわち横幅は、１５ｍｍ以下に形成されていることが望ましい。横幅が１５ｍｍ以下であることにより、ユーザＵの指の腹で押し易くなり操作感を向上させることができる。また、横幅は、１．２ｍｍ以上有することが望ましい。横幅が１．２ｍｍ以上であることにより、ユーザＵがボタンＢ１〜Ｂ４を触ることにより認知することができる。 The buttons B1 to B4 are mounted on, for example, mounting surfaces S1 and S2 formed along the extension direction of the first arm 33 or the second arm. It is desirable that the length of each button in a direction parallel to the mounting surface and perpendicular to the extending direction, that is, the lateral width, is formed to be 15 mm or less. When the width is 15 mm or less, the user U can easily press with the belly of his / her finger, and the operational feeling can be improved. Further, the width is desirably 1.2 mm or more. When the width is 1.2 mm or more, the user U can recognize by touching the buttons B1 to B4.

ボタンＢ１及びＢ２は、第１の腕部３３に搭載されている。ボタンＢ１及びＢ２は、第１の腕部３３の伸長方向に対して列状に配列されている。具体的には、ボタンＢ１は、ボタンＢ２よりも基部３１側に配置されている。例えば、ボタンＢ１は、ユーザＵが右手で第１の腕部３３を握った際に、親指が触れる位置に配されているとよい。 The buttons B1 and B2 are mounted on the first arm 33. The buttons B1 and B2 are arranged in a row in the extension direction of the first arm 33. Specifically, the button B1 is disposed closer to the base 31 than the button B2. For example, the button B1 may be arranged at a position where the thumb touches when the user U holds the first arm 33 with the right hand.

ボタンＢ１は、第１の腕部３３の延長部３３ｂの搭載面Ｓ１から矩形状に窪んで形成されている。ボタンＢ１は、例えば、出力された音声を早戻しする操作ボタンである。ボタンＢ２は、第１の腕部３３の延長部３３ｂの搭載面Ｓ１から矩形状に突出して形成されている。ボタンＢ２は、例えば、スピーカＳＰの音量を調整するボタンである。ボタンＢ２は、２つの領域Ｒａ、Ｒｂを有している。ボタンＢ２の一方の領域ＲａはスピーカＳＰの音量を大きくする操作ボタンとして機能する。ボタンＢ２の他方の領域ＲｂはスピーカＳＰの音量を小さくする操作ボタンとして機能する。 The button B1 is formed so as to be rectangularly recessed from the mounting surface S1 of the extension 33b of the first arm 33. The button B1 is, for example, an operation button for rewinding the output sound. The button B2 is formed to protrude in a rectangular shape from the mounting surface S1 of the extension 33b of the first arm 33. The button B2 is a button for adjusting the volume of the speaker SP, for example. The button B2 has two regions Ra and Rb. One area Ra of the button B2 functions as an operation button for increasing the volume of the speaker SP. The other region Rb of the button B2 functions as an operation button for reducing the volume of the speaker SP.

ボタンＢ２の表面の一方の領域Ｒａが基部３１側、ボタンＢ２の表面の他方の領域Ｒｂが腕部３３の先端側に配置されている。 One area Ra on the surface of the button B2 is arranged on the base 31 side, and the other area Rb on the surface of the button B2 is arranged on the tip side of the arm 33.

周囲の環境音等の影響で音が聞こえないときに音量を大きくする操作ボタンを先に操作する頻度が、音量を小さくする操作ボタンを先に操作する頻度よりも高いため、基部３１側に配置されたボタンＢ２の表面の一方の領域Ｒａを音量を大きくする操作ボタンとして機能させることで、操作性の向上が図られうる。 Since the frequency of operating the operation button for increasing the volume first when the sound cannot be heard due to the influence of the surrounding environmental sound or the like is higher than the frequency of operating the operation button for decreasing the volume first, the operation button is disposed on the base 31 side. By making one area Ra on the surface of the button B2 function as an operation button for increasing the volume, operability can be improved.

また、ボタンＢ２の表面の領域Ｒａを押圧すると、ボタンＢ２の表面の領域Ｒｂがせりあがり、ボタンＢ２の表面の領域Ｒｂを押圧すると、ボタンＢ２の表面の領域Ｒａがせりあがるように構成されてもよい。 Further, when the area Ra on the surface of the button B2 is pressed, the area Rb on the surface of the button B2 rises, and when the area Rb on the surface of the button B2 is pushed, the area Ra on the surface of the button B2 rises. Is also good.

ボタンＢ３及びＢ４は、第２の腕部３４に搭載されている。ボタンＢ３及びＢ４は、第２の腕部３４の伸長方向に対して列状に配列されている。具体的には、ボタンＢ３は、ボタンＢ４よりも基部３１側に配置されている。例えば、ユーザＵが左手で第２の腕部３４を握った際に、親指が触れる位置にボタンＢ１が配されているとよい。 The buttons B3 and B4 are mounted on the second arm 34. The buttons B3 and B4 are arranged in a row in the extending direction of the second arm 34. Specifically, the button B3 is disposed closer to the base 31 than the button B4. For example, when the user U holds the second arm 34 with the left hand, the button B1 may be arranged at a position where the thumb touches.

ボタンＢ３は、第２の腕部３４の延長部３４ｂの搭載面Ｓ２から矩形状に突出して形成されている。ボタンＢ３は、例えば、スピーカＳＰから再生される音声の速度を調整するボタンである。ボタンＢ３は、２つの領域Ｒｃ，Ｒｄを有している。ボタンＢ３の一方の領域ＲｃはスピーカＳＰから再生される音声の速度を早くする操作ボタンとして機能する。ボタンＢ３の他方の領域ＲｄはスピーカＳＰから再生される音声の速度を遅くする操作ボタンとして機能する。 The button B3 is formed to protrude in a rectangular shape from the mounting surface S2 of the extension 34b of the second arm 34. The button B3 is, for example, a button for adjusting the speed of the sound reproduced from the speaker SP. The button B3 has two regions Rc and Rd. One region Rc of the button B3 functions as an operation button for increasing the speed of sound reproduced from the speaker SP. The other area Rd of the button B3 functions as an operation button for reducing the speed of the sound reproduced from the speaker SP.

ボタンＢ２の表面の一方の領域Ｒｃが基部３１側、ボタンＢ２の表面の他方の領域Ｒｄが腕部３４の先端側に配置されている。 One region Rc on the surface of the button B2 is located on the base 31 side, and the other region Rd on the surface of the button B2 is located on the tip side of the arm portion.

再生速度の調整するための操作ボタンは、まず、必要性の低い情報の再生速度を早くするために使用され、重要な情報を聞く際に、再生速度を遅くすることが多い。このようなことに鑑みると、再生速度を早くする操作ボタンの方が、再生速度を遅くする操作ボタンのよりも先に操作されることが多いため、基部３１側に配置されたボタンＢ３の表面の一方の領域Ｒｃを再生速度を早くする操作ボタンとして機能させることで、操作性の向上が図られうる。 The operation button for adjusting the reproduction speed is first used to increase the reproduction speed of less necessary information, and often lowers the reproduction speed when listening to important information. In view of the above, since the operation button for increasing the reproduction speed is often operated earlier than the operation button for decreasing the reproduction speed, the surface of the button B3 arranged on the base 31 side is By making one of the regions Rc function as an operation button for increasing the reproduction speed, the operability can be improved.

また、ボタンＢ３の表面の領域Ｒｃを押圧すると、ボタンＢ３の表面の領域Ｒｄがせりあがり、ボタンＢ３の表面の領域Ｒｄを押圧すると、ボタンＢ３の表面の領域Ｒｃがせりあがるように構成されてもよい。 Further, when the region Rc on the surface of the button B3 is pressed, the region Rd on the surface of the button B3 rises, and when the region Rd on the surface of the button B3 is pressed, the region Rc on the surface of the button B3 rises. Is also good.

ボタンＢ４は、第２の腕部３４の延長部３４ｂの搭載面Ｓ２から矩形状に窪んで形成されている。ボタンＢ４は、例えば、カメラ２０で撮像された画像から音声データを生成する態様を変更する、すなわち、音声出力装置１０の音声データを生成する制御モードを変更する操作ボタンである。制御モードの一例としては、例えば、ユーザＵに伝えるべき情報がユーザＵの周囲に多く存在する場合、ユーザＵがシャッターボタンＳＢを押す頻度は高くなる。このような場合、音声出力装置１０は、ユーザＵがシャッターボタンＳＢを押さずとも、カメラ２０が撮像した画像に含まれている情報に基づいて音声データを生成する制御（以下、街歩きモードという）を行ってもよい。ボタンＢ４は、街歩きモードと通常の制御モードを切り替える操作ボタンである。 The button B4 is formed to be rectangularly recessed from the mounting surface S2 of the extension 34b of the second arm 34. The button B4 is, for example, an operation button for changing a mode of generating audio data from an image captured by the camera 20, that is, for changing a control mode of the audio output device 10 for generating audio data. As an example of the control mode, for example, when there is much information to be conveyed to the user U around the user U, the frequency at which the user U presses the shutter button SB increases. In such a case, the audio output device 10 controls the audio data to be generated based on information included in the image captured by the camera 20 without the user U pressing the shutter button SB (hereinafter, referred to as a city walking mode). ) May be performed. The button B4 is an operation button for switching between the town walking mode and the normal control mode.

このように、ボタンＢ１及びＢ２は、第１の腕部３３の搭載面Ｓ１から露出して形成されている。また、ボタンＢ３及びＢ４は、第２の腕部３４の搭載面Ｓ２から露出して形成されている。すなわち、ボタンＢ１〜ボタンＢ４は、第１の腕部３３及び第２の腕部３４の搭載面に露出する露出部として機能する。 As described above, the buttons B1 and B2 are formed so as to be exposed from the mounting surface S1 of the first arm 33. The buttons B3 and B4 are formed so as to be exposed from the mounting surface S2 of the second arm 34. That is, the buttons B1 to B4 function as exposed portions that are exposed on the mounting surfaces of the first arm 33 and the second arm 34.

本実施例における、ボタンＢ１〜ボタンＢ４の機能の振り分け方について説明する。本実施例においては、ユーザＵが右利きであり、利き腕とは逆の左手でシャッターボタンＳＢを操作する様態となっている。このとき、ボタンＢ１及びＢ２は、ユーザＵの利き腕とは逆の腕である左側の第１の腕部３３に配置されており、ボタンＢ３及びＢ４はユーザＵの利き腕である右側の第２の腕部３４に配置されている。そのため、ユーザＵは、利き腕である右腕でボタンＢ１及びＢ２を操作し、利き腕とは逆の左手でボタンＢ３及びＢ４を操作する。 A method of distributing the functions of the buttons B1 to B4 in this embodiment will be described. In the present embodiment, the user U is right-handed and operates the shutter button SB with the left hand opposite to the dominant arm. At this time, the buttons B1 and B2 are arranged on the left first arm 33, which is the arm opposite to the user U's dominant arm, and the buttons B3 and B4 are arranged on the right second arm, which is the user U's dominant arm. It is arranged on the arm 34. Therefore, the user U operates the buttons B1 and B2 with the right arm, which is the dominant arm, and operates the buttons B3 and B4 with the left hand opposite to the dominant arm.

このとき、ボタンＢ１及びＢ２には、ユーザＵがシャッターボタンＳＢを操作すると同時またはシャッターボタンＳＢを操作した直前や直後に使用する可能性の高い機能を配置するとよい。具体的にはボタンＢ１及びＢ２には、早戻しボタンや音量調整ボタンを配置するとよい。これらの早戻しボタンや音量調整ボタンは、ユーザＵが音声出力装置１０から出力された音声をよく聞き取れないときに即時に押下される必要がある。そのため、ユーザＵがシャッターボタンＳＢを操作する際にはシャッターボタンＳＢを操作する腕（左手）とは逆の腕（右手）でボタンＢ１またはボタンＢ２を触れながら押下の準備をできる方が、操作性が高まる。一方で、主に撮影の準備段階や撮影の合間に使用される機能は、ユーザＵがシャッターボタンＳＢを操作する腕と同じ腕（左手）で操作してもよい。よって再生速度の調整やメニューの設定は、ユーザＵがシャッターボタンＳＢを操作する腕と同じ腕側のボタンＢ３及びＢ４に配置されているとよい。 At this time, the buttons B1 and B2 may be provided with functions that are likely to be used at the same time as the user U operates the shutter button SB or immediately before or immediately after operating the shutter button SB. Specifically, it is preferable to arrange a fast-return button and a volume adjustment button on the buttons B1 and B2. These fast-return buttons and volume adjustment buttons need to be pressed immediately when the user U cannot hear the sound output from the sound output device 10 well. Therefore, when the user U operates the shutter button SB, the person who can prepare for pressing while pressing the button B1 or the button B2 with the arm (right hand) opposite to the arm (right hand) for operating the shutter button SB, The nature increases. On the other hand, the functions mainly used in the preparatory stage of photographing or during photographing may be operated by the same arm (left hand) as the arm by which the user U operates the shutter button SB. Therefore, the adjustment of the reproduction speed and the setting of the menu are preferably arranged on the buttons B3 and B4 on the same arm side as the arm on which the user U operates the shutter button SB.

また、ユーザＵがシャッターボタンＳＢを操作する腕と同じ側のボタンは、基部３１に対してより近位の操作部（ボタン）が操作する頻度が高い機能を配置するとよい。前提として、ユーザＵは視覚障碍者であることもあり、ユーザＵがボタンＢ１〜ボタンＢ４を操作する際には、ボタンＢ１〜ボタンＢ４が配置されている位置を見ることなくボタンＢ１〜ボタンＢ４を操作する。そのため、ユーザＵが第１の腕部３３または第２の腕部３４を自然に把持した場合に、親指が配される場所の近傍に最も使用頻度が高い機能のボタンが配置されるのがよい。このとき、たとえば、ユーザＵがボタンＢ１またはＢ２を右手で操作する場合、図７Ａのように、親指の先側にボタンＢ１（基部３１に対してボタンＢ２より近位のボタン）が配置され、親指の根元側にボタンＢ２（基部３１に対してボタンＢ１より遠位のボタン）が配置されることになる。人の手の構造上、ユーザＵが第１の腕部３３または第２の腕部３４を自然に把持した場合、親指は自然に伸びていることが多い。それ故、ユーザＵは、親指の先端側のボタンＢ１の操作から根元側のボタンＢ２を操作する際には、親指を曲げるだけで対応できる。一方で、ユーザＵが第１の腕部３３または第２の腕部３４を把持した状態からさらに親指の先端側のボタン１を操作する場合は、ユーザＵは、親指を伸ばしても対応できないため、腕全体を基部３１の方向に動かしてボタン１を操作する必要がある。つまり、ユーザＵは第１の腕部３３または第２の腕部３４を一回把持すると、把持した状態から第１の腕部３３または第２の腕部３４の伸張方向（基部３１に対してボタンＢ１より遠位）のボタンＢ２を操作するほうが、基部３１方向に配置されているボタンＢ１を操作するよりも、簡単に対応できる。よって使用頻度の高い機能が基部３１に対して近位のボタン（ボタンＢ１、Ｂ３）に配置され、それよりは使用頻度の低い機能が腕部の伸張方向のボタン（ボタンＢ２、Ｂ４）に配置されるとよい。 The button on the same side as the arm on which the user U operates the shutter button SB may be provided with a function in which the operation unit (button) closer to the base 31 is frequently operated. As a premise, the user U may be a visually impaired person. When the user U operates the buttons B1 to B4, the user U does not look at the positions where the buttons B1 to B4 are arranged. Operate. Therefore, when the user U naturally grasps the first arm 33 or the second arm 34, it is preferable that the button of the most frequently used function is arranged near the place where the thumb is arranged. . At this time, for example, when the user U operates the button B1 or B2 with the right hand, as shown in FIG. 7A, the button B1 (the button closer to the base 31 than the button B2) is arranged on the tip side of the thumb, The button B2 (the button farther than the button B1 with respect to the base 31) is arranged on the base side of the thumb. Due to the structure of human hands, when the user U naturally grasps the first arm 33 or the second arm 34, the thumb often extends naturally. Therefore, when operating the button B2 on the root side from the operation on the button B1 on the tip side of the thumb, the user U can respond only by bending the thumb. On the other hand, when the user U further operates the button 1 on the tip side of the thumb from a state in which the user U holds the first arm 33 or the second arm 34, the user U cannot cope even with the thumb extended. It is necessary to operate the button 1 by moving the entire arm toward the base 31. That is, when the user U once grips the first arm 33 or the second arm 34, the user U moves from the gripped state to the direction in which the first arm 33 or the second arm 34 extends (with respect to the base 31). It is easier to operate the button B2 (distant from the button B1) than to operate the button B1 disposed in the direction of the base 31. Therefore, the most frequently used functions are arranged on the buttons (buttons B1, B3) proximal to the base 31, and the less frequently used functions are arranged on the buttons (buttons B2, B4) in the direction of extension of the arm. Good to be.

スピーカＳＰは、第１の腕部３３の搭載面Ｓ１及び第２の腕部３４の搭載面Ｓ２に搭載されている。第１の腕部３３に搭載されているスピーカＳＰは、基部３１に対してボタンＢ１，Ｂ２よりも近位に配されている。また、第２の腕部３４に搭載されているスピーカＳＰは、基部３１に対してボタンＢ３，Ｂ４よりも近位に配されている。 The speaker SP is mounted on the mounting surface S1 of the first arm 33 and the mounting surface S2 of the second arm 34. The speaker SP mounted on the first arm 33 is located closer to the base 31 than the buttons B1 and B2. Further, the speaker SP mounted on the second arm portion 34 is disposed closer to the base 31 than the buttons B3 and B4.

図３は、実施例１に係る携行部材３０を正面から見た斜視図を示している。図３に示すように、第１の腕部３３の延長部３３ｂのボタンＢ１，Ｂ２が搭載されている搭載面Ｓ１を含む面３は、第２の腕部３４の延長部３４ｂのボタンＢ３，Ｂ４が搭載されている搭載面Ｓ２を含む面Ｓ４と交差する。言い換えれば、搭載面Ｓ１は、搭載面Ｓ２から所定の角度を成すように形成されている。すなわち、搭載面Ｓ１は、搭載面Ｓ２とは平行ではない。 FIG. 3 is a perspective view of the carrying member 30 according to the first embodiment as viewed from the front. As shown in FIG. 3, the surface 3 including the mounting surface S1 of the extension 33b of the first arm 33 on which the buttons B1 and B2 are mounted is the button B3 of the extension 34b of the second arm 34. It intersects with a surface S4 including a mounting surface S2 on which B4 is mounted. In other words, the mounting surface S1 is formed so as to form a predetermined angle from the mounting surface S2. That is, the mounting surface S1 is not parallel to the mounting surface S2.

図４は、実施例１に係る携行部材３０を正面から見た斜視図を示している。図４には、第１の腕部３３の延長部３３ｂのボタンＢ１，Ｂ２が形成されている搭載面Ｓ１は、平面Ｓから角度αを成す。第２の腕部３４の延長部３４ｂのボタンＢ３，Ｂ４が形成されている搭載面Ｓ２は、平面Ｓから角度-αを成す。すなわち、搭載面Ｓ１及び搭載面Ｓ２は、平面Ｓに対して対称な形状を有している。 FIG. 4 is a perspective view of the carrying member 30 according to the first embodiment as viewed from the front. In FIG. 4, the mounting surface S1 of the extension 33b of the first arm 33 on which the buttons B1 and B2 are formed forms an angle α from the plane S. The mounting surface S2 of the extension 34b of the second arm 34 on which the buttons B3, B4 are formed forms an angle -α from the plane S. That is, the mounting surface S1 and the mounting surface S2 have shapes that are symmetric with respect to the plane S.

言い換えれば、第１の腕部３３の延長部３３ｂのボタンＢ１，Ｂ２が形成されている搭載面Ｓ１及び第２の腕部３４の延長部３４ｂのボタンＢ３，Ｂ４が形成されている搭載面Ｓ２は、第１の腕部３３の伸張方向に伸張する直線Ｌ１及び第２の腕部３４の伸張方向に沿って伸張する直線Ｌ２によって規定される平面Ｓと角度を持って形成されている。 In other words, the mounting surface S1 of the extension 33b of the first arm 33 on which the buttons B1 and B2 are formed and the mounting surface S2 of the extension 34b of the second arm 34 on which the buttons B3 and B4 are formed. Is formed at an angle to a plane S defined by a straight line L1 extending in the direction in which the first arm 33 extends and a straight line L2 extending in the direction in which the second arm 34 extends.

図５Ａは、ボタンＢ１の平面を示している。図５Ａに示すように、ボタンＢ１は、出力された音声を早戻しする操作ボタンである。このボタンＢ１の表面には、音声を早戻しする操作ボタンとしての記号が凸状に形成されている。早戻しする操作ボタンとしての記号は、例えば、円弧の終点が矢印で表されている記号が挙げられる。 FIG. 5A shows the plane of the button B1. As shown in FIG. 5A, the button B1 is an operation button for quickly returning the output sound. On the surface of the button B1, a symbol as an operation button for quickly returning the sound is formed in a convex shape. The symbol as the operation button for rewinding quickly includes, for example, a symbol in which the end point of an arc is represented by an arrow.

図５ＢはボタンＢ２の平面を示している。図５Ｂに示すように、ボタンＢ２は、スピーカの音量を調整するボタンである。ボタンＢ２は、ボタンＢ１よりも載置面Ｓ１における面積が広く形成されているとよい。このようにボタンＢ２を形成することで、ユーザＵは、目視によらなくてもボタンＢ１又はボタンＢ２を触った感触でいずれのボタンであるかを識別することができる。 FIG. 5B shows the plane of the button B2. As shown in FIG. 5B, button B2 is a button for adjusting the volume of the speaker. The button B2 is preferably formed to have a larger area on the mounting surface S1 than the button B1. By forming the button B2 in this manner, the user U can identify which button is the button B1 or the button B2 with a feeling of touching the button B2 without visual observation.

ボタンＢ２の表面には、２つの領域が設けられている。このボタンＢ２の表面の一方の領域Ｒａには、音量を大きくする操作ボタンとしての記号が凸状に形成されている。音量を大きくする操作ボタンとしての記号は、例えば、互いに半径が異なる３つの同心円の円弧で表されるものが挙げられる。各々の円弧は、円弧の半径の長さに応じて配されている。例えば、３つの円弧のうち半径の最も短い円弧と半径の最も長い円弧の間に半径の長さが中間の円弧が配される。 Two areas are provided on the surface of the button B2. In one area Ra on the surface of the button B2, a symbol as an operation button for increasing the volume is formed in a convex shape. The symbol as the operation button for increasing the volume may be, for example, one represented by three concentric arcs having different radii. Each arc is arranged according to the length of the radius of the arc. For example, an arc having an intermediate radius is disposed between an arc having the shortest radius and an arc having the longest radius among the three arcs.

また、ボタンＢ２の表面の他方の領域Ｒｂには、音量を小さくする操作ボタンとしての記号が凸状に形成されている。音量を小さくする操作ボタンとしての記号は、例えば、１つの円弧で表されるものが挙げられる。 In the other region Rb on the surface of the button B2, a symbol as an operation button for reducing the volume is formed in a convex shape. The symbol as the operation button for decreasing the volume may be, for example, one represented by a single arc.

図５ＣはボタンＢ３の平面を示している。図５Ｃに示すように、ボタンＢ３は、スピーカの音声を再生する再生スピードを調整するボタンである。ボタンＢ３は、ボタンＢ４よりも載置面Ｓ２における面積が広く形成されているとよい。このようにボタンＢ３を形成することで、ユーザＵは、目視によらなくてもボタンＢ４又はボタンＢ３を触った感触でいずれのボタンであるかを識別することができる。 FIG. 5C shows the plane of the button B3. As shown in FIG. 5C, the button B3 is a button for adjusting the reproduction speed for reproducing the sound of the speaker. The button B3 is preferably formed to have a larger area on the mounting surface S2 than the button B4. By forming the button B3 in this way, the user U can identify which button is the button B4 or the button B3 with the touch of the button B3 without visual observation.

ボタンＢ３の表面には、２つの領域が設けられている。このボタンＢ３の表面の一方の領域Ｒｃには、音声の再生スピードを早くする記号が凸状に形成されている。再生スピードを速くする操作ボタンとしての記号は、例えば、互いに半径が同じである３つの円で表されるものが挙げられる。各々の円は、ボタンＢ３の表面において列状に配されている。 Two areas are provided on the surface of the button B3. In one region Rc on the surface of the button B3, a symbol for increasing the sound reproduction speed is formed in a convex shape. Symbols used as operation buttons for increasing the reproduction speed include, for example, those represented by three circles having the same radius. Each circle is arranged in a row on the surface of the button B3.

また、ボタンＢ３の表面の他方の領域Ｒｃには、再生スピードを遅くする操作ボタンとしての記号が凸状に形成されている。再生スピードを遅くする操作ボタンとしての記号は、例えば、１つの円で表されるものが挙げられる。 In the other region Rc on the surface of the button B3, a symbol as an operation button for reducing the reproduction speed is formed in a convex shape. The symbol used as the operation button for reducing the reproduction speed is, for example, one represented by one circle.

図５Ｄは、ボタンＢ４の平面を示している。図５Ｄに示すように、ボタンＢ４は、音声出力装置１０の音声を生成する制御モードを操作する操作ボタンである。このボタンＢ４の表面には、当該制御モードを操作する操作ボタンとしての記号が凸状に形成されている。当該制御モードを操作する操作ボタンとしての記号は、例えば、人間が歩行する際の下半身をモチーフとした記号が挙げられる。 FIG. 5D shows the plane of the button B4. As illustrated in FIG. 5D, the button B4 is an operation button for operating a control mode for generating a sound of the sound output device 10. On the surface of the button B4, a symbol as an operation button for operating the control mode is formed in a convex shape. The symbol as the operation button for operating the control mode includes, for example, a symbol with a motif of a lower body when a human walks.

このように、ボタンＢ１〜Ｂ４は、受け付ける操作に応じて互いに異なる凹凸が表面に形成されている。尚、ボタンＢ１〜Ｂ４が受け付ける機能は一例であり、適宜変更して実施してもよい。例えば、ボタンＢ１〜Ｂ４のいずれかにカメラ２０のシャッターボタンの機能を有するようにしてもよい。 As described above, the buttons B1 to B4 have different irregularities on the surface in accordance with the operation to be received. Note that the functions accepted by the buttons B1 to B4 are merely examples, and may be implemented by appropriately changing them. For example, any of the buttons B1 to B4 may have a function of a shutter button of the camera 20.

図６は、ユーザが音声出力装置を操作する際の態様を示している。図６に示すように、携行部材３０の第１の腕部３３は、メガネＥＧのフレームＦＲの左レンズＬＬ側に配されている。携行部材３０の第２の腕部３４は、メガネＥＧのフレームＦＲの右レンズＲＬ側に配されている。 FIG. 6 shows an aspect when the user operates the audio output device. As shown in FIG. 6, the first arm 33 of the carrying member 30 is disposed on the left lens LL side of the frame FR of the glasses EG. The second arm 34 of the carrying member 30 is arranged on the right lens RL side of the frame FR of the glasses EG.

ボタンＢ１，Ｂ２は、シャッターボタンＳＢからみて近位にある第１の腕部３３に設けられている。すなわち、ボタンＢ１，Ｂ２は、第１の腕部３３及び第２の腕部３４のうちシャッターボタンＳＢからみて近位にあるいずれか一方に設けられている。 The buttons B1 and B2 are provided on the first arm portion 33 that is proximal to the shutter button SB. That is, the buttons B1 and B2 are provided on one of the first arm 33 and the second arm 34 that is proximal to the shutter button SB.

図７Ａは、右利きの人が、左手でシャッターボタンＳＢを操作する際の右手によるボタンＢ１，Ｂ２の操作態様を示している。図７Ａに示すように、ユーザＵは、左手でシャッターボタンＳＢを操作する際に、右手でボタンＢ１又はボタンＢ２を操作することができる。これは、通常利き腕の方が、利き腕でない方の腕よりも繊細な操作ができるためである。シャッターボタンＳＢの押下とボタンＢ１の操作（早戻し）又はボタンＢ２の操作（音量の調整）を比較した場合に、ボタンＢ１、Ｂ２の操作の方が複雑な操作を要求される。したがって、利き腕に応じて右利きの人が操作しやすい第１の腕部３３にボタンＢ１、Ｂ２を配置することによって、シャッターボタンＳＢを操作すると共に、ボタンＢ１，Ｂ２の操作を行うことが可能となる。すなわち、ユーザＵが望む優先度が高い機能をボタンＢ１及びＢ２を配することによって、音声出力装置１０の操作性の向上を図ることが可能となる。 FIG. 7A shows an operation mode of the buttons B1 and B2 by the right hand when the right-handed person operates the shutter button SB with the left hand. As shown in FIG. 7A, when operating the shutter button SB with the left hand, the user U can operate the button B1 or the button B2 with the right hand. This is because the normally dominant arm can perform more delicate operations than the non-dominant arm. When the operation of pressing the shutter button SB and the operation of the button B1 (fast rewind) or the operation of the button B2 (adjustment of the volume) are compared, the operation of the buttons B1 and B2 requires a more complicated operation. Therefore, by arranging the buttons B1 and B2 on the first arm 33 which is easy for the right-handed person to operate according to the dominant arm, the shutter button SB can be operated and the buttons B1 and B2 can be operated. Becomes That is, by arranging the buttons B1 and B2 for functions having a high priority desired by the user U, it is possible to improve the operability of the audio output device 10.

尚、ユーザＵの利き腕に応じてシャッターボタンＳＢとボタンＢ１乃至Ｂ４を配置してもよい。例えば、左手が利き腕の場合、カメラ２０のシャッターボタンＳＢをユーザの頭部の右側に搭載し、右手でシャッターボタンＳＢを操作する際に、左手でボタンＢ３又はボタンＢ４を操作するようにしてもよい。また、カメラ２０と通信可能な携行部材３０のインタフェース（図示せず）は、カメラ２０がメガネＥＧに搭載される位置に応じて設けるとよい。例えば、カメラ２０のシャッターボタンＳＢがユーザＵの頭部の左側に搭載される場合には、第１の腕部３３にインターフェースを設けるとよい。一方で、カメラ２０のシャッターボタンＳＢがユーザＵの頭部の右側に搭載される場合には、第１の腕部３４にインターフェースを設けるとよい。 Incidentally, the shutter button SB and the buttons B1 to B4 may be arranged according to the dominant arm of the user U. For example, when the left hand is a dominant arm, the shutter button SB of the camera 20 is mounted on the right side of the user's head, and when the shutter button SB is operated with the right hand, the button B3 or the button B4 may be operated with the left hand. Good. Further, an interface (not shown) of the carrying member 30 that can communicate with the camera 20 may be provided according to the position where the camera 20 is mounted on the glasses EG. For example, when the shutter button SB of the camera 20 is mounted on the left side of the head of the user U, an interface may be provided on the first arm 33. On the other hand, when the shutter button SB of the camera 20 is mounted on the right side of the head of the user U, an interface may be provided on the first arm 34.

図７Ｂは、左手によるボタンＢ１，Ｂ２の操作態様を示している。図７Ｂに示すように、ユーザＵは、左でボタンＢ１又はボタンＢ２を操作することができる。 FIG. 7B shows an operation mode of the buttons B1 and B2 by the left hand. As shown in FIG. 7B, the user U can operate the button B1 or the button B2 on the left.

すなわち、図４において説明したように、第１の腕部３３の延長部３３ｂのボタンＢ１，Ｂ２が形成されている搭載面Ｓ１は、平面Ｓから角度αを成す。例えば、音声出力装置１０の制御モードを街歩きモードに設定した場合、ユーザＵは、左手でボタンＢ１又はボタンＢ２を操作することも考えられる。その際、搭載面Ｓ１が平面Ｓに対して角度αを有することにより、ボタンＢ１及びボタンＢ２の操作性を高めることができる。さらに、搭載面Ｓ１の角が落とされて形成されているため、ボタンＢ１及びボタンＢ２の操作時にユーザＵに与えるストレスを軽減することができる。 That is, as described in FIG. 4, the mounting surface S1 of the extension 33b of the first arm 33 on which the buttons B1 and B2 are formed forms an angle α with the plane S. For example, when the control mode of the audio output device 10 is set to the city walking mode, the user U may operate the button B1 or the button B2 with the left hand. At this time, the operability of the button B1 and the button B2 can be enhanced by the mounting surface S1 having the angle α with respect to the plane S. Further, since the mounting surface S1 is formed with the corners dropped, the stress applied to the user U when operating the buttons B1 and B2 can be reduced.

図８は、カメラ２０のコントロールユニットＣＵ２の機能ブロックを示している。図８に示すように、入力部２１は、シャッターボタンＳＢ及び撮像ユニットＩＵに接続されているインターフェース部である。カメラ２０は、入力部２１を介してシャッターボタンＳＢからの撮像指示を取得可能である。カメラ２０は、入力部２１を介して撮像ユニットＩＵが生成した画像データを取得可能である。
FIG. 8 shows functional blocks of the control unit CU2 of the camera 20. As shown in FIG. 8, the input unit 21 is an interface unit connected to the shutter button SB and the imaging unit IU. The camera 20 can acquire an imaging instruction from the shutter button SB via the input unit 21. The camera 20 can acquire image data generated by the imaging unit IU via the input unit 21.

撮像ユニットＩＵは、レンズＬＥから入光した光を電気信号に変換することによって画像データを生成する撮像素子を含む。撮像素子は、たとえば、ＣＭＯＳイメージセンサである。撮像ユニットＩＵは、例えば、シャッター等の撮像機構を含む。 The imaging unit IU includes an imaging device that generates image data by converting light that has entered from the lens LE into an electric signal. The image sensor is, for example, a CMOS image sensor. The imaging unit IU includes, for example, an imaging mechanism such as a shutter.

記憶装置２２は、例えばフラッシュメモリなどにより構成されている。記憶装置２２は、ＢＩＯＳ（Basic Input Output System）、ソフトウェア等の各種プログラムを記憶する。また、記憶装置２２は、カメラ２０が撮像した画像データＩＭを格納可能である。 The storage device 22 is configured by, for example, a flash memory or the like. The storage device 22 stores various programs such as a basic input output system (BIOS) and software. Further, the storage device 22 can store image data IM captured by the camera 20.

通信部２３は、携行部材３０と通信を行うインターフェース部である。カメラ２０は、通信部２３を介して記憶装置２２に格納されている画像データを携行部材３０に送信可能である。 The communication unit 23 is an interface unit that communicates with the portable member 30. The camera 20 can transmit image data stored in the storage device 22 to the portable member 30 via the communication unit 23.

出力部２４は、撮像ユニットＩＵに接続されているインターフェース部である。カメラ２０は、シャッターボタンＳＢから入力された撮像指示を撮像ユニットＩＵに出力可能である。 The output unit 24 is an interface unit connected to the imaging unit IU. The camera 20 can output an imaging instruction input from the shutter button SB to the imaging unit IU.

制御部２５は、演算処理装置としてのＣＰＵ（Central Processing Unit）と、主記憶装置としてのＲＯＭ（Read Only Memory）と、ＲＡＭ（Random Access Memory）と、を有するコンピュータによって実現される。ＣＰＵは、ＲＯＭや記憶装置２２から処理内容に応じたプログラムを読み出してＲＡＭに展開し、展開したプログラムと協働して、各種機能を実現する。 The control unit 25 is realized by a computer having a CPU (Central Processing Unit) as an arithmetic processing device, a ROM (Read Only Memory) as a main storage device, and a RAM (Random Access Memory). The CPU reads out a program corresponding to the processing content from the ROM or the storage device 22, expands the program into the RAM, and realizes various functions in cooperation with the expanded program.

動作制御部２５ａは、制御部２５の機能ブロックの１つである。動作制御部２５ａは、カメラ２０の撮像動作の制御を行うことが可能である。 The operation control unit 25a is one of the functional blocks of the control unit 25. The operation control unit 25a can control the imaging operation of the camera 20.

入力部２１、記憶装置２２、通信部２３、出力部２４及び制御部２５の各々は、システムバスＢ１を介して互いに接続されている。 The input unit 21, the storage device 22, the communication unit 23, the output unit 24, and the control unit 25 are connected to each other via a system bus B1.

図９は、携行部材３０のコントロールユニットＣＵ１の機能ブロックを示している。図９に示すように、入力部３５は、ボタンＢ１〜ボタンＢ４に接続されているインターフェース部である。 FIG. 9 shows functional blocks of the control unit CU1 of the carrying member 30. As shown in FIG. 9, the input unit 35 is an interface unit connected to the buttons B1 to B4.

記憶装置３６は、例えばフラッシュメモリなどにより構成されている。記憶装置３６は、ＢＩＯＳ（Basic Input Output System）、ソフトウェア等の各種プログラムを記憶する。また、記憶装置３６は、カメラ２０から送信された画像データを格納可能である。 The storage device 36 is configured by, for example, a flash memory or the like. The storage device 36 stores various programs such as a basic input output system (BIOS) and software. Further, the storage device 36 can store image data transmitted from the camera 20.

記憶装置３６は、画像音声変換データベース（以下、データベースをＤＢと表記する）を含む。画像音声変換ＤＢは、画像に含まれる情報と当該情報に紐づいた音声データが格納されている。たとえば、画像音声変換ＤＢは、文字と音声データが紐づいたデータ構造を有している。画像音声変換ＤＢは、単語と音声データが紐づいたデータ構造を有している。画像音声変換ＤＢは、物体と音声データが紐づいたデータ構造を有している。尚、画像音声変換ＤＢは、カメラ２０で撮像した画像に基づいてディープラーニングによって構築されるようにしてもよい。 The storage device 36 includes a video / audio conversion database (hereinafter, the database is referred to as DB). The image-sound conversion DB stores information included in an image and sound data associated with the information. For example, the image-audio conversion DB has a data structure in which characters and audio data are linked. The image / audio conversion DB has a data structure in which words and audio data are linked. The image-sound conversion DB has a data structure in which an object and sound data are linked. Note that the image-audio conversion DB may be constructed by deep learning based on the image captured by the camera 20.

通信部３７は、カメラ２０と通信を行うインターフェース部である。携行部材３０は、通信部３７を介してカメラ２０と通信可能である。 The communication unit 37 is an interface unit that communicates with the camera 20. The carrying member 30 can communicate with the camera 20 via the communication unit 37.

出力部３８は、スピーカＳＰに接続されているインターフェース部である。携行部材３０は、出力部３８を介してスピーカＳＰから音声を出力可能である。 The output unit 38 is an interface unit connected to the speaker SP. The carrying member 30 can output sound from the speaker SP via the output unit 38.

制御部３９は、演算処理装置としてのＣＰＵ（Central Processing Unit）と、主記憶装置としてのＲＯＭ（Read Only Memory）と、ＲＡＭ（Random Access Memory）と、を有するコンピュータによって実現される。ＣＰＵは、ＲＯＭや記憶装置３６から処理内容に応じたプログラムを読み出してＲＡＭに展開し、展開したプログラムと協働して、各種機能を実現する。 The control unit 39 is realized by a computer having a CPU (Central Processing Unit) as an arithmetic processing device, a ROM (Read Only Memory) as a main storage device, and a RAM (Random Access Memory). The CPU reads out a program corresponding to the processing content from the ROM or the storage device 36 and develops the program on the RAM, and realizes various functions in cooperation with the developed program.

入力部３５、記憶装置３６、通信部３７、出力部３８及び制御部３９の各々は、システムバスＢ２を介して互いに接続されている。 The input unit 35, the storage device 36, the communication unit 37, the output unit 38, and the control unit 39 are connected to each other via a system bus B2.

音声データ生成部３９ａは、制御部３９の機能ブロックの１つである。音声データ生成部３９ａは、カメラ２０から送信された画像データを受信すると受信した画像データＩＭに含まれている文字に基づいて音声データを生成することが可能である。したがって、制御部３９は、ユーザＵの周囲を撮像した画像に含まれている文字に基づいて音声データを生成する音声データ生成手段として機能する。 The audio data generation unit 39a is one of the functional blocks of the control unit 39. When receiving the image data transmitted from the camera 20, the audio data generation unit 39a can generate audio data based on characters included in the received image data IM. Therefore, the control unit 39 functions as an audio data generation unit that generates audio data based on characters included in an image of the area around the user U.

音声データ生成部３９ａは、例えば、ＯＣＲ（Optical Character Recognition）によって画像中の文字を認識する。音声データ生成部３９ａは、認識した文字を画像音声変換ＤＢを参照して音声データを生成する。 The audio data generation unit 39a recognizes characters in an image by, for example, OCR (Optical Character Recognition). The audio data generation unit 39a generates audio data by referring to the image and audio conversion DB for the recognized character.

以上で説明した音声出力装置１０の音声の出力処理について説明する。紙に印刷された文字からなる文章を音声出力装置１０が音声データに変換して出力する場合を説明する。
The sound output processing of the sound output device 10 described above will be described. The case where the speech output device 10 converts a sentence composed of characters printed on paper into speech data and outputs the speech data will be described.

図１０は、音声出力装置１０の音声出力処理を示している。図１０に示すように、音声出力装置１０は、カメラ２０で音声に変換する対象となる紙を撮像する（ステップＳ１１）。音声出力装置１０は、ステップＳ１１において撮像された画像データＩＭに基づいて音声データを生成する（ステップＳ１２）。音声出力装置１０は、ステップＳ１２において生成された音声データに基づいて音声をスピーカＳＰから出力する（ステップＳ１３）。 FIG. 10 shows a sound output process of the sound output device 10. As shown in FIG. 10, the audio output device 10 captures an image of a paper to be converted into audio by the camera 20 (step S11). The audio output device 10 generates audio data based on the image data IM captured in step S11 (step S12). The sound output device 10 outputs a sound from the speaker SP based on the sound data generated in step S12 (step S13).

図１１は、図１０のステップＳ１１の撮像処理のサブルーチンを示している。図１１に示すように、カメラ２０の制御部２５は、シャッターボタンが押されたか否かを判断する（ステップＳ２１）。カメラ２０のシャッターボタンＳＢがユーザＵによって押されると（ステップＳ２１：Ｙ）、カメラ２０の撮像ユニットＵＩによって撮像対象を撮像する（ステップＳ２２）。ステップＳ１２において撮像された画像データＩＭは記憶装置２２に記録されてもよい。カメラ２０は、ステップＳ１２において撮像された画像データＩＭを携行部材３０のコントロールユニットＵ１に送信する（ステップＳ２３）。 FIG. 11 shows a subroutine of the imaging process in step S11 of FIG. As shown in FIG. 11, the control unit 25 of the camera 20 determines whether the shutter button has been pressed (step S21). When the shutter button SB of the camera 20 is pressed by the user U (Step S21: Y), an image of an object to be imaged is captured by the imaging unit UI of the camera 20 (Step S22). The image data IM captured in step S12 may be recorded in the storage device 22. The camera 20 transmits the image data IM captured in step S12 to the control unit U1 of the carrying member 30 (step S23).

図１２は、図１０のステップＳ１２の音声データ生成処理のサブルーチンを示している。図１２に示すように、携行部材３０の制御部３９は、画像データＩＭを受信したかを判断する（ステップ３１）。携行部材３０の制御部３９は、画像データＩＭを受信したと判断すると（ステップＳ３１：Ｙ）、受信した画像データＩＭに含まれている文章の文字に基づいて音声データを生成する（ステップＳ３２）。従って、携行部材３０の制御部３９は、音声データ生成部３９ａとして機能する。尚、画像データＩＭに含まれる文章が長文に亘る場合、音声データ生成部３９ａは、文章の内容を要約して音声データを生成してもよい。 FIG. 12 shows a subroutine of the audio data generation processing in step S12 of FIG. As shown in FIG. 12, the control unit 39 of the carrying member 30 determines whether the image data IM has been received (step 31). When determining that the image data IM has been received (step S31: Y), the control unit 39 of the carrying member 30 generates audio data based on the text of the text included in the received image data IM (step S32). . Therefore, the control unit 39 of the carrying member 30 functions as the audio data generation unit 39a. When the text included in the image data IM extends over a long text, the audio data generation unit 39a may generate audio data by summarizing the content of the text.

尚、画像データＩＭに含まれる音声に変換する対象は、文章だけでなく例えば、時刻表のように文字と表が組み合わさったものであってもよい。このような音声に変換する対象の場合、たとえば、バスの行先及びバス停を出発する時刻を含む音声データを生成するとよい。例えば、音声データは、「Ａ（行先）行きのバスは、８時にＢバス停を出発する時刻は、Ｃ分、Ｄ分、Ｅ分です。」と音声が出力されるように生成されてもよい。 The object to be converted into the sound included in the image data IM may be not only a sentence but also a combination of a character and a table, such as a timetable. In the case of such an object to be converted into voice, for example, voice data including a bus destination and a time of departure from a bus stop may be generated. For example, the audio data may be generated such that the audio is output as follows: "The bus going to A (destination) departs from the B bus stop at 8 o'clock is C, D, E minutes." .

また、音声に変換する対象は、固有名詞に紐づいた情報を含むようにしてもよい。例えば、商店街の名称を含む画像データの場合、たとえば、音声データは、「この商店街には、Ａ，Ｂ，Ｃなどのお店があり、ＡでのランチはＤ，Ｅなのメニューがあります。Ａ点ランチの平均額はＦ円です。」と音声が出力されるように生成されてもよい。 Further, the object to be converted to voice may include information associated with a proper noun. For example, in the case of image data including the name of a shopping street, for example, the audio data is "This shopping street has shops such as A, B, and C, and the lunch at A has a menu of D and E. The average amount of the A point lunch is F yen. "

さらに、音声データ生成部３９ａが音声データを生成する対象は、カメラ２０が撮像した画像であった。しかし、音声データ生成部３９ａが音声データを生成する対象は静止画に限られず、たとえば、カメラ２０が撮像した映像であってもよい。 Furthermore, the object from which the audio data generation unit 39a generates audio data is an image captured by the camera 20. However, the target for which the audio data generation unit 39a generates the audio data is not limited to a still image, and may be, for example, a video captured by the camera 20.

以上のように、本発明の音声出力装置１０によれば、ユーザＵは、携行部材３０を首に掛ける態様で音声出力装置１０を装着することができる。このため、箱型のコンピュータを持ち歩くことなく音声出力装置１０を移動させることが可能となる。それゆえ、音声出力装置１０の携帯性の向上を図ることが可能となる。 As described above, according to the audio output device 10 of the present invention, the user U can wear the audio output device 10 in a manner of hanging the carrying member 30 around the neck. For this reason, it is possible to move the audio output device 10 without carrying a box-shaped computer. Therefore, the portability of the audio output device 10 can be improved.

また、音声出力装置１０の携行部材３０は、ユーザＵの首に掛ける態様で装着することができるため、その重量を第１の腕部３３と第２の腕部３４に分散させることが可能となる。 In addition, since the carrying member 30 of the audio output device 10 can be mounted on the neck of the user U, the weight can be distributed to the first arm 33 and the second arm 34. Become.

尚、本実施例においては、音声の出力をスピーカＳＰから行うようにした。しかし、音声の出力は、スピーカＳＰに限られず、例えば、イヤホン又はヘッドホンによって行われるようにしてもよい。イヤホン又はヘッドホンから音声の出力を行う場合には、例えば、携行部材３０の基部３１にイヤホンジャックを設けるとよい。 In this embodiment, audio is output from the speaker SP. However, the output of the sound is not limited to the speaker SP, and may be performed by, for example, an earphone or headphones. When sound is output from an earphone or headphones, for example, an earphone jack may be provided on the base 31 of the carrying member 30.

また、本実施例においては、第１の腕部３３又は第２の腕部３４の両方にスピーカＳＰを搭載した。しかし、スピーカＳＰは、第１の腕部３３又は第２の腕部３４の少なくとも一方に搭載されていればよく、必ずしも第１の腕部３３又は第２の腕部３４の両方にスピーカＳＰが搭載されている必要はない。 In this embodiment, the speakers SP are mounted on both the first arm 33 and the second arm 34. However, the speaker SP only needs to be mounted on at least one of the first arm 33 and the second arm 34, and the speaker SP is not necessarily provided on both the first arm 33 and the second arm 34. It does not need to be installed.

（サーバで音声データを生成）
実施例２に係る音声出力装置１０について説明する。実施例２に係る音声出力装置１０は、サーバに通信可能に接続されている点で実施例１の音声出力装置１０と異なる。具体的には、サーバは、画像データＩＭに含まれている文字に基づいて音声データを生成する。尚、実施例１に係る音声出力装置１０と同一の構成については、同一の符号を付して説明を省略する。 (Generate audio data on the server)
The audio output device 10 according to the second embodiment will be described. The audio output device 10 according to the second embodiment is different from the audio output device 10 according to the first embodiment in that the audio output device 10 is communicably connected to a server. Specifically, the server generates audio data based on characters included in the image data IM. Note that the same components as those of the audio output device 10 according to the first embodiment are denoted by the same reference numerals, and description thereof is omitted.

図１３は、実施例２に係る音声出力装置１０が接続するサーバ４０の構成を示している。図１３に示すように、サーバ４０の記憶装置４１は、例えばフラッシュメモリなどにより構成されている。記憶装置４１は、ＢＩＯＳ（Basic Input Output System）、ソフトウェア等の各種プログラムを記憶する。また、記憶装置４１は、携行部材３０から送信された画像データを格納可能である。 FIG. 13 illustrates a configuration of a server 40 to which the audio output device 10 according to the second embodiment is connected. As shown in FIG. 13, the storage device 41 of the server 40 is configured by, for example, a flash memory. The storage device 41 stores various programs such as a basic input output system (BIOS) and software. Further, the storage device 41 can store image data transmitted from the carrying member 30.

記憶装置４１は、画像音声変換ＤＢを含む。画像音声変換ＤＢは、画像に含まれる情報と当該情報に紐づいた音声データが格納されている。たとえば、画像音声変換ＤＢは、文字と音声データが紐づいたデータ構造を有している。画像音声変換ＤＢは、単語と音声データが紐づいたデータ構造を有している。画像音声変換ＤＢは、物体と音声データが紐づいたデータ構造を有している。尚、画像音声変換ＤＢは、カメラ２０で撮像した画像に基づいてディープラーニングによって構築されるようにしてもよい。 The storage device 41 includes an image / audio conversion DB. The image-sound conversion DB stores information included in an image and sound data associated with the information. For example, the image-audio conversion DB has a data structure in which characters and audio data are linked. The image / audio conversion DB has a data structure in which words and audio data are linked. The image-sound conversion DB has a data structure in which an object and sound data are linked. Note that the image-audio conversion DB may be constructed by deep learning based on the image captured by the camera 20.

通信部４２は、携行部材３０と通信を行うインターフェース部である。サーバ４０は、通信部４２介して携行部材３０と通信可能である。 The communication unit 42 is an interface unit that communicates with the portable member 30. The server 40 can communicate with the portable member 30 via the communication unit 42.

制御部４３は、演算処理装置としてのＣＰＵ（Central Processing Unit）と、主記憶装置としてのＲＯＭ（Read Only Memory）と、ＲＡＭ（Random Access Memory）と、を有するコンピュータによって実現される。ＣＰＵは、記憶装置４１から処理内容に応じたプログラムを読み出してＲＡＭに展開し、展開したプログラムと協働して、各種機能を実現する。 The control unit 43 is realized by a computer having a CPU (Central Processing Unit) as an arithmetic processing device, a ROM (Read Only Memory) as a main storage device, and a RAM (Random Access Memory). The CPU reads out a program corresponding to the processing content from the storage device 41, loads the program on the RAM, and realizes various functions in cooperation with the loaded program.

記憶装置４１、通信部４２及び制御部４３の各々は、システムバスＢ３を介して互いに接続されている。 The storage device 41, the communication unit 42, and the control unit 43 are connected to each other via a system bus B3.

制御部４３は、携行部材３０から送信された画像データを受信すると受信した画像データＩＭに含まれている文字に基づいて音声データを生成することが可能である。したがって、制御部４３は、ユーザＵの周囲を撮像した画像に含まれている文字に基づいて音声データを生成する音声データ生成手段として機能する。 When receiving the image data transmitted from the carrying member 30, the control unit 43 can generate audio data based on the characters included in the received image data IM. Therefore, the control unit 43 functions as an audio data generation unit that generates audio data based on characters included in an image of the area around the user U.

音声データ生成部４３ａは、制御部４３の機能ブロックの１つである。音声データ生成部４３ａは、携行部材３０から送信された画像データを受信すると受信した画像データＩＭに含まれている文字に基づいて音声データを生成することが可能である。したがって、制御部４３は、ユーザＵの周囲を撮像した画像に含まれている文字に基づいて音声データを生成する音声データ生成手段として機能する。 The audio data generation unit 43a is one of the functional blocks of the control unit 43. When receiving the image data transmitted from the portable member 30, the audio data generation unit 43a can generate the audio data based on the characters included in the received image data IM. Therefore, the control unit 43 functions as an audio data generation unit that generates audio data based on characters included in an image of the area around the user U.

音声データ生成部４３ａは、例えば、ＯＣＲ（Optical Character Recognition）によって画像中の文字を認識する。音声データ生成部４３ａは、認識した文字を画像音声変換ＤＢを参照して音声データを生成する。 The audio data generation unit 43a recognizes characters in an image by, for example, OCR (Optical Character Recognition). The voice data generation unit 43a generates voice data with reference to the image / voice conversion DB for the recognized character.

以上で説明した音声出力装置１０の音声の出力態様について説明する。尚、図１０のスッテプＳ１２の音声データ生成処理を除いて実施例１と同一であるので、他の処理については説明を省略する。 A description will be given of an output mode of the audio output device 10 described above. The processing is the same as that of the first embodiment except for the audio data generation processing in step S12 in FIG.

携行部材３０は、画像データＩＭを受信すると、サーバ４０に画像データＩＭを送信する。 When carrying the image data IM, the carrying member 30 transmits the image data IM to the server 40.

図１４は、サーバ４０による音声データ生成処理のサブルーチンを示している。図１４に示すように、サーバ４０の制御部４３は、画像データＩＭを受信したかを判断する（ステップ４１）。サーバ４０の制御部４３は、画像データＩＭを受信したと判断すると（ステップＳ４１：Ｙ）、受信した画像データＩＭに含まれている文章の文字に基づいて音声データを生成する（ステップＳ４２）。従って、サーバ４０の制御部４３は、音声データ生成部４３ａとして機能する。サーバ４０の制御部４３は、ステップＳ４２で生成された音声データを携行部材３０のコントロールユニットＵ１に送信する（ステップＳ４３）。 FIG. 14 shows a subroutine of audio data generation processing by the server 40. As shown in FIG. 14, the control unit 43 of the server 40 determines whether the image data IM has been received (Step 41). When the control unit 43 of the server 40 determines that the image data IM has been received (step S41: Y), the control unit 43 generates audio data based on the text of the text included in the received image data IM (step S42). Therefore, the control unit 43 of the server 40 functions as the audio data generation unit 43a. The control unit 43 of the server 40 transmits the audio data generated in step S42 to the control unit U1 of the portable member 30 (step S43).

１０音声出力装置
２０カメラ
３０携行部材
３１基部
３３第１の腕部
３４第２の腕部
３９制御部
４０サーバ
Ｂ１〜Ｂ４ボタン
ＥＧメガネ
ＦＲフレーム
ＳＢシャッターボタン
ＳＰスピーカ Reference Signs List 10 audio output device 20 camera 30 carrying member 31 base 33 first arm 34 second arm 39 control unit 40 server B1 to B4 button EG glasses FR frame SB shutter button SP speaker

Claims

A carrying member having a base and a first arm and a second arm extending in the same direction from the base;
Audio data generating means for generating audio data based on a captured image,
Output means for outputting a sound based on the sound data,
Operation receiving means mounted on at least one of the first arm part and the second arm part of the carrying member and receiving an operation relating to an output mode of the sound by the output means;
An audio output device comprising:

The audio output device according to claim 1, wherein the first arm and the second arm are formed so as to approach each other as the distance from the base increases.

The carrying member is formed in a U-shape when viewed from a direction perpendicular to a plane defined by a straight line extending in a direction in which the first arm extends and a straight line extending in a direction in which the second arm extends. The audio output device according to claim 1, wherein:

The operation receiving means is mounted on the first arm and the second arm,
The surface of the first arm portion including the mounting surface on which the operation receiving unit is mounted intersects with the surface of the second arm portion including the mounting surface on which the operation receiving unit is mounted. The audio output device according to claim 1, wherein:

The output unit is mounted on at least one of the first arm and the second arm,
The audio output device according to claim 1, wherein the output unit is disposed closer to the base than the operation receiving unit.

Holding means attachable to the user's head;
Imaging means for imaging the image held by the holding means;
An imaging instruction input unit that is held by the holding unit so as to be located on the right side or the left side of the user's head, and that receives an input of an imaging instruction to the imaging unit,
6. The method according to claim 1, wherein the operation receiving unit is provided on an arm of the first arm and the second arm that is proximal to the imaging instruction input unit. 7. An audio output device according to any one of the above.

The audio output device according to any one of claims 1 to 6, wherein the carrying member includes an imaging instruction input unit that instructs the imaging device to perform imaging.

The operation receiving means has an exposed portion exposed on the mounting surface,
The audio output device according to any one of claims 1 to 7, wherein the exposure unit has different irregularities formed on a surface according to an operation received by the operation receiving unit.

The audio output device according to any one of claims 1 to 8, further comprising a control unit provided in the portable member and functioning as the audio data generation unit.

A housing having a base and a first arm and a second arm extending in the same direction from the base;
Acquisition means for acquiring an image of the periphery of the housing,
Transmitting means for transmitting the image to the outside,
Receiving means for receiving audio data generated based on the image,
Output means for outputting a sound based on the sound data,
A carrying member comprising: