JP2012252524A

JP2012252524A - Display device, display method and program

Info

Publication number: JP2012252524A
Application number: JP2011124715A
Authority: JP
Inventors: Tetsuya Handa; 哲也半田
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2011-06-03
Filing date: 2011-06-03
Publication date: 2012-12-20

Abstract

PROBLEM TO BE SOLVED: To easily display a desired image at a desired position in an image.SOLUTION: The display device includes: an image acquisition unit 6a for acquiring an image displayed in a display area of a display unit 8; a sound collection unit 3a for collecting sound generated from a sound source; a sound source direction identification unit 3d for identifying a sound source direction of the sound collected by the sound collection unit 3a; a position determination unit 6f for, on the basis of the sound source direction of the sound identified by the sound source direction identification unit 3d, determining a display position in a display area of sound related image associated with the sound; and a display control unit 7 for displaying the sound related image on the display position determined by the position determination unit 6f so as to be superimposed on a predetermined image displayed in the display area.

Description

本発明は、表示装置、表示方法及びプログラムに関する。 The present invention relates to a display device, a display method, and a program.

従来、スタンプ画像と表示画像とを合成する写真作成装置が知られている（例えば、特許文献１参照）。 2. Description of the Related Art Conventionally, a photo creation apparatus that combines a stamp image and a display image is known (see, for example, Patent Document 1).

特開２００４−１５９１５８号公報JP 2004-159158 A

しかしながら、上記特許文献１の場合、ユーザがスタンプ画像が表示されたキーを操作したり、スタンプ画像を合成する位置を指定したりしなければならず、それらの操作が煩わしいという問題がある。 However, in the case of Patent Document 1, the user has to operate the key on which the stamp image is displayed or to specify the position where the stamp image is to be combined.

本発明の課題は、画像の所望の位置に所望の画像を容易に表示することができる表示装置、表示方法及びプログラムを提供することにある。 An object of the present invention is to provide a display device, a display method, and a program that can easily display a desired image at a desired position of the image.

上記課題を解決するため、本発明の表示装置は、
表示手段を備えた表示装置において、
前記表示手段の表示領域に表示される画像を取得する取得手段と、
音源から発せられた音を集音する集音手段と、
当該装置本体の位置を基準として、前記集音手段により集音された音の音源方向を特定する方向特定手段と、
この方向特定手段により特定された前記音の音源方向に基づいて、当該音に関連する音関連画像の前記表示領域における表示位置を決定する位置決定手段と、
前記表示領域に表示される前記画像と重なるように、前記音関連画像を前記位置決定手段により決定された表示位置に表示させる表示制御手段と、
を備えたことを特徴としている。 In order to solve the above problems, the display device of the present invention provides:
In a display device comprising display means,
Obtaining means for obtaining an image displayed in a display area of the display means;
Sound collecting means for collecting sounds emitted from the sound source;
Direction specifying means for specifying the sound source direction of the sound collected by the sound collecting means with reference to the position of the apparatus body;
Position determining means for determining a display position in the display area of the sound-related image related to the sound based on the sound source direction of the sound specified by the direction specifying means;
Display control means for displaying the sound-related image at the display position determined by the position determination means so as to overlap the image displayed in the display area;
It is characterized by having.

また、本発明の表示方法は、
表示手段と、音源から発せられた音を集音する集音手段とを備える表示装置を用いた表示方法であって、
前記表示手段の表示領域に画像を表示する処理と、
前記表示装置本体の位置を基準として、前記集音手段により集音された音の音源方向を特定する処理と、
特定された前記音の音源方向に基づいて、当該音に関連する音関連画像の前記表示領域における表示位置を決定する処理と、
前記表示領域に表示される前記画像と重なるように、前記音関連画像を決定された表示位置に表示させる処理と、
を行うことを特徴としている。 Moreover, the display method of the present invention includes:
A display method using a display device comprising display means and sound collection means for collecting sound emitted from a sound source,
Processing for displaying an image in a display area of the display means;
A process of identifying the sound source direction of the sound collected by the sound collecting means with reference to the position of the display device body;
A process of determining a display position in the display area of a sound-related image related to the sound based on the sound source direction of the specified sound;
Processing to display the sound-related image at the determined display position so as to overlap the image displayed in the display area;
It is characterized by performing.

また、本発明のプログラムは、
表示手段と、音源から発せられた音を集音する集音手段とを備える表示装置のコンピュータを、
前記表示装置本体の位置を基準として、前記表示手段の表示領域に表示される画像を取得する取得手段、
前記集音手段により集音された音の音源方向を特定する方向特定手段、
この方向特定手段により特定された前記音の音源方向に基づいて、当該音に関連する音関連画像の前記表示領域における表示位置を決定する位置決定手段、
前記表示領域に表示される前記画像と重なるように、前記音関連画像を前記位置決定手段により決定された表示位置に表示させる表示制御手段、
として機能させることを特徴としている。 The program of the present invention is
A computer of a display device comprising display means and sound collection means for collecting sounds emitted from a sound source,
Obtaining means for obtaining an image displayed in a display area of the display means on the basis of the position of the display device body;
Direction specifying means for specifying the sound source direction of the sound collected by the sound collecting means;
Position determining means for determining a display position in the display area of the sound-related image related to the sound based on the sound source direction of the sound specified by the direction specifying means;
Display control means for displaying the sound-related image at the display position determined by the position determination means so as to overlap the image displayed in the display area;
It is characterized by making it function as.

本発明によれば、画像の所望の位置に所望の画像を容易に表示することができる。 According to the present invention, a desired image can be easily displayed at a desired position of the image.

本発明を適用した一実施形態の表示装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the display apparatus of one Embodiment to which this invention is applied. 図１の表示装置による音関連画像表示処理に係る動作の一例を示すフローチャートである。3 is a flowchart illustrating an example of an operation related to sound-related image display processing by the display device of FIG. 図２の音関連画像表示処理に係る画像の一例を模式的に示す図である。It is a figure which shows typically an example of the image which concerns on the sound related image display process of FIG. 図２の音関連画像表示処理に係る画像の一例を模式的に示す図である。It is a figure which shows typically an example of the image which concerns on the sound related image display process of FIG. 図２の音関連画像表示処理に係る画像の一例を模式的に示す図である。It is a figure which shows typically an example of the image which concerns on the sound related image display process of FIG. 図２の音関連画像表示処理に係る画像の一例を模式的に示す図である。It is a figure which shows typically an example of the image which concerns on the sound related image display process of FIG.

以下に、本発明について、図面を用いて具体的な態様を説明する。ただし、発明の範囲は、図示例に限定されない。
図１は、本発明を適用した一実施形態の表示装置１００の概略構成を示すブロック図である。 Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. However, the scope of the invention is not limited to the illustrated examples.
FIG. 1 is a block diagram showing a schematic configuration of a display device 100 according to an embodiment to which the present invention is applied.

本実施形態の表示装置１００は、表示部８の表示領域Ｒに表示される所定の画像と重なるように、集音部３ａにより集音された音に関連する音関連画像を表示する。その際、集音部３ａにより集音された音の音源方向を特定し、特定した音源方向に基づいて音関連画像の表示領域Ｒにおける表示位置を決定し、決定した表示位置に音関連画像を表示する。
具体的には、表示装置１００は、例えば卓上に設置されるデジタルフォトフレーム等であり、図１に示すように、中央制御部１と、操作入力部２と、音処理部３と、メモリ４と、記録媒体制御部５と、画像処理部６と、表示制御部７と、表示部８とを備えている。 The display device 100 according to the present embodiment displays a sound-related image related to the sound collected by the sound collection unit 3a so as to overlap a predetermined image displayed in the display area R of the display unit 8. At that time, the sound source direction of the sound collected by the sound collecting unit 3a is specified, the display position in the display area R of the sound related image is determined based on the specified sound source direction, and the sound related image is displayed at the determined display position. indicate.
Specifically, the display device 100 is a digital photo frame or the like installed on a table, for example. As shown in FIG. 1, the central control unit 1, the operation input unit 2, the sound processing unit 3, and the memory 4. A recording medium control unit 5, an image processing unit 6, a display control unit 7, and a display unit 8.

中央制御部１は、表示装置１００の各部を制御するものである。具体的には、中央制御部１は、図示は省略するが、ＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）を備え、表示装置１００用の各種処理プログラム（図示略）や操作入力部２により入力された操作信号等に従って各種の制御動作を行う。 The central control unit 1 controls each unit of the display device 100. Specifically, although not shown, the central control unit 1 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), and a ROM (Read Only Memory), and various processing programs (illustrated) for the display device 100. Various control operations are performed in accordance with operation signals input by the operation input unit 2.

操作入力部２は、表示装置１００に対して各種指示を入力するためのものである。具体的には、文字等を入力するためのデータ入力キーや、データの選択又は指定の操作等を行うための上下左右移動キーや各種機能キー等によって構成される操作部を備えている。中央制御部１は、操作入力部２から出力され入力された操作信号に従って所定の動作を各部に実行させる。また、操作入力部２は、リモコンやタッチパネル等のその他の入力装置を備えるものとしても良い。 The operation input unit 2 is for inputting various instructions to the display device 100. Specifically, an operation unit including a data input key for inputting characters and the like, an up / down / left / right movement key for selecting or specifying data, various function keys, and the like is provided. The central control unit 1 causes each unit to execute a predetermined operation according to the operation signal output from the operation input unit 2 and input. In addition, the operation input unit 2 may include other input devices such as a remote controller and a touch panel.

音処理部３は、集音部３ａと、録音部３ｂと、音量特定部３ｃと、音源方向特定部３ｄと、音認識部３ｅと、個体識別部３ｆと、音声情報テーブルＴ１とを具備している。 The sound processing unit 3 includes a sound collecting unit 3a, a recording unit 3b, a volume specifying unit 3c, a sound source direction specifying unit 3d, a sound recognizing unit 3e, an individual identifying unit 3f, and a voice information table T1. ing.

集音部３ａは、マイク等であり、ヒトや動物等の音源Ｓから発せられた音を集音する。
具体的には、集音部３ａは、例えば、表示装置１００を正面から見て表示部８の表示領域Ｒの右上側と左下側の２箇所に、音源Ｓから発せられた音の振動が入力される入力部ａ１、ａ１が設けられ（図３等参照）、これら入力部ａ１に入力された音の振動にＡ／Ｄ変換等を施して音データを生成する。
なお、表示装置１００が備える集音部３ａの入力部ａ１の個数や配設位置は、集音部３ａにより集音された音の音源方向を特定可能であれば、適宜任意に変更可能である。 The sound collection unit 3a is a microphone or the like and collects sound emitted from a sound source S such as a human or an animal.
Specifically, for example, the sound collection unit 3a receives vibrations of sound emitted from the sound source S at two locations on the upper right side and lower left side of the display area R of the display unit 8 when the display device 100 is viewed from the front. Input sections a1 and a1 are provided (see FIG. 3 and the like), and sound data is generated by subjecting the vibration of the sound input to the input section a1 to A / D conversion or the like.
Note that the number and arrangement position of the input unit a1 of the sound collecting unit 3a included in the display device 100 can be arbitrarily changed as long as the sound source direction of the sound collected by the sound collecting unit 3a can be specified. .

録音部３ｂは、所定の録音開始タイミングで集音部３ａにより集音された音の音データの記録を開始し、所定の録音終了タイミングで当該記録を終了する。
ここで、録音開始タイミング及び録音終了タイミングは、ユーザからの指示に応じたタイミングであれば良い。
具体的には、録音部３ｂは、ユーザによる操作入力部２の録音指示キー（図示略）の所定操作に応じて中央制御部１から出力される所定の録音指示信号の入力に基づいて、録音開始タイミングや録音終了タイミングを特定する。例えば、録音部３ｂは、ユーザにより録音指示キーが所定操作（例えば、押下）されると録音開始タイミングになったと判定するとともに、当該所定操作が解除されると録音終了タイミングになったと判定しても良い。また、録音部３ｂは、ユーザにより録音指示キーが操作されると録音開始タイミングになったと判定するとともに、その録音指示キーが再度操作されると録音終了タイミングになったと判定しても良い。 The recording unit 3b starts recording the sound data of the sound collected by the sound collection unit 3a at a predetermined recording start timing, and ends the recording at a predetermined recording end timing.
Here, the recording start timing and the recording end timing may be any timing according to an instruction from the user.
Specifically, the recording unit 3b performs recording based on a predetermined recording instruction signal output from the central control unit 1 in response to a predetermined operation of a recording instruction key (not shown) of the operation input unit 2 by the user. Specify the start timing and recording end timing. For example, the recording unit 3b determines that the recording start timing is reached when the recording instruction key is pressed (for example, pressed) by the user, and determines that the recording end timing is reached when the predetermined operation is released. Also good. The recording unit 3b may determine that the recording start timing is reached when the recording instruction key is operated by the user and that the recording end timing is reached when the recording instruction key is operated again.

また、録音部３ｂは、ユーザによる撮像部（図示略）に対する所定のジェスチャに応じて中央制御部１から出力される所定の録音指示信号の入力に基づいて、録音開始タイミングや録音終了タイミングを特定しても良い。即ち、例えば、録音部３ｂは、その撮像部によってユーザによる所定のジェスチャが撮像されると録音開始タイミングになったと判定するとともに、当該録音開始タイミングから所定時間経過後に録音終了タイミングになったと判定しても良い。また、録音部３ｂは、その撮像部によってユーザによる所定の録音開始指示ジェスチャが撮像されると録音開始タイミングになったと判定するとともに、その撮像部によってユーザによる所定の録音終了指示ジェスチャが撮像されると録音終了タイミングになったと判定としても良い。 The recording unit 3b specifies the recording start timing and the recording end timing based on the input of a predetermined recording instruction signal output from the central control unit 1 in response to a predetermined gesture made by the user to the imaging unit (not shown). You may do it. That is, for example, the recording unit 3b determines that the recording start timing is reached when a predetermined gesture by the user is imaged by the imaging unit, and determines that the recording end timing is reached after a predetermined time has elapsed from the recording start timing. May be. Further, the recording unit 3b determines that the recording start timing has come when the predetermined recording start instruction gesture by the user is imaged by the imaging unit, and the predetermined recording end instruction gesture by the user is imaged by the imaging unit. It may be determined that the recording end timing has come.

音量特定部３ｃは、音量特定手段として、集音部３ａにより集音された音の音量を特定する。
具体的には、音量特定部３ｃは、例えば、録音開始タイミングから録音終了タイミングまでに集音部３ａにより集音されて録音部３ｂに記録された音の音データに基づいて、当該音（例えば、音声等）の音量を特定する。
なお、集音部３ａにより集音された音の音量は、公知の手法を用いて特定可能であるので、ここでは詳細な説明を省略する。 The volume specifying unit 3c specifies the volume of the sound collected by the sound collecting unit 3a as a volume specifying unit.
Specifically, the sound volume specifying unit 3c, for example, based on the sound data of the sound collected by the sound collecting unit 3a from the recording start timing to the recording end timing and recorded in the recording unit 3b (for example, , Voice, etc.).
Note that the volume of the sound collected by the sound collection unit 3a can be specified using a known method, and thus detailed description thereof is omitted here.

音源方向特定部３ｄは、方向特定手段として、集音部３ａにより集音された音の音源方向を特定する。
具体的には、音源方向特定部３ｄは、例えば、録音開始タイミングから録音終了タイミングまでに集音部３ａにより集音されて録音部３ｂに記録された音の音データに基づいて、当該音の音源Ｓ（ヒトや動物等の個体）の表示装置１００に対する位置を特定する。即ち、音源方向特定部３ｄは、例えば、音源Ｓからの音が集音部３ａの一方の入力部ａ１に到達した時間と他方の入力部ａ１に到達した時間との差に基づいて、音源Ｓ（ヒトや動物等の個体）の表示装置１００に対する位置（例えば、図３（ｂ）における表示装置１００の左側や、図４（ｂ）における右側等）を特定する。そして、音源方向特定部３ｄは、表示装置１００に対する音源Ｓの位置に向かう方向（例えば、図３（ｂ）における左方向や、図４（ｂ）における右方向等）を音源方向とする。 The sound source direction specifying unit 3d specifies the sound source direction of the sound collected by the sound collecting unit 3a as direction specifying means.
Specifically, the sound source direction specifying unit 3d, for example, based on the sound data of the sound collected by the sound collecting unit 3a and recorded in the recording unit 3b from the recording start timing to the recording end timing, The position of the sound source S (individuals such as humans and animals) with respect to the display device 100 is specified. That is, the sound source direction specifying unit 3d, for example, based on the difference between the time when the sound from the sound source S reaches one input unit a1 of the sound collection unit 3a and the time when the sound reaches the other input unit a1. The position (for example, the left side of the display device 100 in FIG. 3B, the right side in FIG. 4B, etc.) with respect to the display device 100 is specified. Then, the sound source direction specifying unit 3d sets the direction toward the position of the sound source S relative to the display device 100 (for example, the left direction in FIG. 3B, the right direction in FIG. 4B, etc.) as the sound source direction.

なお、音源Ｓの表示装置１００に対する位置や音源方向は、表示装置１００の表示領域Ｒが含まれる平面と略平行な二次元の空間を基準として特定しても良いし、さらに表示装置１００の表示領域Ｒに略直交する前後方向を加えた三次元の空間を基準として特定しても良い。
また、音源方向特定部３ｄは、例えば、予め規定されている複数の方向（例えば、上方向、下方向、左方向及び右方向の４つの方向等）の中から、音源Ｓの表示装置１００に対する位置に向かう方向に最も近い方向を音源方向としても良い。ここで、音源Ｓの表示装置１００に対する位置に向かう方向に最も近い方向とは、予め規定されている複数の方向の中で、各々の方向に延在する直線と音源Ｓの表示装置１００に対する位置に向かう方向に延在する直線とのなす角が最も小さくなる方向のことを言う。 Note that the position and the sound source direction of the sound source S with respect to the display device 100 may be specified with reference to a two-dimensional space substantially parallel to a plane including the display region R of the display device 100, and the display of the display device 100 You may specify on the basis of the three-dimensional space which added the front-back direction substantially orthogonal to the area | region R. FIG.
Further, the sound source direction specifying unit 3d, for example, with respect to the display device 100 of the sound source S from a plurality of predetermined directions (for example, four directions such as an upward direction, a downward direction, a left direction, and a right direction). The direction closest to the direction toward the position may be the sound source direction. Here, the direction closest to the direction toward the position of the sound source S relative to the display device 100 is a straight line extending in each direction and the position of the sound source S relative to the display device 100 among a plurality of predetermined directions. This is the direction in which the angle formed with the straight line extending in the direction toward the direction is the smallest.

音認識部３ｅは、音認識手段として、集音部３ａにより集音された音を認識する。
具体的には、音認識部３ｅは、例えば、録音開始タイミングから録音終了タイミングまでに集音部３ａにより集音されて録音部３ｂに記録された音（特に、音声）の音データに基づいて、当該音を認識して対応する文字で表すための文字データを生成する。
なお、集音部３ａにより集音された音は、所定の音声認識辞書を使用して認識する等の公知の手法を用いて認識可能であるので、ここでは詳細な説明を省略する。 The sound recognition unit 3e recognizes the sound collected by the sound collection unit 3a as sound recognition means.
Specifically, the sound recognizing unit 3e is based on sound data of sound (particularly, sound) collected by the sound collecting unit 3a and recorded in the recording unit 3b from the recording start timing to the recording end timing, for example. The character data for recognizing the sound and expressing it with the corresponding character is generated.
Note that the sound collected by the sound collection unit 3a can be recognized using a known method such as recognition using a predetermined speech recognition dictionary, and thus detailed description thereof is omitted here.

また、音認識部３ｅは、集音部３ａにより集音された音の意味内容を認識しても良い。
ここで、集音部３ａにより集音された音の意味内容は、所定の意味解析辞書を使用して認識する等の公知の手法を用いて認識可能であるので、ここでは詳細な説明を省略する。 The sound recognition unit 3e may recognize the meaning content of the sound collected by the sound collection unit 3a.
Here, since the meaning content of the sound collected by the sound collection unit 3a can be recognized using a known method such as recognition using a predetermined semantic analysis dictionary, detailed description thereof is omitted here. To do.

個体識別部３ｆは、集音部３ａにより集音された音を発した個体を識別する。
具体的には、個体識別部３ｆは、例えば、録音開始タイミングから録音終了タイミングまでに集音部３ａにより集音されて録音部３ｂに記録された音声の音データに基づいて、当該音声と一致する音声情報が音声情報テーブルＴ１に記憶されているか否かを判定する。そして、音声と一致する音声情報が音声情報テーブルＴ１に記憶されていると判定すると、個体識別部３ｆは、当該音声と一致する音声情報に対応する識別情報を音声情報テーブルＴ１から取得することによって、集音部３ａにより集音された音声を発した個体を識別する。 The individual identifying unit 3f identifies an individual that has emitted the sound collected by the sound collecting unit 3a.
Specifically, the individual identifying unit 3f, for example, matches the sound based on the sound data of the sound collected by the sound collecting unit 3a and recorded in the recording unit 3b from the recording start timing to the recording end timing. It is determined whether the audio information to be stored is stored in the audio information table T1. When it is determined that the voice information that matches the voice is stored in the voice information table T1, the individual identification unit 3f acquires the identification information corresponding to the voice information that matches the voice from the voice information table T1. The individual that has produced the sound collected by the sound collection unit 3a is identified.

なお、音声情報テーブルＴ１は、第２記憶手段として、ヒト及び動物のうち、少なくとも一方の各個体の識別に用いられる音声情報を記憶する。
具体的には、音声情報テーブルＴ１は、例えば、ヒトや動物等の個体の音声を識別するための音声情報（例えば、音声の特徴情報）と、当該個体を識別するための識別情報（例えば、名前）とが対応付けられて記憶されている。
ここで、音声の特徴情報としては、例えば、声紋等の情報が挙げられるが、ヒトや動物等の個体の音声を識別するための情報であればこれに限定されることなく如何なる情報であってもよい。 In addition, the audio | voice information table T1 memorize | stores the audio | voice information used for identification of each individual | organism | solid of at least one among a human and an animal as a 2nd memory | storage means.
Specifically, the audio information table T1 includes, for example, audio information (for example, audio feature information) for identifying the audio of an individual such as a human or an animal, and identification information (for example, for identifying the individual) Name) is stored in association with each other.
Here, the voice feature information includes, for example, information such as a voice print. However, any information is not limited to this as long as it is information for identifying the voice of an individual such as a human or an animal. Also good.

メモリ４は、例えば、ＤＲＡＭ（Dynamic Random Access Memory）やＮＡＮＤ型フラッシュメモリなどのＲＯＭ等により構成され、中央制御部１、音処理部３、画像処理部６等によって処理されるデータ等を一時的に記憶する。 The memory 4 is composed of a ROM such as a DRAM (Dynamic Random Access Memory) or a NAND flash memory, for example, and temporarily stores data processed by the central control unit 1, the sound processing unit 3, the image processing unit 6, and the like. To remember.

記録媒体制御部５は、記録媒体５１が着脱自在に構成され、装着された記録媒体５１からのデータの読み出しや記録媒体５１に対するデータの書き込みを制御する。
即ち、記録媒体制御部５は、例えば、撮像装置等の外部装置から取り外された後に取り付けられた記録媒体５１からの表示対象となる所定の画像データの読み出しを制御する。
なお、記録媒体５１は、例えば、ＳＤカードやＵＳＢメモリなどの不揮発性メモリ（フラッシュメモリ）等により構成されるが、一例であってこれに限定されるものではなく、適宜任意に変更可能である。 The recording medium control unit 5 is configured such that the recording medium 51 is detachable, and controls reading of data from the loaded recording medium 51 and writing of data to the recording medium 51.
That is, for example, the recording medium control unit 5 controls reading of predetermined image data to be displayed from the recording medium 51 attached after being detached from an external device such as an imaging device.
The recording medium 51 is composed of, for example, a non-volatile memory (flash memory) such as an SD card or a USB memory. However, the recording medium 51 is an example and is not limited thereto, and can be arbitrarily changed as appropriate. .

画像処理部６は、画像取得部６ａと、顔検出部６ｂと、寸法決定部６ｃと、向き決定部６ｄと、音関連画像生成部６ｅと、位置決定部６ｆと、顔識別用情報テーブルＴ２とを具備している。 The image processing unit 6 includes an image acquisition unit 6a, a face detection unit 6b, a dimension determination unit 6c, a direction determination unit 6d, a sound related image generation unit 6e, a position determination unit 6f, and a face identification information table T2. It is equipped with.

画像取得部６ａは、表示部８の表示領域Ｒに表示される画像の画像データ、即ち、表示対象となる画像を取得する。
具体的には、画像取得部６ａは、ユーザによる操作入力部２の所定操作に基づいて記録媒体５１から読み出された画像データを表示対象の画像データとして取得する。また、画像取得部６ａは、記録媒体５１に記憶された画像データの中で、ユーザによる操作入力部２の所定操作に基づいて指定された画像データを表示対象の画像データとして取得する。 The image acquisition unit 6a acquires image data of an image displayed in the display area R of the display unit 8, that is, an image to be displayed.
Specifically, the image acquisition unit 6a acquires image data read from the recording medium 51 based on a predetermined operation of the operation input unit 2 by the user as display target image data. In addition, the image acquisition unit 6a acquires image data designated based on a predetermined operation of the operation input unit 2 by the user among the image data stored in the recording medium 51 as image data to be displayed.

顔検出部６ｂは、画像取得部６ａにより取得された表示対象の画像からヒトや動物等の顔を検出する。
具体的には、顔検出部６ｂは、画像取得部６ａにより取得された表示対象の画像の画像データに対して所定の顔検出処理を行って、当該画像に含まれる全ての顔の顔領域を検出する。
なお、顔検出処理は、公知の技術であるので、ここでは詳細な説明を省略する。 The face detection unit 6b detects a face such as a human or an animal from the display target image acquired by the image acquisition unit 6a.
Specifically, the face detection unit 6b performs a predetermined face detection process on the image data of the display target image acquired by the image acquisition unit 6a, and determines the face regions of all the faces included in the image. To detect.
Since the face detection process is a known technique, detailed description thereof is omitted here.

寸法決定部６ｃは、寸法決定手段として、音量特定部３ｃにより特定された音量に基づいて、後述する音関連画像生成部６ｅにより生成され、表示領域Ｒに表示される音関連画像の寸法を決定する。
ここで、音関連画像とは、音認識部３ｅにより認識された音を対応する文字で表した文字画像Ｍである。
文字画像Ｍは、文字を囲む枠部分の所定位置に接続された引き出し線Ｌを具備するふきだし画像を含んでいても良い。即ち、文字画像Ｍは、音認識部３ｅにより生成された文字データに基づく画像（文字本体画像）のみからなる画像であっても良いし、音認識部３ｅにより生成された文字データに基づく画像（文字本体画像）とふきだし画像とからなる画像であっても良い。なお、文字画像Ｍにふきだし画像を含めるか否かは、ユーザからの所定指示の有無、音量特定部３ｃにより特定された音量、表示対象の画像内に顔があるか否か、音認識部３ｅによって認識された音の意味内容等に応じて選択しても良い。 The dimension determination unit 6c, as a dimension determination unit, determines the dimension of the sound-related image generated by the sound-related image generation unit 6e described later based on the volume specified by the volume specification unit 3c and displayed in the display region R. To do.
Here, the sound-related image is a character image M in which a sound recognized by the sound recognition unit 3e is represented by a corresponding character.
The character image M may include a speech bubble image including a lead line L connected to a predetermined position of a frame portion surrounding the character. That is, the character image M may be an image consisting only of an image (character body image) based on the character data generated by the sound recognition unit 3e, or an image based on the character data generated by the sound recognition unit 3e ( It may be an image made up of a character body image) and a speech bubble image. Note that whether or not to include a speech bubble in the character image M depends on whether or not there is a predetermined instruction from the user, the volume specified by the volume specifying unit 3c, whether or not there is a face in the display target image, and the sound recognition unit 3e. You may select according to the meaning content etc. of the sound recognized by.

具体的には、寸法決定部６ｃは、音量特定部３ｃにより特定された音量が大きいほど、表示領域Ｒに表示される文字画像Ｍが相対的に大きくなるよう、表示領域Ｒに表示される文字画像Ｍの寸法を決定する。即ち、寸法決定部６ｃは、例えば、音量特定部３ｃにより特定された音量が所定の第１音量閾値未満である場合、所定の「第１寸法」を文字画像Ｍの寸法として決定する。また、寸法決定部６ｃは、音量特定部３ｃにより特定された音量が第１音量閾値以上であり所定の第２音量閾値未満である場合、第１寸法よりも大きい所定の「第２寸法」を文字画像Ｍの寸法として決定する。また、寸法決定部６ｃは、音量特定部３ｃにより特定された音量が第２音量閾値以上である場合、第２寸法よりも大きい所定の「第三寸法」を文字画像Ｍの寸法として決定する。
なお、文字画像Ｍの寸法を決定するための閾値は、第１音量閾値及び第２音量閾値の２つに限定されるものではなく、適宜任意に変更可能である。
また、音量に基づいて表示領域Ｒに表示される音関連画像（文字画像Ｍ）の寸法を決定できるのであれば、寸法決定部６ｃは、例えば、音量が小さいほど、表示領域Ｒに表示される文字画像Ｍが相対的に大きくなるよう、表示領域Ｒに表示される文字画像Ｍの寸法を決定しても良い。 Specifically, the dimension determining unit 6c displays the characters displayed in the display region R such that the larger the volume specified by the volume specifying unit 3c, the larger the character image M displayed in the display region R. The size of the image M is determined. That is, for example, when the volume specified by the volume specifying unit 3c is less than a predetermined first volume threshold, the dimension determining unit 6c determines a predetermined “first dimension” as the size of the character image M. In addition, when the volume specified by the volume specifying unit 3c is greater than or equal to the first volume threshold and less than the predetermined second volume threshold, the dimension determining unit 6c sets a predetermined “second dimension” that is larger than the first dimension. The size of the character image M is determined. The dimension determining unit 6c determines a predetermined “third dimension” larger than the second dimension as the dimension of the character image M when the volume specified by the volume specifying unit 3c is greater than or equal to the second volume threshold.
Note that the threshold for determining the size of the character image M is not limited to two, ie, the first volume threshold and the second volume threshold, and can be arbitrarily changed as appropriate.
In addition, if the size of the sound-related image (character image M) displayed in the display region R can be determined based on the volume, the size determination unit 6c is displayed in the display region R as the volume decreases, for example. The size of the character image M displayed in the display region R may be determined so that the character image M becomes relatively large.

向き決定部６ｄは、文字画像Ｍがふきだし画像を含む場合に、ふきだし画像の引き出し線Ｌの向きを決定する。
具体的には、向き決定部６ｄは、向き決定手段として、音源方向特定部３ｄにより特定された音源方向に基づいて、ふきだし画像の引き出し線Ｌの向きを決定する。
ここで、向き決定部６ｄは、例えば、音源方向特定部３ｄにより特定された音源方向と略同じ方向に延在するようにふきだし画像の引き出し線Ｌの向きを決定する。即ち、向き決定部６ｄは、例えば図３（ｂ）に示すように、音源方向が左方向の場合、引き出し線Ｌの先端が表示装置１００の左側を指す向き「左向き」をふきだし画像の引き出し線Ｌの向きとして決定する。また、向き決定部６ｄは、例えば図４（ｂ）に示すように、音源方向が右方向の場合、引き出し線Ｌの先端が表示装置１００の右側を指す向き「右向き」をふきだし画像の引き出し線Ｌの向きとして決定する。 The orientation determining unit 6d determines the orientation of the leader line L of the speech image when the character image M includes a speech image.
Specifically, the direction determination unit 6d determines the direction of the lead-out line L of the balloon image based on the sound source direction specified by the sound source direction specifying unit 3d as direction determination means.
Here, the orientation determining unit 6d determines the orientation of the extraction line L of the speech bubble so as to extend in substantially the same direction as the sound source direction specified by the sound source direction specifying unit 3d, for example. That is, for example, as illustrated in FIG. 3B, the orientation determination unit 6 d takes out the orientation “leftward” in which the leading end of the leader line L points to the left side of the display device 100 when the sound source direction is the left direction. The direction of L is determined. For example, as illustrated in FIG. 4B, the orientation determination unit 6 d takes out the orientation “right” where the leading end of the leader line L points to the right side of the display device 100 when the sound source direction is the right direction, The direction of L is determined.

音関連画像生成部６ｅは、集音部３ａにより集音された音に関連する音関連画像を生成する。
具体的には、音関連画像生成部６ｅは、例えば、音認識部３ｅにより生成された文字データに基づく画像（文字本体画像）を生成し、当該生成した画像の寸法を寸法決定部６ｃにより決定された寸法に基づいて調整することによって、文字画像Ｍを生成する。 The sound related image generation unit 6e generates a sound related image related to the sound collected by the sound collection unit 3a.
Specifically, the sound-related image generation unit 6e generates, for example, an image (character body image) based on the character data generated by the sound recognition unit 3e, and the size determination unit 6c determines the size of the generated image. The character image M is generated by adjusting based on the measured dimensions.

また、文字画像Ｍがふきだし画像を含む場合、音関連画像生成部６ｅは、例えば、音認識部３ｅにより生成された文字データに基づく画像（文字本体画像）に所定形状のふきだし画像を付加することによって、文字画像Ｍを生成する。
即ち、音関連画像生成部６ｅは、画像処理部６等に格納された所定の記録手段（図示略）に記録されているふきだし画像データを取得し、当該取得したふきだし画像データに基づくふきだし画像の枠内に、音認識部３ｅにより生成された文字データに基づく画像（文字本体画像）が配置されるようにふきだし画像のサイズを調整する。また、音関連画像生成部６ｅは、当該取得したふきだし画像データに基づくふきだし画像の引き出し線Ｌの向きを向き決定部６ｄにより決定された向きに基づいて調整する。そして、音関連画像生成部６ｅは、文字本体画像と、サイズや向きが調整されたふきだし画像とを合成して合成画像を生成し、当該生成した合成画像の寸法を寸法決定部６ｃにより決定された寸法に基づいて調整することによって、文字画像Ｍを生成する。
なお、音関連画像生成部６ｅは、音認識部３ｅにより生成された文字データに基づく画像（文字本体画像）の文字数に基づいて、所定の記録手段から取得するふきだし画像データを変更しても良い。 When the character image M includes a speech bubble image, the sound-related image generation unit 6e adds a speech bubble image having a predetermined shape to an image (character body image) based on the character data generated by the sound recognition unit 3e, for example. Thus, the character image M is generated.
That is, the sound-related image generation unit 6e acquires the balloon image data recorded in a predetermined recording unit (not shown) stored in the image processing unit 6 and the like, and generates a balloon image based on the acquired balloon image data. The size of the speech balloon image is adjusted so that an image (character body image) based on the character data generated by the sound recognition unit 3e is arranged in the frame. The sound-related image generation unit 6e adjusts the direction of the extraction line L of the balloon image based on the acquired balloon image data based on the direction determined by the direction determination unit 6d. Then, the sound-related image generation unit 6e generates a composite image by combining the character main body image and the balloon image whose size and orientation are adjusted, and the size determination unit 6c determines the dimensions of the generated composite image. The character image M is generated by adjusting based on the measured dimensions.
Note that the sound-related image generation unit 6e may change the balloon image data acquired from the predetermined recording unit based on the number of characters of the image (character body image) based on the character data generated by the sound recognition unit 3e. .

また、音認識部３ｅによって音の意味内容が認識される場合、音関連画像生成部６ｅは、当該意味内容に基づいて、所定の記録手段から取得するふきだし画像データを変更しても良い。具体的には、音関連画像生成部６ｅは、例えば、図３（ｂ）に示すように、「ヤッホー」等の叫ぶ際に使用する言葉が音声として集音部３ａにより集音された場合には、文字を囲む枠部分の形状がギザギザしたふきだし画像のふきだし画像データを取得し、図４（ｂ）に示すように、「こんにちは」等のあいさつする際に使用する言葉が音声として集音部３ａにより集音された場合には、文字を囲む枠部分の形状が略楕円形のふきだし画像のふきだし画像データを取得しても良い。これにより、表示装置１００は、集音部３ａによって集音された音の意味内容に応じて、表示部８の表示領域Ｒに表示する文字画像Ｍに含まれるふきだし画像の形状を変化させることができる。 Further, when the meaning content of the sound is recognized by the sound recognition unit 3e, the sound related image generation unit 6e may change the speech image data acquired from the predetermined recording unit based on the meaning content. Specifically, the sound-related image generation unit 6e, for example, as shown in FIG. 3 (b), when the sound collection unit 3a collects the words used when shouting “Yaho” or the like as sound. Acquires speech image data of a speech bubble with a jagged shape of the frame portion surrounding the character, and as shown in FIG. When the sound is collected by 3a, the speech image data of a speech bubble image having a substantially elliptical frame portion surrounding the character may be acquired. Thereby, the display apparatus 100 can change the shape of the speech bubble image included in the character image M displayed in the display area R of the display unit 8 in accordance with the meaning content of the sound collected by the sound collection unit 3a. it can.

位置決定部６ｆは、集音部３ａにより集音された音に関連する音関連画像（文字画像Ｍ）の表示領域Ｒにおける表示位置を決定する。
具体的には、位置決定部６ｆは、例えば、顔検出部６ｂにより顔が検出された場合、音源方向特定部３ｄにより特定された音源方向と、顔検出部６ｂにより所定の画像（表示対象の画像）から検出された顔の位置とに基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定する。
ここで、位置決定部６ｆは、例えば、顔検出部６ｂにより検出された顔の個数が１個である場合、当該顔の顔領域を、文字画像Ｍを付与する顔領域として特定する。そして、位置決定部６ｆは、当該特定した顔領域の周辺領域のうち、音源方向特定部３ｄにより特定された音源方向の逆側の領域内の所定位置の座標（例えば、当該所定位置の中心座標）を、文字画像Ｍの表示位置として決定する。即ち、位置決定部６ｆは、例えば図４（ｂ）に示すように、音源方向が右方向の場合、特定した顔領域の周辺領域のうち左側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。 The position determination unit 6f determines the display position in the display region R of the sound related image (character image M) related to the sound collected by the sound collection unit 3a.
Specifically, for example, when a face is detected by the face detection unit 6b, the position determination unit 6f detects the sound source direction specified by the sound source direction specification unit 3d and a predetermined image (display target) by the face detection unit 6b. The display position of the character image M in the display area R is determined based on the face position detected from the image.
Here, for example, when the number of faces detected by the face detection unit 6b is one, the position determination unit 6f specifies the face region of the face as a face region to which the character image M is added. The position determination unit 6f then coordinates the coordinates of a predetermined position (for example, the center coordinates of the predetermined position) in the region on the opposite side of the sound source direction specified by the sound source direction specifying unit 3d among the peripheral regions of the specified face region. ) Is determined as the display position of the character image M. That is, as shown in FIG. 4B, for example, when the sound source direction is the right direction, the position determination unit 6f uses the character image M as the coordinates of a predetermined position in the left area of the peripheral area of the specified face area. The display position is determined.

また、位置決定部６ｆは、例えば、顔検出部６ｂにより検出された顔の個数が複数個である場合、個体識別部３ｆにより取得された識別情報（個体識別部３ｆにより識別された個体の識別情報）に対応する顔識別用情報を顔識別用情報テーブルＴ２から取得して、顔検出部６ｂにより検出された顔の中に、当該取得した顔識別用情報と一致する顔があるか否かを判定する。
そして、顔検出部６ｂにより検出された顔の中に、当該取得した顔識別用情報と一致する顔があると判定した場合、位置決定部６ｆは、顔検出部６ｂにより所定の画像（表示対象の画像）から検出され、個体識別部３ｆにより識別された個体の顔識別用情報を用いて識別される顔の位置に基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定する。即ち、位置決定部６ｆは、当該取得した顔識別用情報と一致する顔の顔領域を、文字画像Ｍを付与する顔領域として特定する。そして、位置決定部６ｆは、当該特定した顔領域の周辺領域のうち、音源方向特定部３ｄにより特定された音源方向の逆側の領域内の所定位置の座標（例えば、当該所定位置の中心座標）を、文字画像Ｍの表示位置として決定する。
一方、顔検出部６ｂにより検出された顔の中に、当該取得した顔識別用情報と一致する顔がないと判定した場合、位置決定部６ｆは、顔検出部６ｂにより検出された顔の顔領域の中で所定条件を満たす顔領域（例えば、最も寸法が大きい顔領域）を、文字画像Ｍを付与する顔領域として特定する。そして、位置決定部６ｆは、当該特定した顔領域の周辺領域のうち、音源方向特定部３ｄにより特定された音源方向の逆側の領域内の所定位置の座標（例えば、当該所定位置の中心座標）を、文字画像Ｍの表示位置として決定する。 For example, when there are a plurality of faces detected by the face detection unit 6b, the position determination unit 6f identifies the identification information acquired by the individual identification unit 3f (identification of the individual identified by the individual identification unit 3f). Information) is acquired from the face identification information table T2 and whether there is a face that matches the acquired face identification information among the faces detected by the face detection unit 6b. Determine.
When it is determined that there is a face that matches the acquired face identification information among the faces detected by the face detection unit 6b, the position determination unit 6f uses the face detection unit 6b to display a predetermined image (display target). The display position of the character image M in the display region R is determined based on the face position detected using the individual face identification information identified by the individual identification unit 3f. That is, the position determination unit 6f specifies the face area of the face that matches the acquired face identification information as the face area to which the character image M is added. The position determination unit 6f then coordinates the coordinates of a predetermined position (for example, the center coordinates of the predetermined position) in the region on the opposite side of the sound source direction specified by the sound source direction specifying unit 3d among the peripheral regions of the specified face region. ) Is determined as the display position of the character image M.
On the other hand, when it is determined that there is no face that matches the acquired face identification information among the faces detected by the face detection unit 6b, the position determination unit 6f detects the face detected by the face detection unit 6b. A face area (for example, a face area having the largest dimension) that satisfies a predetermined condition in the area is specified as a face area to which the character image M is applied. The position determination unit 6f then coordinates the coordinates of a predetermined position (for example, the center coordinates of the predetermined position) in the region on the opposite side of the sound source direction specified by the sound source direction specifying unit 3d among the peripheral regions of the specified face region. ) Is determined as the display position of the character image M.

この際、位置決定部６ｆは、例えば、特定した顔領域の周辺領域のうち、音源方向の逆側の領域内の所定位置として、寸法決定部６ｃにより決定された寸法等に基づいて当該特定した顔領域と文字画像Ｍとが重ならない位置を選択する。また、位置決定部６ｆは、重ならない位置がない場合には、当該特定した顔領域と文字画像Ｍとの重なる度合いが最も小さい位置を選択する。
さらに、位置決定部６ｆは、特定した顔領域の周辺領域のうち、音源方向の逆側の領域内の所定位置として、当該特定した顔領域の中心位置等にふきだし画像の引き出し線Ｌの先端が向く位置を選んでも良いし、当該特定した顔領域の中の特定領域（例えば、口元領域）の中心位置等にふきだし画像の引き出し線Ｌの先端が向く位置を選んでも良い。 At this time, for example, the position determining unit 6f specifies the predetermined position in the region on the opposite side of the sound source direction as the predetermined position in the peripheral region of the specified face region based on the dimension determined by the dimension determining unit 6c. A position where the face area and the character image M do not overlap is selected. Further, when there is no position that does not overlap, the position determination unit 6 f selects a position where the degree of overlap between the specified face area and the character image M is the smallest.
Furthermore, the position determination unit 6f sets the leading edge of the extraction line L of the speech image to the center position of the specified face area as a predetermined position in the area on the opposite side of the sound source direction among the peripheral areas of the specified face area. You may select the position which faces, or you may select the position where the tip of the lead-out line L of the extracted image faces the center position of the specific area (for example, the mouth area) in the specified face area.

また、位置決定部６ｆは、例えば、顔検出部６ｂにより顔が検出されなかった場合、音源方向特定部３ｄにより特定された音源方向に基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定する。
ここで、位置決定部６ｆは、表示部８の表示領域Ｒのうち、音源方向特定部３ｄにより特定された音源方向側の領域内の所定位置の座標（例えば、当該所定位置の中心座標）を、文字画像Ｍの表示位置として決定する。即ち、位置決定部６ｆは、例えば図３（ｂ）に示すように、音源方向が左方向の場合、表示領域Ｒのうち左側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。 In addition, for example, when the face is not detected by the face detection unit 6b, the position determination unit 6f determines the display position of the character image M in the display region R based on the sound source direction specified by the sound source direction specification unit 3d. To do.
Here, the position determination unit 6f uses the coordinates of a predetermined position (for example, the center coordinates of the predetermined position) in the region on the sound source direction side specified by the sound source direction specifying unit 3d in the display region R of the display unit 8. The display position of the character image M is determined. That is, for example, as illustrated in FIG. 3B, the position determination unit 6 f uses the coordinates of a predetermined position in the left region of the display region R as the display position of the character image M when the sound source direction is the left direction. decide.

顔識別用情報テーブルＴ２は、第１記憶手段として、ヒト及び動物のうち、少なくとも一方の各個体の顔を識別するための顔識別用情報を記憶する。
具体的には、顔識別用情報テーブルＴ２は、例えば、ヒトや動物等の個体の顔を識別するための顔識別用情報（例えば、顔の特徴情報）と、当該個体を識別するための識別情報（例えば、名前）とを対応付けて予め記憶している。
ここで、顔の特徴情報としては、例えば、目、鼻、口等に相当する顔パーツの情報や所定角度ごとに撮影されたヒトや動物等の顔画像が挙げられるが、ヒトや動物等の個体の顔を識別するための情報であればこれらに限定されることなく如何なる情報であってもよい。 The face identification information table T2 stores, as first storage means, face identification information for identifying the face of at least one individual among humans and animals.
Specifically, the face identification information table T2 includes, for example, face identification information (for example, facial feature information) for identifying the face of an individual such as a human or an animal, and identification for identifying the individual. Information (for example, name) is stored in advance in association with each other.
Here, the facial feature information includes, for example, facial part information corresponding to eyes, nose, mouth and the like and facial images of humans and animals photographed at predetermined angles. Any information may be used as long as it is information for identifying an individual's face without being limited thereto.

表示制御部７は、画像取得部６ａにより取得された表示対象の画像の画像データに基づいて、所定の画像を表示部８の表示領域Ｒに表示させる制御を行う。
また、表示制御部７は、表示制御手段として、表示領域Ｒに表示される所定の画像（表示対象の画像）と重なるように、音関連画像（文字画像Ｍ）を位置決定部６ｆにより決定された表示位置に表示させる。具体的には、表示制御部７は、音関連画像生成部６ｅにより生成された文字画像ＭをＯＳＤ画像として、表示領域Ｒに表示されている表示対象の画像と重なるように重畳表示させる。この際、例えば、位置決定部６ｆにより文字画像Ｍの表示位置として決定された座標が、前記所定位置の中心座標である場合、表示制御部７は、当該中心座標と文字画像Ｍの中心座標とが一致するように、文字画像Ｍを表示させる。
なお、表示制御部７は、表示部８以外の表示手段と接続するための外部接続Ｉ／Ｆ等を備えていても良い。これにより、表示制御部７は、表示装置１００に接続された外部表示装置に表示対象の画像や音関連画像（文字画像Ｍ）等を表示させることが可能となる。 The display control unit 7 performs control to display a predetermined image in the display region R of the display unit 8 based on the image data of the display target image acquired by the image acquisition unit 6a.
Further, the display control unit 7 determines a sound-related image (character image M) by the position determination unit 6f as a display control unit so as to overlap a predetermined image (image to be displayed) displayed in the display region R. Display at the displayed position. Specifically, the display control unit 7 superimposes and displays the character image M generated by the sound-related image generation unit 6e as an OSD image so as to overlap the display target image displayed in the display region R. At this time, for example, when the coordinates determined as the display position of the character image M by the position determination unit 6f are the center coordinates of the predetermined position, the display control unit 7 determines the center coordinates and the center coordinates of the character image M. The character image M is displayed so that.
The display control unit 7 may include an external connection I / F for connecting to display means other than the display unit 8. As a result, the display control unit 7 can display an image to be displayed, a sound-related image (character image M), or the like on an external display device connected to the display device 100.

表示部８は、表示手段として、表示制御部７の制御下にて、表示対象の画像を表示領域Ｒに表示したり、当該表示対象の画像に文字画像Ｍを重畳表示する。表示部８としては、例えば、液晶表示パネルや有機ＥＬ表示パネル等が挙げられるが、これらに限られるものではない。 The display unit 8 displays a display target image in the display region R as a display unit, or displays the character image M superimposed on the display target image under the control of the display control unit 7. Examples of the display unit 8 include a liquid crystal display panel and an organic EL display panel, but are not limited thereto.

次に、表示装置１００による音関連画像表示処理について図２を参照して説明する。
図２は、音関連画像表示処理に係る動作の一例を示すフローチャートである。 Next, sound-related image display processing by the display device 100 will be described with reference to FIG.
FIG. 2 is a flowchart illustrating an example of an operation related to the sound-related image display process.

図２に示すように、先ず、表示制御部７は、画像取得部６ａにより取得された表示対象の画像の画像データに基づいて、所定の画像を表示部８の表示領域Ｒに表示させる（ステップＳ１）。
次に、集音部３ａは、ヒトや動物等の音源Ｓから発せられた音声（音）を集音し、録音部３ｂは、集音部３ａにより集音された音声を録音する（ステップＳ２）。 As shown in FIG. 2, the display control unit 7 first displays a predetermined image in the display region R of the display unit 8 based on the image data of the display target image acquired by the image acquisition unit 6a (step S1).
Next, the sound collection unit 3a collects sound (sound) emitted from the sound source S such as a human or an animal, and the recording unit 3b records the sound collected by the sound collection unit 3a (step S2). ).

次に、音量特定部３ｃは、ステップＳ２で集音されて録音された音声の音量を特定し、音源方向特定部３ｄは、ステップＳ２で集音されて録音された音声の音源方向を特定する（ステップＳ３）。
具体的には、音源方向特定部３ｄは、例えば、音源Ｓからの音が集音部３ａの一方の入力部ａ１に到達した時間と他方の入力部ａ１に到達した時間との差に基づいて、音源Ｓ（ヒトや動物等の個体）の表示装置１００に対する位置を特定し、その位置に向かう方向を音源方向とする。 Next, the volume specifying unit 3c specifies the volume of the sound collected and recorded in step S2, and the sound source direction specifying unit 3d specifies the sound source direction of the sound collected and recorded in step S2. (Step S3).
Specifically, the sound source direction specifying unit 3d is based on, for example, the difference between the time when the sound from the sound source S reaches one input unit a1 of the sound collection unit 3a and the time when the sound reaches the other input unit a1. The position of the sound source S (individuals such as humans and animals) with respect to the display device 100 is specified, and the direction toward the position is set as the sound source direction.

次に、音認識部３ｅは、ステップＳ２で集音されて録音された音声を認識し、当該音声を対応する文字で表すための文字データを生成する（ステップＳ４）。
次に、個体識別部３ｆは、音声情報テーブルＴ１に記憶されている音声情報に基づいて、集音部３ａにより集音されて録音部３ｂにより録音された音声を発した個体（ヒトや動物等）を識別する（ステップＳ５）。 Next, the sound recognizing unit 3e recognizes the voice collected and recorded in step S2, and generates character data for representing the voice by corresponding characters (step S4).
Next, the individual identification unit 3f, based on the audio information stored in the audio information table T1, is an individual (human, animal, etc.) that has collected the sound collected by the sound collection unit 3a and recorded by the recording unit 3b. ) Is identified (step S5).

次に、寸法決定部６ｃは、ステップＳ３で特定された音量に基づいて、表示領域Ｒに表示される音関連画像（文字画像Ｍ）の寸法を決定する（ステップＳ６）。
具体的には、例えば、図４（ｂ）に示すように、集音されて録音された音声の音量が比較的小さい場合（例えば、第１音量閾値未満である場合）、寸法決定部６ｃは、比較的小さい寸法（例えば、第１寸法）を、表示領域Ｒに表示される文字画像Ｍの寸法として決定する。一方、例えば、図５に示すように、集音されて録音された音声の音量が比較的大きい場合（例えば、第１音量閾値以上第２音量閾値未満である場合）、寸法決定部６ｃは、比較的大きい寸法（例えば、第１寸法よりも大きい第２寸法）を、表示領域Ｒに表示される文字画像Ｍの寸法として決定する。 Next, the dimension determination unit 6c determines the dimension of the sound-related image (character image M) displayed in the display region R based on the sound volume specified in step S3 (step S6).
Specifically, for example, as shown in FIG. 4B, when the volume of the collected and recorded sound is relatively low (for example, less than the first volume threshold), the dimension determining unit 6c A relatively small dimension (for example, the first dimension) is determined as the dimension of the character image M displayed in the display region R. On the other hand, for example, as shown in FIG. 5, when the volume of the collected and recorded sound is relatively large (for example, when the volume is greater than or equal to the first volume threshold and less than the second volume threshold), the dimension determining unit 6 c A relatively large dimension (for example, a second dimension larger than the first dimension) is determined as the dimension of the character image M displayed in the display region R.

次に、向き決定部６ｄは、ステップＳ３で特定された音源方向に基づいて、文字画像Ｍに含まれるふきだし画像の引き出し線Ｌの向きを決定する（ステップＳ７）。
具体的には、例えば、図３（ｂ）に示すように、音源方向が左方向の場合、向き決定部６ｄは、引き出し線Ｌの先端が表示装置１００の左側を指す向き「左向き」をふきだし画像の引き出し線Ｌの向きとして決定する。一方、例えば、図４（ｂ）に示すように、音源方向が右方向の場合、向き決定部６ｄは、引き出し線Ｌの先端が表示装置１００の右側を指す向き「右向き」をふきだし画像の引き出し線Ｌの向きとして決定する。 Next, the orientation determination unit 6d determines the orientation of the lead-out line L of the speech balloon image included in the character image M based on the sound source direction identified in step S3 (step S7).
Specifically, for example, as illustrated in FIG. 3B, when the sound source direction is the left direction, the direction determination unit 6 d extracts the direction “left direction” in which the leading end of the lead line L points to the left side of the display device 100. This is determined as the direction of the drawing line L of the image. On the other hand, for example, as illustrated in FIG. 4B, when the sound source direction is the right direction, the direction determination unit 6 d extracts the image by extracting the direction “right direction” in which the leading end of the lead line L points to the right side of the display device 100. The direction of the line L is determined.

次に、音関連画像生成部６ｅは、ステップＳ４での認識結果（具体的には、ステップＳ４で生成された文字データ）や、ステップＳ６で決定された寸法、ステップＳ７で決定された引き出し線Ｌの向き等に基づいて、ステップＳ２で集音されて録音された音声に関連する音関連画像を生成する（ステップＳ８）。
具体的には、音関連画像生成部６ｅは、画像処理部６等に格納された所定の記録手段（図示略）に記録されているふきだし画像データを取得し、当該取得したふきだし画像データに基づくふきだし画像の枠内に、ステップＳ４で生成された文字データに基づく画像（文字本体画像）が配置されるようにふきだし画像のサイズを調整する。また、音関連画像生成部６ｅは、当該取得したふきだし画像データに基づくふきだし画像の引き出し線Ｌの向きをステップＳ７で決定された向きに基づいて調整する。そして、音関連画像生成部６ｅは、文字本体画像と、サイズや向きが調整されたふきだし画像とを合成して合成画像を生成し、当該生成した合成画像の寸法をステップＳ６で決定された寸法に基づいて調整することによって、文字画像Ｍを生成する。 Next, the sound-related image generation unit 6e recognizes the recognition result in step S4 (specifically, the character data generated in step S4), the dimensions determined in step S6, and the leader line determined in step S7. Based on the direction of L and the like, a sound-related image related to the sound collected and recorded in step S2 is generated (step S8).
Specifically, the sound-related image generation unit 6e acquires the balloon image data recorded in predetermined recording means (not shown) stored in the image processing unit 6 and the like, and based on the acquired balloon image data. The size of the speech bubble image is adjusted so that the image (character body image) based on the character data generated in step S4 is arranged in the frame of the speech bubble image. The sound-related image generation unit 6e adjusts the direction of the extraction line L of the balloon image based on the acquired balloon image data based on the direction determined in step S7. Then, the sound-related image generation unit 6e generates a composite image by combining the character main body image and the balloon image whose size and orientation are adjusted, and determines the dimensions of the generated composite image in the dimensions determined in step S6. The character image M is generated by adjusting based on the above.

次に、顔検出部６ｂは、画像取得部６ａにより取得されて表示部８の表示領域Ｒに表示されている表示対象の画像から、ヒトや動物等の顔を検出する（ステップＳ９）。
次に、位置決定部６ｆは、ステップＳ９で顔が検出されたか否かを判定する（ステップＳ１０）。 Next, the face detection unit 6b detects a face such as a human or an animal from the display target image acquired by the image acquisition unit 6a and displayed in the display region R of the display unit 8 (step S9).
Next, the position determination unit 6f determines whether or not a face is detected in step S9 (step S10).

ステップＳ１０にて、ステップＳ９で顔が検出されなかったと判定した場合（ステップＳ１０；ＮＯ）、即ち、表示部８の表示領域Ｒに表示されている表示対象の画像が、例えば図３（ａ）に示すような風景画等である場合、位置決定部６ｆは、ステップＳ３で特定された音源方向に基づいて、音関連画像の表示領域Ｒにおける表示位置を決定する（ステップＳ１１）。
具体的には、位置決定部６ｆは、表示部８の表示領域Ｒのうち、ステップＳ３で特定された音源方向側の領域内の所定位置の座標を、音関連画像（文字画像Ｍ）の表示位置として決定する。例えば、図３（ｂ）に示すように、音源方向が左方向の場合、位置決定部６ｆは、表示領域Ｒのうち左側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。 If it is determined in step S10 that no face has been detected in step S9 (step S10; NO), that is, the display target image displayed in the display area R of the display unit 8 is, for example, FIG. In the case of a landscape image as shown in FIG. 4, the position determination unit 6f determines the display position of the sound-related image in the display region R based on the sound source direction specified in step S3 (step S11).
Specifically, the position determination unit 6f displays the sound-related image (character image M) using the coordinates of a predetermined position in the region on the sound source direction side specified in step S3 in the display region R of the display unit 8. Determine as position. For example, as illustrated in FIG. 3B, when the sound source direction is the left direction, the position determination unit 6 f determines the coordinates of a predetermined position in the left area of the display area R as the display position of the character image M. To do.

次に、表示制御部７は、表示部８の表示領域Ｒに表示されている表示対象の画像と重なるように、ステップＳ８で生成された音関連画像を、ステップＳ１１で決定された表示位置に表示させる（ステップＳ１６；図３（ｂ）参照）。 Next, the display control unit 7 places the sound-related image generated in step S8 at the display position determined in step S11 so as to overlap the display target image displayed in the display region R of the display unit 8. It is displayed (step S16; see FIG. 3B).

また、ステップＳ１０にて、ステップＳ９で顔が検出されたと判定した場合（ステップＳ１０；ＹＥＳ）、即ち、表示部８の表示領域Ｒに表示されている表示対象の画像が、例えば図４（ａ）や図６（ａ）に示すようなヒトを被写体とする画像等である場合、位置決定部６ｆは、ステップＳ９で検出された顔の個数は複数であるか否かを判定する（ステップＳ１２）。 If it is determined in step S10 that a face has been detected in step S9 (step S10; YES), that is, the display target image displayed in the display area R of the display unit 8 is, for example, FIG. ) Or FIG. 6A, the position determination unit 6f determines whether or not the number of faces detected in step S9 is plural (step S12). ).

ステップＳ１２にて、ステップＳ９で検出された顔の個数は複数でないと判定した場合（ステップＳ１２；ＮＯ）、即ち、表示部８の表示領域Ｒに表示されている表示対象の画像が、例えば図４（ａ）に示すような画像等である場合、位置決定部６ｆは、ステップＳ３で特定された音源方向や、ステップＳ９で検出された顔の位置に基づいて、音関連画像の表示領域Ｒにおける表示位置を決定する（ステップＳ１３）。
具体的には、位置決定部６ｆは、表示部８の表示領域ＲにおけるステップＳ９で検出された顔の顔領域を、音関連画像（文字画像Ｍ）を付与する顔領域として特定し、当該特定した顔領域の周辺領域のうち、ステップＳ３で特定された音源方向の逆側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。例えば、図４（ｂ）や図５に示すように、音源方向が右方向の場合、位置決定部６ｆは、表示領域Ｒにおける顔領域の周辺領域のうち左側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。 If it is determined in step S12 that the number of faces detected in step S9 is not plural (step S12; NO), that is, the display target image displayed in the display area R of the display unit 8 is, for example, FIG. 4 (a), the position determination unit 6f displays the sound related image display region R based on the sound source direction specified in step S3 and the face position detected in step S9. The display position at is determined (step S13).
Specifically, the position determination unit 6f identifies the face region of the face detected in step S9 in the display region R of the display unit 8 as the face region to which the sound-related image (character image M) is added, and performs the identification. Among the peripheral areas of the face area, the coordinates of a predetermined position in the area opposite to the sound source direction specified in step S3 are determined as the display position of the character image M. For example, as illustrated in FIG. 4B and FIG. 5, when the sound source direction is the right direction, the position determination unit 6 f calculates the coordinates of a predetermined position in the left area of the peripheral area of the face area in the display area R. The display position of the character image M is determined.

次に、表示制御部７は、表示部８の表示領域Ｒに表示されている表示対象の画像と重なるように、ステップＳ８で生成された音関連画像を、ステップＳ１３で決定された表示位置に表示させる（ステップＳ１６；図４（ｂ）や図５参照）。 Next, the display control unit 7 places the sound-related image generated in step S8 at the display position determined in step S13 so as to overlap the display target image displayed in the display area R of the display unit 8. It is displayed (step S16; see FIG. 4B and FIG. 5).

また、ステップＳ１２にて、ステップＳ９で検出された顔の個数は複数であると判定した場合（ステップＳ１２；ＹＥＳ）、即ち、表示部８の表示領域Ｒに表示されている表示対象の画像が、例えば図６（ａ）に示すような画像等である場合、位置決定部６ｆは、顔識別用情報テーブルＴ２に記憶されている顔識別用情報に基づいて、ステップＳ９で検出された顔の中に、ステップＳ５で識別された個体（即ち、集音されて録音された音声を発した個体）の顔があるか否かを判定する（ステップＳ１４）。 If it is determined in step S12 that the number of faces detected in step S9 is plural (step S12; YES), that is, the display target image displayed in the display area R of the display unit 8 is displayed. For example, in the case of an image or the like as shown in FIG. 6A, the position determination unit 6f determines the face detected in step S9 based on the face identification information stored in the face identification information table T2. It is determined whether or not there is a face of the individual identified in step S5 (that is, the individual who has collected and recorded the sound) (step S14).

ステップＳ１４にて、ステップＳ９で検出された顔の中に、ステップＳ５で識別された個体の顔がないと判定した場合（ステップＳ１３；ＮＯ）、位置決定部６ｆは、ステップＳ３で特定された音源方向や、ステップＳ９で検出された顔の位置に基づいて、音関連画像の表示領域Ｒにおける表示位置を決定する（ステップＳ１３）。
具体的には、位置決定部６ｆは、表示部８の表示領域ＲにおけるステップＳ９で検出された顔の顔領域の中で所定条件を満たす顔領域を、音関連画像（文字画像Ｍ）を付与する顔領域として特定し、当該特定した顔領域の周辺領域のうち、ステップＳ３で特定された音源方向の逆側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。
なお、音声情報テーブルＴ１に、集音されて録音された音声と一致する音声情報が記憶されていない場合や、顔識別用情報テーブルＴ２に、集音されて録音された音声を発した個体の顔識別用情報が記憶されていない場合にも、位置決定部６ｆは、ステップＳ１４にて、ステップＳ９で検出された顔の中に、ステップＳ５で識別された個体の顔がないと判定する。 When it is determined in step S14 that the face detected in step S9 does not include the face of the individual identified in step S5 (step S13; NO), the position determination unit 6f is specified in step S3. Based on the sound source direction and the position of the face detected in step S9, the display position of the sound-related image in the display area R is determined (step S13).
Specifically, the position determination unit 6f gives a sound-related image (character image M) to a face region that satisfies a predetermined condition among the face regions of the face detected in step S9 in the display region R of the display unit 8. The coordinates of a predetermined position in the area opposite to the sound source direction specified in step S3 among the peripheral areas of the specified face area are determined as the display position of the character image M.
It should be noted that the voice information table T1 does not store voice information that matches the voice that has been collected and recorded, or the face identification information table T2 has the sound that has been collected and recorded. Even when face identification information is not stored, the position determination unit 6f determines in step S14 that the face detected in step S9 does not include the face of the individual identified in step S5.

次に、表示制御部７は、表示部８の表示領域Ｒに表示されている表示対象の画像と重なるように、ステップＳ８で生成された音関連画像を、ステップＳ１３で決定された表示位置に表示させる（ステップＳ１６）。 Next, the display control unit 7 places the sound-related image generated in step S8 at the display position determined in step S13 so as to overlap the display target image displayed in the display area R of the display unit 8. It is displayed (step S16).

また、ステップＳ１４にて、ステップＳ９で検出された顔の中に、ステップＳ５で識別された個体の顔があると判定した場合（ステップＳ１４；ＮＯ）、位置決定部６ｆは、ステップＳ３で特定された音源方向や、ステップＳ９で検出されステップＳ５で識別された個体の顔の位置に基づいて、表示領域Ｒにおける音関連画像の表示位置を決定する（ステップＳ１５）。
具体的には、位置決定部６ｆは、表示部８の表示領域ＲにおけるステップＳ９で検出されステップＳ５で識別された個体の顔の顔領域を、音関連画像（文字画像Ｍ）を付与する顔領域として特定し、当該特定した顔領域の周辺領域のうち、ステップＳ３で特定された音源方向の逆側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。例えば、図６（ｂ）に示すように、音源方向が右方向の場合、位置決定部６ｆは、表示領域Ｒにおける顔領域（具体的には、集音されて録音された音声を発した個体の顔の顔領域）の周辺領域のうち左側の領域内の所定位置の座標を、文字画像Ｍの表示位置として決定する。 In Step S14, when it is determined that the face of the individual identified in Step S5 is included in the face detected in Step S9 (Step S14; NO), the position determination unit 6f specifies in Step S3. The display position of the sound-related image in the display region R is determined based on the sound source direction and the face position of the individual detected in step S9 and identified in step S5 (step S15).
Specifically, the position determination unit 6f applies the sound-related image (character image M) to the face region of the individual face detected in step S9 in the display region R of the display unit 8 and identified in step S5. As a region, the coordinates of a predetermined position in the region on the opposite side of the sound source direction specified in step S3 among the peripheral regions of the specified face region are determined as the display position of the character image M. For example, as shown in FIG. 6B, when the sound source direction is the right direction, the position determination unit 6f displays the face area in the display area R (specifically, the individual that has collected and recorded the sound). The coordinates of a predetermined position in the left area of the peripheral area of the face area) are determined as the display position of the character image M.

次に、表示制御部７は、表示部８の表示領域Ｒに表示されている表示対象の画像と重なるように、ステップＳ８で生成された音関連画像を、ステップＳ１５で決定された表示位置に表示させる（ステップＳ１６；図６（ｂ）参照）。 Next, the display control unit 7 places the sound-related image generated in step S8 at the display position determined in step S15 so as to overlap the display target image displayed in the display region R of the display unit 8. It is displayed (step S16; refer to FIG. 6B).

以上のように、本実施形態の表示装置１００によれば、集音部３ａにより集音された音に関連する音関連画像を表示するので、所望の画像を指定する際に、ユーザによる操作入力部２の煩わしい操作を不要とし、所望の画像に対応する音を集音部３ａに集音させるだけで、表示領域Ｒに表示される所定の画像に、当該所望の画像を重ねて表示することができる。
さらに、表示装置１００によれば、集音部３ａにより集音された音の音源方向に基づいて、当該音に関連する音関連画像の表示領域Ｒにおける表示位置を決定するので、所望の画像の表示位置として所望の位置を指定する際に、ユーザによる操作入力部２の煩わしい操作を不要とし、当該所望の位置に対応する方向からの音を集音部３ａに集音させるだけで、当該所望の位置を音関連画像の表示位置として決定することができる。つまり、ユーザが音に関連する音関連画像を表示させたい位置に応じて表示装置１００に対する音源Ｓ（例えば、ユーザ本人）の位置を変更するだけで、表示領域Ｒに表示される所定の画像内のユーザ所望の位置に、ユーザ所望の画像を重ねて表示することができる。
従って、表示装置１００は、表示部８の表示領域Ｒに表示される所定の画像のユーザ所望の位置にユーザ所望の音関連画像を容易に表示することができる。 As described above, according to the display device 100 of the present embodiment, the sound-related image related to the sound collected by the sound collecting unit 3a is displayed. Therefore, when a desired image is designated, an operation input by the user is performed. The cumbersome operation of the unit 2 is not required, and the desired image is displayed on the predetermined image displayed in the display region R by simply collecting the sound corresponding to the desired image on the sound collecting unit 3a. Can do.
Furthermore, according to the display device 100, the display position of the sound-related image related to the sound in the display region R is determined based on the sound source direction of the sound collected by the sound collecting unit 3a. When a desired position is designated as the display position, the user does not need to perform troublesome operations on the operation input unit 2, and the sound collecting unit 3a collects sound from the direction corresponding to the desired position. Can be determined as the display position of the sound-related image. That is, the user can change the position of the sound source S (for example, the user himself / herself) relative to the display device 100 according to the position where the user wants to display the sound-related image related to the sound. The user-desired image can be superimposed and displayed at the position desired by the user.
Therefore, the display device 100 can easily display the sound-related image desired by the user at the position desired by the user of the predetermined image displayed in the display area R of the display unit 8.

また、本実施形態の表示装置１００によれば、音関連画像として、集音部３ａにより集音された音を対応する文字で表した文字画像Ｍを表示させることができる。したがって、ユーザによる操作入力部２等の操作により文字を入力する等の煩わしい操作を不要とし、ユーザ所望の文字画像Ｍに対応する音を集音部３ａに集音させるだけで、当該ユーザ所望の文字画像Ｍを表示領域Ｒに表示される所定の画像に重ねて表示することができる。 Moreover, according to the display apparatus 100 of this embodiment, the character image M which represented the sound collected by the sound collection part 3a with the corresponding character as a sound related image can be displayed. Therefore, a troublesome operation such as inputting a character by an operation of the operation input unit 2 or the like by the user is not necessary, and the sound collecting unit 3a collects a sound corresponding to the user-desired character image M, and the user-desired operation is performed. The character image M can be displayed over the predetermined image displayed in the display region R.

また、本実施形態の表示装置１００によれば、集音部３ａにより集音された音の音源方向と、表示部８の表示領域Ｒに表示される所定の画像（表示対象の画像）から検出された顔の位置とに基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定する。したがって、ユーザ所望の文字画像Ｍに対応する音をユーザ所望の位置に対応する方向から集音部３ａに集音させるだけで、当該ユーザ所望の位置と表示領域Ｒに表示される所定の画像内の顔部分とに基づいて、当該ユーザ所望の文字画像Ｍの表示位置を決定することができる。 Further, according to the display device 100 of the present embodiment, detection is performed from the sound source direction of the sound collected by the sound collection unit 3a and a predetermined image (image to be displayed) displayed in the display region R of the display unit 8. The display position of the character image M in the display region R is determined based on the face position. Therefore, by simply collecting the sound corresponding to the user-desired character image M in the sound collecting unit 3a from the direction corresponding to the user-desired position, the user-desired position and the predetermined image displayed in the display area R The display position of the character image M desired by the user can be determined based on the face portion.

また、本実施形態の表示装置１００によれば、集音部３ａにより集音された音声の音源方向と、表示部８の表示領域Ｒに表示される所定の画像（表示対象の画像）から検出された顔のうちの当該音声を発した個体の顔の位置とに基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定する。したがって、個体が発したユーザ所望の文字画像Ｍに対応する音声をユーザ所望の位置に対応する方向から集音部３ａに集音させるだけで、当該ユーザ所望の位置と表示領域Ｒに表示される所定の画像内の当該個体の顔部分とに基づいて、当該ユーザ所望の文字画像Ｍの表示位置を決定することができる。 Further, according to the display device 100 of the present embodiment, detection is performed from the sound source direction of the sound collected by the sound collection unit 3a and a predetermined image (image to be displayed) displayed in the display region R of the display unit 8. The display position of the character image M in the display area R is determined based on the position of the face of the individual that emitted the voice among the faces that have been displayed. Therefore, by simply collecting the sound corresponding to the user-desired character image M emitted by the individual from the direction corresponding to the user-desired position on the sound collection unit 3a, the sound is displayed on the user-desired position and the display region R. The display position of the character image M desired by the user can be determined based on the face portion of the individual in the predetermined image.

また、本実施形態の表示装置１００によれば、集音部３ａにより集音された音の音源方向に基づいて、文字画像Ｍに含まれるふきだし画像の引き出し線Ｌの向きを決定する。したがって、ユーザによる操作入力部２等の操作によりふきだし線の引き出し線Ｌの向きを入力する等の煩わしい操作がなくても、ふきだし画像の引き出し線Ｌの向きとして適切な向きを決定することができる。 Further, according to the display device 100 of the present embodiment, the direction of the extraction line L of the speech balloon included in the character image M is determined based on the sound source direction of the sound collected by the sound collection unit 3a. Therefore, even if there is no troublesome operation such as inputting the direction of the extraction line L of the speech line by the operation of the operation input unit 2 or the like by the user, an appropriate direction can be determined as the direction of the extraction line L of the speech line image. .

また、本実施形態の表示装置１００によれば、集音部３ａにより集音された音の音量に基づいて、表示部８の表示領域Ｒに表示される音関連画像（文字画像Ｍ）の寸法を決定することができる。したがって、ユーザによる操作入力部２等の操作によりユーザ所望の寸法を入力する等の煩わしい操作を不要とし、ユーザ所望の寸法に対応する音量の音を集音部３ａに集音させるだけで、表示領域Ｒに表示される所定の画像に、当該ユーザ所望の寸法の音関連画像（文字画像Ｍ）を重ねて表示することができる。 Further, according to the display device 100 of the present embodiment, the dimensions of the sound-related image (character image M) displayed in the display area R of the display unit 8 based on the volume of the sound collected by the sound collection unit 3a. Can be determined. Therefore, a troublesome operation such as inputting a user-desired dimension by an operation of the operation input unit 2 or the like by the user is unnecessary, and the sound collecting unit 3a collects a sound having a volume corresponding to the user-desired dimension. A sound-related image (character image M) having a user-desired size can be superimposed on a predetermined image displayed in the region R and displayed.

なお、本発明は、上記実施形態に限定されることなく、本発明の趣旨を逸脱しない範囲において、種々の改良並びに設計の変更を行っても良い。 The present invention is not limited to the above-described embodiment, and various improvements and design changes may be made without departing from the spirit of the present invention.

例えば、表示装置１００は、音声情報テーブルＴ１及び顔識別用情報テーブルＴ２に代えて、音声情報テーブルＴ１と顔識別用情報テーブルＴ２とを一体化したテーブルを格納していても良い。即ち、表示装置１００は、ヒトや動物等の個体の音声を識別するための音声情報と、当該個体の顔を識別するための顔識別用情報と、当該個体を識別するための識別情報とを対応付けて記憶するテーブルを格納していても良い。この場合、当該テーブルが、第１記憶手段及び第２記憶手段を構成する。 For example, the display device 100 may store a table in which the voice information table T1 and the face identification information table T2 are integrated instead of the voice information table T1 and the face identification information table T2. That is, the display device 100 includes audio information for identifying the sound of an individual such as a human or an animal, face identification information for identifying the face of the individual, and identification information for identifying the individual. A table stored in association with each other may be stored. In this case, the table constitutes a first storage unit and a second storage unit.

また、画像処理部６は、音源方向特定部３ｄにより特定された音源方向、即ち、集音部３ａにより集音された音の音源方向に基づいて、当該音に関連する音関連画像自体の向きを決定しても良い。具体的には、例えば、画像処理部６は、音源方向が右方向又は左方向の場合、文字画像Ｍの文字が横に並ぶ横書きとなるように、文字画像Ｍ（音関連画像）自体の向きを決定するとともに、音源方向が上方向又は下方向の場合、文字画像Ｍの文字が縦に並ぶ縦書きとなるように、文字画像Ｍ（音関連画像）自体の向きを決定してもよい。 The image processing unit 6 also determines the direction of the sound-related image itself related to the sound based on the sound source direction specified by the sound source direction specifying unit 3d, that is, the sound source direction of the sound collected by the sound collecting unit 3a. May be determined. Specifically, for example, when the sound source direction is the right direction or the left direction, the image processing unit 6 directs the character image M (sound-related image) itself so that the characters of the character image M are horizontally written side by side. When the sound source direction is upward or downward, the direction of the character image M (sound-related image) itself may be determined so that the characters of the character image M are vertically written.

また、表示装置１００が撮像部を備えている場合、個体識別部３ｆは、音声情報に代えて、当該撮像部により撮像された画像に基づいて、集音部３ａにより集音された音声を発したヒトや動物等の個体を識別しても良い。即ち、ヒト及び動物のうち、少なくとも一方の各個体の識別に、音声情報を用いるか否かは適宜任意に変更可能である。 Further, when the display device 100 includes an imaging unit, the individual identification unit 3f generates a sound collected by the sound collection unit 3a based on an image captured by the imaging unit instead of the sound information. Individuals such as humans and animals may be identified. That is, whether or not to use audio information for identifying each individual of at least one of humans and animals can be arbitrarily changed as appropriate.

また、上記実施形態では、表示制御部７は、表示領域Ｒに表示済みの表示対象の画像に、文字画像Ｍを重ねて表示させるようにしたが、例えば、表示領域Ｒに表示される前の表示対象の画像において文字画像Ｍの表示位置を決定してから、表示対象の画像と文字画像Ｍとをともに表示領域Ｒに表示させるようにしても良い。
また、表示制御部７は、文字画像ＭをＯＳＤ画像として、表示対象の画像と重なるように重畳表示させるようにしたが、例えば、画像処理部６は、表示対象の画像と文字画像Ｍとを合成した画像を生成し、表示制御部７は、当該合成した画像を表示領域Ｒに表示させるようにしても良い。 In the above embodiment, the display control unit 7 displays the character image M so as to be superimposed on the display target image that has been displayed in the display area R. After determining the display position of the character image M in the display target image, both the display target image and the character image M may be displayed in the display region R.
In addition, the display control unit 7 displays the character image M as an OSD image so as to overlap the display target image. For example, the image processing unit 6 displays the display target image and the character image M. The combined image may be generated, and the display control unit 7 may display the combined image in the display region R.

また、音源Ｓは、音声を発するヒトや動物等の個体でなくてもよく、音を発する物であれば良い。
また、画像処理部６は、音源Ｓが車等の音声以外の音を発する物である場合、音源Ｓが発する音を識別するための情報や、音源Ｓの外形や音源Ｓの特徴部分の形状等を識別するための情報などに基づいて、表示対象の画像（例えば、車等を被写体とする画像）から集音部３ａにより集音された音を発した音源Ｓ（例えば、車等）領域を検出し、当該検出した音源Ｓ領域の位置に基づいて、音関連画像の表示位置を決定しても良い。
また、表示装置１００が撮像部を備えている場合、音源Ｓの外形や音源Ｓの特徴部分の形状等を識別するための情報や、当該撮像部により撮像された画像などに基づいて、表示対象の画像（例えば、車等を被写体とする画像）から集音部３ａにより集音された音を発した音源Ｓ（例えば、車等）領域を検出し、当該検出した音源Ｓ領域の位置に基づいて、音関連画像の表示位置を決定しても良い。 The sound source S does not have to be an individual such as a human or an animal that emits sound, but may be any object that emits sound.
In addition, when the sound source S is a thing that emits sound other than sound such as a car, the image processing unit 6 identifies information for identifying the sound emitted by the sound source S, the outer shape of the sound source S, and the shape of the characteristic portion of the sound source S. A sound source S (for example, a car) region that emits sound collected by the sound collection unit 3a from an image to be displayed (for example, an image with a car or the like as a subject) based on information for identifying the And the display position of the sound-related image may be determined based on the detected position of the sound source S region.
Further, when the display device 100 includes an imaging unit, a display target is determined based on information for identifying the outer shape of the sound source S, the shape of the characteristic portion of the sound source S, and the like, an image captured by the imaging unit, and the like. A sound source S (for example, a car or the like) region that emits the sound collected by the sound collecting unit 3a is detected from the image (for example, an image having a car or the like as a subject), and based on the position of the detected sound source S region Thus, the display position of the sound-related image may be determined.

また、上記実施形態では、音関連画像を、集音部３ａにより集音された音声を対応する文字で表した文字画像Ｍとしたが、音関連画像は、例えば、集音部３ａにより集音された音声以外の音（例えば、騒音等）を対応する文字で表した文字画像Ｍであっても良い。具体的には、表示装置１００は、例えば、「プップー」という車のクラクションの音が集音部３ａにより集音された場合、音関連画像として当該音を対応する文字（「プップー」）で表した文字画像Ｍを表示しても良い。
また、音関連画像は、文字画像Ｍに限定されるものではなく、集音部３ａに集音された音に関連する画像であれば適宜任意に変更可能である。具体的には、表示装置１００は、例えば、「たいよう」という言葉が音声として集音部３ａに集音された場合、音関連画像として太陽の画像を表示しても良いし、例えば、「おはよう」という朝を連想させるような言葉が音声として集音部３ａに集音された場合、音関連画像として太陽の画像等の朝を連想させるような画像を表示しても良い。また、表示装置１００は、例えば、「プップー」という車のクラクションの音が集音部３ａにより集音された場合、音関連画像として車の画像等の「プップー」という音から連想できる物の画像を表示しても良い。 In the above embodiment, the sound-related image is the character image M representing the sound collected by the sound collection unit 3a with the corresponding characters. However, the sound-related image is collected by the sound collection unit 3a, for example. It may be a character image M in which sound (for example, noise, etc.) other than the generated voice is represented by corresponding characters. Specifically, for example, when the sound of a car horn called “Pupu” is collected by the sound collection unit 3a, the display device 100 displays the sound as a sound-related image with a corresponding character (“Pupu”). The character image M thus displayed may be displayed.
The sound-related image is not limited to the character image M, and can be arbitrarily changed as long as it is an image related to the sound collected by the sound collection unit 3a. Specifically, the display device 100 may display a sun image as a sound-related image, for example, when the word “taiyo” is collected as a sound in the sound collecting unit 3a. When the sound collection unit 3a collects the word “reminiscent of the morning” as sound, an image reminiscent of the morning such as the sun image may be displayed as the sound-related image. In addition, for example, when the sound of a car horn called “Pupu” is collected by the sound collection unit 3a, the display device 100 can be associated with the sound “Pupu” such as a car image as a sound-related image. May be displayed.

また、音関連画像の表示領域Ｒにおける表示位置は、上記実施形態に限定されるものではなく、少なくとも特定された音の音源方向に基づいて決定可能であれば、適宜任意に変更可能である。
また、ふきだし画像の引き出し線Ｌの向きは、上記実施形態に限定されるものではなく、特定された音源方向に基づいて決定可能であれば、適宜任意に変更可能である。また、ふきだし画像の引き出し線Ｌの向きは、音源方向以外の情報に基づいて決定しても良い。
また、表示領域Ｒに表示される音関連画像の寸法は、上記実施形態に限定されるものではなく、特定された音量に基づいて決定可能であれば、適宜任意に変更可能である。また、表示領域Ｒに表示される音関連画像の寸法は、音量以外の情報に基づいて決定しても良い。 The display position of the sound-related image in the display region R is not limited to the above embodiment, and can be arbitrarily changed as long as it can be determined based on at least the sound source direction of the specified sound.
Further, the direction of the lead-out line L of the balloon image is not limited to the above embodiment, and can be arbitrarily changed as long as it can be determined based on the specified sound source direction. Further, the direction of the leader line L of the speech bubble may be determined based on information other than the sound source direction.
In addition, the size of the sound-related image displayed in the display region R is not limited to the above embodiment, and can be arbitrarily changed as long as it can be determined based on the specified volume. Further, the size of the sound-related image displayed in the display area R may be determined based on information other than the volume.

また、上記実施形態では、位置決定部６ｆは、顔検出部６ｂにより検出された顔の個数が複数個である場合に、顔検出部６ｂにより所定の画像（表示対象の画像）から検出され、個体識別部３ｆにより識別された個体の顔識別用情報を用いて識別される顔の位置に基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定したが、これに限定されるものではない。位置決定部６ｆは、例えば、顔検出部６ｂにより検出された顔の個数にかかわらず、顔検出部６ｂにより所定の画像（表示対象の画像）から検出され、個体識別部３ｆにより識別された個体の顔識別用情報を用いて識別される顔の位置に基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定しても良い。この場合、顔検出部６ｂにより検出された顔の個数が１個であり、当該顔が集音されて録音された音声を発した個体の顔でないときには、位置決定部６ｆは、例えば、顔検出部６ｂにより顔が検出されなかった場合と同様、音源方向特定部３ｄにより特定された音源方向のみに基づいて、文字画像Ｍの表示領域Ｒにおける表示位置を決定する。
また、位置決定部６ｆが、顔識別用画像情報を利用して集音されて録音された音声を発した個体の顔を識別するようにしたが、例えば、顔検出部６ｂが、顔の検出と同時に、顔識別用画像情報を利用して集音されて録音された音声を発した個体の顔を識別しても良い。 In the above embodiment, the position determination unit 6f is detected from a predetermined image (image to be displayed) by the face detection unit 6b when the number of faces detected by the face detection unit 6b is plural, The display position of the character image M in the display area R is determined based on the position of the face identified using the individual face identification information identified by the individual identification unit 3f, but the present invention is not limited to this. . The position determination unit 6f is, for example, an individual detected from a predetermined image (image to be displayed) by the face detection unit 6b and identified by the individual identification unit 3f regardless of the number of faces detected by the face detection unit 6b. The display position of the character image M in the display region R may be determined based on the position of the face identified using the face identification information. In this case, when the number of faces detected by the face detection unit 6b is one and the face is not the face of the individual that has collected and recorded the sound, the position determination unit 6f, for example, detects the face As in the case where no face is detected by the unit 6b, the display position of the character image M in the display region R is determined based only on the sound source direction specified by the sound source direction specifying unit 3d.
In addition, the position determination unit 6f uses the face identification image information to identify the face of the individual that has collected and recorded the sound. For example, the face detection unit 6b detects the face. At the same time, it is also possible to identify the face of an individual who has produced a sound that has been collected and recorded using face identification image information.

加えて、上記実施形態にあっては、取得手段、方向特定手段、位置決定手段、表示制御手段としての機能を、中央制御部１の制御下にて、画像取得部６ａ、音源方向特定部３ｄ、位置決定部６ｆ、表示制御部７が駆動することにより実現される構成としたが、これに限られるものではなく、中央制御部１のＣＰＵによって所定のプログラム等が実行されることにより実現される構成としても良い。
即ち、プログラムを記憶するプログラムメモリ（図示略）に、取得処理ルーチン、方向特定処理ルーチン、位置決定処理ルーチン、表示制御処理ルーチンを含むプログラムを記憶しておく。そして、取得処理ルーチンにより中央制御部１のＣＰＵを、表示手段の表示領域に表示される所定の画像を取得する取得手段として機能させるようにしても良い。また、方向特定処理ルーチンにより中央制御部１のＣＰＵを、集音手段により集音された音の音源方向を特定する方向特定手段として機能させるようにしても良い。また、位置決定処理ルーチンにより中央制御部１のＣＰＵを、方向特定手段により特定された音の音源方向に基づいて、当該音に関連する音関連画像の表示領域における表示位置を決定する位置決定手段として機能させるようにしても良い。また、表示制御処理ルーチンにより中央制御部１のＣＰＵを、表示領域に表示される所定の画像と重なるように、音関連画像を位置決定手段により決定された表示位置に表示させる表示制御手段として機能させるようにしても良い。 In addition, in the above-described embodiment, the functions of the acquisition unit, the direction specification unit, the position determination unit, and the display control unit are controlled by the central control unit 1, and the image acquisition unit 6a and the sound source direction specification unit 3d. The position determining unit 6f and the display control unit 7 are configured to be driven. However, the present invention is not limited to this, and is realized by executing a predetermined program or the like by the CPU of the central control unit 1. It is good also as a structure.
That is, a program including an acquisition process routine, a direction specifying process routine, a position determination process routine, and a display control process routine is stored in a program memory (not shown) that stores the program. Then, the CPU of the central control unit 1 may function as an acquisition unit that acquires a predetermined image displayed in the display area of the display unit by an acquisition process routine. Further, the CPU of the central control unit 1 may function as direction specifying means for specifying the sound source direction of the sound collected by the sound collecting means by the direction specifying processing routine. Further, the CPU of the central control unit 1 by the position determination processing routine determines the display position in the display area of the sound related image related to the sound based on the sound source direction of the sound specified by the direction specifying means. You may make it function as. Also, the display control processing routine causes the CPU of the central control unit 1 to function as display control means for displaying the sound-related image at the display position determined by the position determination means so as to overlap a predetermined image displayed in the display area. You may make it let it.

同様に、音認識手段、検出手段、識別手段、向き決定手段、音量特定手段、寸法決定手段等のその他の機能についても、中央制御部１のＣＰＵによって所定のプログラム等が実行されることにより実現される構成としても良い。 Similarly, other functions such as sound recognition means, detection means, identification means, direction determination means, volume specification means, and dimension determination means are also realized by executing predetermined programs and the like by the CPU of the central control unit 1. It is good also as a structure made.

さらに、上記の各処理を実行するためのプログラムを格納したコンピュータ読み取り可能な媒体として、ＲＯＭやハードディスク等の他、フラッシュメモリ等の不揮発性メモリ、ＣＤ−ＲＯＭ等の可搬型画像記録媒体を適用することも可能である。また、プログラムのデータを所定の通信回線を介して提供する媒体としては、キャリアウェーブ（搬送波）も適用される。 Further, as a computer-readable medium storing a program for executing each of the above processes, a non-volatile memory such as a flash memory, a portable image recording medium such as a CD-ROM, etc. is applied in addition to a ROM and a hard disk. It is also possible. A carrier wave is also used as a medium for providing program data via a predetermined communication line.

本発明の実施形態を説明したが、本発明の範囲は、上述の実施の形態に限定するものではなく、特許請求の範囲に記載された発明の範囲とその均等の範囲を含む。
以下に、この出願の願書に最初に添付した特許請求の範囲に記載した発明を付記する。付記に記載した請求項の項番は、この出願の願書に最初に添付した特許請求の範囲の通りである。
〔付記〕
＜請求項１＞
表示手段を備えた表示装置において、
前記表示手段の表示領域に表示される画像を取得する取得手段と、
音源から発せられた音を集音する集音手段と、
当該装置本体の位置を基準として、前記集音手段により集音された音の音源方向を特定する方向特定手段と、
この方向特定手段により特定された前記音の音源方向に基づいて、当該音に関連する音関連画像の前記表示領域における表示位置を決定する位置決定手段と、
前記表示領域に表示される前記画像と重なるように、前記音関連画像を前記位置決定手段により決定された表示位置に表示させる表示制御手段と、
を備えたことを特徴とする表示装置。
＜請求項２＞
前記取得手段により取得された画像から顔を検出する検出手段を更に備え、
前記位置決定手段は、更に、前記検出手段により前記画像から検出された顔の位置に基づいて、前記音関連画像の前記表示領域における表示位置を決定することを特徴とする請求項１に記載の表示装置。
＜請求項３＞
ヒト及び動物のうち、少なくとも一方の各個体の顔を識別するための顔識別用情報を記憶する第１記憶手段と、
ヒト及び動物のうち、少なくとも一方の各個体の識別に用いられる音声情報を記憶する第２記憶手段と、
この第２記憶手段に記憶されている前記音声情報に基づいて、前記集音手段により集音された音声を発した個体を識別する識別手段と、を更に備え、
前記位置決定手段は、前記検出手段により前記画像から検出され、前記識別手段により識別された個体の前記顔識別用情報を用いて識別される顔の位置に基づいて、前記音関連画像の前記表示領域における表示位置を決定することを特徴とする請求項２に記載の表示装置。
＜請求項４＞
前記集音手段により集音された音を認識する音認識手段を更に備え、
前記表示制御手段は、前記音関連画像として、前記音認識手段により認識された音を対応する文字で表した文字画像を表示させることを特徴とする請求項１〜３の何れか一項に記載の表示装置。
＜請求項５＞
前記文字画像は、前記文字を囲む枠部分の所定位置に接続された引き出し線を具備するふきだし画像を含み、
前記方向特定手段により特定された音源方向に基づいて、前記ふきだし画像の引き出し線の向きを決定する向き決定手段を更に備えることを特徴とする請求項４に記載の表示装置。
＜請求項６＞
前記集音手段により集音された音の音量を特定する音量特定手段と、
前記音量特定手段により特定された音量に基づいて、前記表示領域に表示される前記音関連画像の寸法を決定する寸法決定手段と、を更に備えることを特徴とする請求項１〜５の何れか一項に記載の表示装置。
＜請求項７＞
表示手段と、音源から発せられた音を集音する集音手段とを備える表示装置を用いた表示方法であって、
前記表示手段の表示領域に画像を表示する処理と、
前記表示装置本体の位置を基準として、前記集音手段により集音された音の音源方向を特定する処理と、
特定された前記音の音源方向に基づいて、当該音に関連する音関連画像の前記表示領域における表示位置を決定する処理と、
前記表示領域に表示される前記画像と重なるように、前記音関連画像を決定された表示位置に表示させる処理と、
を行うことを特徴とする表示方法。
＜請求項８＞
表示手段と、音源から発せられた音を集音する集音手段とを備える表示装置のコンピュータを、
前記表示装置本体の位置を基準として、前記表示手段の表示領域に表示される画像を取得する取得手段、
前記集音手段により集音された音の音源方向を特定する方向特定手段、
この方向特定手段により特定された前記音の音源方向に基づいて、当該音に関連する音関連画像の前記表示領域における表示位置を決定する位置決定手段、
前記表示領域に表示される前記画像と重なるように、前記音関連画像を前記位置決定手段により決定された表示位置に表示させる表示制御手段、
として機能させることを特徴とするプログラム。 Although the embodiments of the present invention have been described, the scope of the present invention is not limited to the above-described embodiments, and includes the scope of the invention described in the claims and an equivalent scope thereof.
The invention described in the scope of claims attached to the application of this application will be added below. The item numbers of the claims described in the appendix are as set forth in the claims attached to the application of this application.
[Appendix]
<Claim 1>
In a display device comprising display means,
Obtaining means for obtaining an image displayed in a display area of the display means;
Sound collecting means for collecting sounds emitted from the sound source;
Direction specifying means for specifying the sound source direction of the sound collected by the sound collecting means with reference to the position of the apparatus body;
Position determining means for determining a display position in the display area of the sound-related image related to the sound based on the sound source direction of the sound specified by the direction specifying means;
Display control means for displaying the sound-related image at the display position determined by the position determination means so as to overlap the image displayed in the display area;
A display device comprising:
<Claim 2>
It further comprises detection means for detecting a face from the image acquired by the acquisition means,
The said position determination means further determines the display position in the said display area of the said sound related image based on the position of the face detected from the said image by the said detection means. Display device.
<Claim 3>
First storage means for storing face identification information for identifying the face of each individual of at least one of human and animal;
Second storage means for storing voice information used for identification of each individual of at least one of human and animal;
Identification means for identifying an individual that has emitted the sound collected by the sound collection means based on the sound information stored in the second storage means;
The position determination unit is configured to display the sound-related image based on a position of a face detected from the image by the detection unit and identified using the face identification information of the individual identified by the identification unit. The display device according to claim 2, wherein a display position in the region is determined.
<Claim 4>
A sound recognition means for recognizing the sound collected by the sound collection means;
The said display control means displays the character image which represented the sound recognized by the said sound recognition means by the character corresponding as the said sound related image, The Claim 1 characterized by the above-mentioned. Display device.
<Claim 5>
The character image includes a speech bubble image having a lead line connected to a predetermined position of a frame portion surrounding the character,
The display device according to claim 4, further comprising a direction determining unit that determines a direction of a leader line of the balloon image based on a sound source direction specified by the direction specifying unit.
<Claim 6>
Volume specifying means for specifying the volume of the sound collected by the sound collecting means;
6. The method according to claim 1, further comprising: a dimension determining unit that determines a dimension of the sound-related image displayed in the display area based on the volume specified by the volume specifying unit. The display device according to one item.
<Claim 7>
A display method using a display device comprising display means and sound collection means for collecting sound emitted from a sound source,
Processing for displaying an image in a display area of the display means;
A process of identifying the sound source direction of the sound collected by the sound collecting means with reference to the position of the display device body;
A process of determining a display position in the display area of a sound-related image related to the sound based on the sound source direction of the specified sound;
Processing to display the sound-related image at the determined display position so as to overlap the image displayed in the display area;
A display method characterized by:
<Claim 8>
A computer of a display device comprising display means and sound collection means for collecting sounds emitted from a sound source,
Obtaining means for obtaining an image displayed in a display area of the display means on the basis of the position of the display device body;
Direction specifying means for specifying the sound source direction of the sound collected by the sound collecting means;
Position determining means for determining a display position in the display area of the sound-related image related to the sound based on the sound source direction of the sound specified by the direction specifying means;
Display control means for displaying the sound-related image at the display position determined by the position determination means so as to overlap the image displayed in the display area;
A program characterized by functioning as

１中央制御部
３ａ集音部
３ｃ音量特定部
３ｄ音源方向特定部
３ｅ音認識部
３ｆ個体識別部
７ａ画像取得部
７ｂ顔検出部
７ｃ寸法決定部
７ｄ向き決定部
７ｆ位置決定部
８表示制御部
９表示部
１００表示装置
Ｌ引き出し線
Ｍ文字画像
Ｒ表示領域
Ｓ音源
Ｔ１音声情報テーブル
Ｔ２顔識別用情報テーブル DESCRIPTION OF SYMBOLS 1 Central control part 3a Sound collection part 3c Sound volume specific | specification part 3d Sound source direction specific | specification part 3e Sound recognition part 3f Individual identification part 7a Image acquisition part 7b Face detection part 7c Dimension determination part 7d Direction determination part 7f Position determination part 8 Display control part 9 Display unit 100 Display device L Lead line M Character image R Display area S Sound source T1 Audio information table T2 Face identification information table

Claims

In a display device comprising display means,
Obtaining means for obtaining an image displayed in a display area of the display means;
Sound collecting means for collecting sounds emitted from the sound source;
Direction specifying means for specifying the sound source direction of the sound collected by the sound collecting means with reference to the position of the apparatus body;
Position determining means for determining a display position in the display area of the sound-related image related to the sound based on the sound source direction of the sound specified by the direction specifying means;
Display control means for displaying the sound-related image at the display position determined by the position determination means so as to overlap the image displayed in the display area;
A display device comprising:

It further comprises detection means for detecting a face from the image acquired by the acquisition means,
The said position determination means further determines the display position in the said display area of the said sound related image based on the position of the face detected from the said image by the said detection means. Display device.

First storage means for storing face identification information for identifying the face of each individual of at least one of human and animal;
Second storage means for storing voice information used for identification of each individual of at least one of human and animal;
Identification means for identifying an individual that has emitted the sound collected by the sound collection means based on the sound information stored in the second storage means;
The position determination unit is configured to display the sound-related image based on a position of a face detected from the image by the detection unit and identified using the face identification information of the individual identified by the identification unit. The display device according to claim 2, wherein a display position in the region is determined.

A sound recognition means for recognizing the sound collected by the sound collection means;
The said display control means displays the character image which represented the sound recognized by the said sound recognition means by the character corresponding as the said sound related image, The Claim 1 characterized by the above-mentioned. Display device.

The character image includes a speech bubble image having a lead line connected to a predetermined position of a frame portion surrounding the character,
The display device according to claim 4, further comprising a direction determining unit that determines a direction of a leader line of the balloon image based on a sound source direction specified by the direction specifying unit.

Volume specifying means for specifying the volume of the sound collected by the sound collecting means;
6. The method according to claim 1, further comprising: a dimension determining unit that determines a dimension of the sound-related image displayed in the display area based on the volume specified by the volume specifying unit. The display device according to one item.

A display method using a display device comprising display means and sound collection means for collecting sound emitted from a sound source,
Processing for displaying an image in a display area of the display means;
A process of identifying the sound source direction of the sound collected by the sound collecting means with reference to the position of the display device body;
A process of determining a display position in the display area of a sound-related image related to the sound based on the sound source direction of the specified sound;
Processing to display the sound-related image at the determined display position so as to overlap the image displayed in the display area;
A display method characterized by:

A computer of a display device comprising display means and sound collection means for collecting sounds emitted from a sound source,
Obtaining means for obtaining an image displayed in a display area of the display means on the basis of the position of the display device body;
Direction specifying means for specifying the sound source direction of the sound collected by the sound collecting means;
Position determining means for determining a display position in the display area of the sound-related image related to the sound based on the sound source direction of the sound specified by the direction specifying means;
Display control means for displaying the sound-related image at the display position determined by the position determination means so as to overlap the image displayed in the display area;
A program characterized by functioning as