JP2013041012A

JP2013041012A - Imaging apparatus and imaging method

Info

Publication number: JP2013041012A
Application number: JP2011176514A
Authority: JP
Inventors: Noriyuki Aoki; 規至青木
Original assignee: NEC Casio Mobile Communications Ltd
Current assignee: NEC Casio Mobile Communications Ltd
Priority date: 2011-08-12
Filing date: 2011-08-12
Publication date: 2013-02-28

Abstract

PROBLEM TO BE SOLVED: To enable an imaging apparatus to more accurately give notice such as the movement instruction of a person being an object to the person.SOLUTION: Each of voice output parts 132-1 to 132-12 includes a parametric speaker and outputs a voice toward a part of the imaging region of an imaging part 150. Thus, the notice is given to only a part of a plurality of persons included in the imaging region or the notice is given only when the person is located in a specific region. Therefore, a personal digital assistance 100 can more accurately give the notice, that is, can give the notice only to the person to which the notice is given through a short message.

Description

本発明は、カメラ付き携帯端末装置やデジタルカメラ等の撮像装置、および、当該撮像装置の撮像方法に関する。 The present invention relates to an imaging device such as a mobile terminal device with a camera or a digital camera, and an imaging method of the imaging device.

カメラ付き携帯端末装置やデジタルカメラ等の撮像装置を用いて撮像を行う際、ユーザは、ファインダーで撮像領域を確認しながら撮像を行うことが一般的である。このファインダーで撮像領域を確認する作業は、ユーザにとって手間であり、また、シャッタータイミングが遅れてしまう要因となり得る。
そこで、特許文献１に記載のカメラでは、スピーカは、スピーカ制御部に制御されて、撮影範囲と略同一の範囲で聴き取ることのできる報知音を発生させる。スピーカ制御部は、セルフタイマ撮影モードにおいて、シャッターボタンが操作されると、撮影実行までの間、一定のサイクルで報知音を発生させ、撮影実行時には、撮影実行を表す報知音を発生させる。
この報知音により、被写体の人物は、自分が撮影範囲内に位置しているか否か、並びに、撮影タイミングを知ることができるとされている。 When performing imaging using an imaging device such as a portable terminal device with a camera or a digital camera, it is common for a user to perform imaging while confirming an imaging region with a finder. The operation of confirming the imaging area with the finder is troublesome for the user, and may cause a delay in shutter timing.
Therefore, in the camera described in Patent Document 1, the speaker is controlled by the speaker control unit to generate a notification sound that can be heard in the substantially same range as the shooting range. When the shutter button is operated in the self-timer shooting mode, the speaker control unit generates a notification sound at a constant cycle until the execution of shooting, and generates a notification sound indicating the execution of shooting when the shooting is executed.
This notification sound allows the subject person to know whether or not he is within the shooting range and the shooting timing.

特開２００７−２６４４５７号公報JP 2007-264457 A

特許文献１に記載のカメラにおける報知音による報知方法のように、音声にて被写体である人物に通知を行う方法において、詳細な情報を的確に通知できない場合がある。例えば、集合写真を撮像する際、撮像装置に向かって右端から２人が右に寄り過ぎているため左に移動させたい場合に、撮像装置が「左に寄って下さい」という音声を出力したのでは、他の人物も左に寄ってしまうおそれがある。一方、「右端から２人の方は左に寄って下さい」というように通知内容を詳細に示す音声を撮像装置が出力しても、メッセージが長すぎて被写体に伝わらない（正しく認識して貰えない）おそれがある。また、このように長いメッセージを、想定される様々なシーンに応じて予め準備しておくことは、メッセージを準備する者にとって大きな負担となる。 In a method of notifying a person who is a subject by voice, such as a notification method using a notification sound in a camera described in Patent Document 1, detailed information may not be accurately notified. For example, when taking a group photo, when two people are too far to the right from the right end toward the imaging device, the imaging device outputs a voice saying "Please come to the left" Then, there is a risk that other people will also move to the left. On the other hand, even if the imaging device outputs a voice that shows the details of the notification, such as “If two people from the right end, please move to the left,” the message is too long to be transmitted to the subject (recognize correctly) Not). In addition, preparing such a long message in advance according to various assumed scenes is a heavy burden for those who prepare the message.

本発明は、上述の課題を解決することのできる撮像装置および撮像方法を提供することを目的としている。 An object of the present invention is to provide an imaging apparatus and an imaging method that can solve the above-described problems.

この発明は上述した課題を解決するためになされたもので、本発明の一態様による撮像装置は、撮像部と、前記撮像部の撮像領域の一部で聞き取り可能な音声を出力する音声出力部と、を具備することを特徴とする。 The present invention has been made to solve the above-described problems, and an imaging apparatus according to an aspect of the present invention includes an imaging unit and an audio output unit that outputs audible audio in a part of the imaging region of the imaging unit. It is characterized by comprising.

また、本発明の一態様による撮像方法は、撮像装置の撮像方法であって、撮像領域の一部で聞き取り可能な音声を出力する音声出力ステップと、前記撮像領域を撮像する撮像ステップと、を具備することを特徴とする。 An imaging method according to an aspect of the present invention is an imaging method of an imaging apparatus, and includes an audio output step for outputting a sound that can be heard in a part of an imaging region, and an imaging step for imaging the imaging region. It is characterized by comprising.

本発明によれば、被写体である人物に対して、より的確に通知を行うことができる。 According to the present invention, it is possible to more accurately notify a person who is a subject.

本発明の一実施形態における携帯端末装置の機能構成を示す概略ブロック図である。It is a schematic block diagram which shows the function structure of the portable terminal device in one Embodiment of this invention. 同実施形態における携帯端末装置の外形の概略を示す斜視図である。It is a perspective view which shows the outline of the external shape of the portable terminal device in the embodiment. 同実施形態において、音声出力部の各々が音声を出力する領域を示す説明図である。In the same embodiment, it is explanatory drawing which shows the area | region where each of an audio | voice output part outputs an audio | voice. 同実施形態において、携帯端末装置が行う被写体位置の指示の例を示す説明図である。In the embodiment, it is explanatory drawing which shows the example of the instruction | indication of the to-be-photographed object position which a portable terminal device performs. 同実施形態において、携帯端末装置（合焦領域検出部）が検出する合焦領域の例を示す説明図である。In the embodiment, it is explanatory drawing which shows the example of the focusing area | region which a portable terminal device (focusing area detection part) detects. 同実施形態において、携帯端末装置（笑顔検出部）が検出する笑顔領域の例を示す説明図である。In the embodiment, it is explanatory drawing which shows the example of the smile area | region which a portable terminal device (smile detection part) detects. 同実施形態において、携帯端末装置が、静止画像を撮像する撮像機能を実行する際の処理手順を示すフローチャートである。5 is a flowchart illustrating a processing procedure when the mobile terminal device executes an imaging function for capturing a still image in the embodiment. 同実施形態において、携帯端末装置が、被写体位置の指示を行う機能に関する処理を行う手順を示すフローチャートである。4 is a flowchart illustrating a procedure in which the mobile terminal device performs processing related to a function of instructing a subject position in the embodiment. 同実施形態において、携帯端末装置が、合焦領域通知に関する処理を行う手順を示すフローチャートである。5 is a flowchart illustrating a procedure in which the mobile terminal device performs processing related to a focus area notification in the embodiment. 同実施形態において、携帯端末装置が、笑顔検出通知に関する処理を行う手順を示すフローチャートである。In the same embodiment, it is a flowchart which shows the procedure in which a portable terminal device performs the process regarding a smile detection notification.

以下、図面を参照して、本発明の実施の形態について説明する。なお、以下では、本発明を携帯端末装置に適用する場合の例について説明するが、本発明の適用範囲は携帯端末装置に限らない。例えばデジタルカメラなど、撮像を行う様々な機器に本発明を適用し得る。 Embodiments of the present invention will be described below with reference to the drawings. In the following, an example in which the present invention is applied to a mobile terminal device will be described, but the scope of application of the present invention is not limited to the mobile terminal device. For example, the present invention can be applied to various devices that perform imaging such as a digital camera.

図１は、本発明の一実施形態における携帯端末装置の機能構成を示す概略ブロック図である。同図に示すように、携帯端末装置１００は、表示部１１０と、操作入力部１２０と、音声入力部１３１と、音声出力部１３２−１〜１３２−１２と、無線通信部１４０と、撮像部１５０と、制御部１８０と、記憶部１９０とを具備する。制御部１８０は、表示制御部２１０と、入力処理部２２０と、音声処理部２３０と、通信制御部２４０と、撮像制御部２５０と、アプリケーション実行部２６０とを具備する。アプリケーション実行部２６０は、被写体検出部２６１と、距離測定部２６２と、合焦領域検出部２６３と、笑顔検出部２６４とを具備する。 FIG. 1 is a schematic block diagram showing a functional configuration of a mobile terminal device according to an embodiment of the present invention. As shown in the figure, the mobile terminal device 100 includes a display unit 110, an operation input unit 120, an audio input unit 131, audio output units 132-1 to 132-12, a wireless communication unit 140, and an imaging unit. 150, a control unit 180, and a storage unit 190. The control unit 180 includes a display control unit 210, an input processing unit 220, an audio processing unit 230, a communication control unit 240, an imaging control unit 250, and an application execution unit 260. The application execution unit 260 includes a subject detection unit 261, a distance measurement unit 262, a focus area detection unit 263, and a smile detection unit 264.

携帯端末装置１００は、例えば携帯情報端末装置（Personal Digital Assistant；ＰＤＡ）であり、ユーザ操作に応じて撮像機能や電子メール機能など各種機能を実行する。ここでいう撮像機能は、撮像部１５０のフォーカス（撮像部１５０と焦点との距離）やズームの調整や、ファインダー画像の表示や、撮像画像の保存等を行う機能である。 The mobile terminal device 100 is, for example, a personal digital assistant (PDA), and executes various functions such as an imaging function and an e-mail function in response to a user operation. The imaging function here is a function for adjusting the focus (distance between the imaging unit 150 and the focal point) and zoom of the imaging unit 150, displaying a finder image, saving the captured image, and the like.

表示部１１０は、例えば液晶ディスプレイまたは有機ＥＬ（Organic Electro-Luminescence）ディスプレイ等の表示画面を有し、動画像や静止画像やテキスト（文字）などの各種画像を表示する。特に、表示部１１０は、携帯端末装置１００が撮像機能を実行する際に、撮像部１５０から出力されるファインダー画像を表示する。 The display unit 110 has a display screen such as a liquid crystal display or an organic EL (Organic Electro-Luminescence) display, and displays various images such as a moving image, a still image, and text (characters). In particular, the display unit 110 displays a finder image output from the imaging unit 150 when the mobile terminal device 100 executes the imaging function.

操作入力部１２０は、例えば表示部１１０の表示画面に設けられたタッチセンサ（タッチパネル）や押ボタンなどの入力装置を有し、ユーザ操作を受け付ける。特に、操作入力部１２０は、撮像機能など各種機能の開始や終了を指示する操作や、撮像画像の保存を指示する操作（シャッターボタン押下操作）を受け付ける。 The operation input unit 120 includes an input device such as a touch sensor (touch panel) or a push button provided on the display screen of the display unit 110, for example, and receives a user operation. In particular, the operation input unit 120 receives an operation for instructing the start and end of various functions such as an imaging function and an operation for instructing saving of a captured image (shutter button pressing operation).

音声出力部１３２−１〜１３２−１２は、それぞれ、撮像部１５０の撮像領域（撮像部１５０の取得する画像に結像される領域）の一部で聞き取り可能な音声を出力する。ここで、音声出力部１３２−１〜１３２−１２は、互いに異なる領域で聞き取り可能な音声を出力する。
音声出力部１３２〜１３２−１２の各々は、例えばパラメトリックスピーカ（Parametric Speaker、超指向性スピーカ）を用いて実現される。音声出力部１３２〜１３２−１２の各々は、パラメトリックスピーカを用いて、例えば、搬送波としての超音波を、出力したい音声の波形に振幅変調して、空気中を超音波が伝播する際の非線形特性の生じる大きな振幅で放射することで、指向性の強い音声を出力する。 Each of the sound output units 132-1 to 132-12 outputs sound that can be heard in a part of the imaging region of the imaging unit 150 (a region formed on an image acquired by the imaging unit 150). Here, the sound output units 132-1 to 132-12 output sounds that can be heard in different areas.
Each of the audio output units 132 to 132-12 is realized using, for example, a parametric speaker (super directional speaker). Each of the audio output units 132 to 132-12 uses a parametric speaker, for example, amplitude-modulates an ultrasonic wave as a carrier wave into a waveform of the audio to be output, and nonlinear characteristics when the ultrasonic wave propagates through the air. By radiating with a large amplitude, the sound with strong directivity is output.

なお、携帯端末装置１００が具備する音声出力部の数は、図１に示す１２個に限らず、１つ以上であればよい。
また、本実施形態では、音声出力部１３２−１〜１３２−１２の各々が音声を出力する向き（すなわち、音声出力部１３２−１〜１３２−１２の各々が音声を出力する領域）は、携帯端末装置１００本体の姿勢（向き）に対して固定であるが、本発明の実施にあたって、携帯端末装置の具備するパラメトリックスピーカの向き（音声を出力する向き）を、携帯端末装置本体の姿勢に対して可変としてもよい。これにより、携帯端末装置は、複数の領域のいずれかを選択し、１つのパラメトリックスピーカを用いて当該領域に音声を出力し得る。 In addition, the number of the audio | voice output parts which the portable terminal device 100 comprises is not restricted to 12 shown in FIG. 1, What is necessary is just one or more.
In the present embodiment, the direction in which each of the sound output units 132-1 to 132-12 outputs sound (that is, the region in which each of the sound output units 132-1 to 132-12 outputs sound) is mobile. Although it is fixed with respect to the orientation (orientation) of the terminal device 100 main body, in implementing the present invention, the orientation of the parametric speaker included in the mobile terminal device (the direction in which sound is output) is set relative to the orientation of the mobile terminal device main body. And may be variable. Thereby, the mobile terminal device can select any of the plurality of regions and output the sound to the region using one parametric speaker.

また、携帯端末装置１００が被写体位置指示機能を実行する際、音声出力部１３２−１〜１３２−１２のいずれかは、撮像領域の外縁の一部を含む領域で聞き取り可能な音声かつ撮像領域の中心側へ移動するよう促す音声を出力する。
ここでいう被写体位置指示機能は、被写体である人物に対して立ち位置（撮像される位置）を変更するよう指示する機能である。音声出力部１３２−１〜１３２−１２は、撮像領域の端近くに位置する人物に対して撮像領域の中心側に移動するよう指示することで、当該人物の一部が撮像領域からはみ出し、撮像画像において当該人物の像が切れてしまうことや、撮像画像において複数の人物の一部が一方向に偏ってバランスが悪くなってしまうことを防止する。 In addition, when the mobile terminal device 100 executes the subject position instruction function, one of the audio output units 132-1 to 132-12 is audible and can be heard in an area including a part of the outer edge of the imaging area. Outputs voice prompting to move to the center side.
The subject position instruction function here is a function for instructing a person who is a subject to change the standing position (position to be imaged). The audio output units 132-1 to 132-12 instruct a person located near the end of the imaging area to move to the center side of the imaging area, so that a part of the person protrudes from the imaging area, and imaging is performed. This prevents the image of the person from being cut off in the image, and prevents some of the plurality of people from being biased in one direction in the captured image and resulting in poor balance.

また、携帯端末装置１００が合焦領域通知機能を実行する際、音声出力部１３２−１〜１３２−１２のいずれかは、合焦領域検出部２６３の検出した合焦領域を示す音声を出力する。
ここでいう合焦領域通知機能は、被写体である人物に対して、合焦領域に位置するか否かを通知する機能である。また、ここでいう合焦領域は、撮像部１５０の撮像画像におけるピントが合う領域である。より具体的には、合焦領域は、撮像領域を分割した領域のうち、合焦して（すなわち、ピントが合って）撮像される被写体を含む領域である。また、ここでいう撮像領域を分割した領域は、撮像部１５０の撮像領域を、音声出力部１３２−１〜１３２−１２の各々が音声を出力する領域に分割した領域である。 In addition, when the mobile terminal device 100 executes the focus area notification function, one of the sound output units 132-1 to 132-12 outputs sound indicating the focus area detected by the focus area detection unit 263. .
The focus area notification function here is a function for notifying a person who is a subject whether or not the subject is in the focus area. Further, the in-focus area referred to here is an in-focus area in the captured image of the imaging unit 150. More specifically, the focus area is an area including a subject to be imaged in focus (that is, in focus) among areas obtained by dividing the imaging area. The area obtained by dividing the imaging area here is an area obtained by dividing the imaging area of the imaging unit 150 into areas where each of the audio output units 132-1 to 132-12 outputs audio.

音声出力部１３２−１〜１３２−１２が、合焦領域に位置するか否かを被写体に通知することで、当該被写体である人物は、自らに合焦しているか否かを知り得る。例えば、合焦領域に位置しない旨の通知を取得した人物は、自らに合焦していないことを知り得る。そして、自らに合焦していないことを知った人物は、携帯端末装置１００までの位置を変化させるように移動することで、自らに合焦させ得る（すなわち、自らがピンボケしていない撮像画像を取得し得る）。 When the audio output units 132-1 to 132-12 notify the subject whether or not the subject is located in the focus area, the person who is the subject can know whether or not the subject is in focus. For example, a person who has received a notification that he / she is not in the focus area can know that he / she is not in focus. Then, a person who knows that he / she is not in focus can move himself / herself so as to change the position up to the mobile terminal device 100, thereby focusing on himself / herself (that is, a captured image that is not out of focus himself / herself). Can get).

また、携帯端末装置１００が笑顔通知機能を実行する際、音声出力部１３２−１〜１３２−１２のいずれかは、前記笑顔検出部の検出した笑顔を含む領域を示す音声を出力する。
ここでいう笑顔通知機能は、被写体である人物に対して、笑顔検出領域に位置するか否かを通知する機能である。また、ここでいう笑顔検出領域は、撮像領域を分割した領域のうち、笑顔検出部２６４が笑顔を検出した領域である。
音声出力部１３２−１〜１３２−１２は、笑顔検出領域に位置するか否かを被写体に通知することで、当該被写体である人物は、自らの笑顔が検出されたか否かを確認し得る。また、音声出力部１３２−１〜１３２−１２は、笑顔検出領域外に位置する被写体に対して、微笑むよう促す音声を出力し得る。
なお、以下では、音声出力部１３２−１〜１３２−１２を総称して「音声出力部１３２」と表記する。 In addition, when the mobile terminal device 100 executes the smile notification function, one of the voice output units 132-1 to 132-12 outputs a voice indicating a region including the smile detected by the smile detection unit.
The smile notification function here is a function for notifying a person who is a subject whether or not the person is in the smile detection area. The smile detection area here is an area where the smile detection unit 264 detects a smile among the areas obtained by dividing the imaging area.
The voice output units 132-1 to 132-12 notify the subject whether or not the subject is located in the smile detection region, so that the person who is the subject can confirm whether or not his / her smile is detected. In addition, the sound output units 132-1 to 132-12 can output a sound for encouraging a subject located outside the smile detection area to smile.
Hereinafter, the audio output units 132-1 to 132-12 are collectively referred to as “audio output unit 132”.

音声入力部１３１はマイクを有し、周囲音を採取して音声信号に変換し、音声処理部２３０に出力する。特に、音声入力部１３１は、音声出力部１３２から出力され被写体に当たって反射した音声（反射音）を採取する。
無線通信部１４０は、無線基地局との通信を行う。具体的には、無線通信部１４０は、通信制御部２４０から出力される信号に対して変調処理を行って無線信号にて送信し、また、受信した無線信号に対して復調処理を行って通信制御部２４０に出力する。例えば、無線通信部１４０は、電子メールデータを無線信号にて送受信する。 The audio input unit 131 has a microphone, collects ambient sounds, converts them into audio signals, and outputs them to the audio processing unit 230. In particular, the audio input unit 131 collects audio (reflected sound) that is output from the audio output unit 132 and is reflected by hitting the subject.
The radio communication unit 140 communicates with a radio base station. Specifically, the radio communication unit 140 performs modulation processing on the signal output from the communication control unit 240 and transmits the signal as a radio signal, and performs demodulation processing on the received radio signal to perform communication. Output to the controller 240. For example, the wireless communication unit 140 transmits and receives e-mail data using a wireless signal.

撮像部１５０は、撮像用レンズ（フォーカスレンズおよびズームレンズを含む）および撮像素子を備えたカメラを有して撮像を行う。具体的には、撮像部１５０は、被写体からの光線の入力を受けて、フォーカスレンズの位置に応じたフォーカスおよびズームレンズの位置に応じたズームにて撮像素子上に被写体像を結像する。そして、撮像部１５０は、撮像素子において、結像された被写体像を画像信号に変換して撮像制御部２５０に出力する。撮像部１５０は、携帯端末装置１００が撮像機能を実行する状態において、画像信号を常時出力する。
また、撮像部１５０は、撮像制御部２５０の制御に従って、距離測定部２６２が検出した被写体までの距離に合焦する（フォーカスレンズを動かす）。
以下では、撮像部１５０が出力する信号の示す画像を「カメラ画像」と称する。また、表示部１１０に表示されるカメラ画像を「ファインダー画像」と称し、記憶部１９０に保存されるカメラ画像を「撮像画像」と称する。すなわち、撮像部１５０が取得するカメラ画像は、ファインダー画像として表示部１１０に表示され、また、操作入力部１２０が受け付ける操作によって指示される撮像画像の保存のタイミングにおいて、撮像画像として記憶部１９０に保存される。 The imaging unit 150 has a camera including an imaging lens (including a focus lens and a zoom lens) and an imaging element, and performs imaging. Specifically, the imaging unit 150 receives an input of a light beam from the subject, and forms a subject image on the imaging element with focus according to the position of the focus lens and zoom according to the position of the zoom lens. Then, the imaging unit 150 converts the formed subject image into an image signal in the imaging element and outputs the image signal to the imaging control unit 250. The imaging unit 150 always outputs an image signal in a state where the mobile terminal device 100 executes the imaging function.
Further, the imaging unit 150 focuses on the distance to the subject detected by the distance measuring unit 262 (moves the focus lens) according to the control of the imaging control unit 250.
Hereinafter, an image indicated by a signal output from the imaging unit 150 is referred to as a “camera image”. In addition, the camera image displayed on the display unit 110 is referred to as a “finder image”, and the camera image stored in the storage unit 190 is referred to as a “captured image”. That is, the camera image acquired by the imaging unit 150 is displayed on the display unit 110 as a finder image, and is stored in the storage unit 190 as a captured image at the timing of saving the captured image instructed by an operation received by the operation input unit 120. Saved.

制御部１８０は、携帯端末装置１００の各部を制御して各種機能を実行する。制御部１８０は、例えば、携帯端末装置１００の具備するＣＰＵ（Central Processing Unit、中央処理装置）が、携帯端末装置１００の具備するメモリからプログラムを読み出して実行することにより実現される。
表示制御部２１０は、表示部１１０を制御して各種画像を表示させる。具体的には、表示制御部２１０は、アプリケーション実行部２６０から出力される動画像データや静止画像データやテキストデータ等に基づいて画面表示用の信号を生成して表示部１１０に出力することにより、表示部１１０に画像を表示させる。特に、表示制御部２１０は、携帯端末装置１００が撮像機能を実行する状態において、アプリケーション実行部２６０から画像データにて出力されるファインダー画像を表示部１１０に表示させる。 The control unit 180 controls each unit of the mobile terminal device 100 to execute various functions. The control unit 180 is realized, for example, when a CPU (Central Processing Unit) included in the mobile terminal device 100 reads a program from a memory included in the mobile terminal device 100 and executes the program.
The display control unit 210 controls the display unit 110 to display various images. Specifically, the display control unit 210 generates a screen display signal based on moving image data, still image data, text data, or the like output from the application execution unit 260 and outputs the screen display signal to the display unit 110. Then, an image is displayed on the display unit 110. In particular, the display control unit 210 causes the display unit 110 to display a finder image output as image data from the application execution unit 260 in a state where the mobile terminal device 100 executes the imaging function.

入力処理部２２０は、操作入力部１２０が受け付けた操作に応じた信号をアプリケーション実行部２６０に出力する。
特に、操作入力部１２０が撮像機能の開始を指示する操作を受け付けると、入力処理部２２０は、撮像機能の開始を指示する信号を出力する。例えば、表示部１１０が、携帯端末装置１００の実行可能な機能の一覧を示すメニューを表示している状態で、撮像機能の項目をタッチする操作が行われると、操作入力部１２０が、タッチ位置（表示画面上においてタッチされた位置）を示す信号を入力処理部２２０に出力する。そして、入力処理部２２０は、タッチ位置に基づいて、撮像機能の項目がタッチされたことを検出して、撮像機能の開始を指示する信号を出力する。
また、操作入力部１２０が、撮像画像の保存を指示する操作を受け付けると、入力処理部２２０は、撮像画像の保存を指示する信号を出力する。例えば、表示部１１０がシャッターボタンの画像（アイコン）を表示している状態で、当該シャッターボタンの画像をタッチ（押下）する操作が行われると、操作入力部１２０が、タッチ位置を示す信号を入力処理部２２０に出力する。そして、入力処理部２２０は、タッチ位置に基づいて、シャッターボタンの画像がタッチされたことを検出して、撮像画像の保存を指示する信号を出力する。 The input processing unit 220 outputs a signal corresponding to the operation received by the operation input unit 120 to the application execution unit 260.
In particular, when the operation input unit 120 receives an operation for instructing the start of the imaging function, the input processing unit 220 outputs a signal for instructing the start of the imaging function. For example, when an operation for touching an imaging function item is performed in a state where the display unit 110 displays a menu indicating a list of functions that can be executed by the mobile terminal device 100, the operation input unit 120 is moved to the touch position. A signal indicating (a touched position on the display screen) is output to the input processing unit 220. Then, the input processing unit 220 detects that the item of the imaging function is touched based on the touch position, and outputs a signal instructing the start of the imaging function.
When the operation input unit 120 accepts an operation for instructing saving of a captured image, the input processing unit 220 outputs a signal for instructing saving of the captured image. For example, when an operation for touching (pressing) the shutter button image is performed in a state where the display unit 110 displays the shutter button image (icon), the operation input unit 120 generates a signal indicating the touch position. The data is output to the input processing unit 220. Then, the input processing unit 220 detects that the shutter button image has been touched based on the touch position, and outputs a signal instructing to save the captured image.

音声処理部２３０は、アプリケーション実行部２６０から出力される音声データを電気信号に変換して音声出力部１３２に出力することで、音声出力部１３２に音声を出力させる。また、音声処理部２３０は、音声入力部１３１が音声を採取して出力する電気信号を音声データに変換してアプリケーション実行部２６０に出力する。 The audio processing unit 230 converts the audio data output from the application execution unit 260 into an electrical signal and outputs the electrical signal to the audio output unit 132, thereby causing the audio output unit 132 to output audio. In addition, the voice processing unit 230 converts the electrical signal output by the voice input unit 131 by collecting and outputting voice to voice data and outputs the voice data to the application execution unit 260.

通信制御部２４０は、アプリケーション実行部２６０から出力されるデータに符号化等の処理を行って、無線通信部１４０に出力して変調させ、無線信号にて送信させる。また、通信制御部２４０は、無線通信部１４０が受信して復調した信号に、復号等の処理を行ってデータを抽出し、アプリケーション実行部２６０に出力する。例えば、通信制御部２４０は、アプリケーション実行部２６０から出力される電子メールデータに符号化等の処理を行って無線通信部１４０に出力し、また、無線通信部１４０が受信して復調した信号に復号等の処理を行って電子メールデータ等のデータを抽出してアプリケーション実行部２６０に出力する。 The communication control unit 240 performs processing such as encoding on the data output from the application execution unit 260, outputs the data to the wireless communication unit 140, modulates the data, and transmits the data as a wireless signal. In addition, the communication control unit 240 performs processing such as decoding on the signal received and demodulated by the wireless communication unit 140, extracts data, and outputs the data to the application execution unit 260. For example, the communication control unit 240 performs processing such as encoding on the e-mail data output from the application execution unit 260 and outputs the processed data to the wireless communication unit 140. The communication control unit 240 also converts the received data into a signal received and demodulated. Data such as e-mail data is extracted by performing processing such as decryption, and is output to the application execution unit 260.

撮像制御部２５０は、撮像部１５０の制御や撮像部１５０から出力された信号の処理を行う。特に、撮像制御部２５０は、撮像部１５０から出力された電気信号を動画像フレームまたは静止画像の画像データに変換してアプリケーション実行部２６０に出力する。また、撮像制御部２５０は、アプリケーション実行部２６０から出力されるフォーカス指示に従って撮像部１５０のフォーカスレンズを移動させて、撮像部１５０におけるフォーカスを調整する。また、撮像制御部２５０は、アプリケーション実行部２６０から出力されるズーム指示に従って撮像部１５０のズームレンズを移動させて、撮像部１５０におけるズームを調整する。 The imaging control unit 250 performs control of the imaging unit 150 and processing of signals output from the imaging unit 150. In particular, the imaging control unit 250 converts the electric signal output from the imaging unit 150 into image data of a moving image frame or a still image and outputs the converted image data to the application execution unit 260. Further, the imaging control unit 250 adjusts the focus in the imaging unit 150 by moving the focus lens of the imaging unit 150 in accordance with a focus instruction output from the application execution unit 260. Further, the imaging control unit 250 adjusts the zoom in the imaging unit 150 by moving the zoom lens of the imaging unit 150 in accordance with the zoom instruction output from the application execution unit 260.

アプリケーション実行部２６０は、アプリケーションプログラムを実行して、撮像機能や電子メール機能など各種機能を提供する。特に、アプリケーション実行部２６０は、撮像機能を実行している状態において、撮像制御部２５０から出力される画像データ（カメラ画像のデータ）を、ファインダー画像のデータとして表示制御部２１０に出力し、また、ユーザ操作によって撮像画像の保存を指示されたタイミングで、撮像制御部２５０から出力される画像データを、記憶部１９０に書き込む（保存する）。 The application execution unit 260 executes the application program and provides various functions such as an imaging function and an e-mail function. In particular, the application execution unit 260 outputs image data (camera image data) output from the imaging control unit 250 to the display control unit 210 as finder image data in a state where the imaging function is being executed. Then, the image data output from the imaging control unit 250 is written (saved) in the storage unit 190 at a timing instructed to save the captured image by a user operation.

被写体検出部２６１は、カメラ画像において、被写体としての人物（の像）を検出する。具体的には、被写体検出部２６１は、カメラ画像に対するパターンマッチングを行って人物を検出し、カメラ画像における人物の位置（例えば、当該人物の顔の中心の位置）を求める。 The subject detection unit 261 detects a person (image) as a subject in the camera image. Specifically, the subject detection unit 261 performs pattern matching on the camera image to detect a person, and obtains the position of the person in the camera image (for example, the position of the center of the person's face).

距離測定部２６２は、音声出力部１３２の出力する音声を用いて被写体までの距離を検出する。具体的には、距離測定部２６２は、音声出力部１３２（音声出力部１３２−１〜１３２−１２のうち少なくとも１つ）が音声を出力してから、音声入力部１３１が反射音を採取するまでの時間を距離に換算して、携帯端末装置１００と被写体との往復距離を算出し、当該往復距離を２で除算して、携帯端末装置１００から被写体までの距離を算出（検出）する。 The distance measuring unit 262 detects the distance to the subject using the sound output from the sound output unit 132. Specifically, in the distance measuring unit 262, after the audio output unit 132 (at least one of the audio output units 132-1 to 132-12) outputs the audio, the audio input unit 131 collects the reflected sound. Is converted into a distance, a round trip distance between the mobile terminal device 100 and the subject is calculated, and the round trip distance is divided by 2 to calculate (detect) the distance from the mobile terminal device 100 to the subject.

合焦領域検出部２６３は、合焦領域を検出する。具体的には、合焦領域検出部２６３は、音声出力部１３２の各々が音声を出力する領域に応じてカメラ画像を分割した領域のうち、合焦して撮像される被写体（すなわち、合焦している被写体）を含む領域を検出する。
ここで、被写体に合焦しているか否かを合焦領域検出部２６３が判定する方法として、様々なものを用いることができる。例えば、合焦領域検出部２６３が、距離測定部２６２の検出する携帯端末装置１００から被写体までの距離と、撮像部１５０のフォーカスとを比較することで、被写体に合焦しているか否かを判定するようにしてもよい。あるいは、撮像制御部２５０が、撮像部１５０のフォーカスを変化させ、合焦領域検出部２６３が、被写体の像のコントラスト（画素毎の明度のばらつき度合い）の変化量に基づいて、被写体に合焦しているか否かを判定するようにしてもよい。 The focus area detection unit 263 detects the focus area. Specifically, the in-focus area detection unit 263 focuses on a subject (that is, in-focus) from among the areas obtained by dividing the camera image according to the area in which each of the sound output units 132 outputs sound. A region including the subject that is).
Here, various methods can be used as a method for the focus area detection unit 263 to determine whether or not the subject is focused. For example, the focus area detection unit 263 compares the distance from the mobile terminal device 100 to the subject detected by the distance measurement unit 262 with the focus of the imaging unit 150 to determine whether or not the subject is in focus. You may make it determine. Alternatively, the imaging control unit 250 changes the focus of the imaging unit 150, and the in-focus area detection unit 263 focuses on the subject based on the amount of change in the contrast of the subject image (lightness variation degree for each pixel). You may make it determine whether it is doing.

笑顔検出部２６４は、撮像領域における被写体の笑顔（の像）を検出する。具体的には、笑顔検出部２６４は、カメラ画像に対するパターンマッチングを行って笑顔を検出し、カメラ画像における笑顔の位置（例えば、当該顔の中心の位置）を求める。 The smile detection unit 264 detects a smile (image) of the subject in the imaging region. Specifically, the smile detection unit 264 detects a smile by performing pattern matching on the camera image, and obtains a smile position (for example, the center position of the face) in the camera image.

記憶部１９０は、例えば携帯端末装置１００の具備するメモリの記憶領域にて実現され、各種データを記憶する。特に、記憶部１９０は、アプリケーション実行部２６０によって書き込まれる撮像画像データ（撮像画像の画像データ）を記憶する。また、記憶部１９０は、音声出力部１３２が音声を出力するための音声データや、携帯端末装置１００の具備するＣＰＵが実行する各種プログラムを、予め記憶している。 The storage unit 190 is realized by a storage area of a memory included in the mobile terminal device 100, for example, and stores various data. In particular, the storage unit 190 stores captured image data (image data of a captured image) written by the application execution unit 260. In addition, the storage unit 190 stores in advance audio data for the audio output unit 132 to output audio and various programs executed by the CPU of the mobile terminal device 100.

図２は、携帯端末装置１００の外形の概略を示す斜視図である。同図において、携帯端末装置１００の筐体上面に、表示部１１０の表示画面および操作入力部１２０のタッチセンサに該当するタッチパネル式の表示画面が設けられている。また、携帯端末装置１００の筐体側面に撮像部１５０の撮像用レンズが設けられている。また、撮像部１５０の撮像用レンズが設けられているのと同じ側面に、同じ向きで、音声入力部１３１のマイクと、音声出力部１３２のパラメトリックスピーカとが設けられている。 FIG. 2 is a perspective view illustrating an outline of the outer shape of the mobile terminal device 100. In the figure, a touch panel display screen corresponding to the display screen of the display unit 110 and the touch sensor of the operation input unit 120 is provided on the upper surface of the housing of the mobile terminal device 100. In addition, an imaging lens of the imaging unit 150 is provided on the side surface of the casing of the mobile terminal device 100. The microphone of the audio input unit 131 and the parametric speaker of the audio output unit 132 are provided in the same direction on the same side where the imaging lens of the imaging unit 150 is provided.

撮像部１５０の撮像用レンズと音声出力部１３２のパラメトリックスピーカとが、同じ側面に同じ向きで設けられていることで、音声出力部１３２は、撮像領域に向けて音声を出力する。ここで、音声出力部１３２のパラメトリックスピーカは１２個（音声出力部１３２−１〜１３２−１２に各１個）設けられている。音声出力部１３２−１〜１３２−１２は、それぞれ１つのパラメトリックスピーカを具備して、撮像領域を分割した領域に向けて音声を出力する。
また、音声出力部１３２のパラメトリックスピーカと、音声入力部１３１のマイクとが同じ側面に同じ向きで設けられていることで、音声入力部１３１は、音声出力部１３２から出力され被写体に当たって反射した音声を採取する。 Since the imaging lens of the imaging unit 150 and the parametric speaker of the audio output unit 132 are provided on the same side surface in the same direction, the audio output unit 132 outputs audio toward the imaging region. Here, twelve parametric speakers of the audio output unit 132 (one for each of the audio output units 132-1 to 132-12) are provided. Each of the sound output units 132-1 to 132-12 includes one parametric speaker, and outputs sound toward an area obtained by dividing the imaging area.
In addition, since the parametric speaker of the audio output unit 132 and the microphone of the audio input unit 131 are provided on the same side surface in the same direction, the audio input unit 131 outputs the audio output from the audio output unit 132 and reflected by the subject. Collect.

次に、図３を参照して、音声出力部１３２が音声を出力する領域について説明する。
図３は、音声出力部１３２−１〜１３２−１２の各々が音声を出力する領域を示す説明図である。同図において、領域ＡＳは、撮像部１５０の合焦距離（ピントが合う距離）における撮像領域（被写体の像がカメラ画像に含まれる領域）を示す。撮像部１５０の撮像領域全体は、撮像部１５０の具備するレンズ（の中心）を頂点として、当該頂点と、領域ＡＳの外周（各辺）とを結んで得られる四角錐にて示される。
また、領域Ａ１０１〜Ａ１１２の各々は、領域ＡＳを縦に４分割、横に３分割して得られる領域である。この分割は、図２におけるパラメトリックスピーカの配置に対応している。 Next, with reference to FIG. 3, the area | region where the audio | voice output part 132 outputs an audio | voice is demonstrated.
FIG. 3 is an explanatory diagram showing a region where each of the sound output units 132-1 to 132-12 outputs sound. In the drawing, an area AS indicates an imaging area (an area in which the image of the subject is included in the camera image) at the in-focus distance (focus distance) of the imaging unit 150. The entire imaging region of the imaging unit 150 is indicated by a quadrangular pyramid obtained by connecting the vertex and the outer periphery (each side) of the region AS with the lens (center) of the imaging unit 150 as a vertex.
Each of the regions A101 to A112 is a region obtained by dividing the region AS vertically into four and horizontally into three. This division corresponds to the arrangement of parametric speakers in FIG.

音声出力部１３２−１は、音声出力部１３２−１自らの具備するパラメトリックスピーカ（の中心）を頂点として、当該頂点と、領域Ａ１０１の外周とを結んで得られる四角錐におよそ一致する領域に向けて音声を出力する。同様に、音声出力部１３２−２〜１３２−１２は、それぞれ、音声出力部１３２−２〜１３２−１２自らの具備するパラメトリックスピーカ（の中心）を頂点として、当該頂点と、領域Ａ１０２〜Ａ１１２の外周とを結んで得られる四角錐におよそ一致する領域に向けて音声を出力する。 The audio output unit 132-1 has a parametric speaker (center) of the audio output unit 132-1 itself as a vertex, and is in a region approximately matching a quadrangular pyramid obtained by connecting the vertex and the outer periphery of the region A101. Output sound. Similarly, the audio output units 132-2 to 132-12 each have the parametric speaker (center) of the audio output units 132-2 to 132-12 as apexes, and the vertexes of the areas A102 to A112. The sound is output toward a region that approximately matches the quadrangular pyramid obtained by connecting the outer periphery.

このように、音声出力部１３２の各々は、撮像領域を分割した領域に対して音声を出力する。
ここで、音声出力部１３２の各々が音声を出力する領域は、互いに排他的（すなわち、重なりが無い）であってもよいし、一部が重なっていてもよい。
また、音声出力部１３２が音声を出力する領域と撮像部１５０の撮像領域とは、完全に一致する必要は無い。すなわち、音声出力部１３２が音声を出力する領域が、撮像部１５０の撮像領域の周辺領域を含んでいてもよいし、また、撮像部１５０の撮像領域の一部が、音声出力部１３２が音声を出力する領域に含まれていなくてもよい。 As described above, each of the audio output units 132 outputs audio to an area obtained by dividing the imaging area.
Here, the areas in which the audio output units 132 output audio may be mutually exclusive (that is, do not overlap), or may partially overlap.
Further, the area where the audio output unit 132 outputs audio and the imaging area of the imaging unit 150 do not have to be completely coincident. That is, the region where the sound output unit 132 outputs sound may include a peripheral region of the image capturing region of the image capturing unit 150, or a part of the image capturing region of the image capturing unit 150 may be sounded by the sound output unit 132. May not be included in the region for outputting.

次に、図４を参照して、携帯端末装置１００が被写体としての人物に対して行う位置の指示（被写体位置指示）について説明する。
図４は、携帯端末装置１００が行う被写体位置の指示の例を示す説明図である。この被写体位置を指示する機能（被写体位置指示機能）は、例えば集合写真を撮像する場合に用いられる。
同図（ａ）は、被写体の移動前におけるカメラ画像の例を示す。同図（ａ）に示すカメラ画像には、人物Ｐ１１〜Ｐ１４が写っている。また、同図（ａ）に示す領域Ｐ２０１〜Ｐ２１２は、それぞれ、音声出力部１３２−１〜１３２−１２が音声を出力する領域に対応する領域（音声出力部１３２−１〜１３２−１２が音声を出力する領域に位置する被写体の像が結像する領域）である。 Next, with reference to FIG. 4, a position instruction (subject position instruction) performed by the mobile terminal device 100 for a person as a subject will be described.
FIG. 4 is an explanatory diagram illustrating an example of a subject position instruction performed by the mobile terminal device 100. This function of instructing the subject position (subject position instruction function) is used, for example, when taking a group photo.
FIG. 4A shows an example of a camera image before the subject moves. Persons P11 to P14 are shown in the camera image shown in FIG. In addition, areas P201 to P212 shown in FIG. 6A are areas corresponding to areas where the audio output units 132-1 to 132-12 output audio (the audio output units 132-1 to 132-12 are audio sources). Is a region where an image of a subject located in the region where the image is output.

ここで、同図（ａ）の状態では、人物Ｐ１１およびＰ１２が左（携帯端末装置１００に向かって右）に寄っており、人物Ｐ１２とＰ１３との間に隙間が空いて画像のバランスが悪くなっている。また、人物Ｐ１１は撮像領域の端に位置し、当該人物Ｐ１１の像が切れてしまっている。 Here, in the state of FIG. 9A, the persons P11 and P12 are on the left (right toward the mobile terminal device 100), and there is a gap between the persons P12 and P13, resulting in poor image balance. It has become. In addition, the person P11 is located at the end of the imaging region, and the image of the person P11 is cut off.

これら人物Ｐ１１およびＰ１２を画像の中心側（携帯端末装置１００に向かって左）に移動させたい場合、携帯端末装置１００が、撮像領域全体に対して「左に寄って下さい」という音声を出力したのでは、他の人物Ｐ１３およびＰ１４も左に移動してしまうおそれがある。これでは、人物Ｐ１２とＰ１３との間に隙間が空いたままとなり、画像のバランスが悪い。また、人物Ｐ１４が左に移動し過ぎると、当該人物Ｐ１４の像が切れてしまうおそれがある。
一方、携帯端末装置１００が、「右端から２人の方は左に寄って下さい」というように通知内容を詳細に示す音声を出力しても、メッセージが長すぎて被写体に伝わらない（正しく認識して貰えない）おそれがある。 When it is desired to move the persons P11 and P12 to the center side of the image (leftward toward the mobile terminal device 100), the mobile terminal device 100 outputs a voice “Please move to the left” with respect to the entire imaging region. Then, the other persons P13 and P14 may also move to the left. This leaves a gap between the persons P12 and P13, and the image balance is poor. Further, if the person P14 moves too far to the left, the image of the person P14 may be cut off.
On the other hand, even if the mobile terminal device 100 outputs a voice indicating the details of the notification, such as “Two people from the right end, please go to the left”, the message is too long to be transmitted to the subject (correctly recognized) There is a risk that

そこで、携帯端末装置１００は、人物Ｐ１１およびＰ１２の像を含む領域Ａ２０１、Ａ２０２、Ａ２０５、Ａ２０６、Ａ２０９およびＡ２１０に対応する音声出力部１３２−１、１３２−２、１３２−５、１３２−６、１３２−９および１３２−１０から、「左に寄って下さい」というメッセージなど、人物を画像の中心側へ移動するよう促す音声を出力する。 Therefore, the mobile terminal device 100 includes audio output units 132-1, 132-2, 132-5, 132-6 corresponding to the regions A201, A202, A205, A206, A209, and A210 including the images of the persons P11 and P12. Voices prompting the person to move to the center side of the image, such as a message “Please come to the left”, are output from 132-9 and 132-10.

これにより、人物Ｐ１１およびＰ１２のみが当該音声を聞き取ることができる。そして、人物Ｐ１１およびＰ１２が当該音声に従って左に移動し、また、音声を聞いていない人物Ｐ１３およびＰ１４が元の位置に留まることで、携帯端末装置１００は、図４（ｂ）に示すように、人物Ｐ１２とＰ１３との間の隙間の狭いバランスの良い画像を得ることができる。
特に、撮像領域の外縁の一部を含む領域（携帯端末装置１００に向かって右端の領域）で聞き取り可能な音声かつ撮像領域の中心側へ移動するよう促す音声を、音声出力部１３２−１、１３２−５および１３２−９が出力し、当該音声を聞いた人物Ｐ１１が左に移動することで、人物Ｐ１１の像が切れる状態を解消し得る。 Thereby, only the persons P11 and P12 can hear the sound. Then, the persons P11 and P12 move to the left according to the sound, and the persons P13 and P14 who are not listening to the sound remain in their original positions, so that the mobile terminal device 100 is as shown in FIG. It is possible to obtain a balanced image with a narrow gap between the persons P12 and P13.
In particular, a sound that can be heard in an area including a part of the outer edge of the imaging area (a rightmost area toward the mobile terminal device 100) and a voice that prompts the user to move to the center side of the imaging area are output to the audio output unit 132-1. 132-5 and 132-9 are output, and the person P11 who has heard the sound moves to the left, so that the state in which the image of the person P11 is cut can be eliminated.

なお、携帯端末装置１００が撮像機能を実行している際、音声出力部１３２が、撮像領域の外縁の一部を含む領域に対して、撮像領域の中心側へ移動するメッセージを常に出力するようにしてもよい。たとえば、音声出力部１３２は、携帯端末装置１００に向かって撮像領域の右端から内側へ５０センチメートル（ｃｍ）以内の領域に向けて音声を出力するパラメトリックスピーカを具備して、当該領域に対して、撮像領域の中心側へ移動するよう指示する音声（例えば、「左へ寄ってください」といったメッセージ音声）を出力する。
これにより、携帯端末装置１００は、撮像領域の端に写っている人物の有無を判定する必要無しに、撮像領域の端に位置する人物に対して中心側へ移動するよう指示することができ、当該人物の像が切れていない画像を取得し得る。 Note that when the mobile terminal device 100 is executing the imaging function, the audio output unit 132 always outputs a message that moves to the center side of the imaging area with respect to an area that includes a part of the outer edge of the imaging area. It may be. For example, the audio output unit 132 includes a parametric speaker that outputs audio toward an area within 50 centimeters (cm) inward from the right end of the imaging area toward the mobile terminal device 100, and for the area Then, a voice for instructing to move to the center side of the imaging region (for example, a message voice such as “Please move to the left”) is output.
Thereby, the mobile terminal device 100 can instruct the person located at the end of the imaging region to move to the center side without having to determine the presence or absence of the person shown at the end of the imaging region, An image in which the image of the person is not cut can be acquired.

次に図５を参照して、携帯端末装置１００が行う合焦領域通知について説明する。
図５は、携帯端末装置１００（合焦領域検出部２６３）が検出する合焦領域の例を示す説明図である。同図に示すカメラ画像には、被写体としての人物Ｐ２１〜Ｐ２３が写っている。これらの人物のうち、人物Ｐ２２にはピントが合っており、人物Ｐ２１およびＰ２３にはピントが合っていない。また、同図に示す領域Ａ２０１〜Ａ２１２は、図４の場合と同様である。 Next, with reference to FIG. 5, the focus area notification performed by the mobile terminal device 100 will be described.
FIG. 5 is an explanatory diagram illustrating an example of a focus area detected by the mobile terminal device 100 (focus area detection unit 263). In the camera image shown in the figure, persons P21 to P23 as subjects are shown. Among these persons, the person P22 is in focus, and the persons P21 and P23 are not in focus. In addition, areas A201 to A212 shown in the figure are the same as those in FIG.

この図５の状態において、合焦領域検出部２６３は、領域Ａ２０３、Ａ２０７およびＡ２１１に含まれる被写体にピントが合っていることを検出し、また、他の領域に含まれる被写体にはピントが合っていないことを検出する。すなわち、合焦領域検出部２６３は、領域Ａ２０３、Ａ２０７およびＡ２１１に対応する領域である、音声出力部１３２−３、１３２−７および１３２−１１が音声を出力する領域が合焦領域に該当すること、および、他の領域は合焦領域に該当しないことを検出する。そして、音声出力部１３２が、合焦領域検出部２６３の検出結果に従って、合焦領域を示す音声を出力する。 In the state shown in FIG. 5, the focus area detection unit 263 detects that the subjects included in the areas A203, A207, and A211 are in focus, and the subjects included in other areas are in focus. Detect not. That is, the focus area detection unit 263 corresponds to the areas A203, A207, and A211. The areas where the sound output units 132-3, 132-7, and 132-11 output sound correspond to the focus areas. And that other areas do not correspond to the in-focus area. Then, the sound output unit 132 outputs sound indicating the in-focus area according to the detection result of the in-focus area detection unit 263.

ここで、音声出力部１３２が出力する合焦領域を示す音声として、様々なものを用いることができる。例えば、合焦領域に対応する音声出力部１３２−３、１３２−７および１３２−１１が、「ピントが合っている領域です」といったメッセージなど、合焦領域であることを示す音声を出力するようにしてもよい。
あるいは、合焦していない領域に対応する音声出力部１３２−１、１３２−２、１３２−４〜１３２−６、１３２−８〜１３２−１０および１３２−１２が、「ピントが合っていない領域です」といったメッセージなど、合焦していない領域であることを示す音声を出力するようにしてもよい。 Here, various sounds can be used as the sound indicating the in-focus area output by the sound output unit 132. For example, the audio output units 132-3, 132-7, and 132-11 corresponding to the in-focus area output a sound indicating the in-focus area, such as a message “the area is in focus”. It may be.
Alternatively, the audio output units 132-1, 132-2, 132-4 to 132-6, 132-8 to 132-10, and 132-12 corresponding to the out-of-focus area are displayed as “the area that is not in focus. A message indicating that the area is not in focus, such as a message such as “This is a message” may be output.

このように、音声出力部１３２が、合焦領域を示す音声を出力することで、合焦していない領域に位置する人物Ｐ２１およびＰ２３は、自らにピントが合っていないことを認識して、ピントの合う位置に移動することができる。
また、合焦領域に位置する人物Ｐ２２は、自らにピントが合っている可能性が高いことを認識して、ピントの合う位置に留まることができる。また、携帯端末装置１００は、当該人物Ｐ２２に対して安心感を与えることができる。 Thus, the voice output unit 132 outputs the voice indicating the in-focus area, so that the persons P21 and P23 located in the out-of-focus area recognize that they are not in focus. It can be moved to the in-focus position.
Further, the person P22 located in the in-focus area can recognize that there is a high possibility that the person P22 is in focus, and can remain at the in-focus position. Moreover, the portable terminal device 100 can give a sense of security to the person P22.

次に、図６を参照して、携帯端末装置１００が行う笑顔検出通知について説明する。
図６は、携帯端末装置１００（笑顔検出部２６４）が検出する笑顔領域の例を示す説明図である。同図に示すカメラ画像には、被写体としての人物Ｐ３１およびＰ３２が写っている。これらの人物のうち、人物Ｐ３１の表情は笑顔であり、一方、人物Ｐ３２の表情は笑顔でない。また、領域Ａ２０２およびＡ２０３等の各領域は、図４の場合と同様である。 Next, smile detection notification performed by the mobile terminal device 100 will be described with reference to FIG.
FIG. 6 is an explanatory diagram illustrating an example of a smile area detected by the mobile terminal device 100 (smile detection unit 264). In the camera image shown in the figure, persons P31 and P32 as subjects are shown. Among these persons, the expression of the person P31 is a smile, while the expression of the person P32 is not a smile. Each region such as regions A202 and A203 is the same as in FIG.

この図６の状態において、笑顔検出部２６４は、人物Ｐ３１の表情が笑顔であること、および、人物Ｐ３２の表情が笑顔で無いことを検出する。そして、音声出力部１３２が、笑顔検出部２６４の検出した笑顔を含む領域（笑顔検出領域）を示す音声を出力する。
ここで、音声出力部１３２が出力する笑顔検出領域を示す音声として、様々なものを用いることができる。例えば、笑顔検出領域に対応する音声出力部１３２−３が、「笑顔を検出した領域です」といったメッセージなど、笑顔検出領域であることを示す音声を出力するようにしてもよい。 In the state of FIG. 6, the smile detection unit 264 detects that the expression of the person P31 is a smile and that the expression of the person P32 is not a smile. Then, the voice output unit 132 outputs a voice indicating a region (smile detection region) including the smile detected by the smile detection unit 264.
Here, various sounds can be used as the sound indicating the smile detection area output by the sound output unit 132. For example, the voice output unit 132-3 corresponding to the smile detection area may output a voice indicating that it is a smile detection area, such as a message “It is an area where a smile is detected”.

あるいは、笑顔を検出していない領域に対応する音声出力部１３２−１、１３２−２、１３２−４〜１３２−１２が、「笑顔を検出していない領域です」といったメッセージなど、笑顔を検出していない領域であることを示す音声を出力するようにしてもよい。または、笑顔以外の表情が検出された領域Ａ２０２に対応する音声出力部１３２−２が、笑顔を検出していない領域であることを示す音声を出力するようにしてもよい。これにより、人物が位置しない領域に対する音声の出力を抑制して、消費電力を低減させることができる。
あるいは、笑顔を検出していない領域であることを示す音声に代えて、「スマイルを御願いします」など、笑顔を促す音声を出力するようにしてもよい。 Alternatively, the voice output units 132-1, 132-2, and 132-4 to 132-12 corresponding to the area where no smile is detected detects a smile such as a message “It is an area where no smile is detected”. A sound indicating that the area is not present may be output. Or you may make it the audio | voice output part 132-2 corresponding to area | region A202 where facial expressions other than a smile were detected output the audio | voice which shows that it is an area | region which has not detected a smile. As a result, it is possible to reduce power consumption by suppressing output of sound to an area where no person is located.
Alternatively, instead of the voice indicating that the area does not detect a smile, a voice prompting a smile such as “Please smile” may be output.

このように、音声出力部１３２が、笑顔検出領域を示す音声を出力することで、笑顔を検出していない領域に位置する人物Ｐ３１に笑顔を意識させ得る。人物Ｐ３１が笑顔になることで、携帯端末装置１００は、和らいだ雰囲気の画像を得ることが出来る。
また、笑顔検出領域に位置する人物Ｐ３２は、自らの表情が笑顔として認識されている可能性が高いこと、従って、自らが笑顔で撮像されるであろうことを認識して、自らの表情を維持することができる。また、携帯端末装置１００は、当該人物Ｐ３２に対して安心感を与えることができる。 As described above, the voice output unit 132 outputs the voice indicating the smile detection area, thereby making the person P31 located in the area where the smile is not detected aware of the smile. When the person P31 smiles, the mobile terminal device 100 can obtain an image with a relaxed atmosphere.
In addition, the person P32 located in the smile detection area recognizes that his / her facial expression is likely to be recognized as a smile, and thus will be captured with a smile, Can be maintained. Moreover, the portable terminal device 100 can give a sense of security to the person P32.

次に、図７〜図１０を参照して携帯端末装置１００の動作について説明する。
図７は、携帯端末装置１００が、静止画像を撮像する撮像機能を実行する際の処理手順を示すフローチャートである。ただし、本発明の適用範囲は静止画像を撮像する場合に限らず、動画像を撮像する場合にも本発明を適用し得る。
携帯端末装置１００は、撮像機能の開始を指示する操作を操作入力部１２０にて受け付けると、図７の処理を開始する。 Next, the operation of the mobile terminal device 100 will be described with reference to FIGS.
FIG. 7 is a flowchart illustrating a processing procedure when the mobile terminal device 100 executes an imaging function for capturing a still image. However, the application range of the present invention is not limited to capturing a still image, and the present invention can also be applied to capturing a moving image.
When the operation input unit 120 receives an operation for instructing the start of the imaging function, the mobile terminal device 100 starts the process of FIG.

まず、アプリケーション実行部２６０は、撮像機能のアプリケーションプログラムの実行を開始し、撮像制御部２５０を介して撮像部１５０を起動させる（ステップＳ１０１）。これにより、撮像部１５０は、カメラ画像を示す画像信号の出力を開始し、撮像制御部２５０は、当該カメラ画像の画像データの出力を開始する。そして、表示制御部２１０は、撮像制御部２５０から出力される画像データに基づいて画面表示用の信号を生成して表示部１１０に出力し、表示部１１０は、当該信号に従ってファインダー画像を表示する。 First, the application execution unit 260 starts execution of the application program for the imaging function, and activates the imaging unit 150 via the imaging control unit 250 (step S101). Thereby, the imaging unit 150 starts outputting an image signal indicating a camera image, and the imaging control unit 250 starts outputting image data of the camera image. The display control unit 210 generates a screen display signal based on the image data output from the imaging control unit 250 and outputs the screen display signal to the display unit 110. The display unit 110 displays a finder image according to the signal. .

次に、アプリケーション実行部２６０の距離測定部２６２は、被写体までの距離を測定し、測定結果を撮像制御部２５０に出力する（ステップＳ１０２）。上述したように、距離測定部２６２は、音声出力部１３２が音声を出力してから、音声入力部１３１が反射音を採取するまでの時間に基づいて、携帯端末装置１００から被写体までの距離を算出（検出）する。 Next, the distance measurement unit 262 of the application execution unit 260 measures the distance to the subject and outputs the measurement result to the imaging control unit 250 (step S102). As described above, the distance measuring unit 262 calculates the distance from the mobile terminal device 100 to the subject based on the time from when the audio output unit 132 outputs sound until the audio input unit 131 collects reflected sound. Calculate (detect).

次に、撮像部１５０は、撮像制御部２５０の制御に従って、距離測定部２６２が検出した被写体までの距離に合焦する（ステップＳ１０３）。ここで、被写体が複数ある場合、撮像部１５０は、携帯端末装置１００から最も近い被写体までの距離に合焦する。あるいは、操作入力部１２０が、被写体を選択する操作を受け付けて、撮像部１５０が、選択された被写体までの距離に合焦するようにしてもよい。 Next, the imaging unit 150 focuses on the distance to the subject detected by the distance measuring unit 262 according to the control of the imaging control unit 250 (step S103). Here, when there are a plurality of subjects, the imaging unit 150 focuses on the distance from the mobile terminal device 100 to the closest subject. Alternatively, the operation input unit 120 may receive an operation for selecting a subject, and the imaging unit 150 may focus on the distance to the selected subject.

次に、アプリケーション実行部２６０は、撮像機能の終了を指示する操作を操作入力部１２０が受け付けたか否かを判定する（ステップＳ１０４）。受け付けたと判定した場合（ステップＳ１０４：ＹＥＳ）、撮像部１５０の停止やファインダー画像の表示終了など、撮像機能の終了処理を行い（ステップＳ１１１）、その後、同図の処理を終了する。 Next, the application execution unit 260 determines whether or not the operation input unit 120 has accepted an operation for instructing the end of the imaging function (step S104). If it is determined that the image has been received (step S104: YES), the imaging function is terminated (step S111), such as stopping the imaging unit 150 or ending the display of the finder image (step S111).

一方、ステップＳ１０４において、撮像機能の終了を指示する操作を操作入力部１２０が受け付けなかったと判定した場合（ステップＳ１０４：ＮＯ）、アプリケーション実行部２６０は、ズーム操作を操作入力部１２０が受け付けたか否かを判定する（ステップＳ１２１）。ズーム操作を受け付けたと判定した場合（ステップＳ１２１：ＹＥＳ）、アプリケーション実行部２６０は、当該ズーム操作に従って、ズームレンズの移動を撮像制御部２５０に指示し、撮像制御部２５０は、アプリケーション実行部２６０からの指示に従って、撮像部１５０のズームレンズを移動させる（ステップＳ１２２）。 On the other hand, when it is determined in step S104 that the operation input unit 120 has not received an operation for instructing the end of the imaging function (step S104: NO), the application execution unit 260 determines whether the operation input unit 120 has received a zoom operation. Is determined (step S121). When it is determined that the zoom operation has been accepted (step S121: YES), the application execution unit 260 instructs the imaging control unit 250 to move the zoom lens according to the zoom operation, and the imaging control unit 250 receives from the application execution unit 260. The zoom lens of the imaging unit 150 is moved according to the instruction (step S122).

次に、携帯端末装置１００は、図４で説明した被写体位置指示に関する処理を行う（ステップＳ１３１）。当該処理の手順については、図８を参照して後述する。
次に、アプリケーション実行部２６０は、撮像画像の保存を指示する操作を操作入力部１２０が受け付けたか否かを判定する（ステップＳ１４１）。撮像画像の保存を指示する操作を受け付けたと判定した場合（ステップＳ１４１：ＹＥＳ）、アプリケーション実行部２６０は、撮像制御部２５０から出力されるカメラ画像の画像データを、撮像画像の画像データとして記憶部１９０に書き込む（ステップＳ１４２）。
その後、ステップＳ１０２に戻る。 Next, the mobile terminal device 100 performs processing related to the subject position instruction described in FIG. 4 (step S131). The processing procedure will be described later with reference to FIG.
Next, the application execution unit 260 determines whether or not the operation input unit 120 has accepted an operation for instructing saving of a captured image (step S141). When it is determined that an operation for instructing saving of the captured image is received (step S141: YES), the application execution unit 260 stores the image data of the camera image output from the imaging control unit 250 as the image data of the captured image. 190 is written (step S142).
Thereafter, the process returns to step S102.

一方、ステップＳ１４１において、撮像画像の保存を指示する操作を受け付けていないと判定した場合（ステップＳ１４１：ＮＯ）、携帯端末装置１００は、図５で説明した合焦領域通知に関する処理を実行し（ステップＳ１５１）、また、図６で説明した笑顔検出通知に関する処理を実行する（ステップＳ１５２）。これらの処理の手順については、図９および図１０を参照して後述する。
その後、ステップＳ１０２に戻る。 On the other hand, if it is determined in step S141 that an operation for instructing saving of the captured image has not been received (step S141: NO), the mobile terminal device 100 executes the process related to the in-focus area notification described in FIG. In step S151), the process related to the smile detection notification described in FIG. 6 is executed (step S152). The procedure of these processes will be described later with reference to FIG. 9 and FIG.
Thereafter, the process returns to step S102.

一方、ステップＳ１２１において、ズーム操作を受け付けていないと判定した場合（ステップＳ１２１：ＮＯ）、ステップＳ１３１に進む。 On the other hand, when it determines with not accepting zoom operation in step S121 (step S121: NO), it progresses to step S131.

次に、図８を参照して、ステップＳ１３１（図７）における携帯端末装置１００の動作について説明する。
図８は、携帯端末装置１００が、被写体位置指示機能に関する処理を行う手順を示すフローチャートである。同図の処理において、まず、アプリケーション実行部２６０は、被写体位置指示機能がＯＮ（実行する）に設定されているか否かを判定する（ステップＳ２０１）。当該設定は、例えば、操作入力部１２０がユーザ操作を受け付けることによって予め（撮像機能の実行開始前に）行われる。 Next, with reference to FIG. 8, operation | movement of the portable terminal device 100 in step S131 (FIG. 7) is demonstrated.
FIG. 8 is a flowchart illustrating a procedure in which the mobile terminal device 100 performs processing related to the subject position instruction function. In the process of FIG. 6, first, the application execution unit 260 determines whether or not the subject position instruction function is set to ON (execute) (step S201). For example, the setting is performed in advance (before the execution of the imaging function) when the operation input unit 120 receives a user operation.

被写体位置指示機能がＯＮに設定されていると判定した場合（ステップＳ２０１：ＹＥＳ）、アプリケーション実行部２６０の被写体検出部２６１が、カメラ画像における人物の位置を検出する（ステップＳ２０２）。被写体検出部２６１は、例えば、カメラ画像に対するパターンマッチングによって人物の像を検出し、検出した像の位置（カメラ画像における座標）を求めることによって、カメラ画像における被写体の位置を検出する。 When it is determined that the subject position instruction function is set to ON (step S201: YES), the subject detection unit 261 of the application execution unit 260 detects the position of the person in the camera image (step S202). The subject detection unit 261 detects the position of the subject in the camera image by, for example, detecting a human image by pattern matching with the camera image and obtaining the position of the detected image (coordinates in the camera image).

次に、アプリケーション実行部２６０は、被写体検出部２６１の検出結果に基づいて、カメラ画像に複数の人物が写っているか否かを判定する（ステップＳ２１１）。複数の人物が写っていると判定した場合（ステップＳ２１１：ＹＥＳ）、アプリケーション実行部２６０は、人物の間隔が所定の距離以上の箇所（以下、「人物間の隙間」と称する）があるか否かを判定する（ステップＳ２１２）。ここでの所定の距離としては、例えば、カメラ画像における人物の（像の）幅など、人物の大きさとの相対的な距離を用いることができる。 Next, the application execution unit 260 determines whether or not a plurality of persons are captured in the camera image based on the detection result of the subject detection unit 261 (step S211). When it is determined that a plurality of persons are captured (step S211: YES), the application execution unit 260 determines whether there is a place where the distance between the persons is equal to or greater than a predetermined distance (hereinafter referred to as “gap between persons”). Is determined (step S212). As the predetermined distance here, for example, a distance relative to the size of the person such as the width of the person (image) in the camera image can be used.

人物間の隙間があると判定した場合（ステップＳ２１２：ＹＥＳ）、アプリケーション実行部２６０は、移動対象とする人物を決定する（ステップＳ２１３）。具体的には、アプリケーション実行部２６０は、人物間の隙間を検出した位置の左右で、それぞれ人物の集合（一纏まり）の中心位置を求め、求めた中心位置がカメラ画像の中心から遠いほうの集合に含まれる人物を移動対象とする。
図４（ａ）のカメラ画像の場合、アプリケーション実行部２６０は、人物Ｐ１２とＰ１３との間に人物間の隙間があると判定し、人物Ｐ１１と人物Ｐ１２とを一纏まりとする。そして、カメラ画像の横方向の座標（以下、「Ｘ座標」と称する）において、人物Ｐ１１と人物Ｐ１２との位置のＸ座標（カメラ画像の左端のＸ座標から、人物Ｐ１２の左手の先のＸ座標まで）の中心のＸ座標を求める。また、アプリケーション実行部２６０は、人物Ｐ１３と人物Ｐ１４とを一纏まりとして、人物Ｐ１３と人物Ｐ１４との位置のＸ座標（人物Ｐ１３の右手の先のＸ座標から、人物Ｐ１４の左手の先のＸ座標まで）の中心のＸ座標を求める。そして、アプリケーション実行部２６０は、人物Ｐ１１と人物Ｐ１２との中心のＸ座標のほうが、人物Ｐ１３と人物Ｐ１４との中心のＸ座標よりもカメラ画像の中心から遠いと判定して、人物Ｐ１１と人物Ｐ１２とを移動対象に決定する。 If it is determined that there is a gap between persons (step S212: YES), the application execution unit 260 determines a person to be moved (step S213). Specifically, the application execution unit 260 obtains the center position of a set of persons (a group) on the left and right of the position where the gap between the persons is detected, and the obtained center position is the farthest from the center of the camera image. A person included in the set is a movement target.
In the case of the camera image of FIG. 4A, the application execution unit 260 determines that there is a gap between the persons P12 and P13, and collects the persons P11 and P12 together. Then, in the horizontal coordinate of the camera image (hereinafter referred to as “X coordinate”), the X coordinate of the position of the person P11 and the person P12 (from the X coordinate of the left end of the camera image to the tip of the left hand of the person P12) X coordinate of the center of (up to coordinates) is obtained. Further, the application execution unit 260 collects the person P13 and the person P14 as a group, and the X coordinate of the position of the person P13 and the person P14 (from the X coordinate of the right hand of the person P13 to the X of the tip of the left hand of the person P14). X coordinate of the center of (up to coordinates) is obtained. Then, the application execution unit 260 determines that the X coordinate of the center of the person P11 and the person P12 is farther from the center of the camera image than the X coordinate of the center of the person P13 and the person P14. P12 is determined as a movement target.

次に、アプリケーション実行部２６０は、移動対象の人物を移動させる方向を決定する（ステップＳ２１４）。具体的には、ステップＳ２１３で求めた中心が、カメラ画像の中心に近付く方向に移動させるように決定する。図４（ａ）の例では、人物Ｐ１１と人物Ｐ１２との中心のＸ座標は、カメラ画像の中心よりも左の位置を示している。そこで、アプリケーション実行部２６０は、人物Ｐ１１と人物Ｐ１２とを右（携帯端末装置１００に向かって左）に移動させることに決定する。 Next, the application execution unit 260 determines the direction in which the person to be moved is moved (step S214). Specifically, it is determined that the center obtained in step S213 is moved in a direction approaching the center of the camera image. In the example of FIG. 4A, the X coordinate of the center of the person P11 and the person P12 indicates a position to the left of the center of the camera image. Therefore, the application execution unit 260 determines to move the person P11 and the person P12 to the right (left toward the mobile terminal device 100).

そして、アプリケーション実行部２６０は、ステップＳ２１４で決定した方向に移動するよう指示するメッセージを記憶部１９０から読み出して音声処理部２３０に出力し、音声処理部２３０は、音声出力部１３２を制御して、移動対象の人物に対して当該メッセージの音声を出力させる（ステップＳ２１５）。
図４（ａ）の例では、アプリケーション実行部２６０は、ステップＳ２１４で決定した方向に基づいて、「左に寄って下さい」というメッセージを記憶部１９０から読み出す。また、アプリケーション実行部２６０（被写体検出部２６１）は、人物Ｐ１１およびＰ１２の像が含まれる領域が、領域Ａ２０１、Ａ２０２、Ａ２０５、Ａ２０６、Ａ２０９およびＡ２１０であることを検出する。そして、アプリケーション実行部２６０は、記憶部１９０から読み出したメッセージを音声処理部２３０に出力し、検出した領域に対応する音声出力部１３２−１、１３２−２、１３２−５、１３２−６、１３２−９および１３２−１０から当該メッセージの音声を出力するよう指示する。そして、音声処理部２３０は、アプリケーション実行部２６０から出力されたメッセージの音声信号を生成し、音声出力部１３２−１、１３２−２、１３２−５、１３２−６、１３２−９および１３２−１０に出力して当該メッセージの音声を出力させる。
その後、ステップＳ２０２に戻る。 Then, the application execution unit 260 reads out a message instructing to move in the direction determined in step S214 from the storage unit 190 and outputs the message to the voice processing unit 230. The voice processing unit 230 controls the voice output unit 132 to control the voice output unit 132. The voice of the message is output to the person to be moved (step S215).
In the example of FIG. 4A, the application execution unit 260 reads from the storage unit 190 a message “Please come to the left” based on the direction determined in step S214. Further, the application execution unit 260 (subject detection unit 261) detects that the regions including the images of the persons P11 and P12 are the regions A201, A202, A205, A206, A209, and A210. Then, the application execution unit 260 outputs the message read from the storage unit 190 to the voice processing unit 230, and the voice output units 132-1, 132-2, 132-5, 132-6, 132 corresponding to the detected area. Instruct to output the voice of the message from -9 and 132-10. Then, the voice processing unit 230 generates a voice signal of the message output from the application execution unit 260, and the voice output units 132-1, 132-2, 132-5, 132-6, 132-9, and 132-10. To output the voice of the message.
Thereafter, the process returns to step S202.

一方、ステップＳ２１１において、複数の人物が写っていないと判定した場合（ステップＳ２１１：ＮＯ）、および、ステップＳ２１２において、人物間の隙間がないと判定した場合（ステップＳ２１２：ＮＯ）、アプリケーション実行部２６０は、画像の端にかかっている人物がいるか否かを判定する（ステップＳ２２１）。画像の端にかかっている人物がいると判定した場合（ステップＳ２２１：ＹＥＳ）、アプリケーション実行部２６０は、当該人物に対してカメラ画像の中心側に移動するよう指示するメッセージを記憶部１９０から読み出して音声処理部２３０に出力し、音声処理部２３０は、音声出力部１３２を制御して、移動対象の人物に対して当該メッセージの音声を出力させる（ステップＳ２２２）。
例えば、カメラ画像の左端にかかっている人物を検出した場合、アプリケーション実行部２６０は、カメラ画像の中心側（携帯端末装置１００に向かって左）に移動するよう指示する「左に寄って下さい」というメッセージを記憶部１９０から読み出す。
また、アプリケーション実行部２６０（被写体検出部２６１）は、当該人物（の像）が含まれる領域が、領域Ａ２０１、Ａ２０５およびＡ２０９であることを検出する。そして、アプリケーション実行部２６０は、記憶部１９０から読み出したメッセージを音声処理部２３０に出力し、検出した領域に対応する音声出力部１３２−１、１３２−５および１３２−９から当該メッセージの音声を出力するよう指示する。そして、音声処理部２３０は、アプリケーション実行部２６０から出力されたメッセージの音声信号を生成し、音声出力部１３２−１、１３２−５および１３２−９に出力して当該メッセージの音声を出力させる。
その後、ステップＳ２０２に戻る。 On the other hand, when it is determined in step S211 that a plurality of persons are not captured (step S211: NO) and in step S212, it is determined that there is no gap between the persons (step S212: NO), the application execution unit 260 determines whether there is a person on the edge of the image (step S221). If it is determined that there is a person on the edge of the image (step S221: YES), the application execution unit 260 reads from the storage unit 190 a message that instructs the person to move to the center side of the camera image. Is output to the voice processing unit 230, and the voice processing unit 230 controls the voice output unit 132 to output the voice of the message to the person to be moved (step S222).
For example, when a person on the left end of the camera image is detected, the application execution unit 260 instructs to move to the center side of the camera image (left toward the mobile terminal device 100). Is read from the storage unit 190.
In addition, the application execution unit 260 (subject detection unit 261) detects that the regions including the person (image) are regions A201, A205, and A209. Then, the application execution unit 260 outputs the message read from the storage unit 190 to the voice processing unit 230, and the voice of the message from the voice output units 132-1, 132-5, and 132-9 corresponding to the detected area. Instruct to output. Then, the voice processing unit 230 generates a voice signal of the message output from the application execution unit 260 and outputs it to the voice output units 132-1, 132-5, and 132-9 to output the voice of the message.
Thereafter, the process returns to step S202.

一方、ステップＳ２０１において、被写体位置の指示を行う機能がＯＦＦに設定されていると判定した場合（ステップＳ２０１：ＮＯ）、および、ステップＳ２２１において、画像の端にかかっている人物がいないと判定した場合（ステップＳ２２１：ＮＯ）、同図の処理を終了する。 On the other hand, if it is determined in step S201 that the function for instructing the subject position is set to OFF (step S201: NO), and it is determined in step S221 that there is no person on the edge of the image. In the case (step S221: NO), the process of FIG.

次に、図９を参照してステップＳ１５１（図７）における携帯端末装置１００の動作について説明する。
図９は、携帯端末装置１００が、合焦領域通知に関する処理を行う手順を示すフローチャートである。同図の処理において、まず、アプリケーション実行部２６０は、合焦領域通知機能がＯＮ（実行する）に設定されているか否かを判定する（ステップＳ３０１）。当該設定は、例えば、操作入力部１２０がユーザ操作を受け付けることによって予め（撮像機能の実行開始前に）行われる。 Next, the operation of the mobile terminal device 100 in step S151 (FIG. 7) will be described with reference to FIG.
FIG. 9 is a flowchart illustrating a procedure in which the mobile terminal device 100 performs processing related to the focus area notification. In the process of FIG. 5, first, the application execution unit 260 determines whether or not the focused area notification function is set to ON (execute) (step S301). For example, the setting is performed in advance (before the execution of the imaging function) when the operation input unit 120 receives a user operation.

合焦領域通知機能がＯＮに設定されていると判定した場合（ステップＳ３０１：ＹＥＳ）、アプリケーション実行部２６０の合焦領域検出部２６３が、カメラ画像における合焦領域を検出する（ステップＳ３０２）。上述したように、被写体に合焦しているか否かを合焦領域検出部２６３が判定する方法としては、携帯端末装置１００から被写体までの距離と撮像部１５０のフォーカスとを比較する方法や、撮像部１５０のフォーカスを変化させた際の被写体の像のコントラストの変化量を用いる方法など、様々な方法を用いることができる。 When it is determined that the focus area notification function is set to ON (step S301: YES), the focus area detection unit 263 of the application execution unit 260 detects the focus area in the camera image (step S302). As described above, as a method of determining whether or not the subject is in focus, the focus region detection unit 263 determines a method of comparing the distance from the mobile terminal device 100 to the subject and the focus of the imaging unit 150, Various methods such as a method using the amount of change in contrast of the image of the subject when the focus of the imaging unit 150 is changed can be used.

次に、アプリケーション実行部２６０は、合焦領域を示すメッセージを記憶部１９０から読み出して音声処理部２３０に出力し、音声処理部２３０は、合焦領域に応じて、音声出力部１３２に当該音声のメッセージを出力させる（ステップＳ３０３）。上述したように、音声出力部１３２の各々のうち合焦領域に対応するものに、合焦領域であることを示す音声を出力させるようにしてもよいし、あるいは、音声出力部１３２の各々のうち合焦していない領域に対応するものに、合焦領域でないことを示す音声を出力させるようにしてもよい。
その後、同図の処理を終了する。
また、ステップＳ３０１において、合焦領域通知機能がＯＦＦに設定されていると判定した場合（ステップＳ３０１：ＮＯ）も、同図の処理を終了する。 Next, the application execution unit 260 reads a message indicating the in-focus area from the storage unit 190 and outputs the message to the audio processing unit 230. The audio processing unit 230 outputs the audio to the audio output unit 132 according to the in-focus area. Is output (step S303). As described above, the sound corresponding to the in-focus area among each of the sound output units 132 may be output with the sound indicating the in-focus area, or each of the sound output units 132 may be output. You may make it output the audio | voice which shows not being a focusing area | region to the thing corresponding to the area | region which is not focusing.
Thereafter, the process of FIG.
If it is determined in step S301 that the in-focus area notification function is set to OFF (step S301: NO), the processing in FIG.

次に、図１０を参照してステップＳ１５２（図７）における携帯端末装置１００の動作について説明する。
図１０は、携帯端末装置１００が、笑顔検出通知に関する処理を行う手順を示すフローチャートである。同図の処理において、まず、アプリケーション実行部２６０は、笑顔検出通知機能がＯＮ（実行する）に設定されているか否かを判定する（ステップＳ４０１）。当該設定は、例えば、操作入力部１２０がユーザ操作を受け付けることによって予め（撮像機能の実行開始前に）行われる。 Next, the operation of the mobile terminal device 100 in step S152 (FIG. 7) will be described with reference to FIG.
FIG. 10 is a flowchart illustrating a procedure in which the mobile terminal device 100 performs a process related to the smile detection notification. In the process of FIG. 6, first, the application execution unit 260 determines whether or not the smile detection notification function is set to ON (execute) (step S401). For example, the setting is performed in advance (before the execution of the imaging function) when the operation input unit 120 receives a user operation.

笑顔検出通知機能がＯＮに設定されていると判定した場合（ステップ４０１：ＹＥＳ）、アプリケーション実行部２６０の笑顔検出部２６４が、カメラ画像における笑顔の領域を検出する（ステップＳ４０２）。笑顔検出部２６４が笑顔を検出する方法としては、例えば、予め記憶しておいた顔パターンとのパターンマッチングによる方法など、公知の方法を用いることができる。 When it is determined that the smile detection notification function is set to ON (step 401: YES), the smile detection unit 264 of the application execution unit 260 detects a smile area in the camera image (step S402). As a method for detecting the smile by the smile detection unit 264, a known method such as a method based on pattern matching with a face pattern stored in advance can be used.

次に、アプリケーション実行部２６０は、笑顔検出領域を示すメッセージを記憶部１９０から読み出して音声処理部２３０に出力し、音声処理部２３０は、笑顔検出領域に応じて、音声出力部１３２に当該音声のメッセージを出力させる（ステップＳ４０３）。上述したように、音声出力部１３２の各々のうち笑顔検出領域に対応するものに、笑顔検出領域であることを示す音声を出力させるようにしてもよいし、あるいは、音声出力部１３２の各々のうち笑顔を検出していない領域に対応するものに、笑顔検出領域でないことを示す音声を出力させるようにしてもよい。
その後、同図の処理を終了する。
また、ステップＳ４０１において、笑顔検出通知機能がＯＦＦに設定されていると判定した場合（ステップＳ４０１：ＮＯ）も、同図の処理を終了する。 Next, the application execution unit 260 reads a message indicating the smile detection area from the storage unit 190 and outputs the message to the voice processing unit 230. The voice processing unit 230 outputs the message to the voice output unit 132 according to the smile detection area. Is output (step S403). As described above, the sound corresponding to the smile detection area among each of the sound output units 132 may be output with a sound indicating that it is a smile detection region, or each of the sound output units 132 may be output. Of these, a sound indicating that the face is not a smile detection area may be output to the area corresponding to the area where no smile is detected.
Thereafter, the process of FIG.
If it is determined in step S401 that the smile detection notification function is set to OFF (step S401: NO), the processing in FIG.

以上のように、音声出力部１３２は、撮像部１５０の撮像領域の一部で聞き取り可能な音声を出力する。
これにより、携帯端末装置１００は、撮像領域に含まれる複数の人物のうち一部のみに通知を行うことや、人物が特定の領域に位置する場合にのみ通知を行うことができる。従って、携帯端末装置１００は、通知対象の人物に対してのみ短いメッセージで通知を行うなど、より的確に通知を行うことができる。
また、携帯端末装置１００は、音声を用いて通知を行うので、光を用いて通知を行う場合と比べて、周囲が明るい状態でも被写体としての人物が通知を認識し易く、また、通知のための光が画像に映りこんでしまうことを回避できる。 As described above, the audio output unit 132 outputs audio that can be heard in a part of the imaging region of the imaging unit 150.
Thereby, the mobile terminal device 100 can notify only some of the plurality of persons included in the imaging region, or can notify only when the person is located in a specific region. Therefore, the mobile terminal device 100 can make a more accurate notification, such as a notification with a short message only to the person to be notified.
In addition, since the mobile terminal device 100 performs notification using sound, compared with the case where notification is performed using light, a person as a subject can easily recognize the notification even when the surroundings are bright. Can be prevented from being reflected in the image.

また、音声出力部１３２の各々は、互いに異なる領域で聞き取り可能な音声を出力する。
これにより、パラメトリックスピーカの向きを制御する仕組みを具備する必要無しに、撮像部１５０の撮像領域を分割した領域のいずれかを選択して、選択した領域に音声を出力することができる。 Each of the audio output units 132 outputs audio that can be heard in different areas.
Accordingly, it is possible to select one of the areas obtained by dividing the imaging area of the imaging unit 150 and output the sound to the selected area without having to have a mechanism for controlling the direction of the parametric speaker.

また、音声出力部１３２（例えば、音声出力部１３２−１、１３２−５および１３２−９）は、撮像領域の外縁の一部を含む領域で聞き取り可能な音声かつ撮像領域の中心側へ移動するよう促す音声を出力する。
これにより、携帯端末装置１００は、カメラ画像において人物の像が切れてしまっている場合に、当該人物を移動させて人物の像が切れていないカメラ画像を取得し得る。 In addition, the audio output unit 132 (for example, the audio output units 132-1, 132-5, and 132-9) moves to the center of the imaging region and can be heard in a region including a part of the outer edge of the imaging region. A voice prompting the user to output is output.
Thereby, when the image of a person is cut in the camera image, the mobile terminal device 100 can acquire a camera image in which the image of the person is not cut by moving the person.

また、合焦領域検出部２６３が合焦領域を検出し、音声出力部１３２は、合焦領域を示す音声を出力する。
これにより、合焦していない領域に位置する人物は、自らにピントが合っていないことを認識して、ピントの合う位置に移動することができる。また、合焦領域に位置する人物は、自らにピントが合っている可能性が高いことを認識して、ピントの合う位置に留まることができる。また、携帯端末装置１００は、当該人物に対して安心感を与えることができる。 Further, the focus area detection unit 263 detects the focus area, and the sound output unit 132 outputs sound indicating the focus area.
As a result, a person located in an out-of-focus area can recognize that the subject is not in focus and can move to a focus position. In addition, a person located in the in-focus area can recognize that there is a high possibility that the person is in focus, and can remain at the in-focus position. Further, the mobile terminal device 100 can give a sense of security to the person.

また、笑顔検出部２６４が笑顔検出領域を検出し、音声出力部１３２は、笑顔検出領域を示す音声を出力する。
これにより、携帯端末装置１００は、笑顔を検出していない領域に位置する人物に笑顔を意識させ得る。当該人物が笑顔になることで、携帯端末装置１００は、和らいだ雰囲気の画像を得ることが出来る。また、笑顔検出領域に位置する人物は、自らの表情が笑顔として認識されている可能性が高いこと、従って、自らが笑顔で撮像されるであろうことを認識して、自らの表情を維持することができる。また、携帯端末装置１００は、当該人物に対して安心感を与えることができる。 The smile detection unit 264 detects a smile detection area, and the voice output unit 132 outputs a voice indicating the smile detection area.
Thereby, the mobile terminal device 100 can make a person located in a region where no smile is detected be aware of smiles. When the person smiles, the mobile terminal device 100 can obtain an image with a relaxed atmosphere. In addition, a person located in the smile detection area maintains his / her facial expression by recognizing that his / her facial expression is likely to be recognized as a smile, and that he / she will be captured with a smile. can do. Further, the mobile terminal device 100 can give a sense of security to the person.

また、距離測定部２６２が、音声出力部１３２の出力する音声を用いて被写体までの距離を検出し、撮像部１５０は、距離測定部２６２が検出した被写体までの距離に合焦する。
これにより、被写体の色と背景の色が似ている場合など、フォーカスレンズを動かしたときにコントラストの変化を得にくい場合にも、携帯端末装置１００は、より正確にピント合わせを行うことができる。 The distance measurement unit 262 detects the distance to the subject using the sound output from the sound output unit 132, and the imaging unit 150 focuses on the distance to the subject detected by the distance measurement unit 262.
Thereby, even when it is difficult to obtain a change in contrast when the focus lens is moved, such as when the subject color is similar to the background color, the mobile terminal device 100 can focus more accurately. .

なお、制御部１８０の全部または一部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより各部の処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。
また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 It should be noted that a program for realizing all or part of the functions of the control unit 180 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. You may perform the process of. Here, the “computer system” includes an OS and hardware such as peripheral devices.
Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used.
The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case, and a program that holds a program for a certain period of time are also included. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

以上、本発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計変更等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes design changes and the like without departing from the gist of the present invention.

１００携帯端末装置
１１０表示部
１２０操作入力部
１３１音声入力部
１３２−１〜１３２−１２音声出力部
１４０無線通信部
１５０撮像部
１８０制御部
１９０記憶部
２１０表示制御部
２２０入力処理部
２３０音声処理部
２４０通信制御部
２５０撮像制御部
２６０アプリケーション実行部
２６１被写体検出部
２６２距離測定部
２６３合焦領域検出部
２６４笑顔検出部
DESCRIPTION OF SYMBOLS 100 Portable terminal device 110 Display part 120 Operation input part 131 Voice input part 132-1 to 132-12 Voice output part 140 Wireless communication part 150 Imaging part 180 Control part 190 Storage part 210 Display control part 220 Input processing part 230 Voice processing part 240 Communication Control Unit 250 Imaging Control Unit 260 Application Execution Unit 261 Subject Detection Unit 262 Distance Measurement Unit 263 Focus Area Detection Unit 264 Smile Detection Unit

Claims

An imaging unit;
An audio output unit that outputs audible audio in a part of the imaging region of the imaging unit;
An imaging apparatus comprising:

A plurality of the audio output units;
The imaging apparatus according to claim 1, wherein the plurality of sound output units output sounds that can be heard in different areas.

The voice output unit outputs a voice that can be heard in an area including a part of an outer edge of the imaging area and a voice that prompts the user to move to the center side of the imaging area. The imaging device described in 1.

An in-focus area detecting unit that detects an in-focus area that is an in-focus area in the captured image of the imaging unit;
The imaging apparatus according to any one of claims 1 to 3, wherein the sound output unit outputs a sound indicating a focus area detected by the focus area detection unit.

Comprising a smile detection unit for detecting a smile of the subject in the imaging region;
The imaging apparatus according to claim 1, wherein the voice output unit outputs a voice indicating a region including a smile detected by the smile detection unit.

A distance measuring unit that detects the distance to the subject using the audio output from the audio output unit;
The imaging apparatus according to claim 1, wherein the imaging unit focuses on a distance to a subject detected by the distance measurement unit.

An imaging method for an imaging apparatus,
An audio output step for outputting audible audio in a part of the imaging area;
An imaging step of imaging the imaging region;
An imaging method comprising: