JP2014033332A

JP2014033332A - Imaging apparatus and imaging method

Info

Publication number: JP2014033332A
Application number: JP2012172530A
Authority: JP
Inventors: Daisuke Fukase; 大介深瀬
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2012-08-03
Filing date: 2012-08-03
Publication date: 2014-02-20

Abstract

PROBLEM TO BE SOLVED: To provide an imaging apparatus and an imaging method which allow for easy remote control.SOLUTION: An imaging apparatus 1 comprises an imaging unit 102, a microphone detection image processing unit 110, a face detection processing unit 111, and a pan/tilt drive unit 107. The imaging unit 102 generates image data. The microphone detection image processing unit 110 detects the position of a microphone device 9 in the image data. The face detection processing unit 111 determines a face detection region in the image data on the basis of the detection result of the microphone detection image processing unit 110, and detects a face region included in the face detection region. The pan/tilt drive unit 107 executes pan/tilt operation on the basis of the detection result of the face detection processing part 111.

Description

本発明は撮像及び撮像方法に関する。 The present invention relates to imaging and an imaging method.

近年、ネット配信番組に代表されるように、少人数で制作される番組が普及している。このような少人数で制作される番組においては、司会者（リポーター）がカメラ操作も兼務するという撮影形態が用いられている。 In recent years, programs produced by a small number of people have become widespread, as represented by online distribution programs. In such a program produced by a small number of people, a mode in which a moderator (reporter) also serves as a camera is used.

例えば、特許文献１には、カメラを遠隔操作するための操作ボタン及び撮像された画像を表示するモニタを有するマイクロフォン装置が開示されている。これにより、司会者は、インタビューをしながら、撮像画面を確認して、カメラを遠隔操作（パンチルトやズーム等）することができる。 For example, Patent Document 1 discloses a microphone device having an operation button for remotely operating a camera and a monitor for displaying a captured image. As a result, the presenter can confirm the imaging screen while performing an interview and remotely control the camera (pan tilt, zoom, etc.).

特開２００９−２７６２６号公報JP 2009-27626 A

しかしながら、特許文献１に記載のマイクロフォン装置においては、司会者は、インタビューをしながら、モニタを確認し、撮像画像が所望の画角となるように操作ボタンを用いてカメラを操作する必要がある。そのため、司会者は、インタビュー、モニタ確認、及びカメラ操作という複数の動作を連動して行わなければならない。その結果、熟練した技術が必要となり、撮影が難しいという問題がある。 However, in the microphone device described in Patent Document 1, the presenter needs to check the monitor while performing an interview and operate the camera using the operation buttons so that the captured image has a desired angle of view. . Therefore, the moderator must perform a plurality of operations such as interview, monitor confirmation, and camera operation in conjunction with each other. As a result, skilled techniques are required, and there is a problem that photographing is difficult.

本発明はこのような問題を解決するためになされたものであり、容易に遠隔操作することができる撮像装置及び撮像方法を提供することを目的としている。 The present invention has been made to solve such a problem, and an object thereof is to provide an imaging apparatus and an imaging method that can be easily remotely controlled.

本発明の一態様に係る撮像装置（１）は、撮像処理により画像データを生成する撮像手段（１０２）と、特定の形状を有する対象物の前記画像データにおける位置を検出する対象物位置検出手段（マイク検出画像処理部１１０）と、前記対象物位置検出手段（マイク検出画像処理部１１０）の検出結果に基づいて、前記画像データにおける顔検出領域を設定し、前記顔検出領域に含まれる顔領域を検出する顔検出手段（１１１）と、前記顔検出手段（１１１）の検出結果に基づいて、パンチルト動作を実行するパンチルト駆動手段（１０７）と、を備える。このような構成により、司会者がマイク装置等の対象物をゲストに向けるだけで、撮像装置１はパンチルト動作を実現できる。その結果、カメラを容易に遠隔操作することができる。
上記撮像装置（１）において、前記画像データにおける前記対象物の向きを検出する対象物方向検出手段（マイク検出画像処理部１１０）をさらに備え、前記顔検出手段（１１１）は、前記対象物位置検出手段（マイク検出画像処理部１１０）及び前記対象物方向検出手段（マイク検出画像処理部１１０）の検出結果に基づいて、前記画像データにおける前記顔検出領域を決定してもよい。
上記撮像装置（１）において、前記顔検出手段（１１１）は、前記画像データにおいて、前記対象物よりも上側の領域を、前記顔検出領域として設定してもよい。
上記撮像装置（１）において、前記顔検出手段（１１１）は、前記画像データにおいて、前記対象物よりも上側、かつ、前記対象物の向きに応じた方向側の領域を、前記顔検出領域として設定してもよい。
上記撮像装置（１）において、前記対象物位置検出手段（マイク検出画像処理部１１０）は、前記画像データにおける前記対象物の移動を検出し、前記画像データにおいて前記対象物が移動している場合、前記パンチルト駆動手段（１０７）は、前記対象物が前記画像データの中央に位置するようにパンチルト動作を実行してもよい。
上記撮像装置（１）において、前記画像データにおいて前記対象物が移動していない場合、前記顔検出手段（１１１）は、前記顔検出領域を決定し、前記顔検出領域に含まれる前記顔領域を検出し、前記パンチルト駆動手段（１０７）は、検出した前記顔領域が前記画像データの中央に位置するようにパンチルト動作を実行してもよい。
上記撮像装置（１）において、前記顔検出手段（１１１）は、検出した前記顔領域の前記画像データにおける位置を検出し、前記画像データにおける前記顔領域の位置から前記画像データにおける前記対象物の位置までの距離に基づいて、前記画像データの画角を変更する第１の画角変更手段（カメラ駆動部１０８）をさらに備えてもよい。
上記撮像装置（１）において、前記対象物位置検出手段（マイク検出画像処理部１１０）により前記画像データ中に前記対象物が検出されなかった場合、前記対象物が有する姿勢検出手段の検出結果に基づいて、前記画像データの画角を変更する第２の画角変更手段（カメラ駆動部１０８）をさらに備えてもよい。
上記撮像装置（１）において、前記対象物はマイク装置（９）であってもよい。
本発明の一態様に係る撮像方法は、撮像処理により画像データを生成し、特定の形状を有する対象物の前記画像データにおける位置を検出し、検出した前記対象物の位置に基づいて、前記画像データにおける顔検出領域を決定し、前記顔検出領域に含まれる顔領域を検出し、検出した前記顔領域の位置に基づいて、パンチルト動作を実行する。 An imaging apparatus (1) according to an aspect of the present invention includes an imaging unit (102) that generates image data by imaging processing, and an object position detection unit that detects a position of the object having a specific shape in the image data. Based on the detection results of the (microphone detection image processing unit 110) and the object position detection means (microphone detection image processing unit 110), a face detection area in the image data is set, and the face included in the face detection area Face detection means (111) for detecting a region, and pan / tilt drive means (107) for executing a pan / tilt operation based on the detection result of the face detection means (111). With such a configuration, the image pickup apparatus 1 can realize a pan-tilt operation only by the moderator directing an object such as a microphone device toward the guest. As a result, the camera can be easily remotely controlled.
The imaging apparatus (1) further includes object direction detection means (microphone detection image processing unit 110) for detecting the direction of the object in the image data, and the face detection means (111) includes the object position. The face detection area in the image data may be determined based on detection results of the detection unit (microphone detection image processing unit 110) and the object direction detection unit (microphone detection image processing unit 110).
In the imaging apparatus (1), the face detection unit (111) may set an area above the object in the image data as the face detection area.
In the imaging apparatus (1), the face detection unit (111) uses, as the face detection area, an area above the object and on the direction side according to the direction of the object in the image data. It may be set.
In the imaging apparatus (1), the object position detection unit (microphone detection image processing unit 110) detects the movement of the object in the image data, and the object is moving in the image data. The pan / tilt driving means (107) may execute a pan / tilt operation so that the object is located at the center of the image data.
In the imaging device (1), when the object does not move in the image data, the face detection means (111) determines the face detection area and determines the face area included in the face detection area. Then, the pan / tilt driving means (107) may perform a pan / tilt operation so that the detected face area is located at the center of the image data.
In the imaging apparatus (1), the face detection unit (111) detects the position of the detected face area in the image data, and determines the position of the object in the image data from the position of the face area in the image data. You may further provide the 1st view angle change means (camera drive part 108) which changes the view angle of the said image data based on the distance to a position.
In the imaging apparatus (1), when the object is not detected in the image data by the object position detection unit (microphone detection image processing unit 110), the detection result of the posture detection unit of the object is obtained. Based on this, a second angle of view changing means (camera driving unit 108) for changing the angle of view of the image data may be further provided.
In the imaging device (1), the object may be a microphone device (9).
An imaging method according to an aspect of the present invention generates image data by imaging processing, detects a position of an object having a specific shape in the image data, and based on the detected position of the object, the image A face detection area in the data is determined, a face area included in the face detection area is detected, and a pan / tilt operation is executed based on the detected position of the face area.

本発明により、容易に遠隔操作することができる撮像装置及び撮像方法を提供することができる。 According to the present invention, it is possible to provide an imaging apparatus and an imaging method that can be easily remotely operated.

実施の形態１にかかる撮像装置のブロック図である。1 is a block diagram of an imaging apparatus according to a first embodiment. 実施の形態１にかかるマイク装置のブロック図である。1 is a block diagram of a microphone device according to a first exemplary embodiment. 実施の形態１にかかる撮像装置の動作を示すフローチャートである。3 is a flowchart illustrating an operation of the imaging apparatus according to the first embodiment. 実施の形態１にかかる撮影状況を説明するための図である。FIG. 6 is a diagram for explaining a shooting situation according to the first embodiment; 実施の形態１にかかる顔検出領域の設定方法を説明するための図である。FIG. 5 is a diagram for explaining a face detection area setting method according to the first exemplary embodiment; 実施の形態１にかかる顔検出領域の設定方法を説明するための図である。FIG. 5 is a diagram for explaining a face detection area setting method according to the first exemplary embodiment; 実施の形態１にかかる顔検出領域の設定方法を説明するための図である。FIG. 5 is a diagram for explaining a face detection area setting method according to the first exemplary embodiment; 実施の形態１にかかる顔検出領域の設定方法を説明するための図である。FIG. 5 is a diagram for explaining a face detection area setting method according to the first exemplary embodiment; 実施の形態２にかかる撮像装置の動作を示すフローチャートである。6 is a flowchart illustrating an operation of the imaging apparatus according to the second embodiment. 実施の形態２にかかるズーム動作を説明するための図である。FIG. 10 is a diagram for explaining a zoom operation according to the second embodiment; 実施の形態２にかかるズーム動作を説明するための図である。FIG. 10 is a diagram for explaining a zoom operation according to the second embodiment; 実施の形態２にかかるズーム動作を説明するための図である。FIG. 10 is a diagram for explaining a zoom operation according to the second embodiment; 実施の形態３にかかる撮像装置のブロック図である。FIG. 6 is a block diagram of an imaging apparatus according to a third embodiment. 実施の形態３にかかるマイク装置のブロック図である。FIG. 6 is a block diagram of a microphone device according to a third exemplary embodiment. 実施の形態３にかかる撮像装置の動作を示すフローチャートである。10 is a flowchart illustrating an operation of the imaging apparatus according to the third embodiment. 実施の形態３にかかる撮像装置の動作を説明するための図である。FIG. 10 is a diagram for explaining the operation of the imaging apparatus according to the third embodiment. 実施の形態３にかかる撮像装置の動作を説明するための図である。FIG. 10 is a diagram for explaining the operation of the imaging apparatus according to the third embodiment. 実施の形態３にかかる撮像装置の動作を説明するための図である。FIG. 10 is a diagram for explaining the operation of the imaging apparatus according to the third embodiment.

＜実施の形態１＞
以下、図面を参照して本発明の実施の形態１について説明する。本実施の形態にかかる撮像装置１は、動画像及び静止画像を撮像することができるデジタルカメラを有している。撮像装置１のブロック図を図１に示す。 <Embodiment 1>
Embodiment 1 of the present invention will be described below with reference to the drawings. The imaging apparatus 1 according to the present embodiment includes a digital camera that can capture moving images and still images. A block diagram of the imaging apparatus 1 is shown in FIG.

＜撮像装置１の構成＞
図１に示すように、撮像装置１は、レンズユニット１０１と、撮像部１０２と、音声入力部１０３と、音声回路１０４と、音声信号受信部１０５と、受信回路１０６と、パンチルト駆動部１０７と、カメラ駆動部１０８と、コントローラ１０９と、マイク検出画像処理部１１０と、顔検出処理部１１１と、記録／再生部１１２と、を備える。 <Configuration of Imaging Device 1>
As shown in FIG. 1, the imaging apparatus 1 includes a lens unit 101, an imaging unit 102, an audio input unit 103, an audio circuit 104, an audio signal receiving unit 105, a receiving circuit 106, and a pan / tilt driving unit 107. A camera driving unit 108, a controller 109, a microphone detection image processing unit 110, a face detection processing unit 111, and a recording / reproducing unit 112.

レンズユニット１０１は、ズームレンズやフォーカスレンズ等の複数のレンズを有し、光を撮像部１０２へ導く。撮像部１０２は、ＣＣＤ（Charge Coupled Device）やＣＭＯＳ（Complementary Metal Oxide Semiconductor）等の撮像素子を有する。撮像部１０２は、撮像処理を行い、レンズユニット１０１を通過した光に基づいて、画像データを生成する。撮像部１０２は、生成した画像データを記録／再生部１１２に出力する。 The lens unit 101 includes a plurality of lenses such as a zoom lens and a focus lens, and guides light to the imaging unit 102. The imaging unit 102 includes an imaging element such as a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS). The imaging unit 102 performs an imaging process, and generates image data based on the light that has passed through the lens unit 101. The imaging unit 102 outputs the generated image data to the recording / reproducing unit 112.

音声入力部１０３は、例えばマイクロフォンであり、撮像装置１の周囲の音声を収音する。そして、音声入力部１０３は、収音した音声をアナログ音声信号に変換して、音声回路１０４に出力する。 The voice input unit 103 is a microphone, for example, and picks up sounds around the imaging apparatus 1. The voice input unit 103 converts the collected voice into an analog voice signal and outputs the analog voice signal to the voice circuit 104.

音声回路１０４は、音声入力部１０３から入力された音声信号を増幅して、Ａ／Ｄ変換を行う。これにより、音声回路１０４は、増幅したアナログ音声信号をデジタル音声データに変換する。音声回路１０４は、変換後のデジタル音声データを記録／再生部１１２に出力する。 The audio circuit 104 amplifies the audio signal input from the audio input unit 103 and performs A / D conversion. As a result, the audio circuit 104 converts the amplified analog audio signal into digital audio data. The audio circuit 104 outputs the converted digital audio data to the recording / reproducing unit 112.

音声信号受信部１０５は、マイク装置から送信される音声信号を受信する。音声信号受信部１０５は、Ｗｉ−Ｆｉ等の無線ＬＡＮ規格に準拠した通信処理を行う。これにより、マイク装置との通信が可能になる。音声信号受信部１０５は、受信した音声信号を受信回路１０６に出力する。受信回路１０６は、音声信号受信部１０５から入力された音声信号を記録／再生部１１２に出力する。 The audio signal receiving unit 105 receives an audio signal transmitted from the microphone device. The audio signal receiving unit 105 performs communication processing conforming to a wireless LAN standard such as Wi-Fi. This enables communication with the microphone device. The audio signal receiving unit 105 outputs the received audio signal to the receiving circuit 106. The receiving circuit 106 outputs the audio signal input from the audio signal receiving unit 105 to the recording / reproducing unit 112.

パンチルト駆動部１０７は、コントローラ１０９からの制御信号に応じて、撮像装置１または雲台に設けられたモータ（図示省略）を駆動させる。これにより、撮像装置１は、パンチルト動作を実現する。 The pan / tilt driving unit 107 drives a motor (not shown) provided on the imaging device 1 or the pan / tilt head in response to a control signal from the controller 109. Thereby, the imaging device 1 realizes a pan / tilt operation.

カメラ駆動部１０８は、コントローラ１０９からの制御信号に応じて、撮像装置１に設けられたズームアクチュエータ及びフォーカスアクチュエータ（図示省略）を駆動する。これにより、レンズユニット１０１が有するズームレンズ及びフォーカスレンズが光軸に沿って移動する。また、カメラ駆動部１０８は、絞りアクチュエータ（図示省略）を駆動させ、絞りを調整する。 The camera driving unit 108 drives a zoom actuator and a focus actuator (not shown) provided in the imaging device 1 in accordance with a control signal from the controller 109. Accordingly, the zoom lens and the focus lens included in the lens unit 101 move along the optical axis. In addition, the camera driving unit 108 drives an aperture actuator (not shown) to adjust the aperture.

コントローラ１０９は、ＣＰＵ、各種プログラムが格納されたＲＯＭ（Read Only Memory）、及びワークエリアとしてのＲＡＭ（Random Access Memory）等を含む半導体集積回路により構成され、撮像処理や、各種画像の表示等の撮像装置１全体の処理を統括的に制御する。 The controller 109 is composed of a semiconductor integrated circuit including a CPU, a ROM (Read Only Memory) in which various programs are stored, a RAM (Random Access Memory) as a work area, and the like, and performs imaging processing, display of various images, and the like. The overall processing of the imaging device 1 is controlled.

具体的には、コントローラ１０９は、マイク検出画像処理部１１０や顔検出処理部１１１の検出結果に基づいて、パンチルト駆動部１０７に対して、パンチルト動作を命令する制御信号を出力する。また、コントローラ１０９は、マイク検出画像処理部１１０や顔検出処理部１１１の検出結果に基づいて、カメラ駆動部１０８に対して、ズームテレ動作を命令する制御信号を出力する。 Specifically, the controller 109 outputs a control signal instructing the pan / tilt driving unit 107 to perform a pan / tilt operation based on detection results of the microphone detection image processing unit 110 and the face detection processing unit 111. Further, the controller 109 outputs a control signal instructing the zoom tele operation to the camera driving unit 108 based on the detection results of the microphone detection image processing unit 110 and the face detection processing unit 111.

マイク検出画像処理部１１０（対象物位置検出手段、対象物方向検出手段）は、撮像部１０２から入力された画像データ中に含まれるマイク装置を検出する。具体的には、マイク検出画像処理部１１０は、予め設定されたマイク装置の特徴（形状、色など）に基づいて、画像処理を用いて画像データ中のマイク装置を検出する。なお、画像処理を用いてマイク装置を検出するため、マイク装置は検出しやすい（特徴的な）形状や色を有していることが好ましい。 The microphone detection image processing unit 110 (object position detection unit, object direction detection unit) detects a microphone device included in the image data input from the imaging unit 102. Specifically, the microphone detection image processing unit 110 detects the microphone device in the image data using image processing based on the characteristics (shape, color, etc.) of the microphone device set in advance. Since the microphone device is detected using image processing, it is preferable that the microphone device has a (characteristic) shape or color that is easy to detect.

また、マイク検出画像処理部１１０は、画像データにおけるマイク装置の座標を取得する。マイク装置の座標とは、例えば、マイク装置の中心座標であり、ｘ座標及びｙ座標で表現される座標である。さらに、マイク検出画像処理部１１０は、マイク装置の向きを検出する。このとき、マイク装置の向きとは、マイク装置の指向性に応じた方向や、柄を有するマイク装置であれば柄から収音部に向かう方向等を意味する。マイク検出画像処理部１１０は、取得したマイク装置の座標及び向きを顔検出処理部１１１及びコントローラ１０９に出力する。 In addition, the microphone detection image processing unit 110 acquires the coordinates of the microphone device in the image data. The coordinates of the microphone device are, for example, center coordinates of the microphone device, and are coordinates expressed by an x coordinate and ay coordinate. Furthermore, the microphone detection image processing unit 110 detects the orientation of the microphone device. At this time, the direction of the microphone device means a direction according to the directivity of the microphone device or a direction from the handle toward the sound collecting unit if the microphone device has a handle. The microphone detection image processing unit 110 outputs the acquired coordinates and orientation of the microphone device to the face detection processing unit 111 and the controller 109.

さらに、マイク検出画像処理部１１０は、マイク装置の移動を検出する。マイク検出画像処理部１１０は、例えば、数フレーム毎にマイク装置の座標を比較して、マイク装置の座標が数フレーム前から所定の距離移動している場合に、マイク装置が移動していると判定する。 Furthermore, the microphone detection image processing unit 110 detects the movement of the microphone device. For example, the microphone detection image processing unit 110 compares the coordinates of the microphone device every several frames, and the microphone device is moved when the coordinates of the microphone device have moved a predetermined distance from several frames before. judge.

顔検出処理部１１１は、画像データにおけるマイク装置の位置及び向きに基づいて、顔検出を行う顔検出領域を設定する。そして、顔検出処理部１１１は、顔検出領域内に存在する人の顔を検出する。顔検出処理部１１１は、検出結果をコントローラ１０９に出力する。なお、顔検出処理部１１１による顔検出方法は、特に限定されるものではなく、公知の顔検出方法を用いることができる。 The face detection processing unit 111 sets a face detection area in which face detection is performed based on the position and orientation of the microphone device in the image data. Then, the face detection processing unit 111 detects a human face existing in the face detection area. The face detection processing unit 111 outputs the detection result to the controller 109. Note that the face detection method by the face detection processing unit 111 is not particularly limited, and a known face detection method can be used.

記録／再生部１１２は、撮像部１０２から入力された画像データを、音声回路１０４から入力されたデジタル音声データと併せてメモリ（図示省略）に格納する。 The recording / reproducing unit 112 stores the image data input from the imaging unit 102 together with the digital audio data input from the audio circuit 104 in a memory (not shown).

＜マイク装置９の構成＞
本実施の形態にかかるマイク装置の構成について説明する。マイク装置９のブロック図を図２に示す。マイク装置９は、音声入力部９１と、送信回路９２と、音声信号送信部９３と、を備える。 <Configuration of Microphone Device 9>
The configuration of the microphone device according to the present embodiment will be described. A block diagram of the microphone device 9 is shown in FIG. The microphone device 9 includes an audio input unit 91, a transmission circuit 92, and an audio signal transmission unit 93.

音声入力部９１は、例えばマイクロフォンであり、撮像装置１の周囲の音声を収音する。そして、音声入力部９１は、収音した音声をアナログ音声信号に変換して、送信回路９２に出力する。 The voice input unit 91 is a microphone, for example, and picks up sounds around the imaging apparatus 1. The voice input unit 91 converts the collected voice into an analog voice signal and outputs the analog voice signal to the transmission circuit 92.

送信回路９２は、音声入力部９１から入力された音声信号を電波信号に変換する。音声信号送信部９３は、アンテナを有し、送信回路９２により変換された電波信号を撮像装置１に送信する。このように、本実施の形態にかかるマイク装置９は、操作ボタンや姿勢検出手段等を有しておらず、一般的なマイク装置である。 The transmission circuit 92 converts the audio signal input from the audio input unit 91 into a radio signal. The audio signal transmission unit 93 includes an antenna and transmits the radio wave signal converted by the transmission circuit 92 to the imaging device 1. Thus, the microphone device 9 according to the present embodiment is a general microphone device that does not have operation buttons, posture detection means, and the like.

＜撮像装置１の動作＞
続いて、本実施の形態にかかる撮像装置１の動作例について、図３に示すフローチャートを参照して説明する。なお、本実施の形態における撮影状況（撮像装置１、司会者２００、及びゲスト３００の位置関係）を図４に示す。司会者２００は、マイク装置９をゲスト３００に向けてインタビューを行う。撮像装置１付近には、撮像装置１を操作するカメラマンはいない。撮像装置１は、司会者２００が把持するマイク装置９の位置や向き、顔検出結果等に基づいて、自動的にパンチルト動作を実行する。図４においては、撮像装置１がパンチルト可能な雲台４００に搭載されているが、撮像装置１自体にパンチルト動作可能な機構が設けられていてもよい。 <Operation of Imaging Device 1>
Next, an operation example of the imaging apparatus 1 according to the present embodiment will be described with reference to a flowchart shown in FIG. In addition, the imaging | photography condition (positional relationship of the imaging device 1, the chairperson 200, and the guest 300) in this Embodiment is shown in FIG. The presenter 200 interviews the microphone device 9 toward the guest 300. There is no cameraman operating the imaging device 1 in the vicinity of the imaging device 1. The imaging device 1 automatically performs a pan / tilt operation based on the position and orientation of the microphone device 9 held by the presenter 200, the face detection result, and the like. In FIG. 4, the imaging apparatus 1 is mounted on a pan / tilt head 400 that can be pan-tilted, but the imaging apparatus 1 itself may be provided with a mechanism that can perform a pan-tilt operation.

まず、撮像装置１が撮像処理を行う（ステップＳ１０１）。これにより、撮像部１０２は画像データを生成する。撮像部１０２は、撮像装置１に設けられた操作ボタンやセルフタイマー等によって撮像処理を開始してもよいし、マイク装置９からの音声入力に応じて撮像処理を開始してもよい。 First, the imaging device 1 performs an imaging process (step S101). As a result, the imaging unit 102 generates image data. The imaging unit 102 may start the imaging process with an operation button, a self-timer, or the like provided in the imaging apparatus 1, or may start the imaging process in response to audio input from the microphone device 9.

撮像処理が開始されると、マイク検出画像処理部１１０は、撮像部１０２の撮像処理により生成された画像データに対してマイク検出処理を行う（ステップＳ１０２）。つまり、マイク検出画像処理部１１０は、画像データ中にマイク装置９が含まれているか否かを判定する。マイク装置９が検出されない場合（ステップＳ１０３：Ｎｏ）、マイク検出画像処理部１１０は、マイク検出処理を続ける（ステップＳ１０２）。 When the imaging process is started, the microphone detection image processing unit 110 performs the microphone detection process on the image data generated by the imaging process of the imaging unit 102 (step S102). That is, the microphone detection image processing unit 110 determines whether or not the microphone device 9 is included in the image data. When the microphone device 9 is not detected (step S103: No), the microphone detection image processing unit 110 continues the microphone detection process (step S102).

マイク装置９が検出された場合（ステップＳ１０３：Ｙｅｓ）、マイク検出画像処理部１１０は、マイク装置９が移動しているか否かを判定する（ステップＳ１０４）。 When the microphone device 9 is detected (step S103: Yes), the microphone detection image processing unit 110 determines whether or not the microphone device 9 is moving (step S104).

マイク装置９が移動している場合（ステップＳ１０４：Ｙｅｓ）、撮像装置１は、マイク装置９が画像データの中央に来るように、パンチルト動作を行う（ステップＳ１０５）。具体的には、コントローラ１０９は、マイク検出画像処理部１１０から入力されたマイク装置９の座標に基づいて、マイク装置９の座標が画像データの中央に来るように、パンチルト駆動部１０７にパンチルト動作を実行させるための制御信号を出力する。これにより、画像データの中心にマイク装置９が移動する。つまり、撮像装置１は、移動するマイク装置９を追尾するように、パンチルト動作を実行する。なお、マイク装置９の座標と画像データの中心座標とは厳密に一致している必要はない。マイク装置９の座標が、画像データの中心を含む所定領域に含まれていればよい。 When the microphone device 9 is moving (step S104: Yes), the imaging device 1 performs a pan / tilt operation so that the microphone device 9 comes to the center of the image data (step S105). Specifically, the controller 109 performs a pan / tilt operation on the pan / tilt driving unit 107 based on the coordinates of the microphone device 9 input from the microphone detection image processing unit 110 so that the coordinates of the microphone device 9 come to the center of the image data. A control signal for executing is output. As a result, the microphone device 9 moves to the center of the image data. That is, the imaging device 1 performs a pan / tilt operation so as to track the moving microphone device 9. Note that the coordinates of the microphone device 9 and the center coordinates of the image data need not exactly match. The coordinates of the microphone device 9 need only be included in a predetermined area including the center of the image data.

マイク装置９が移動していない場合（ステップＳ１０４：Ｎｏ）、顔検出処理部１１１は、マイク検出画像処理部１１０から入力されるマイク装置９の位置及び方向に基づいて、画像データにおける顔検出領域を設定する（ステップＳ１０６）。 When the microphone device 9 is not moving (step S104: No), the face detection processing unit 111 determines the face detection area in the image data based on the position and direction of the microphone device 9 input from the microphone detection image processing unit 110. Is set (step S106).

顔検出処理部１１１は、設定した顔検出領域において、顔検出処理を行う（ステップＳ１０７）。顔検出領域に人の顔が検出された場合（ステップＳ１０８：Ｙｅｓ）、顔検出処理部１１１は、検出結果をコントローラ１０９に出力する。具体的には、顔検出処理部１１１は、画像データにおける検出した顔の座標（例えば、顔の中心座標）をコントローラ１０９に出力する。 The face detection processing unit 111 performs face detection processing in the set face detection area (step S107). When a human face is detected in the face detection area (step S108: Yes), the face detection processing unit 111 outputs the detection result to the controller 109. Specifically, the face detection processing unit 111 outputs detected face coordinates (for example, face center coordinates) in the image data to the controller 109.

ここで、顔検出領域の設定（ステップＳ１０６）及び顔検出処理（ステップＳ１０７）について、図５及び図６を用いて詳細に説明する。図５及び図６は、撮像装置１により撮像された画像である。 Here, setting of the face detection area (step S106) and face detection processing (step S107) will be described in detail with reference to FIGS. 5 and 6 are images captured by the imaging apparatus 1. FIG.

まず、顔検出処理部１１１は、マイク検出画像処理部１１０からマイク装置９の位置９０１と、マイク装置９の向き９０２と、を取得する（図５参照）。なお、マイク装置９の位置９０１は、例えば、マイク装置９の中心座標を用いて表現できる。また、マイク装置９の向き９０２は、例えば、画像データの上方向を０度とした場合のマイク装置９の傾き角度を用いて表現できる。そして、顔検出処理部１１１は、マイク装置９の位置９０１からマイク装置９の向き９０２に広がる領域を、顔検出領域９０３として設定する。顔検出領域９０３は、マイク装置９の位置９０１を中心に、広がり角度９０４が９０度以下となるように、マイク装置９の向き９０２に向かって広がっている。例えば、顔検出領域９０３は、マイク装置９の向き９０２を中心軸として、広がり角度９０４が３０度となる領域である。 First, the face detection processing unit 111 acquires the position 901 of the microphone device 9 and the direction 902 of the microphone device 9 from the microphone detection image processing unit 110 (see FIG. 5). The position 901 of the microphone device 9 can be expressed using, for example, the center coordinates of the microphone device 9. Further, the direction 902 of the microphone device 9 can be expressed by using, for example, the inclination angle of the microphone device 9 when the upward direction of the image data is 0 degree. Then, the face detection processing unit 111 sets a region extending from the position 901 of the microphone device 9 in the direction 902 of the microphone device 9 as the face detection region 903. The face detection area 903 extends from the position 901 of the microphone device 9 toward the direction 902 of the microphone device 9 so that the spread angle 904 is 90 degrees or less. For example, the face detection area 903 is an area where the spread angle 904 is 30 degrees with the direction 902 of the microphone device 9 as the central axis.

勿論、顔検出領域９０３の形状は図６に示した形状に限られるものではない。例えば、顔検出処理部１１１は、マイク装置９の位置９０１のみに基づいて顔検出領域９０３を設定してもよい。例えば、図７に示すように、マイク装置９の位置９０１よりも上側の領域を顔検出領域９０３としてもよい。また、図８に示すように、顔検出処理部１１１は、マイク装置９の位置９０１よりも上側、かつ、マイク装置９の位置９０１よりもマイク装置９の向き側（マイク装置９の傾き方向側）の矩形領域を顔検出領域９０３として設定してもよい。 Of course, the shape of the face detection area 903 is not limited to the shape shown in FIG. For example, the face detection processing unit 111 may set the face detection area 903 based only on the position 901 of the microphone device 9. For example, as shown in FIG. 7, a region above the position 901 of the microphone device 9 may be a face detection region 903. Further, as shown in FIG. 8, the face detection processing unit 111 is above the position 901 of the microphone device 9 and on the direction side of the microphone device 9 from the position 901 of the microphone device 9 (the inclination direction side of the microphone device 9). ) Rectangular area may be set as the face detection area 903.

そして、顔検出処理部１１１は、顔検出領域内に含まれる顔領域を検出する。顔領域とは、画像データから人の顔であると推測される領域であり、図６においては、顔検出枠９０５で示した領域を意味する。このように、画像データにおいて、顔検出が行われる領域を限定することにより、顔検出処理の高速化や顔検出処理部１１１の処理負担の軽減を図ることができる。なお、顔検出領域９０３を狭くする程、顔検出処理の高速化を図ることができる。一方、顔検出領域９０３を広くする程、確実に顔領域を検出することができる。さらに、顔検出処理部１１１が、マイク装置９の位置から顔検出処理を開始し、画像データの端に向かって顔検出処理を行ってもよい。インタビュー中においてはマイク装置９付近に顔があるため、顔検出処理部１１１がマイク装置９から顔検出処理を開始することにより、顔検出処理の更なる高速化を図ることができる。 Then, the face detection processing unit 111 detects a face area included in the face detection area. The face area is an area that is presumed to be a human face from the image data, and means an area indicated by a face detection frame 905 in FIG. As described above, by limiting the area where face detection is performed in the image data, it is possible to speed up the face detection process and reduce the processing load of the face detection processing unit 111. Note that as the face detection area 903 is narrowed, the speed of the face detection process can be increased. On the other hand, the wider the face detection area 903, the more reliably the face area can be detected. Further, the face detection processing unit 111 may start the face detection process from the position of the microphone device 9 and perform the face detection process toward the end of the image data. Since there is a face in the vicinity of the microphone device 9 during the interview, the face detection processing unit 111 starts the face detection processing from the microphone device 9, so that the speed of the face detection processing can be further increased.

図３のフローチャートに戻り、顔検出領域において顔が検出された場合（ステップＳ１０８：Ｙｅｓ）、コントローラ１０９は、顔検出処理部１１１の検出結果に基づいて、パンチルト駆動部１０７にパンチルト動作を実行させる（ステップＳ１０９）。具体的には、コントローラ１０９は、検出された顔領域の中心座標が画像データの中心に位置するように、パンチルト駆動部１０７にパンチルト動作を実行させる。これにより、撮像部１０２は、ゲスト３００を中心とする画像データを撮像できる。 Returning to the flowchart of FIG. 3, when a face is detected in the face detection area (step S <b> 108: Yes), the controller 109 causes the pan / tilt driving unit 107 to execute a pan / tilt operation based on the detection result of the face detection processing unit 111. (Step S109). Specifically, the controller 109 causes the pan / tilt driving unit 107 to execute a pan / tilt operation so that the center coordinates of the detected face area are positioned at the center of the image data. Thereby, the imaging unit 102 can capture image data centered on the guest 300.

一方、顔検出領域において顔が検出されない場合（ステップＳ１０８：Ｎｏ）、ステップＳ１０４に戻り、マイク検出画像処理部１１０は、マイク装置９が移動しているか否かを判定する。 On the other hand, when a face is not detected in the face detection area (step S108: No), the process returns to step S104, and the microphone detection image processing unit 110 determines whether or not the microphone device 9 is moving.

以上のように、本実施の形態にかかる撮像装置１の構成によれば、マイク検出画像処理部１１０が、画像データにおけるマイク装置９の位置を検出する。顔検出処理部１１１が、マイク装置９の位置に基づいて、画像データにおける顔検出領域を設定し、顔検出領域に含まれる顔領域を検出する。そして、パンチルト駆動部１０７が、顔検出手段の検出結果に基づいて、パンチルト動作を実行する。これにより、司会者がインタビューのためにマイク装置９をゲストに向けるだけで、撮像装置１は、ゲストの顔を検出し、検出結果に応じたパンチルト動作を実現できる。その結果、司会者は、モニタ確認やカメラ操作を行う必要が無く、容易に撮像装置１を遠隔操作することができる。 As described above, according to the configuration of the imaging device 1 according to the present embodiment, the microphone detection image processing unit 110 detects the position of the microphone device 9 in the image data. The face detection processing unit 111 sets a face detection area in the image data based on the position of the microphone device 9 and detects a face area included in the face detection area. Then, the pan / tilt driving unit 107 performs a pan / tilt operation based on the detection result of the face detection unit. As a result, the image pickup apparatus 1 can detect the guest's face and realize a pan-tilt operation according to the detection result simply by pointing the microphone device 9 to the guest for the interview by the presenter. As a result, the presenter does not need to perform monitor confirmation or camera operation, and can easily remotely operate the imaging apparatus 1.

＜実施の形態２＞
本発明の実施の形態２について説明する。本実施の形態にかかる撮像装置２は、コントローラ１０９及びカメラ駆動部１０８の動作が、実施の形態１と異なる。なお、その他の構成については撮像装置１と同様であるので、説明を適宜省略する。 <Embodiment 2>
A second embodiment of the present invention will be described. In the imaging apparatus 2 according to the present embodiment, the operations of the controller 109 and the camera driving unit 108 are different from those of the first embodiment. Since other configurations are the same as those of the imaging apparatus 1, description thereof will be omitted as appropriate.

コントローラ１０９は、画像データにおける顔領域とマイク装置９との距離に基づいて、カメラ駆動部１０８（第１及び第２の画角変更手段）に対して、ズーム動作またはテレ動作を実行させる制御信号を出力する。 The controller 109 controls the camera driving unit 108 (first and second field angle changing means) to perform a zoom operation or a tele operation based on the distance between the face area in the image data and the microphone device 9. Is output.

＜撮像装置２の動作＞
続いて、撮像装置２の動作例について、図９に示すフローチャートを参照して説明する。なお、ゲスト３００の顔検出処理が行われ、画像データの中心に検出された顔が移動するまで（図３のステップＳ１０１〜ステップＳ１０９）の処理は、実施の形態１と同様であるため、説明を省略する。なお、図１０〜図１２は、撮像された画像データの変化を説明するための図である。 <Operation of Imaging Device 2>
Next, an operation example of the imaging apparatus 2 will be described with reference to a flowchart shown in FIG. Since the face detection process of the guest 300 is performed and the detected face moves to the center of the image data (step S101 to step S109 in FIG. 3), the process is the same as that in the first embodiment, and thus will be described. Is omitted. 10 to 12 are diagrams for explaining changes in captured image data.

コントローラ１０９は、顔領域が画像データの中心に位置した状態で、画像データにおける顔領域の中心座標とマイク装置９の位置９０１との距離（以下、マイク距離９０６と称す）を取得する（ステップＳ２０１）。マイク距離９０６は、ゲスト３００の顔領域の中心座標３０１とマイク装置９の位置９０１とが求まれば、算出可能である（図１０参照）。 The controller 109 acquires a distance between the center coordinates of the face area in the image data and the position 901 of the microphone device 9 (hereinafter referred to as a microphone distance 906) in a state where the face area is located at the center of the image data (step S201). ). The microphone distance 906 can be calculated if the center coordinates 301 of the face area of the guest 300 and the position 901 of the microphone device 9 are obtained (see FIG. 10).

コントローラ１０９は、取得したマイク距離９０６と予め設定された距離とを比較する（ステップＳ２０２）。取得したマイク距離９０６が予め設定された距離よりも短い場合（ステップＳ２０３：Ｙｅｓ）、コントローラ１０９は、カメラ駆動部１０８に対して、ズーム動作を実行させる制御信号を出力する。これにより、カメラ駆動部１０８は、ズーム動作を実行する。なお、予め設定された距離とは、ユーザが予め設定した距離であり、適宜変更可能である。また、予め設定された距離は、一の値ではなく、所定の範囲を持った値であってもよい。 The controller 109 compares the acquired microphone distance 906 with a preset distance (step S202). When the acquired microphone distance 906 is shorter than the preset distance (step S203: Yes), the controller 109 outputs a control signal for causing the camera driving unit 108 to perform a zoom operation. As a result, the camera driving unit 108 performs a zoom operation. The distance set in advance is a distance set in advance by the user and can be changed as appropriate. Further, the preset distance may be a value having a predetermined range instead of a single value.

具体的には、図１０においては、マイク距離９０６は、予め設定された距離よりも長いものとする。このとき、司会者２００が、マイク装置９をゲスト３００の顔に近づけたとする。これにより、図１１に示すように、マイク距離９０６が縮み、予め設定された距離よりも短くなったとする。 Specifically, in FIG. 10, the microphone distance 906 is longer than a preset distance. At this time, it is assumed that the chairperson 200 brings the microphone device 9 close to the face of the guest 300. As a result, as shown in FIG. 11, it is assumed that the microphone distance 906 is shortened and becomes shorter than a preset distance.

すると、コントローラ１０９は、取得したマイク距離９０６が予め設定された距離よりも短いと判定する。そして、マイク距離９０６が予め設定された距離以上となるように、カメラ駆動部１０８に対してズーム動作の実行を指示する。 Then, the controller 109 determines that the acquired microphone distance 906 is shorter than a preset distance. Then, the camera driving unit 108 is instructed to perform a zoom operation so that the microphone distance 906 is equal to or greater than a preset distance.

カメラ駆動部１０８は、ズーム動作を実行する。そして、コントローラ１０９は、マイク距離９０６を再度取得する（ステップＳ２０１）。コントローラ１０９は、マイク距離９０６が予め設定された距離以上であると判定すると（ステップＳ２０３：Ｎｏ）、カメラ駆動部１０８のズーム動作を停止させる。これにより、ゲスト３００の顔がズームされた画像データが生成される（図１２参照）。 The camera drive unit 108 performs a zoom operation. Then, the controller 109 acquires the microphone distance 906 again (step S201). If the controller 109 determines that the microphone distance 906 is greater than or equal to a preset distance (step S203: No), the controller 109 stops the zoom operation of the camera drive unit 108. Thereby, image data in which the face of the guest 300 is zoomed is generated (see FIG. 12).

以上のように、本実施の形態にかかる撮像装置２の構成によれば、コントローラ１０９が、検出された顔とマイク装置９との距離を取得する。そして、コントローラ１０９は、取得したマイク距離に基づいて、カメラ駆動部１０８に対してズーム動作の実行を指示する。これにより、マイク装置９に操作ボタン等の特別な構成を設けることなく、ズーム動作を実現できる。また、司会者２００は、拡大させたいゲスト３００の顔にマイク装置９を近づけるだけで、ズーム動作を実現できる。その結果、容易に撮像装置１を遠隔操作することができる。 As described above, according to the configuration of the imaging device 2 according to the present embodiment, the controller 109 acquires the distance between the detected face and the microphone device 9. Then, the controller 109 instructs the camera driving unit 108 to execute a zoom operation based on the acquired microphone distance. Thereby, the zoom operation can be realized without providing the microphone device 9 with a special configuration such as an operation button. In addition, the moderator 200 can realize the zoom operation only by bringing the microphone device 9 close to the face of the guest 300 to be enlarged. As a result, the image pickup apparatus 1 can be easily operated remotely.

なお、上述の説明においては、カメラ駆動部１０８がズーム動作を実行する場合について説明したが、カメラ駆動部１０８がテレ動作を実行する場合も同様である。 In the above description, the case where the camera driving unit 108 performs the zoom operation has been described, but the same applies to the case where the camera driving unit 108 performs the tele operation.

例えば、コントローラ１０９は、マイク距離９０６が予め設定した距離よりも長くなった場合、カメラ駆動部１０８にテレ動作を実行させる制御信号を出力する。これにより、カメラ駆動部１０８は、テレ動作を実行する。 For example, when the microphone distance 906 is longer than a preset distance, the controller 109 outputs a control signal that causes the camera driving unit 108 to perform a tele operation. Thereby, the camera drive unit 108 performs a tele operation.

＜実施の形態３＞
本発明の実施の形態３について説明する。本実施の形態にかかる撮像装置３のブロック図を図１３に示す。また、本実施の形態にかかるマイク装置９のブロック図を図１４に示す。 <Embodiment 3>
Embodiment 3 of the present invention will be described. FIG. 13 shows a block diagram of the imaging apparatus 3 according to the present embodiment. FIG. 14 is a block diagram of the microphone device 9 according to the present embodiment.

撮像装置３は、図１に示した構成に加えて、姿勢情報受信部１１３を備える。姿勢情報受信部１１３は、マイク装置９から送信される姿勢情報を受信し、コントローラ１０９に出力する。 The imaging device 3 includes an attitude information receiving unit 113 in addition to the configuration shown in FIG. The posture information receiving unit 113 receives posture information transmitted from the microphone device 9 and outputs the posture information to the controller 109.

マイク装置９は、図２に示した構成に加えて、姿勢検出部９４と、姿勢情報送信部９５と、を備える。 The microphone device 9 includes an attitude detection unit 94 and an attitude information transmission unit 95 in addition to the configuration shown in FIG.

姿勢検出部９４は、例えば、加速度センサやジャイロセンサであり、マイク装置９の姿勢を検出可能なセンサである。なお、姿勢検出部９４に用いられるセンサは、マイク装置９の姿勢の変化を検出できればよく、必ずしも、マイク装置９の現在の姿勢（傾き角度など）を詳細に検出できなくてもよい。姿勢検出部９４は、検出結果を姿勢情報送信部９５に出力する。 The posture detection unit 94 is, for example, an acceleration sensor or a gyro sensor, and is a sensor that can detect the posture of the microphone device 9. Note that the sensor used in the posture detection unit 94 only needs to be able to detect a change in the posture of the microphone device 9, and may not necessarily be able to detect the current posture (tilt angle or the like) of the microphone device 9 in detail. The posture detection unit 94 outputs the detection result to the posture information transmission unit 95.

姿勢情報送信部９５は、姿勢検出部９４から入力された検出結果を、撮像装置に送信する。検出結果（姿勢情報）は、マイク装置９の傾き角度など、マイク装置９の姿勢を示す情報であってもよいし、マイク装置９の姿勢の変化の有無のみを示す情報であってもよい。 The posture information transmission unit 95 transmits the detection result input from the posture detection unit 94 to the imaging device. The detection result (posture information) may be information indicating the posture of the microphone device 9 such as an inclination angle of the microphone device 9, or may be information indicating only the presence or absence of a change in the posture of the microphone device 9.

＜撮像装置３及びマイク装置９の動作＞
続いて、本実施の形態にかかる撮像装置３及びマイク装置９の動作例について、図１５に示すフローチャートを参照して説明する。なお、図１６〜図１８は、撮像された画像データの変化を説明するための図である。図１６は、図３のフローチャートのステップＳ１０９において、ゲスト３００の顔が画像データの中央に位置するようにパンチルト動作が行われた後の状態である。このとき、マイク装置９が画角から外れている（図１６の破線部分参照）。 <Operation of Imaging Device 3 and Microphone Device 9>
Next, an operation example of the imaging device 3 and the microphone device 9 according to the present embodiment will be described with reference to a flowchart shown in FIG. 16 to 18 are diagrams for describing changes in captured image data. FIG. 16 shows a state after the pan / tilt operation is performed so that the face of the guest 300 is positioned at the center of the image data in step S109 of the flowchart of FIG. At this time, the microphone device 9 is out of the angle of view (see the broken line portion in FIG. 16).

図１６の状態においてはマイク装置９、マイク検出画像処理部１１０は、画像処理を用いてマイク装置９の移動を検出できない。このとき、マイク装置９の姿勢が変化した場合、マイク装置９の姿勢検出部９４は、マイク装置９の姿勢変化があったことを検出する。そして、姿勢検出部９４は、検出結果を姿勢情報送信部９５に出力する。姿勢情報送信部９５は、姿勢検出部９４から入力された検出結果を撮像装置３に出力する。 In the state of FIG. 16, the microphone device 9 and the microphone detection image processing unit 110 cannot detect the movement of the microphone device 9 using image processing. At this time, when the posture of the microphone device 9 changes, the posture detection unit 94 of the microphone device 9 detects that the posture of the microphone device 9 has changed. Then, posture detection unit 94 outputs the detection result to posture information transmission unit 95. The posture information transmission unit 95 outputs the detection result input from the posture detection unit 94 to the imaging device 3.

撮像装置３の姿勢情報受信部１１３は、検出結果（姿勢情報）を受信し、コントローラ１０９に出力する。コントローラ１０９は、検出結果に基づいて、マイク装置９の姿勢が変化したか否かを判定する（ステップＳ３０１）。マイク装置９の姿勢が変化していない場合（ステップＳ３０１：Ｎｏ）、コントローラ１０９は、画像データの中央に顔が位置している現在の状態を維持する。 The posture information receiving unit 113 of the imaging device 3 receives the detection result (posture information) and outputs it to the controller 109. Based on the detection result, the controller 109 determines whether or not the posture of the microphone device 9 has changed (step S301). When the posture of the microphone device 9 has not changed (step S301: No), the controller 109 maintains the current state where the face is located at the center of the image data.

一方、マイク装置９の姿勢に変化があった場合（ステップＳ３０１：Ｙｅｓ）、マイク検出画像処理部は、マイク検出処理を行う（ステップＳ３０２）。つまり、画像データ中にマイク装置９が含まれているか否かを判定する。 On the other hand, when the posture of the microphone device 9 has changed (step S301: Yes), the microphone detection image processing unit performs microphone detection processing (step S302). That is, it is determined whether or not the microphone device 9 is included in the image data.

画像データ中にマイク装置９が含まれている場合（ステップＳ３０３：Ｙｅｓ）、コントローラ１０９は、マイク装置９の位置が画像データの中央に位置するように、パンチルト駆動部１０７に制御信号を出力する（ステップＳ３０４）。これにより、画像データの中央に位置する物体が顔からマイク装置９に変化する。 When the microphone device 9 is included in the image data (step S303: Yes), the controller 109 outputs a control signal to the pan / tilt driving unit 107 so that the position of the microphone device 9 is located at the center of the image data. (Step S304). Thereby, the object located at the center of the image data changes from the face to the microphone device 9.

画像データ中にマイク装置９が含まれていない場合（ステップＳ３０３：Ｎｏ）、撮像装置３は、撮像する画像の画角を広げる（ステップＳ３０５）。具体的には、コントローラ１０９は、カメラ駆動部１０８に対して、テレ動作を実行させる制御信号を出力する。これにより、画角が広がり、マイク装置９が画像データに含まれるようになる（図１７参照）。 When the microphone device 9 is not included in the image data (step S303: No), the imaging device 3 widens the angle of view of the image to be captured (step S305). Specifically, the controller 109 outputs a control signal that causes the camera driving unit 108 to perform a tele operation. As a result, the angle of view widens and the microphone device 9 is included in the image data (see FIG. 17).

そして、マイク検出画像処理部１１０は、再度マイク装置９の検出処理を行う（ステップＳ３０２）。そして、マイク装置９が検出されると（ステップＳ３０３：Ｙｅｓ）、コントローラ１０９は、マイク装置９が画像データの中心になるように、パンチルト駆動部１０７に制御信号を出力する（ステップＳ３０４）。その結果、マイク装置９が画像の中央にくる（図１８参照）。 Then, the microphone detection image processing unit 110 performs detection processing of the microphone device 9 again (step S302). When the microphone device 9 is detected (step S303: Yes), the controller 109 outputs a control signal to the pan / tilt driving unit 107 so that the microphone device 9 becomes the center of the image data (step S304). As a result, the microphone device 9 comes to the center of the image (see FIG. 18).

その後、撮像装置３は、例えば、図３に示したフローチャートのステップＳ１０４以降を実施してもよい。具体的には、マイク装置９が停止した場合には（ステップＳ１０４：Ｎｏ）、顔検出処理部１１１は、マイク装置９の位置及び方向に基づいて、顔検出範囲を設定する（ステップＳ１０６）。そして、コントローラ１０９は、顔検出範囲内に含まれる顔（図１８においては、司会者２００の顔）が画像データの中央に来るように、パンチルト駆動部１０７を制御してもよい。 Thereafter, the imaging apparatus 3 may perform, for example, step S104 and subsequent steps in the flowchart illustrated in FIG. Specifically, when the microphone device 9 is stopped (step S104: No), the face detection processing unit 111 sets a face detection range based on the position and direction of the microphone device 9 (step S106). Then, the controller 109 may control the pan / tilt driving unit 107 so that the face included in the face detection range (the face of the presenter 200 in FIG. 18) comes to the center of the image data.

以上のように、本実施の形態にかかるマイク装置９は、マイク装置９の姿勢を検出する姿勢検出部９４を備える。また、撮像装置３は、マイク装置９の姿勢情報を受信する姿勢情報受信部１１３を備える。さらに、コントローラ１０９は、受信したマイク装置９の姿勢情報に基づいて、カメラ駆動部１０８を制御する。これにより、マイク装置９が画角から外れてしまった場合であっても、マイク装置９の姿勢の変化に応じて、マイク装置９を追尾できる。その結果、必ずしも画像にマイク装置９が含まれていなくても、司会者２００は、撮像装置１を遠隔操作することができ、撮影する画角の自由度を向上させることができる。 As described above, the microphone device 9 according to the present embodiment includes the posture detection unit 94 that detects the posture of the microphone device 9. In addition, the imaging device 3 includes a posture information receiving unit 113 that receives posture information of the microphone device 9. Further, the controller 109 controls the camera driving unit 108 based on the received attitude information of the microphone device 9. Thereby, even if it is a case where the microphone apparatus 9 deviates from the angle of view, the microphone apparatus 9 can be tracked according to the change in the attitude of the microphone apparatus 9. As a result, even if the microphone device 9 is not necessarily included in the image, the presenter 200 can remotely operate the imaging device 1 and can improve the degree of freedom of the angle of view for photographing.

なお、本発明は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更及び組み合わせをすることが可能である。例えば、上記の実施の形態においては、所定の対象物としてマイク装置を用いたが、これに限られるものではない。例えば、指示棒や、司会者の手であってもよい。 The present invention is not limited to the above-described embodiment, and can be appropriately changed and combined without departing from the spirit of the present invention. For example, in the above embodiment, the microphone device is used as the predetermined object, but the present invention is not limited to this. For example, it may be a pointing stick or the hands of a presenter.

また、上記の撮像装置の処理は、メインプロセッサのＲＯＭ等に格納されたコンピュータプログラムによって実行可能である。上述の例において、各処理をコンピュータ（プロセッサ）に行わせるための命令群を含むプログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ−ＲＯＭ（Read Only Memory）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（Random Access Memory））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 Further, the processing of the imaging apparatus can be executed by a computer program stored in a ROM or the like of the main processor. In the above-described example, a program including a group of instructions for causing a computer (processor) to perform each process is stored using various types of non-transitory computer readable media. Can be supplied to. Non-transitory computer readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic recording media (for example, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (for example, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W, semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)) are included. The program may also be supplied to the computer by various types of transitory computer readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

１〜３撮像装置
９マイク装置
９１音声入力部
９２送信回路
９３音声信号送信部
９４姿勢検出部
９５姿勢情報送信部
１０１レンズユニット
１０２撮像部
１０３音声入力部
１０４音声回路
１０５音声信号受信部
１０６受信回路
１０７パンチルト駆動部
１０８カメラ駆動部
１０９コントローラ
１１０マイク検出画像処理部
１１１顔検出処理部
１１２記録／再生部
１１３姿勢情報受信部
２００司会者
３００ゲスト 1-3 Imaging device 9 Microphone device 91 Audio input unit 92 Transmission circuit 93 Audio signal transmission unit 94 Attitude detection unit 95 Attitude information transmission unit 101 Lens unit 102 Imaging unit 103 Audio input unit 104 Audio circuit 105 Audio signal reception unit 106 Reception circuit 107 Pan / tilt driving unit 108 Camera driving unit 109 Controller 110 Microphone detection image processing unit 111 Face detection processing unit 112 Recording / reproducing unit 113 Posture information receiving unit 200 Chairperson 300 Guest

Claims

Imaging means for generating image data by imaging processing;
An object position detecting means for detecting a position of the object having a specific shape in the image data;
A face detection means for setting a face detection area in the image data based on a detection result of the object position detection means and detecting a face area included in the face detection area;
Pan / tilt driving means for performing a pan / tilt operation based on the detection result of the face detecting means;
An imaging apparatus comprising:

An object direction detecting means for detecting the direction of the object in the image data;
The imaging apparatus according to claim 1, wherein the face detection unit determines the face detection region in the image data based on detection results of the object position detection unit and the object direction detection unit.

The imaging device according to claim 1, wherein the face detection unit sets an area above the object in the image data as the face detection area.

3. The imaging according to claim 1, wherein the face detection unit sets, in the image data, an area above the object and a direction side according to the direction of the object as the face detection area. apparatus.

The object position detecting means detects movement of the object in the image data;
5. The pan / tilt drive unit performs a pan / tilt operation so that the object is positioned at the center of the image data when the object is moving in the image data. 6. The imaging device described.

When the object does not move in the image data, the face detection means determines the face detection area, detects the face area included in the face detection area,
The imaging apparatus according to claim 1, wherein the pan / tilt driving unit performs a pan / tilt operation so that the detected face area is positioned at a center of the image data.

The face detection means detects a position of the detected face area in the image data;
7. A first angle-of-view changing unit that changes the angle of view of the image data based on a distance from the position of the face area in the image data to the position of the object in the image data. The imaging device according to any one of the above.

When the object is not detected in the image data by the object position detection means, a second image for changing the angle of view of the image data based on the detection result of the attitude detection means of the object. The imaging device according to claim 1, further comprising a corner changing unit.

The imaging device according to claim 1, wherein the object is a microphone device.

Generate image data by imaging processing,
Detecting a position in the image data of an object having a specific shape;
Based on the detected position of the object, determine a face detection area in the image data, detect a face area included in the face detection area,
An imaging method for performing a pan / tilt operation based on the detected position of the face region.