JP2013201642A

JP2013201642A - Electronic device

Info

Publication number: JP2013201642A
Application number: JP2012069264A
Authority: JP
Inventors: Masashi Tsuneishi; 将司常石; Hidemi Inohara; 秀己猪原; Hiroaki Yamamura; 宏明山村; Masaichi Sekiguchi; 政一関口
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2012-03-26
Filing date: 2012-03-26
Publication date: 2013-10-03

Abstract

PROBLEM TO BE SOLVED: To provide an electronic device with excellent handleability, capable of storing sound data at appropriate timing.SOLUTION: An electronic device 100 includes: a sound collection device 30 for collecting sound; and a control part 70 for storing, in a storage part 41, the sound collected by the sound collection device after a face of an object person is detected by a detector 20 and from prescribed time before the mouth of the object person is moved. While there is the "risk" of being late for the first vocalization timing when sound recording is started after the object person moves or opens the "mouth", by starting the sound recording from 10 seconds before for instance, the risk of missing the recording of the first vocalization is eliminated.

Description

本発明は、電子機器に関する。 The present invention relates to an electronic device.

自分で撮影することができない場面の撮影を、他人に依頼することや、定点カメラで行うことが提案されている。また、撮影時の音声の記録においては、顔検出を行なった場合に音声データを画像データに関連づけて記憶媒体に記録することが提案されている（例えば、特許文献１参照）。 It has been proposed to ask other people to shoot scenes that cannot be taken by themselves or to use a fixed-point camera. Further, in recording sound during photographing, it has been proposed to record sound data in a storage medium in association with image data when face detection is performed (for example, see Patent Document 1).

特開２００９−１７７７４０号公報JP 2009-177740 A

しかしながら、従来の撮像装置（電子機器）の使い勝手は、かならずしもよいものではなかった。 However, the usability of the conventional imaging apparatus (electronic device) has not always been good.

本発明による電子機器は、音を集める集音装置と、検出装置によって対象者の顔が検出された後で、かつ、対象者の口が動く所定時間前から集音装置で集音された音を記憶部に記憶させる制御部と、を備えたことを特徴とする。 An electronic apparatus according to the present invention includes a sound collecting device that collects sound, and a sound collected by the sound collecting device after a predetermined time after the subject's mouth moves, after the detection device's face is detected. And a control unit that stores the information in the storage unit.

本発明によれば、使い勝手のよい電子機器を実現することができる。 According to the present invention, a user-friendly electronic device can be realized.

第一の実施形態による撮像装置のブロック図である。It is a block diagram of the imaging device by a first embodiment. 制御部が実行する録画処理を説明するフローチャートである。It is a flowchart explaining the video recording process which a control part performs. 第二の実施形態による撮像システムの概念図である。It is a conceptual diagram of the imaging system by 2nd embodiment. 図３の撮像システムの構成を例示するブロック図である。FIG. 4 is a block diagram illustrating a configuration of the imaging system in FIG. 3. 動画撮像部の構成を例示するブロック図である。It is a block diagram which illustrates the composition of a movie image pick-up part. マスターの制御部が実行する録画処理を説明するフローチャートである。It is a flowchart explaining the video recording process which the control part of a master performs.

以下、図面を参照して本発明を実施するための形態について説明する。
（第一の実施形態）
図１は、第一の実施形態による撮像装置１００のブロック図である。撮像装置１００は、撮影依頼者を被写体として、この被写体に関する音（例えば楽曲、歌、台詞、朗読など）を録音し、動画像を撮影する。撮影依頼者は、あらかじめ撮像装置１００に登録されている。当該依頼者である人物の顔を示す情報は、後述するフラッシュメモリ４３に記憶されている。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings.
(First embodiment)
FIG. 1 is a block diagram of an imaging apparatus 100 according to the first embodiment. The imaging apparatus 100 records a sound (for example, music, song, dialogue, reading, etc.) relating to the subject, and takes a moving image. The imaging requester is registered in the imaging apparatus 100 in advance. Information indicating the face of the person who is the requester is stored in a flash memory 43 described later.

図１において、撮像装置１００は、撮像部１０と、検出部２０と、集音部３０と、記録部４０と、表示部５０と、計時部６０と、制御部７０とを有している。撮像部１０は、第１撮像部１１と、駆動部１２と、第２撮像部１３とを有する。第１撮像部１１は、複数のレンズ、CCDやCMOSなどの撮像素子、および画像処理回路を含み、本実施形態では人物の「口」を主に撮像（例えば動画像）して画像データを生成する。 In FIG. 1, the imaging apparatus 100 includes an imaging unit 10, a detection unit 20, a sound collection unit 30, a recording unit 40, a display unit 50, a timer unit 60, and a control unit 70. The imaging unit 10 includes a first imaging unit 11, a driving unit 12, and a second imaging unit 13. The first imaging unit 11 includes a plurality of lenses, an image sensor such as a CCD or a CMOS, and an image processing circuit. In the present embodiment, the first mouth 11 mainly captures a person's “mouth” (for example, a moving image) and generates image data. To do.

駆動部１２は、リニアモータやボイスコイルモータなどを採用するアクチュエータで構成され、第１撮像部１１の姿勢（撮影方向）を調節する。本実施形態において、駆動部１２は、第２撮像部１３の撮像結果（画像）に応じて第１撮像部１１の姿勢（撮影方向）を調節する。なお、イニシャル位置における第１撮像部１１の撮影光軸は、第２撮像部１３の撮影光軸とほぼ平行である。駆動部１２が第１撮像部１１を駆動することによって撮影方向が変更されると、第１撮像部１１の撮影光軸は第２撮像部１３の撮影光軸と交わる状態となる。 The drive unit 12 is configured by an actuator that employs a linear motor, a voice coil motor, or the like, and adjusts the posture (imaging direction) of the first imaging unit 11. In the present embodiment, the drive unit 12 adjusts the posture (imaging direction) of the first imaging unit 11 according to the imaging result (image) of the second imaging unit 13. Note that the imaging optical axis of the first imaging unit 11 at the initial position is substantially parallel to the imaging optical axis of the second imaging unit 13. When the imaging direction is changed by driving the first imaging unit 11 by the driving unit 12, the imaging optical axis of the first imaging unit 11 intersects the imaging optical axis of the second imaging unit 13.

なお、第１撮像部１１を撮像装置１００の筐体に対して着脱可能に構成し、第１撮像部１１を撮像装置１００の筐体の任意の位置に取り付けられるようにしてもよい。また、第１撮像部１１を撮像装置１００から離して配置するようにしてもよい。 Note that the first imaging unit 11 may be configured to be detachable from the housing of the imaging device 100, and the first imaging unit 11 may be attached to an arbitrary position of the housing of the imaging device 100. Further, the first imaging unit 11 may be arranged away from the imaging device 100.

第２撮像部１３は、第１撮像部１１が有するレンズよりも画角が広い広角撮影が可能な複数のレンズ、CCDやCMOSなどの撮像素子、および画像処理回路を含む。本実施形態では、第２撮像部１３が第１撮像部１１の撮像領域を包含するような広角の撮像（例えば動画像）を行なって画像データを生成する。 The second imaging unit 13 includes a plurality of lenses capable of wide-angle shooting with a wider angle of view than the lens of the first imaging unit 11, an imaging device such as a CCD or a CMOS, and an image processing circuit. In the present embodiment, the second imaging unit 13 performs wide-angle imaging (for example, a moving image) that includes the imaging area of the first imaging unit 11 to generate image data.

検出部２０は、顔検出部２１と顔認識部２２とを有する。顔検出部２１は、顔の主要なパーツ（目、眉毛、鼻、口）を参照パターンとして、撮像した画像とのパターンマッチングにより顔を検出したり、ある画像が顔か否かを判別したりして顔を検出するブースティング(Boosting)による学習手法を用いた検出を行うことができる。また、顔検出部２１は、「口」が開いているか（動いているか）否かについても、上述のパターンマッチングやブースティング手法により検出する。 The detection unit 20 includes a face detection unit 21 and a face recognition unit 22. The face detection unit 21 detects a face by pattern matching with a captured image using main parts (eyes, eyebrows, nose, mouth) of the face as a reference pattern, or determines whether a certain image is a face. Thus, detection using a learning method based on boosting for detecting a face can be performed. The face detection unit 21 also detects whether or not the “mouth” is open (moving) by the above-described pattern matching or boosting method.

顔認識部２２は、顔検出部２１が検出した顔が誰の顔かを認識する。具体的には、公知の認識アルゴリズム（弾性バンチグラフマッチング、隠れマルコフモデルなど）を用いて顔認識を行う。この認識アルゴリズムは、見た目の特徴を直接幾何学的に比較する方法と、画像を統計的に数値化してその数値をテンプレートと比較する方法があるが、いずれの方法を用いることもできる。 The face recognition unit 22 recognizes who the face detected by the face detection unit 21 is. Specifically, face recognition is performed using a known recognition algorithm (elastic bunch graph matching, hidden Markov model, etc.). This recognition algorithm includes a method of directly comparing visual features geometrically and a method of statistically digitizing an image and comparing the numerical value with a template, and either method can be used.

集音部３０は、第１マイク３１、駆動部３２、第２マイク３３、および図示しない音声処理回路を有している。第１マイク３１は、指向性を有するマイクで構成され、特定の方向から入力する音声を高感度に集音する。例えば、超指向性ダイナミック型マイクロホンや超指向性コンデンサ型マイクロホン等を用いることができる。 The sound collection unit 30 includes a first microphone 31, a drive unit 32, a second microphone 33, and a sound processing circuit (not shown). The first microphone 31 is composed of a microphone having directivity, and collects sound input from a specific direction with high sensitivity. For example, a super-directional dynamic microphone, a super-directional condenser microphone, or the like can be used.

駆動部３２は、リニアモータや、ボイスコイルモータなどを採用するアクチュエータで構成され、第１マイク３１の集音方向を調節する。本実施形態では、第１撮像部１１の撮像結果（画像）に応じて第１マイク３１の集音方向を調節する。なお、第１撮像部１１と第１マイク３１とを一体構成（ユニット化）してもよい。ユニット化した場合には、駆動部１２および駆動部３２のうちいずれか一方を省略することができる。 The drive unit 32 includes an actuator that employs a linear motor, a voice coil motor, or the like, and adjusts the sound collection direction of the first microphone 31. In the present embodiment, the sound collection direction of the first microphone 31 is adjusted according to the imaging result (image) of the first imaging unit 11. Note that the first imaging unit 11 and the first microphone 31 may be integrated (unitized). When unitized, either the drive unit 12 or the drive unit 32 can be omitted.

第２マイク３３は、無指向性のマイクで構成され、第１マイク３１が集音する範囲よりも広い範囲の音を集音する。記録部４０は、バッファメモリ４１と、記録I/F（インターフェース）４２と、フラッシュメモリ４３とを有している。 The second microphone 33 is composed of an omnidirectional microphone and collects a sound in a wider range than the range in which the first microphone 31 collects sound. The recording unit 40 includes a buffer memory 41, a recording I / F (interface) 42, and a flash memory 43.

第１マイク３１、第２マイク３３でそれぞれ集音された音声信号は、音声処理回路（不図示）によって増幅される。増幅後の音声信号は、Ａ／Ｄ変換回路（不図示）によってデジタル音声データに変換される。 The audio signals collected by the first microphone 31 and the second microphone 33 are amplified by an audio processing circuit (not shown). The amplified audio signal is converted into digital audio data by an A / D conversion circuit (not shown).

バッファメモリ４１は、第１撮像部１１、第２撮像部１３によって撮像された画像データを一時的に格納したり、後述する制御部７０のワークメモリとして用いられたりする。なお、バッファメモリ４１を複数設けて第１撮像部１１用のバッファメモリ４１ａと、第２撮像部１３用のバッファメモリ４１ｂとで別々に構成してもよい。また、バッファメモリ４１は、前述の画像処理回路による画像処理の前工程や後工程での画像データを一時的に格納する。バッファメモリ４１には、揮発性の半導体メモリ等を適宜選択して用いることができる。 The buffer memory 41 temporarily stores image data picked up by the first image pickup unit 11 and the second image pickup unit 13, or is used as a work memory of the control unit 70 described later. A plurality of buffer memories 41 may be provided, and the buffer memory 41a for the first imaging unit 11 and the buffer memory 41b for the second imaging unit 13 may be configured separately. Further, the buffer memory 41 temporarily stores image data in the pre-process and post-process of image processing by the above-described image processing circuit. As the buffer memory 41, a volatile semiconductor memory or the like can be appropriately selected and used.

記録Ｉ／Ｆ４２には、記憶媒体（例えばＳＤカード）９９を接続するためのコネクタが形成されている。そして、記録Ｉ／Ｆ４２は、コネクタに接続された記憶媒体９９に対してデータの書き込み／読み込みを実行する。本実施形態の記録Ｉ／Ｆ４２は、第２撮像部１３が撮像した画像の一部を記憶媒体９９に記録（保存）し、第１撮像部１１が撮像した画像（本例では、人物の「口」の画像）は記憶媒体９９に記録（保存）しない（記憶媒体９９への記録を禁止する）。 The recording I / F 42 is formed with a connector for connecting a storage medium (for example, an SD card) 99. The recording I / F 42 executes data writing / reading with respect to the storage medium 99 connected to the connector. The recording I / F 42 of the present embodiment records (saves) a part of an image captured by the second imaging unit 13 in the storage medium 99, and captures an image captured by the first imaging unit 11 (in this example, “ The “mouth” image) is not recorded (saved) in the storage medium 99 (recording in the storage medium 99 is prohibited).

なお、記憶媒体９９は、ハードディスクや、半導体メモリを内蔵したメモリカード等で構成され、本実施形態においてはメモリカードを用いる。記録媒体９９は、画像データや音声データの保存用に用いられる。 The storage medium 99 is composed of a hard disk, a memory card with a built-in semiconductor memory, or the like. In this embodiment, a memory card is used. The recording medium 99 is used for storing image data and audio data.

フラッシュメモリ４３は、不揮発性のメモリによって構成される。本実施形態では、「口」が開いている状態や動いている状態を検出するための形状を表す画像データ（リファレンス画像データと呼ぶ）、顔認識をするためにあらかじめ登録された顔のデータ（顔情報）、楽器、ホイッスル、および運動会用ピストルの音などを検出するためにあらかじめ用意された音響データ（音情報）などを、フラッシュメモリ４３に記憶させておく。 The flash memory 43 is configured by a nonvolatile memory. In the present embodiment, image data (referred to as reference image data) representing a shape for detecting a state where the “mouth” is open or moving (referred to as reference image data), face data registered in advance for face recognition ( Face data), acoustic data (sound information) prepared in advance for detecting sounds of musical instruments, whistles, and athletic meet pistols, and the like are stored in the flash memory 43.

表示部５０は、例えば液晶パネルによって構成され、画像や操作メニュー画面などを表示する。本実施形態においては、表示部５０は第２撮像部１３が撮像した画像を表示し、第１撮像部１１が撮像した画像（本例では「口」の画像）を表示しない。なお、表示部５０の表面に透明なタッチパネルを積層して設けてもよい。この場合には、ユーザが表示部５０の操作メニューを視認しつつ、タッチパネルをタッチ操作することにより、タッチ操作された座標とその座標に対応して表示された操作メニューを選択することが可能となる。 The display unit 50 is configured by a liquid crystal panel, for example, and displays an image, an operation menu screen, and the like. In the present embodiment, the display unit 50 displays the image captured by the second imaging unit 13 and does not display the image captured by the first imaging unit 11 (in this example, the “mouth” image). A transparent touch panel may be laminated on the surface of the display unit 50. In this case, the user can select the coordinates of the touch operation and the operation menu displayed corresponding to the coordinates by touching the touch panel while viewing the operation menu of the display unit 50. Become.

計時部６０は、例えば水晶発振回路によって構成され、計時処理を行なう。本実施形態においては、第１撮像部１１が「口」の開く（動く）画像を撮像した時点を基準として、その前後の計時情報を後述の制御部７０に出力する。言い換えれば、計時部６０は、第１撮像部１１が「口」の開く（動く）のを撮像した時点を基準とするタイムスタンプ情報を出力する。 The timer unit 60 is constituted by a crystal oscillation circuit, for example, and performs a timer process. In the present embodiment, the time information before and after the first imaging unit 11 captures an image of opening (moving) of the “mouth” is output to the control unit 70 described later. In other words, the time measuring unit 60 outputs time stamp information based on the time point when the first imaging unit 11 images the opening (moving) of the “mouth”.

制御部７０は図示しないＣＰＵを有し、撮像装置１００全体を制御する。本実施形態においては、被写体の「口」の動きに応じて撮像の制御を行なう。 The control unit 70 has a CPU (not shown) and controls the entire imaging apparatus 100. In the present embodiment, imaging control is performed according to the movement of the “mouth” of the subject.

＜録画処理＞
以上のように構成された撮像装置１００の制御部７０が実行する録画処理について、図２のフローチャートを参照して説明する。制御部７０は、例えば図示しない録画スイッチがオン操作されると、図２による録画処理を行うプログラムを起動させる。図２のステップＳ１において、制御部７０は、第２撮像部１３による撮像を開始させ、動画像データをバッファメモリ４１に逐次記憶させてステップＳ２へ進む。ステップＳ２において、制御部７０は、第２マイク３３を介して集音された音声データをバッファメモリ４１へ逐次記憶させる録音を開始させてステップＳ３へ進む。 <Recording process>
Recording processing executed by the control unit 70 of the imaging apparatus 100 configured as described above will be described with reference to the flowchart of FIG. For example, when a recording switch (not shown) is turned on, the control unit 70 activates a program for performing the recording process shown in FIG. In step S1 of FIG. 2, the control unit 70 starts imaging by the second imaging unit 13, sequentially stores moving image data in the buffer memory 41, and proceeds to step S2. In step S2, the control unit 70 starts recording to sequentially store the audio data collected through the second microphone 33 in the buffer memory 41, and proceeds to step S3.

ステップＳ３において、制御部７０は、上記撮像および録音の開始とともに、計時部６０による計時を開始させてステップＳ４へ進む。ステップＳ４において、制御部７０は、顔検出部２１を用いて、第２撮像部１３で撮像された画像から顔を検出したか否かを判定する。制御部７０は、第２撮像部１３が撮像した画像に顔が含まれていた場合にステップＳ４を肯定判定してステップＳ５へ進み、顔が含まれていない場合にはステップＳ４を否定判定して当該処理を繰り返す。 In step S3, the control unit 70 starts timing by the timing unit 60 together with the start of the imaging and recording, and proceeds to step S4. In step S 4, the control unit 70 uses the face detection unit 21 to determine whether a face has been detected from the image captured by the second imaging unit 13. The control unit 70 makes a positive determination in step S4 when the face captured in the image captured by the second imaging unit 13 proceeds to step S5, and makes a negative determination in step S4 when the face is not included. To repeat the process.

ステップＳ５において、制御部７０は、顔認識部２２を用いて、ステップＳ４で検出された顔を認識したか否かを判定する。制御部７０は、フラッシュメモリ４３に記憶されている撮影依頼者（以後、対象者と呼ぶ）の顔情報とステップＳ４で検出した顔とが一致する場合に、ステップＳ５を肯定判定してステップＳ６へ進む。ステップＳ６へ進む場合は、あらかじめ登録されている対象者を認識した場合である。一方、制御部７０は、フラッシュメモリ４３に記憶されている対象者の顔情報とステップＳ４で検出した顔とが一致しない場合には、ステップＳ５を否定判定してステップＳ４へ戻る。すなわち、ステップＳ３で検出した複数の顔について顔認識が済むまでは、ステップＳ４およびステップＳ５の処理を繰り返す。 In step S5, the control unit 70 determines whether the face detected in step S4 has been recognized using the face recognition unit 22. The control unit 70 makes an affirmative determination in step S5 when the face information of the photographing client (hereinafter referred to as a target person) stored in the flash memory 43 matches the face detected in step S4, and makes a determination in step S6. Proceed to The process proceeds to step S6 when a target person registered in advance is recognized. On the other hand, when the face information of the subject stored in the flash memory 43 does not match the face detected in step S4, the control unit 70 makes a negative determination in step S5 and returns to step S4. That is, the processes in steps S4 and S5 are repeated until face recognition is completed for a plurality of faces detected in step S3.

対象者を顔認識した制御部７０は、ステップＳ６において、第１撮像部１１および第１マイク３１の位置調節（撮影方向および集音方向を調節するための姿勢制御）を行なわせる。具体的には、特定した顔（対象者であると認識した顔）から「口」の位置を検出または推定して、その「口」に第１撮像部１１および第１マイク３１を対向させるように、制御部７０が駆動部１２および駆動部３２をそれぞれ制御する。推定は、「口」を検出できない代わりに「目」や「鼻」を検出した場合において、「目」の位置や「鼻」の位置に基づいて「口」の位置を求めることをいう。 The controller 70 that recognizes the face of the subject causes the position adjustment of the first imaging unit 11 and the first microphone 31 (attitude control for adjusting the shooting direction and the sound collection direction) in step S6. Specifically, the position of the “mouth” is detected or estimated from the identified face (the face recognized as the subject), and the first imaging unit 11 and the first microphone 31 are opposed to the “mouth”. The control unit 70 controls the drive unit 12 and the drive unit 32, respectively. The estimation means obtaining the position of the “mouth” based on the position of the “eye” or the position of the “nose” when “eyes” or “nose” is detected instead of detecting the “mouth”.

第１撮像部１１および第１マイク３１の位置調節を行なった制御部７０は、ステップＳ７において、ステップＳ５で検出または推定した「口」の撮像を第１撮像部１１に開始させて、動画像データをバッファメモリ４１に逐次記憶させてステップＳ８へ進む。 In step S7, the control unit 70 that has adjusted the positions of the first imaging unit 11 and the first microphone 31 causes the first imaging unit 11 to start imaging the “mouth” detected or estimated in step S5. Data is sequentially stored in the buffer memory 41, and the process proceeds to step S8.

ステップＳ８において、制御部７０は、第１マイク３１を介して集音された音声データをバッファメモリ４１へ逐次記憶させる録音を開始させてステップＳ９へ進む。ここで、当該ステップＳ８において第１マイク３１による録音を開始する時点では、第１撮像部１１により撮像される「口」の人物が音声を発していないかもしれない。しかしながら、後述するように、第１撮像部１１が撮像している人物が「口」を動かしたり、開いたりしてから第１マイク３１による録音を開始したのでは、録音開始が最初の発声タイミングに間に合わないおそれがあるため、「口」の動きを検出する前のステップＳ８から録音を開始させて音声データをバッファメモリ４１に蓄積しておく。 In step S8, the control unit 70 starts recording to sequentially store the audio data collected through the first microphone 31 in the buffer memory 41, and proceeds to step S9. Here, at the time when recording by the first microphone 31 is started in step S 8, the person of the “mouth” imaged by the first imaging unit 11 may not emit sound. However, as will be described later, if recording by the first microphone 31 is started after the person imaged by the first imaging unit 11 moves or opens the “mouth”, the recording start is the first utterance timing. Therefore, recording is started from step S8 before the movement of the “mouth” is detected, and the audio data is stored in the buffer memory 41.

ステップＳ９において、制御部７０は、第１撮像部１１が撮像している「口」が動いたかどうか（開いたかどうか）を検出する。この場合も、制御部７０はフラッシュメモリ４３に記憶されている上記リファレンス画像データを参照して、検出部２０により「口」が動いたかどうか（開いたかどうか）を検出する。制御部７０は、「口」の動きを検出した場合にステップＳ９を肯定判定してステップＳ１０へ進み、「口」の動きを検出しない場合にはステップＳ９を否定判定してステップＳ７へ戻る。これにより、「口」の動きが検出されるまでステップＳ７からステップＳ９の処理を繰り返す。 In step S 9, the control unit 70 detects whether or not the “mouth” being imaged by the first imaging unit 11 has moved (whether it has been opened). Also in this case, the control unit 70 refers to the reference image data stored in the flash memory 43 and detects whether or not the “mouth” has moved (opened) by the detection unit 20. When the movement of the “mouth” is detected, the control unit 70 makes a positive determination in step S9 and proceeds to step S10. When the movement of the “mouth” is not detected, the control unit 70 makes a negative determination in step S9 and returns to step S7. Thereby, the processing from step S7 to step S9 is repeated until the movement of the “mouth” is detected.

「口」の動きを検出した制御部７０は、ステップＳ１０において、ステップＳ３で開始した計時の経過時間を確認、および保存を指示してステップＳ１１へ進む。例えば、ステップＳ３の計時開始から１０秒が経過している場合の制御部７０は、記憶Ｉ/Ｆ４２へ指示を送り、「口」の動きが検出される所定時間前（本例では１秒前とする）から、換言すれば計時開始９秒後から、バッファメモリ４１に記憶されている第２撮像部１３で撮像された画像データを記憶媒体９９に記録（保存）させるとともに、同計時開始９秒後からバッファメモリ４１に記憶されている第１マイク３１および第２マイク３３を介して集音された音声データを記憶媒体９９に記録（保存）させる。画像データと音声データは、例えば１つのファイルに含める。 In step S10, the control unit 70 that has detected the movement of the “mouth” confirms the elapsed time measured in step S3 and instructs the storage to proceed to step S11. For example, when 10 seconds have elapsed from the start of timing in step S3, the control unit 70 sends an instruction to the storage I / F 42, and a predetermined time before the movement of the “mouth” is detected (in this example, 1 second before) In other words, after 9 seconds from the start of timing, the image data captured by the second imaging unit 13 stored in the buffer memory 41 is recorded (saved) in the storage medium 99, and the timing starts 9 After 2 seconds, the voice data collected via the first microphone 31 and the second microphone 33 stored in the buffer memory 41 is recorded (saved) in the storage medium 99. Image data and audio data are included in one file, for example.

なお、記憶媒体９９に記録する音声データは、第１マイク３１および第２マイク３３を介して集音された音声データのうちいずれか一方でもよい。異なる２つのマイクを介して集音された音声データのうち、例えば、所定の音圧レベル範囲の音声データを記憶させたり、所定の周波数帯域を含む（あるいは含まない）音声データを記憶させたりすることによって、必要な音声を記録しつつ、不要な音声の記録を排除できる。 Note that the audio data recorded in the storage medium 99 may be any one of the audio data collected via the first microphone 31 and the second microphone 33. Of the audio data collected through two different microphones, for example, audio data in a predetermined sound pressure level range is stored, or audio data including (or not including) a predetermined frequency band is stored. Thus, it is possible to record unnecessary sound while eliminating unnecessary sound recording.

また、制御部７０は、「口」の動きが検出されたタイミングで、第２撮像部１３の図示しないズーム光学系を望遠側に移動させる、もしくは電子ズーム処理によってズームアップさせてもよい。これにより、例えば第２撮像部１３は、上記特定した顔（すなわち対象者の顔）をアップで撮像することができる。上記変倍動作に加えて、または変倍動作に代えて、第２撮像部１３が撮像するフレームレートを上げてもよい。これにより、対象者の動きに対する応答性が向上する。 Further, the control unit 70 may move the zoom optical system (not shown) of the second imaging unit 13 to the telephoto side at the timing when the movement of the “mouth” is detected, or may zoom up by electronic zoom processing. Thereby, for example, the second imaging unit 13 can image the identified face (that is, the face of the subject) up. In addition to the scaling operation or instead of the scaling operation, the frame rate at which the second imaging unit 13 captures an image may be increased. Thereby, the responsiveness with respect to a subject's motion improves.

さらに、制御部７０は、「口」の動き方に応じて、第１マイク３１の感度（ゲイン）を変更するようにしてもよい。これは、「口」が大きく開かれるにつれて大きな声が出されることが予想されるので、「口」の大きさに応じて第１マイク３１の感度（ゲイン）を下げることで、大きな音声入力に備えるためである。このようなゲイン制御を行なうことにより、集音部３０のダイナミックレンジを広げることが可能となる。「口」の開く大きさに応じた感度（ゲイン）の変更は、第２マイク３３にも適用するようにしてもよい。ゲインの変更は、例えば、集音部３０の音声処理回路（不図示）における増幅ゲインを変化させることによって実現できる。 Further, the control unit 70 may change the sensitivity (gain) of the first microphone 31 in accordance with the way the “mouth” moves. This is because it is expected that a loud voice will be produced as the “mouth” is opened wide, so that the sensitivity (gain) of the first microphone 31 is lowered according to the size of the “mouth”, thereby increasing the voice input. It is for preparing. By performing such gain control, the dynamic range of the sound collection unit 30 can be expanded. The change of the sensitivity (gain) according to the opening size of the “mouth” may be applied to the second microphone 33. The change of the gain can be realized by changing the amplification gain in the sound processing circuit (not shown) of the sound collection unit 30, for example.

第２撮像部１３により撮像されている対象者の「口」は、動いたり動かなかったりする。このため、制御部７０は、ステップＳ１１において対象者の「口」が所定時間動きなし（閉じている）であるか否かを検出する。ここで、所定時間は任意に設定することができるものとし、本実施形態においては、例えば５秒とする。制御部７０は、検出部２０によって「口」の動きが５秒間検出されない場合は、ステップＳ１１を肯定判定してステップＳ１２に進む。制御部７０は、検出部２０によって「口」の動きが検出されている場合は、ステップＳ１１を否定判定して当該判定処理を繰り返す。 The “mouth” of the subject imaged by the second imaging unit 13 moves or does not move. For this reason, the control unit 70 detects whether or not the subject's “mouth” has not moved (closed) for a predetermined time in step S11. Here, it is assumed that the predetermined time can be arbitrarily set. In the present embodiment, the predetermined time is, for example, 5 seconds. If the movement of the “mouth” is not detected by the detection unit 20 for 5 seconds, the control unit 70 makes a positive determination in step S11 and proceeds to step S12. When the movement of the “mouth” is detected by the detection unit 20, the control unit 70 makes a negative determination in step S11 and repeats the determination process.

つまり、対象者の「口」の動きが検出されている間は、制御部７０がステップＳ１１を否定判定して上記判定処理を繰り返すため、第２撮像部１３で撮像されバッファメモリ４１に記憶されていた画像データが記憶媒体９９に記録（保存）されるとともに、第１マイク３１および第２マイク３３を介して集音されバッファメモリ４１に記憶されていた音声データが記憶媒体９９に記録（保存）される。 That is, while the movement of the “mouth” of the subject is detected, the control unit 70 makes a negative determination in step S11 and repeats the determination process, so that the image is captured by the second imaging unit 13 and stored in the buffer memory 41. The recorded image data is recorded (saved) in the storage medium 99, and the audio data collected through the first microphone 31 and the second microphone 33 and stored in the buffer memory 41 is recorded (saved) in the storage medium 99. )

ステップＳ１２において、「口」の動きが所定時間ない場合の制御部７０は、撮像を終了するか否かを判断する。例えば、対象者を撮像する場面が学芸会の合唱や演劇である場合、その終了時には大きな拍手があり、終了を告げるアナウンスが行われる。このため、本実施形態の制御部７０は、第２マイク３３を介して大きな拍手や「終了」という言葉が入力された場合に、第１撮像部１１および第２撮像部１３による撮像を終了すべきと判断する。 In step S12, when there is no movement of the “mouth” for a predetermined time, the control unit 70 determines whether or not to end the imaging. For example, when the scene in which the subject is imaged is a choir or theater performance at a school, there is a big applause at the end, and an announcement is made to tell the end. For this reason, the control part 70 of this embodiment complete | finishes the imaging by the 1st imaging part 11 and the 2nd imaging part 13 when a big applause and the word "end" are input via the 2nd microphone 33. Judge that it should.

「大きな拍手」を示す音響データや、「終了」という言葉を示す音声辞書データを上述した音情報としてあらかじめフラッシュメモリ４３に記憶させておけばよい。また、運動会などを撮像する場合は、ピストルの合図により競技が終了する場合があるため、「ピストル合図」を示す音響データを音情報としてフラッシュメモリ４３に記憶しておけばよい。 The sound data indicating “big applause” and the voice dictionary data indicating the word “end” may be stored in advance in the flash memory 43 as the sound information. In addition, when shooting an athletic meet or the like, the competition may be terminated by a pistol signal, so that acoustic data indicating the “pistol signal” may be stored in the flash memory 43 as sound information.

制御部７０は、ステップＳ１２を肯定判定した場合に所定のオフ処理を行って図２による処理を終了する。オフ処理は、第１撮像部１１および第２撮像部１３による撮像を終了させとともに、第１マイク３１および第２マイク３３による録音を終了させ、バッファメモリ４１に蓄積されていた画像データや音声データを記憶媒体９９へ記録（保存）させる処理も終了させる。 If the determination is affirmative in step S12, the control unit 70 performs a predetermined off process and ends the process of FIG. In the off process, the image pickup by the first image pickup unit 11 and the second image pickup unit 13 is ended, the recording by the first microphone 31 and the second microphone 33 is ended, and the image data and audio data accumulated in the buffer memory 41 The process of recording (storing) in the storage medium 99 is also terminated.

一方、制御部７０はステップＳ１２を否定判定した場合にステップＳ１３へ進む。ステップＳ１３において、制御部７０は、顔認識部２２を用いて、対象者の顔を認識できるか否かを判定する。制御部７０は、フラッシュメモリ４３に記憶されている対象者の顔情報と第２撮像部１３により撮像された画像に含まれる顔とが一致する場合に、ステップＳ１３を肯定判定してステップＳ９へ戻る。ステップＳ９へ戻る場合は、撮像を継続して上述した処理を繰り返す。 On the other hand, if the control unit 70 makes a negative determination in step S12, the control unit 70 proceeds to step S13. In step S 13, the control unit 70 determines whether the face of the subject can be recognized using the face recognition unit 22. When the face information of the subject stored in the flash memory 43 matches the face included in the image captured by the second imaging unit 13, the control unit 70 makes a positive determination in step S13 and proceeds to step S9. Return. When returning to step S9, imaging is continued and the above-described processing is repeated.

制御部７０は、フラッシュメモリ４３に記憶されている対象者の顔情報が、第２撮像部１３により撮像された画像に含まれない場合には、ステップＳ１３を否定判定してステップＳ１４へ進む。ステップＳ１４において、制御部７０は、第１撮像部１１による撮像を終了させてステップＳ１５へ進む。ステップＳ１５において、制御部７０は、第１マイク３１による録音を終了させてステップＳ３へ戻る。 When the face information of the subject stored in the flash memory 43 is not included in the image captured by the second imaging unit 13, the control unit 70 makes a negative determination in step S13 and proceeds to step S14. In step S14, the control unit 70 ends the imaging by the first imaging unit 11, and proceeds to step S15. In step S15, the control unit 70 ends the recording by the first microphone 31 and returns to step S3.

以上説明した第一の実施形態によれば、次の作用効果が得られる。
（１）撮像装置１００は、集音部３０と、検出部２０によって対象者の顔が検出された後、かつ、対象者の「口」が動く所定時間前から集音部３０で集音された音をバッファメモリ４１に記憶させる制御部７０と、を備えたので、適切なタイミングで音声データを蓄積しておくことができるようになって、使い勝手が向上する。具体的には、対象者が「口」を動かしたり、開いたりしてから録音を開始したのでは、最初の発声タイミングに間に合わないおそれがあるのに対し、例えば１０秒前から音録りを開始しておくようにすることで、最初の発声を録り損なうおそれを排除できる。 According to the first embodiment described above, the following operational effects can be obtained.
(1) The image pickup apparatus 100 is picked up by the sound collecting unit 30 after the subject's face is detected by the sound collecting unit 30 and the detection unit 20 and before a predetermined time before the “mouth” of the subject moves. And the control unit 70 for storing the stored sound in the buffer memory 41, the sound data can be stored at an appropriate timing, and the usability is improved. Specifically, if the subject starts to record after moving or opening his / her mouth, there is a possibility that the recording will not be in time for the first utterance timing, but for example, recording the sound from 10 seconds before By starting, you can eliminate the risk of failing to record the first utterance.

（２）上記（１）の撮像装置１００において、検出部２０による検出結果に基づいて、集音部３０の向きを調節する駆動部３２を備えたので、例えば対象者からの音声を適切に集音できる。 (2) Since the imaging device 100 of (1) includes the drive unit 32 that adjusts the direction of the sound collection unit 30 based on the detection result by the detection unit 20, for example, the sound from the subject is appropriately collected. I can sound.

（３）上記（１）または（２）の撮像装置１００において、検出部２０による検出結果に基づいて、集音部３０のゲインを調節する制御部７０を備えるようにした。例えば対象者の「口」が大きく開かれるにつれて第１マイク３１のゲインを下げて大きな音声入力に備えることで、集音部３０のダイナミックレンジを広げることができる。 (3) In the imaging device 100 of (1) or (2), the control unit 70 that adjusts the gain of the sound collection unit 30 based on the detection result by the detection unit 20 is provided. For example, the dynamic range of the sound collection unit 30 can be expanded by reducing the gain of the first microphone 31 and preparing for a large voice input as the “mouth” of the subject is greatly opened.

（４）上記（１）から（３）の撮像装置１００において、制御部７０は、検出部２０で対象者の顔が検出されなくなった後に、バッファメモリ４１に対する音の記憶を終了させるようにしたので、無駄に録音を継続することなく自動停止できるため、使い勝手を向上できる。 (4) In the imaging device 100 of (1) to (3) above, the control unit 70 finishes storing the sound in the buffer memory 41 after the detection unit 20 no longer detects the subject's face. Therefore, it is possible to automatically stop without unnecessarily continuing recording, thereby improving usability.

（５）上記（１）から（３）の撮像装置１００において、制御部７０は、所定の音に応じて、バッファメモリ４１に対する音の記憶を終了させるようにしたので、例えば、拍手や終了アナウンスに応じて録音を自動停止できるため、使い勝手を向上できる。 (5) In the imaging device 100 of (1) to (3) above, the control unit 70 ends the storage of the sound in the buffer memory 41 according to a predetermined sound. Since recording can be automatically stopped according to the situation, usability can be improved.

（６）上記（１）から（５）の撮像装置１００において、集音部３０は、指向性を有する第１マイク３１を含むようにしたので、特定方向からの音声を選択的に録音することもできる。 (6) In the imaging device 100 of (1) to (5) above, the sound collection unit 30 includes the first microphone 31 having directivity, and therefore selectively records sound from a specific direction. You can also.

（７）上記（６）の撮像装置１００において、集音部３０は、第１マイク３１よりも集音範囲が広い第２マイク３３を含み、第２マイク３３による集音開始よりも遅く第１マイクによる集音を開始させる制御部７０をさらに備えるようにしたので、例えば、広い範囲の音を集めてから、特定方向の音声の集音を始めるように制御できる。 (7) In the imaging apparatus 100 of (6), the sound collection unit 30 includes the second microphone 33 having a wider sound collection range than the first microphone 31, and the first is later than the sound collection start by the second microphone 33. Since the controller 70 for starting the sound collection by the microphone is further provided, for example, it is possible to control the sound collection in a specific direction after collecting a wide range of sounds.

（８）上記（１）から（７）の撮像装置１００において、「口」の形状に関するデータを記憶しているフラッシュメモリ４３を備えたので、データを備えない場合に比べて、適切に「口」およびその動きを検出できる。 (8) Since the imaging apparatus 100 according to (1) to (7) above includes the flash memory 43 that stores data related to the shape of the “mouth”, the “mouth” is appropriately compared to the case where the data is not provided. "And its movement can be detected.

（変形例１）
上述した第一の実施形態においては、バッファメモリ４１を介在させることで、第２撮像部１３が撮像した画像の一部を記憶媒体９９に記録（保存）し、第１撮像部１１が撮像した画像（本例では口の画像）は記憶媒体９９に記録（保存）しないようにした。これに代えて、第１撮像部１１および第２撮像部１３による画像データや、第１マイク３１および第２マイク３３による音声データを記憶媒体９９に記録させ、さらにステップＳ１０で確認した計時データをメタデータとともに記憶媒体９９に記録させておき、後からパソコンなどで計時データに基づいて画像データおよび音声データの編集を行なう（保存するか否かを後から決定する）ように構成してもよい。 (Modification 1)
In the first embodiment described above, by interposing the buffer memory 41, a part of the image captured by the second imaging unit 13 is recorded (saved) in the storage medium 99, and the first imaging unit 11 captures the image. The image (in this example, the mouth image) is not recorded (saved) in the storage medium 99. Instead, the image data from the first imaging unit 11 and the second imaging unit 13 and the audio data from the first microphone 31 and the second microphone 33 are recorded in the storage medium 99, and the time measurement data confirmed in step S10 is recorded. It may be configured to be recorded in the storage medium 99 together with the metadata, and to edit the image data and the sound data based on the time data on a personal computer or the like later (determining whether to save). .

（第二の実施形態）
次に、図３〜図６を参照して第二の実施形態について説明する。図３は、第二の実施形態による撮像システム２００の概念図である。撮像システム２００は、撮影依頼者を被写体として、この被写体に関する音（例えば楽曲、歌、台詞、朗読など）を録音し、動画像を撮影する。 (Second embodiment)
Next, a second embodiment will be described with reference to FIGS. FIG. 3 is a conceptual diagram of an imaging system 200 according to the second embodiment. The imaging system 200 records a sound related to the subject (for example, music, song, dialogue, reading, etc.) as a subject and photographs a moving image.

図３において、撮像システム２００は、被写体の「口」を検出する動作検出撮像部２１０と、この動作検出撮像部２１０による検出結果に基づいて、被写体の動画撮影を行う複数の動画撮像部２２０（本例では２２０−１〜２２０−３）とを有する。撮影依頼者はあらかじめ登録され、各動画撮像部２２０−１〜２２０−３内のフラッシュメモリ４３に当該依頼者である人物の顔を示す顔情報が記憶されている。本実施形態では、動画撮像部２２０を３台の動画撮像部２２０−１〜２２０−３で構成し、その基本構成は同じとする。なお、動画撮像部２２０の数は３台に限らず、５台でも１０台でもよい。 In FIG. 3, the imaging system 200 includes a motion detection imaging unit 210 that detects a “mouth” of a subject, and a plurality of video imaging units 220 that capture a video of the subject based on the detection result of the motion detection imaging unit 210 ( In this example, 220-1 to 220-3). The photographing client is registered in advance, and face information indicating the face of the requesting person is stored in the flash memory 43 in each of the moving image capturing units 220-1 to 220-3. In the present embodiment, the moving image capturing unit 220 includes three moving image capturing units 220-1 to 220-3, and the basic configuration is the same. The number of moving image capturing units 220 is not limited to three, and may be five or ten.

図４は、図３の撮像システム２００の構成を例示するブロック図である。図５は、上記３台の動画撮像部２２０−１〜２２０−３のうち、１つの動画撮像部２２０−ｎの構成を例示するブロック図である。図４および図５において、動作検出撮像部２１０および複数の動画撮像部２２０−ｎを構成する各構成要件のうち、第一の実施形態で説明した撮像装置１００の構成と同様のものについては同一の符号を付し、その説明を省略する。 FIG. 4 is a block diagram illustrating the configuration of the imaging system 200 of FIG. FIG. 5 is a block diagram illustrating the configuration of one moving image capturing unit 220-n among the three moving image capturing units 220-1 to 220-3. 4 and 5, the same constituent elements as those of the imaging apparatus 100 described in the first embodiment are the same among the constituent elements constituting the motion detection imaging unit 210 and the plurality of moving image imaging units 220-n. The description is omitted.

図４、図５において、動作検出撮像部２１０は、第一実施形態における撮像部１０に対応する。この動作検出撮像部２１０は、第１撮像部１１と、第１マイク３１と、駆動部２１１と、通信部２１２と、制御部２１３とを有する。第１撮像部１１は、主に対象者の「口」の動きを検出するための撮像を行う。駆動部２１１は、例えばユニット化された第１撮像部１１と第１マイク３１とを駆動して、第１撮像部１１の姿勢（撮影方向）および第１マイク３１の集音方向を調節する。 4 and 5, the motion detection imaging unit 210 corresponds to the imaging unit 10 in the first embodiment. The motion detection imaging unit 210 includes a first imaging unit 11, a first microphone 31, a driving unit 211, a communication unit 212, and a control unit 213. The first imaging unit 11 mainly performs imaging for detecting the movement of the “mouth” of the subject. The drive unit 211 drives the first imaging unit 11 and the first microphone 31 that are unitized, for example, and adjusts the posture (shooting direction) of the first imaging unit 11 and the sound collection direction of the first microphone 31.

通信部２１２は無線通信ユニットを有し、動画撮像部２２０−ｎの通信部２２１−ｎとの間で無線通信を行う。また、通信部２１２は、第１マイク３１を介して集音された音声データを動画撮像部２２０−ｎに送信する。制御部２１３は図示しないＣＰＵを有し、動作検出撮像部２１０の全体を制御する他、動画撮像部２２０−ｎとの間で協調制御を行う。 The communication unit 212 includes a wireless communication unit, and performs wireless communication with the communication unit 221-n of the moving image capturing unit 220-n. In addition, the communication unit 212 transmits the audio data collected through the first microphone 31 to the moving image capturing unit 220-n. The control unit 213 includes a CPU (not shown), and controls the entire motion detection imaging unit 210 and performs cooperative control with the moving image imaging unit 220-n.

以下、図５を参照して動画撮像部２２０−ｎの詳細を説明する。図５において、動画撮像部２２０−ｎは、第一実施形態における撮像部１０以外のブロックに対応する。この動画撮像部２２０−ｎは、第２撮像部１３と、検出部２０と、第２マイク３３と、記録部４０と、表示部５０と、計時部６０と、通信部２２１と、制御部２２２とを有している。通信部２２１は無線通信ユニットを有し、動作検出撮像部２１０の通信部２１２との間で無線通信を行う。また、通信部２２１は、他の動画撮像部２２０の通信部２２１とも通信を行う。 Hereinafter, the details of the moving image capturing unit 220-n will be described with reference to FIG. In FIG. 5, the moving image capturing unit 220-n corresponds to a block other than the image capturing unit 10 in the first embodiment. The moving image imaging unit 220-n includes a second imaging unit 13, a detection unit 20, a second microphone 33, a recording unit 40, a display unit 50, a time measuring unit 60, a communication unit 221, and a control unit 222. And have. The communication unit 221 includes a wireless communication unit, and performs wireless communication with the communication unit 212 of the motion detection imaging unit 210. The communication unit 221 also communicates with the communication unit 221 of the other moving image capturing unit 220.

制御部２２２は図示しないＣＰＵを有し、動画撮像部２２０−ｎの全体を制御する他、動作検出撮像部２１０の制御部２１３との間で協調制御を行なう。本実施形態では、動画撮像部２２０−１（図４）の制御部２２２−１が撮像システム２００の全体を制御するマスターの制御部となり、動画撮像部２２０−２（図４）の制御部２２２−２、動画撮像部２２０−３（図４）の制御部２２２−３が、それぞれスレーブの制御部となるものとして説明する。 The control unit 222 includes a CPU (not shown), and controls the entire moving image capturing unit 220-n and performs cooperative control with the control unit 213 of the motion detection imaging unit 210. In the present embodiment, the control unit 222-1 of the moving image capturing unit 220-1 (FIG. 4) serves as a master control unit that controls the entire imaging system 200, and the control unit 222 of the moving image capturing unit 220-2 (FIG. 4). -2, The control part 222-3 of the moving image imaging part 220-3 (FIG. 4) demonstrates as what becomes a control part of a slave, respectively.

以上のように構成された撮像システム２００における動画撮像部２２０−１（マスター）の制御部２２２−１が実行する録画処理について、図６のフローチャートを参照して説明する。制御部２２２−１は、例えば図示しない録画スイッチがオン操作されると、図６による録画処理を行うプログラムを起動させる。 Recording processing executed by the control unit 222-1 of the moving image imaging unit 220-1 (master) in the imaging system 200 configured as described above will be described with reference to the flowchart of FIG. For example, when a recording switch (not shown) is turned on, the control unit 222-1 activates a program for performing the recording process in FIG.

図６のステップＳ１０１において、制御部２２２−１（マスター）は、通信部２２１−１を介して動画撮像部２２０−２および２２０−３に対してそれぞれ撮像を指示するとともに、自身の第２撮像部１３−１へ撮像を指示してステップＳ１０２へ進む。これにより、動画撮像部２２０−１〜２２０−３の第２撮像部１３−１〜１３−３がそれぞれ動画の撮像を開始する。なお、必ずしも全ての動画撮像部２２０−１〜２２０−３による撮像を指示しなくてもよく、適宜その数を設定可能に構成してもよい。 In step S101 in FIG. 6, the control unit 222-1 (master) instructs the moving image capturing units 220-2 and 220-3 to perform image capturing via the communication unit 221-1, and performs its second image capturing. The imaging is instructed to the unit 13-1, and the process proceeds to step S102. Accordingly, the second imaging units 13-1 to 13-3 of the moving image capturing units 220-1 to 220-3 start capturing moving images, respectively. Note that it is not always necessary to instruct imaging by all the moving image imaging units 220-1 to 220-3, and the number of the imaging units may be set as appropriate.

ステップＳ１０２において、制御部２２２−１（マスター）は、動画撮像部２２０−１自身、動画撮像部２２０−２、および２２０−３の少なくとも１つで顔を撮像したかどうかを検出する。具体的には、第２撮像部１３−１の撮像結果に顔が含まれているかどうかを検出部２０−１により検出させる。制御部２２２−１（マスター）はさらに、スレーブとなる動画撮像部２２０−２および２２０−３に対し、それぞれ第２撮像部１３−２および１３−３の撮像結果に顔が含まれているかどうかを検出部２０−２および２０−３により検出させて、その検出結果を通信部２２１−２および２２１−３からマスターである動画撮像部２２０−１へそれぞれ送信させる。制御部２２２−１（マスター）は、自身の第２撮像部１３−１、動画撮像部２２０−２の第２撮像部１３−２、および動画撮像部２２０−３の第２撮像部１３−３のうち、少なくとも１つで顔検出されるまでステップＳ１０２の処理を繰り返す。 In step S102, the control unit 222-1 (master) detects whether or not the moving image capturing unit 220-1 itself, the moving image capturing unit 220-2, and 220-3 have captured the face. Specifically, the detection unit 20-1 detects whether a face is included in the imaging result of the second imaging unit 13-1. Further, the control unit 222-1 (master) further determines whether or not a face is included in the imaging results of the second imaging units 13-2 and 13-3 with respect to the moving image imaging units 220-2 and 220-3 serving as slaves, respectively. Are detected by the detection units 20-2 and 20-3, and the detection results are transmitted from the communication units 221-2 and 221-3 to the moving image capturing unit 220-1 as a master, respectively. The control unit 222-1 (master) has its own second imaging unit 13-1, the second imaging unit 13-2 of the moving image capturing unit 220-2, and the second imaging unit 13-3 of the moving image imaging unit 220-3. Step S102 is repeated until at least one face is detected.

ステップＳ１０３において、制御部２２２−１（マスター）は、ステップＳ１０２で検出された顔を認識したか否かを判定させる。具体的には、自身の検出部２０−１へ顔認識を指示するとともに、スレーブとなる動画撮像部２２０−２および２２０−３に対し、それぞれ検出部２０−２および２０−３により顔認識を指示する。この顔認識は、第一の実施形態の場合と同様の手法で各動画撮像部２２０−１〜２２０−３において行わせる。それぞれの動画撮像部２２０−１〜２２０−３において顔認識させることにより、１つの動画撮像部で顔認識処理をまとめて行う場合に比べて、顔認識に要する時間の短縮を行なうことができる。 In step S103, the control unit 222-1 (master) determines whether or not the face detected in step S102 has been recognized. Specifically, while instructing its own detection unit 20-1 to perform face recognition, the detection units 20-2 and 20-3 perform face recognition on the moving image capturing units 220-2 and 220-3 serving as slaves, respectively. Instruct. This face recognition is performed in each of the moving image capturing units 220-1 to 220-3 by the same method as in the first embodiment. By performing face recognition in each of the moving image capturing units 220-1 to 220-3, it is possible to reduce the time required for face recognition as compared with the case where face recognition processing is performed collectively by one moving image capturing unit.

制御部２２２−１（マスター）は、対象者（撮影依頼者）が顔認識された場合にはステップＳ１０３を肯定判定してステップＳ１０４に進み、対象者が顔認識されない場合にはステップＳ１０３を否定判定してステップＳ１０２に戻る。すなわち、動画撮像部２２０−１、２２０−２、および２２０−３の少なくとも１つで顔認識が済むまでは、ステップＳ１０２およびＳ１０３の処理を繰り返す。 The control unit 222-1 (master) makes an affirmative decision in step S103 when the subject (photographer) recognizes the face, proceeds to step S104, and denies step S103 if the subject is not recognized. Determine and return to step S102. That is, the processes in steps S102 and S103 are repeated until face recognition is completed in at least one of the moving image capturing units 220-1, 220-2, and 220-3.

ステップＳ１０４において、制御部２２２−１（マスター）は、動画撮像部２２０−１〜２２０−３の中から動画像を撮像させる機器を選択する。制御部２２２−１（マスター）は、全ての動画撮像部２２０−１〜２２０−３が対象者の顔を認識した場合には、全ての動画撮像部２２０−１〜２２０−３によって撮像を継続させる。しかしながら、いつも全ての動画撮像部２２０−１〜２２０−３が対象者の顔を認識するとは限らない。そこで、本実施形態においては図３の動画撮像部２２０−３のみによって対象者の顔が認識された場合を例にあげて説明する。 In step S104, the control unit 222-1 (master) selects a device that captures a moving image from the moving image capturing units 220-1 to 220-3. When all the moving image capturing units 220-1 to 220-3 recognize the target person's face, the control unit 222-1 (master) continues image capturing with all the moving image capturing units 220-1 to 220-3. Let However, not all the moving image capturing units 220-1 to 220-3 always recognize the face of the subject. Therefore, in the present embodiment, a case where the subject's face is recognized only by the moving image capturing unit 220-3 in FIG. 3 will be described as an example.

この場合の制御部２２２−１（マスター）は、動画撮像部２２０−３の第２撮像部１３−３に加えて、隣接する動画撮像部２２０−２の第２撮像部１３−２を選択して撮像を継続させるとともに、動画撮像部２２０−１の第２撮像部１３−１による撮像を中止させる（オフにする）。これは、対象者が移動するような場合に、動画撮像部２２０−３に隣接する動画撮像部２２０−２では対象者を撮像する可能性があるのに対して、動画撮像部２２０−３から離れた動画撮像部２２０−１では対象者を撮像する可能性が低いからである。 In this case, the control unit 222-1 (master) selects the second imaging unit 13-2 of the adjacent moving image capturing unit 220-2 in addition to the second imaging unit 13-3 of the moving image capturing unit 220-3. Then, the imaging is continued and the imaging by the second imaging unit 13-1 of the moving image imaging unit 220-1 is stopped (turned off). This is because when the target person moves, the moving image capturing unit 220-2 adjacent to the moving image capturing unit 220-3 may capture the target person, whereas the moving image capturing unit 220-3 This is because the distant moving image capturing unit 220-1 has a low possibility of capturing the subject.

なお、動画撮像部２２０−１および２２０−３の双方に隣接する動画撮像部２２０−２によって対象者の顔が認識された場合には、動画撮像部２２０−２で撮像された画像に基づいて対象者の動きベクトルを検出して、該動きベクトルに基づいて動画撮像部２２０−１もしくは２２０−３を選択するようにすればよい。これは、移動する対象者を撮像する可能性がある動画撮像部が複数存在する場合に、撮像できる可能性がより高い動画撮像部を選ぶためである。 In addition, when the subject's face is recognized by the moving image capturing unit 220-2 adjacent to both of the moving image capturing units 220-1 and 220-3, based on the image captured by the moving image capturing unit 220-2. What is necessary is just to detect a motion vector of a subject and select the moving image capturing unit 220-1 or 220-3 based on the motion vector. This is because when there are a plurality of moving image capturing units that are likely to capture a moving target person, a moving image capturing unit that is more likely to be captured is selected.

また、図３の動画撮像部２２０−３によって対象者の顔が認識された場合において、例えば当該対象者が合唱している場面など、対象者の移動が予想されない場合（動きベクトルの大きさが所定値未満）には、動画撮像部２２０−３に隣接する動画撮像部２２０−２の第２撮像部１３−２での撮像を中止（オフ）させてもよい。 In addition, when the target person's face is recognized by the moving image capturing unit 220-3 in FIG. 3, for example, when the target person is not expected to move, such as a scene in which the target person is singing (the magnitude of the motion vector is For less than a predetermined value, the imaging in the second imaging unit 13-2 of the moving image capturing unit 220-2 adjacent to the moving image capturing unit 220-3 may be stopped (turned off).

動画撮像部２２０−１や動画撮像部２２０−２における第２撮像部の撮像を中止（オフ）させる場合の制御部２２２−１（マスター）はさらに、この動画撮像部２２０−１、２２０−２を操作しているユーザに対し、対象者を撮像できる位置への移動を促してもよい。例えば、動画撮像部２２０−１の表示部５０−１に「被写体に近づいて撮影しましょう」というメッセージを表示させたり、通信部２２１−１を介して動画撮像部２２０−２へ「被写体に近づいて撮影しましょう」というメッセージを送ったりする。このメッセージを受信した動画撮像部２２０−２の制御部２２２−２は、表示部５０−２に「被写体に近づいて撮影しましょう」というメッセージを表示させる。 The control unit 222-1 (master) in the case of stopping (turning off) the imaging of the second imaging unit in the moving image capturing unit 220-1 or the moving image capturing unit 220-2 is further provided with the moving image capturing units 220-1 and 220-2. The user may be prompted to move to a position where the subject can be imaged. For example, a message “Let's shoot close to the subject” is displayed on the display unit 50-1 of the moving image capturing unit 220-1, or “the approaching subject is approached” to the moving image capturing unit 220-2 via the communication unit 221-1. Or send a message saying "Let's shoot." The control unit 222-2 of the moving image capturing unit 220-2 that has received this message causes the display unit 50-2 to display a message “Let's shoot near the subject”.

さらにまた、動画撮像部２２０−１、２２０−２がユーザによる操作なしに自動撮影する固定カメラであって、該動画撮像部２２０−１、２２０−２に第２撮像部１３−１、１３−２の姿勢（撮影方向）を制御するための駆動部材（不図示）があらかじめ設けられている場合には、これら駆動部材を駆動させることによって対象者を撮像できる向きに、動画撮影部２２０−１、２２０−２の撮影方向を自動制御させてもよい。 Furthermore, the moving image capturing units 220-1 and 220-2 are fixed cameras that automatically shoot without user operation, and the moving image capturing units 220-1 and 220-2 include the second image capturing units 13-1 and 13-. In the case where driving members (not shown) for controlling the posture (photographing direction) 2 are provided in advance, the moving image photographing unit 220-1 is arranged in such a direction that the subject can be imaged by driving these driving members. , 220-2 may be automatically controlled.

図６のステップＳ１０５において、制御部２２２−１（マスター）は、ステップＳ１０４で選択した動画撮像部２２０−３および２２０−２に対して通信部２２１−１から指示を送り、動画撮像部２２０−３の第２マイク３３−３、および動画撮像部２２０−２の第２マイク３３−２による録音を開始させてステップＳ１０６へ進む。 In step S105 in FIG. 6, the control unit 222-1 (master) sends an instruction from the communication unit 221-1 to the moving image capturing units 220-3 and 220-2 selected in step S104, and the moving image capturing unit 220- 3 and the second microphone 33-3 of the moving image capturing unit 220-2 and recording are started by the second microphone 33-3, and the process proceeds to step S106.

ステップＳ１０６において、制御部２２２−１（マスター）は、ステップＳ１０４で選択した動画撮像部２２０−３および２２０−２に対して通信部２２１−１から指示を送り、計時部６０−３、および計時部６０−２による計時開始を開始させてステップＳ１０７へ進む。 In step S106, the control unit 222-1 (master) sends an instruction from the communication unit 221-1 to the moving image capturing units 220-3 and 220-2 selected in step S104, the time measuring unit 60-3, and the time measuring The timing start by the unit 60-2 is started, and the process proceeds to step S107.

ステップＳ１０７において、制御部２２２−１（マスター）は、通信部２２１−１から動作検出撮像部２１０へ指示を送り、第１撮像部１１および第１マイク３１の位置調節（撮影方向および集音方向を調節するための姿勢制御）を行なわせる。具体的には、制御部２１３が駆動部２１１を制御して、対象者であると認識された顔から検出または推定される「口」の位置に第１撮像部１１および第１マイク３１を対向させる。「口」の位置情報は、例えば顔認識した動画撮像部（本例では２２０−３）が取得し、その結果を通信部２２１−３から動作検出撮像部２１０へ送信させておく。推定は、上述したように「目」の位置や「鼻」の位置に基づいて「口」の位置を求めることをいう。 In step S107, the control unit 222-1 (master) sends an instruction from the communication unit 221-1 to the motion detection imaging unit 210, and adjusts the positions of the first imaging unit 11 and the first microphone 31 (imaging direction and sound collection direction). (Attitude control for adjusting). Specifically, the control unit 213 controls the driving unit 211 so that the first imaging unit 11 and the first microphone 31 are opposed to the position of the “mouth” detected or estimated from the face recognized as the target person. Let The position information of “mouth” is acquired by, for example, a moving image capturing unit (220-3 in this example) that recognizes the face, and the result is transmitted from the communication unit 221-3 to the motion detection imaging unit 210. The estimation means obtaining the position of the “mouth” based on the position of the “eye” and the position of the “nose” as described above.

なお、複数の動画撮像部２２０によって対象者の顔が検出されている場合には、該対象者の正面の顔を撮像した動画撮像部２２０による撮像結果に基づいて、制御部２１３が、第１撮像部１１および第１マイク３１の姿勢制御をすればよい。また、これに代えて、制御部２１３が、複数の動画撮像部２２０による撮像結果に基づいて対象者の顔の正面位置を推定して、第１撮像部１１と第１マイク３１の姿勢制御を行うようにしてもよい。複数の動画撮像部２２０による撮像結果に基づく場合は、例えば顔認識した複数の動画撮像部（例えば２２０−２および２２０−３）から、それぞれの撮像結果を動作検出撮像部２１０へ送信させておく。 In addition, when the face of the subject is detected by the plurality of moving image capturing units 220, the control unit 213 performs the first operation based on the imaging result by the moving image capturing unit 220 that captures the face in front of the subject. The posture control of the imaging unit 11 and the first microphone 31 may be performed. Instead of this, the control unit 213 estimates the front position of the face of the subject based on the imaging results of the plurality of moving image imaging units 220, and performs posture control of the first imaging unit 11 and the first microphone 31. You may make it perform. When based on the imaging results of the plurality of moving image capturing units 220, for example, the respective imaging results are transmitted to the motion detection imaging unit 210 from a plurality of moving image capturing units (for example, 220-2 and 220-3) whose faces are recognized. .

ステップＳ１０８において、制御部２２２−１（マスター）は、通信部２２１−１から動作検出撮像部２１０へ指示を送り、ステップＳ１０７で検出または推定された「口」の撮像を第１撮像部１１により開始させる。この動画像データは、動作検出撮像部２１０の通信部２１２から動画撮像部２２０−１へ送信させ、動画撮像部２２０−１内のバッファメモリ４１−１に逐次記憶させる。 In step S108, the control unit 222-1 (master) sends an instruction from the communication unit 221-1 to the motion detection imaging unit 210, and the first imaging unit 11 captures the “mouth” detected or estimated in step S107. Let it begin. The moving image data is transmitted from the communication unit 212 of the motion detection imaging unit 210 to the moving image imaging unit 220-1, and is sequentially stored in the buffer memory 41-1 in the moving image imaging unit 220-1.

ステップＳ１０９において、制御部２２２−１（マスター）は、通信部２２１−１から動作検出撮像部２１０へ指示を送り、第１マイク３１を介して集音された音声データの録音を開始させる。この音声データは、動作検出撮像部２１０の通信部２１２から動画撮像部２２０−１へ送信させ、動画撮像部２２０−１内のバッファメモリ４１−１に逐次記憶させる。 In step S 109, the control unit 222-1 (master) sends an instruction from the communication unit 221-1 to the motion detection imaging unit 210, and starts recording audio data collected via the first microphone 31. The audio data is transmitted from the communication unit 212 of the motion detection imaging unit 210 to the moving image imaging unit 220-1, and is sequentially stored in the buffer memory 41-1 in the moving image imaging unit 220-1.

ステップＳ１１０において、制御部２２２−１（マスター）は、動作検出撮像部２１０の第１撮像部１１が撮像している「口」が動いたかどうか（開いたかどうか）を検出する。具体的には、第一の実施形態の場合と同様のリファレンス画像データを動画撮像部２２０−１内のフラッシュメモリ４３に記憶にあらかじめ記憶させておき、このリファレンス画像データを参照して検出部２０−１によって「口」が動いたかどうか（開いたかどうか）を検出させる。 In step S 110, the control unit 222-1 (master) detects whether the “mouth” imaged by the first imaging unit 11 of the motion detection imaging unit 210 has moved (whether opened). Specifically, the same reference image data as in the first embodiment is stored in advance in the flash memory 43 in the moving image capturing unit 220-1, and the detection unit 20 is referenced with reference to this reference image data. -1 detects whether the "mouth" has moved (opened).

制御部２２２−１（マスター）は、動作検出撮像部２１０から送信され、バッファメモリ４１−１に逐次記憶した動画像データから「口」の動きを検出した場合にステップＳ１１０を肯定判定してステップＳ１１１へ進み、「口」の動きを検出しない場合にはステップＳ１１０を否定判定してステップＳ１０８へ戻る。これにより、「口」の動きが検出されるまでステップＳ１０８からステップＳ１１０の処理を繰り返す。 When the motion of the “mouth” is detected from the moving image data transmitted from the motion detection imaging unit 210 and sequentially stored in the buffer memory 41-1, the control unit 222-1 (master) makes a positive determination in step S110 and performs step Proceeding to S111, if the movement of the “mouth” is not detected, a negative determination is made in step S110, and the process returns to step S108. Thus, the processing from step S108 to step S110 is repeated until the movement of the “mouth” is detected.

「口」の動きを検出した制御部２２２−１は、ステップＳ１１１において、ステップＳ１０６で開始した計時の経過時間を確認、および保存を指示してステップＳ１１２へ進む。例えば、ステップＳ１０６の計時開始から１０秒が経過している場合の制御部２２２−１（マスター）は、通信部２２１−１を介して動画撮像部２２０−３、２２０−２に対して指示（「口」の動きを検出した時刻情報を含める）を送り、「口」の動きを検出した時刻から所定時間前（本例では１秒前とする）からバッファメモリ４１−３、４１−２にそれぞれ記憶されている第２撮像部１３−３、１３−２による画像データを記憶媒体９９−３、９９−２にそれぞれ記録（保存）させる。 In step S111, the control unit 222-1 that has detected the movement of the “mouth” confirms the elapsed time measured in step S106 and instructs the storage to proceed to step S112. For example, the control unit 222-1 (master) in the case where 10 seconds have elapsed from the start of timing in step S106 instructs the moving image capturing units 220-3 and 220-2 via the communication unit 221-1 ( Including the time information at which the movement of the “mouth” is detected), and the buffer memories 41-3 and 41-2 are sent from the time when the movement of the “mouth” is detected to a predetermined time (in this example, 1 second before). The stored image data by the second imaging units 13-3 and 13-2 are recorded (saved) in the storage media 99-3 and 99-2, respectively.

また、制御部２２２−１（マスター）は、ステップＳ１１０において対象者の口の動きを検出した場合に、動画撮像部２２０−３へ指示を送り、第２撮像部１３−３の不図示のズーム光学系（もしくは電子ズーム）を望遠側に移動させてもよいし、動画撮像部２２０−３および２２０−２のうちいずれか一方へ指示を送り、第２撮像部１３−３または１３−２のズーム光学系（もしくは電子ズーム）を望遠側に移動させてもよい。また、一方のズーム光学系（もしくは電子ズーム）を望遠側に移動させる場合において、他方のズーム光学系（もしくは電子ズーム）を変化させずに維持してもよいし、ズーム光学系（もしくは電子ズーム）を広角側に移動させてもよい。これにより、一方の動画撮像部で対象者の顔をアップで撮像したり、他方の動画撮像部では広角で撮像したりすることができる。 In addition, when the movement of the subject's mouth is detected in step S110, the control unit 222-1 (master) sends an instruction to the moving image capturing unit 220-3, and zooms (not illustrated) of the second imaging unit 13-3. The optical system (or electronic zoom) may be moved to the telephoto side, or an instruction is sent to one of the moving image capturing units 220-3 and 220-2, and the second image capturing unit 13-3 or 13-2 The zoom optical system (or electronic zoom) may be moved to the telephoto side. When one zoom optical system (or electronic zoom) is moved to the telephoto side, the other zoom optical system (or electronic zoom) may be maintained without change, or the zoom optical system (or electronic zoom) may be maintained. ) May be moved to the wide-angle side. As a result, it is possible to capture the subject's face up with one moving image capturing unit or to capture the wide angle with the other moving image capturing unit.

動画撮像部２２０−３および２２０−２で撮像される対象者の「口」は、動いたり動かなかったりする。このため、制御部２２２−１（マスター）は、ステップＳ１１２において対象者の「口」が所定時間動きなし（閉じている）であるか否かを検出する。ここで、所定時間は任意に設定することができるものとし、本実施形態においては、例えば５秒とする。制御部２２２−１（マスター）は、検出部２０−１によって「口」の動きが５秒間検出されない場合は、ステップＳ１１２を肯定判定してステップＳ１１３に進む。制御部２２２−１（マスター）は、検出部２０−１によって「口」の動きが検出されている場合は、ステップＳ１１２を否定判定して当該判定処理を繰り返す。 The “mouth” of the subject imaged by the moving image capturing units 220-3 and 220-2 moves or does not move. Therefore, the control unit 222-1 (master) detects whether or not the subject's “mouth” has not moved (closed) for a predetermined time in step S 112. Here, it is assumed that the predetermined time can be arbitrarily set. In the present embodiment, the predetermined time is, for example, 5 seconds. When the movement of the “mouth” is not detected by the detection unit 20-1 for 5 seconds, the control unit 222-1 (master) makes an affirmative determination in step S112 and proceeds to step S113. When the movement of the “mouth” is detected by the detection unit 20-1, the control unit 222-1 (master) makes a negative determination in step S112 and repeats the determination process.

つまり、対象者の「口」の動きが検出されている間は、制御部２２２−１（マスター）がステップＳ１１２を否定判定して上記判定処理を繰り返すため、この間に動画撮像部２２０−３、２２０−２は、バッファメモリ４１−３、４１−２にそれぞれ記憶している第２撮像部１３−３、１３−２による画像データを、記憶媒体９９−３、９９−２にそれぞれ記録（保存）させる。また、第２マイク３３−３、３３−２を介して集音されバッファメモリ４１−３、４１−２に記憶している音声データを、記憶媒体９９−３、９９−２にそれぞれ記録（保存）させる。 That is, while the movement of the “mouth” of the subject is detected, the control unit 222-1 (master) makes a negative determination in step S112 and repeats the above determination process. 220-2 records (saves) the image data by the second imaging units 13-3 and 13-2 stored in the buffer memories 41-3 and 41-2 respectively in the storage media 99-3 and 99-2. ) Also, the audio data collected via the second microphones 33-3 and 33-2 and stored in the buffer memories 41-3 and 41-2 are recorded (stored) in the storage media 99-3 and 99-2, respectively. )

ステップＳ１１３において、制御部２２２−１（マスター）は、第一の実施形態の場合と同様に、「口」の動きが所定時間ない場合に撮像を終了するか否かを判断する。制御部２２２−１（マスター）は、ステップＳ１１３を肯定判定した場合に所定のオフ処理を行って図６による処理を終了する。オフ処理は、全ての動画撮像部２２０−１〜２２０−３および動作検出撮像部２１０による撮像および録音（集音）を終了させる。 In step S 113, the control unit 222-1 (master) determines whether or not to end imaging when there is no movement of the “mouth” for a predetermined time, as in the case of the first embodiment. The control unit 222-1 (master) performs a predetermined off process when the determination in step S113 is affirmative, and ends the process of FIG. In the off process, the imaging and recording (sound collection) by all the moving image imaging units 220-1 to 220-3 and the motion detection imaging unit 210 are ended.

一方、制御部２２２−１（マスター）は、ステップＳ１１３を否定判定した場合にステップＳ１１４へ進む。ステップＳ１１４において、制御部２２２−１（マスター）は、対象者の顔を認識できるか否かを判定させる。撮像中の機器である動画撮像部２２０−２および２２０−３に対し、ステップＳ１０３の場合と同様に、それぞれ検出部２０−２および２０−３により顔認識を指示する。 On the other hand, if the control unit 222-1 (master) makes a negative determination in step S113, the process proceeds to step S114. In step S114, the control unit 222-1 (master) determines whether or not the subject's face can be recognized. Similar to the case of step S103, the detection units 20-2 and 20-3 respectively instruct the face recognition to the moving image capturing units 220-2 and 220-3 that are the devices being imaged.

制御部２２２−１（マスター）は、対象者（撮影依頼者）が顔認識された場合にはステップＳ１１４を肯定判定してステップＳ１１０へ戻り、対象者が顔認識されない場合にはステップＳ１１４を否定判定してステップＳ１１５へ進む。ステップＳ１１０へ戻る場合は、撮像を継続させて上述した処理を繰り返す。 The control unit 222-1 (master) makes a positive determination in step S114 when the subject (photographer) recognizes the face, returns to step S110, and denies step S114 if the subject is not recognized. Determine and proceed to step S115. When returning to step S110, imaging is continued and the above-described processing is repeated.

ステップＳ１１５において、制御部２２２−１（マスター）は、通信部２２１−１から指示を送り、動作検出撮像部２１０による撮像を終了させてステップＳ１１６へ進む。ステップＳ１１６において、制御部２２２−１（マスター）は、通信部２２１−１から指示を送り、動作検出撮像部２１０による録音を終了させてステップＳ１０２へ戻る。 In step S115, the control unit 222-1 (master) sends an instruction from the communication unit 221-1, ends the imaging by the motion detection imaging unit 210, and proceeds to step S116. In step S116, the control unit 222-1 (master) sends an instruction from the communication unit 221-1, ends the recording by the motion detection imaging unit 210, and returns to step S102.

以上説明した第二の実施形態によれば、次の作用効果が得られる。
（１）撮像システム２００は、第２マイク３３および第１マイク３１と、検出部２０によって対象者の顔が検出された後で、かつ、対象者の「口」が動く１秒前から第２マイク３３で集音された音をバッファメモリ４１に記憶させる制御部２２２と、を備えたので、適切なタイミングで音声データを蓄積しておくことができる。 According to the second embodiment described above, the following operational effects can be obtained.
(1) The imaging system 200 starts from the second microphone 33, the first microphone 31, and the detection unit 20 after detecting the subject's face and from the second before the subject's “mouth” moves. And a control unit 222 that stores the sound collected by the microphone 33 in the buffer memory 41, so that audio data can be accumulated at an appropriate timing.

（２）上記（１）の撮像システム２００において、検出部２０による検出結果に基づいて、第１マイク３１の向きを調節する駆動部２１１を備えたので、例えば対象者からの音声を適切に集音できる。 (2) Since the imaging system 200 of (1) includes the drive unit 211 that adjusts the direction of the first microphone 31 based on the detection result by the detection unit 20, for example, the sound from the subject is appropriately collected. I can sound.

（３）上記（１）または（２）の撮像システム２００において、制御部２２２は、検出部２０で対象者の顔が検出されなくなった後に、バッファメモリ４１に対する音の記憶を終了させるようにしたので、無駄に録音を継続することなく自動停止できるため、使い勝手を向上できる。 (3) In the imaging system 200 of (1) or (2) above, the control unit 222 finishes storing the sound in the buffer memory 41 after the detection unit 20 no longer detects the subject's face. Therefore, it is possible to automatically stop without unnecessarily continuing recording, thereby improving usability.

（４）上記（１）または（２）の撮像システム２００において、制御部２２２は、所定の音に応じて、バッファメモリ４１に対する音の記憶を終了させるようにしたので、例えば、拍手や終了アナウンスに応じて録音を自動停止できるため、使い勝手を向上できる。 (4) In the imaging system 200 of the above (1) or (2), the control unit 222 ends the storage of the sound in the buffer memory 41 according to a predetermined sound. Since recording can be automatically stopped according to the situation, usability can be improved.

（５）上記（１）から（４）の撮像システム２００において、集音部は、指向性を有する第１マイク３１を含むようにしたので、特定方向からの音声を選択的に録音することもできる。 (5) In the imaging system 200 of (1) to (4) above, the sound collection unit includes the first microphone 31 having directivity, so that sound from a specific direction can be selectively recorded. it can.

（６）上記（５）の撮像システム２００において、集音部は、第１マイク３１よりも集音範囲が広い第２マイク３３を含み、第２マイク３３による集音開始よりも遅く第１マイク３１による集音を開始させる制御部２２２をさらに備えるようにしたので、例えば、対象者を認識した後で特定方向から音声の集音を始めるように制御できる。 (6) In the imaging system 200 of (5) above, the sound collection unit includes the second microphone 33 having a wider sound collection range than the first microphone 31, and the first microphone is later than the sound collection start by the second microphone 33. Since the control unit 222 for starting the sound collection by 31 is further provided, for example, the sound collection of the sound can be started from a specific direction after the target person is recognized.

（７）上記（１）から（６）の撮像システム２００において、「口」の形状に関するデータを記憶しているフラッシュメモリ４３を備えたので、データを備えない場合に比べて、適切に「口」およびその動きを検出できる。 (7) Since the imaging system 200 of the above (1) to (6) includes the flash memory 43 that stores data related to the shape of the “mouth”, the “mouth” is appropriately compared with the case where the data is not provided. "And its movement can be detected.

（変形例２）
上述した第二の実施形態においては、複数の動画撮像部２２０−ｎによって1人の対象者を撮像する場合を例にして説明したが、対象者を複数として、複数の対象者を撮像するようにしてもよい。複数の対象者を撮像する場合には、どの対象者をどの動画撮像部２２０で撮像するかについて割り当ててもよい。例えば、動画撮像部２２０が撮像しやすい対象者を撮像するようにしてもよい。 (Modification 2)
In the second embodiment described above, a case where one target person is imaged by a plurality of moving image capturing units 220-n has been described as an example. However, a plurality of target persons may be imaged. It may be. When imaging a plurality of subjects, it may be assigned which subject is to be imaged by which moving image imaging unit 220. For example, the moving image capturing unit 220 may capture a subject that is easy to capture.

上記説明では、電子機器の例として撮像装置１００、動作検出撮像部２１０、動画撮像部２２０を例示したが、多機能携帯電話機やタブレット型コンピュータなどを用いて撮像装置１００や撮像システム２００を構成してもよい。 In the above description, the imaging device 100, the motion detection imaging unit 210, and the moving image imaging unit 220 are illustrated as examples of electronic devices. However, the imaging device 100 and the imaging system 200 are configured using a multi-function mobile phone or a tablet computer. May be.

以上の説明はあくまで一例であり、上記の実施形態の構成に何ら限定されるものではない。また、上述した第一実施形態と第二実施形態とを適宜組み合わせてもよいことは言うまでもない。 The above description is merely an example, and is not limited to the configuration of the above embodiment. Needless to say, the first embodiment and the second embodiment described above may be appropriately combined.

１０…撮像部
１１…第１撮像部
１２、３２、２１１…駆動部
１３…第２撮像部
２０…検出部
２１…顔検出部
２２…顔認識部
３０…集音部
３１…第１マイク
３３…第２マイク
４０…記録部
４１…バッファメモリ
４２…記録Ｉ／Ｆ
４３…フラッシュメモリ
５０…表示部
６０…計時部
７０、２１３、２２２…制御部
９９…記憶媒体
１００…撮像装置
２００…撮像システム
２１０…動作検出撮像部
２１２、２２１…通信部
２２０（２２０−１〜２２０−３）…動画撮像部 DESCRIPTION OF SYMBOLS 10 ... Imaging part 11 ... 1st imaging part 12, 32, 211 ... Drive part 13 ... 2nd imaging part 20 ... Detection part 21 ... Face detection part 22 ... Face recognition part 30 ... Sound collecting part 31 ... 1st microphone 33 ... Second microphone 40 ... recording unit 41 ... buffer memory 42 ... recording I / F
43 ... Flash memory 50 ... Display unit 60 ... Timing unit 70, 213, 222 ... Control unit 99 ... Storage medium 100 ... Imaging device 200 ... Imaging system 210 ... Motion detection imaging unit 212, 221 ... Communication unit 220 (220-1 to 220-1) 220-3) ... Moving image capturing unit

Claims

A sound collector that collects sound,
A control unit that stores sound collected by the sound collection device in a storage unit after a predetermined time after the subject's face is detected by the detection device and before the subject's mouth moves;
An electronic device characterized by comprising:

The electronic device according to claim 1,
An electronic apparatus comprising: a posture control unit that adjusts a direction of the sound collecting device based on a detection result by the detection device.

The electronic device according to claim 1 or 2,
An electronic apparatus comprising a gain control unit that adjusts a gain of the sound collector based on a detection result of the detection device.

The electronic device according to any one of claims 1 to 3,
The electronic device according to claim 1, wherein the control unit terminates the storage of the sound in the storage unit after the detection apparatus no longer detects the face of the subject.

The electronic device according to any one of claims 1 to 3,
The electronic device according to claim 1, wherein the control unit terminates the storage of the sound in the storage unit according to a predetermined sound.

The electronic device according to any one of claims 1 to 5,
The sound collecting device includes a first microphone having directivity.

The electronic device according to claim 6,
The sound collection device includes a second microphone having a wider sound collection range than the first microphone,
An electronic apparatus, further comprising: a sound collection control unit for starting sound collection by the first microphone later than the sound collection start by the second microphone.

In the electronic device as described in any one of Claim 1 to 7,
An electronic apparatus comprising a storage member that stores data relating to a mouth shape.