JP7403218B2

JP7403218B2 - Imaging device, its control method, program, storage medium

Info

Publication number: JP7403218B2
Application number: JP2018217522A
Authority: JP
Inventors: 将浩高山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2017-12-18
Filing date: 2018-11-20
Publication date: 2023-12-22
Anticipated expiration: 2038-11-20
Also published as: JP2019110525A

Description

本発明は、撮像装置における自動撮影技術に関するものである。 The present invention relates to automatic photographing technology in an imaging device.

カメラ等の撮像装置による静止画・動画撮影においては、ユーザがファインダー等を通して撮影対象を決定し、撮影状況を自ら確認して撮影画像のフレーミングを調整して、画像を撮影するのが一般的である。このような撮像装置では、ユーザの操作ミスや外部環境の検知を行い、撮影に適していない場合にはユーザに通知したり、撮影に適した状態になるようにカメラを制御する仕組みが従来から備えられている。 When shooting still images and videos using an imaging device such as a camera, it is common for the user to determine the subject to be shot through a finder, etc., check the shooting conditions themselves, adjust the framing of the shot image, and then shoot the image. be. Traditionally, such imaging devices detect user operational errors and the external environment, notify the user if the camera is not suitable for shooting, and control the camera to bring the camera into a state suitable for shooting. It is equipped.

このようなユーザの操作により撮影を実行する撮像装置に対し、ユーザが撮影指示を与えることなく定期的および継続的に撮影を行うライフログカメラ（特許文献１）が知られている。ライフログカメラは、ストラップ等でユーザの身体に装着された状態で用いられ、ユーザが日常生活で目にする光景を一定時間間隔で映像として記録する。ライフログカメラによる撮影では、ユーザがシャッターを切るなどの意図したタイミングで撮影するのではなく、一定の時間間隔で撮影が行われるため、普段撮影しないような不意な瞬間を映像として残すことができる。 A life log camera (Patent Document 1) is known that performs photography periodically and continuously without the user giving a photography instruction to an imaging device that performs photography by such a user's operation. A life log camera is used while attached to a user's body with a strap or the like, and records scenes that the user sees in daily life as images at regular time intervals. When taking pictures with a life log camera, pictures are taken at regular intervals, rather than at the user's intended timing, such as when the user releases the shutter, so it is possible to capture unexpected moments that would not normally be taken. .

特表２０１６－５３６８６８号公報Special table 2016-536868 publication

しかしながら、ライフログカメラをユーザが身に着けた状態において、定期的に自動撮影を行った場合、以下のような問題が発生する。 However, when a user wears a life log camera and periodically performs automatic photography, the following problems occur.

１つは、ユーザの意思に関係なく一定時間間隔で撮影が行われるため、ユーザが本当に撮影したい瞬間の映像を撮り逃す可能性があることである。また、もう１つは、撮り逃しを回避するために撮影間隔を短くすると、撮影による消費電力が大きくなり、撮影可能時間が短くなってしまうことである。 One is that since images are taken at fixed time intervals regardless of the user's intention, there is a possibility that the user may miss taking the image at the moment he or she really wants to take. Another problem is that if the shooting interval is shortened in order to avoid missed shots, the power consumption for shooting will increase and the available shooting time will be shortened.

本発明は上述した課題に鑑みてなされたものであり、その目的は、自動撮影を行う撮像装置において、ユーザが撮影したい映像の撮り逃しを極力抑制できるようにすることである。 The present invention has been made in view of the above-mentioned problems, and an object of the present invention is to enable an image capturing apparatus that performs automatic shooting to minimize the possibility of missing a video that a user wants to shoot.

本発明に係わる撮像装置は、被写体像を撮像して画像データを出力する撮像手段と、前記撮像手段によって出力された画像データを記録する撮影動作を実施するか否か制御する制御手段と、第１の時刻からの累積の撮影枚数を示す情報を取得する取得手段と、所定期間における目標撮影枚数を決定する決定手段と、を備え、前記制御手段は、前記目標撮影枚数と前記累積の撮影枚数を示す情報とに基づいて、前記撮影動作を実施するか否かを決定するための閾値を変更することを特徴とする。 An imaging device according to the present invention includes: an imaging means for imaging a subject image and outputting image data; a control means for controlling whether to perform a photographing operation for recording image data outputted by the imaging means; 1 ; and a determining means for determining a target number of shots in a predetermined period . The present invention is characterized in that a threshold value for determining whether or not to perform the photographing operation is changed based on information indicating the photographing operation.

本発明によれば、自動撮影を行う撮像装置において、自動撮影を行う撮像装置において、ユーザが撮影したい映像の撮り逃しを極力抑制できるようにすることが可能となる。 Advantageous Effects of Invention According to the present invention, in an imaging device that performs automatic imaging, it is possible to minimize the possibility of a user missing a video that he or she wants to capture.

本発明の撮像装置の一実施形態であるカメラの外観を模式的に示す図。1 is a diagram schematically showing the appearance of a camera that is an embodiment of an imaging device of the present invention. 一実施形態のカメラの全体構成を示すブロック図。FIG. 1 is a block diagram showing the overall configuration of a camera according to an embodiment. カメラと外部装置との無線通信システムの構成例を示す図。FIG. 1 is a diagram showing an example of a configuration of a wireless communication system between a camera and an external device. 外部装置の構成を示す図。FIG. 3 is a diagram showing the configuration of an external device. カメラと外部装置の構成を示す図。The figure which shows the structure of a camera and an external device. 外部装置の構成を示す図。FIG. 3 is a diagram showing the configuration of an external device. 第１制御部の動作を説明するフローチャート。5 is a flowchart illustrating the operation of the first control section. 第２制御部の動作を説明するフローチャート。5 is a flowchart illustrating the operation of the second control section. 撮影モード処理の動作を説明するフローチャート。5 is a flowchart illustrating the operation of shooting mode processing. 撮影画像内のエリア分割を説明するための図。FIG. 3 is a diagram for explaining area division within a photographed image. 撮影頻度の制御を説明するための図。FIG. 3 is a diagram for explaining control of shooting frequency. ニューラルネットワークを説明する図。A diagram explaining a neural network. 外部装置で画像を閲覧している様子を示す図。A diagram showing how images are viewed on an external device. 学習モード判定を説明するフローチャート。Flowchart illustrating learning mode determination. 学習処理を説明するフローチャート。Flowchart illustrating learning processing.

以下、本発明の一実施形態について、添付図面を参照して詳細に説明する。 Hereinafter, one embodiment of the present invention will be described in detail with reference to the accompanying drawings.

＜カメラの構成＞
図１は、本発明の撮像装置の一実施形態であるカメラの外観を模式的に示す図である。図１（ａ）に示すカメラ１０１には、電源スイッチ、カメラ操作を行うことができる操作部材などが設けられている。被写体像の撮像を行う撮像光学系としての撮影レンズ群や撮像素子を一体的に含む鏡筒１０２は、カメラ１０１の固定部１０３に対して移動可能に取り付けられている。具体的には、鏡筒１０２は、固定部１０３に対して回転駆動できる機構であるチルト回転ユニット１０４とパン回転ユニット１０５とを介して固定部１０３に取り付けられている。 <Camera configuration>
FIG. 1 is a diagram schematically showing the appearance of a camera that is an embodiment of the imaging device of the present invention. The camera 101 shown in FIG. 1(a) is provided with a power switch, an operation member for operating the camera, and the like. A lens barrel 102 that integrally includes a photographic lens group and an image sensor as an imaging optical system for imaging a subject image is movably attached to a fixed portion 103 of the camera 101. Specifically, the lens barrel 102 is attached to the fixed part 103 via a tilt rotation unit 104 and a pan rotation unit 105, which are mechanisms that can be rotated relative to the fixed part 103.

チルト回転ユニット１０４は、鏡筒１０２を図１（ｂ）に示すピッチ方向に回転駆動することができるモーター駆動機構を備え、パン回転ユニット１０５は、鏡筒１０２を図１（ｂ）に示すヨー方向に回転駆動することができるモーター駆動機構を備える。すなわちカメラ１０１は、鏡筒１０２を２軸方向に回転駆動する機構を有する。図１（ｂ）に示す各軸は、固定部１０３の位置に対してそれぞれ定義されている。角速度計１０６及び加速度計１０７は、カメラ１０１の固定部１０３に配置されている。そして、角速度計１０６や加速度計１０７の出力信号に基づいて、カメラ１０１の振動を検出し、チルト回転ユニット１０４とパン回転ユニット１０５を回転駆動することにより、鏡筒１０２の振れを補正したり、傾きを補正したりすることができる。また、角速度計１０６や加速度計１０７は、一定の期間の計測結果に基づいて、カメラの移動検出も行う。 The tilt rotation unit 104 includes a motor drive mechanism that can rotate the lens barrel 102 in the pitch direction shown in FIG. 1(b), and the pan rotation unit 105 rotates the lens barrel 102 in the yaw direction shown in FIG. Equipped with a motor drive mechanism that can rotate in the direction. That is, the camera 101 has a mechanism that rotates the lens barrel 102 in two axial directions. Each axis shown in FIG. 1(b) is defined with respect to the position of the fixed part 103. The angular velocity meter 106 and the accelerometer 107 are arranged on the fixed part 103 of the camera 101. Then, vibration of the camera 101 is detected based on the output signals of the angular velocity meter 106 and the accelerometer 107, and the tilt rotation unit 104 and the pan rotation unit 105 are rotationally driven, thereby correcting the shake of the lens barrel 102. You can also correct the tilt. Furthermore, the angular velocity meter 106 and the accelerometer 107 also detect movement of the camera based on measurement results over a certain period.

図２は本実施形態のカメラ１０１の全体構成を示すブロック図である。図２において、第１制御部２２３は、例えばＣＰＵ（ＭＰＵ）、メモリ（ＤＲＡＭ、ＳＲＡＭ）などを備える。そして、不揮発性メモリ（ＥＥＰＲＯＭ）２１６に記憶されたプログラムに従って、各種処理を実行してカメラ１０１の各ブロックを制御したり、各ブロック間でのデータ転送を制御したりする。不揮発性メモリ２１６は、電気的に消去・記録可能なメモリであり、上記のように第１制御部２２３の動作用の定数、プログラム等が記憶される。 FIG. 2 is a block diagram showing the overall configuration of the camera 101 of this embodiment. In FIG. 2, the first control unit 223 includes, for example, a CPU (MPU), memory (DRAM, SRAM), and the like. Then, according to the program stored in the nonvolatile memory (EEPROM) 216, various processes are executed to control each block of the camera 101 and data transfer between the blocks. The nonvolatile memory 216 is an electrically erasable/recordable memory, and stores constants, programs, etc. for the operation of the first control section 223 as described above.

図２において、ズームユニット２０１は、変倍（結像された被写体像の拡大・縮小）を行うズームレンズを含む。ズーム駆動制御部２０２は、ズームユニット２０１を駆動制御するとともに、そのときの焦点距離を検出する。フォーカスユニット２０３は、ピント調整（焦点調節）を行うフォーカスレンズを含む。フォーカス駆動制御部２０４は、フォーカスユニット２０３を駆動制御する。撮像部２０６は撮像素子を備え、各レンズ群を通して入射する光を受け、その光量に応じた電荷の情報をアナログ画像信号として画像処理部２０７に出力する。なお、ズームユニット２０１、フォーカスユニット２０３、撮像部２０６は、鏡筒１０２内に配置されている。 In FIG. 2, a zoom unit 201 includes a zoom lens that performs magnification (enlargement/reduction of a formed subject image). The zoom drive control section 202 drives and controls the zoom unit 201 and detects the focal length at that time. The focus unit 203 includes a focus lens that performs focus adjustment. A focus drive control section 204 drives and controls the focus unit 203. The imaging unit 206 includes an imaging element, receives light incident through each lens group, and outputs charge information corresponding to the amount of light to the image processing unit 207 as an analog image signal. Note that the zoom unit 201, focus unit 203, and imaging section 206 are arranged inside the lens barrel 102.

画像処理部２０７はアナログ画像信号をＡ／Ｄ変換して得られたデジタル画像データに対して、歪曲補正、ホワイトバランス調整、色補間処理等の画像処理を適用し、適用後のデジタル画像データを出力する。画像処理部２０７から出力されたデジタル画像データは、画像記録部２０８でＪＰＥＧ形式等の記録用フォーマットに変換され、メモリ２１５に記憶されたり後述する映像出力部２１７に送信される。 The image processing unit 207 applies image processing such as distortion correction, white balance adjustment, and color interpolation processing to the digital image data obtained by A/D converting the analog image signal, and converts the applied digital image data. Output. The digital image data output from the image processing section 207 is converted into a recording format such as JPEG format by the image recording section 208, and is stored in the memory 215 or transmitted to the video output section 217, which will be described later.

鏡筒回転駆動部２０５は、チルト回転ユニット１０４、パン回転ユニット１０５を駆動し、鏡筒１０２をチルト方向とパン方向に回動させる。装置揺れ検出部２０９は、カメラ１０１の３軸方向の角速度を検出する角速度計（ジャイロセンサ）１０６や、カメラ１０１の３軸方向の加速度を検出する加速度計（加速度センサ）１０７を備える。そして、それらのセンサにより検出された信号に基づいて、装置の回転角度や装置のシフト量などが算出される。 The lens barrel rotation drive section 205 drives the tilt rotation unit 104 and the pan rotation unit 105, and rotates the lens barrel 102 in the tilt direction and the pan direction. The device shake detection unit 209 includes an angular velocity meter (gyro sensor) 106 that detects the angular velocity of the camera 101 in three axial directions, and an accelerometer (acceleration sensor) 107 that detects the acceleration of the camera 101 in the three axial directions. Then, based on the signals detected by these sensors, the rotation angle of the device, the amount of shift of the device, etc. are calculated.

音声入力部２１３は、カメラ１０１に設けられたマイクによりカメラ１０１の周辺の音声信号を取得し、デジタル音声信号に変換して音声処理部２１４に送信する。音声処理部２１４は、入力されたデジタル音声信号の適正化処理等の音声に関する処理を行う。そして、音声処理部２１４で処理された音声信号は、第１制御部２２３によりメモリ２１５に送信される。メモリ２１５は、画像処理部２０７、音声処理部２１４により得られた画像信号及び音声信号を一時的に記憶する。 The audio input unit 213 acquires an audio signal around the camera 101 using a microphone provided in the camera 101, converts it into a digital audio signal, and transmits the digital audio signal to the audio processing unit 214. The audio processing unit 214 performs audio-related processing such as optimization processing of the input digital audio signal. The audio signal processed by the audio processing section 214 is then transmitted to the memory 215 by the first control section 223. The memory 215 temporarily stores image signals and audio signals obtained by the image processing section 207 and the audio processing section 214.

画像処理部２０７及び音声処理部２１４は、メモリ２１５に一時的に記憶された画像信号や音声信号を読み出して画像信号の符号化、音声信号の符号化などを行い、圧縮画像信号、圧縮音声信号を生成する。第１制御部２２３は、これらの圧縮画像信号、圧縮音声信号を、記録再生部２２０に送信する。 The image processing unit 207 and the audio processing unit 214 read the image signal and audio signal temporarily stored in the memory 215, encode the image signal, encode the audio signal, etc., and convert the image signal and the audio signal into compressed image signals and compressed audio signals. generate. The first control section 223 transmits these compressed image signals and compressed audio signals to the recording/reproducing section 220.

記録再生部２２０は、記録媒体２２１に対して画像処理部２０７及び音声処理部２１４で生成された圧縮画像信号、圧縮音声信号、その他撮影に関する制御データ等を記録する。また、音声信号を圧縮符号化しない場合には、第１制御部２２３は、音声処理部２１４により生成された音声信号と画像処理部２０７により生成された圧縮画像信号とを、記録再生部２２０に送信し記録媒体２２１に記録させる。 The recording and reproducing unit 220 records compressed image signals and compressed audio signals generated by the image processing unit 207 and the audio processing unit 214, and other control data related to photographing on the recording medium 221. Furthermore, when the audio signal is not compressed and encoded, the first control unit 223 sends the audio signal generated by the audio processing unit 214 and the compressed image signal generated by the image processing unit 207 to the recording/reproducing unit 220. It is transmitted and recorded on the recording medium 221.

記録媒体２２１は、カメラ１０１に内蔵された記録媒体でも、取外し可能な記録媒体でもよく、カメラ１０１で生成した圧縮画像信号、圧縮音声信号、音声信号などの各種データを記録することができる。一般的には、記録媒体２２１には不揮発性メモリ２１６よりも大容量な媒体が使用される。例えば、記録媒体２２１は、ハードディスク、光ディスク、光磁気ディスク、ＣＤ－Ｒ、ＤＶＤ－Ｒ、磁気テープ、不揮発性の半導体メモリ、フラッシュメモリ、などのあらゆる方式の記録媒体を含む。 The recording medium 221 may be a recording medium built into the camera 101 or a removable recording medium, and can record various data such as compressed image signals, compressed audio signals, and audio signals generated by the camera 101. Generally, a medium having a larger capacity than the nonvolatile memory 216 is used as the recording medium 221. For example, the recording medium 221 includes any type of recording medium such as a hard disk, an optical disk, a magneto-optical disk, a CD-R, a DVD-R, a magnetic tape, a nonvolatile semiconductor memory, and a flash memory.

記録再生部２２０は、記録媒体２２１に記録された圧縮画像信号、圧縮音声信号、音声信号、各種データ、プログラムを読み出す（再生する）。そして、第１制御部２２３は、読み出された圧縮画像信号、圧縮音声信号を、画像処理部２０７及び音声処理部２１４に送信する。画像処理部２０７及び音声処理部２１４は、圧縮画像信号、圧縮音声信号を一時的にメモリ２１５に記憶させ、所定の手順で復号し、復号した信号を映像出力部２１７に送信する。 The recording and reproducing unit 220 reads (reproduces) compressed image signals, compressed audio signals, audio signals, various data, and programs recorded on the recording medium 221. The first control unit 223 then transmits the read compressed image signal and compressed audio signal to the image processing unit 207 and the audio processing unit 214. The image processing section 207 and the audio processing section 214 temporarily store the compressed image signal and the compressed audio signal in the memory 215, decode them according to a predetermined procedure, and transmit the decoded signals to the video output section 217.

音声入力部２１３には複数のマイクが配置されており、音声処理部２１４は複数のマイクが設置された平面に対する音の方向を検出することができ、後述する被写体の探索や自動撮影に用いられる。さらに、音声処理部２１４では、特定の音声コマンドを検出する。音声コマンドは事前に登録されたいくつかのコマンドの他、ユーザが特定音声をカメラに登録できる構成にしてもよい。また、音シーン認識も行う。音シーン認識では、予め大量の音声データに基づいて機械学習により学習させたネットワークにより音シーンの判定を行う。例えば、「歓声が上がっている」、「拍手している」、「声を発している」などの特定シーンを検出するためのネットワークが音声処理部２１４に設定されており、特定音シーンや特定音声コマンドを検出する。音声処理部２１４が特定音シーンや特定音声コマンドを検出すると、第１制御部２２３や第２制御部２１１に、検出トリガー信号を出力する。 A plurality of microphones are arranged in the audio input section 213, and an audio processing section 214 can detect the direction of sound with respect to the plane on which the plurality of microphones are installed, and is used for searching for a subject and automatic shooting, which will be described later. . Furthermore, the voice processing unit 214 detects a specific voice command. In addition to some commands registered in advance, the voice commands may be configured such that the user can register specific voices in the camera. It also performs sound scene recognition. In sound scene recognition, sound scenes are determined using a network trained in advance through machine learning based on a large amount of audio data. For example, a network is set up in the audio processing unit 214 to detect specific scenes such as "cheers", "claps", "sounds", etc. Detect voice commands. When the audio processing unit 214 detects a specific sound scene or specific audio command, it outputs a detection trigger signal to the first control unit 223 and the second control unit 211.

カメラ１０１のメインシステム全体を制御する第１制御部２２３とは別に、この第１制御部２２３の供給電源を制御する第２制御部２１１が設けられている。第１電源部２１０と第２電源部２１２は、第１制御部２２３と第２制御部２１１を動作させるための電力をそれぞれ供給する。カメラ１０１に設けられた電源ボタンの押下により、まず第１制御部２２３と第２制御部２１１の両方に電源が供給されるが、後述するように、第１制御部２２３は、第１電源部２１０へ自らの電源供給をＯＦＦする制御も行う。第１制御部２２３が動作していない間も、第２制御部２１１は動作しており、装置揺れ検出部２０９や音声処理部２１４からの情報が入力される。第２制御部２１１は、各種入力情報に基づいて、第１制御部２２３を起動するか否かの判定を行い、起動することが判定されると、第１電源部２１０に第１制御部２２３へ電力を供給するように指示する。 Separately from the first control section 223 that controls the entire main system of the camera 101, a second control section 211 that controls the power supply to the first control section 223 is provided. The first power supply section 210 and the second power supply section 212 supply power for operating the first control section 223 and the second control section 211, respectively. When the power button provided on the camera 101 is pressed, power is first supplied to both the first control unit 223 and the second control unit 211, but as described later, the first control unit 223 is connected to the first power supply unit. It also performs control to turn off its own power supply to 210. Even while the first control section 223 is not operating, the second control section 211 is operating, and information from the device shaking detection section 209 and the audio processing section 214 is input. The second control unit 211 determines whether or not to start the first control unit 223 based on various input information, and when it is determined that the first control unit 223 is to be started, the second control unit 211 causes the first control unit 223 to be activated. Instructs to supply power to.

音声出力部２１８は、例えば撮影時などにカメラ１０１に内蔵されたスピーカーから予め設定された音声パターンを出力する。ＬＥＤ制御部２２４は、例えば撮影時などに、カメラ１０１に設けられたＬＥＤを、予め設定された点灯パターンや点滅パターンに基づいて点灯させる。映像出力部２１７は、例えば映像出力端子からなり、接続された外部ディスプレイ等に映像を表示させるために画像信号を出力する。また、音声出力部２１８、映像出力部２１７は、結合された１つの端子、例えばＨＤＭＩ（登録商標：Ｈｉｇｈ－ＤｅｆｉｎｉｔｉｏｎＭｕｌｔｉｍｅｄｉａＩｎｔｅｒｆａｃｅ）端子のような端子であってもよい。 The audio output unit 218 outputs a preset audio pattern from a speaker built into the camera 101, for example, when photographing. The LED control unit 224 lights up the LED provided in the camera 101 based on a preset lighting pattern or blinking pattern, for example, when photographing. The video output unit 217 includes, for example, a video output terminal, and outputs an image signal to display a video on a connected external display or the like. Further, the audio output section 218 and the video output section 217 may be a single terminal connected to each other, for example, a terminal such as an HDMI (registered trademark: High-Definition Multimedia Interface) terminal.

通信部２２２は、カメラ１０１と外部装置との間で通信を行う部分であり、例えば、音声信号、画像信号、圧縮音声信号、圧縮画像信号などのデータを送信したり受信したりする。また、撮影開始や終了のコマンド、パン・チルト、ズーム駆動等の撮影にかかわる制御信号を受信して、外部装置の指示に基づいてカメラ１０１を駆動する。また、カメラ１０１と外部装置との間で、後述する学習処理部２１９で処理される学習にかかわる各種パラメータなどの情報を送信したり受信したりする。通信部２２２は、例えば、赤外線通信モジュール、Ｂｌｕｅｔｏｏｔｈ（登録商標）通信モジュール、無線ＬＡＮ通信モジュール、ＷｉｒｅｌｅｓｓＵＳＢ（登録商標）、ＧＰＳ受信機等の無線通信モジュールを備える。 The communication unit 222 is a part that communicates between the camera 101 and an external device, and transmits and receives data such as an audio signal, an image signal, a compressed audio signal, and a compressed image signal, for example. It also receives control signals related to photography, such as commands to start and end photography, pan/tilt, and zoom drive, and drives the camera 101 based on instructions from an external device. Furthermore, information such as various parameters related to learning to be processed by a learning processing unit 219, which will be described later, is transmitted and received between the camera 101 and an external device. The communication unit 222 includes, for example, a wireless communication module such as an infrared communication module, a Bluetooth (registered trademark) communication module, a wireless LAN communication module, a Wireless USB (registered trademark), and a GPS receiver.

環境センサ２２６は、所定の周期でカメラ１０１の周辺の環境の状態を検出する。環境センサ２２６は、カメラ１０１周辺の温度を検出する温度センサ、カメラ１０１周辺の気圧の変化を検出する気圧センサ、カメラ１０１周辺の明るさを検出する照度センサを有する。さらに、カメラ１０１周辺の湿度を検出する湿度センサ、カメラ１０１周辺の紫外線量を検出するＵＶセンサ等も有する。検出した温度情報や気圧情報や明るさ情報や湿度情報やＵＶ情報に加え、検出した各種情報から所定時間間隔での変化率を算出した温度変化量や気圧変化量や明るさ変化量や湿度変化量や紫外線変化量などを後述する自動撮影などの判定に使用する。 The environment sensor 226 detects the state of the environment around the camera 101 at a predetermined cycle. The environment sensor 226 includes a temperature sensor that detects the temperature around the camera 101, an atmospheric pressure sensor that detects changes in the air pressure around the camera 101, and an illuminance sensor that detects the brightness around the camera 101. Furthermore, it also includes a humidity sensor that detects the humidity around the camera 101, a UV sensor that detects the amount of ultraviolet rays around the camera 101, and the like. In addition to the detected temperature information, atmospheric pressure information, brightness information, humidity information, and UV information, the amount of temperature change, amount of atmospheric pressure change, amount of brightness change, and humidity change is calculated by calculating the rate of change at a predetermined time interval from various detected information. It is used to judge the amount of ultraviolet rays and the amount of change in ultraviolet rays for automatic shooting, etc., which will be described later.

＜外部装置との通信＞
図３は、カメラ１０１と外部装置３０１との無線通信システムの構成例を示す図である。カメラ１０１は撮影機能を有するデジタルカメラであり、外部装置３０１はＢｌｕｅｔｏｏｔｈ通信モジュール、無線ＬＡＮ通信モジュールを含むスマートデバイスである。 <Communication with external devices>
FIG. 3 is a diagram showing a configuration example of a wireless communication system between the camera 101 and the external device 301. The camera 101 is a digital camera with a photographing function, and the external device 301 is a smart device including a Bluetooth communication module and a wireless LAN communication module.

カメラ１０１と外部装置３０１は、例えばＩＥＥＥ８０２．１１規格シリーズに準拠した無線ＬＡＮによる第１の通信３０２と、例えばＢｌｕｅｔｏｏｔｈＬｏｗＥｎｅｒｇｙ（以下、「ＢＬＥ」と呼ぶ）などの、制御局と従属局などの主従関係を有する第２の通信３０３とによって通信可能である。なお、無線ＬＡＮ及びＢＬＥは通信手法の一例であり、各通信装置は、２つ以上の通信機能を有し、例えば制御局と従属局との関係の中で通信を行う一方の通信機能によって、他方の通信機能の制御を行うことが可能であれば、他の通信手法が用いられてもよい。ただし、無線ＬＡＮなどの第１の通信３０２は、ＢＬＥなどの第２の通信３０３より高速な通信が可能であり、また、第２の通信３０３は、第１の通信３０２よりも消費電力が少ないか通信可能距離が短いかの少なくともいずれかであるものとする。 The camera 101 and the external device 301 have a first communication 302 using a wireless LAN based on the IEEE802.11 standard series, and a control station and a dependent station such as Bluetooth Low Energy (hereinafter referred to as "BLE"). It is possible to communicate with the second communication 303 having a master-slave relationship. Note that wireless LAN and BLE are examples of communication methods, and each communication device has two or more communication functions. For example, one communication function that performs communication in the relationship between a control station and a dependent station, Other communication methods may be used as long as it is possible to control the other communication function. However, the first communication 302 such as wireless LAN is capable of faster communication than the second communication 303 such as BLE, and the second communication 303 consumes less power than the first communication 302. or the communicable distance is short.

外部装置３０１の構成を図４を用いて説明する。外部装置３０１は、例えば、無線ＬＡＮ用の無線ＬＡＮ制御部４０１、及び、ＢＬＥ用のＢＬＥ制御部４０２に加え、公衆無線通信用の公衆無線制御部４０６を有する。また、外部装置３０１は、パケット送受信部４０３をさらに有する。無線ＬＡＮ制御部４０１は、無線ＬＡＮのＲＦ制御、通信処理、ＩＥＥＥ８０２．１１規格シリーズに準拠した無線ＬＡＮによる通信の各種制御を行うドライバ処理や無線ＬＡＮによる通信に関するプロトコル処理を行う。ＢＬＥ制御部４０２は、ＢＬＥのＲＦ制御、通信処理、ＢＬＥによる通信の各種制御を行うドライバ処理やＢＬＥによる通信に関するプロトコル処理を行う。公衆無線制御部４０６は、公衆無線通信のＲＦ制御、通信処理、公衆無線通信の各種制御を行うドライバ処理や公衆無線通信関連のプロトコル処理を行う。公衆無線通信は例えばＩＭＴ（ＩｎｔｅｒｎａｔｉｏｎａｌＭｕｌｔｉｍｅｄｉａＴｅｌｅｃｏｍｍｕｎｉｃａｔｉｏｎｓ）規格やＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ）規格などに準拠したものである。パケット送受信部４０３は、無線ＬＡＮ並びにＢＬＥによる通信及び公衆無線通信に関するパケットの送信と受信との少なくともいずれかを実行するための処理を行う。なお、本実施形態では、外部装置３０１は、通信においてパケットの送信と受信との少なくともいずれかを行うものとして説明するが、パケット交換以外に、例えば回線交換など、他の通信形式が用いられてもよい。 The configuration of the external device 301 will be explained using FIG. 4. The external device 301 includes, for example, a wireless LAN control unit 401 for wireless LAN and a BLE control unit 402 for BLE, as well as a public wireless control unit 406 for public wireless communication. Furthermore, the external device 301 further includes a packet transmitting/receiving section 403. The wireless LAN control unit 401 performs RF control of the wireless LAN, communication processing, driver processing for controlling various types of wireless LAN communication based on the IEEE 802.11 standard series, and protocol processing regarding wireless LAN communication. The BLE control unit 402 performs BLE RF control, communication processing, driver processing for controlling various types of BLE communication, and protocol processing regarding BLE communication. The public radio control unit 406 performs RF control of public radio communication, communication processing, driver processing for performing various controls of public radio communication, and protocol processing related to public radio communication. Public wireless communication is based on, for example, the International Multimedia Telecommunications (IMT) standard or the Long Term Evolution (LTE) standard. The packet transmitting/receiving unit 403 performs processing for transmitting and/or receiving packets related to wireless LAN, BLE communication, and public wireless communication. Note that in this embodiment, the external device 301 will be described as one that performs at least one of sending and receiving packets in communication; however, other communication formats other than packet exchange, such as circuit switching, may be used. Good too.

外部装置３０１は、例えば、制御部４１１、記憶部４０４、ＧＰＳ受信部４０５、表示部４０７、操作部４０８、音声入力音声処理部４０９、電源部４１０をさらに有する。制御部４１１は、例えば、記憶部４０４に記憶された制御プログラムを実行することにより、外部装置３０１全体を制御する。記憶部４０４は、例えば制御部４１１が実行する制御プログラムと、通信に必要なパラメータ等の各種情報とを記憶する。後述する各種動作は、記憶部４０４に記憶された制御プログラムを制御部４１１が実行することにより、実現される。 The external device 301 further includes, for example, a control section 411, a storage section 404, a GPS reception section 405, a display section 407, an operation section 408, a voice input voice processing section 409, and a power supply section 410. The control unit 411 controls the entire external device 301, for example, by executing a control program stored in the storage unit 404. The storage unit 404 stores, for example, a control program executed by the control unit 411 and various information such as parameters necessary for communication. Various operations described below are realized by the control unit 411 executing a control program stored in the storage unit 404.

電源部４１０は、外部装置３０１に電力を供給する。表示部４０７は、例えば、ＬＣＤやＬＥＤのように視覚で認知可能な情報の出力、又はスピーカー等の音出力が可能な機能を有し、各種情報の表示を行う。操作部４０８は、例えばユーザによる外部装置３０１の操作を受け付けるボタン等を含む。なお、表示部４０７及び操作部４０８は、例えばタッチパネルなどの共通する部材によって構成されていてもよい。 The power supply unit 410 supplies power to the external device 301. The display unit 407 has a function capable of outputting visually recognizable information such as an LCD or LED, or outputting sound such as a speaker, and displays various information. The operation unit 408 includes, for example, buttons that accept operations on the external device 301 by the user. Note that the display section 407 and the operation section 408 may be configured by a common member such as a touch panel, for example.

音声入力音声処理部４０９は、例えば外部装置３０１に内蔵された汎用的なマイクにより、ユーザが発した音声を取得し、音声認識処理により、ユーザの操作命令を識別する構成にしてもよい。また、外部装置３０１内の専用のアプリケーションを用いて、ユーザの発音により音声コマンドを取得し、無線ＬＡＮによる第１の通信３０２を介して、カメラ１０１の音声処理部２１４に認識させるための特定音声コマンドとして登録することもできる。 The voice input voice processing unit 409 may be configured to acquire the voice uttered by the user using, for example, a general-purpose microphone built into the external device 301, and identify the user's operation command through voice recognition processing. Additionally, a dedicated application in the external device 301 is used to acquire a voice command by the user's pronunciation, and a specific voice is sent to be recognized by the voice processing unit 214 of the camera 101 via the first communication 302 using the wireless LAN. It can also be registered as a command.

ＧＰＳ（Ｇｌｏｂａｌｐｏｓｉｔｉｏｎｉｎｇｓｙｓｔｅｍ）受信部４０５は、衛星から通知されるＧＰＳ信号を受信し、ＧＰＳ信号を解析し、外部装置３０１の現在位置（経度・緯度情報）を推定する。もしくは、ＷＰＳ（Ｗｉ－ＦｉＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）等を利用して、周囲に存在する無線ネットワークの情報に基づいて、外部装置３０１の現在位置を推定するようにしてもよい。取得した現在のＧＰＳ位置情報が予め事前に設定されている位置範囲（検出位置を中心といて所定半径の範囲以内）に位置している場合や、ＧＰＳ位置情報に所定以上の位置変化があった場合に、ＢＬＥ制御部４０２を介してカメラ１０１へ移動情報を通知する。そして、後述する自動撮影や自動編集のためのパラメータとして使用する。 A GPS (Global Positioning System) receiving unit 405 receives a GPS signal notified from a satellite, analyzes the GPS signal, and estimates the current position (longitude/latitude information) of the external device 301. Alternatively, the current location of the external device 301 may be estimated using WPS (Wi-Fi Positioning System) or the like based on information on surrounding wireless networks. If the acquired current GPS location information is located within a preset location range (within a predetermined radius around the detected location), or if the GPS location information has changed more than a predetermined amount. In this case, the movement information is notified to the camera 101 via the BLE control unit 402. Then, it is used as a parameter for automatic shooting and automatic editing, which will be described later.

上記のようにカメラ１０１と外部装置３０１は、無線ＬＡＮ制御部４０１、及び、ＢＬＥ制御部４０２を用いた通信により、データのやりとりを行う。例えば、音声信号、画像信号、圧縮音声信号、圧縮画像信号などのデータを送信したり受信したりする。また、外部装置３０１からカメラ１０１への撮影指示などの送信、音声コマンド登録データの送信、ＧＰＳ位置情報に基づいた所定位置検出通知の送信、場所移動通知の送信等を行う。また、外部装置３０１内の専用のアプリケーションを用いての学習用データの送受信も行う。 As described above, the camera 101 and the external device 301 exchange data through communication using the wireless LAN control unit 401 and the BLE control unit 402. For example, it transmits and receives data such as audio signals, image signals, compressed audio signals, and compressed image signals. Further, the external device 301 transmits a shooting instruction to the camera 101, transmits voice command registration data, transmits a predetermined position detection notification based on GPS position information, transmits a location movement notification, and the like. It also transmits and receives learning data using a dedicated application within the external device 301.

＜アクセサリ類の構成＞
図５は、カメラ１０１と通信可能である外部装置５０１の構成例を示す図である。カメラ１０１は撮影機能を有するデジタルカメラであり、外部装置５０１は、例えばＢｌｕｅｔｏｏｔｈ通信モジュールなどによりカメラ１０１と通信可能である各種センシング部を含むウエアラブルデバイスである。 <Configuration of accessories>
FIG. 5 is a diagram showing a configuration example of an external device 501 that can communicate with the camera 101. The camera 101 is a digital camera with a photographing function, and the external device 501 is a wearable device that includes various sensing units that can communicate with the camera 101 using, for example, a Bluetooth communication module.

外部装置５０１は、例えばユーザの腕などに装着できるように構成されており、所定の周期でユーザの脈拍、心拍、血流等の生体情報を検出するセンサやユーザの運動状態を検出できる加速度センサ等が搭載されている。 The external device 501 is configured to be worn on the user's arm, for example, and includes a sensor that detects biological information such as the user's pulse, heartbeat, and blood flow at a predetermined cycle, and an acceleration sensor that can detect the user's exercise state. etc. are installed.

生体情報検出部６０２は、例えば、脈拍を検出する脈拍センサ、心拍を検出する心拍センサ、血流を検出する血流センサ、導電性高分子を用いた皮膚の接触によって電位の変化を検出するセンサを含む。本実施形態では、生体情報検出部６０２として心拍センサを用いて説明する。心拍センサは、例えばＬＥＤ等を用いて皮膚に赤外光を照射し、体組織を透過した赤外光を受光センサで検出して信号処理することによりユーザの心拍を検出する。生体情報検出部６０２は、検出した生体情報を信号として制御部６０７（図６参照）へ出力する。 The biological information detection unit 602 includes, for example, a pulse sensor that detects pulses, a heartbeat sensor that detects heartbeats, a blood flow sensor that detects blood flow, and a sensor that uses conductive polymer to detect changes in potential due to skin contact. including. In this embodiment, a heartbeat sensor will be used as the biological information detection unit 602. The heartbeat sensor detects the user's heartbeat by emitting infrared light onto the skin using, for example, an LED, and detecting the infrared light that has passed through the body tissue with a light receiving sensor and processing the signal. The biological information detection unit 602 outputs the detected biological information as a signal to the control unit 607 (see FIG. 6).

ユーザの運動状態を検出する揺れ検出部６０３は、例えば、加速度センサやジャイロセンサを備えており、加速度の情報に基づきユーザが移動しているか、腕を振り回してアクションをしているかなどのモーションを検出することができる。また、ユーザによる外部装置５０１の操作を受け付ける操作部６０５や、ＬＣＤやＬＥＤのように視覚で認知可能な情報を出力するモニタなどの表示部６０４が搭載されている。 The sway detection unit 603 that detects the user's movement state is equipped with, for example, an acceleration sensor or a gyro sensor, and detects motions such as whether the user is moving or waving his arms around based on acceleration information. can be detected. Furthermore, an operation unit 605 that accepts operations on the external device 501 by a user, and a display unit 604 such as a monitor that outputs visually recognizable information such as an LCD or LED are installed.

図６は、外部装置５０１の構成を示す図である。上述したように、外部装置５０１は、例えば、制御部６０７、通信部６０１、生体情報検出部６０２、揺れ検出部６０３、表示部６０４、操作部６０５、電源部６０６、記憶部６０８を有する。 FIG. 6 is a diagram showing the configuration of external device 501. As described above, the external device 501 includes, for example, a control section 607, a communication section 601, a biological information detection section 602, a shaking detection section 603, a display section 604, an operation section 605, a power supply section 606, and a storage section 608.

制御部６０７は、例えば、記憶部６０８に記憶された制御プログラムを実行することにより、外部装置５０１全体を制御する。記憶部６０８は、例えば制御部６０７が実行する制御プログラムと、通信に必要なパラメータ等の各種情報とを記憶している。後述する各種動作は、例えば記憶部６０８に記憶された制御プログラムを制御部６０７が実行することにより、実現される。 The control unit 607 controls the entire external device 501, for example, by executing a control program stored in the storage unit 608. The storage unit 608 stores, for example, a control program executed by the control unit 607 and various information such as parameters necessary for communication. Various operations described below are realized, for example, by the control unit 607 executing a control program stored in the storage unit 608.

電源部６０６は、外部装置５０１に電力を供給する。表示部６０４は、例えば、ＬＣＤやＬＥＤのように視覚で認知可能な情報の出力部、又はスピーカー等の音出力が可能な出力部を有し、各種情報の表示を行う。操作部６０５は、例えばユーザによる外部装置５０１の操作を受け付けるボタン等を備える。なお、表示部６０４及び操作部６０５は、例えばタッチパネルなどの共通する部材によって構成されていてもよい。また、操作部６０５は、例えば外部装置５０１に内蔵された汎用的なマイクによりユーザが発した音声を取得し、音声認識処理により、ユーザの操作命令を識別するように構成されていてもよい。 The power supply unit 606 supplies power to the external device 501. The display unit 604 has, for example, an output unit for visually perceivable information such as an LCD or LED, or an output unit for outputting sound such as a speaker, and displays various information. The operation unit 605 includes, for example, buttons that accept operations on the external device 501 by the user. Note that the display unit 604 and the operation unit 605 may be configured by a common member such as a touch panel, for example. Further, the operation unit 605 may be configured to acquire the voice emitted by the user using, for example, a general-purpose microphone built into the external device 501, and identify the user's operation command through voice recognition processing.

生体情報検出部６０２や揺れ検出部６０３により取得され制御部６０７で処理された各種検出情報は、通信部６０１により、カメラ１０１へ送信される。例えば、ユーザの心拍の変化を検出したタイミングで検出情報をカメラ１０１に送信したり、歩行移動／走行移動／立ち止まりなどの移動状態の変化のタイミングで検出情報を送信したりすることができる。また、予め設定された腕ふりのモーションを検出したタイミングで検出情報を送信したり、予め設定された距離の移動を検出したタイミングで検出情報を送信したりすることもできる。 Various detection information acquired by the biological information detection unit 602 and the shaking detection unit 603 and processed by the control unit 607 is transmitted to the camera 101 by the communication unit 601. For example, the detection information can be transmitted to the camera 101 at the timing when a change in the user's heartbeat is detected, or the detection information can be transmitted at the timing when the movement state changes such as walking/running/stopping. Further, the detection information may be transmitted at the timing when a preset arm swing motion is detected, or the detection information may be transmitted at the timing when movement of a preset distance is detected.

＜カメラの動作シーケンス＞
図７は、本実施形態におけるカメラ１０１の第１制御部２２３が受け持つ動作の例を説明するフローチャートである。 <Camera operation sequence>
FIG. 7 is a flowchart illustrating an example of the operation handled by the first control unit 223 of the camera 101 in this embodiment.

ユーザがカメラ１０１に設けられた電源ボタンを操作すると、第１電源部２１０から第１制御部２２３及びカメラ１０１の各ブロックに電力が供給される。また、同様に、第２電源部２１２から第２制御部２１１に電力が供給される。第２制御部２１１の動作の詳細については、図８のフローチャートを用いて後述する。 When the user operates the power button provided on the camera 101, power is supplied from the first power supply section 210 to the first control section 223 and each block of the camera 101. Similarly, power is supplied from the second power supply section 212 to the second control section 211. Details of the operation of the second control section 211 will be described later using the flowchart of FIG.

電力が供給されると、図７の処理がスタートする。ステップＳ７０１では、起動条件の読み込みが行われる。本実施形態においては、電源が起動される条件には以下の３つの場合がある。
（１）電源ボタンが手動で押下されて電源が起動される
（２）外部装置（例えば外部装置３０１）から外部通信（例えばＢＬＥ通信）により起動指示が送られ、電源が起動される
（３）第２制御部２１１の指示により、電源が起動される
ここで、（３）の第２制御部２１１の指示により電源が起動される場合は、第２制御部２１１内で演算された起動条件が読み込まれることになるが、詳細は図８を用いて後述する。また、ここで読み込まれた起動条件は、被写体探索や自動撮影時の１つのパラメータ要素として用いられるが、それについても後述する。起動条件読み込みが終了するとステップＳ７０２に進む。 When power is supplied, the process shown in FIG. 7 starts. In step S701, startup conditions are read. In this embodiment, there are the following three conditions for starting the power supply.
(1) The power button is manually pressed to start the power. (2) A startup instruction is sent from an external device (for example, the external device 301) through external communication (for example, BLE communication), and the power is started (3) The power supply is started according to the instruction from the second control unit 211. Here, when the power supply is started according to the instruction from the second control unit 211 in (3), the activation condition calculated in the second control unit 211 is The details will be described later using FIG. 8. Further, the activation condition read here is used as one parameter element during subject search and automatic shooting, and this will also be described later. When the reading of the startup conditions is completed, the process advances to step S702.

ステップＳ７０２では、各種センサの検出信号の読み込みが行われる。ここで読み込まれるセンサの信号は、１つは、装置揺れ検出部２０９におけるジャイロセンサや加速度センサなどの振動を検出するセンサの信号である。また、チルト回転ユニット１０４やパン回転ユニット１０５の回転位置の信号である。さらには、音声処理部２１４で検出される音声信号、特定音声認識の検出トリガー信号、音方向検出信号、環境センサ２２６で検出される環境情報の検出信号などである。ステップＳ７０２で各種センサの検出信号の読み込みが行われると、ステップＳ７０３に進む。 In step S702, detection signals from various sensors are read. One of the sensor signals read here is a signal from a sensor that detects vibrations, such as a gyro sensor or an acceleration sensor in the device shake detection section 209. It is also a signal of the rotational position of the tilt rotation unit 104 and the pan rotation unit 105. Furthermore, the signals include a voice signal detected by the voice processing unit 214, a detection trigger signal for specific voice recognition, a sound direction detection signal, a detection signal of environmental information detected by the environment sensor 226, and the like. When the detection signals of various sensors are read in step S702, the process advances to step S703.

ステップＳ７０３では、外部装置から通信指示が送信されているかを検出し、通信指示があった場合、外部装置との通信を行う。例えば、外部装置３０１からの、無線ＬＡＮやＢＬＥを介したリモート操作、音声信号、画像信号、圧縮音声信号、圧縮画像信号などの送受信、外部装置３０１からの撮影などの操作指示、音声コマンド登録データの送信、ＧＰＳ位置情報に基づいた所定位置検出通知、場所移動通知、学習用データの送受信等の読み込みを行う。また、外部装置５０１から、ユーザの運動情報、腕のアクション情報、心拍などの生体情報の更新がある場合、ＢＬＥを介した情報の読み込みを行う。なお、上述した環境センサ２２６は、カメラ１０１に搭載されていてもよいが、外部装置３０１或いは外部装置５０１に搭載されていてもよい。その場合、ステップＳ７０３では、ＢＬＥを介した環境情報の読み込みも行う。ステップＳ７０３で外部装置からの通信読み込みが行われると、ステップＳ７０４に進む。 In step S703, it is detected whether a communication instruction has been sent from an external device, and if there is a communication instruction, communication with the external device is performed. For example, remote operation via wireless LAN or BLE from the external device 301, transmission and reception of audio signals, image signals, compressed audio signals, compressed image signals, etc., operation instructions such as shooting from the external device 301, voice command registration data transmission, predetermined position detection notification based on GPS position information, location movement notification, transmission and reception of learning data, etc. Further, when there is an update of user's exercise information, arm action information, heartbeat, and other biological information from the external device 501, the information is read via BLE. Note that the environment sensor 226 described above may be mounted on the camera 101, but may also be mounted on the external device 301 or the external device 501. In that case, in step S703, environment information is also read via BLE. When the communication from the external device is read in step S703, the process advances to step S704.

ステップＳ７０４では、モード設定判定が行われ、ステップＳ７０５に進む。ステップＳ７０５ではステップＳ７０４で動作モードが低消費電力モードに設定されているか否かを判定する。後述する「自動撮影モード」、「自動編集モード」、「画像自動転送モード」、「学習モード」、「ファイル自動削除モード」、の何れのモードでもない場合は、低消費電力モードになるように判定される。ステップＳ７０５で、低消費電力モードであると判定されると、ステップＳ７０６に進む。 In step S704, a mode setting determination is made, and the process advances to step S705. In step S705, it is determined whether the operation mode is set to low power consumption mode in step S704. If you are not in any of the following modes: "Automatic shooting mode", "Automatic editing mode", "Automatic image transfer mode", "Learning mode", and "Automatic file deletion mode", the mode will be set to low power consumption mode. It will be judged. If it is determined in step S705 that the mode is low power consumption mode, the process advances to step S706.

ステップＳ７０６では、第２制御部２１１（ＳｕｂＣＰＵ）へ、第２制御部２１１内で判定する起動要因に係る各種パラメータ（揺れ検出判定用パラメータ、音検出用パラメータ、時間経過検出パラメータ）を通知する。各種パラメータは後述する学習処理で学習されることによって値が変化する。ステップＳ７０６の処理を終了すると、ステップＳ７０７に進み、第１制御部２２３（ＭａｉｎＣＰＵ）の電源をＯＦＦして、処理を終了する。 In step S706, the second control unit 211 (SubCPU) is notified of various parameters related to the activation factor determined within the second control unit 211 (shake detection determination parameters, sound detection parameters, and time elapsed detection parameters). The values of various parameters change as they are learned in a learning process described later. When the process in step S706 is completed, the process proceeds to step S707, where the power to the first control unit 223 (Main CPU) is turned off, and the process ends.

ステップＳ７０５で、低消費電力モードでないと判定されると、ステップＳ７０４におけるモード設定が自動撮影モードか否かを判定する。ここで、ステップＳ７０４でのモード設定判定の処理について説明する。判定されるモードは、以下の中から選択される。 If it is determined in step S705 that it is not the low power consumption mode, it is determined whether the mode setting in step S704 is automatic shooting mode. Here, the mode setting determination process in step S704 will be explained. The mode to be determined is selected from the following.

（１）自動撮影モード
＜モード判定条件＞
学習設定された各検出情報（画像、音、時間、振動、場所、身体の変化、環境変化）、自動撮影モードに移行してからの経過時間、過去の撮影情報・撮影枚数などの情報から、自動撮影を行うべきと判定されると、自動撮影モードに設定される。 (1) Automatic shooting mode <Mode judgment conditions>
Based on information such as each detection information set for learning (image, sound, time, vibration, location, physical change, environmental change), elapsed time since switching to automatic shooting mode, past shooting information/number of shots, etc. When it is determined that automatic photography should be performed, automatic photography mode is set.

＜モード内処理＞
自動撮影モード処理（ステップＳ７１０）では、各検出情報（画像、音、時間、振動、場所、体の変化、環境変化）に基づいて、パン・チルトやズームを駆動して被写体を自動探索する。そして、ユーザの好みの撮影が行えるタイミングであると判定されると自動で撮影が行われる。 <In-mode processing>
In automatic shooting mode processing (step S710), the subject is automatically searched for by driving panning, tilting, and zooming based on each detection information (image, sound, time, vibration, location, change in body, and change in environment). Then, when it is determined that it is time to take a photograph of the user's preference, photographing is automatically performed.

（２）自動編集モード
＜モード判定条件＞
前回自動編集を行ってからの経過時間、過去の撮影画像情報から、自動編集を行うべきと判定されると、自動編集モードに設定される。 (2) Automatic editing mode <Mode judgment conditions>
If it is determined that automatic editing should be performed based on the elapsed time since the last automatic editing and information on past captured images, automatic editing mode is set.

＜モード内処理＞
自動編集モード処理（ステップＳ７１２）では、学習に基づいた静止画像や動画像の選抜処理を行い、学習に基づいて、画像効果や編集後動画の時間などにより、一つの動画にまとめたハイライト動画を作成する自動編集処理が行われる。 <In-mode processing>
In automatic editing mode processing (step S712), still images and moving images are selected based on learning, and based on learning, highlight videos are combined into one video based on image effects, edited video time, etc. An automatic editing process is performed to create the .

（３）画像転送モード
＜モード判定条件＞
外部装置３０１内の専用のアプリケーションを用いた指示により、画像自動転送モードに設定されている場合、前回画像転送を行ってからの経過時間と過去の撮影画像情報から、自動転送を行うべきと判定されると、自動画像転送モードに設定される。 (3) Image transfer mode <Mode judgment conditions>
If the automatic image transfer mode is set by an instruction using a dedicated application in the external device 301, it is determined that automatic transfer should be performed based on the elapsed time since the last image transfer and information on previously captured images. When this happens, automatic image transfer mode is set.

＜モード内処理＞
画像自動転送モード処理（ステップＳ７１４）では、カメラ１０１は、ユーザの好みに合うであろう画像を自動で抽出し、外部装置３０１にユーザの好みと思われる画像を自動で転送する。ユーザの好みの画像抽出は、後述する各画像に付加されたユーザの好みを判定したスコアにより行われる。 <In-mode processing>
In automatic image transfer mode processing (step S714), the camera 101 automatically extracts images that are likely to match the user's preferences, and automatically transfers the images that are likely to be the user's preferences to the external device 301. Extraction of the user's favorite image is performed using a score that is added to each image and determined by the user's preference, which will be described later.

（４）学習モード
＜モード判定条件＞
前回学習処理を行ってからの経過時間と、学習に使用することのできる画像に一体となった情報や学習データの数などから、自動学習を行うべきと判定されると、自動学習送モードに設定される。または、外部装置３０１からの通信を介して学習モードが設定されるように指示があった場合もこのモードに設定される。 (4) Learning mode <Mode judgment conditions>
If it is determined that automatic learning should be performed based on the elapsed time since the last learning process, the information integrated into the image that can be used for learning, the number of learning data, etc., automatic learning sending mode will be activated. Set. Alternatively, this mode is also set when there is an instruction to set the learning mode via communication from the external device 301.

＜モード内処理＞
学習モード処理（ステップＳ７１６）では、外部装置３０１での各操作情報（カメラからの画像取得情報、専用アプリケーションを介して手動編集した情報、カメラ内の画像に対してユーザが入力した判定値情報）、外部装置３０１からの学習情報の通知などに基づいて、ニューラルネットワークを用いて、ユーザの好みに合わせた学習を行う。また、個人認証の登録、音声登録、音シーン登録、一般物体認識登録などの、検出に関する学習や、上述した低消費電力モードの条件などの学習も同時に行われる。 <In-mode processing>
In the learning mode process (step S716), each operation information on the external device 301 (image acquisition information from the camera, information manually edited via a dedicated application, judgment value information input by the user for the image in the camera) , based on notification of learning information from the external device 301, etc., uses a neural network to perform learning tailored to the user's preferences. Further, learning related to detection such as registration of personal authentication, voice registration, sound scene registration, general object recognition registration, etc., and learning of the above-mentioned low power consumption mode conditions, etc. are also performed at the same time.

（５）ファイル自動削除モード
＜モード判定条件＞
前回ファイル自動削除を行ってからの経過時間と、画像を記録している不揮発性メモリ２１６の残容量とに基づいて、ファイル自動削除を行うべきと判定されると、ファイル自動削除モードに設定される。 (5) File automatic deletion mode <Mode judgment conditions>
When it is determined that automatic file deletion should be performed based on the elapsed time since the previous automatic file deletion and the remaining capacity of the non-volatile memory 216 in which images are recorded, automatic file deletion mode is set. Ru.

＜モード内処理＞
ファイル自動削除モード処理（ステップＳ７１８）では、不揮発性メモリ２１６内の画像の中から、各画像のタグ情報と撮影された日時などから自動削除されるべきファイルを指定し削除する。 <In-mode processing>
In the automatic file deletion mode process (step S718), files to be automatically deleted are designated from among the images in the nonvolatile memory 216 based on the tag information of each image, the date and time of photographing, and the like.

以上の各モードにおける処理の詳細については、後述する。 Details of the processing in each of the above modes will be described later.

図７の説明に戻り、ステップＳ７０５で低消費電力モードでないと判定されると、ステップＳ７０９に進み、モード設定が自動撮影モードであるか否かを判定する。判定の結果、自動撮影モードであればステップＳ７１０に進み、自動撮影モード処理が行われる。処理が終了すると、ステップＳ７０２に戻り、処理を繰り返す。ステップＳ７０９で、自動撮影モードでないと判定されると、ステップＳ７１１に進む。 Returning to the explanation of FIG. 7, if it is determined in step S705 that the low power consumption mode is not set, the process proceeds to step S709, and it is determined whether the mode setting is automatic shooting mode. As a result of the determination, if it is the automatic photographing mode, the process advances to step S710, and automatic photographing mode processing is performed. When the process ends, the process returns to step S702 and repeats the process. If it is determined in step S709 that the mode is not automatic shooting mode, the process advances to step S711.

ステップＳ７１１では、モード設定が自動編集モードであるか否かを判定し、自動編集モードであればステップＳ７１２に進み、自動編集モード処理が行われる。処理が終了すると、ステップＳ７０２に戻り、処理を繰り返す。ステップＳ７１１で、自動編集モードでないと判定されると、ステップＳ７１３に進む。なお、自動編集モードは、本発明の主旨に直接関係しないため、詳細な説明は省略する。 In step S711, it is determined whether the mode setting is automatic editing mode, and if it is automatic editing mode, the process advances to step S712, where automatic editing mode processing is performed. When the process ends, the process returns to step S702 and repeats the process. If it is determined in step S711 that the mode is not automatic editing mode, the process advances to step S713. Note that the automatic editing mode is not directly related to the gist of the present invention, so a detailed explanation will be omitted.

ステップＳ７１３では、モード設定が画像自動転送モードであるか否かを判定し、画像自動転送モードであればステップＳ７１４に進み、画像自動転送モード処理が行われる。処理が終了すると、ステップＳ７０２に戻り、処理を繰り返す。ステップＳ７１３で、画像自動転送モードでないと判定されると、ステップＳ７１５に進む。なお、画像自動転送モードは、本発明の主旨に直接関係しないため、詳細な説明は省略する。 In step S713, it is determined whether the mode setting is automatic image transfer mode, and if it is automatic image transfer mode, the process advances to step S714, where automatic image transfer mode processing is performed. When the process ends, the process returns to step S702 and repeats the process. If it is determined in step S713 that the mode is not automatic image transfer mode, the process advances to step S715. Note that the automatic image transfer mode is not directly related to the gist of the present invention, so a detailed explanation will be omitted.

ステップＳ７１５では、モード設定が学習モードであるか否かを判定し、学習モードであればステップＳ７１６に進み、学習モード処理が行われる。処理が終了すると、ステップＳ７０２に戻り、処理を繰り返す。ステップＳ７１５で、学習モードでないと判定されると、ステップＳ７１７に進む。 In step S715, it is determined whether the mode setting is learning mode, and if it is learning mode, the process advances to step S716, where learning mode processing is performed. When the process ends, the process returns to step S702 and repeats the process. If it is determined in step S715 that the mode is not learning mode, the process advances to step S717.

ステップＳ７１７では、モード設定がファイル自動削除モードであるか否かを判定し、ファイル自動削除モードであればステップＳ７１８に進み、ファイル自動削除モード処理が行われる。処理が終了すると、ステップＳ７０２に戻り、処理を繰り返す。ステップＳ７１７で、ファイル自動削除モードでないと判定されると、ステップＳ７０２に戻り、処理を繰り返す。なお、ファイル自動削除モードは、本発明の主旨に直接関係しないため、詳細な説明は省略する。 In step S717, it is determined whether the mode setting is automatic file deletion mode, and if it is automatic file deletion mode, the process advances to step S718, where automatic file deletion mode processing is performed. When the process ends, the process returns to step S702 and repeats the process. If it is determined in step S717 that the file automatic deletion mode is not set, the process returns to step S702 and repeats the process. Note that the automatic file deletion mode is not directly related to the gist of the present invention, so detailed explanation will be omitted.

図８は、本実施形態におけるカメラ１０１の第２制御部２１１が受け持つ動作の例を説明するフローチャートである。 FIG. 8 is a flowchart illustrating an example of the operation handled by the second control unit 211 of the camera 101 in this embodiment.

ユーザがカメラ１０１に設けられた電源ボタンを操作すると、第１電源部２１０から第１制御部２２３及びカメラ１０１の各ブロックに電力が供給される。また、同様に、第２電源部２１２から第２制御部２１１に電力が供給される。 When the user operates the power button provided on the camera 101, power is supplied from the first power supply section 210 to the first control section 223 and each block of the camera 101. Similarly, power is supplied from the second power supply section 212 to the second control section 211.

電力が供給されると、第２制御部（ＳｕｂＣＰＵ）２１１が起動され、図８の処理がスタートする。ステップＳ８０１では、所定サンプリング周期が経過したか否かを判定する。所定サンプリング周期は、例えば１０ｍｓｅｃに設定され、１０ｍｓｅｃ周期で、ステップＳ８０２に進む。所定サンプリング周期が経過していないと判定されると、第２制御部２１１はそのまま待機する。 When power is supplied, the second control unit (SubCPU) 211 is activated and the process shown in FIG. 8 starts. In step S801, it is determined whether a predetermined sampling period has elapsed. The predetermined sampling period is set to, for example, 10 msec, and the process proceeds to step S802 at a 10 msec period. If it is determined that the predetermined sampling period has not elapsed, the second control unit 211 remains on standby.

ステップＳ８０２では、学習情報の読み込みが行われる。学習情報は、図７のステップＳ７０６での第２制御部２１１へ情報を通信する際に転送された情報であり、例えば以下の情報が含まれる。
（１）特定揺れ検出の判定（後述するステップＳ８０４で用いる）
（２）特定音検出の判定（後述するステップＳ８０５で用いる）
（３）時間経過の判定（後述するステップＳ８０７で用いる）
ステップＳ８０２で学習情報が読み込まれると、ステップＳ８０３に進み、揺れ検出値が取得される。揺れ検出値は、装置揺れ検出部２０９におけるジャイロセンサや加速度センサなどの出力値である。 In step S802, learning information is read. The learning information is information transferred when communicating information to the second control unit 211 in step S706 of FIG. 7, and includes, for example, the following information.
(1) Determination of specific shaking detection (used in step S804 described later)
(2) Determination of specific sound detection (used in step S805 described later)
(3) Determining the passage of time (used in step S807 described later)
When the learning information is read in step S802, the process advances to step S803, and a shaking detection value is acquired. The shaking detection value is an output value of a gyro sensor, an acceleration sensor, or the like in the device shaking detection section 209.

ステップＳ８０３で揺れ検出値が取得されると、ステップＳ８０４に進み、予め設定された特定の揺れ状態の検出処理を行う。ここでは、ステップＳ８０２で読み込まれた学習情報によって、判定処理を変更する。いくつかの例について説明する。 When the shaking detection value is acquired in step S803, the process advances to step S804, and a preset specific shaking state detection process is performed. Here, the determination process is changed based on the learning information read in step S802. Some examples will be explained.

＜タップ検出＞
ユーザがカメラ１０１を例えば指先などで叩いた状態（タップ状態）を、カメラ１０１に取り付けられた加速度センサ１０７の出力値から検出することが可能である。３軸の加速度センサ１０７の出力を、所定サンプリング周期で、特定の周波数領域に設定したバンドパスフィルタ（ＢＰＦ）に通すことで、タップによる加速度変化の信号領域を抽出することができる。ＢＰＦに通した後の加速度信号が、所定時間ＴｉｍｅＡの間に、所定閾値ＴｈｒｅｓｈＡを超えた回数が、所定回数ＣｏｕｎｔＡであるか否かにより、タップ検出を行う。ダブルタップの場合は、ＣｏｕｎｔＡは２に設定され、トリプルタップの場合は、ＣｏｕｎｔＡは３に設定される。また、ＴｉｍｅＡやＴｈｒｅｓｈＡについても、学習情報によって変化させることができる。 <Tap detection>
A state in which the user taps the camera 101 with, for example, a fingertip (tap state) can be detected from the output value of the acceleration sensor 107 attached to the camera 101. By passing the output of the triaxial acceleration sensor 107 through a band pass filter (BPF) set in a specific frequency range at a predetermined sampling period, it is possible to extract the signal range of acceleration changes due to taps. Tap detection is performed depending on whether the number of times the acceleration signal passed through the BPF exceeds a predetermined threshold value ThreshA during a predetermined time TimeA is a predetermined number of times CountA. For double taps, CountA is set to 2, and for triple taps, CountA is set to 3. Furthermore, TimeA and ThreshA can also be changed depending on the learning information.

＜揺れ状態の検出＞
カメラ１０１の揺れ状態を、カメラ１０１に取り付けられたジャイロセンサ１０６や加速度センサ１０７の出力値から検出することが可能である。ジャイロセンサ１０６や加速度センサ１０７の出力の高周波成分をハイパスフィルタ（ＨＰＦ）でカットし、低周波成分をローパスフィルタ（ＬＰＦ）でカットした後、絶対値変換を行う。算出した絶対値が、所定時間ＴｉｍｅＢの間に、所定閾値ＴｈｒｅｓｈＢを超えた回数が、所定回数ＣｏｕｎｔＢ以上であるか否かにより、振動検出を行う。これにより、例えばカメラ１０１を机などに置いたような揺れが小さい状態か、カメラ１０１をウエアラブルカメラとして体に装着して歩いているような揺れが大きい状態かを判定することが可能である。また、判定閾値や判定のカウント数の条件を複数設けることにより、揺れレベルに応じた細かい揺れ状態を検出することも可能である。ＴｉｍｅＢやＴｈｒｅｓｈＢやＣｏｕｎｔＢについても、学習情報によって変化させることができる。 <Detection of shaking condition>
It is possible to detect the shaking state of the camera 101 from the output values of the gyro sensor 106 and acceleration sensor 107 attached to the camera 101. After cutting high frequency components of the outputs of the gyro sensor 106 and the acceleration sensor 107 with a high pass filter (HPF), and cutting low frequency components with a low pass filter (LPF), absolute value conversion is performed. Vibration detection is performed depending on whether the number of times the calculated absolute value exceeds a predetermined threshold value ThreshB during a predetermined time TimeB is equal to or greater than a predetermined number CountB. This makes it possible to determine whether the shaking is small, such as when the camera 101 is placed on a desk or the like, or the shaking is large, such as when the camera 101 is worn as a wearable camera while walking. Further, by setting a plurality of conditions for determination threshold values and determination count numbers, it is also possible to detect fine shaking states according to the shaking level. TimeB, ThreshB, and CountB can also be changed depending on learning information.

上記では、揺れ検出センサの検出値を判定することにより、特定の揺れ状態を検出する方法について説明した。しかし、所定時間内でサンプリングされた揺れ検出センサのデータから、ニューラルネットワークを用いた揺れ状態判定器に入力することで、学習させたニューラルネットワークにより、事前に登録しておいた特定の揺れ状態を検出することも可能である。その場合、ステップＳ８０２での学習情報読み込みはニューラルネットワークの重みパラメータとなる。 The method for detecting a specific shaking state by determining the detection value of the shaking detection sensor has been described above. However, by inputting data from the shaking detection sensor sampled within a predetermined period of time into a shaking state determiner using a neural network, the trained neural network can detect a specific shaking state that has been registered in advance. It is also possible to detect. In that case, the learning information read in step S802 becomes the weight parameter of the neural network.

ステップＳ８０４で特定の揺れ状態の検出処理が行われると、ステップＳ８０５に進み、予め設定された特定の音の検出処理を行う。ここでは、ステップＳ８０２で読み込まれた学習情報によって、検出判定処理を変更する。いくつかの例について説明する。 When a specific shaking state detection process is performed in step S804, the process advances to step S805, and a preset specific sound detection process is performed. Here, the detection determination process is changed based on the learning information read in step S802. Some examples will be explained.

＜特定音声コマンド検出＞
特定の音声コマンドを検出する。音声コマンドは事前に登録されたいくつかのコマンドの他、ユーザが特定音声をカメラに登録できる。 <Specific voice command detection>
Detect specific voice commands. In addition to several pre-registered voice commands, the user can also register specific voices to the camera.

＜特定音シーン認識＞
予め大量の音声データに基づいて、機械学習により学習させたネットワークにより音シーンの判定を行う。例えば、「歓声が上がっている」、「拍手している」、「声を発している」などの特定シーンを検出する。検出するシーンは学習によって変化する。 <Specific sound scene recognition>
Sound scenes are determined using a network trained through machine learning based on a large amount of audio data in advance. For example, specific scenes such as ``cheers'', ``claps'', and ``sounds'' are detected. The scenes to be detected change through learning.

＜音レベル判定＞
音声レベルの大きさが所定時間の間、所定の大きさを超えているかを判定することよって、音レベルの検出を行う。所定時間や所定の大きさなどが学習によって変化する。 <Sound level judgment>
The sound level is detected by determining whether the sound level exceeds a predetermined level for a predetermined period of time. The predetermined time, predetermined size, etc. change depending on learning.

＜音方向判定＞
平面上に配置された複数のマイクにより、所定の大きさの音について、音の方向を検出する。 <Sound direction determination>
A plurality of microphones arranged on a plane detect the direction of sound of a predetermined loudness.

音声処理部２１４内で上記の判定処理が行われ、事前に学習された各設定により、特定の音の検出がされたかをステップＳ８０５で判定する。 The above-mentioned determination process is performed within the audio processing unit 214, and it is determined in step S805 whether a specific sound has been detected based on each setting learned in advance.

ステップＳ８０５で特定の音の検出処理が行われると、ステップＳ８０６に進み、第１制御部２２３の電源がＯＦＦ状態であるか否かを判定する。第１制御部２２３（ＭａｉｎＣＰＵ）がＯＦＦ状態であれば、ステップＳ８０７に進み、予め設定された時間の経過検出処理を行う。ここでは、ステップＳ８０２で読み込まれた学習情報によって、検出判定処理を変更する。学習情報は、図７で説明したステップＳ７０６での第２制御部２１１へ情報を通信する際に転送された情報である。第１制御部２２３がＯＮからＯＦＦへ遷移したときからの経過時間が計測され、経過時間が所定の時間ＴｉｍｅＣ以上であれば、時間が経過したと判定し、ＴｉｍｅＣより短かければ、時間が経過していないと判定される。ＴｉｍｅＣは、学習情報によって変化するパラメータである。 When the specific sound detection processing is performed in step S805, the process advances to step S806, and it is determined whether the power of the first control unit 223 is in the OFF state. If the first control unit 223 (Main CPU) is in the OFF state, the process advances to step S807, and a preset time elapse detection process is performed. Here, the detection determination process is changed based on the learning information read in step S802. The learning information is information transferred when communicating information to the second control unit 211 in step S706 described in FIG. 7. The elapsed time from when the first control unit 223 transitions from ON to OFF is measured, and if the elapsed time is greater than or equal to a predetermined time TimeC, it is determined that the time has elapsed, and if it is shorter than TimeC, the time has elapsed. It is determined that this has not been done. TimeC is a parameter that changes depending on learning information.

ステップＳ８０７で時間経過検出処理が行われると、ステップＳ８０８に進み、低消費電力モードを解除する条件が成立したか否かを判定する。低消費電力モード解除は以下の条件によって判定される。
（１）特定の揺れが検出されたこと
（２）特定の音が検出されたこと
（３）所定の時間が経過したこと
（１）については、ステップＳ８０４での特定揺れ状態検出処理により、特定の揺れが検出されたか否かが判定されている。（２）については、ステップＳ８０５での特定音検出処理により、特定の音が検出されたか否かが判定されている。（３）については、ステップＳ８０７での時間経過検出処理により、所定の時間が経過したか否かが判定されている。（１）～（３）の少なくとも１つが成立すれば、低消費電力モード解除を行うように判定される。 When the time elapse detection process is performed in step S807, the process advances to step S808, and it is determined whether the conditions for canceling the low power consumption mode are satisfied. Canceling the low power consumption mode is determined based on the following conditions.
(1) A specific shaking has been detected. (2) A specific sound has been detected. It is determined whether or not shaking has been detected. Regarding (2), it is determined whether a specific sound has been detected by the specific sound detection process in step S805. Regarding (3), it is determined whether a predetermined time has elapsed by the time elapse detection process in step S807. If at least one of (1) to (3) is satisfied, it is determined to cancel the low power consumption mode.

ステップＳ８０８で低消費電力モードの解除が判定されると、ステップＳ８０９に進み第１制御部２２３の電源をＯＮし、ステップＳ８１０で、低消費電力モードの解除が判定された条件（揺れ、音、時間のいずれか）を第１制御部２２３に通知する。そして、ステップＳ８０１に戻り処理をループする。ステップＳ８０８で何れの解除条件にも当てはまらず、低消費電力モード解除の条件ではないと判定されると、ステップＳ８０１に戻り処理をループする。 When it is determined in step S808 that the low power consumption mode is to be canceled, the process proceeds to step S809, where the first control unit 223 is powered on, and in step S810, the conditions (shaking, sound, time) to the first control unit 223. Then, the process returns to step S801 and loops. If it is determined in step S808 that none of the cancellation conditions apply and the condition is not for canceling the low power consumption mode, the process returns to step S801 and loops.

一方、ステップＳ８０６で、第１制御部２２３がＯＮ状態であると判定された場合、ステップＳ８１１に進み、ステップＳ８０３～Ｓ８０５までで取得した情報を第１制御部２２３に通知し、ステップＳ８０１に戻り処理をループする。 On the other hand, if it is determined in step S806 that the first control unit 223 is in the ON state, the process advances to step S811, and the information acquired in steps S803 to S805 is notified to the first control unit 223, and the process returns to step S801. Loop the process.

本実施形態においては、第１制御部２２３がＯＮ状態である場合でも、揺れ検出や特定音の検出を第２制御部２１１で行い、検出結果を第１制御部２２３に通知する構成にしている。しかし、第１制御部２２３がＯＮの場合は、ステップＳ８０３～Ｓ８０５の処理を行わず、第１制御部２２３内の処理（図７のステップＳ７０２）で揺れ検出や特定音の検出を行う構成にしてもよい。 In this embodiment, even when the first control section 223 is in the ON state, the second control section 211 detects shaking and detects specific sounds, and the detection results are notified to the first control section 223. . However, if the first control unit 223 is ON, the process in steps S803 to S805 is not performed, and the configuration is such that vibration detection and specific sound detection are performed in the process within the first control unit 223 (step S702 in FIG. 7). You can.

上述したように、図７のステップＳ７０４～Ｓ７０７や、図８の処理を行うことにより、低消費電力モードに移行する条件や低消費電力モードを解除する条件が、ユーザの操作に基づいて学習される。そして、カメラ１０１を所有するユーザの使い勝手に合わせたカメラ動作を行うことが可能となる。学習の方法については後述する。 As described above, by performing steps S704 to S707 in FIG. 7 and the processing in FIG. 8, the conditions for transitioning to the low power consumption mode and the conditions for canceling the low power consumption mode are learned based on the user's operation. Ru. Then, it becomes possible to perform camera operations tailored to the usability of the user who owns the camera 101. The learning method will be described later.

なお、上記において、揺れ検出や音検出や時間経過により低消費電力モードを解除する方法について詳しく説明したが、環境情報により低消費電力モードの解除を行ってもよい。環境情報は、温度、気圧、明るさ、湿度、紫外線量の絶対量や変化量が所定閾値を超えたか否かにより判定することができ、後述する学習により閾値を変化させることもできる。 Note that although the method for canceling the low power consumption mode based on shaking detection, sound detection, or the passage of time has been described in detail above, the low power consumption mode may also be canceled based on environmental information. The environmental information can be determined based on whether the absolute amount or amount of change in temperature, atmospheric pressure, brightness, humidity, or amount of ultraviolet rays exceeds a predetermined threshold, and the threshold can also be changed by learning described later.

また、揺れ検出や音検出や時間経過の検出情報や、各環境情報の絶対値や変化量をニューラルネットワークに基づいて判断し、低消費電力モードを解除する判定をしてもよい。この判定処理は、後述する学習によって、判定条件を変更することができる。 Further, the determination to cancel the low power consumption mode may be made by determining the absolute value and amount of change of each environmental information based on the vibration detection, sound detection, detection information of the passage of time, and the amount of change of each environmental information. In this judgment process, the judgment conditions can be changed through learning, which will be described later.

＜自動撮影モード処理＞
図９を用いて、自動撮影モード処理について説明する。まず、ステップＳ９０１において、撮像部２０６により取り込まれた信号に対して、画像処理部２０７で画像処理を行い、被写体検出用の画像を生成する。生成された画像に対して、人物や物体などを検出する被写体検出処理が行われる。 <Automatic shooting mode processing>
Automatic photographing mode processing will be explained using FIG. 9. First, in step S901, the image processing unit 207 performs image processing on a signal captured by the imaging unit 206 to generate an image for subject detection. Subject detection processing for detecting people, objects, etc. is performed on the generated image.

人物を検出する場合、被写体の顔や人体を検出する。顔検出処理では、人物の顔を判断するためのパターンが予め定められており、撮像された画像内においてそのパターンに一致する箇所を、人物の顔領域として検出することができる。また、被写体の顔としての確からしさを示す信頼度も同時に算出する。信頼度は、例えば画像内における顔領域の大きさや、顔パターンとの一致程度等から算出される。物体認識についても同様に、予め登録されたパターンに一致する物体を認識することができる。 When detecting a person, the face or human body of the subject is detected. In the face detection process, a pattern for determining a person's face is determined in advance, and a portion of the captured image that matches the pattern can be detected as a person's face area. In addition, the reliability level indicating the certainty that the subject's face is true is also calculated at the same time. The reliability is calculated from, for example, the size of the face area in the image, the degree of matching with the face pattern, and the like. Similarly, for object recognition, it is possible to recognize objects that match a pre-registered pattern.

また、撮像された画像内の色相や彩度等のヒストグラムを用いて特徴被写体を抽出する方法などもある。撮影画角内に捉えられている被写体の画像に関し、その色相や彩度等のヒストグラムから導出される分布を複数の区間に分け、区間ごとに撮像された画像を分類する処理が実行される。例えば、撮像された画像について複数の色成分のヒストグラムが作成され、その山型の分布範囲で区分けされ、同一の区間の組み合わせに属する領域において撮像された画像が分類され、被写体の画像領域が認識される。認識された被写体の画像領域ごとに評価値を算出することで、その評価値が最も高い被写体の画像領域を主被写体領域として判定することができる。以上の方法で、撮像情報から各被写体情報を得ることができる。 There is also a method of extracting a characteristic object using a histogram of hue, saturation, etc. in a captured image. Regarding the image of the subject captured within the photographic field of view, a process is performed in which the distribution derived from the histogram of hue, saturation, etc. is divided into a plurality of sections, and the captured image is classified for each section. For example, a histogram of multiple color components is created for a captured image, divided by its mountain-shaped distribution range, images captured in areas that belong to the same combination of sections are classified, and the image area of the subject is recognized. be done. By calculating an evaluation value for each image area of the recognized object, it is possible to determine the image area of the object with the highest evaluation value as the main object area. With the above method, each subject information can be obtained from the imaging information.

ステップＳ９０２では、像ブレ補正量の算出を行う。具体的には、まず、装置揺れ検出部２０９において取得した角速度および加速度情報に基づいてカメラの揺れの絶対角度を算出する。そして、その絶対角度を打ち消す角度方向にチルト回転ユニット１０４およびパン回転ユニット１０５を動かして像ブレを補正する角度を求め、像ブレ補正量とする。なお、ここでの像ブレ補正量算出処理は、後述する学習処理によって、算出方法を変更することが出来る。 In step S902, an image blur correction amount is calculated. Specifically, first, the absolute angle of the camera shake is calculated based on the angular velocity and acceleration information acquired by the device shake detection unit 209. Then, the tilt rotation unit 104 and the pan rotation unit 105 are moved in an angular direction that cancels out the absolute angle, and an angle for correcting the image blur is determined, and is used as an image blur correction amount. Note that the calculation method of the image blur correction amount calculation process here can be changed by a learning process described later.

ステップＳ９０３では、カメラの状態判定を行う。角速度情報や加速度情報やＧＰＳ位置情報などで検出したカメラ角度やカメラ移動量などにより、現在カメラがどのような振動／動き状態なのかを判定する。例えば、車にカメラ１０１を装着して撮影する場合、移動された距離によって大きく周りの風景などの被写体情報が変化する。そのため、車などに装着して速い速度で移動している「乗り物移動状態」か否かを判定し、後に説明する自動被写体探索に使用する。また、カメラの角度の変化が大きいか否かを判定し、カメラ１０１の揺れがほとんどない「置き撮り状態」であるのかを判定する。「置き撮り状態」である場合は、カメラ１０１自体の位置変化はないと考えてよいので、置き撮り用の被写体探索を行うことができる。また、比較的カメラの角度変化が大きい場合は、「手持ち状態」と判定され、手持ち用の被写体探索を行うことができる。 In step S903, the status of the camera is determined. The current vibration/motion state of the camera is determined based on the camera angle, camera movement amount, etc. detected using angular velocity information, acceleration information, GPS position information, etc. For example, when the camera 101 is attached to a car to take a picture, subject information such as surrounding scenery changes greatly depending on the distance traveled. Therefore, it is determined whether the device is in a ``vehicle moving state'' where it is attached to a car or the like and is moving at a high speed, and is used for automatic subject search, which will be explained later. Further, it is determined whether the change in the angle of the camera is large or not, and whether the camera 101 is in a "stationary shooting state" in which there is almost no shaking is determined. When the camera 101 is in the "stationary shooting state", it can be considered that there is no change in the position of the camera 101 itself, so it is possible to search for a subject for stationary shooting. Furthermore, if the angle change of the camera is relatively large, it is determined that the camera is in a "hand-held state", and a hand-held object search can be performed.

ステップＳ９０４では、被写体探索処理を行う。被写体探索は、以下の処理によって構成される。
（１）エリア分割
（２）エリア毎の重要度レベルの算出
（３）探索対象エリアの決定
以下、各処理について順次説明する。 In step S904, subject search processing is performed. The object search consists of the following processing.
(1) Area division (2) Calculation of importance level for each area (3) Determination of search target area Below, each process will be explained in sequence.

（１）エリア分割
図１０Ａを用いて、エリア分割について説明する。図１０Ａ（ａ）のようにカメラ（原点Ｏがカメラ位置とする）位置を中心として、全周囲でエリア分割を行う。図１０Ａ（ａ）の例においては、チルト方向、パン方向をそれぞれ２２．５度ごとに分割している。図１０Ａ（ａ）のように分割すると、チルト方向の角度が０度から離れるにつれて、水平方向の円周が小さくなり、エリア領域が小さくなる。よって、図１０Ａ（ｂ）のように、チルト角度が４５度以上の場合、水平方向のエリア範囲を２２．５度よりも大きく設定している。 (1) Area division Area division will be explained using FIG. 10A. As shown in FIG. 10A(a), area division is performed all around the camera position (origin O is the camera position) as the center. In the example of FIG. 10A(a), the tilt direction and the pan direction are each divided into 22.5 degrees. When divided as shown in FIG. 10A(a), as the angle in the tilt direction moves away from 0 degrees, the circumference in the horizontal direction becomes smaller and the area becomes smaller. Therefore, as shown in FIG. 10A(b), when the tilt angle is 45 degrees or more, the horizontal area range is set to be larger than 22.5 degrees.

図１０Ａ（ｃ）、１０Ａ（ｄ）に撮影画角内でのエリア分割された領域の例を示す。軸１３０１は初期化時のカメラ１０１の向きであり、この方向を基準位置としてエリア分割が行われる。１３０２は、撮像されている画像の画角エリアを示しており、そのときの画像例を図１０Ａ（ｄ）に示す。撮像されている画角の画像内では、エリア分割に基づいて、図１０Ａ（ｄ）で符号１３０３～１３１８で示されるように画像が分割される。 FIGS. 10A(c) and 10A(d) show examples of areas divided into areas within the shooting angle of view. An axis 1301 is the direction of the camera 101 at the time of initialization, and area division is performed using this direction as a reference position. Reference numeral 1302 indicates the viewing angle area of the image being captured, and an example of the image at that time is shown in FIG. 10A(d). Within the image of the angle of view being captured, the image is divided as indicated by numerals 1303 to 1318 in FIG. 10A(d) based on area division.

（２）エリア毎の重要度レベルの算出
上記のように分割した各エリアについて、エリア内に存在する被写体の状況やシーンの状況に応じて、探索を行う優先順位を示す重要度レベルを算出する。被写体の状況に基づいた重要度レベルは、例えば、エリア内に存在する人物の数、人物の顔の大きさ、顔の向き、顔検出の確からしさ、人物の表情、人物の個人認証結果等に基づいて算出される。また、シーンの状況に応じた重要度レベルは、例えば、一般物体認識結果、シーン判別結果（青空、逆光、夕景など）、エリアの方向からする音のレベルや音声認識結果、エリア内の動き検知情報等に基づいて算出される。 (2) Calculating the importance level for each area For each area divided as above, calculate the importance level that indicates the priority for searching, depending on the situation of objects existing in the area and the situation of the scene. . The importance level based on the situation of the subject is determined based on, for example, the number of people in the area, the size of the faces of the people, the orientation of the faces, the certainty of face detection, the facial expressions of the people, the results of personal identification of the people, etc. Calculated based on In addition, the importance level depending on the situation of the scene includes, for example, general object recognition results, scene discrimination results (blue sky, backlight, sunset view, etc.), sound levels and voice recognition results from the direction of the area, and motion detection within the area. Calculated based on information etc.

また、図９のカメラ状態判定（ステップＳ９０３）で、カメラの振動が検出されている場合、振動状態に応じても重要度レベルが変化するようにすることもできる。例えば、「置き撮り状態」と判定された場合、顔認証で登録されている中で優先度の高い被写体（例えばカメラの所有者）を中心に被写体探索が行われるように判定される。また、後述する自動撮影も例えばカメラの所有者の顔を優先して行われる。これにより、カメラの所有者がカメラを身に着けて持ち歩き撮影を行っている時間が多くても、カメラを取り外して机の上などに置くことで、所有者が写った画像も多く残すことができる。このときパン・チルトにより顔の探索が可能であることから、カメラの置き角度などを考えなくても、適当に設置するだけで所有者が写った画像やたくさんの顔が写った集合写真などを残すことができる。 Further, if camera vibration is detected in the camera state determination (step S903) in FIG. 9, the importance level can also be changed depending on the vibration state. For example, when it is determined that the camera is in a "stationary shooting state", it is determined that a subject search is performed centering on a subject with a high priority (for example, the owner of the camera) among those registered by face authentication. Further, automatic photographing, which will be described later, is also performed with priority given to the face of the owner of the camera, for example. As a result, even if the owner of the camera spends a lot of time carrying the camera around and taking pictures, by removing the camera and placing it on a desk or other surface, many images of the owner can be left. can. At this time, it is possible to search for faces by panning and tilting, so you can create images of the owner or group photos of many faces by simply setting up the camera, without having to think about the angle of the camera. You can leave it behind.

なお、上記の条件だけでは、各エリアに変化がない限りは、最も重要度レベルが高いエリアが同じとなり、その結果探索されるエリアがずっと変わらないことになってしまう。そこで、過去の撮影情報に応じて重要度レベルを変化させる。具体的には、所定時間継続して探索エリアに指定され続けたエリアは重要度レベルを下げたり、後述するステップＳ９１０において撮影を行ったエリアでは、所定時間の間重要度レベルを下げてもよい。 Note that if only the above conditions are met, as long as there is no change in each area, the area with the highest level of importance will remain the same, and as a result, the area to be searched will remain the same forever. Therefore, the importance level is changed depending on past shooting information. Specifically, the importance level may be lowered for an area that has been designated as a search area for a predetermined period of time, or the importance level may be lowered for a predetermined period of time for an area photographed in step S910, which will be described later. .

（３）探索対象エリアの決定
上記のように各エリアの重要度レベルが算出されたら、重要度レベルが高いエリアを探索対象エリアとして決定する。そして、探索対象エリアを画角に捉えるために必要なパン・チルト探索目標角度を算出する。 (3) Determining the search target area Once the importance level of each area has been calculated as described above, the area with the higher importance level is determined as the search target area. Then, the pan/tilt search target angle required to capture the search target area at the angle of view is calculated.

図９の説明に戻って、ステップＳ９０５では、パン・チルト駆動を行う。具体的には、制御サンプリング周波数での、像ブレ補正量と、パン・チルト探索目標角度に基づいた駆動角度とを加算することにより、パン・チルト駆動量を算出する。そして、鏡筒回転駆動部２０５によって、チルト回転ユニット１０４、パン回転ユニット１０５をそれぞれ駆動制御する。 Returning to the explanation of FIG. 9, in step S905, pan/tilt driving is performed. Specifically, the pan/tilt drive amount is calculated by adding the image blur correction amount at the control sampling frequency and the drive angle based on the pan/tilt search target angle. The lens barrel rotation drive section 205 drives and controls the tilt rotation unit 104 and the pan rotation unit 105, respectively.

ステップＳ９０６ではズームユニット２０１を制御し、ズーム駆動を行う。具体的には、ステップＳ９０４で決定した探索対象被写体の状態に応じてズームを駆動させる。例えば、探索対象被写体が人物の顔である場合、画像上の顔が小さすぎると検出可能な最小サイズを下回ることで検出が出来ず、見失ってしまう恐れがある。そのような場合は、望遠側にズームすることで画像上の顔のサイズが大きくなるように制御する。一方で、画像上の顔が大きすぎる場合、被写体やカメラ自体の動きによって被写体が画角から外れやすくなってしまう。そのような場合は、広角側にズームすることで、画面上の顔のサイズが小さくなるように制御する。このようにズーム制御を行うことで、被写体を追跡するのに適した状態を保つことが出来る。 In step S906, the zoom unit 201 is controlled to drive the zoom. Specifically, the zoom is driven according to the state of the search target object determined in step S904. For example, if the object to be searched is a person's face, if the face on the image is too small, it will be smaller than the minimum detectable size and will not be detected and may be lost. In such a case, control is performed so that the size of the face on the image becomes larger by zooming toward the telephoto side. On the other hand, if the face in the image is too large, the subject is likely to move out of the field of view due to movement of the subject or the camera itself. In such a case, zoom to the wide-angle side to reduce the size of the face on the screen. By performing zoom control in this manner, it is possible to maintain a state suitable for tracking the subject.

ステップＳ９０７では、手動による撮影指示があったか否かを判定し、手動撮影指示があった場合、ステップＳ９１０に進む。この時、手動による撮影指示は、シャッターボタン押下によるもの、カメラ筺体を指等で軽く叩くこと（タップ）によるもの、音声コマンド入力によるもの、外部装置からの指示によるものなどのいずれでもよい。タップ操作をトリガーとする撮影指示は、ユーザがカメラ筺体をタップした際、装置揺れ検出部２０９によって短期間に連続した高周波の加速度を検知することにより判定される。音声コマンド入力は、ユーザが所定の撮影を指示する合言葉（例えば「写真とって」等）を発声した場合、音声処理部２１４で音声を認識し、撮影のトリガーとする撮影指示方法である。外部装置からの指示は、例えばカメラとＢｌｕｅＴｏｏｔｈ接続したスマートフォン等から、専用のアプリケーションを用いて送信されたシャッター指示信号をトリガーとする撮影指示方法である。 In step S907, it is determined whether there is a manual photographing instruction, and if there is a manual photographing instruction, the process advances to step S910. At this time, the manual photographing instruction may be by pressing the shutter button, by lightly tapping (tapping) the camera housing with a finger, by inputting a voice command, by an instruction from an external device, or the like. A shooting instruction triggered by a tap operation is determined by detecting continuous high-frequency acceleration in a short period of time by the device shake detection unit 209 when the user taps the camera housing. Voice command input is a photography instruction method in which when a user utters a command word (for example, "take a photo") instructing a predetermined photography, the voice processing unit 214 recognizes the voice and uses it as a trigger for photography. The instruction from the external device is, for example, a shooting instruction method in which a shutter instruction signal transmitted from a smartphone or the like connected to the camera using a dedicated application is used as a trigger.

ステップＳ９０７で手動による撮影指示がなかった場合には、ステップＳ９０８に進み、自動撮影判定を行う。自動撮影判定では、自動撮影を行うか否かの判定と、撮影方法の判定（静止画撮影、動画撮影、連写、パノラマ撮影などの内どれを実行するかの判定）を行う。 If there is no manual photographing instruction in step S907, the process advances to step S908 and automatic photographing determination is performed. In the automatic shooting determination, it is determined whether or not automatic shooting is to be performed, and the shooting method is determined (determining which of still image shooting, video shooting, continuous shooting, panoramic shooting, etc. to be performed).

＜自動撮影を行うか否かの判定＞
自動撮影（撮像部によって出力された画像データを記録する撮影動作）を行うか否かの判定は以下のように行われる。具体的には、以下の２つの場合に、自動撮影を実行すると判定する。１つは、ステップＳ９０４において得られたエリア別の重要度レベルに基づき、重要度レベルが所定値を超えている場合、自動撮影を実施すると判定する。２つめは、ニューラルネットワークに基づく判定であるが、これについては後述する。なお、ここでいう記録は、メモリ２１５への画像データの記録でもよいし、不揮発性メモリ２１６への画像データの記録でもよい。また、外部装置３０１に画像を自動で転送し、外部装置３０１側に画像データを記録するものも含む。 <Determining whether to perform automatic shooting>
A determination as to whether or not to perform automatic photography (a photography operation that records image data output by the imaging unit) is performed as follows. Specifically, in the following two cases, it is determined that automatic photographing is to be performed. First, based on the importance level for each area obtained in step S904, if the importance level exceeds a predetermined value, it is determined that automatic photographing is to be performed. The second method is determination based on a neural network, which will be described later. Note that the recording here may be recording the image data to the memory 215 or recording the image data to the nonvolatile memory 216. It also includes one that automatically transfers images to the external device 301 and records the image data on the external device 301 side.

本実施形態では、上記のように、重要度レベルのような自動撮影判定パラメータにより撮影を自動的に行うように制御する。所定条件を満たした場合に自動撮影を行う撮像装置では、以下の問題が発生する。 In this embodiment, as described above, the automatic imaging determination parameters such as the importance level are used to control the automatic imaging. The following problems occur in an imaging device that automatically takes pictures when a predetermined condition is met.

１つは、自動撮影の頻度が高い場合である。決められた時間内でまんべんなく撮影したい場合でも、所定条件を満たせば撮影が行われてしまうため、前半の時間帯に撮影頻度が非常に高くなってしまって、後半の時間帯には、バッテリー残量/カード残量が不足して、撮影できなくなってしまうことが起こり得る。 One is when the frequency of automatic shooting is high. Even if you want to take pictures evenly within a set period of time, the camera will take pictures only if certain conditions are met, so the frequency of shooting will be very high during the first half of the time, and the remaining battery will run out during the second half of the time. There is a possibility that you will not be able to take pictures due to insufficient capacity/card remaining capacity.

また、もう１つは、自動撮影の頻度が低い場合である。業者などが、予め決まった枚数を撮影したい場合でも、自動撮影の所定条件がなかなか満たされず、撮影枚数が不足してしまうことが起こり得る。 Another case is when the frequency of automatic photographing is low. Even if a trader or the like wants to take a predetermined number of images, the predetermined conditions for automatic photography may not be easily met, resulting in a shortage of the number of images to be taken.

そこで、撮影頻度をコントロールするために、その場の状況やカメラの状況によっては、自動撮影の判定パラメータを変更したほうがよい場合がある。 Therefore, in order to control the frequency of shooting, it may be better to change the automatic shooting determination parameters depending on the situation at the scene and the situation of the camera.

例えば時間が限られた結婚式等のイベントでは、以下のような自動撮影制御が好まれる傾向にある。
（１）人やモノも含めて多めの枚数の画像を撮影したい
（２）短時間の撮影なのでバッテリーの残量、記録メディアの残量を気にせず撮影を行いたい
（３）パン、チルトを積極的に行い、被写体を探索したい
一方、このような限られた時間の撮影に対して、一日中の出来事を記録したい場合は、以下のような自動撮影制御が好まれる傾向にある。
（１）長時間の撮影なのである程度被写体は選別したい
（２）バッテリーの残量、記録メディアの残量を考慮し、省エネで撮影を行いたい
（３）パン、チルト制御は通常制御よりバッテリーを消費してしまうので制限したい
上述の制御例として図１０Ｂ（ａ）を用いて説明する。まず、ユーザの指示（例えば外部装置３０１や音声入力）により撮影時間Ｔ（総撮影時間）などの撮影条件（例えば結婚式：２時間）の入力を行う。この入力情報とバッテリー（第１電源部２１０など）の残量、記録メディア（記録媒体２２１など）の残量に基づいて目標撮影枚数Ｓを決定する。 For example, in events such as weddings where time is limited, the following automatic shooting control tends to be preferred.
(1) I want to take a large number of images, including people and things. (2) I want to shoot for a short time without worrying about the remaining battery power or recording media. (3) I want to be able to pan and tilt. On the other hand, if you want to record the events of a whole day during such limited shooting time, automatic shooting control such as the one described below tends to be preferred.
(1) I want to select subjects to a certain extent because I am shooting for a long time. (2) I want to shoot in an energy-efficient way, taking into account the remaining battery power and recording media. (3) Pan and tilt control consume more battery than normal control. An example of the above-mentioned control will be explained using FIG. 10B(a). First, photographing conditions (for example, wedding ceremony: 2 hours) such as photographing time T (total photographing time) are input by user's instructions (for example, from the external device 301 or voice input). The target number of shots S is determined based on this input information, the remaining amount of the battery (such as the first power supply section 210), and the remaining amount of the recording medium (such as the recording medium 221).

一定の時間間隔で監視を行い、監視結果に基づいて自動撮影の判定閾値、カメラ制御パラメータを随時更新していく。なお、自動撮影の判定閾値、カメラ制御パラメータの初期値は、それまでの学習結果あるいはニューラルネットワークを用いた判定により決定される。 Monitoring is performed at regular time intervals, and automatic shooting determination thresholds and camera control parameters are updated as needed based on the monitoring results. Note that the determination threshold for automatic photography and the initial value of the camera control parameters are determined by the results of previous learning or determination using a neural network.

図１０Ｂ（ａ）に示す例では、図中破線で示す領域Ｒ内に撮影枚数が入るようにパラメータを更新していく。図１０Ｂ（ａ）では、横軸に経過時間、縦軸に累積の撮影枚数を示している。撮影枚数が領域Ｒ内に入るように制御するのは、時間の経過とともに累積の撮影枚数が概略リニア（直線的）に増加するように撮影を行えば、撮影時間全体にわたって、ほぼ万遍なく撮影を行うことができると考えられるからである。 In the example shown in FIG. 10B(a), the parameters are updated so that the number of captured images falls within the region R indicated by the broken line in the figure. In FIG. 10B(a), the horizontal axis shows elapsed time, and the vertical axis shows the cumulative number of captured images. Controlling the number of shots so that it falls within region R means that if you shoot so that the cumulative number of shots increases approximately linearly over time, shots will be taken almost evenly over the entire shooting time. This is because it is thought that it is possible to do the following.

図１０Ｂ（ａ）の例では、監視時刻Ａでは撮影枚数が不足していると判断し、自動撮影の判定閾値を下げ、パン、チルト可動範囲を拡げて積極的に主被写体を探索し、自動撮影頻度を上げていく。一方、監視時刻Ｂでは撮影枚数が多いと判断し、自動撮影の判定閾値を上げ、パン、チルト可動範囲を狭めて撮影頻度を下げていく。監視時刻Ｃでは適切な撮影枚数が得られていると判断して、自動撮影の判定閾値、パン、チルトの可動範囲の制御パラメータを維持していく。このように一定間隔で、一定期間ごとの撮影枚数の監視を行い、所定期間である撮影時間Ｔにおける目標枚数Ｓに向かって撮影頻度を随時制御していく。 In the example shown in FIG. 10B(a), it is determined that the number of shots is insufficient at monitoring time A, the automatic shooting judgment threshold is lowered, the pan and tilt movable ranges are expanded, the main subject is actively searched, and the automatic Increase the frequency of shooting. On the other hand, at monitoring time B, it is determined that the number of shots is large, and the automatic shooting determination threshold is raised, the pan and tilt movable ranges are narrowed, and the shooting frequency is reduced. At monitoring time C, it is determined that an appropriate number of shots has been obtained, and the automatic shooting determination threshold and the control parameters for the pan and tilt movable ranges are maintained. In this way, the number of images to be photographed for each fixed period is monitored at regular intervals, and the photographing frequency is controlled as needed toward the target number of images S for the photographing time T, which is a predetermined period.

例えば、カメラ１０１を制御するＣＰＵは、画像情報を基に被写体の顔を検出する検出部や、顔の表情を認識して表情が特定の状態（例えば、喜び、悲しみ、怒り、驚きの状態の特徴値が閾値を超えた場合）になっているかを判定する判定部、更には該判定部の判定結果に応じて被写体記録動作（自動撮影）を行う制御部を有する。この場合において、自動撮影の判定閾値を撮影頻度に応じて調整する。この調整により、判定部により判定された被写体の顔の表情が同じであっても、撮影頻度が第１の頻度だった場合には撮影動作を実施し、撮影頻度が第２の頻度の場合には撮影動作を実施しないように制御することになる。これにより、所望の撮影枚数を得るとともに記録メモリ不足の低減が図れる。 For example, the CPU that controls the camera 101 includes a detection unit that detects a subject's face based on image information, and a detection unit that recognizes facial expressions to determine whether the expression is in a specific state (for example, joy, sadness, anger, or surprise). The image forming apparatus includes a determination section that determines whether the characteristic value exceeds a threshold value), and a control section that performs a subject recording operation (automatic photographing) in accordance with the determination result of the determination section. In this case, the automatic imaging determination threshold is adjusted according to the imaging frequency. With this adjustment, even if the facial expressions of the subject judged by the judgment unit are the same, the shooting operation is performed when the shooting frequency is the first frequency, and the shooting operation is performed when the shooting frequency is the second frequency. is controlled so that the photographing operation is not performed. This makes it possible to obtain the desired number of images and to reduce the shortage of recording memory.

また、例えば、カメラ１０１を制御するＣＰＵは、画像情報を基に被写体の顔を検出する検出部や、顔の向きを認識して、顔が特定の方向、特に正面を向いた状態になっているかを判定する判定部、更には該判定部の判定結果に応じて自動撮影を行う制御部を有する。この場合において、自動撮影の判定閾値（正面を向いた状態になっているか否かの判定をする閾値）を撮影頻度に応じて調整する。この調整により、判定部により判定された被写体の顔の向きが同じであっても、撮影頻度が第１の頻度だった場合には撮影動作を実施し、撮影頻度が第２の頻度の場合には撮影動作を実施しないように制御することになる。これにより、所望の撮影枚数を得るとともに記録メモリ不足の低減が図れる。 For example, the CPU that controls the camera 101 may also include a detection unit that detects the face of the subject based on image information, and a detection unit that recognizes the direction of the subject's face and detects whether the face is facing in a specific direction, especially the front. It has a determination section that determines whether there is a bird present, and a control section that performs automatic photographing according to the determination result of the determination section. In this case, the determination threshold value for automatic photographing (threshold value for determining whether or not the subject is facing forward) is adjusted according to the photographing frequency. With this adjustment, even if the direction of the subject's face determined by the determination unit is the same, the photographing operation is performed when the photographing frequency is the first frequency, and the photographing operation is performed when the photographing frequency is the second frequency. is controlled so that the photographing operation is not performed. This makes it possible to obtain the desired number of images and to reduce the shortage of recording memory.

その他にも、被写体の目の状態を認識し、目が所定の状態、特に目をしっかり開けて、カメラに目線が向いている状態になったときに自動撮影を行う場合も同様である。また、被写体の姿勢を認識し、被写体が所定の姿勢になったときに自動撮影を行う場合も同様である。また、被写体の動作を認識し、被写体が所定の動作を行ったときに自動撮影を行う場合も同様である。 The same applies to the case where the subject's eye condition is recognized and automatic photographing is performed when the subject's eyes are in a predetermined state, particularly when the eyes are fully open and the eyes are facing the camera. The same applies when the posture of the subject is recognized and automatic photographing is performed when the subject assumes a predetermined posture. Further, the same applies when the motion of the subject is recognized and automatic photographing is performed when the subject performs a predetermined motion.

このように、被写体の状態を認識して、被写体の状態が特定の状態になっているかを判定し、判定結果に応じて自動撮影を行う場合に、撮影頻度が第１の頻度だった場合には撮影動作を実施し、撮影頻度が第２の頻度の場合には撮影動作を実施しないように制御する。これにより、撮影時間Ｔにおいて、所望の撮影枚数を得るとともに、撮影時間の途中での記録メモリ不足の低減が図れる。 In this way, when the state of the subject is recognized, it is determined whether the subject is in a specific state, and automatic shooting is performed according to the judgment result, if the shooting frequency is the first frequency, performs the photographing operation, and when the photographing frequency is the second frequency, control is performed so that the photographing operation is not performed. This makes it possible to obtain the desired number of images during the photographing time T and to reduce the shortage of recording memory during the photographing time.

他の制御例として、図１０Ｂ（ｂ）に示す例について説明する。図１０Ｂ（ａ）の制御例と同様に、まずユーザの指示により撮影時間Ｔの入力を行い、目標枚数Ｓが決定される。一定の時間間隔で監視を行いその結果を保持する。ここで保持する内容は監視期間内に自動撮影の判定に用いられた、ユーザにとって撮影する価値がある画像かを示す評価値を保持する。図１０Ｂ（ｂ）では、閾値Ｔｈ以上の評価値の画像（１枚のみ）が自動撮影された結果となっている。そのため、時刻Ｔ１では撮影枚数が足りないと判断し、評価値の判定閾値を下げる制御を行う。一方、時刻Ｔ２では評価値の閾値を下げた結果、５枚の画像が自動撮影されて目標枚数に対して撮影枚数が過多と判断され、評価値の判定閾値を上げて撮影頻度を下げた結果となっている。このように過去の自動撮影の判定で用いた評価値を監視していき、適切な撮影枚数を得られるように随時判定閾値を更新していく。これにより、短時間内の過多な撮影処理が抑制される。また、撮影時間Ｔにおいて、所望の撮影枚数を得るとともに、撮影時間の途中での記録メモリ不足の低減が図れる。 As another control example, an example shown in FIG. 10B(b) will be described. As in the control example shown in FIG. 10B(a), first, the shooting time T is inputted by the user's instruction, and the target number of images S is determined. Monitoring is performed at regular time intervals and the results are retained. The content held here is an evaluation value used to determine whether to automatically shoot within the monitoring period and indicates whether the image is worth shooting for the user. FIG. 10B(b) shows the result of automatically capturing an image (only one image) with an evaluation value equal to or higher than the threshold Th. Therefore, at time T1, it is determined that the number of captured images is insufficient, and control is performed to lower the determination threshold of the evaluation value. On the other hand, at time T2, as a result of lowering the evaluation value threshold, 5 images were automatically taken and it was determined that the number of images was too many compared to the target number, so the evaluation value judgment threshold was raised and the shooting frequency was lowered. It becomes. In this way, the evaluation values used in past automatic shooting judgments are monitored, and the judgment thresholds are updated as needed to obtain an appropriate number of shots. This suppresses excessive photographing processing within a short period of time. Furthermore, it is possible to obtain the desired number of images during the photographing time T, and to reduce the shortage of recording memory during the photographing time.

また、別の制御例として、画像の評価値が閾値を超えた場合だけ撮像した画像データを保存する構成について説明する。自動撮影中、一定間隔（例えば、時刻Ｔ１、Ｔ２、Ｔ３、Ｔ４）で撮影枚数の監視を行い、目標枚数に向かって、画像の評価値の閾値を変更する。例えば、時刻Ｔ１の時点で、撮影枚数が不足している場合は、画像の評価値の閾値を下げて、保存しやすくする。また、例えば、時刻Ｔ２の時点で、撮影枚数が方と判断された場合は、画像の評価値の閾値を上げて、保存されにくくする。これにより、撮影時間Ｔにおいて、所望の撮影枚数を得るとともに、撮影時間の途中での記録メモリ不足の低減が図れる。 Further, as another control example, a configuration will be described in which captured image data is saved only when the evaluation value of the image exceeds a threshold value. During automatic photographing, the number of photographed images is monitored at regular intervals (for example, at times T1, T2, T3, and T4), and the threshold value of the image evaluation value is changed toward the target number of images. For example, if the number of captured images is insufficient at time T1, the threshold of the image evaluation value is lowered to make it easier to save. For example, if it is determined at time T2 that the number of images taken is low, the threshold of the evaluation value of the image is increased to make it difficult to save the image. This makes it possible to obtain the desired number of images during the photographing time T and to reduce the shortage of recording memory during the photographing time.

このように撮影状況に応じて撮影頻度をコントロールする（変更する）ことにより、適切な撮影枚数が得られる自動撮影を行うことができる。これにより、自動撮影を行う撮像装置において、ユーザが撮影したい映像の撮り逃しを極力抑制できる。 By controlling (changing) the shooting frequency according to the shooting situation in this way, it is possible to perform automatic shooting that allows an appropriate number of shots to be taken. As a result, in an imaging device that performs automatic shooting, it is possible to minimize the possibility of missing a video that the user wants to shoot.

なお、上述では、撮影状況に応じて撮影頻度を変更するように制御したが、合わせて、カメラ１０１と外部装置との間の通信の性能も考慮して、撮影頻度を変更するように制御してもよい。 Note that in the above, the shooting frequency is controlled to be changed according to the shooting situation, but the shooting frequency is also controlled to be changed in consideration of the communication performance between the camera 101 and the external device. You can.

次に、２つめの判定である、ニューラルネットワークに基づく判定について説明する。ニューラルネットワークの一例として、多層パーセプトロンによるネットワークの例を図１１に示す。ニューラルネットワークは、入力値から出力値を予測することに使用されるものであり、予め入力値と、その入力に対して模範となる出力値とを学習しておくことで、新たな入力値に対して、学習した模範に倣った出力値を推定することができる。なお、学習の方法は後述する。図１１の１２０１およびその縦に並ぶ丸は入力層のニューロンを示し、１２０３およびその縦に並ぶ丸は中間層のニューロンを示し、１２０４は出力層のニューロンを示す。１２０２で示すような矢印は各ニューロンを繋ぐ結合を示している。ニューラルネットワークに基づく判定では、入力層のニューロンに対して、現在の画角中に写る被写体や、シーンやカメラの状態に基づいた特徴量を入力として与え、多層パーセプトロンの順伝播則に基づく演算を経て出力層から出力された値を得る。そして、出力の値が閾値以上であれば、自動撮影を実施する判定を下す。なお、被写体の特徴としては、現在のズーム倍率、現在の画角における一般物体認識結果、顔検出結果、現在画角に写る顔の数、顔の笑顔度、目瞑り度、顔角度、顔認証ＩＤ番号、被写体人物の視線角度、シーン判別結果、前回撮影時からの経過時間、現在時刻、ＧＰＳ位置情報および前回撮影位置からの変化量、現在の音声レベル、声を発している人物、拍手、歓声が上がっているか否か、振動情報（加速度情報、カメラ状態）、環境情報（温度、気圧、照度、湿度、紫外線量）等を使用する。更に、外部装置５０１からの情報通知がある場合、通知情報（ユーザの運動情報、腕のアクション情報、心拍などの生体情報など）も特徴として使用する。この特徴を所定の範囲の数値に変換し、特徴量として入力層の各ニューロンに与える。そのため、入力層の各ニューロンは上記使用する特徴量の数だけ必要となる。 Next, the second determination, which is based on a neural network, will be explained. As an example of a neural network, an example of a network using a multilayer perceptron is shown in FIG. Neural networks are used to predict output values from input values, and by learning input values and model output values for those inputs in advance, they can be used to predict output values from new input values. On the other hand, it is possible to estimate the output value based on the learned model. Note that the learning method will be described later. In FIG. 11, 1201 and the circles arranged vertically therein indicate neurons in the input layer, 1203 and the circles arranged vertically therein indicate neurons in the intermediate layer, and 1204 indicate neurons in the output layer. Arrows such as 1202 indicate connections connecting each neuron. In neural network-based determination, the input layer neurons are given features based on the subject in the current field of view, the scene, and the camera state, and then perform calculations based on the forward propagation law of a multilayer perceptron. Then, the value output from the output layer is obtained. Then, if the output value is equal to or greater than the threshold value, a determination is made to perform automatic imaging. The characteristics of the subject include the current zoom magnification, general object recognition results at the current angle of view, face detection results, number of faces in the current angle of view, degree of smiling face, degree of eyes closed, face angle, and face recognition. ID number, viewing angle of the subject, scene determination results, elapsed time since the previous shooting, current time, GPS location information and amount of change from the previous shooting position, current audio level, person making a voice, applause, It uses information such as whether there are cheers, vibration information (acceleration information, camera status), environmental information (temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet rays), etc. Further, when there is information notification from the external device 501, the notification information (user's exercise information, arm action information, biological information such as heartbeat, etc.) is also used as a feature. This feature is converted into a numerical value within a predetermined range and given to each neuron in the input layer as a feature amount. Therefore, each neuron in the input layer is required as many as the number of features used.

なお、このニューラルネットワークに基づく判断は、後述する学習処理で各ニューロン間の結合重みを変化させることによって、出力値を変化させることができ、判断の結果を学習結果に適応させることが出来る。 Note that in the judgment based on this neural network, the output value can be changed by changing the connection weight between each neuron in a learning process described later, and the judgment result can be adapted to the learning result.

また、図７のステップＳ７０２で読み込んだ第１制御部２２３の起動条件によって、自動撮影の判定も変化する。例えば、タップ検出による起動や特定音声コマンドによる起動の場合は、ユーザが現在撮影してほしいための操作である可能性が非常に高い。そこで、撮影頻度が多くなるように設定される。 Further, the determination of automatic shooting also changes depending on the activation conditions of the first control unit 223 read in step S702 in FIG. For example, in the case of activation by tap detection or activation by a specific voice command, there is a very high possibility that the operation is because the user currently wants to take a picture. Therefore, the shooting frequency is set to increase.

＜撮影方法の判定＞
撮影方法の判定では、ステップＳ９０１～Ｓ９０４において検出した、カメラの状態や周辺の被写体の状態に基づいて、静止画撮影、動画撮影、連写撮影、パノラマ撮影などの内どれを実行するかを判定する。例えば、被写体（人物）が静止している場合は静止画撮影を実行し、被写体が動いている場合は動画撮影または連写撮影を実行する。また、被写体がカメラを取り囲むように複数存在している場合や、前述したＧＰＳ情報に基づいて景勝地であるということが判断出来ている場合には、パン・チルトを操作させながら順次撮影した画像を合成してパノラマ画像を生成するパノラマ撮影処理を実行してもよい。なお、＜自動撮影を行うか否かの判定＞での判定方法と同様に、撮影前に検出した各種情報をニューラルネットワークに基づいて判断し、撮影方法を決定することもできる。また、この判定処理では、後述する学習処理によって、判定条件を変更することも出来る。 <Determination of shooting method>
In determining the shooting method, it is determined whether to perform still image shooting, video shooting, continuous shooting, panorama shooting, etc., based on the camera status and the status of surrounding subjects detected in steps S901 to S904. do. For example, if the subject (person) is stationary, still image shooting is performed, and if the subject is moving, video shooting or continuous shooting is performed. In addition, if there are multiple subjects surrounding the camera, or if it is determined that the subject is a scenic spot based on the GPS information mentioned above, images can be taken one after another while operating the pan/tilt. A panoramic photographing process may be performed in which the images are combined to generate a panoramic image. Note that, similar to the determination method in <Determining whether or not to perform automatic photographing>, the photographing method may be determined by determining various information detected before photographing based on a neural network. Further, in this judgment process, the judgment conditions can also be changed by a learning process to be described later.

図９の説明に戻って、ステップＳ９０９では、ステップＳ９０８の自動撮影判定により自動撮影する判定が下された場合、ステップＳ９１０に進み、自動撮影する判定が下されなかった場合、自動撮影モード処理を終了する。 Returning to the explanation of FIG. 9, in step S909, if a determination is made to perform automatic photography in step S908, the process advances to step S910, and if a determination is not made to perform automatic photography, automatic photography mode processing is performed. finish.

ステップＳ９１０では、自動撮影を開始する。この時、ステップＳ９０８において判定された撮影方法による撮影を開始する。その際、フォーカス駆動制御部２０４によるオートフォーカス制御を行う。また、不図示の絞り制御部およびセンサゲイン制御部、シャッター制御部を用いて、被写体が適切な明るさになるような露出制御を行う。さらに、撮影後には画像処理部２０７において、オートホワイトバランス処理、ノイズリダクション処理、ガンマ補正処理等、種々の公知の画像処理が行われ、画像が生成される。 In step S910, automatic photographing is started. At this time, photographing using the photographing method determined in step S908 is started. At this time, autofocus control is performed by the focus drive control unit 204. Further, using an aperture control section, a sensor gain control section, and a shutter control section (not shown), exposure control is performed so that the subject has appropriate brightness. Further, after photographing, the image processing unit 207 performs various known image processing such as auto white balance processing, noise reduction processing, and gamma correction processing to generate an image.

なお、この撮影の際に、所定の条件を満たした場合、カメラが撮影対象となる人物に対し撮影を行う旨を報知した上で撮影するようにしてもよい。報知の方法として、例えば、音声出力部２１８からの発音やＬＥＤ制御部２２４によるＬＥＤ点灯等を使用してもよい。所定の条件は、例えば、画角内における顔の数、顔の笑顔度、目瞑り度、被写体人物の視線角度や顔角度、顔認証ＩＤ番号、個人認証登録されている人物の数、撮影時の一般物体認識結果、シーン判別結果、前回撮影時からの経過時間、撮影時刻、ＧＰＳ情報に基づく現在位置が景勝地であるか否か、撮影時の音声レベル、声を発している人物の有無、拍手、歓声が上がっているか否か、振動情報（加速度情報、カメラ状態）、環境情報（温度、気圧、照度、湿度、紫外線量）等である。これらの条件に基づいて報知撮影を行うことによって、重要性が高いシーンにおいて好ましいカメラ目線の画像を残すことが出来る。 Note that when a predetermined condition is satisfied during this photographing, the camera may notify the person to be photographed that the photograph will be taken, and then the photograph may be taken. As a method of notification, for example, sound output from the audio output unit 218 or lighting of an LED by the LED control unit 224 may be used. The predetermined conditions include, for example, the number of faces within the angle of view, the degree of smile on the face, the degree of closed eyes, the gaze angle and face angle of the subject person, the facial recognition ID number, the number of people registered for personal authentication, and the time of shooting. General object recognition results, scene classification results, elapsed time since the previous shooting, shooting time, whether the current location is a scenic spot based on GPS information, audio level at the time of shooting, presence or absence of a person making a voice. , whether there is applause or cheering, vibration information (acceleration information, camera status), environmental information (temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet rays), etc. By performing notification photography based on these conditions, it is possible to leave a preferable camera-looking image in a highly important scene.

このような撮影前の報知についても、撮影画像の情報、或いは撮影前に検出した各種情報をニューラルネットワークに基づいて判断し、報知の方法やタイミングを決定することもできる。また、この判定処理では、後述する学習処理によって、判定条件を変更することも出来る。 Regarding such notification before photography, information on the photographed image or various information detected before photography can be determined based on a neural network, and the method and timing of notification can be determined. Further, in this judgment process, the judgment conditions can also be changed by a learning process to be described later.

ステップＳ９１１では、ステップＳ９１０において生成した画像を加工したり、動画に追加したりといった編集処理を行う。画像加工については、具体的には、人物の顔や合焦位置に基づいたトリミング処理、画像の回転処理、ＨＤＲ（ハイダイナミックレンジ）効果処理、ボケ効果処理、色変換フィルタ効果処理などである。画像加工では、ステップＳ９１０において生成した画像に基づいて、上記の処理の組み合わせによって複数の加工画像を生成し、ステップＳ９１０において生成した画像とは別に保存するようにしてもよい。また、動画処理については、撮影した動画または静止画を、生成済みの編集動画にスライド、ズーム、フェードの特殊効果処理をつけながら追加するといった処理をしてもよい。ステップＳ９１１での編集についても、撮影画像の情報、或いは撮影前に検出した各種情報をニューラルネットワークに基づいて判断し、画像加工の方法を決定することもできる。また、この判定処理では、後述する学習処理によって、判定条件を変更することも出来る。 In step S911, editing processing such as processing the image generated in step S910 and adding it to a video is performed. Specifically, image processing includes trimming processing based on a person's face or focus position, image rotation processing, HDR (high dynamic range) effect processing, blurring effect processing, color conversion filter effect processing, and the like. In image processing, a plurality of processed images may be generated based on the image generated in step S910 by a combination of the above processes, and stored separately from the image generated in step S910. Further, regarding video processing, processing may be performed in which a captured video or still image is added to an already generated edited video while applying special effects such as slide, zoom, and fade. Regarding the editing in step S911, it is also possible to determine the image processing method by determining information on the photographed image or various information detected before photographing based on a neural network. Further, in this judgment process, the judgment conditions can also be changed by a learning process to be described later.

ステップＳ９１２では、撮影画像の学習情報生成処理を行う。ここでは、後述する学習処理に使用する情報を生成し、記録する。具体的には、今回の撮影画像における、撮影時のズーム倍率、撮影時の一般物体認識結果、顔検出結果、撮影画像に写る顔の数、顔の笑顔度、目瞑り度、顔角度、顔認証ＩＤ番号、被写体人物の視線角度、シーン判別結果、前回撮影時からの経過時間、撮影時刻、ＧＰＳ位置情報および前回撮影位置からの変化量、撮影時の音声レベル、声を発している人物、拍手、歓声が上がっているか否か、振動情報（加速度情報、カメラ状態）、環境情報（温度、気圧、照度、湿度、紫外線量）、動画撮影時間、手動撮影指示によるものか否か、等である。更にユーザの画像の好みを数値化したニューラルネットワークの出力であるスコアも演算する。これらの情報を生成し、撮影画像ファイルへタグ情報として記録する。あるいは、不揮発性メモリ２１６へ書き込むか、記録媒体２２１内に、所謂カタログデータとして各々の撮影画像の情報をリスト化した形式で保存するようにしてもよい。 In step S912, learning information generation processing for the captured image is performed. Here, information used for learning processing, which will be described later, is generated and recorded. Specifically, the zoom magnification at the time of shooting, the general object recognition result at the time of shooting, the face detection result, the number of faces in the shot image, the smiling degree of the face, the degree of eye closure, the face angle, and the face Authentication ID number, viewing angle of the subject, scene determination results, elapsed time since the previous shooting, shooting time, GPS location information and amount of change from the previous shooting position, audio level at the time of shooting, person making the voice, Whether there is applause or cheers, vibration information (acceleration information, camera status), environmental information (temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet rays), video recording time, whether manual recording instructions were given, etc. be. It also calculates a score, which is the output of a neural network that quantifies the user's image preferences. This information is generated and recorded as tag information in the photographed image file. Alternatively, the information on each photographed image may be written into the nonvolatile memory 216 or stored in the recording medium 221 in a list format as so-called catalog data.

ステップＳ９１３では過去の撮影情報の更新を行う。具体的には、ステップＳ９０８で説明したエリア毎の撮影枚数、個人認証登録された人物毎の撮影枚数、一般物体認識で認識された被写体毎の撮影枚数、シーン判別のシーン毎の撮影枚数について、今回撮影された画像が該当する枚数のカウントを１つ増やす。 In step S913, past photographic information is updated. Specifically, regarding the number of shots for each area described in step S908, the number of shots for each person registered for personal authentication, the number of shots for each subject recognized by general object recognition, and the number of shots for each scene for scene discrimination, The count of the number of images corresponding to the image taken this time is increased by one.

＜学習処理＞
次に、本実施形態におけるユーザの好みに合わせた学習について説明する。本実施形態では、図１１に示すようなニューラルネットワークを用い、機械学習アルゴリズムを使用して、学習処理部２１９においてユーザの好みに合わせた学習を行う。ニューラルネットワークは、入力値から出力値を予測することに使用されるものであり、予め入力値の実績値と出力値の実績値を学習しておくことで、新たな入力値に対して、出力値を推定することができる。ニューラルネットワークを用いることにより、前述の自動撮影や自動編集、被写体探索に対して、ユーザの好みに合わせた学習を行う。また、ニューラルネットワークに入力する特徴データともなる被写体情報（顔認証や一般物体認識などの結果）の登録や、撮影報知制御や低消費電力モード制御やファイル自動削除を学習により変更する動作も行う。 <Learning process>
Next, learning tailored to the user's preferences in this embodiment will be explained. In this embodiment, a neural network as shown in FIG. 11 is used, and a machine learning algorithm is used to perform learning in accordance with the user's preferences in the learning processing unit 219. Neural networks are used to predict output values from input values, and by learning the actual values of input values and actual values of output values in advance, they can predict output values for new input values. value can be estimated. By using a neural network, learning is performed in accordance with the user's preferences for the aforementioned automatic shooting, automatic editing, and subject search. It also registers subject information (results of facial recognition, general object recognition, etc.), which also serves as feature data input to the neural network, and uses learning to change shooting notification control, low power consumption mode control, and automatic file deletion.

本実施形態において、学習処理が適用される動作は、以下の動作である。
（１）自動撮影
（２）自動編集
（３）被写体探索
（４）被写体登録
（５）撮影報知制御
（６）低消費電力モード制御
（７）ファイル自動削除
（８）像ブレ補正
（９）画像自動転送
なお、上記の学習処理が適用される動作のうち、自動編集、ファイル自動削除、画像自動転送については、本発明の主旨と直接関係しないので、説明を省略する。 In this embodiment, the operations to which learning processing is applied are the following operations.
(1) Automatic shooting (2) Automatic editing (3) Subject search (4) Subject registration (5) Shooting notification control (6) Low power consumption mode control (7) Automatic file deletion (8) Image stabilization (9) Image Automatic Transfer Of the operations to which the above-described learning process is applied, automatic editing, automatic file deletion, and automatic image transfer are not directly related to the gist of the present invention, and therefore their explanations will be omitted.

＜自動撮影＞
自動撮影に対する学習について説明する。自動撮影では、ユーザの好みに合った画像の撮影を自動で行うための学習を行う。図９のフローチャートを用いて説明したように、撮影後（ステップＳ９１０の後）に学習用情報生成処理（ステップＳ９１２）が行われている。後述する方法により学習させる画像を選択させ、画像に含まれる学習情報に基づいて、ニューラルネットワークの重みを変化させることにより学習を行わせる。 <Automatic shooting>
Learn about automatic shooting. Automatic shooting involves learning to automatically take images that match the user's preferences. As explained using the flowchart of FIG. 9, the learning information generation process (step S912) is performed after the photographing (after step S910). Images to be trained are selected by a method described later, and learning is performed by changing the weights of the neural network based on learning information included in the images.

学習は、自動撮影タイミングの判定を行うニューラルネットワークの変更と、撮影方法（静止画撮影、動画撮影、連写、パノラマ撮影など）の判定を行うニューラルネットワークの変更により行われる。 Learning is performed by changing the neural network that determines the automatic shooting timing and the neural network that determines the shooting method (still image shooting, video shooting, continuous shooting, panoramic shooting, etc.).

＜被写体探索＞
被写体探索に対する学習について説明する。被写体探索では、ユーザの好みに合った被写体の探索を自動的に行うための学習を行う。図９のフローチャートを用いて説明したように、被写体探索処理（ステップＳ９０４）において、各エリアの重要度レベルを算出し、パン・チルト、ズームを駆動し、被写体探索を行う。学習は撮影画像や探索中の検出情報に基づいて行われ、ニューラルネットワークの重みを変化させることで学習結果として反映される。探索動作中の各種検出情報をニューラルネットワークに入力し、重要度レベルの判定を行うことにより、学習を反映した被写体探索を行う。また、重要度レベルの算出以外にも、例えば、パン・チルト探索方法（速度、動かす頻度）の制御も行う。 <Subject search>
Learning for object search will be explained. In object search, learning is performed to automatically search for objects that match the user's preferences. As explained using the flowchart of FIG. 9, in the subject search process (step S904), the importance level of each area is calculated, pan/tilt, and zoom are driven, and the subject search is performed. Learning is performed based on captured images and detected information during search, and is reflected in the learning results by changing the weights of the neural network. By inputting various detection information during search operations into a neural network and determining the level of importance, object searches that reflect learning are performed. In addition to calculating the importance level, it also controls, for example, the pan/tilt search method (speed, frequency of movement).

＜被写体登録＞
被写体登録に対する学習について説明する。被写体登録では、ユーザの好みに合った被写体の登録やランク付けを自動的に行うための学習を行う。学習として、例えば、顔認証登録や一般物体認識の登録、ジェスチャーや音声認識、音によるシーン認識の登録を行う。人と物体に対する認証登録を行い、画像の取得される回数や頻度、手動撮影される回数や頻度、探索中の被写体の現れる頻度からランク付けの設定を行う。登録された情報は、各ニューラルネットワークを用いた判定のための入力として登録されることになる。 <Subject registration>
Learning for subject registration will be explained. In subject registration, learning is performed to automatically register and rank subjects that match the user's preferences. As learning, for example, facial recognition registration, general object recognition registration, gesture and voice recognition, and scene recognition using sound are registered. Authentication registration is performed for people and objects, and rankings are set based on the number and frequency of images taken, the number and frequency of manual shooting, and the frequency with which the subject appears during the search. The registered information will be registered as input for determination using each neural network.

＜撮影報知制御＞
撮影報知に対する学習について説明する。図９のステップＳ９１０で説明したように、撮影直前に、所定の条件を満たしたとき、カメラが撮影対象となる人物に対して撮影を行う旨を報知した上で撮影することを行う。例えば、パン・チルトを駆動することにより視覚的に被写体の視線を誘導したり、音声出力部２１８から発するスピーカー音や、ＬＥＤ制御部２２４によるＬＥＤ点灯光を使用して被写体の注意を誘導したりする。上記の報知の直後に、被写体の検出情報（例えば、笑顔度、目線検出、ジェスチャー）が得られたか否かに基づいて、検出情報を学習に使用するかを判定し、ニューラルネットワークの重みを変化させることで学習する。 <Photography notification control>
Learning for shooting notification will be explained. As described in step S910 of FIG. 9, immediately before photographing, when a predetermined condition is satisfied, the camera notifies the person to be photographed that the photograph will be taken, and then photographs the person. For example, by driving panning and tilting, the subject's line of sight can be visually guided, or by using a speaker sound emitted from the audio output section 218 or an LED lighting light from the LED control section 224, the subject's attention can be guided. do. Immediately after the above notification, based on whether or not the detection information of the subject (e.g. smile level, line of sight detection, gesture) is obtained, it is determined whether the detection information is used for learning and the weight of the neural network is changed. Learn by doing.

撮影直前の各検出情報をニューラルネットワークに入力し、報知を行うか否かの判定や、各動作（音（音レベル／音の種類／タイミング）、光（点灯時間、スピード）、カメラの向き（パン・チルトモーション））の判定を行う。 Each detection information immediately before shooting is input into a neural network, and it is used to determine whether or not to issue a notification, as well as each movement (sound (sound level/sound type/timing), light (lighting time, speed), camera direction ( Pan/tilt motion)) is determined.

＜低消費電力モード制御＞
図７、図８を用いて、説明したようにＭａｉｎＣＰＵ（第１制御部２２３）への電源供給をＯＮ／ＯＦＦする制御を行うが、低消費電力モードからの復帰条件や、低消費電力状態への遷移条件の学習も行う。低消費電力モードを解除する条件の学習について説明する。 <Low power consumption mode control>
Using FIGS. 7 and 8, as explained above, the power supply to the Main CPU (first control unit 223) is controlled to be turned ON/OFF. It also learns the transition conditions for . Learning the conditions for canceling low power consumption mode will be explained.

＜音検出＞
ユーザが特定音声や検出したい特定音シーンや特定音レベルを、例えば外部装置３０１の専用アプリケーションを用いた通信により、手動で設定することで学習することができる。また、複数の検出方法を音声処理部に予め設定しておき、後述する方法により学習させる画像を選択させ、画像に含まれる前後音情報を学習し、起動要因とする音判定（特定音コマンドや、「歓声」、「拍手」などの音シーン）を設定することで学習することもできる。 <Sound detection>
The user can learn by manually setting specific sounds, specific sound scenes and specific sound levels that the user wants to detect, for example, through communication using a dedicated application of the external device 301. In addition, multiple detection methods are set in advance in the audio processing unit, and the image to be learned is selected by the method described later, and the sound information included in the image is learned, and the sound judgment as the activation factor (specific sound command, etc.) is performed. You can also learn by setting sound scenes (such as ``cheers'', ``applause'', etc.).

＜環境情報検出＞
ユーザが起動条件としたい環境情報変化を、例えば外部装置３０１の専用アプリケーションを用いた通信により、手動で設定することで学習することができる。例えば、温度、気圧、明るさ、湿度、紫外線量の絶対量や変化量等の特定条件によって起動させることができる。各環境情報に基づく判定閾値を学習することもできる。環境情報による起動後のカメラ検出情報から、起動要因ではなかったと判定されると、各判定閾値のパラメータを環境変化を検出し難いように設定する。 <Environmental information detection>
It is possible to learn by manually setting environmental information changes that the user wants to use as activation conditions, for example, through communication using a dedicated application of the external device 301. For example, it can be activated based on specific conditions such as temperature, atmospheric pressure, brightness, humidity, absolute amount or amount of change in the amount of ultraviolet rays. It is also possible to learn determination thresholds based on each piece of environmental information. If it is determined from the camera detection information after activation based on environmental information that it is not the activation factor, the parameters of each determination threshold value are set to make it difficult to detect an environmental change.

また、上記の各パラメータは、電池の残容量によっても変化する。例えば、電池残量が少ないときは各種判定に入り難くなり、電池残量が多いときは各種判定に入り易くなる。具体的には、ユーザが必ずカメラを起動してほしい要因ではない揺れ状態検出結果や、音シーン検出結果でも、電池残量が多い場合には、カメラを起動すると判定されてしまう場合もある。 Furthermore, each of the above parameters changes depending on the remaining capacity of the battery. For example, when the remaining battery power is low, it is difficult to make various judgments, and when the remaining battery power is large, it is easy to make various judgments. Specifically, even if the shaking state detection result or the sound scene detection result is not a factor that the user definitely wants to start the camera, if the battery level is high, it may be determined that the camera should be started.

また、低消費電力モード解除条件の判定は、揺れ検出、音検出、時間経過検出の情報、各環境情報、電池残量等からニューラルネットワークに基づいて行うこともできる。その場合、後述する方法により学習させる画像を選択させ、画像に含まれる学習情報に基づいて、ニューラルネットワークの重みを変化させることにより学習する。 Further, the determination of the low power consumption mode release condition can also be performed based on a neural network based on information on shaking detection, sound detection, time elapsed detection, various environmental information, remaining battery level, etc. In that case, learning is performed by selecting images to be trained using a method described later and changing the weights of the neural network based on learning information included in the images.

次に、低消費電力状態への遷移条件の学習について説明する。図７に示したとおり、ステップＳ７０４のモード設定判定において、「自動撮影モード」「自動編集モード」「画像自動転送モード」「学習モード」「ファイル自動削除モード」の何れでもないと判定されると、低消費電力モードに入る。各モードの判定条件については、上述したとおりであるが、各モードが判定される条件についても学習によって変化する。 Next, learning of transition conditions to a low power consumption state will be explained. As shown in FIG. 7, in the mode setting determination in step S704, if it is determined that the mode is not one of "automatic shooting mode," "automatic editing mode," "automatic image transfer mode," "learning mode," and "automatic file deletion mode." , enter low power mode. The conditions for determining each mode are as described above, but the conditions for determining each mode also change through learning.

＜自動撮影モード＞
上述したとおり、エリア毎の重要度レベルを判定し、パン・チルトで被写体探索をしながら自動撮影を行うが、撮影される被写体が存在しないと判定されると、自動撮影モードが解除される。例えば、すべのエリアの重要度レベルや、各エリアの重要度レベルを加算した値が、所定閾値以下になったとき、自動撮影モードを解除する。このとき、自動撮影モードに遷移してからの経過時間によって所定閾値を下げていくことも行われる。自動撮影モードに遷移してからの経過時間が長くなるにつれて低消費電力モードへ移行し易くしている。 <Automatic shooting mode>
As described above, the importance level of each area is determined and automatic photography is performed while searching for a subject using panning and tilting, but if it is determined that there is no subject to be photographed, the automatic photography mode is canceled. For example, when the importance level of all areas or the sum of the importance levels of each area becomes less than or equal to a predetermined threshold, the automatic shooting mode is canceled. At this time, the predetermined threshold value is also lowered depending on the elapsed time after transitioning to the automatic shooting mode. The longer the time elapsed after transitioning to automatic shooting mode, the easier the transition to low power consumption mode becomes.

また、電池の残容量によって所定閾値を変化させることにより、電池もちを考慮した低消費電力モード制御を行うことができる。例えば、電池残量が少ないときは閾値を大きくして低消費電力モードに移行しやすくし、電池残量が多いときは閾値を小さくして低消費電力モードに移行し難くする。ここで、前回自動撮影モードに遷移してからの経過時間と撮影枚数によって、第２制御部２１１（ＳｕｂＣＰＵ）に対して、次回の低消費電力モード解除条件のパラメータ（経過時間閾値ＴｉｍｅＣ）を設定する。上記の各閾値は学習によって変化する。学習は、例えば外部装置３０１の専用アプリケーションを用いた通信により、手動で撮影頻度や起動頻度などを設定することで行われる。 Further, by changing the predetermined threshold value depending on the remaining capacity of the battery, it is possible to perform low power consumption mode control that takes battery life into consideration. For example, when the battery level is low, the threshold value is increased to make it easier to shift to the low power consumption mode, and when the battery level is high, the threshold value is decreased to make it difficult to transition to the low power consumption mode. Here, a parameter (elapsed time threshold TimeC) for the next low power consumption mode release condition is set for the second control unit 211 (SubCPU) based on the elapsed time and the number of shots since the previous transition to automatic shooting mode. do. Each of the above threshold values changes by learning. Learning is performed, for example, by manually setting the shooting frequency, activation frequency, etc. through communication using a dedicated application of the external device 301.

また、カメラ１０１の電源ボタンをＯＮしてから、電源ボタンをＯＦＦするまでの経過時間の平均値や時間帯ごとの分布データを蓄積し、各パラメータを学習する構成にしてもよい。その場合、電源ＯＮからＯＦＦまでの時間が短いユーザに対しては低消費電力モードからの復帰や、低消費電力状態への遷移の時間間隔が短くなり、電源ＯＮからＯＦＦまでの時間が長いユーザに対しては間隔が長くなるように学習される。 Alternatively, the average value of the elapsed time from when the power button of the camera 101 is turned on until the power button is turned off may be accumulated, and distribution data for each time period may be accumulated, and each parameter may be learned. In that case, the time interval for returning from low power consumption mode or transitioning to a low power consumption state will be shortened for users who take a short time from power on to power off, and for users who take a long time from power on to power off. The interval is learned to be longer for .

また、探索中の検出情報によっても学習される。学習によって設定された重要となる被写体が多いと判断されている間は、低消費電力モードからの復帰や、低消費電力状態への遷移の時間間隔が短くなり、重要となる被写体が少ない間は、間隔が長くなるように学習される。 It is also learned from detected information during the search. While it is determined that there are many important subjects set by learning, the time interval for returning from low power consumption mode or transitioning to low power consumption state will be shortened, and while there are few important subjects, the time interval will be shortened. , the interval is learned to be longer.

＜像ブレ補正＞
像ブレ補正に対する学習について説明する。像ブレ補正は、図９のステップＳ９０２で補正量を算出し、補正量に基づいてステップＳ９０５でパン・チルトを駆動することにより行われる。像ブレ補正では、ユーザの揺れの特徴に合わせた補正を行うための学習を行う。撮影画像に対して、例えば、ＰＳＦ（ＰｏｉｎｔＳｐｒｅａｄＦｕｎｃｔｉｏｎ）を用いることにより、ブレの方向及び大きさを推定することが可能である。図９のステップＳ９１２の学習用情報生成では、推定したブレの方向と大きさが、情報として画像に付加される。 <Image shake correction>
Learning for image blur correction will be explained. Image blur correction is performed by calculating a correction amount in step S902 of FIG. 9, and driving pan/tilt in step S905 based on the correction amount. In image blur correction, learning is performed to perform corrections tailored to the user's shaking characteristics. For example, by using PSF (Point Spread Function) for a photographed image, it is possible to estimate the direction and magnitude of blur. In the learning information generation in step S912 in FIG. 9, the estimated direction and magnitude of blur are added to the image as information.

図７のステップＳ７１６での学習モード処理内で、推定したブレの方向と大きさを出力として、撮影時の各検出情報（撮影前所定時間における画像の動きベクトル情報、検出した被写体（人や物体）の動き情報、振動情報（ジャイロ出力、加速度出力、カメラ状態）を入力として、像ブレ補正用のニューラルネットワークの重みを学習させる。他にも、環境情報（温度、気圧、照度、湿度）、音情報（音シーン判定、特定音声検出、音レベル変化）、時間情報（起動からの経過時間、前回撮影時からの経過時間）、場所情報（ＧＰＳ位置情報、位置移動変化量）なども入力に加えて判定してもよい。 In the learning mode process in step S716 in FIG. ) movement information and vibration information (gyro output, acceleration output, camera status) as input to learn the weights of the neural network for image stabilization.In addition, environmental information (temperature, atmospheric pressure, illuminance, humidity), Sound information (sound scene determination, specific sound detection, sound level change), time information (elapsed time since startup, elapsed time since last shooting), location information (GPS position information, amount of change in position movement), etc. can also be input. You may also make a determination.

ステップＳ９０２での像ブレ補正量の算出時において、上記各検出情報をニューラルネットワークに入力することにより、その瞬間撮影したときのブレの大きさを推定することができる。そして、推定したブレの大きさが大きいときは、シャッター速度を速くするなどの制御が可能となる。また、推定したブレの大きさが大きいときはブレ画像になってしまうので撮影を禁止するなどの方法もとることができる。 When calculating the amount of image blur correction in step S902, by inputting each of the above detection information to the neural network, it is possible to estimate the magnitude of blur at the time of photographing at that moment. When the estimated magnitude of blur is large, control such as increasing the shutter speed becomes possible. Furthermore, if the estimated magnitude of blur is large, the resulting image will be blurred, so a method such as prohibiting photography may be taken.

また、パン・チルト駆動角度には制限があるため、駆動端に到達してしまうとそれ以上補正を行うことができないが、撮影時のブレの大きさと方向を推定することにより、露光中の像ブレを補正するためのパン・チルト駆動に必要な範囲を推定することができる。露光中の可動範囲の余裕がない場合は、像ブレ補正量を算出するフィルタのカットオフ周波数を大きくして、可動範囲を超えないように設定することにより、大きなブレを抑制することもできる。また、可動範囲を超えそうな場合は、露光直前にパン・チルトの角度を可動範囲を超えそうな方向とは逆の方向に回転してから露光開始することにより、可動範囲を確保してブレのない撮影を行うこともできる。これにより、ユーザの撮影時の特徴や使い方に合わせて像ブレ補正を学習することができるので、撮影画像がブレてしまうことを防止できる。 Additionally, since there is a limit to the pan/tilt drive angle, once the drive end is reached, no further correction can be made. It is possible to estimate the range required for pan/tilt driving to correct blur. If there is not enough room in the movable range during exposure, large blur can be suppressed by increasing the cutoff frequency of the filter that calculates the amount of image blur correction so that it does not exceed the movable range. In addition, if the movable range is likely to be exceeded, just before exposure, rotate the pan/tilt angle in the opposite direction to the direction in which the movable range is likely to be exceeded, and then start exposure to ensure the movable range and prevent blurring. You can also take pictures without. This allows image blur correction to be learned in accordance with the user's shooting characteristics and usage, thereby preventing blur in the shot image.

また、上述した＜撮影方法の判定＞において、動いている被写体はブレがなく、動いていない背景が流れる撮影を行う、流し撮り撮影を行うか否かを判定してもよい。その場合、撮影前までの検出情報から、被写体をブレなく撮影するためのパン・チルト駆動速度を推定して、被写体ブレ補正を行ってもよい。この時、上記各検出情報を既に学習させているニューラルネットワークに入力することにより、駆動速度を推定することができる。学習は、画像を各ブロックに分割して、各ブロックのＰＳＦを推定することにより、主被写体が位置するブロックでのブレの方向及び大きさを推定し、その情報に基づいて行われる。 Furthermore, in the above-described <determination of photographing method>, it may be determined whether or not to perform panning photography, in which a moving subject is not blurred and a non-moving background appears flowing. In that case, the pan/tilt drive speed for photographing the subject without blur may be estimated from the detection information before photographing, and subject blur correction may be performed. At this time, the driving speed can be estimated by inputting each of the above detection information to a neural network that has already been trained. Learning is performed based on the information obtained by dividing the image into blocks and estimating the PSF of each block to estimate the direction and magnitude of blur in the block where the main subject is located.

また、ユーザが選択した画像の情報から、背景流し量を学習することもできる。その場合、主被写体が位置しないブロックでのブレの大きさを推定し、その情報に基づいてユーザの好みを学習することができる。学習した好みの背景流し量に基づいて、撮影時のシャッター速度を設定することにより、ユーザの好みにあった流し撮り効果が得られる撮影を自動で行うことができる。 Additionally, the amount of background flow can be learned from information on the image selected by the user. In that case, it is possible to estimate the magnitude of blur in blocks where the main subject is not located, and learn the user's preferences based on that information. By setting the shutter speed during photographing based on the learned preferred background panning amount, it is possible to automatically perform photographing that provides a panning effect that suits the user's preference.

次に、学習方法について説明する。学習方法としては、「カメラ内の学習」と「通信機器との連携による学習」がある。 Next, the learning method will be explained. Learning methods include ``in-camera learning'' and ``learning by linking with communication devices.''

カメラ内学習の方法について、以下説明する。本実施形態におけるカメラ内学習には、以下の方法がある。
（１）手動撮影時の検出情報による学習
（２）被写体探索時の検出情報による学習
＜手動撮影時の検出情報による学習＞
図９のステップＳ９０７～ステップＳ９１３で説明したとおり、本実施形態においては、カメラ１０１は、手動撮影と自動撮影の２つの撮影を行うことができる。ステップＳ９０７で手動撮影指示があった場合には、ステップＳ９１２において、撮影画像は手動で撮影された画像であるとの情報が付加される。また、ステップＳ９０９において自動撮影ＯＮと判定されて撮影された場合においては、ステップＳ９１２において、撮影画像は自動で撮影された画像であると情報が付加される。 The in-camera learning method will be explained below. In-camera learning in this embodiment includes the following methods.
(1) Learning using detection information during manual shooting (2) Learning using detection information during subject search <Learning using detection information during manual shooting>
As described in steps S907 to S913 in FIG. 9, in this embodiment, the camera 101 can perform two types of photography: manual photography and automatic photography. If there is a manual photographing instruction in step S907, information indicating that the photographed image is a manually photographed image is added in step S912. Furthermore, if it is determined in step S909 that automatic photography is ON and the image is captured, information is added to the captured image in step S912 to indicate that the captured image is an automatically captured image.

ここで、手動撮影される場合、ユーザの好みの被写体、好みのシーン、好みの場所や時間間隔に基づいて撮影された可能性が非常に高い。よって、手動撮影時に得られた各特徴データや撮影画像の学習情報を基にした学習が行われるようにする。また、手動撮影時の検出情報から、撮影画像における特徴量の抽出や個人認証の登録、個人ごとの表情の登録、人の組み合わせの登録に関して学習を行う。また、被写体探索時の検出情報からは、例えば、個人登録された被写体の表情から、近くの人や物体の重要度を変更するような学習を行う。 Here, in the case of manual photographing, there is a very high possibility that the photograph was taken based on the user's favorite subject, favorite scene, favorite place, or time interval. Therefore, learning is performed based on each feature data obtained during manual photography and the learning information of the photographed image. It also learns about extracting features from captured images, registering personal authentication, registering facial expressions for each individual, and registering combinations of people from the information detected during manual photography. Also, from the detection information during the subject search, for example, learning is performed to change the importance of nearby people and objects based on the facial expressions of the personally registered subjects.

＜被写体探索時の検出情報による学習＞
被写体探索動作中において、個人認証登録されている被写体が、どんな人物、物体、シーンと同時に写っているかを判定し、同時に画角内に写っている時間比率を算出しておく。例えば、個人認証登録被写体の人物Ａが、個人認証登録被写体の人物Ｂと同時に写っている時間比率を計算する。そして、人物Ａと人物Ｂが画角内に入る場合は、自動撮影判定の点数が高くなるように、各種検出情報を学習データとして保存して、学習モード処理（ステップＳ７１６）で学習する。 <Learning based on detection information during object search>
During the object search operation, it is determined what kind of person, object, or scene the object registered for personal authentication is photographed at the same time, and at the same time, the ratio of time during which the object is photographed within the angle of view is calculated. For example, the time ratio in which person A, who is a subject registered for personal authentication, is photographed at the same time as person B, who is a subject registered for personal authentication, is calculated. If person A and person B are within the angle of view, various detection information is saved as learning data and learned in learning mode processing (step S716) so that the automatic shooting determination score is high.

他の例では、個人認証登録被写体の人物Ａが、一般物体認識により判定された被写体「猫」と同時に写っている時間比率を計算する。そして、人物Ａと「猫」が画角内に入る場合は、自動撮影判定の点数が高くなるように、各種検出情報を学習データとして保存して、学習モード処理（ステップＳ７１６）で学習する。 In another example, the time ratio in which person A, who is a subject registered for personal authentication, is photographed at the same time as a subject "cat" determined by general object recognition is calculated. If the person A and the "cat" are within the angle of view, various detection information is stored as learning data and learned in learning mode processing (step S716) so that the automatic shooting determination score is high.

また、個人認証登録被写体の人物Ａの高い笑顔度を検出した場合や、「喜び」「驚き」などの表情が検出された場合に、同時に写っている被写体は重要であると学習される。あるいは、「怒り」「真顔」などの表情が検出された場合に、同時に写っている被写体は重要である可能性が低いので学習することはしないなどの処理が行われる。 Furthermore, when a high degree of smile of Person A, who is a registered subject for personal authentication, is detected, or when an expression such as "joy" or "surprise" is detected, the subjects photographed at the same time are learned to be important. Alternatively, when facial expressions such as "angry" or "serious face" are detected, processing is performed such as not learning the subjects that are photographed at the same time because they are unlikely to be important.

次に、本実施形態における外部装置との連携による学習について説明する。本実施形態における外部装置との連携による学習には、以下の方法がある。
（１）外部装置で画像を取得したことによる学習
（２）外部装置を介して画像に判定値を入力することによる学習
（３）外部装置内の保存されている画像を解析することによる学習
（４）外部装置でＳＮＳのサーバにアップロードされた情報からの学習
（５）外部装置でカメラパラメータを変更することによる学習
（６）外部装置で画像が手動編集された情報からの学習
＜外部装置で画像を取得したことによる学習＞
図３で説明したとおり、カメラ１０１と外部装置３０１は、第１及び第２の通信３０２，３０３を行う通信手段を有している。そして、主に第１の通信３０２によって画像の送受信が行われ、外部装置３０１内の専用のアプリケーションを介して、カメラ１０１内の画像を外部装置３０１に送信することができる。また、カメラ１０１内の保存されている画像データのサムネイル画像を外部装置３０１内の専用のアプリケーションを用いて、閲覧可能である。ユーザは、このサムネイル画像の中から、自分が気に入った画像を選んで、画像確認し、画像取得指示を操作することで外部装置３０１に画像を送信させることができる。 Next, learning by cooperation with an external device in this embodiment will be explained. In this embodiment, there are the following methods for learning through cooperation with an external device.
(1) Learning by acquiring images with an external device (2) Learning by inputting judgment values into images via an external device (3) Learning by analyzing images stored in an external device ( 4) Learning from information uploaded to the SNS server with an external device (5) Learning by changing camera parameters with an external device (6) Learning from information whose images were manually edited with an external device <With an external device Learning by acquiring images＞
As explained in FIG. 3, the camera 101 and the external device 301 have communication means for performing first and second communications 302 and 303. Images are sent and received mainly through the first communication 302, and images in the camera 101 can be sent to the external device 301 via a dedicated application in the external device 301. Further, thumbnail images of image data stored in the camera 101 can be viewed using a dedicated application in the external device 301. The user can select an image he/she likes from among the thumbnail images, confirm the image, and send the image to the external device 301 by operating an image acquisition instruction.

このとき、ユーザが画像を選んで取得しているので、取得された画像はユーザの好みの画像である可能性が非常に高い。よって取得された画像は、学習すべき画像であると判定し、取得された画像の学習情報に基づいて学習することにより、ユーザの好みの各種学習を行うことができる。 At this time, since the user selects and acquires the image, there is a very high possibility that the acquired image is an image of the user's preference. Therefore, by determining that the acquired image is an image to be studied and learning based on the learning information of the acquired image, various types of learning according to the user's preference can be performed.

ここで、操作例について説明する。外部装置３０１の専用のアプリケーションを用いて、カメラ１０１内の画像を閲覧している例を図１２に示す。表示部４０７にカメラ内に保存されている画像データのサムネイル画像（１６０４～１６０９）が表示されており、ユーザは自分が気に入った画像を選択し取得することができる。このとき、表示方法を変更する表示方法変更部を構成するボタン１６０１，１６０２，１６０３が設けられている。 Here, an example of operation will be explained. FIG. 12 shows an example in which images in the camera 101 are viewed using a dedicated application of the external device 301. Thumbnail images (1604 to 1609) of image data stored in the camera are displayed on the display unit 407, and the user can select and obtain an image that he or she likes. At this time, buttons 1601, 1602, and 1603 forming a display method changing section for changing the display method are provided.

ボタン１６０１を押下すると日時優先表示モードに変更され、カメラ１０１内の画像の撮影日時の順番で表示部４０７に画像が表示される。例えば、１６０４で示される位置には日時が新しい画像が表示され、１６０９で示される位置には日時が古い画像が表示される。 When the button 1601 is pressed, the display mode is changed to date and time priority display mode, and images are displayed on the display unit 407 in the order of the shooting date and time of the images in the camera 101. For example, an image with a newer date and time is displayed at a position indicated by 1604, and an image with an older date and time is displayed at a position indicated by 1609.

ボタン１６０２を押下すると、おすすめ画像優先表示モードに変更される。図９のステップＳ９１２で演算した各画像に対するユーザの好みを判定したスコアに基づいて、カメラ１０１内の画像が、スコアの高い順番で表示部４０７に表示される。例えば、１６０４で示される位置にはスコアが高い画像が表示され、１６０９で示される位置にはスコアが低い画像が表示される。 When button 1602 is pressed, the mode is changed to recommended image priority display mode. Based on the score calculated in step S912 of FIG. 9 that determines the user's preference for each image, the images in the camera 101 are displayed on the display unit 407 in the order of the highest score. For example, an image with a high score is displayed at a position indicated by 1604, and an image with a low score is displayed at a position indicated by 1609.

ボタン１６０３を押下すると、人物や物体被写体を指定でき、続いて特定の人物や物体被写体を指定すると特定の被写体のみを表示することもできる。ボタン１６０１～１６０３は同時に設定をＯＮすることもできる。例えばすべての設定がＯＮされている場合、指定された被写体のみを表示し、且つ、撮影日時が新しい画像が優先され、且つ、スコアの高い画像が優先され、表示されることになる。このように、撮影画像に対してもユーザの好みを学習しているため、撮影された大量の画像の中から簡単な確認作業でユーザの好みの画像のみを抽出することが可能である。 By pressing the button 1603, a person or object can be specified, and by subsequently specifying a specific person or object, only the specific object can be displayed. The settings of buttons 1601 to 1603 can also be turned on at the same time. For example, if all settings are ON, only the specified subject will be displayed, images with newer shooting dates and times will be given priority, and images with higher scores will be given priority and displayed. In this way, since the user's preferences are learned for the captured images, it is possible to extract only the user's preferred images from a large number of captured images with a simple confirmation process.

＜外部装置を介して画像に判定値を入力することによる学習＞
上記で説明したとおり、カメラ１０１と外部装置３０１は、通信手段を有しており、カメラ１０１内に保存されている画像を外部装置３０１内の専用のアプリケーションを用いて、閲覧可能である。ここで、ユーザは、各画像に対して点数付けを行う構成にしてもよい。ユーザが好みと思った画像に対して高い点数（例えば５点）を付けたり、好みでないと思った画像に対して低い点数（例えば１点）を付けることができ、ユーザの操作によって、カメラが学習していくような構成にする。各画像の点数は、カメラ内で学習情報と共に再学習に使用される。指定した画像情報からの特徴データを入力にした、ニューラルネットワークの出力がユーザが指定した点数に近づくように学習される。 <Learning by inputting judgment values into images via an external device>
As explained above, the camera 101 and the external device 301 have communication means, and images stored in the camera 101 can be viewed using a dedicated application in the external device 301. Here, the user may assign a score to each image. Users can give high scores (for example, 5 points) to images that they like, and low scores (for example, 1 point) to images that they do not like. Create a structure that allows you to learn. The score for each image is used for relearning together with learning information within the camera. The neural network inputs the feature data from the specified image information and learns so that the output of the neural network approaches the score specified by the user.

本実施形態では、外部装置３０１を介して、撮影済み画像にユーザが判定値を入力する構成にしたが、カメラ１０１を操作して、直接、画像に判定値を入力する構成にしてもよい。その場合、例えば、カメラ１０１にタッチパネルディスプレイを設け、タッチパネルディスプレイの画面表示部に表示されたＧＵＩボタンをユーザが押下して、撮影済み画像を表示するモードに設定する。そして、ユーザは撮影済み画像を確認しながら、各画像に判定値を入力するなどの方法により、同様の学習を行うことができる。 In this embodiment, the user inputs the determination value into the photographed image via the external device 301, but the user may input the determination value directly into the image by operating the camera 101. In that case, for example, the camera 101 is provided with a touch panel display, and the user presses a GUI button displayed on the screen display section of the touch panel display to set a mode for displaying captured images. Then, the user can perform similar learning by inputting judgment values for each image while checking the captured images.

＜外部装置内の保存されている画像を解析することによる学習＞
外部装置３０１は、記憶部４０４を有し、記憶部４０４にはカメラ１０１で撮影された画像以外の画像も記録される構成とする。このとき、外部装置３０１内に保存されている画像は、ユーザが閲覧し易く、公衆無線制御部４０６を介して、共有サーバに画像をアップロードすることも容易なため、ユーザの好みの画像が多く含まれる可能性が非常に高い。 <Learning by analyzing images stored in an external device>
The external device 301 has a storage unit 404, and is configured to record images other than images captured by the camera 101 in the storage unit 404. At this time, the images stored in the external device 301 are easy for the user to view and can be easily uploaded to the shared server via the public wireless control unit 406, so there are many images that the user likes. Very likely to be included.

外部装置３０１の制御部４１１は、専用のアプリケーションを用いて、記憶部４０４に保存されている画像を、カメラ１０１内の学習処理部２１９と同等の能力で処理可能に構成される。そして、処理された学習用データをカメラ１０１に通信することにより、学習を行う。あるいは、カメラ１０１に学習させたい画像やデータを送信して、カメラ１０１内で学習するような構成にしてもよい。また、専用のアプリケーションを用いて、記録部４０４に保存されている画像の中から、学習させたい画像をユーザが選択して学習する構成にすることもできる。 The control unit 411 of the external device 301 is configured to be able to process images stored in the storage unit 404 with the same ability as the learning processing unit 219 in the camera 101 using a dedicated application. Learning is then performed by communicating the processed learning data to the camera 101. Alternatively, a configuration may be adopted in which images and data to be learned by the camera 101 are transmitted and the learning is performed within the camera 101. Alternatively, a configuration may be provided in which the user selects and learns an image that the user wants to learn from among the images stored in the recording unit 404 using a dedicated application.

＜外部装置でＳＮＳのサーバにアップロードされた情報からの学習＞
次に、人と人の繋がりに主眼をおいた社会的なネットワークを構築できるサービスやウェブサイトであるソーシャル・ネットワーキング・サービス（ＳＮＳ）における情報を学習に使用する方法について説明する。画像をＳＮＳにアップロードする際に、外部装置３０１から画像に関するタグを入力した上で、画像と共に送信する技術がある。また、他のユーザがアップロードした画像に対して好き嫌いを入力する技術もあり、他のユーザがアップロードした画像が、外部装置３０１を所有するユーザの好みの写真であるかも判定できる。 <Learning from information uploaded to SNS server by external device>
Next, we will explain how to use information in social networking services (SNS), which are services and websites that allow you to build social networks that focus on connections between people, for learning. When uploading an image to SNS, there is a technique in which a tag related to the image is input from the external device 301 and then transmitted along with the image. There is also a technique for inputting likes and dislikes for images uploaded by other users, and it can be determined whether the images uploaded by other users are the photos that the user who owns the external device 301 likes.

外部装置３０１内にダウンロードされた専用のＳＮＳアプリケーションで、上記のようにユーザが自らアップロードした画像と画像についての情報を取得することができる。また、ユーザが他のユーザがアップロードした画像に対して好きか否かを入力することにより、ユーザの好みの画像やタグ情報を取得することもできる。それらの画像やタグ情報を解析し、カメラ１０１内で学習できるようにする。 With a dedicated SNS application downloaded into the external device 301, it is possible to acquire images and information about the images that the user himself has uploaded as described above. Further, by inputting whether or not the user likes images uploaded by other users, the user's favorite images and tag information can be acquired. The images and tag information are analyzed and learned within the camera 101.

外部装置３０１の制御部４１１は、上記のようにユーザがアップロードした画像や、ユーザが好きと判定した画像を取得し、カメラ１０１内の学習処理部２１９と同等の能力で処理可能に構成される。そして、処理された学習用データをカメラ１０１に通信することで、学習を行う。あるいは、カメラ１０１に学習させたい画像を送信して、カメラ１０１内で学習するような構成にしてもよい。 The control unit 411 of the external device 301 is configured to be able to acquire images uploaded by the user and images determined to be liked by the user as described above, and process them with the same ability as the learning processing unit 219 in the camera 101. . Learning is then performed by communicating the processed learning data to the camera 101. Alternatively, a configuration may be adopted in which images to be learned by the camera 101 are transmitted and the learning is performed within the camera 101.

また、タグ情報に設定された被写体情報（例えば、犬、猫などの被写体物体情報、ビーチなどのシーン情報、スマイルなどの表情情報など）から、ユーザが好みであろう被写体情報を推定する。そして、ニューラルネットワークに入力する検出すべき被写体として登録することによる学習を行う。 Further, from the subject information set in the tag information (for example, subject object information such as a dog or cat, scene information such as a beach, facial expression information such as a smiley face, etc.), subject information that the user is likely to like is estimated. Then, learning is performed by registering the object to be detected and input into the neural network.

また、上記ＳＮＳでのタグ情報（画像フィルタ情報や被写体情報）の統計値から、世の中で今現在流行っている画像情報を推定し、カメラ１０１内で学習できる構成にすることもできる。 Furthermore, it is also possible to estimate image information that is currently popular in the world from statistical values of tag information (image filter information and subject information) in the SNS, and to learn it within the camera 101.

＜外部装置でカメラパラメータを変更することによる学習＞
上記で説明したとおり、カメラ１０１と外部装置３０１は、通信手段を有している。そして、カメラ１０１内に現在設定されている学習パラメータ（ニューラルネットワークの重みや、ニューラルネットワークに入力する被写体の選択など）を外部装置３０１に通信し、外部装置３０１の記憶部４０４に保存することができる。また、外部装置３０１内の専用のアプリケーションを用いて、専用のサーバにセットされた学習パラメータを公衆無線制御部４０６を介して取得し、カメラ１０１内の学習パラメータに設定することもできる。これにより、ある時点でのパラメータを外部装置３０１に保存しておいて、カメラ１０１に設定することで、学習パラメータを戻すこともできる。また、他のユーザが持つ学習パラメータを、専用のサーバを介して取得し、自身のカメラ１０１に設定することもできる。 <Learning by changing camera parameters with an external device>
As explained above, the camera 101 and the external device 301 have communication means. Then, the learning parameters currently set in the camera 101 (such as the weight of the neural network and the selection of subjects to be input to the neural network) can be communicated to the external device 301 and stored in the storage unit 404 of the external device 301. can. Furthermore, using a dedicated application in the external device 301, learning parameters set in a dedicated server can be obtained via the public radio control unit 406 and set as learning parameters in the camera 101. Thereby, by saving the parameters at a certain point in the external device 301 and setting them in the camera 101, it is also possible to restore the learning parameters. Further, learning parameters held by other users can be acquired via a dedicated server and set in the user's own camera 101.

また、外部装置３０１の専用のアプリケーションを用いて、ユーザが登録した音声コマンドや認証登録、ジェスチャーを登録できるようにしてもよいし、重要な場所を登録してもよい。これらの情報は、自動撮影モード処理（図９）で説明した撮影トリガーや自動撮影判定の入力データとして扱われる。また、撮影頻度や起動間隔、静止画と動画の割合や好みの画像などを設定することができる構成とし、＜低消費電力モード制御＞で説明した起動間隔などの設定を行ってもよい。 Further, a dedicated application of the external device 301 may be used to allow the user to register voice commands, authentication registration, and gestures, or to register important locations. These pieces of information are treated as input data for the shooting trigger and automatic shooting determination described in the automatic shooting mode process (FIG. 9). In addition, a configuration may be adopted in which the shooting frequency, activation interval, ratio of still images to moving images, favorite images, etc. can be set, and settings such as the activation interval described in <Low power consumption mode control> may be made.

＜外部装置で画像が手動編集された情報からの学習＞
外部装置３０１の専用のアプリケーションにユーザの操作により手動で編集できる機能を持たせ、編集作業の内容を学習にフィードバックすることもできる。例えば、画像効果付与（トリミング処理、回転処理、スライド、ズーム、フェード、色変換フィルタ効果、時間、静止画動画比率、ＢＧＭ）の編集が可能である。そして、画像の学習情報に対して、手動で編集した画像効果付与が判定されるように、自動編集のニューラルネットワークを学習させる。 <Learning from information on images manually edited with an external device>
It is also possible to provide a dedicated application of the external device 301 with a function that allows manual editing by user operation, and feed back the contents of the editing work to the learning. For example, it is possible to edit image effects (trimming processing, rotation processing, slide, zoom, fade, color conversion filter effect, time, still image/video ratio, BGM). Then, the automatic editing neural network is trained to determine whether to apply a manually edited image effect to the learning information of the image.

次に、学習処理シーケンスについて説明する。図７のステップＳ７０４のモード設定判定において、学習処理を行うべきか否かを判定し、学習処理を行うべきと判定された場合、ステップＳ７１６の学習モード処理を行う。 Next, the learning processing sequence will be explained. In the mode setting determination in step S704 in FIG. 7, it is determined whether learning processing should be performed, and if it is determined that learning processing should be performed, learning mode processing in step S716 is performed.

学習モードの判定条件について説明する。学習モードに移行するか否かは、前回学習処理を行ってからの経過時間と、学習に使用できる情報の数、通信機器を介して学習処理指示があったかなどから判定される。ステップＳ７０４のモード設定判定処理内で判定される、学習モードに移行すべきか否かの判定処理フローを図１３に示す。 The learning mode determination conditions will be explained. Whether or not to shift to the learning mode is determined based on the elapsed time since the last learning process, the number of information that can be used for learning, whether a learning process instruction has been received via a communication device, etc. FIG. 13 shows a process flow for determining whether or not to shift to learning mode, which is determined in the mode setting determination process of step S704.

ステップＳ７０４のモード設定判定処理内で学習モード判定が開始指示されると、図１３の処理がスタートする。ステップＳ１４０１では、外部装置３０１からの登録指示があるか否かを判定する。ここでの登録は、上記で説明した＜外部装置で画像を取得したことによる学習＞や、＜外部装置を介して画像に判定値を入力することによる学習＞や、＜外部装置内の保存されている画像を解析することによる学習＞などの、学習するための登録指示があったか否かの判定である。 When the learning mode determination is instructed to start in the mode setting determination process of step S704, the process of FIG. 13 starts. In step S1401, it is determined whether there is a registration instruction from the external device 301. Registration here includes the above-mentioned <Learning by acquiring images with an external device>, <Learning by inputting judgment values into images via an external device>, and <Learning by storing images in an external device>. This is a determination as to whether or not there is a registration instruction for learning, such as "Learning by analyzing images that are currently available."

ステップＳ１４０１で、外部装置３０１からの登録指示があった場合、ステップＳ１４０８に進み、学習モード判定をＴＲＵＥにして、ステップＳ７１６の処理を行うように設定し、学習モード判定処理を終了する。ステップＳ１４０１で外部装置からの登録指示がない場合、ステップＳ１４０２に進む。 In step S1401, if there is a registration instruction from the external device 301, the process advances to step S1408, sets the learning mode determination to TRUE, sets to perform the process of step S716, and ends the learning mode determination process. If there is no registration instruction from the external device in step S1401, the process advances to step S1402.

ステップＳ１４０２では外部装置からの学習指示があるか否かを判定する。ここでの学習指示は＜外部装置でカメラパラメータを変更することによる学習＞のように、学習パラメータをセットする指示があったか否かの判定である。ステップＳ１４０２で、外部装置からの学習指示があった場合、ステップＳ１４０８に進み、学習モード判定をＴＲＵＥにして、ステップＳ７１６の処理を行うように設定し、学習モード判定処理を終了する。ステップＳ１４０２で外部装置からの学習指示がない場合、ステップＳ１４０３に進む。 In step S1402, it is determined whether there is a learning instruction from an external device. The learning instruction here is a determination as to whether or not there is an instruction to set learning parameters, such as <learning by changing camera parameters with an external device>. In step S1402, if there is a learning instruction from an external device, the process advances to step S1408, sets the learning mode determination to TRUE, sets the process of step S716 to be performed, and ends the learning mode determination process. If there is no learning instruction from the external device in step S1402, the process advances to step S1403.

ステップＳ１４０３では、前回の学習処理（ニューラルネットワークの重みの再計算）が行われてからの経過時間ＴｉｍｅＮを取得し、ステップＳ１４０４に進む。ステップＳ１４０４では、学習する新規のデータ数ＤＮ（前回の学習処理が行われてからの経過時間ＴｉｍｅＮの間で、学習するように指定された画像の数）を取得し、ステップＳ１４０５に進む。ステップＳ１４０５では、経過時間ＴｉｍｅＮから学習モードに入るか否かを判定する閾値ＤＴを演算する。閾値ＤＴの値が小さいほど学習モードに入りやすく設定されている。例えば、ＴｉｍｅＮが所定値よりも小さい場合の閾値ＤＴの値であるＤＴａが、ＴｉｍｅＮが所定値よりも大きい場合の閾値ＤＴの値であるＤＴｂよりも大きく設定されており、時間の経過とともに、閾値が小さくなるように設定されている。これにより、学習データが少ない場合においても、時間経過が大きいと学習モードに入りやすくして、再度学習することで、使用時間に応じてカメラが学習変化し易いようにされている。 In step S1403, the elapsed time TimeN since the previous learning process (recalculation of neural network weights) was performed is obtained, and the process advances to step S1404. In step S1404, the new number of data to be learned DN (the number of images designated to be learned during the elapsed time TimeN since the previous learning process was performed) is acquired, and the process advances to step S1405. In step S1405, a threshold value DT for determining whether to enter the learning mode is calculated from the elapsed time TimeN. It is set that the smaller the value of the threshold DT, the easier it is to enter the learning mode. For example, DTa, which is the value of the threshold DT when TimeN is smaller than a predetermined value, is set larger than DTb, which is the value of the threshold DT when TimeN is larger than the predetermined value. is set to be small. As a result, even when there is little learning data, if a long time has elapsed, it is easier to enter the learning mode, and by learning again, the camera can easily change the learning mode according to the usage time.

ステップＳ１４０５で閾値ＤＴを演算すると、ステップＳ１４０６に進み、学習するデータ数ＤＮが、閾値ＤＴよりも大きいか否かを判定する。データ数ＤＮが、閾値ＤＴよりも大きい場合、ステップＳ１４０７に進み、ＤＮを０に設定する。その後、ステップＳ１４０８に進み、学習モード判定をＴＲＵＥにして、ステップＳ７１６（図７）の処理を行うように設定し、学習モード判定処理を終了する。 After calculating the threshold value DT in step S1405, the process advances to step S1406, and it is determined whether the number of data to be learned DN is larger than the threshold value DT. If the data number DN is larger than the threshold DT, the process advances to step S1407 and DN is set to 0. Thereafter, the process advances to step S1408, where the learning mode determination is set to TRUE, the process of step S716 (FIG. 7) is set to be performed, and the learning mode determination process is ended.

ステップＳ１４０６でＤＮが閾値ＤＴ以下の場合、ステップＳ１４０９に進む。外部装置からの登録指示も、外部装置からの学習指示もなく、且つ学習データ数も所定値以下であるので、学習モード判定をＦＡＬＳＥにし、ステップＳ７１６の処理は行わないように設定し、学習モード判定処理を終了する。 If DN is equal to or less than the threshold value DT in step S1406, the process advances to step S1409. Since there is no registration instruction from the external device or learning instruction from the external device, and the number of learning data is less than the predetermined value, the learning mode determination is set to FALSE, the process of step S716 is set not to be performed, and the learning mode is set. The determination process ends.

次に、学習モード処理（ステップＳ７１６）内の処理について説明する。学習モード処理の動作を示す詳細なフローチャートを図１４に示す。 Next, processing in the learning mode processing (step S716) will be explained. A detailed flowchart showing the operation of the learning mode process is shown in FIG.

図７のステップＳ７１５で学習モードと判定され、ステップＳ７１６に進むと、図１４の処理がスタートする。ステップＳ１５０１では、外部装置３０１からの登録指示があるか否かを判定する。ステップＳ１５０１で、外部装置３０１からの登録指示があった場合、ステップＳ１５０２に進む。ステップＳ１５０２では、各種登録処理を行う。 When the learning mode is determined in step S715 in FIG. 7 and the process proceeds to step S716, the process in FIG. 14 starts. In step S1501, it is determined whether there is a registration instruction from the external device 301. If there is a registration instruction from the external device 301 in step S1501, the process advances to step S1502. In step S1502, various registration processes are performed.

各種登録は、ニューラルネットワークに入力する特徴の登録であり、例えば顔認証の登録や、一般物体認識の登録や、音情報の登録や、場所情報の登録などである。登録処理を終了すると、ステップＳ１５０３に進み、ステップＳ１５０２で登録された情報から、ニューラルネットワークへ入力する要素を変更する。ステップＳ１５０３の処理を終了すると、ステップＳ１５０７に進む。 The various registrations are the registration of features to be input to the neural network, such as registration of face authentication, registration of general object recognition, registration of sound information, and registration of location information. When the registration process is completed, the process advances to step S1503, and the elements to be input to the neural network are changed from the information registered in step S1502. After completing the process in step S1503, the process advances to step S1507.

ステップＳ１５０１で外部装置３０１からの登録指示がない場合、ステップＳ１５０４に進み、外部装置３０１からの学習指示があるか否かを判定する。外部装置３０１からの学習指示があった場合、ステップＳ１５０５に進み、外部装置３０１から通信された学習パラメータを各判定器（ニューラルネットワークの重みなど）に設定し、ステップＳ１５０７に進む。 If there is no registration instruction from the external device 301 in step S1501, the process advances to step S1504, and it is determined whether there is a learning instruction from the external device 301. If there is a learning instruction from the external device 301, the process advances to step S1505, where the learning parameters communicated from the external device 301 are set in each determiner (neural network weights, etc.), and the process advances to step S1507.

ステップＳ１５０４で外部装置３０１からの学習指示がない場合、ステップＳ１５０６で学習（ニューラルネットワークの重みの再計算）を行う。ステップＳ１５０６の処理に入るのは、図１３を用いて説明したように、学習するデータ数ＤＮが閾値ＤＴを超えて、各判定器の再学習を行う場合である。誤差逆伝搬法或いは、勾配降下法などの方法を使って再学習させ、ニューラルネットワークの重みを再計算して、各判定器のパラメータを変更する。学習パラメータが設定されると、ステップＳ１５０７に進む。 If there is no learning instruction from the external device 301 in step S1504, learning (recalculation of neural network weights) is performed in step S1506. As explained using FIG. 13, the process of step S1506 is entered when the number of data to be learned DN exceeds the threshold value DT and re-learning of each determiner is performed. Relearning is performed using a method such as error backpropagation or gradient descent, the weights of the neural network are recalculated, and the parameters of each determiner are changed. Once the learning parameters are set, the process advances to step S1507.

ステップＳ１５０７では、ファイル内の画像を再スコア付けする。本実施形態においては、学習結果に基づいてファイル（記録媒体２２１）内に保存されている全ての撮影画像にスコアを付けておき、付けられたスコアに応じて、自動編集や自動ファイル削除を行う構成となっている。よって、再学習や外部装置からの学習パラメータのセットが行われた場合には、撮影済み画像のスコアも更新を行う必要がある。よって、ステップＳ１５０７では、ファイル内に保存されている撮影画像に対して新たなスコアを付ける再計算が行われ、処理が終了すると学習モード処理を終了する。 In step S1507, the images in the file are rescored. In this embodiment, scores are assigned to all captured images stored in a file (recording medium 221) based on learning results, and automatic editing and automatic file deletion are performed according to the assigned scores. The structure is as follows. Therefore, when re-learning or setting learning parameters from an external device is performed, it is necessary to update the scores of captured images as well. Therefore, in step S1507, recalculation is performed to give a new score to the photographed image stored in the file, and when the process is completed, the learning mode process is ended.

本実施形態においては、ユーザが好むと思われるシーンを抽出し、その特徴を学習し、自動撮影や自動編集といったカメラ動作に反映させることにより、ユーザの好みの映像を提案する方法を説明したが、本発明はこの用途に限定されるものではない。例えば、あえてユーザ自身の好みとは異なる映像を提案する用途に用いることもできる。その実現方法の例としては、以下のとおりである。 In this embodiment, we have described a method for proposing videos that the user likes by extracting scenes that the user seems to like, learning their characteristics, and reflecting them in camera operations such as automatic shooting and automatic editing. However, the invention is not limited to this application. For example, it can also be used to propose a video that is different from the user's own preference. An example of how to achieve this is as follows.

＜好みを学習させたニューラルネットワークを用いる方法＞
学習については、上記で説明したとおりの方法により、ユーザの好みを学習する。そして、＜自動撮影＞のＳ９０８において、ニューラルネットワークの出力値が、教師データであるユーザの好みとは異なることを示す値であるときに自動撮影を行う。例えば、ユーザが好んだ画像を教師画像とし、教師画像と類似する特徴を示すときに高い値が出力されように学習をさせた場合は、逆に出力値が所定値より低いことを条件として自動撮影を行う。また、同様に被写体探索処理や自動編集処理においても、ニューラルネットワークの出力値が、教師データであるユーザの好みとは異なることを示す値となる処理を実行する。 <Method using a neural network that has learned preferences>
Regarding learning, the user's preferences are learned using the method described above. Then, in S908 of <Automatic Photography>, automatic photographing is performed when the output value of the neural network is a value indicating that it is different from the user's preference, which is teacher data. For example, if an image that the user likes is used as a teacher image, and learning is performed so that a high value is output when it shows features similar to the teacher image, conversely, if the output value is lower than a predetermined value, Perform automatic shooting. Similarly, in the subject search process and the automatic editing process, a process is performed in which the output value of the neural network becomes a value indicating that it is different from the user's preference, which is teacher data.

＜好みとは異なる状況を学習させたニューラルネットワークを用いる方法＞
この方法では、学習処理の時点で、ユーザの好みとは異なる状況を教師データとして学習する。例えば、上記では、手動で撮影した画像はユーザが好んで撮影したシーンであるとして、これを教師データとする学習方法について説明した。しかし、ここでは、逆に手動撮影した画像は教師データとして使用せず、所定時間以上手動撮影が行われなかったシーンを教師データとして追加する。あるいは、教師データの中に手動撮影した画像と特徴が類似するシーンがあれば、教師データから削除するようにしてもよい。また、外部装置で取得した画像と特徴が異なる画像を教師データに加えるか、取得した画像と特徴が似た画像を教師データから削除するようにしてもよい。このようにすることで、教師データには、ユーザの好みと異なるデータが集まり、学習の結果、ニューラルネットワークは、ユーザの好みと異なる状況を判別することができるようになる。そして、自動撮影ではそのニューラルネットワークの出力値に応じて撮影を行うことで、ユーザの好みとは異なるシーンを撮影することができる。 <Method using a neural network trained on situations different from one's preferences>
In this method, a situation different from the user's preference is learned as training data at the time of learning processing. For example, in the above description, a learning method using manually captured images as training data has been described, assuming that the manually captured images are scenes that the user prefers to capture. However, here, on the contrary, manually photographed images are not used as training data, and scenes in which manual photography has not been performed for a predetermined period of time or more are added as training data. Alternatively, if there is a scene in the training data that has similar characteristics to the manually captured image, it may be deleted from the training data. Further, an image having different characteristics from an image acquired by an external device may be added to the teacher data, or an image having characteristics similar to the acquired image may be deleted from the teacher data. By doing this, data that differs from the user's preferences is collected in the teacher data, and as a result of learning, the neural network becomes able to discriminate situations that differ from the user's preferences. Then, in automatic photography, by performing photography according to the output value of the neural network, it is possible to photograph a scene different from the user's preference.

上記のように、あえてユーザ自身の好みとは異なる映像を提案することにより、ユーザが手動で撮影をしないであろうシーンが撮影され、撮り逃しを減少させることができる。また、ユーザ自身の発想にないシーンでの撮影を提案することで、ユーザに気付きを与えたり、嗜好の幅を広げたりする効果が期待できる。 As described above, by intentionally proposing a video that is different from the user's own preference, scenes that the user would not have manually photographed can be photographed, and the number of missed shots can be reduced. In addition, by proposing shooting scenes that are not in the user's imagination, it can be expected to make the user aware of the situation and expand the range of tastes.

また、上記の方法を組み合わせることにより、ユーザの好みと多少似ているが一部違う状況の提案もでき、ユーザの好みに対する適合度合いを調節することも容易である。ユーザの好みに対する適合度合いは、モード設定や、各種センサの状態、検出情報の状態に応じて変更してもよい。 Furthermore, by combining the above methods, it is possible to propose situations that are somewhat similar to the user's preferences but are partially different, and it is also easy to adjust the degree of adaptation to the user's preferences. The degree of adaptation to the user's preferences may be changed depending on the mode setting, the status of various sensors, and the status of detected information.

本実施形態においては、カメラ１０１内で学習する構成について説明した。しかし、外部装置３０１側に学習機能を持ち、学習に必要なデータを外部装置３０１に通信し、外部装置側でのみ学習を実行する構成でも同様の学習効果を実現可能である。その場合、上記の＜外部装置でカメラパラメータを変更することによる学習＞で説明したように、外部装置側で学習したニューラルネットワークの重みなどのパラメータをカメラ１０１に通信により設定することで学習を行う構成にしてもよい。 In this embodiment, a configuration for learning within the camera 101 has been described. However, the same learning effect can be achieved with a configuration in which the external device 301 side has a learning function, data necessary for learning is communicated to the external device 301, and learning is executed only on the external device side. In that case, as explained in <Learning by changing camera parameters with an external device> above, learning is performed by setting parameters such as neural network weights learned on the external device side to the camera 101 via communication. It may be configured.

また、カメラ１０１内と、外部装置３０１内の両方に、それぞれ学習機能を持つ構成にし、例えばカメラ１０１内で学習モード処理（ステップＳ７１６）が行われるタイミングで外部装置３０１が持つ学習情報をカメラ１０１に通信し、学習パラメータをマージすることで学習を行う構成にしてもよい。 Furthermore, both the camera 101 and the external device 301 are configured to have learning functions, so that, for example, learning information held by the external device 301 is transferred to the camera 101 at the timing when learning mode processing (step S716) is performed in the camera 101. The learning may be performed by communicating with the learning parameters and merging the learning parameters.

（他の実施形態）
また本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現できる。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現できる。 (Other embodiments)
The present invention also provides a system or device with a program that implements one or more functions of the above-described embodiments via a network or a storage medium, and one or more processors in a computer of the system or device reads the program. This can also be achieved by executing a process. It can also be implemented by a circuit (eg, ASIC) that implements one or more functions.

本発明の実施様態の例を以下に列挙する。 Examples of embodiments of the invention are listed below.

（実施様態１）被写体像を撮像して画像データを出力する撮像手段と、
前記撮像手段によって出力された画像データを記録する撮影動作を実施するか否か制御する制御手段と、
前記撮影動作の頻度に関する情報を取得する取得手段と、
を備え、
前記制御手段は、前記頻度に関する情報に応じて、前記撮影動作を実施するか否かを決定するための閾値を変更することを特徴とする撮像装置。 (Embodiment 1) Imaging means that images a subject image and outputs image data;
a control means for controlling whether or not to perform a photographing operation for recording image data outputted by the imaging means;
acquisition means for acquiring information regarding the frequency of the photographing operation;
Equipped with
The imaging apparatus is characterized in that the control means changes a threshold value for determining whether or not to perform the imaging operation, depending on the information regarding the frequency.

（実施様態２）
被写体の情報を検出する検出手段を更に備え、
前記制御手段は、前記撮影動作を実施するか否かを、前記被写体の情報を前記閾値と比較することにより決定することを特徴とする実施様態１に記載の撮像装置。 (Implementation mode 2)
Further comprising a detection means for detecting information about the subject,
The imaging apparatus according to embodiment 1, wherein the control means determines whether to perform the photographing operation by comparing information about the subject with the threshold value.

（実施様態３）
前記検出手段は、検出された音および前記撮像手段により撮像された画像データの少なくとも一方に基づいて、被写体の情報を検出することを特徴とする実施様態２に記載の撮像装置。 (Implementation mode 3)
Embodiment 2 The imaging device according to embodiment 2, wherein the detection unit detects information about the subject based on at least one of detected sound and image data captured by the imaging unit.

（実施様態４）
前記閾値の初期値は、過去の学習の結果に基づいて決定されることを特徴とする実施様態１乃至３のいずれか１つに記載の撮像装置。 (Implementation mode 4)
4. The imaging device according to any one of embodiments 1 to 3, wherein the initial value of the threshold is determined based on past learning results.

（実施様態５）
前記撮影動作の頻度に関する情報は、一定期間ごとの撮影枚数であることを特徴とする実施様態１乃至４のいずれか１つに記載の撮像装置。 (Implementation mode 5)
5. The imaging apparatus according to any one of embodiments 1 to 4, wherein the information regarding the frequency of the photographing operation is the number of images taken per fixed period.

（実施様態６）
前記制御手段は、過去の撮影枚数に基づいて、次の一定期間の前記閾値を決定することを特徴とする実施様態５に記載の撮像装置。 (Embodiment 6)
Embodiment 5 The imaging apparatus according to embodiment 5, wherein the control means determines the threshold value for the next fixed period based on the number of images taken in the past.

（実施様態７）
所定期間における目標撮影枚数を決定する決定手段を更に備え、
前記制御手段は、前記目標撮影枚数と、前記頻度に関する情報とに基づいて、前記撮影動作を実施するか否かを判定するための閾値を変更することを特徴とする実施様態１乃至６のいずれか１つに記載の撮像装置。 (Embodiment 7)
Further comprising determining means for determining a target number of shots in a predetermined period,
Any one of embodiments 1 to 6, wherein the control means changes a threshold value for determining whether or not to perform the photographing operation based on the target number of photographed images and information regarding the frequency. The imaging device according to item 1.

（実施様態８）
前記制御手段は、撮影時間の経過とともに前記目標撮影枚数に向けて、撮影枚数が直線的に増加するように前記閾値を変更することを特徴とする実施様態７に記載の撮像装置。 (Embodiment 8)
Embodiment 7 The imaging apparatus according to embodiment 7, wherein the control means changes the threshold value so that the number of shots increases linearly toward the target number of shots as the shooting time elapses.

（実施様態９）
前記決定手段は、ユーザによる手動による入力または音声による入力に基づいて設定された撮影条件に基づいて前記目標撮影枚数を決定することを特徴とする実施様態７または８に記載の撮像装置。 (Implementation mode 9)
9. The imaging apparatus according to embodiment 7 or 8, wherein the determining unit determines the target number of images to be taken based on photographing conditions set based on manual input or voice input by a user.

（実施様態１０）
前記ユーザによる手動による入力または音声による入力は、スマートデバイスを用いて行われることを特徴とする実施様態９に記載の撮像装置。 (Embodiment 10)
10. The imaging apparatus according to embodiment 9, wherein the manual input or voice input by the user is performed using a smart device.

（実施様態１１）
前記撮影条件は、総撮影時間の情報を含むことを特徴とする実施様態９または１０に記載の撮像装置。 (Embodiment 11)
The imaging apparatus according to embodiment 9 or 10, wherein the imaging conditions include information on a total imaging time.

（実施様態１２）
前記撮影条件は、さらに記録媒体およびバッテリーの残量の情報を含むことを特徴とする実施様態１１に記載の撮像装置。 (Embodiment 12)
12. The imaging apparatus according to embodiment 11, wherein the photographing conditions further include information on a recording medium and remaining battery power.

（実施形態１３）
被写体像を撮像して画像データを出力する撮像手段と、
前記撮像手段によって出力された画像データを記録する撮影動作を実施するか否か制御する制御手段と、
被写体の顔を検出する検出手段と、
前記検出手段により検出された被写体の顔の状態を判定する判定手段と、
前記撮影動作の頻度に関する情報を取得する取得手段と、
を備え、
前記制御手段は、前記判定手段により判定された被写体の顔の状態が同じであっても、前記頻度が第１の頻度の場合には撮影動作を実施し、前記頻度が第２の頻度の場合には撮影動作を実施しないように制御することを特徴とする撮像装置。 (Embodiment 13)
an imaging means for imaging a subject image and outputting image data;
a control means for controlling whether or not to perform a photographing operation for recording image data outputted by the imaging means;
a detection means for detecting a face of a subject;
determination means for determining the state of the face of the subject detected by the detection means;
acquisition means for acquiring information regarding the frequency of the photographing operation;
Equipped with
The control means performs a photographing operation when the frequency is a first frequency, and performs a photographing operation when the frequency is a second frequency even if the facial condition of the subject determined by the determination means is the same. An imaging device characterized in that the imaging device is controlled so as not to perform a photographing operation.

（実施形態１４）
前記被写体の顔の状態とは、被写体の顔の表情、被写体の顔の向き、被写体の目の開き具合、被写体の視線、被写体の姿勢、被写体の動作の状態であることを特徴とする実施形態１３に記載の撮像装置。 (Embodiment 14)
An embodiment characterized in that the state of the subject's face includes the facial expression of the subject, the direction of the subject's face, the degree to which the subject's eyes are opened, the subject's line of sight, the subject's posture, and the state of the subject's movements. 14. The imaging device according to 13.

（実施形態１５）
前記撮像手段の向きを被写体に向けるために、前記撮像手段の向きを変更する変更手段を更に備え、
前記変更手段は、前記頻度に応じて、前記撮像手段の向きを変更する可動範囲を変更することを特徴とする実施形態１乃至１４のいずれか１つに記載の撮像装置。 (Embodiment 15)
Further comprising changing means for changing the orientation of the imaging means in order to direct the imaging means toward the subject,
15. The imaging device according to any one of embodiments 1 to 14, wherein the changing unit changes a movable range in which the orientation of the imaging unit is changed depending on the frequency.

（実施形態１６）
前記変更手段は、前記撮像手段を、パン方向およびチルト方向に回動させることを特徴とする実施形態１５に記載の撮像装置。 (Embodiment 16)
16. The imaging device according to Embodiment 15, wherein the changing unit rotates the imaging unit in a panning direction and a tilting direction.

（実施形態１７）
前記撮像手段への被写体像を拡大あるいは縮小するためのズーム手段をさらに備え、
前記ズーム手段は、前記頻度に応じて、前記拡大あるいは縮小の制御を変更することを特徴とする実施形態１乃至１６のいずれか１つに記載の撮像装置。 (Embodiment 17)
further comprising a zoom means for enlarging or reducing the subject image on the imaging means,
17. The imaging device according to any one of embodiments 1 to 16, wherein the zoom means changes the enlargement or reduction control according to the frequency.

（実施形態１８）
被写体像を撮像して画像データを出力する撮像手段を備える撮像装置を制御する方法であって、
前記撮像手段によって出力された画像データを記録する撮影動作を実施するか否か制御する制御工程と、
前記撮影動作の頻度に関する情報を取得する取得工程と、を有し、
前記制御工程では、前記頻度に関する情報に応じて、前記撮影動作を実施するか否かを決定するための閾値を変更することを特徴とする撮像装置の制御方法。 (Embodiment 18)
A method for controlling an imaging device including an imaging means for imaging a subject image and outputting image data, the method comprising:
a control step for controlling whether or not to perform a photographing operation for recording image data output by the imaging means;
an acquisition step of acquiring information regarding the frequency of the photographing operation,
A method for controlling an imaging apparatus, wherein in the control step, a threshold value for determining whether to perform the photographing operation is changed in accordance with information regarding the frequency.

（実施形態１９）
被写体像を撮像して画像データを出力する撮像手段を備える撮像装置を制御する方法であって、
被写体の顔を検出する検出工程と、
前記検出工程により検出された被写体の顔の状態を判定する判定工程と、
前記撮像手段によって出力された画像データを記録する撮影動作を実施するか否か制御する制御工程と、
前記撮影動作の頻度に関する情報を取得する取得工程と、を有し、
前記制御工程では、前記判定工程により判定された被写体の顔の状態が同じであっても、前記頻度が第１の頻度の場合には撮影動作を実施し、前記頻度が第２の頻度の場合には撮影動作を実施しないように制御することを特徴とする撮像装置の制御方法。 (Embodiment 19)
A method for controlling an imaging device including an imaging means for imaging a subject image and outputting image data, the method comprising:
a detection step of detecting the face of the subject;
a determination step of determining the state of the face of the subject detected in the detection step;
a control step for controlling whether or not to perform a photographing operation for recording image data output by the imaging means;
an acquisition step of acquiring information regarding the frequency of the photographing operation,
In the control step, even if the state of the face of the subject determined in the determination step is the same, the photographing operation is performed when the frequency is a first frequency, and when the frequency is a second frequency, the photographing operation is performed. 1. A method of controlling an imaging device, comprising controlling the imaging device so as not to perform a photographing operation.

（実施形態２０）
実施形態１８または１９に記載の制御方法の各工程をコンピュータに実行させるためのプログラム。 (Embodiment 20)
A program for causing a computer to execute each step of the control method described in Embodiment 18 or 19.

（実施形態２１）
実施形態１８または１９に記載の制御方法の各工程をコンピュータに実行させるためのプログラムを記憶したコンピュータが読み取り可能な記憶媒体。 (Embodiment 21)
A computer-readable storage medium storing a program for causing a computer to execute each step of the control method according to embodiment 18 or 19.

１０１：カメラ、３０１：スマートデバイス、５０１：ウエアラブルデバイス、１０４：チルト回転ユニット、１０５：パン回転ユニット 101: Camera, 301: Smart device, 501: Wearable device, 104: Tilt rotation unit, 105: Pan rotation unit

Claims

an imaging means for imaging a subject image and outputting image data;
a control means for controlling whether or not to perform a photographing operation for recording image data outputted by the imaging means;
acquisition means for acquiring information indicating the cumulative number of shots taken since the first time;
determining means for determining a target number of shots in a predetermined period;
Equipped with
The imaging apparatus is characterized in that the control means changes a threshold value for determining whether or not to perform the photographing operation based on the target number of photographed images and information indicating the cumulative number of photographed images.

an imaging means for imaging a subject image and outputting image data;
a control means for controlling whether or not to perform a photographing operation for recording image data outputted by the imaging means;
acquisition means for acquiring information indicating the cumulative number of shots taken since the first time;
a communication unit that communicates with an external device ;
Equipped with
The control means changes a threshold value for determining whether to perform the photographing operation based on communication performance of the communication unit with the external device and information indicating the cumulative number of photographed images. An imaging device that uses

an imaging means for imaging a subject image and outputting image data;
a control means for controlling whether or not to perform a photographing operation for recording image data outputted by the imaging means;
acquisition means for acquiring information indicating the cumulative number of shots taken since the first time;
Equipped with
The acquisition means acquires information indicating the cumulative number of images taken until a certain time elapses from the first time,
The control means includes a threshold value for determining whether to perform the photographing operation until a certain time period elapses from the first time, and a threshold value for determining whether or not to perform the photographing operation until a certain period of time elapses from the first time point. The imaging apparatus sets the threshold value to be used after the predetermined period of time has elapsed from the first time based on information indicating the cumulative number of images taken during the period of time.

Further comprising a detection means for detecting information about the subject,
The imaging device according to any one of claims 1 to 3, wherein the control means determines whether or not to perform the photographing operation by comparing information about the subject with the threshold value. .

The imaging device according to claim 4 , wherein the detection means detects information about the subject based on at least one of detected sound and image data captured by the imaging means.

The imaging device according to any one of claims 1 to 5 , wherein the initial value of the threshold value is determined based on a past learning result.

7. The control means determines the threshold value for the next certain period based on information indicating the cumulative number of images taken since the first time. imaging device.

Further comprising determining means for determining a target number of shots in a predetermined period,
The control means changes a threshold value for determining whether or not to perform the photographing operation based on the target number of photographed images and information indicating the cumulative number of photographed images from the first time. An imaging device according to any one of claims 1 to 7 .

9. The imaging apparatus according to claim 8 , wherein the control means changes the threshold value so that the number of shots increases linearly toward the target number of shots as the shooting time elapses.

10. The imaging apparatus according to claim 8 , wherein the determining unit determines the target number of images to be photographed based on photographing conditions set based on manual input or voice input by a user.

The imaging apparatus according to claim 10 , wherein the manual input or voice input by the user is performed using a smart device.

The imaging apparatus according to claim 10 or 11 , wherein the imaging conditions include information on a total imaging time.

13. The imaging apparatus according to claim 12 , wherein the photographing conditions further include information on a recording medium and remaining battery power.

a detection means for detecting a face of a subject;
determination means for determining the state of the face of the subject detected by the detection means;
further comprising;
The control means performs a photographing operation if the cumulative number of photographed images from the first time is a first number even if the face condition of the subject determined by the determination means is the same; The imaging apparatus according to any one of claims 1 to 13 , wherein control is performed so that the photographing operation is not performed when the cumulative number of photographed images is a second number that is greater than the first number.

A claim characterized in that the state of the subject's face is the expression of the subject's face, the direction of the subject's face, the degree to which the subject's eyes are opened, the subject's line of sight, the subject's posture, and the state of the subject's movements. 15. The imaging device according to 14 .

Further comprising changing means for changing the orientation of the imaging means in order to direct the imaging means toward the subject,
4. The changing means changes the movable range for changing the orientation of the imaging means based on information indicating the cumulative number of images taken since the first time . The imaging device described in section.

17. The imaging apparatus according to claim 16 , wherein the changing means rotates the imaging means in a panning direction and a tilting direction.

further comprising a zoom means for enlarging or reducing the subject image on the imaging means,
4. The imaging apparatus according to claim 1, wherein the zoom means changes the enlargement or reduction control based on information indicating the cumulative number of captured images.

The acquisition means acquires information indicating the cumulative number of images taken until a certain time elapses from the first time,
3. The control means changes the threshold value based on information indicating the cumulative number of images taken until a certain period of time has elapsed from the first time. Imaging device.

an imaging means for imaging a subject image and outputting image data;
a control means for controlling whether or not to perform a photographing operation for recording image data outputted by the imaging means;
acquisition means for acquiring information regarding the frequency of the photographing operation;
a changing means for changing the direction of the imaging means in order to direct the direction of the imaging means toward the subject;
The control means changes a threshold value for determining whether to perform the photographing operation according to the information regarding the frequency,
The imaging device is characterized in that the changing unit changes a movable range in which the orientation of the imaging unit is changed depending on the frequency.

an imaging means for imaging a subject image and outputting image data;
a control means for controlling whether or not to perform a photographing operation for recording image data outputted by the imaging means;
acquisition means for acquiring information regarding the frequency of the photographing operation;
a zoom means for enlarging or reducing the subject image on the imaging means;
The control means changes a threshold value for determining whether to perform the photographing operation according to the information regarding the frequency,
The image pickup apparatus is characterized in that the zoom means changes the control of enlargement or reduction according to the frequency.

A method for controlling an imaging device including an imaging means for imaging a subject image and outputting image data, the method comprising:
a control step for controlling whether or not to perform a photographing operation for recording image data output by the imaging means;
an acquisition step of acquiring information indicating the cumulative number of shots taken from the first time;
a determination step of determining a target number of shots in a predetermined period;
has
In the control step, a threshold value for determining whether to perform the photographing operation is changed based on information indicating the target number of photographed images and the cumulative number of photographed images. .

A method for controlling an imaging device including an imaging means for imaging a subject image and outputting image data, the method comprising:
a control step for controlling whether or not to perform a photographing operation for recording image data output by the imaging means;
an acquisition step of acquiring information regarding the frequency of the photographing operation;
a changing step of changing the orientation of the imaging means in order to direct the imaging means toward the subject;
In the control step, a threshold value for determining whether to perform the photographing operation is changed according to the information regarding the frequency;
A method for controlling an imaging device, characterized in that, in the changing step, a movable range in which the direction of the imaging means is changed is changed according to the frequency.

A method for controlling an imaging device including an imaging means for imaging a subject image and outputting image data, the method comprising:
a control step for controlling whether or not to perform a photographing operation for recording image data output by the imaging means;
an acquisition step of acquiring information regarding the frequency of the photographing operation;
a zoom step for enlarging or reducing the subject image on the imaging means,
In the control step, a threshold value for determining whether to perform the photographing operation is changed according to the information regarding the frequency;
A method for controlling an imaging apparatus, characterized in that in the zooming step, the enlargement or reduction control is changed depending on the frequency.

A method for controlling an imaging device comprising an imaging means for imaging a subject image and outputting image data, and a communication unit for communicating with an external device, the method comprising:
a control step for controlling whether or not to perform a photographing operation for recording image data output by the imaging means;
an acquisition step of acquiring information indicating the cumulative number of shots taken from the first time;
has
In the control step, a threshold value for determining whether to perform the photographing operation is changed based on communication performance with the external device in the communication unit and information indicating the cumulative number of photographed images. Features: A method for controlling an imaging device.

A method for controlling an imaging device including an imaging means for imaging a subject image and outputting image data, the method comprising:
a control step for controlling whether or not to perform a photographing operation for recording image data output by the imaging means;
an acquisition step of acquiring information indicating the cumulative number of shots taken from the first time;
has
In the acquisition step, information indicating the cumulative number of images taken until a certain time elapses from the first time is acquired;
In the control step, a threshold value for determining whether to perform the photographing operation until a certain time elapses from the first time, and a threshold value for determining whether or not to perform the photographing operation until a certain time elapses from the first time. A method for controlling an imaging apparatus, characterized in that the threshold value to be used after the certain period of time has elapsed from the first time is set based on information indicating the cumulative number of images taken during a period of time.

A program for causing a computer to execute each step of the control method according to any one of claims 22 to 26 .

A computer-readable storage medium storing a program for causing a computer to execute each step of the control method according to any one of claims 22 to 26 .