JP7393133B2

JP7393133B2 - Image processing device, image processing method, imaging device, program, storage medium

Info

Publication number: JP7393133B2
Application number: JP2019100722A
Authority: JP
Inventors: 将浩高山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-05-29
Filing date: 2019-05-29
Publication date: 2023-12-06
Anticipated expiration: 2039-05-29
Also published as: JP2020195099A

Description

本発明は、機械学習を用いて画像を選択する技術に関するものである。 The present invention relates to a technique for selecting images using machine learning.

カメラ等の撮像装置による静止画・動画撮影においては、ユーザがファインダー等を通して撮影対象を決定し、撮影状況を自ら確認して撮影画像のフレーミングを調整し、シャッターボタンを操作することによって、画像を撮影するのが一般的である。 When shooting still images and videos using an imaging device such as a camera, the user determines the subject to be shot through a finder, etc., checks the shooting conditions himself, adjusts the framing of the shot image, and operates the shutter button to capture the image. It is common to take pictures.

このようなユーザの操作により撮影を実行する撮像装置に対し、特許文献１には、所謂ライフログカメラと呼ばれる、ユーザが撮影指示を与えることなく定期的および継続的に撮影を行うカメラが開示されている。ライフログカメラは、ストラップ等でユーザの身体に装着された状態で用いられ、ユーザが日常生活で目にする光景を一定時間間隔で映像として記録する。ライフログカメラによる撮影は、ユーザがシャッターを切るなどの意図したタイミングで行われるのではなく、一定の間隔で行われるため、普段撮影しないような不意な瞬間を映像として残すことができる。 In contrast to such an imaging device that performs photography through user operations, Patent Document 1 discloses a camera called a so-called lifelog camera that periodically and continuously photographs images without the user giving a photography instruction. ing. A life log camera is used while attached to a user's body with a strap or the like, and records scenes that the user sees in daily life as images at regular time intervals. Photographing with a life log camera is not done at the user's intended timing, such as when the user releases the shutter, but at regular intervals, making it possible to capture unexpected moments that would not normally be photographed.

特表２０１６－５３６８６８号公報Special table 2016-536868 publication

ライフログカメラをユーザが身に着けた状態において、定期的に自動撮影を行った場合、以下のような問題が起こることが考えられる。 When a user wears a life log camera and automatically takes pictures on a regular basis, the following problems may occur.

例えば、カメラ自体に撮影画像を記録する場合や、カメラと接続されたスマートフォンや携帯型タブレット端末に画像を記録する場合に、ユーザが気づかないうちにメディア容量を使い切ってしまう恐れがある。メディアに空き容量がなくなると、ユーザの意図したときに撮影を行うことができなかったり、自動撮影において狙ったシーンを撮影できない。 For example, when recording captured images on the camera itself, or when recording images on a smartphone or portable tablet device connected to the camera, there is a risk that the media capacity will be used up without the user noticing. If the media runs out of free space, the user will not be able to take a picture when he or she intends to do so, or the user will not be able to take a desired scene during automatic shooting.

あるいは、容量不足の懸念が少ないサーバに撮影画像を転送する場合であっても、自動撮影により膨大な画像が撮影される可能性があるので、撮影された画像を確認して、好みでない画像や失敗画像を選別するのに手間がかかる。 Alternatively, even if you are transferring captured images to a server where there is less concern about running out of space, there is a possibility that a huge number of images will be captured due to automatic shooting, so be sure to check the captured images and avoid any images you do not like. It takes time to sort out failed images.

これらの問題を解決するため、自動撮影された画像のそれぞれに対してスコアをつけ、そのスコアに応じて記録あるいは表示する画像を選択することが考えられる。しかしながら、機械学習を用いる場合、時間の経過とともに、スコアのつけ方が変化する可能性がある。この場合、過去に記録あるいは表示対象として選択された画像と、新たに記録あるいは表示対象として選択された画像において、選択される基準に差異が生じてしまう可能性がある。 In order to solve these problems, it is conceivable to assign a score to each automatically captured image and select the image to be recorded or displayed according to the score. However, when using machine learning, the way scores are assigned may change over time. In this case, there may be a difference in the selection criteria between an image that has been selected as a recording or display target in the past and an image that has been newly selected as a recording or display target.

本発明は上述した課題に鑑みてなされたものであり、その目的は、ユーザにとって重要でないと思われる画像を自動的に選択することができる画像処理装置を提供することである。 The present invention has been made in view of the above-mentioned problems, and its purpose is to provide an image processing device that can automatically select images that are considered unimportant to the user.

本発明に係わる画像処理装置は、被写体を撮像して得られた画像を記憶する記憶手段と、機械学習により、ユーザの好みの画像を学習する学習手段と、前記学習手段の学習パラメータに基づいて、前記記憶手段に記憶されている画像の評価値であるスコアを取得する評価手段と、前記記憶手段に記憶されている画像から、前記スコアを参照して削除候補を選択する選択手段と、前記選択手段により選択された画像を、前記記憶手段から削除する削除手段と、を備える画像処理装置であって、当該画像処理装置は、前記学習手段によりユーザの好みの画像を学習する学習モードと、前記削除手段により前記記憶手段に記憶されている画像を自動的に削除する自動削除モードを有し、前記評価手段は、前記学習モードにおいて、外部機器から前記学習手段の学習パラメータの設定指示があった場合は、当該学習パラメータをセットし、前記指示がない場合には学習した学習パラメータをセットした後、全ての画像に対して前記スコアを付け直す再評価をし、前記選択手段は、前記自動削除モードにおいて、前記学習手段の学習パラメータが更新された場合は、全ての画像を対象にして前記スコアを参照して削除候補を選択し、更新されていない場合は、前回のファイル自動削除処理よりも後に得られた画像のみを対象にして前記スコアを参照して削除候補を選択することを特徴とする。 An image processing apparatus according to the present invention includes a storage means for storing an image obtained by imaging a subject, a learning means for learning a user's favorite image by machine learning, and an image processing device based on learning parameters of the learning means. , an evaluation means for acquiring a score that is an evaluation value of the image stored in the storage means; a selection means for selecting a deletion candidate from the images stored in the storage means by referring to the score ; an image processing apparatus, comprising: a deletion means for deleting the image selected by the selection means from the storage means; the image processing apparatus is in a learning mode in which the learning means learns a user's favorite image; and an automatic deletion mode in which the deletion means automatically deletes images stored in the storage means, and the evaluation means receives an instruction to set learning parameters of the learning means from an external device in the learning mode. If there is, the learning parameter is set, and if there is no instruction, the learned learning parameter is set, and then the re-evaluation is performed to re-score all the images, and the selection means: In the automatic deletion mode, if the learning parameters of the learning means have been updated , deletion candidates are selected by referring to the scores for all images, and if they have not been updated , the previous file is automatically deleted. The method is characterized in that deletion candidates are selected by referring to the scores for only images obtained after the processing .

本発明によれば、ユーザにとって重要でないと思われる画像を自動的に選択することができる画像処理装置を提供することが可能となる。 According to the present invention, it is possible to provide an image processing device that can automatically select images that are considered unimportant to the user.

撮像装置を模式的に示す図である。FIG. 1 is a diagram schematically showing an imaging device. 撮像装置の構成を示す図である。1 is a diagram showing the configuration of an imaging device. 撮像装置と外部機器との構成を示す図である。FIG. 2 is a diagram showing the configuration of an imaging device and external equipment. 外部機器の構成を示す図である。FIG. 3 is a diagram showing the configuration of external equipment. 制御回路の動作を説明するフローチャートである。It is a flow chart explaining operation of a control circuit. 自動撮影モード処理を説明するフローチャートである。3 is a flowchart illustrating automatic shooting mode processing. ニューラルネットワークを説明する図である。FIG. 2 is a diagram illustrating a neural network. 画像の表示処理を説明するための図である。FIG. 3 is a diagram for explaining image display processing. 学習モード判定を説明するフローチャートである。It is a flowchart explaining learning mode determination. 学習モード処理を説明するフローチャートである。It is a flow chart explaining learning mode processing. ファイル自動削除モード処理を説明するフローチャートである。3 is a flowchart illustrating automatic file deletion mode processing.

以下、添付図面を参照して実施形態を詳しく説明する。なお、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。実施形態には複数の特徴が記載されているが、これらの複数の特徴の全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. Note that the following embodiments do not limit the claimed invention. Although a plurality of features are described in the embodiments, not all of these features are essential to the invention, and the plurality of features may be arbitrarily combined. Furthermore, in the accompanying drawings, the same or similar components are designated by the same reference numerals, and redundant description will be omitted.

＜撮像装置の構成＞
図１は、本発明の画像処理装置の一実施形態である撮像装置を模式的に示す図である。本発明は、デジタルカメラやデジタルビデオカメラに限らず、監視カメラ、Ｗｅｂカメラ、携帯電話などにも適用可能である。本実施形態では、撮像装置そのものが機械学習を行う画像処理装置を兼ねている構成を前提として説明を行うが、撮像装置と別であって、撮像装置と通信可能な画像処理装置において撮像装置のための機械学習を行う構成としてもよい。 <Configuration of imaging device>
FIG. 1 is a diagram schematically showing an imaging device that is an embodiment of an image processing device of the present invention. The present invention is applicable not only to digital cameras and digital video cameras, but also to surveillance cameras, web cameras, mobile phones, and the like. In this embodiment, the description will be made assuming a configuration in which the imaging device itself also serves as an image processing device that performs machine learning, but an image processing device that is separate from the imaging device and can communicate with the imaging device It may also be configured to perform machine learning for this purpose.

図１（ａ）に示す撮像装置１０１は、電源スイッチの操作を行うことができる操作部材（以後、電源ボタンというが、タッチパネルへのタップやフリック、スワイプなどの操作でもよい）などが設けられている。撮像を行う撮影レンズ群や撮像素子を含む筐体である鏡筒１０２は、撮像装置１０１に取り付けられ、鏡筒１０２を固定部１０３に対して回転駆動できる回転機構が設けられている。チルト回転ユニット１０４は、鏡筒１０２を図１（ｂ）に示すピッチ方向に回転できるモーター駆動機構であり、パン回転ユニット１０５は、鏡筒１０２をヨー方向に回転できるモーター駆動機構である。よって、鏡筒１０２は、１軸以上の方向に回転可能である。なお、図１（ｂ）は、固定部１０３位置での軸定義を示している。角速度計１０６と加速度計１０７はともに、撮像装置１０１の固定部１０３に実装されている。そして、角速度計１０６や加速度計１０７に基づいて、撮像装置１０１の振動を検出し、チルト回転ユニットとパン回転ユニットを検出した揺れ角度に基づいて回転駆動する。これにより、可動部である鏡筒１０２の振れを補正したり、傾きを補正したりすることができる。 The imaging device 101 shown in FIG. 1A is provided with an operation member (hereinafter referred to as a power button, but operations such as a tap, flick, or swipe on a touch panel) that can operate a power switch. There is. A lens barrel 102 , which is a housing that includes a group of photographic lenses and an image sensor that performs imaging, is attached to the imaging device 101 , and is provided with a rotation mechanism that can rotate the lens barrel 102 with respect to a fixed portion 103 . The tilt rotation unit 104 is a motor drive mechanism that can rotate the lens barrel 102 in the pitch direction shown in FIG. 1(b), and the pan rotation unit 105 is a motor drive mechanism that can rotate the lens barrel 102 in the yaw direction. Therefore, the lens barrel 102 is rotatable in one or more axial directions. Note that FIG. 1(b) shows the axis definition at the fixed part 103 position. Both the angular velocity meter 106 and the accelerometer 107 are mounted on the fixed part 103 of the imaging device 101. Then, vibrations of the imaging device 101 are detected based on the angular velocity meter 106 and the accelerometer 107, and the tilt rotation unit and the pan rotation unit are rotated based on the detected swing angle. Thereby, it is possible to correct the shake of the lens barrel 102, which is a movable part, and to correct the inclination.

図２は、本実施形態の撮像装置の構成を示すブロック図である。図２において、制御回路２２１は、プロセッサ（例えば、ＣＰＵ、ＧＰＵ、マイクロプロセッサ、ＭＰＵなど）、メモリ（例えば、ＤＲＡＭ、ＳＲＡＭなど）からなる。これらは、各種処理を実行して撮像装置１０１の各ブロックを制御したり、各ブロック間でのデータ転送を制御したりする。不揮発性メモリ（ＥＥＰＲＯＭ）２１４は、電気的に消去・記録可能なメモリであり、制御回路２２１の動作用の定数、プログラム等が記憶される。 FIG. 2 is a block diagram showing the configuration of the imaging device of this embodiment. In FIG. 2, the control circuit 221 includes a processor (eg, CPU, GPU, microprocessor, MPU, etc.) and memory (eg, DRAM, SRAM, etc.). These control each block of the imaging device 101 by executing various processes, and control data transfer between each block. The nonvolatile memory (EEPROM) 214 is an electrically erasable/recordable memory, and stores constants, programs, etc. for operation of the control circuit 221.

図２において、ズームユニット２０１は、変倍を行うズームレンズを含む。ズーム駆動制御回路２０２は、ズームユニット２０１を駆動制御する。フォーカスユニット２０３は、ピント調整を行うレンズを含む。フォーカス駆動制御回路２０４は、フォーカスユニット２０３を駆動制御する。 In FIG. 2, a zoom unit 201 includes a zoom lens that changes magnification. A zoom drive control circuit 202 drives and controls the zoom unit 201. Focus unit 203 includes a lens that performs focus adjustment. A focus drive control circuit 204 drives and controls the focus unit 203.

撮像部２０６は、撮像素子とＡ／Ｄ変換器を備え、撮像素子が各レンズ群を通して入射する光を受け、その光量に応じた電荷の情報をアナログ画像信号として画像処理回路２０７に出力する。画像処理回路２０７は複数のＡＬＵ（ＡｒｉｔｈｍｅｔｉｃａｎｄＬｏｇｉｃＵｎｉｔ）を搭載した演算回路であり、Ａ／Ｄ変換により出力されたデジタル画像データに対して、歪曲補正やホワイトバランス調整や色補間処理等の画像処理を適用し、適用後のデジタル画像データを出力する。画像処理回路２０７から出力されたデジタル画像データは、画像記録回路２０８でＪＰＥＧ形式等の記録用フォーマットに変換され、メモリ２１３や後述する映像出力回路２１５に送信される。 The imaging unit 206 includes an image sensor and an A/D converter, and the image sensor receives incident light through each lens group, and outputs charge information corresponding to the amount of light to the image processing circuit 207 as an analog image signal. The image processing circuit 207 is an arithmetic circuit equipped with multiple ALUs (Arithmetic and Logic Units), and performs image processing such as distortion correction, white balance adjustment, and color interpolation processing on digital image data output by A/D conversion. Apply the processing and output the applied digital image data. The digital image data outputted from the image processing circuit 207 is converted into a recording format such as JPEG format by the image recording circuit 208, and is sent to the memory 213 and a video output circuit 215, which will be described later.

鏡筒回転駆動回路２０５は、チルト回転ユニット１０４、パン回転ユニット１０５を駆動して鏡筒１０２をチルト方向とパン方向に駆動させる。 The lens barrel rotation drive circuit 205 drives the tilt rotation unit 104 and the pan rotation unit 105 to drive the lens barrel 102 in the tilt direction and the pan direction.

装置揺れ検出回路２０９には、例えば撮像装置１０１の３軸方向の角速度を検出する角速度計（ジャイロセンサ）１０６や、装置の３軸方向の加速度を検出する加速度計（加速度センサ）１０７が搭載される。装置揺れ検出回路２０９は、検出された信号に基づいて、装置の回転角度や装置のシフト量などを算出する。 The device shake detection circuit 209 is equipped with, for example, an angular velocity meter (gyro sensor) 106 that detects the angular velocity of the imaging device 101 in the three-axis directions, and an accelerometer (acceleration sensor) 107 that detects the acceleration of the device in the three-axis directions. Ru. The device shake detection circuit 209 calculates the rotation angle of the device, the amount of shift of the device, etc. based on the detected signal.

音声入力回路２１１は、撮像装置１０１に設けられたマイクから撮像装置１０１周辺の音声信号を取得し、アナログデジタル変換をして音声処理回路２１２に送信する。音声処理回路２１２は、入力されたデジタル音声信号の適正化処理等の音声に関する処理を行う。そして、音声処理回路２１２で処理された音声信号は、制御回路２２１によりメモリ２１３に送信される。メモリ２１３は、画像処理回路２０７、音声処理回路２１２により得られた画像信号及び音声信号を一時的に記憶する。 The audio input circuit 211 acquires audio signals around the imaging device 101 from a microphone provided in the imaging device 101, performs analog-to-digital conversion, and transmits the signal to the audio processing circuit 212. The audio processing circuit 212 performs audio-related processing such as optimization processing of the input digital audio signal. The audio signal processed by the audio processing circuit 212 is then transmitted to the memory 213 by the control circuit 221. The memory 213 temporarily stores the image signal and audio signal obtained by the image processing circuit 207 and the audio processing circuit 212.

画像処理回路２０７及び音声処理回路２１２は、メモリ２１３に一時的に記憶された画像信号や音声信号を読み出して画像信号の符号化、音声信号の符号化などを行い、圧縮画像信号、圧縮音声信号を生成する。制御回路２２１は、これらの圧縮画像信号、圧縮音声信号を、記録再生回路２１８に送信する。 The image processing circuit 207 and the audio processing circuit 212 read out image signals and audio signals temporarily stored in the memory 213, encode the image signals, encode the audio signals, etc., and convert them into compressed image signals and compressed audio signals. generate. The control circuit 221 transmits these compressed image signals and compressed audio signals to the recording and reproducing circuit 218.

記録再生回路２１８は、記録媒体２１９に対して画像処理回路２０７及び音声処理回路２１２で生成された圧縮画像信号、圧縮音声信号、その他撮影に関する制御データ等を記録する。また、音声信号を圧縮符号化しない場合には、制御回路２２１は、音声処理回路２１２により生成された音声信号と画像処理回路２０７により生成された圧縮画像信号とを、記録再生回路２１８に送信し記録媒体２１９に記録させる。 The recording/reproducing circuit 218 records compressed image signals and compressed audio signals generated by the image processing circuit 207 and the audio processing circuit 212, as well as other control data related to photographing, on the recording medium 219. Furthermore, when the audio signal is not compressed and encoded, the control circuit 221 transmits the audio signal generated by the audio processing circuit 212 and the compressed image signal generated by the image processing circuit 207 to the recording/reproducing circuit 218. The information is recorded on the recording medium 219.

記録媒体２１９は、撮像装置１０１に内蔵された記録媒体でも、取外し可能な記録媒体でもよい。記録媒体２１９は、撮像装置１０１で生成された圧縮画像信号、圧縮音声信号、音声信号などの各種データを記録することができ、不揮発性メモリ２１４よりも大容量な媒体が一般的に使用される。例えば、記録媒体２１９は、ハードディスク、光ディスク、光磁気ディスク、ＣＤ－Ｒ、ＤＶＤ－Ｒ、磁気テープ、不揮発性の半導体メモリ、フラッシュメモリ、などのあらゆる方式の記録媒体を含む。 The recording medium 219 may be a recording medium built into the imaging device 101 or a removable recording medium. The recording medium 219 can record various data such as compressed image signals, compressed audio signals, and audio signals generated by the imaging device 101, and a medium with a larger capacity than the nonvolatile memory 214 is generally used. . For example, the recording medium 219 includes any type of recording medium such as a hard disk, an optical disk, a magneto-optical disk, a CD-R, a DVD-R, a magnetic tape, a nonvolatile semiconductor memory, and a flash memory.

記録再生回路２１８は、記録媒体２１９に記録された圧縮画像信号、圧縮音声信号、音声信号、各種データ、プログラムを読み出す（再生する）。そして制御回路２２１は、読み出した圧縮画像信号及び圧縮音声信号を、画像処理回路２０７及び音声処理回路２１２に送信する。画像処理回路２０７及び音声処理回路２１２は、圧縮画像信号、圧縮音声信号を一時的にメモリ２１３に記憶させ、所定の手順で復号し、復号した信号を映像出力回路２１５、音声出力回路２１６に送信する。 The recording and reproducing circuit 218 reads (reproduces) compressed image signals, compressed audio signals, audio signals, various data, and programs recorded on the recording medium 219. The control circuit 221 then transmits the read compressed image signal and compressed audio signal to the image processing circuit 207 and the audio processing circuit 212. The image processing circuit 207 and the audio processing circuit 212 temporarily store the compressed image signal and the compressed audio signal in the memory 213, decode them according to a predetermined procedure, and send the decoded signals to the video output circuit 215 and the audio output circuit 216. do.

音声入力回路２１１には、撮像装置１０１に搭載された複数のマイクが接続され、音声処理回路２１２は複数のマイクが設置された平面上の音の方向を検出することができる。この情報は、後述する被写体の探索や自動撮影に用いられる。さらに、音声処理回路２１２では、特定の音声コマンドを検出する。音声コマンドは事前に登録された複数のコマンドの他、ユーザが特定音声を撮像装置に登録できる構成にしてもよい。また、音シーン認識も行う。音シーン認識では、予め大量の音声データを基に機械学習により学習させたネットワークにより音シーン判定を行う。例えば、「歓声が上がっている」、「拍手している」、「声を発している」などの特定シーンを検出するためのネットワークが音声処理回路２１２に設定されている。そして、特定音シーンや特定音声コマンドを検出すると、制御回路２２１に、検出トリガー信号を出力する。電源回路２１０は、制御回路２２１を動作させるための電源を供給する。 A plurality of microphones mounted on the imaging device 101 are connected to the audio input circuit 211, and the audio processing circuit 212 can detect the direction of sound on a plane in which the plurality of microphones are installed. This information is used for object search and automatic shooting, which will be described later. Furthermore, the voice processing circuit 212 detects a specific voice command. In addition to a plurality of voice commands registered in advance, the configuration may also be such that the user can register a specific voice in the imaging device. It also performs sound scene recognition. In sound scene recognition, sound scenes are determined using a network trained in advance through machine learning based on a large amount of audio data. For example, a network is set up in the audio processing circuit 212 to detect specific scenes such as "cheering", "claps", and "sounding". When a specific sound scene or specific audio command is detected, a detection trigger signal is output to the control circuit 221. The power supply circuit 210 supplies power for operating the control circuit 221.

音声出力回路２１６は、例えば撮影時などに撮像装置１０１に内蔵されたスピーカーから予め設定された音声パターンを出力する。ＬＥＤ制御回路２２２は、例えば撮影時などに撮像装置１０１に設けられたＬＥＤを予め設定された点灯点滅パターンで制御する。映像出力回路２１５は、例えば映像出力端子からなり、接続された外部ディスプレイ等に映像を表示させるために画像信号を送信する。また、音声出力回路２１６、映像出力回路２１５は、結合された１つの端子、例えばＨＤＭＩ（登録商標）（Ｈｉｇｈ－ＤｅｆｉｎｉｔｉｏｎＭｕｌｔｉｍｅｄｉａＩｎｔｅｒｆａｃｅ）端子のような端子であってもよい。 The audio output circuit 216 outputs a preset audio pattern from a speaker built into the imaging device 101 during, for example, shooting. The LED control circuit 222 controls the LEDs provided in the imaging device 101 in a preset lighting/blinking pattern during, for example, photographing. The video output circuit 215 includes, for example, a video output terminal, and transmits an image signal to display a video on a connected external display or the like. Furthermore, the audio output circuit 216 and the video output circuit 215 may be connected to one terminal, such as a terminal such as an HDMI (registered trademark) (High-Definition Multimedia Interface) terminal.

通信回路２２０は、撮像装置１０１と外部装置との間で通信を行い、例えば、音声信号、画像信号、圧縮音声信号、圧縮画像信号などのデータを送信したり受信したりする。また、撮影開始や終了コマンド、パン・チルトやズーム駆動等の、撮影にかかわる制御信号を受信して、撮像装置１０１と相互通信可能な外部機器の指示により撮像装置１０１を駆動する。また、撮像装置１０１と外部装置との間で、後述する学習処理回路２１７で処理される学習にかかわる各種パラメータなどの情報を送信したり受信したりする。通信回路２２０は、例えば、赤外線通信モジュール、Ｂｌｕｅｔｏｏｔｈ（登録商標）通信モジュール、無線ＬＡＮ通信モジュール、ＷｉｒｅｌｅｓｓＵＳＢ、ＧＰＳ受信機等の無線通信モジュールである。 The communication circuit 220 communicates between the imaging device 101 and an external device, and transmits and receives data such as an audio signal, an image signal, a compressed audio signal, and a compressed image signal. It also receives control signals related to photography, such as photography start and end commands, pan/tilt, and zoom driving, and drives the imaging device 101 according to instructions from an external device that can communicate with the imaging device 101. Further, information such as various parameters related to learning to be processed by a learning processing circuit 217, which will be described later, is transmitted and received between the imaging device 101 and an external device. The communication circuit 220 is, for example, a wireless communication module such as an infrared communication module, a Bluetooth (registered trademark) communication module, a wireless LAN communication module, a Wireless USB, or a GPS receiver.

＜外部通信機器との構成＞
図３は、撮像装置１０１と外部機器３０１との無線通信システムの構成例を示す図である。撮像装置１０１は撮影機能を有するデジタルカメラであり、外部機器３０１はＢｌｕｅｔｏｏｔｈ通信モジュール、無線ＬＡＮ通信モジュールを含むスマートデバイスである。 <Configuration with external communication devices>
FIG. 3 is a diagram showing a configuration example of a wireless communication system between the imaging device 101 and the external device 301. The imaging device 101 is a digital camera with a shooting function, and the external device 301 is a smart device including a Bluetooth communication module and a wireless LAN communication module.

撮像装置１０１と外部機器３０１は、例えばＩＥＥＥ８０２．１１規格シリーズに準拠した無線ＬＡＮによる通信３０２と、例えばＢｌｕｅｔｏｏｔｈＬｏｗＥｎｅｒｇｙ（以下、「ＢＬＥ」と呼ぶ）などの、制御局と従属局などの主従関係を有する通信３０３とによって通信可能である。なお、無線ＬＡＮ及びＢＬＥは通信手法の一例であり、各通信装置は、２つ以上の通信機能を有し、例えば制御局と従属局との関係の中で通信を行う一方の通信機能によって、他方の通信機能の制御を行うことが可能であれば、他の通信手法が用いられてもよい。ただし、一般性を失うことなく、無線ＬＡＮなどの第１の通信は、ＢＬＥなどの第２の通信より高速な通信が可能であり、また、第２の通信は、第１の通信よりも消費電力が少ないか通信可能距離が短いかの少なくともいずれかであるものとする。 The imaging device 101 and the external device 301 communicate via wireless LAN 302 based on the IEEE802.11 standard series, for example, and have a master-slave relationship between a control station and a subordinate station, such as Bluetooth Low Energy (hereinafter referred to as "BLE"). It is possible to communicate with the communication 303 having the following. Note that wireless LAN and BLE are examples of communication methods, and each communication device has two or more communication functions. For example, one communication function that performs communication in the relationship between a control station and a dependent station, Other communication methods may be used as long as it is possible to control the other communication function. However, without loss of generality, the first communication such as wireless LAN can be faster than the second communication such as BLE, and the second communication consumes less than the first communication. It is assumed that the power consumption is low or the communicable distance is short.

外部機器３０１の構成を、図４を用いて説明する。外部機器３０１は、例えば、無線ＬＡＮ用の無線ＬＡＮ制御回路４０１、及び、ＢＬＥ用のＢＬＥ制御回路４０２に加え、公衆無線通信用の公衆回線制御回路４０６を有する。また、外部機器３０１は、パケット送受信回路４０３をさらに有する。無線ＬＡＮ制御回路４０１は、無線ＬＡＮのＲＦ制御、通信処理、ＩＥＥＥ８０２．１１規格シリーズに準拠した無線ＬＡＮによる通信の各種制御を行うドライバや無線ＬＡＮによる通信に関するプロトコル処理を行う。ＢＬＥ制御回路４０２は、ＢＬＥのＲＦ制御、通信処理、ＢＬＥによる通信の各種制御を行うドライバやＢＬＥによる通信に関するプロトコル処理を行う。公衆回線制御回路４０６は、公衆無線通信のＲＦ制御、通信処理、公衆無線通信の各種制御を行うドライバや公衆無線通信関連のプロトコル処理を行う。公衆無線通信は例えばＩＭＴ（ＩｎｔｅｒｎａｔｉｏｎａｌＭｕｌｔｉｍｅｄｉａＴｅｌｅｃｏｍｍｕｎｉｃａｔｉｏｎｓ）規格やＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ）規格などに準拠したものである。パケット送受信回路４０３は、無線ＬＡＮ並びにＢＬＥによる通信及び公衆無線通信に関するパケットの送信と受信との少なくともいずれかを実行するための処理を行う。なお、本例では、外部機器３０１は、通信においてパケットの送信と受信との少なくともいずれかを行うものとして説明するが、パケット交換以外に、例えば回線交換など、他の通信形式が用いられてもよい。 The configuration of the external device 301 will be explained using FIG. 4. The external device 301 includes, for example, a wireless LAN control circuit 401 for wireless LAN and a BLE control circuit 402 for BLE, as well as a public line control circuit 406 for public wireless communication. Furthermore, the external device 301 further includes a packet transmitting/receiving circuit 403. The wireless LAN control circuit 401 performs RF control of the wireless LAN, communication processing, a driver that performs various controls of wireless LAN communication based on the IEEE 802.11 standard series, and protocol processing related to wireless LAN communication. The BLE control circuit 402 performs BLE RF control, communication processing, a driver that performs various controls of BLE communication, and protocol processing regarding BLE communication. The public line control circuit 406 performs RF control of public radio communication, communication processing, a driver for performing various controls of public radio communication, and protocol processing related to public radio communication. Public wireless communication is based on, for example, the International Multimedia Telecommunications (IMT) standard or the Long Term Evolution (LTE) standard. The packet transmitting/receiving circuit 403 performs processing for transmitting and/or receiving packets related to wireless LAN, BLE communication, and public wireless communication. Note that in this example, the external device 301 will be described as one that transmits and/or receives packets during communication; however, other communication formats such as line switching may be used in addition to packet exchange. good.

外部機器３０１は、例えば、制御回路４１１、記憶回路４０４、ＧＰＳ受信回路４０５、表示装置４０７、操作部材４０８、音声入力／処理回路４０９、電源回路４１０をさらに有する。制御回路４１１は、例えば、記憶回路４０４に記憶される制御プログラムを実行することにより、外部機器３０１全体を制御する。記憶回路４０４は、例えば制御回路４１１が実行する制御プログラムと、通信に必要なパラメータ等の各種情報とを記憶する。後述する各種動作は、記憶回路４０４に記憶された制御プログラムを制御回路４１１が実行することにより、実現される。 The external device 301 further includes, for example, a control circuit 411, a storage circuit 404, a GPS reception circuit 405, a display device 407, an operation member 408, an audio input/processing circuit 409, and a power supply circuit 410. The control circuit 411 controls the entire external device 301 by executing a control program stored in the storage circuit 404, for example. The storage circuit 404 stores, for example, a control program executed by the control circuit 411 and various information such as parameters necessary for communication. Various operations described below are realized by the control circuit 411 executing a control program stored in the storage circuit 404.

電源回路４１０は外部機器３０１に電源を供給する。表示装置４０７は、例えば、ＬＣＤやＬＥＤのように視覚で認知可能な情報の出力、又はスピーカー等の音出力が可能な機能を有し、各種情報の表示を行う。操作部材４０８は、例えばユーザによる外部機器３０１への操作を受け付けるボタン等である。なお、表示装置４０７及び操作部材４０８は、例えばタッチパネルなどの共通する部材によって構成されてもよい。 A power supply circuit 410 supplies power to the external device 301. The display device 407 has a function of outputting visually recognizable information such as an LCD or LED, or outputting sound such as a speaker, and displays various information. The operation member 408 is, for example, a button that accepts an operation on the external device 301 by the user. Note that the display device 407 and the operation member 408 may be configured by a common member such as a touch panel, for example.

音声入力／処理回路４０９は、例えば外部機器３０１に内蔵された汎用的なマイクから、ユーザが発した音声を取得し、音声認識処理により、ユーザの操作命令を取得する構成にしてもよい。 The voice input/processing circuit 409 may be configured to acquire the voice uttered by the user from, for example, a general-purpose microphone built into the external device 301, and acquire the user's operation command through voice recognition processing.

また、外部機器３０１内の専用のアプリケーションを介して、ユーザの発音により音声コマンドを取得する。そして、無線ＬＡＮによる通信３０２を介して、撮像装置１０１の音声処理回路２１２に特定音声コマンドを認識させるための特定音声コマンドとして登録することもできる。 In addition, a voice command is acquired by the user's pronunciation via a dedicated application in the external device 301. Then, it can also be registered as a specific voice command for causing the voice processing circuit 212 of the imaging apparatus 101 to recognize the specific voice command via the wireless LAN communication 302.

ＧＰＳ（Ｇｌｏｂａｌｐｏｓｉｔｉｏｎｉｎｇｓｙｓｔｅｍ）４０５は、衛星から通知されるＧＰＳ信号を受信し、ＧＰＳ信号を解析し、外部機器３０１の現在位置（経度・緯度情報）を推定する。もしくは、位置推定は、ＷＰＳ（Ｗｉ－ＦｉＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）等を利用して、周囲に存在する無線ネットワークの情報に基づいて、外部機器３０１の現在位置を推定するようにしてもよい。取得した現在のＧＰＳ位置情報が予め事前に設定されている位置範囲（所定半径の範囲以内）に位置している場合に、ＢＬＥ制御回路４０２を介して撮像装置１０１へ移動情報を通知し、後述する自動撮影や自動編集のためのパラメータとして使用する。また、ＧＰＳ位置情報に所定以上の位置変化があった場合に、ＢＬＥ制御回路４０２を介して撮像装置１０１へ移動情報を通知し、後述する自動撮影や自動編集のためのパラメータとして使用する。 A GPS (Global Positioning System) 405 receives a GPS signal notified from a satellite, analyzes the GPS signal, and estimates the current position (longitude/latitude information) of the external device 301 . Alternatively, the position estimation may be performed by using WPS (Wi-Fi Positioning System) or the like to estimate the current position of the external device 301 based on information on surrounding wireless networks. If the acquired current GPS position information is located within a preset position range (within a predetermined radius), the movement information is notified to the imaging device 101 via the BLE control circuit 402, and as described below. Use as a parameter for automatic shooting and automatic editing. Further, when there is a positional change of more than a predetermined value in the GPS positional information, movement information is notified to the imaging device 101 via the BLE control circuit 402, and is used as a parameter for automatic shooting and automatic editing, which will be described later.

上記のように撮像装置１０１と外部機器３０１は、無線ＬＡＮ制御回路４０１、及び、ＢＬＥ制御回路４０２を用いた通信により、撮像装置１０１とデータのやりとりを行う。例えば、音声信号、画像信号、圧縮音声信号、圧縮画像信号などのデータを送信したり受信したりする。また、外部機器３０１から撮像装置１０１へ、撮影などの操作指示や、音声コマンド登録データ送信や、ＧＰＳ位置情報に基づいた所定位置検出通知や場所移動通知を行う。また、外部機器３０１内の専用のアプリケーションを介しての学習用データの送受信も行う。 As described above, the imaging device 101 and the external device 301 exchange data with the imaging device 101 through communication using the wireless LAN control circuit 401 and the BLE control circuit 402. For example, it transmits and receives data such as audio signals, image signals, compressed audio signals, and compressed image signals. In addition, the external device 301 issues operational instructions such as photographing to the imaging device 101, sends voice command registration data, and sends a notification of detection of a predetermined position based on GPS position information and notification of location movement. It also transmits and receives learning data via a dedicated application within the external device 301.

＜撮像動作のシーケンス＞
図５は、本実施形態における撮像装置１０１の制御回路２２１が受け持つ動作の例を説明するフローチャートである。 <Sequence of imaging operation>
FIG. 5 is a flowchart illustrating an example of the operation handled by the control circuit 221 of the imaging apparatus 101 in this embodiment.

ユーザが撮像装置１０１に設けられた電源ボタンを操作すると、電源回路２１０が制御回路２２１及び撮像装置１０１の各ブロックに電源を供給する。電源が供給されると、図５の処理がスタートする。ステップＳ５０１（以下では、「ステップＳ」を単に「Ｓ」と省略する）では、起動条件の読み込みが行われる。本実施形態においては、電源ボタンが手動で押下されて電源起動をしてもよいし、外部機器（例えば３０１）から外部通信（例えばＢＬＥ通信）による指示で電源起動してもよい。あるいは、ユーザが撮像装置１０１をタップしたことを検知して電源起動してもよいし、特定の音声コマンドが入力されたことを検知して電源起動してもよい。また、ここで読み込まれた起動条件は、被写体探索や自動撮影時の１つのパラメータ要素として用いられるが、これについては後述する。起動条件読み込みが終了するとＳ５０２に進む。 When the user operates a power button provided on the imaging device 101, the power supply circuit 210 supplies power to the control circuit 221 and each block of the imaging device 101. When power is supplied, the process shown in FIG. 5 starts. In step S501 (hereinafter, "step S" is simply abbreviated as "S"), startup conditions are read. In this embodiment, the power may be started by manually pressing the power button, or the power may be started by an instruction from an external device (for example, 301) through external communication (for example, BLE communication). Alternatively, the power may be activated by detecting that the user has tapped the imaging device 101, or the power may be activated by detecting that a specific voice command has been input. Further, the activation condition read here is used as one parameter element during subject search and automatic shooting, and this will be described later. When the reading of the activation conditions is completed, the process advances to S502.

Ｓ５０２では、各種センサの検出値の読み込みが行われる。ここで読み込まれるセンサの検出値は、装置揺れ検出回路２０９からのジャイロセンサや加速度センサなどの振動を検出するセンサの検出値である。また、チルト回転ユニット１０４やパン回転ユニット１０５の回転位置である。さらに、音声処理回路２１２において検出される音声レベルや特定音声認識の検出トリガーや音方向の検出値である。 In S502, detection values of various sensors are read. The sensor detection value read here is the detection value of a sensor that detects vibrations, such as a gyro sensor or an acceleration sensor from the device shake detection circuit 209. It is also the rotational position of the tilt rotation unit 104 and the pan rotation unit 105. Further, it is the sound level detected by the sound processing circuit 212, the detection trigger for specific sound recognition, and the detected value of the sound direction.

また、図１乃至図４には図示していないが、環境情報を検出するセンサでも情報を取得する。例えば、所定の周期で撮像装置１０１の周辺の温度を検出する温度センサや、撮像装置１０１の周辺の気圧の変化を検出する気圧センサを備える。また、撮像装置１０１の周辺の明るさを検出する照度センサや、撮像装置１０１の周辺の湿度を検出する湿度センサや、撮像装置１０１の周辺の紫外線量を検出するＵＶセンサ等を備えてもよい。検出した温度情報や気圧情報や明るさ情報や湿度情報やＵＶ情報に加え、検出した各種情報から所定時間間隔での変化率を算出した温度変化量や気圧変化量や明るさ変化量や湿度変化量や紫外線変化量などを後述する自動撮影などの判定に使用する。 Although not shown in FIGS. 1 to 4, a sensor that detects environmental information also acquires information. For example, it includes a temperature sensor that detects the temperature around the imaging device 101 at a predetermined period, and an atmospheric pressure sensor that detects changes in the air pressure around the imaging device 101. Further, an illuminance sensor that detects the brightness around the imaging device 101, a humidity sensor that detects the humidity around the imaging device 101, a UV sensor that detects the amount of ultraviolet rays around the imaging device 101, etc. may be provided. . In addition to the detected temperature information, atmospheric pressure information, brightness information, humidity information, and UV information, the amount of temperature change, amount of atmospheric pressure change, amount of brightness change, and humidity change is calculated by calculating the rate of change at a predetermined time interval from various detected information. It is used to judge the amount of ultraviolet rays and the amount of change in ultraviolet rays for automatic shooting, etc., which will be described later.

Ｓ５０２で各種センサの検出値の読み込みが行われるとＳ５０３に進む。Ｓ５０３では、外部機器からの通信が指示されているかを検出し、通信指示があった場合、外部機器との通信を行う。例えば、外部機器３０１から、無線ＬＡＮやＢＬＥを介した、リモート操作を受信したり、音声信号、画像信号、圧縮音声信号、圧縮画像信号などのデータを送信したり受信したりする。また、外部機器３０１からの撮像装置１０１の撮影などの操作指示や、音声コマンド登録データ送信や、ＧＰＳ位置情報に基づいた所定位置検出通知や場所移動通知や学習用データの送受信の指示があるかどうかの読み込みを行う。 When the detection values of various sensors are read in S502, the process advances to S503. In S503, it is detected whether communication is instructed from an external device, and if there is a communication instruction, communication with the external device is performed. For example, it receives a remote operation from the external device 301 via wireless LAN or BLE, and transmits and receives data such as an audio signal, an image signal, a compressed audio signal, and a compressed image signal. Also, are there operational instructions from the external device 301 such as shooting with the imaging device 101, voice command registration data transmission, predetermined position detection notifications based on GPS position information, location movement notifications, and transmission/reception of learning data? Load something.

また、上述した環境情報を検出する各種センサは、撮像装置１０１に搭載されていてもよいが、外部機器３０１に搭載されていてもよく、その場合、ＢＬＥを介した環境情報の読み込みも行う。Ｓ５０３で外部機器からの通信読み込みが行われると、Ｓ５０４に進む。 Further, the various sensors that detect the environmental information described above may be installed in the imaging device 101 or may be installed in the external device 301, and in that case, the environmental information is also read via BLE. When the communication from the external device is read in S503, the process advances to S504.

Ｓ５０４では、モード設定判定が行われる。Ｓ５０４で設定されるモードは、以下の内から判定され選ばれる。 In S504, mode setting determination is performed. The mode set in S504 is determined and selected from the following.

（１）手動撮影モード
［モード判定条件］
外部機器３０１から手動撮影モードを設定するコマンドが送信されたことを検出した場合に、手動撮影モードに設定される。 (1) Manual shooting mode [Mode judgment conditions]
When it is detected that a command to set the manual shooting mode has been sent from the external device 301, the manual shooting mode is set.

［モード内処理］
手動撮影モード処理（Ｓ５０６）では、ユーザの入力内容に応じて、パン・チルトあるいはズームを駆動し、ユーザの撮影指示に応じて静止画の撮影をしたり、動画の記録を開始したりする。 [In-mode processing]
In manual shooting mode processing (S506), panning, tilting, or zooming is driven according to the user's input contents, and still images are taken or moving image recording is started according to the user's shooting instructions.

（２）自動撮影モード
［モード判定条件］
後述する学習により設定された各検出情報（画像、音、時間、振動、場所、身体の変化、環境変化）や、自動撮影モードに移行してからの経過時間や、過去の撮影情報などから、自動撮影を行うべきと判定されると、自動撮影モードに設定される。 (2) Automatic shooting mode [Mode judgment conditions]
Based on each detection information (image, sound, time, vibration, location, physical change, environmental change) set by learning described later, the elapsed time since switching to automatic shooting mode, past shooting information, etc. When it is determined that automatic photography should be performed, automatic photography mode is set.

［モード内処理］
自動撮影モード処理（Ｓ５０８）では、各検出情報（画像、音、時間、振動、場所、体の変化、環境変化）に基づいて、パン・チルトやズームを駆動して被写体を自動探索する。そして、ユーザの好みの撮影が行えるタイミングであると判定されると、自動で撮影が行われる。なお、ユーザから撮影指示があった場合には、その指示に応じて撮影を行う。 [In-mode processing]
In automatic photographing mode processing (S508), the subject is automatically searched for by driving panning, tilting, and zooming based on each detection information (image, sound, time, vibration, location, change in body, and change in environment). Then, when it is determined that the timing is right for the user's preferred photography, photography is automatically performed. Note that when a user gives a shooting instruction, shooting is performed in accordance with the instruction.

（３）学習モード
［モード判定条件］
前回学習処理を行ってからの経過時間と、学習に使用することのできる画像に対応付けられた情報や学習データの数などから、学習を行うべきと判定されると、学習モードに設定される。または、外部機器３０１からの通信を介して学習パラメータを設定するように指示があった場合も本モードに設定される。 (3) Learning mode [Mode judgment conditions]
If it is determined that learning should be performed based on the elapsed time since the last learning process, the information associated with images that can be used for learning, the number of learning data, etc., the system will be set to learning mode. . Alternatively, this mode is also set when there is an instruction to set learning parameters via communication from the external device 301.

［モード内処理］
学習モード処理（Ｓ５１０）では、ユーザの好みに合わせた学習を行う。外部機器３０１での各操作、外部機器３０１からの学習データ通知などの情報を基にニューラルネットワークを用いて、ユーザの好みに合わせた学習が行われる。外部機器３０１での各操作の情報としては、例えば、撮像装置１０１からの画像取得情報、専用アプリケーションを介して手動による編集指示がされた情報、撮像装置内の画像に対してユーザが入力した判定値情報がある。 [In-mode processing]
In the learning mode process (S510), learning is performed in accordance with the user's preferences. Learning is performed in accordance with the user's preferences using a neural network based on information such as each operation on the external device 301 and learning data notifications from the external device 301. Information on each operation on the external device 301 includes, for example, image acquisition information from the imaging device 101, information on manual editing instructions given via a dedicated application, and judgments input by the user for images in the imaging device. There is value information.

（４）ファイル自動削除モード
［モード判定条件］
前回ファイル自動削除を行ってからの経過時間や、記録媒体２１９あるいは不揮発性メモリ２１４に記録したデータ量、学習データの更新の有無により、ファイル自動削除を行うべきと判定されると、ファイル自動削除モードに設定される。 (4) File automatic deletion mode [Mode judgment conditions]
If it is determined that automatic file deletion should be performed based on the elapsed time since the last automatic file deletion, the amount of data recorded on the recording medium 219 or non-volatile memory 214, and whether or not learning data has been updated, the file will be automatically deleted. mode is set.

［モード内処理］
ファイル自動削除モード処理（Ｓ５１２）では、記録媒体２１９あるいは不揮発性メモリ２１４内の画像の中から、各画像のタグ情報と撮影された日時などから自動削除されるファイルを指定し削除する。 [In-mode processing]
In the automatic file deletion mode process (S512), files to be automatically deleted are specified from among the images in the recording medium 219 or the nonvolatile memory 214 based on the tag information of each image, the date and time of photographing, and the like.

なお、自動撮影モード処理、学習モード処理、および、ファイル自動削除モード処理についての詳細は、後述する。 Note that details of the automatic shooting mode processing, learning mode processing, and automatic file deletion mode processing will be described later.

図５のＳ５０５では、Ｓ５０４でモード設定判定が手動撮影モードに設定されているか否かを判定する。手動撮影モードであると判定されれば、Ｓ５０６に進み、手動撮影モード処理が行われる。手動撮影モード処理では、上述したようにユーザの入力内容に応じて撮像装置１０１を駆動する。処理が終了すると、Ｓ５０２に戻る。 In S505 of FIG. 5, it is determined whether the mode setting determination in S504 is set to manual shooting mode. If it is determined that the mode is manual shooting mode, the process advances to S506 and manual shooting mode processing is performed. In the manual shooting mode process, the imaging device 101 is driven according to the user's input content as described above. When the process ends, the process returns to S502.

一方、Ｓ５０５で手動撮影モードでないと判定されると、Ｓ５０７に進み、モード設定が自動撮影モードであるか否かを判定し、自動撮影モードであればＳ５０８に進み、自動撮影モード処理が行われる。処理が終了すると、Ｓ５０２に戻る。Ｓ５０７で、自動撮影モードでないと判定されると、Ｓ５０９に進む。 On the other hand, if it is determined in S505 that the mode is not manual shooting mode, the process proceeds to S507, where it is determined whether the mode setting is automatic shooting mode, and if it is automatic shooting mode, the process proceeds to S508, where automatic shooting mode processing is performed. . When the process ends, the process returns to S502. If it is determined in S507 that the mode is not automatic shooting mode, the process advances to S509.

Ｓ５０９では、モード設定が学習モードであるか否かを判定し、学習モードであればＳ５１０に進み、学習モード処理が行われる。処理が終了すると、Ｓ５０２に戻り、処理を繰り返す。Ｓ５０９で、学習モードでないと判定されると、Ｓ５１１に進む。 In S509, it is determined whether the mode setting is learning mode, and if it is learning mode, the process advances to S510, where learning mode processing is performed. When the process is completed, the process returns to S502 and the process is repeated. If it is determined in S509 that the mode is not learning mode, the process advances to S511.

Ｓ５１１では、モード設定がファイル自動削除モードであるか否かを判定し、ファイル自動削除モードであればＳ５１２に進み、ファイル自動削除モード処理が行われる。処理が終了すると、Ｓ５０２に戻り、処理を繰り返す。Ｓ５１１で、ファイル自動削除モードでないと判定されると、Ｓ５０２に戻り、処理を繰り返す。 In S511, it is determined whether the mode setting is automatic file deletion mode, and if it is automatic file deletion mode, the process advances to S512, where automatic file deletion mode processing is performed. When the process is completed, the process returns to S502 and the process is repeated. If it is determined in S511 that the file automatic deletion mode is not in effect, the process returns to S502 and repeats the process.

＜自動撮影モード処理＞
図６を用いて、図５のＳ５０８の自動撮影モード処理の詳細について説明する。前述したように、以下の処理は、本実施形態における撮像装置１０１の制御回路２２１が制御を行う。 <Automatic shooting mode processing>
The details of the automatic shooting mode process in S508 in FIG. 5 will be explained using FIG. 6. As described above, the following processing is controlled by the control circuit 221 of the imaging apparatus 101 in this embodiment.

Ｓ６０１では、画像処理回路２０７に撮像部２０６で取り込まれた画像信号に対して画像処理を行わせ、被写体認識用の画像を生成させる。生成された画像に対して、人物や物体認識などの被写体認識が行われる。 In S601, the image processing circuit 207 performs image processing on the image signal captured by the imaging unit 206 to generate an image for object recognition. Subject recognition such as person and object recognition is performed on the generated image.

人物を認識する場合、被写体の顔や人体を検出する。顔検出処理では、人物の顔を判断するためのパターンが予め定められており、撮像された画像内に含まれる領域のうち、このパターンに一致する箇所を人物の顔画像として検出することができる。また、被写体の顔としての確からしさを示す信頼度も同時に算出する。信頼度は、例えば画像内における顔領域の大きさや、顔パターンとの一致度等から算出される。 When recognizing a person, the face or human body of the subject is detected. In face detection processing, a pattern for determining a person's face is determined in advance, and a portion of the area included in the captured image that matches this pattern can be detected as a person's face image. . In addition, the reliability level indicating the certainty that the subject's face is true is also calculated at the same time. The reliability is calculated from, for example, the size of the face area in the image, the degree of matching with the face pattern, and the like.

物体認識についても同様に、予め登録されたパターンに一致する物体を認識することができる。また、撮像された画像内の色相や彩度等のヒストグラムを使用する方法で特徴被写体を抽出する方法などもある。この場合、撮影画角内に捉えられている被写体の画像に関し、その色相や彩度等のヒストグラムから導出される分布を複数の区間に分け、区間ごとに撮像された画像を分類する処理が実行される。 Similarly, for object recognition, it is possible to recognize objects that match a pre-registered pattern. There is also a method of extracting characteristic objects by using a histogram of hue, saturation, etc. in a captured image. In this case, the distribution derived from the histogram of hue, saturation, etc. of the image of the subject captured within the shooting angle of view is divided into multiple sections, and the captured image is classified for each section. be done.

例えば、撮像された画像について複数の色成分のヒストグラムが作成され、その山型の分布範囲を区分けし、同一の区間の組み合わせに属する領域において撮像された画像が分類され、被写体の画像領域が認識される。 For example, a histogram of multiple color components is created for a captured image, the mountain-shaped distribution range is divided, images captured in areas that belong to the same combination of sections are classified, and the image area of the subject is recognized. be done.

認識された被写体の画像領域ごとに評価値を算出することにより、その評価値が最も高い被写体の画像領域を主被写体領域として判定することができる。 By calculating the evaluation value for each image area of the recognized object, it is possible to determine the image area of the object with the highest evaluation value as the main object area.

以上の方法で、撮像情報から各被写体情報を得ることができる。 With the above method, each subject information can be obtained from the imaging information.

Ｓ６０２では、揺れ補正量の算出を行う。具体的には、まず、装置揺れ検出回路２０９において取得した角速度および加速度情報に基づいて、撮像装置１０１の姿勢変化の絶対角度を算出する。そして、絶対角度を打ち消す角度方向にチルト回転ユニット１０４およびパン回転ユニット１０５を動かす揺れ補正角度を求め、揺れ補正量とする。 In S602, a shake correction amount is calculated. Specifically, first, the absolute angle of attitude change of the imaging device 101 is calculated based on the angular velocity and acceleration information acquired by the device shake detection circuit 209. Then, a shake correction angle for moving the tilt rotation unit 104 and pan rotation unit 105 in an angular direction that cancels out the absolute angle is determined and used as the shake correction amount.

Ｓ６０３では、撮像装置１０１の状態判定を行う。角速度情報や加速度情報やＧＰＳ位置情報などで検出した角度や移動量などにより、現在、撮像装置１０１がどのような振動／動き状態なのかを判定する。例えば、車に撮像装置１０１を装着して撮影する場合、移動された距離によって周りの風景などの被写体情報が大きく変化する。 In S603, the state of the imaging device 101 is determined. What kind of vibration/motion state the imaging device 101 is currently in is determined based on the angle and amount of movement detected using angular velocity information, acceleration information, GPS position information, and the like. For example, when the imaging device 101 is attached to a car to take a picture, subject information such as surrounding scenery changes greatly depending on the distance traveled.

そのため、車などに装着して速い速度で移動している「乗り物移動状態」か否かを判定し、後に説明する自動被写体探索に使用することができる。 Therefore, it is possible to determine whether the device is in a "vehicle moving state" in which the device is attached to a car or the like and is moving at a high speed, and can be used for automatic object search, which will be described later.

また、角度の変化が大きいか否かを判定し、撮像装置１０１が揺れ角度がほとんどない「置き撮り状態」であるのか否かを判定する。「置き撮り状態」である場合は、撮像装置１０１自体の角度変化はないと考えてよいので、置き撮り用の被写体探索を行うことができる。また、比較的角度変化が大きい場合は、「手持ち状態」と判定され、手持ち用の被写体探索を行うことができる。 Furthermore, it is determined whether the change in angle is large or not, and it is determined whether the imaging device 101 is in a "stationary shooting state" in which there is almost no shaking angle. When the camera is in the "stationary shooting state", it can be considered that there is no change in the angle of the imaging device 101 itself, so it is possible to search for a subject for stationary shooting. Furthermore, if the angle change is relatively large, it is determined that the camera is in a "hand-held state", and a hand-held object search can be performed.

Ｓ６０４では、被写体探索処理を行う。制御回路２２１は、撮像装置１０１の位置（図１の原点Ｏが撮像装置の位置とする）を中心として、全周囲でエリア分割を行う。分割した各エリアについて、エリア内に存在する被写体やエリアのシーン状況に応じて、探索を行う優先順位を示す重要度レベルを算出する。 In S604, subject search processing is performed. The control circuit 221 performs area division around the entire periphery around the position of the imaging device 101 (the origin O in FIG. 1 is the position of the imaging device). For each divided area, an importance level indicating the priority for searching is calculated depending on the objects existing within the area and the scene situation of the area.

被写体の状況に基づいた重要度レベルは、例えば、エリア内に存在する人物の数、人物の顔の大きさ、顔の向き、顔検出の確からしさ、人物の表情、人物の個人認証結果に基づいて算出される。また、シーンの状況に応じた重要度レベルは、例えば、一般物体認識結果、シーン判別結果（青空、逆光、夕景など）、エリアの方向からする音のレベルや音声認識結果、エリア内の動き検知情報等から算出される。また、撮像装置１０１の状態判定（Ｓ６０３）で、撮像装置１０１の振動状態が検出されており、振動状態に応じても重要度レベルが変化するようにすることができる。例えば、「置き撮り状態」と判定された場合、顔認証で登録されている中で優先度の高い被写体（例えば撮像装置のユーザ）を中心に被写体探索が行われるように、特定人物の顔が検出されると重要度レベルが高くなるように判定される。また、後述する自動撮影も特定人物の顔を優先して行われることになり、撮像装置１０１のユーザが撮像装置を身に着けて持ち歩き撮影を行っている時間が多くても、撮像装置を取り外して机の上などに置くことで、ユーザが写った画像も多く残すことができる。このときパン・チルトにより探索可能であることから、撮像装置の置き角度などを考えなくても、適当に設置するだけでユーザが写った画像やたくさんの顔が写った集合写真などを残すことができる。 The importance level based on the situation of the subject is based on, for example, the number of people in the area, the size of the faces of the people, the orientation of the faces, the certainty of face detection, the facial expressions of the people, and the results of personal identification of the people. Calculated by In addition, the importance level depending on the situation of the scene includes, for example, general object recognition results, scene discrimination results (blue sky, backlight, sunset view, etc.), sound levels and voice recognition results from the direction of the area, and motion detection within the area. Calculated from information etc. Further, in the state determination of the imaging device 101 (S603), the vibration state of the imaging device 101 is detected, and the importance level can also be changed depending on the vibration state. For example, if it is determined that the camera is in a "stationary shooting state", the face of a specific person will be searched based on the high-priority subject (for example, the user of the imaging device) registered in face recognition. When detected, the importance level is determined to be high. In addition, automatic shooting, which will be described later, will also be performed with priority given to the face of a specific person, so even if the user of the imaging device 101 spends a lot of time carrying around the imaging device and taking pictures, he or she may remove the imaging device. By placing it on a desk, etc., you can keep many images of the user. At this time, since it is possible to search by panning and tilting, it is possible to leave an image of the user or a group photo of many faces by simply setting it up appropriately, without having to consider the placement angle of the imaging device. can.

なお、上記条件だけでは、各エリアに変化がない限りは、最も重要度レベルが高いエリアが同じとなり、その結果探索されるエリアがずっと変わらないことになってしまう。そこで、過去の撮影情報に応じて重要度レベルを変化させる。具体的には、所定時間継続して探索エリアに指定され続けたエリアは重要度レベルを下げたり、後述するＳ６１０において撮影を行ったエリアでは、所定時間の間重要度レベルを下げたりしてもよい。 Note that with only the above conditions, as long as there is no change in each area, the area with the highest level of importance will remain the same, and as a result, the area to be searched will remain the same forever. Therefore, the importance level is changed depending on past shooting information. Specifically, the importance level of an area that has been designated as a search area for a predetermined period of time is lowered, and the importance level of an area that has been photographed in S610, which will be described later, is lowered for a predetermined period of time. good.

上記のように各エリアの重要度レベルが算出されたら、重要度レベルが高いエリアを探索対象エリアとして決定する。そして、探索対象エリアを画角に捉えるために必要なパン・チルト探索目標角度を算出する。 Once the importance level of each area is calculated as described above, an area with a high importance level is determined as a search target area. Then, the pan/tilt search target angle required to capture the search target area at the angle of view is calculated.

Ｓ６０５では、パン・チルト駆動を行う。具体的には、像振れ補正量とパン・チルト探索目標角度に基づいて、制御サンプリングでの駆動角度を加算することにより、パン・チルト駆動量を算出する。そして、鏡筒回転駆動回路２０５によって、チルト回転ユニット１０４、パン回転ユニット１０５をそれぞれ駆動制御する。 In S605, pan/tilt driving is performed. Specifically, the pan/tilt drive amount is calculated by adding the drive angle in control sampling based on the image blur correction amount and the pan/tilt search target angle. The lens barrel rotation drive circuit 205 drives and controls the tilt rotation unit 104 and the pan rotation unit 105, respectively.

Ｓ６０６では、ズームユニット２０１を制御しズーム駆動を行う。具体的には、Ｓ６０４で決定した探索対象被写体の状態に応じてズームを駆動させる。例えば、探索対象被写体が人物の顔である場合、画像上の顔が小さすぎると検出可能な最小サイズを下回って検出することができず、見失ってしまう恐れがある。そのような場合は、望遠側にズームすることにより、画像上の顔のサイズが大きくなるように制御する。一方で、画像上の顔が大きすぎる場合、被写体や撮像装置１０１自体の動きによって被写体が画角から外れやすくなってしまう。そのような場合は、広角側にズームすることにより、画面上の顔のサイズが小さくなるように制御する。このようにズーム制御を行うことにより、被写体を追跡するのに適した状態を保つことができる。 In S606, the zoom unit 201 is controlled to drive the zoom. Specifically, the zoom is driven according to the state of the search target object determined in S604. For example, when the subject to be searched is a person's face, if the face on the image is too small, it is less than the minimum detectable size and cannot be detected, and there is a risk that the face will be lost. In such a case, control is performed to increase the size of the face on the image by zooming toward the telephoto side. On the other hand, if the face in the image is too large, the subject is likely to move out of the field of view due to movement of the subject or the imaging device 101 itself. In such a case, control is performed to reduce the size of the face on the screen by zooming to the wide-angle side. By performing zoom control in this manner, a state suitable for tracking the subject can be maintained.

Ｓ６０４～Ｓ６０６では、パン・チルトやズーム駆動により被写体探索を行う方法について説明したが、広角レンズを複数使用して全方位を一度に撮影する撮像システムで被写体探索を行ってもよい。全方位カメラの場合、撮像によって得られる信号全てを入力画像として被写体検出などの画像処理を行うと、膨大な処理が必要となる。そこで、画像の一部を切り出して、切り出した画像範囲の中で被写体の探索処理を行う構成にする。上述した方法と同様にエリア毎の重要度レベルを算出し、重要度レベルに基づいて切り出し位置を変更し、後述する自動撮影の判定を行う。これにより画像処理による消費電力の低減や高速な被写体探索が可能となる。 In S604 to S606, a method of searching for an object using pan/tilt or zoom driving was described, but the object search may also be performed using an imaging system that uses a plurality of wide-angle lenses to take images in all directions at once. In the case of an omnidirectional camera, performing image processing such as object detection using all the signals obtained by imaging as input images requires an enormous amount of processing. Therefore, a configuration is adopted in which a part of the image is cut out and a search process for a subject is performed within the cut out image range. Similar to the method described above, the importance level for each area is calculated, the cutout position is changed based on the importance level, and automatic shooting determination, which will be described later, is performed. This makes it possible to reduce power consumption through image processing and to search for objects at high speed.

Ｓ６０７では、自動撮影モードが設定されている状態で、ユーザ（手動）による撮影指示があったか否かを判定し、撮影指示があった場合、Ｓ６１０に進む。この場合、ユーザ（手動）による撮影指示は、シャッターボタンの押下により行ってもよいし、撮像装置１０１の筺体を指等で軽く叩く（タップ）、音声コマンド入力、外部機器からの指示などによって行ってもよい。タップ操作による撮影指示は、ユーザが撮像装置１０１の筺体をタップした際の振動を、装置揺れ検出回路２０９によって短期間に連続した高周波の加速度として検知し、撮影のトリガーとする撮影指示方法である。音声コマンド入力は、ユーザが所定の撮影を指示する合言葉（例えば「写真とって」等）を発声した場合、その音声を音声処理回路２１２で認識し、撮影のトリガーとする撮影指示方法である。外部機器からの撮影指示は、例えば撮像装置１０１とＢｌｕｅＴｏｏｔｈ接続したスマートフォン等から、専用のアプリケーションを介して送信されたシャッター指示信号をトリガーとする撮影指示方法である。 In S607, it is determined whether or not there is a user (manual) shooting instruction while the automatic shooting mode is set. If there is a shooting instruction, the process advances to S610. In this case, the user (manually) may issue a shooting instruction by pressing the shutter button, by lightly tapping (tapping) the housing of the imaging device 101 with a finger, inputting a voice command, or by instructions from an external device. It's okay. The shooting instruction by tap operation is a shooting instruction method in which the vibration when the user taps the housing of the imaging device 101 is detected by the device shaking detection circuit 209 as a continuous high-frequency acceleration in a short period of time, and is used as a trigger for shooting. . Voice command input is a photography instruction method in which when a user utters a command word (for example, "take a photo") instructing a predetermined photography, the voice processing circuit 212 recognizes the voice and uses it as a trigger for photography. The shooting instruction from an external device is a shooting instruction method in which a shutter instruction signal transmitted from a smartphone or the like connected to the imaging device 101 via a dedicated application is used as a trigger, for example.

Ｓ６０７で撮影指示がなかった場合、Ｓ６０８に進み、自動撮影判定を行う。自動撮影判定では、自動撮影を行うか否かの判定を行う。 If there is no photographing instruction in S607, the process advances to S608 and automatic photographing determination is performed. In the automatic photographing determination, it is determined whether or not to perform automatic photographing.

自動撮影を行うか否かの判定は、機械学習の１つであるニューラルネットワークに基づいて行われる。ニューラルネットワークの一例として、多層パーセプトロンによるネットワークの例を図７に示す。ニューラルネットワークは、入力値から出力値を予測することに使用されるものであり、予め入力値と、その入力に対して模範となる出力値とを学習しておくことで、新たな入力値に対して、学習した模範に倣った出力値を推定することができる。なお、学習の方法は後述する。 The determination as to whether or not to perform automatic photographing is performed based on a neural network, which is a type of machine learning. As an example of a neural network, an example of a network using a multilayer perceptron is shown in FIG. Neural networks are used to predict output values from input values, and by learning input values and model output values for those inputs in advance, they can be used to predict output values from new input values. On the other hand, it is possible to estimate the output value based on the learned model. Note that the learning method will be described later.

図７の７０１およびその縦に並ぶ丸は入力層のニューロンであり、７０３およびその縦に並ぶ丸は中間層のニューロンであり、７０４は出力層のニューロンである。７０２のような矢印は各ニューロンを繋ぐ結合を示している。ニューラルネットワークに基づく判定では、入力層のニューロンに対して、現在の画角中に写る被写体や、シーンや撮像装置の状態に基づいた特徴量を入力として与え、多層パーセプトロンの順伝播則に基づく演算を経て出力層から出力された値を得る。そして、出力の値が閾値以上であれば、自動撮影を実施する判定を下す。 In FIG. 7, 701 and its vertically aligned circles are neurons of the input layer, 703 and its vertically aligned circles are neurons of the intermediate layer, and 704 are neurons of the output layer. Arrows such as 702 indicate connections connecting each neuron. In neural network-based determination, input layer neurons are given features based on the subject in the current field of view, the scene, and the state of the imaging device, and calculations are performed based on the forward propagation law of a multilayer perceptron. Obtain the value output from the output layer through . Then, if the output value is equal to or greater than the threshold value, a determination is made to perform automatic imaging.

なお、被写体の特徴は、現在のズーム倍率、現在の画角における一般物体認識結果、顔検出結果、現在画角に写る顔の数、顔の笑顔度・目瞑り度、顔角度、顔認証ＩＤ番号、被写体人物の視線角度、シーン判別結果、特定の構図の検出結果等を使用する。また、前回撮影時からの経過時間、現在時刻、ＧＰＳ位置情報および前回撮影位置からの変化量、現在の音声レベル、声を発している人物、拍手、歓声が上がっているか否か等を使用してもよい。また、振動情報（加速度情報、撮像装置の状態）、環境情報（温度、気圧、照度、湿度、紫外線量）等を使用してもよい。この特徴を所定の範囲の数値に変換し、特徴量として入力層の各ニューロンに与える。そのため、入力層の各ニューロンは上記の使用する特徴量の数だけ必要となる。 The characteristics of the subject include the current zoom magnification, the general object recognition result at the current angle of view, the face detection result, the number of faces in the current angle of view, the degree of smile/eyes closed of the face, the face angle, and the face recognition ID. The number, the viewing angle of the subject person, the scene discrimination result, the detection result of a specific composition, etc. are used. It also uses information such as the elapsed time since the last shooting, the current time, GPS location information, the amount of change from the previous shooting location, the current audio level, the person making the sound, whether there is applause or cheering, etc. It's okay. Further, vibration information (acceleration information, state of the imaging device), environmental information (temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet rays), etc. may be used. This feature is converted into a numerical value within a predetermined range and given to each neuron in the input layer as a feature amount. Therefore, each neuron in the input layer is required as many as the number of features used above.

なお、このニューラルネットワークに基づく判断は、後述する学習処理によって、各ニューロン間の結合重みを変化させることによって、出力値が変化し、判断の結果を学習結果に適応させることができる。 Note that in the judgment based on this neural network, the output value changes by changing the connection weight between each neuron through a learning process to be described later, so that the judgment result can be adapted to the learning result.

また、図５のＳ５０１で読み込んだ起動条件によって、自動撮影される判定も変化する。例えば、タップ検出による起動や特定音声コマンドによる起動の場合は、ユーザが現在撮影してほしいための操作である可能性が非常に高い。そこで、撮影頻度が多くなるように設定される。 Further, the determination that automatic photography is performed also changes depending on the activation conditions read in S501 of FIG. For example, in the case of activation by tap detection or activation by a specific voice command, there is a very high possibility that the operation is because the user currently wants to take a picture. Therefore, the shooting frequency is set to increase.

Ｓ６０９では、Ｓ６０８の自動撮影判定により撮影する判定が下された場合、Ｓ６１０に進み、下されなかった場合、撮影モード処理を終了し、図５のＳ５０２に進む。 In S609, if a determination is made to photograph by the automatic photographing determination in S608, the process advances to S610; if not, the photographing mode process is ended and the process advances to S502 in FIG.

Ｓ６１０では、撮影を開始する。この時、手動撮影であれば静止画の撮影、あるいは、ユーザが手動で設定した撮影方法で撮影を行い、自動撮影であればＳ６０８において判定されたタイミングで撮影を開始する。その際、フォーカス駆動制御回路２０４によるオートフォーカス制御を行う。また、不図示の絞り制御回路およびセンサゲイン制御回路、シャッター制御回路を用いて、被写体が適切な明るさになるような露出制御を行う。さらに、撮影後には画像処理回路２０７において、オートホワイトバランス処理、ノイズリダクション処理、ガンマ補正処理等、種々の画像処理を行い、画像を生成する。 In S610, photographing is started. At this time, in the case of manual shooting, shooting is performed as a still image or in a shooting method manually set by the user, and in the case of automatic shooting, shooting is started at the timing determined in S608. At this time, autofocus control is performed by the focus drive control circuit 204. Further, using an aperture control circuit, a sensor gain control circuit, and a shutter control circuit (not shown), exposure control is performed so that the subject has appropriate brightness. Further, after photographing, the image processing circuit 207 performs various image processing such as auto white balance processing, noise reduction processing, and gamma correction processing to generate an image.

なお、この撮影の際に、所定の条件を満たした場合、撮像装置１０１が撮影対象となる人物に対し撮影を行う旨を報知した上で撮影する方法をとってもよい。報知の方法は、例えば、音声出力回路２１６からの音声やＬＥＤ制御回路２２２によるＬＥＤ点灯光を使用してもよいし、パン・チルトを駆動することにより視覚的に被写体の視線を誘導するモーション動作をしてもよい。所定の条件は、例えば、画角内における顔の数、顔の笑顔度・目瞑り度、被写体人物の視線角度や顔角度、顔認証ＩＤ番号、個人認証登録されている人物の数等である。また、撮影時の一般物体認識結果、シーン判別結果、前回撮影時からの経過時間、撮影時刻、ＧＰＳ情報に基づく現在位置が景勝地であるか否か、撮影時の音声レベル、声を発している人物の有無、拍手、歓声が上がっているか否か等である。また、振動情報（加速度情報、撮像装置の状態）、環境情報（温度、気圧、照度、湿度、紫外線量）等である。これらの条件に基づき報知撮影を行うことによって、重要性が高いシーンにおいてカメラ目線の好ましい画像を残すことができる。 Note that when a predetermined condition is satisfied during this photographing, a method may be adopted in which the imaging device 101 notifies the person to be photographed that the photograph will be taken, and then photographs the person. The notification method may be, for example, using sound from the audio output circuit 216 or LED lighting from the LED control circuit 222, or a motion motion that visually guides the subject's line of sight by driving panning and tilting. You may do so. The predetermined conditions include, for example, the number of faces within the angle of view, the degree of smiling faces and closed eyes, the gaze angle and face angle of the subject person, the face authentication ID number, the number of people registered for personal authentication, etc. . In addition, the general object recognition results at the time of shooting, the scene discrimination results, the elapsed time since the previous shooting, the shooting time, whether the current position based on GPS information is a scenic spot, the audio level at the time of shooting, and the voice output. These include the presence or absence of a person present, whether there is applause, cheers, etc. The information also includes vibration information (acceleration information, status of the imaging device), environmental information (temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet rays), and the like. By performing notification photography based on these conditions, it is possible to leave a favorable image of the camera's line of sight in a highly important scene.

また、所定の条件を複数もち、各条件に応じて音声を変更したり、ＬＥＤの点灯方法（色や点滅時間など）を変更したり、パン・チルトのモーション方法（動き方や駆動速度）を変更してもよい。 It also has multiple predetermined conditions, and can change the audio, the LED lighting method (color, flashing time, etc.), and the pan/tilt motion method (movement method and drive speed) according to each condition. May be changed.

Ｓ６１１では、Ｓ６１０において生成した画像を加工したり、動画に追加したりといった編集処理を行う。画像加工は、具体的には、人物の顔や合焦位置に基づいたトリミング処理、画像の回転処理、ＨＤＲ（ハイダイナミックレンジ）効果、ボケ効果、色変換フィルタ効果などの各種効果の追加処理などである。画像加工は、Ｓ６１０において生成した画像を元に、上記の処理の組み合わせによって複数生成し、Ｓ６１０において生成した画像とは別に保存するようにしてもよい。また、動画処理については、撮影した動画または静止画を、生成済みの編集動画にスライド、ズーム、フェードの特殊効果処理をつけながら追加するといった処理をしてもよい。Ｓ６１１での編集においても、撮影画像の情報、或いは撮影前に検出した各種情報をニューラルネットワークに基づいて判断し、画像加工の方法を判定することもできる。また、この判定処理は、後述する学習処理によって、判定条件を変更することができる。 In S611, editing processing such as processing the image generated in S610 and adding it to a video is performed. Specifically, image processing includes cropping processing based on a person's face and focus position, image rotation processing, and addition processing of various effects such as HDR (high dynamic range) effects, bokeh effects, and color conversion filter effects. It is. For image processing, a plurality of images may be generated based on the image generated in S610 by a combination of the above processes, and the images may be saved separately from the image generated in S610. Further, regarding video processing, processing may be performed in which a captured video or still image is added to an already generated edited video while applying special effects such as slide, zoom, and fade. In the editing in S611, information on the photographed image or various information detected before photographing can be determined based on a neural network, and the image processing method can be determined. Further, in this judgment process, the judgment conditions can be changed by a learning process described later.

Ｓ６１２では、撮影画像から学習データを生成する処理を行う。ここでは、後述する学習処理に使用する情報を生成し、記録する。具体的には、今回の撮影画像における、撮影時のズーム倍率、撮影時の一般物体認識結果、顔検出結果、撮影画像に写る顔の数、顔の笑顔度・目瞑り度、顔角度、顔認証ＩＤ番号、被写体人物の視線角度等である。また、シーン判別結果、前回撮影時からの経過時間、撮影時刻、ＧＰＳ位置情報および前回撮影位置からの変化量、撮影時の音声レベル、声を発している人物、拍手、歓声が上がっているか否か等である。また、振動情報（加速度情報、撮像装置の状態）、環境情報（温度、気圧、照度、湿度、紫外線量）、動画撮影時間、手動撮影指示によるものか否か、等である。更にユーザの画像の好みを数値化したニューラルネットワークの出力であるスコアも算出する。 In S612, processing is performed to generate learning data from the captured images. Here, information used for learning processing, which will be described later, is generated and recorded. Specifically, in this captured image, the zoom magnification at the time of shooting, general object recognition results at the time of shooting, face detection results, number of faces in the captured image, degree of smiling face/eyes closed, face angle, face These include the authentication ID number, the viewing angle of the subject person, etc. In addition, the scene determination results, elapsed time since the previous shooting, shooting time, GPS location information, amount of change from the previous shooting position, audio level at the time of shooting, person making a voice, whether there is applause, cheers, etc. etc. Further, the information includes vibration information (acceleration information, state of the imaging device), environmental information (temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet rays), video shooting time, whether or not manual shooting was instructed, and so on. It also calculates a score, which is the output of a neural network that quantifies the user's image preferences.

これらの情報を生成し、撮影画像ファイルへタグ情報として記録する。あるいは、不揮発性メモリ２１４へ書き込むか、記録媒体２１９内に、所謂カタログデータとして各々の撮影画像の情報をリスト化した形式で保存するようにしてもよい。 This information is generated and recorded as tag information in the photographed image file. Alternatively, the information on each photographed image may be written in the nonvolatile memory 214 or stored in the recording medium 219 in a list format as so-called catalog data.

Ｓ６１３では、過去の撮影情報の更新を行う。具体的には、Ｓ６０８で説明したエリア毎の撮影枚数や、個人認証登録された人物毎の撮影枚数、一般物体認識で認識された被写体毎の撮影枚数、シーン判別のシーン毎の撮影枚数について、今回撮影された画像が該当する枚数のカウントを１つ増やす。 In S613, past shooting information is updated. Specifically, the number of shots for each area explained in S608, the number of shots for each person registered for personal authentication, the number of shots for each subject recognized by general object recognition, and the number of shots for each scene for scene discrimination, The count of the number of images corresponding to the image taken this time is increased by one.

＜学習モード処理＞
次に、本実施形態におけるユーザの好みに合わせた学習について説明する。 <Learning mode processing>
Next, learning tailored to the user's preferences in this embodiment will be explained.

本実施形態では、図７に示すようなニューラルネットワークを用い、機械学習アルゴリズムを使用して、学習処理回路２１７においてユーザの好みに合わせた学習を行う。学習処理回路２１７は、例えば、ＮＶＩＤＩＡ社のＪｅｔｓｏｎＴＸ２を用いる。ニューラルネットワークは、入力値から出力値を予測することに使用されるものであり、予め入力値の実績値と出力値の実績値を学習しておくことで、新たな入力値に対して、出力値を推定することができる。ニューラルネットワークを用いることにより、前述の自動撮影や被写体探索に対して、ユーザの好みに合わせた学習を行う。 In this embodiment, a neural network as shown in FIG. 7 is used, and a machine learning algorithm is used to perform learning in accordance with the user's preference in the learning processing circuit 217. The learning processing circuit 217 uses, for example, NVIDIA's Jetson TX2. Neural networks are used to predict output values from input values, and by learning the actual values of input values and actual values of output values in advance, they can predict output values for new input values. value can be estimated. By using a neural network, the system performs learning tailored to the user's preferences for the above-mentioned automatic shooting and subject search.

また、ニューラルネットワークに入力する特徴データともなる被写体登録（顔認証や一般物体認識など）も行う。 It also performs subject registration (facial recognition, general object recognition, etc.), which also serves as feature data input to the neural network.

本実施形態における自動撮影に対する学習について説明する。自動撮影では、ユーザの好みに合った画像の撮影を自動で行うための学習を行う。図６のフローチャートを用いて説明したように、撮影後に学習データを生成する処理（Ｓ６１２）が行われている。後述する方法により学習させる画像を選択させ、画像に含まれる学習データに基づいて、ニューラルネットワークのニューロン間の結合重みを変化させることにより学習する。 Learning for automatic shooting in this embodiment will be explained. Automatic shooting involves learning to automatically take images that match the user's preferences. As explained using the flowchart of FIG. 6, the process of generating learning data (S612) is performed after photographing. Learning is performed by selecting images to be trained using a method described later, and changing the connection weights between neurons of the neural network based on learning data included in the images.

次に、学習方法について説明する。学習方法としては、「撮像装置内の学習」と「通信機器との連携による学習」がある。撮像装置内学習の方法について、以下説明する。 Next, the learning method will be explained. Learning methods include "learning within the imaging device" and "learning by linking with communication equipment." The method of learning within the imaging device will be described below.

本実施形態における撮像装置内学習は、以下の方法がある。 The learning within the imaging device in this embodiment includes the following methods.

（１）ユーザによる撮影指示時の検出情報による学習
図６のＳ６０７～Ｓ６１３で説明したとおり、本実施形態においては、撮像装置１０１は、手動撮影と自動撮影の２つの撮影を行うことができる。Ｓ６０７で手動操作による撮影指示（上記で説明したとおり、３つの判定に基づいて行う）があった場合は、Ｓ６１２において、撮影画像は手動で撮影された画像であるとの情報が付加される。また、Ｓ６０９において自動撮影ＯＮと判定されて撮影された場合においては、Ｓ６１２において、撮影画像は自動で撮影された画像であると情報が付加される。また、Ｓ５０６の手動撮影モード処理で撮影した画像に対しても、手動で撮影された画像であるとの情報が付加される。 (1) Learning based on detection information when user instructs photography As described in S607 to S613 in FIG. 6, in this embodiment, the imaging device 101 can perform two types of photography: manual photography and automatic photography. If there is a manual shooting instruction in S607 (performed based on the three determinations as described above), in S612 information indicating that the captured image is a manually captured image is added. Furthermore, if it is determined in S609 that automatic photography is ON and a photograph is taken, information is added to the photographed image in S612 indicating that the photographed image is an automatically photographed image. Furthermore, information indicating that the image is a manually shot image is also added to the image shot in the manual shooting mode process of S506.

ここで手動撮影される場合、ユーザの好みの被写体、好みのシーン、好みの場所や時間間隔を基に撮影された可能性が非常に高い。よって、手動撮影時に得られた各特徴データや撮影画像の学習データを基にした学習が行われるようにする。 When manual photography is performed here, there is a very high possibility that the photography is based on the user's favorite subject, favorite scene, favorite place, or time interval. Therefore, learning is performed based on each feature data obtained during manual photography and the learning data of the photographed image.

また、手動撮影時の検出情報から、撮影画像における特徴量の抽出や個人認証の登録、個人ごとの表情の登録、人の組み合わせの登録に関して学習を行う。また、被写体探索時の検出情報からは、例えば、個人登録された被写体の表情から、近くの人や物体の重要度を変更するような学習を行う。 It also learns about extracting features from captured images, registering personal authentication, registering facial expressions for each individual, and registering combinations of people from the information detected during manual photography. Also, from the detection information during the subject search, for example, learning is performed to change the importance of nearby people and objects based on the facial expressions of the personally registered subjects.

次に、本実施形態における外部通信機器との連携による学習について説明する。本実施形態における外部通信機器との連携による学習には、以下の方法がある。 Next, learning by cooperation with an external communication device in this embodiment will be explained. In this embodiment, there are the following methods for learning through cooperation with an external communication device.

（２）外部通信機器で画像を取得したことによる学習
図３で説明したとおり、撮像装置１０１と外部機器３０１は、通信３０２,３０３を行う通信手段を有している。主に通信３０２によって画像の送受信が行われ、外部機器３０１は、その内部の専用のアプリケーションを介して、撮像装置１０１内の画像を通信により取得することができる。また、外部機器３０１は、その内部の専用のアプリケーションを介して、撮像装置１０１内の保存されている画像データのサムネイル画像を閲覧可能である。これにより、ユーザはサムネイル画像の中から、自分が気に入った画像を選択して、画像確認し、取得指示を操作することにより、その画像を外部機器３０１に取得することができる。 (2) Learning by acquiring images with an external communication device As explained in FIG. 3, the imaging device 101 and the external device 301 have communication means for communicating 302 and 303. Images are mainly sent and received through communication 302, and the external device 301 can acquire images in the imaging device 101 through communication via an internal dedicated application. Further, the external device 301 can view thumbnail images of image data stored in the imaging device 101 via an internal dedicated application. Thereby, the user can select an image he/she likes from among the thumbnail images, confirm the image, and operate the acquisition instruction to acquire the image to the external device 301.

このとき、ユーザが選んだ画像について送信指示（送信要求）して取得しているので、取得された画像はユーザの好みの画像である可能性が非常に高い。よって取得された画像は、学習すべき画像であると判定し、取得された画像から図６のＳ６１２と同様に学習データを生成し、この学習データに基づいて学習する。これにより、ユーザの好みの各種学習を行うことができる。 At this time, since the image selected by the user is acquired by a transmission instruction (transmission request), there is a very high possibility that the acquired image is an image of the user's preference. Therefore, the acquired image is determined to be an image to be learned, learning data is generated from the acquired image in the same manner as in S612 of FIG. 6, and learning is performed based on this learning data. This allows the user to perform various types of learning as desired.

操作例について説明する。スマートデバイスである外部機器３０１の専用のアプリケーションを介して、撮像装置１０１内の画像を閲覧している例を図８に示す。表示装置４０７に、撮像装置１０１内に保存されている画像データのサムネイル画像（８０４～８０９）が表示されており、ユーザは自分が気に入った画像を選択し、取得することができる。このとき、表示方法を変更する変更ボタンアイコン８０１,８０２,８０３が設けられている。変更ボタンアイコン８０１を押下すると、表示順序が日時優先表示モードに変更され、撮像装置１０１内の画像が撮影日時の順番で表示装置４０７に表示される。例えば、画像８０４は日時が新しく、画像８０９が日時は古いように表示される。変更ボタンアイコン８０２を押下すると、おすすめ画像優先表示モードに変更される。図６のＳ６１２で算出された、各画像に対してのユーザの好みを判定した評価結果であるスコアに基づいて、撮像装置１０１内の画像がスコアの高い順番で表示装置４０７に表示される。例えば、画像８０４はスコアが高く、画像８０９がスコアは低いように表示される。変更ボタンアイコン８０３を押下すると、人物や物体被写体を指定でき、続いて特定の人物や物体被写体を指定すると特定の被写体のみを表示することもできる。 An example of operation will be explained. FIG. 8 shows an example in which images in the imaging device 101 are viewed through a dedicated application of the external device 301, which is a smart device. Thumbnail images (804 to 809) of image data stored in the imaging device 101 are displayed on the display device 407, and the user can select and acquire an image that he or she likes. At this time, change button icons 801, 802, and 803 for changing the display method are provided. When the change button icon 801 is pressed, the display order is changed to date and time priority display mode, and images in the imaging device 101 are displayed on the display device 407 in the order of shooting date and time. For example, image 804 is displayed with a newer date and time, and image 809 is displayed with an older date and time. When the change button icon 802 is pressed, the mode is changed to the recommended image priority display mode. Based on the score calculated in S612 of FIG. 6, which is the evaluation result of determining the user's preference for each image, the images in the imaging device 101 are displayed on the display device 407 in order of the highest score. For example, image 804 is displayed with a high score, and image 809 is displayed with a low score. By pressing the change button icon 803, a person or object can be specified, and by subsequently specifying a specific person or object, only the specific object can be displayed.

変更ボタンアイコン８０１～８０３は、同時に設定をＯＮすることもでき、例えば全ての設定がＯＮされている場合、指定された被写体のみを表示し、且つ、撮影日時が新しい画像が優先され、且つ、スコアの高い画像が優先され、表示されることになる。 Settings of the change button icons 801 to 803 can also be turned on at the same time. For example, if all settings are turned on, only the specified subject will be displayed, and the image with the latest shooting date and time will be given priority, and Images with higher scores will be prioritized and displayed.

このように、撮影画像についてもユーザの好みを学習するため、撮影された大量の画像の中から簡単な確認作業で、ユーザの好みの画像のみを簡単に抽出することが可能である。 In this way, since the user's preferences are learned for captured images as well, it is possible to easily extract only the user's preferred images from a large number of captured images with a simple confirmation process.

（３）外部通信機器を介して画像に判定値を入力することによる学習
上記で説明したとおり、撮像装置１０１と外部機器３０１は、通信手段を有しており、撮像装置１０１内の保存されている画像を、外部機器３０１内の専用のアプリケーションを介して閲覧可能である。ここで、ユーザが、各画像に対して点数付けを行う構成にしてもよい。ユーザが好みと思った画像に対して高い点数（例えば５点）を付けたり、好みでないと思った画像に対して低い点数（例えば１点）を付けることができ、ユーザの操作によって、撮像装置１０１が学習していくような構成にする。各画像の点数は、撮像装置内で学習データと共に再学習に使用する。指定した画像情報からの特徴データを入力にした、ニューラルネットワークの出力がユーザが指定した点数に近づくように学習される。 (3) Learning by inputting judgment values into images via an external communication device As explained above, the imaging device 101 and the external device 301 have communication means, and the images can be viewed through a dedicated application within the external device 301. Here, the configuration may be such that the user assigns a score to each image. The user can give a high score (for example, 5 points) to an image that he or she likes, and a low score (for example, 1 point) to an image that he or she does not like. 101 will be configured to learn. The score of each image is used for relearning together with learning data within the imaging device. The neural network inputs the feature data from the specified image information and learns so that the output of the neural network approaches the score specified by the user.

本実施形態では、外部機器３０１を介して、撮影済み画像にユーザが点数を入力する構成にしたが、撮像装置１０１を操作して、直接、画像に点数を入力する構成にしてもよい。その場合、例えば、撮像装置１０１にタッチパネルディスプレイを設け、タッチパネルディスプレイに表示されたＧＵＩボタンをユーザが押下して、撮影済み画像を表示するモードに設定する。そして、ユーザが撮影済み画像を確認しながら、各画像に点数を入力するなどの方法により、同様の学習を行うことができる。 In the present embodiment, the user inputs a score to a captured image via the external device 301, but a configuration may also be adopted in which the user operates the imaging device 101 and directly inputs a score to the image. In that case, for example, the imaging apparatus 101 is provided with a touch panel display, and the user presses a GUI button displayed on the touch panel display to set a mode for displaying captured images. Similar learning can be performed by the user inputting points for each image while checking the captured images.

（４）外部通信機器でパラメータを変更することによる学習
上記で説明したとおり、撮像装置１０１と外部機器３０１は、通信手段を有しており、撮像装置１０１内に現在設定されている学習パラメータを外部機器３０１に送信し、外部機器３０１の記憶回路４０４に保存することができる。学習パラメータとしては、例えば、ニューラルネットワークのニューロン間の結合重みや、ニューラルネットワークに入力する被写体の選択などが考えられる。また、外部機器３０１内の専用のアプリケーションを介して、専用のサーバにセットされた学習パラメータを公衆回線制御回路４０６を介して取得して、撮像装置１０１内の学習パラメータに設定することもできる構成とする。これにより、ある時点でのパラメータを外部機器３０１に保存しておいて、撮像装置１０１に設定することにより学習パラメータを戻すこともできるし、他のユーザが持つ学習パラメータを専用のサーバを介して取得し自身の撮像装置１０１に設定することもできる。 (4) Learning by changing parameters with an external communication device As explained above, the imaging device 101 and the external device 301 have a communication means, and the learning parameters currently set in the imaging device 101 can be changed. It can be transmitted to the external device 301 and stored in the storage circuit 404 of the external device 301. Examples of learning parameters include connection weights between neurons in a neural network, selection of subjects to be input to the neural network, and the like. Further, the configuration is such that learning parameters set in a dedicated server can be obtained via a dedicated application in the external device 301 via the public line control circuit 406 and set as learning parameters in the imaging device 101. shall be. With this, it is possible to save the parameters at a certain point in the external device 301 and return the learning parameters by setting them in the imaging device 101, or to save the learning parameters held by other users via a dedicated server. It is also possible to acquire and set it in the own imaging device 101.

次に、学習処理シーケンスについて説明する。図５のＳ５０４のモード設定判定において、学習処理を行うべきか否かを判定し、学習処理を行う場合、学習モードであると判定され、Ｓ５１０の学習モード処理を行う。 Next, the learning processing sequence will be explained. In the mode setting determination at S504 in FIG. 5, it is determined whether or not learning processing should be performed. If learning processing is to be performed, it is determined that the learning mode is selected, and learning mode processing is performed at S510.

学習モードの判定条件について説明する。学習モードに移行するか否かは、前回学習処理を行ってからの経過時間と、学習に使用できる情報の数、通信機器を介して学習処理指示があったかなどから判定される。Ｓ５０４のモード設定判定処理内で判定される、学習モードに移行すべきか否かの判定処理フローを図９に示す。 The learning mode determination conditions will be explained. Whether or not to shift to the learning mode is determined based on the elapsed time since the last learning process, the number of information that can be used for learning, whether a learning process instruction has been received via a communication device, etc. FIG. 9 shows a process flow for determining whether to shift to learning mode, which is determined in the mode setting determination process of S504.

Ｓ５０４のモード設定判定処理内において学習モード判定が開始指示されると、図９の処理がスタートする。Ｓ９０１では外部機器からの学習指示があるか否かを判定する。ここでの学習指示の有無の判定は、＜（４）外部通信機器でパラメータを変更することによる学習＞のように、学習パラメータをセットする指示があったか否かの判定である。Ｓ９０１で、外部機器３０１からの学習指示があった場合、Ｓ９０７に進み、学習モード判定をＴＲＵＥにして、Ｓ５１０の処理を行うように設定し、学習モード判定処理を終了する。Ｓ９０１で外部機器３０１からの学習指示がない場合、Ｓ９０２に進む。 When the learning mode determination is instructed to start in the mode setting determination process of S504, the process of FIG. 9 starts. In S901, it is determined whether there is a learning instruction from an external device. The determination of the presence or absence of a learning instruction here is a determination of whether or not there was an instruction to set learning parameters, as in <(4) Learning by changing parameters with an external communication device>. In S901, if there is a learning instruction from the external device 301, the process advances to S907, sets the learning mode determination to TRUE, sets to perform the process of S510, and ends the learning mode determination process. If there is no learning instruction from the external device 301 in S901, the process advances to S902.

Ｓ９０２では、前回学習モード処理が行われてからの経過時間ＴｉｍｅＮを取得し、Ｓ９０３に進む。Ｓ９０３では、学習する新規のデータ数ＤＮ（前回学習処理が行われてからの経過時間ＴｉｍｅＮの間で、学習するように指定された画像の数）を取得し、Ｓ９０４に進む。Ｓ９０４では、ＴｉｍｅＮから閾値ＤＴを算出する。あるいは、ＴｉｍｅＮから閾値ＤＴを得るためのテーブルを用意しておいてもよい。例えば、ＴｉｍｅＮが所定値よりも小さい場合の閾値ＤＴａが、所定値よりも大きい場合の閾値ＤＴｂよりも大きく設定されており、時間経過によって、閾値が小さくなるように設定されている。これにより、学習データが少ない場合においても、時間経過が大きいと再度学習するようにすることで、使用時間が長くなると撮像装置が学習モードに変化し易いようにすることができる。なお、学習モード処理が行われてから暫くの期間は学習モードに移行しないように、閾値ＤＴを大きくするとよい。 In S902, the elapsed time TimeN since the last learning mode process was performed is acquired, and the process advances to S903. In S903, the new number of data to be learned DN (the number of images designated to be learned during the elapsed time TimeN since the previous learning process was performed) is acquired, and the process advances to S904. In S904, a threshold value DT is calculated from TimeN. Alternatively, a table for obtaining the threshold value DT from TimeN may be prepared. For example, the threshold value DTa when TimeN is smaller than a predetermined value is set larger than the threshold value DTb when TimeN is larger than the predetermined value, and the threshold value is set to become smaller as time passes. As a result, even when there is little learning data, learning is performed again when a large amount of time has elapsed, thereby making it easier for the imaging device to change to the learning mode when the usage time becomes longer. Note that it is preferable to increase the threshold value DT so as not to shift to the learning mode for a while after the learning mode processing is performed.

Ｓ９０４において閾値ＤＴが算出されると、Ｓ９０５に進み、学習するデータ数ＤＮが、閾値ＤＴ以上であるか否かを判定する。データ数ＤＮが、閾値ＤＴ以上である場合、Ｓ９０６に進み、ＤＮを０に設定する。その後、Ｓ９０７に進み、学習モード判定をＴＲＵＥにして、Ｓ５１０の処理を行うように設定し、学習モード判定処理を終了する。 When the threshold value DT is calculated in S904, the process proceeds to S905, and it is determined whether the number of data to be learned DN is greater than or equal to the threshold value DT. If the data number DN is equal to or greater than the threshold DT, the process advances to S906 and DN is set to 0. Thereafter, the process proceeds to S907, where the learning mode determination is set to TRUE, the process of S510 is set to be performed, and the learning mode determination process is ended.

Ｓ９０５においてデータ数ＤＮが、閾値ＤＴ未満の場合、Ｓ９０８に進む。Ｓ９０８では、外部機器３０１からの登録指示も、外部機器からの学習指示もなく、且つ学習データの数も所定値未満であるので、学習モード判定をＦＡＬＳＥにし、Ｓ５１０の処理は行わないように設定し、学習モード判定処理を終了する。 If the data number DN is less than the threshold DT in S905, the process advances to S908. In S908, since there is neither a registration instruction from the external device 301 nor a learning instruction from the external device, and the number of learning data is less than a predetermined value, the learning mode determination is set to FALSE and the process of S510 is set not to be performed. Then, the learning mode determination process ends.

次に、学習モード処理（Ｓ５１０）内の処理について説明する。学習モード処理の詳細なフローを図１０に示す。 Next, the processing in the learning mode processing (S510) will be explained. FIG. 10 shows a detailed flow of the learning mode process.

図５のＳ５０９で学習モードと判定され、Ｓ５１０に進むと、図１０の処理がスタートする。Ｓ１００１では、外部機器３０１からの学習パラメータの設定指示があるか否かを判定する。外部機器３０１から学習パラメータの設定指示があった場合、Ｓ１００６に進み、外部機器から送信された学習パラメータを各判定器（ニューラルネットワークのニューロン間の結合重みなど）に設定し、Ｓ１００７に進む。Ｓ１００１で外部機器３０１からの学習指示がない場合、Ｓ１００２に進む。 When the learning mode is determined in S509 of FIG. 5 and the process proceeds to S510, the process of FIG. 10 starts. In S1001, it is determined whether there is a learning parameter setting instruction from the external device 301. If there is an instruction to set learning parameters from the external device 301, the process advances to S1006, where the learning parameters transmitted from the external device are set in each determiner (such as connection weights between neurons of a neural network), and the process advances to S1007. If there is no learning instruction from the external device 301 in S1001, the process advances to S1002.

Ｓ１００２では、いずれかの学習データを選択して機械学習を行う。この学習データは、手動で撮影された画像であるとの情報が付加された撮影画像から生成された学習データ、外部通信機器で取得した画像から生成された学習データ、外部通信機器を介して判定値が入力された撮影画像から生成された学習データを含む。誤差逆伝搬法あるいは、勾配降下法などの方法を用いて学習させ、ニューラルネットワークのニューロン間の結合重みを再計算して、各判定器のパラメータを変更する。学習データを生成した画像に対してユーザが点数を付けていれば、その点数を加味した学習を行う。 In S1002, one of the learning data is selected and machine learning is performed. This learning data includes learning data generated from captured images that have been added with information indicating that they are manually captured images, learning data generated from images acquired with an external communication device, and learning data determined via an external communication device. Contains learning data generated from captured images with input values. Learning is performed using a method such as error backpropagation or gradient descent, the connection weights between neurons in the neural network are recalculated, and the parameters of each judger are changed. If the user has given a score to the image that generated the learning data, learning will be performed taking that score into consideration.

Ｓ１００３では、機械学習のために用意した全ての学習データを用いて学習を行ったかを判定する。まだ残っている学習データがあればＳ１００２に戻り、全ての学習データを用いて学習を行っていればＳ１００４に進む。 In S1003, it is determined whether learning has been performed using all the learning data prepared for machine learning. If there is still learning data remaining, the process returns to S1002, and if learning has been performed using all the learning data, the process proceeds to S1004.

Ｓ１００４では、機械学習により得られた学習パラメータを、基準回数に対応付けて不揮発性メモリ２１４に記憶する。 In S1004, the learning parameters obtained by machine learning are stored in the nonvolatile memory 214 in association with the reference number of times.

Ｓ１００５では、Ｓ１００４で記憶した最新の学習パラメータを各判定器（ニューラルネットワークのニューロン間の結合重みなど）に設定し、Ｓ１００７に進む。 In S1005, the latest learning parameters stored in S1004 are set in each determiner (such as connection weights between neurons of a neural network), and the process advances to S1007.

Ｓ１００７では、記録媒体２１９内あるいは不揮発性メモリ２１４内の画像に対してスコアを付け直す（再評価する）。本実施形態においては、新たな学習結果に基づいて、記録媒体２１９内あるいは不揮発性メモリ２１４内に保存されている全ての撮影画像にスコアを付けておき、付けられたスコアに応じて、自動編集や自動ファイル削除を行う構成とする。つまり、再学習や外部機器からの学習パラメータのセットが行われた場合には、撮影済み画像のスコアも更新する必要がある。よって、Ｓ１００７では、記録媒体２１９内あるいは不揮発性メモリ２１４内に保存されている撮影画像に対して、新たなスコアを付ける再計算が行われ、処理が終了すると学習モード処理を終了する。なお、新たなスコアを付ける再計算は、ユーザの指示に応じて行うようにしてもよい。 In S1007, the images in the recording medium 219 or the non-volatile memory 214 are re-scored (re-evaluated). In this embodiment, scores are assigned to all captured images stored in the recording medium 219 or non-volatile memory 214 based on new learning results, and automatic editing is performed according to the assigned scores. and automatic file deletion. In other words, when re-learning or setting learning parameters from an external device is performed, the scores of captured images also need to be updated. Therefore, in S1007, a new score is recalculated for the photographed image stored in the recording medium 219 or the nonvolatile memory 214, and when the process is completed, the learning mode process is ended. Note that the recalculation for assigning a new score may be performed in response to a user's instruction.

本実施形態においては、撮像装置１０１内で学習する構成を基に説明したが、外部機器３０１側に学習機能を持たせ、学習に必要なデータを外部機器３０１に送信し、外部機器側でのみ学習を実行する構成でも同様の学習効果を実現可能である。その場合、上記の＜（４）外部通信機器でパラメータを変更することによる学習＞で説明したように、外部機器側で学習したニューラルネットワークのニューロン間の結合重みなどのパラメータを撮像装置１０１に送信して設定することにより、学習を行う構成にしてもよい。 Although the present embodiment has been described based on a configuration in which learning is performed within the imaging device 101, the learning function is provided on the external device 301 side, data necessary for learning is sent to the external device 301, and only the learning function is provided on the external device 301 side. A similar learning effect can also be achieved with a configuration that performs learning. In that case, as explained in <(4) Learning by changing parameters with external communication device> above, parameters such as connection weights between neurons of the neural network learned on the external device side are transmitted to the imaging device 101. It is also possible to have a configuration in which learning is performed by setting as follows.

また、撮像装置１０１内と、外部機器３０１内の両方に、それぞれ学習処理機能をもつ構成にしてもよい。例えば撮像装置１０１内で学習モード処理が行われるタイミングで外部機器３０１が持つ学習データを撮像装置１０１に通信し、学習パラメータをマージすることで学習を行う構成にしてもよい。 Further, a configuration may be adopted in which both the imaging device 101 and the external device 301 have learning processing functions. For example, a configuration may be adopted in which learning data held by the external device 301 is communicated to the imaging device 101 at the timing when learning mode processing is performed within the imaging device 101, and learning is performed by merging the learning parameters.

＜ファイル自動削除処理＞
次に、ファイル自動削除処理の詳細について図１１を用いて説明する。図１１は、図５のＳ５１２で実行されるファイル自動削除処理の動作を示すフローチャートである。 <File automatic deletion process>
Next, details of automatic file deletion processing will be explained using FIG. 11. FIG. 11 is a flowchart showing the operation of the automatic file deletion process executed in S512 of FIG.

図５のＳ５１１において、前回ファイル自動削除を行ってからの経過時間や、記録媒体２１９あるいは不揮発性メモリ２１４に記録したデータ量、学習データの更新の有無等により、ファイル自動削除を行うべきと判定されると、Ｓ５１２に進み、図１１のファイル自動削除モードの動作が実行される。 In S511 of FIG. 5, it is determined that automatic file deletion should be performed based on the elapsed time since the last automatic file deletion, the amount of data recorded on the recording medium 219 or nonvolatile memory 214, whether learning data has been updated, etc. If so, the process advances to S512, and the automatic file deletion mode operation shown in FIG. 11 is executed.

Ｓ１１０２では、前回のファイル自動削除処理を実行した際に参照した学習データに対して、学習データが更新されているか否かを判断する。学習データつまり学習パラメータが更新されている場合は、ユーザの好みが更新されている可能性があり、削除候補も変化している可能性がある。そのため、前回から学習パラメータが更新されていればＳ１１０３に進み、更新されていなければ、Ｓ１１０８に進む。 In S1102, it is determined whether the learning data referenced when the previous automatic file deletion process was executed has been updated. If the learning data, that is, the learning parameters have been updated, the user's preferences may have been updated, and the deletion candidates may also have changed. Therefore, if the learning parameters have been updated since the last time, the process advances to S1103, and if they have not been updated, the process advances to S1108.

Ｓ１１０３では、前回のファイル自動削除処理から学習パラメータが更新されているので、ユーザの好みも更新されている可能性がある。そのため、記録媒体２１９内あるいは不揮発性メモリ２１４内に保存されている全ての撮影画像を対象にして、図１０のＳ１００７で付与されたスコアを参照し、スコアが第１の基準値よりも低い撮影画像を、削除候補画像として選択する。この第１の基準値は一定値でも、ユーザが設定した値でもよいし、全ての撮影画像のスコアのヒストグラム分布に応じて設定した値としてもよい。なお、説明を簡単にするために、記録媒体２１９内あるいは不揮発性メモリ２１４内に保存されている全ての撮影画像を対象としているが、ユーザによる指示、日付、あるいは、画像が格納されているフォルダなどによって、対象とする撮影画像を制限するようにしてもよい。 In S1103, since the learning parameters have been updated since the previous automatic file deletion process, there is a possibility that the user's preferences have also been updated. Therefore, the scores assigned in S1007 of FIG. 10 are referred to for all the captured images stored in the recording medium 219 or the non-volatile memory 214, and the scores of the captured images whose scores are lower than the first reference value are determined. Select an image as a deletion candidate image. This first reference value may be a constant value, a value set by the user, or a value set according to the histogram distribution of scores of all captured images. Note that to simplify the explanation, all captured images stored in the recording medium 219 or non-volatile memory 214 are covered, but instructions from the user, date, or folder in which the images are stored are The target photographed images may be limited by, for example, the following.

Ｓ１１０４では、削除されずに残る画像が第２の基準値以上の場合には、Ｓ１１０５に進み、第２の基準値未満の場合にはＳ１１０６に進む。このＳ１１０４は、記録対象として残す画像が多すぎるか否かを判定する処理である。そのため、この第２の基準値は固定値でもよいが、撮影回数が増えるほど、記録対象として残す画像が増えることが望ましいので、撮影回数が増加するほど、この第２の基準値も増加させることが望ましい。 In S1104, if the image that remains without being deleted is equal to or greater than the second reference value, the process advances to S1105, and if it is less than the second reference value, the process advances to S1106. This step S1104 is a process for determining whether there are too many images to be recorded. Therefore, this second reference value may be a fixed value, but as the number of shots increases, it is desirable that the number of images to be recorded increases, so as the number of shots increases, this second reference value may also be increased. is desirable.

Ｓ１１０５では、Ｓ１１０３とは異なる評価基準により、削除候補画像の選択を行う。学習データに基づくスコアはすでに算出されているため、例えば、Ｓ１１０３で削除候補画像とならなかった撮影画像間の類似度を比較し、類似度が高い撮影画像が存在する場合には、いずれか（例えば、スコアの低いほう）の類似画像を削除する。類似度の判断基準としては、例えば、撮影された場所、撮影された日時、撮影画像内の被写体の構図、撮影画像内の被写体の種類などが考えられるが、これに限られるものではない。なお、スコアが高い画像を全て残したいのであれば、Ｓ１１０４およびＳ１１０５は、省略しても構わない。 In S1105, deletion candidate images are selected using evaluation criteria different from those in S1103. Since the score based on the learning data has already been calculated, for example, the similarity between the captured images that were not selected as deletion candidate images in S1103 is compared, and if there is a captured image with a high degree of similarity, one of the captured images ( For example, similar images with lower scores are deleted. Examples of criteria for determining similarity include, but are not limited to, the location where the image was taken, the date and time the image was taken, the composition of the subject in the captured image, and the type of subject in the captured image. Note that if you want to keep all images with high scores, S1104 and S1105 may be omitted.

Ｓ１１０６では、前回のファイル自動削除処理では削除されなかったが、今回のファイル自動削除処理で削除候補となった画像がある場合には、それらの画像が削除候補となったことをユーザに報知する。報知処理の一例としては、音声出力回路２１６から発するスピーカー音を用いて報知する方法や、ＬＥＤ制御回路２２２によるＬＥＤ点灯光を用いて報知する方法などが考えられる。また、削除候補を確認する手法としては、スマートデバイスである外部機器３０１の専用のアプリケーションを介して、カメラ１０１内に記録されている削除候補を一覧表示する方法などが考えられる。一例としては、削除対象の画像を一覧表示し、過去に削除対象とならなかった画像がわかるように色分けやアイコン表示を行い、削除候補をユーザに視覚的に認識させる。これは、過去のファイル自動削除処理において削除されなかった画像については、ユーザは、今回のファイル自動削除処理においても削除されないと思い込んでいる可能性があるためである。そして、ユーザからの指示があれば、その指示に応じて画像を削除対象から除外する。 In S1106, if there are images that were not deleted in the previous file automatic deletion process but became deletion candidates in the current file automatic deletion process, the user is notified that these images have become deletion candidates. . Examples of notification processing include a method of notification using a speaker sound emitted from the audio output circuit 216, a method of notification using an LED lighting light from the LED control circuit 222, and the like. Further, as a method for checking the deletion candidates, there may be a method of displaying a list of deletion candidates recorded in the camera 101 via a dedicated application of the external device 301, which is a smart device. For example, a list of images to be deleted is displayed, and images that have not been targeted for deletion in the past are displayed in color or with icons so that the user can visually recognize deletion candidates. This is because the user may believe that images that were not deleted in the past automatic file deletion process will not be deleted in the current automatic file deletion process. Then, if there is an instruction from the user, the image is excluded from deletion targets in accordance with the instruction.

そして、Ｓ１１０７において、削除対象画像を記録媒体２１９あるいは不揮発性メモリ２１４から削除して、本フローの動作を終了する。 Then, in S1107, the image to be deleted is deleted from the recording medium 219 or the nonvolatile memory 214, and the operation of this flow ends.

なお、Ｓ１１０２において学習パラメータの更新がされていなければ、Ｓ１１０８に進む。Ｓ１１０８おいては、記録媒体２１９内あるいは不揮発性メモリ２１４内に保存されている撮影画像のうち、前回のファイル自動削除処理よりも後に得られた撮影画像のみを対象にして、図１０のＳ１００７で付与されたスコアを参照する。そして、スコアが前述した第１の基準値よりも低い撮影画像を、削除候補画像として選択する。学習パラメータが更新されていない状態であれば、すでに判定済みの撮影画像に対して判定を行ったとしても、判定結果が変わらないためである。また、スコアを参照する撮影画像を減らすことにより、処理負荷を軽減することができる。 Note that if the learning parameters have not been updated in S1102, the process advances to S1108. In S1108, among the captured images stored in the recording medium 219 or the non-volatile memory 214, only the captured images obtained after the previous file automatic deletion process are targeted, and in S1007 of FIG. See the assigned score. Then, a photographed image whose score is lower than the first reference value described above is selected as a deletion candidate image. This is because if the learning parameters have not been updated, the determination result will not change even if the determination is performed on a captured image that has already been determined. Furthermore, by reducing the number of captured images that refer to scores, the processing load can be reduced.

そして、Ｓ１１０９で削除対象画像を記録媒体２１９あるいは不揮発性メモリ２１４から削除して、本フローの動作を終了する。 Then, in step S1109, the image to be deleted is deleted from the recording medium 219 or the nonvolatile memory 214, and the operation of this flow ends.

なお、削除対象画像を選択する際に、図１０のＳ１００７で付与されたスコアに加え、画像観閲記録や画像転送記録に基づいて、画像を選択して削除処理を実行してもよい。例えば、ユーザが長い期間観閲していない画像や、すでに転送している画像は、削除しても不利益はないと考えられるためである。 Note that when selecting images to be deleted, the images may be selected and deleted based on the image viewing record or image transfer record in addition to the score assigned in S1007 of FIG. 10. This is because, for example, there is no disadvantage to deleting images that the user has not viewed for a long time or images that have already been transferred.

また、Ｓ１１０６において削除候補画像をユーザに報知するステップを設けたが、報知せずに自動削除を行う構成としてもよい。また、Ｓ１１０３に進む条件として、学習パラメータが更新された場合だけでなく、記録媒体２１９あるいは不揮発性メモリ２１４の残容量が閾値を下回っている場合を含めるようにしてもよい。 Further, although the step of notifying the user of deletion candidate images in S1106 is provided, a configuration may be adopted in which automatic deletion is performed without notification. Further, the conditions for proceeding to S1103 may include not only the case where the learning parameters have been updated but also the case where the remaining capacity of the recording medium 219 or the nonvolatile memory 214 is below a threshold value.

このように、本実施形態では、学習パラメータが更新されている場合のほうが、学習パラメータが更新されていない場合よりも、ファイル自動削除処理の判定の対象となる画像の範囲が広い。すなわち、記録媒体２１９内あるいは不揮発性メモリ２１４内に保存されている撮影画像が同じ条件であっても、ファイル自動削除処理を行う際に学習パラメータが更新されていれば、学習パラメータが更新されていない場合に比べて、ファイル自動削除処理の判定の対象となる画像の数が多くなる。なお、本実施形態では、学習パラメータが更新されていない状態であれば、前回のファイル自動削除処理よりも後に得られた撮影画像のみから、削除候補画像を選択する例をあげて説明を行ったが、これに限られるものではない。 As described above, in this embodiment, the range of images that are subject to automatic file deletion processing is wider when the learning parameters have been updated than when the learning parameters have not been updated. In other words, even if the captured images stored in the recording medium 219 or the non-volatile memory 214 are under the same conditions, if the learning parameters are updated when performing automatic file deletion processing, the learning parameters will not be updated. The number of images that are subject to automatic file deletion processing increases compared to the case where no files are deleted. In addition, in this embodiment, an example was given in which deletion candidate images are selected from only captured images obtained after the previous automatic file deletion process if the learning parameters have not been updated. However, it is not limited to this.

例えば、ファイル自動削除処理において、スコアが高いものから所定数の画像を残し、それ以外の画像を削除する仕様とすることが考えられる。このような仕様であれば、学習パラメータが更新されてなくとも、撮影画像が増えることで、前回のファイル自動削除処理で削除されなかった画像が、次のファイル自動削除処理では削除される可能性がある。このような場合は、学習パラメータが更新されていれば、次のファイル自動削除処理では、記録媒体２１９内に保存されている全ての撮影画像を対象にする。これに対して、学習パラメータが更新されていなければ、次のファイル自動削除処理では、前回のファイル自動削除処理よりも後に得られた撮影画像と、前回のファイル自動削除処理よりも前に得られた撮影画像のうちの一部のみを対象にする。前回のファイル自動削除処理よりも後に撮影画像が得られることによって削除する対象となる撮影画像は、前回のファイル自動削除処理よりも前に得られた撮影画像のうち、相対的にスコアが低い撮影画像である。よって、前回のファイル自動削除処理よりも前に得られた撮影画像においては、これらの相対的にスコアが低い撮影画像のみを次のファイル自動削除処理の対象としてもよい。 For example, in automatic file deletion processing, a predetermined number of images with the highest scores may be left and the remaining images may be deleted. With such specifications, even if the learning parameters are not updated, as the number of captured images increases, there is a possibility that images that were not deleted in the previous automatic file deletion process will be deleted in the next automatic file deletion process. There is. In such a case, if the learning parameters have been updated, all captured images stored in the recording medium 219 will be targeted in the next automatic file deletion process. On the other hand, if the learning parameters have not been updated, the next automatic file deletion process will use images captured after the previous automatic file deletion process and images acquired before the previous automatic file deletion process. Only some of the captured images are targeted. The captured images that are obtained after the previous file automatic deletion process and are subject to deletion are those that have a relatively low score among the captured images obtained before the previous file automatic deletion process. It is an image. Therefore, among the captured images obtained before the previous automatic file deletion process, only those captured images with relatively low scores may be subjected to the next automatic file deletion process.

なお、本実施形態では、撮像装置１０１内でファイルの自動削除を行う構成を前提として説明したが、これに限られるものではない。撮像装置１０１から撮影画像が転送された外部機器３０１や、外部機器３０１を介して転送されたサーバにおいて、ファイルの自動削除を行う構成としてもよい。この場合、撮像装置１０１は画像の自動削除を行う必要はなく、全ての撮影画像を外部機器３０１やサーバに転送する構成とすることができる。 Note that although the present embodiment has been described assuming a configuration in which files are automatically deleted within the imaging apparatus 101, the present invention is not limited to this. The file may be automatically deleted in the external device 301 to which the captured image is transferred from the imaging device 101 or in the server to which the captured image is transferred via the external device 301. In this case, the imaging apparatus 101 does not need to automatically delete images, and can be configured to transfer all captured images to the external device 301 or the server.

また、外部機器３０１やサーバに、撮影画像を記録するための十分な容量がある場合には、ファイルを自動削除する代わりに、ユーザが閲覧するための撮影画像を自動選択する構成としてもよい。すなわち、撮影画像を自動削除することはせずに、ユーザが撮影画像を閲覧しようとした場合に、学習パラメータに基づくスコアの低い画像を表示対象とはせずに、スコアの高い画像のみをユーザが閲覧可能なように選択する仕様とすることができる。この場合は、Ｓ１１０３、Ｓ１１０５、Ｓ１１０８において、削除候補画像を選択する代わりに、非表示候補画像を選択することになる。この場合、撮影画像は削除されずに残るため、過去に低いスコアが付いた画像であっても、新たな学習パラメータに基づく評価では高いスコアとなる画像を救済することが可能となる。 Further, if the external device 301 or the server has sufficient capacity to record captured images, a configuration may be adopted in which captured images are automatically selected for the user to view instead of automatically deleting files. In other words, when a user tries to view a captured image without automatically deleting the captured image, only the images with a high score will be displayed to the user instead of the images with a low score based on the learning parameters. The specification can be such that it can be viewed. In this case, in steps S1103, S1105, and S1108, a non-display candidate image is selected instead of selecting a deletion candidate image. In this case, since the photographed image remains without being deleted, even if the image had a low score in the past, it is possible to salvage an image that would have a high score in the evaluation based on the new learning parameters.

以上説明したように、本実施形態によれば、学習データの更新に応じて、記録あるいは表示対象とする画像を適切に選択できるようになるため、ユーザの好みが変わった場合に、それに応じた画像の選択が行えるようになる。これにより、ユーザの好みの画像を残し、そうでない画像を自動的に削除することができ、メディアの空き容量不足によりユーザの意図した撮影を行うことができなかったり、自動撮影において狙ったシーンを撮影できなかったりする不都合を解消することができる。 As explained above, according to the present embodiment, images to be recorded or displayed can be appropriately selected according to updates of learning data, so if the user's preferences change, the image to be recorded or displayed can be appropriately selected. You can now select images. This makes it possible to keep images that the user likes and automatically delete images that are not. This makes it possible to automatically delete images that the user does not like. The inconvenience of not being able to take pictures can be resolved.

（他の実施形態）
また本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読み出し実行する処理でも実現できる。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現できる。 (Other embodiments)
The present invention also provides a system or device with a program that implements one or more functions of the above-described embodiments via a network or a storage medium, and one or more processors in a computer of the system or device reads the program. This can also be achieved by executing a process. It can also be implemented by a circuit (eg, ASIC) that implements one or more functions.

発明は上記実施形態に制限されるものではなく、発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、発明の範囲を公にするために請求項を添付する。 The invention is not limited to the embodiments described above, and various changes and modifications can be made without departing from the spirit and scope of the invention. Therefore, the following claims are hereby appended to disclose the scope of the invention.

１０１：撮像装置、１０２：鏡筒、１０４：チルト回転ユニット、１０５：パン回転ユニット、２１７：学習処理回路、２１９：記録媒体、２２１：制御回路、３０１：外部機器 101: Imaging device, 102: Lens barrel, 104: Tilt rotation unit, 105: Pan rotation unit, 217: Learning processing circuit, 219: Recording medium, 221: Control circuit, 301: External device

Claims

a storage means for storing an image obtained by imaging a subject;
A learning means that uses machine learning to learn the user's favorite images;
evaluation means for acquiring a score that is an evaluation value of the image stored in the storage means based on learning parameters of the learning means;
selection means for selecting deletion candidates from the images stored in the storage means with reference to the score ;
Deletion means for deleting the image selected by the selection means from the storage means;
An image processing device comprising :
The image processing device has a learning mode in which the learning means learns a user's favorite image, and an automatic deletion mode in which the deletion means automatically deletes images stored in the storage means,
In the learning mode, if there is an instruction to set a learning parameter of the learning means from an external device, the evaluation means sets the learning parameter, and if there is no instruction, after setting the learned learning parameter. , re-evaluate the scores for all images,
In the automatic deletion mode, when the learning parameters of the learning means have been updated , the selection means selects deletion candidates by referring to the scores for all images, and when the learning parameters have not been updated , An image processing device that selects deletion candidates by referring to the score for only images obtained after a previous automatic file deletion process .

5. The selection means selects the deletion candidate based on at least one of a viewing record of the image stored in the storage means and a record of transfer to an external device, in addition to the score. 1. The image processing device according to 1 .

3. The image processing apparatus according to claim 1 , wherein the learning means sets learning parameters based on an image photographed according to a manual photographing instruction from a user.

3. The image processing apparatus according to claim 1 , wherein the learning means sets learning parameters based on an image requested to be transmitted from an external device.

3. The learning means learns based on a score, which is an evaluation value of each image, input by a user and stored in the storage means, and sets learning parameters. image processing device.

The selection means selects similar images from among the images other than the selected images when the number of images other than those selected by the selection means among the images stored in the storage means is equal to or greater than a predetermined threshold. The image processing apparatus according to any one of claims 1 to 5 , wherein the image processing apparatus further selects at least one of the similar images.

7. The image processing apparatus according to claim 6 , further comprising a notification unit that visibly notifies a user of the image further selected by the selection unit.

The selection means determines whether or not to select the deletion candidate from the images stored in the storage means based on the amount of data of the images stored in the storage means. The image processing apparatus according to any one of claims 1 to 7 .

an imaging means for imaging a subject;
An image processing device according to any one of claims 1 to 8 ,
An imaging device comprising:

Claim 9 , further comprising a control unit that searches for a subject based on at least one of images, sounds, time, vibrations, changes in the body, and past photographic information, and causes the imaging unit to automatically perform photographing. The imaging device described in .

a storage step of storing an image obtained by imaging the subject in a storage means;
A learning process that uses machine learning to learn the user's favorite images;
an evaluation step of acquiring a score, which is an evaluation value of the image stored in the storage means, based on the learning parameters in the learning step;
a selection step of selecting deletion candidates from images stored in the storage means with reference to the score ;
a deletion step of deleting the image selected in the selection step from the storage means;
An image processing method comprising :
The image processing method has a learning mode in which the user's favorite images are learned in the learning step, and an automatic deletion mode in which the images stored in the storage means are automatically deleted in the deletion step,
In the evaluation step, in the learning mode, if there is an instruction to set learning parameters for the learning step from an external device, the learning parameters are set, and if there is no instruction, after setting the learned learning parameters. , re-evaluate the scores for all images,
In the selection step, in the automatic deletion mode, if the learning parameters of the learning step have been updated , deletion candidates are selected by referring to the scores for all images, and if they have not been updated , An image processing method characterized in that deletion candidates are selected by referring to the scores for only images obtained after a previous automatic file deletion process .

A program for causing a computer to function as each means of the image processing apparatus according to claim 1 .

A computer-readable storage medium storing a program for causing a computer to function as each means of the image processing apparatus according to any one of claims 1 to 8 .