JP2019185541A

JP2019185541A - Image processing apparatus, image processing method, and program

Info

Publication number: JP2019185541A
Application number: JP2018077730A
Authority: JP
Inventors: 窪田　聡; Satoshi Kubota; 聡窪田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-04-13
Filing date: 2018-04-13
Publication date: 2019-10-24

Abstract

To extract a subject from an image precisely within a short period of time to make it recognizable.SOLUTION: An image processing apparatus (40) detects a subject from a captured image to extract feature information, recognizes an individual of the subject whose feature information is extracted, and registers the feature information at a data base. Then, the image processing apparatus (40) primarily stores (46) the extracted feature information, collates, with the primarily stored feature information, feature information extracted from a subject detected after the subject disappears from the image, confirms whether or not the subject detected after disappearance of the subject from the image is the subject of the primarily stored feature information, and registers the feature information of the subject at the data base when the subject detected after disappearance from the image is the subject of the primarily stored feature information and an angle of view used for capturing the subject whose feature information is extracted is different from that used for capturing the subject of the primarily stored feature information.SELECTED DRAWING: Figure 1

Description

本発明は、撮像された画像から被写体を検出、識別する技術に関する。 The present invention relates to a technique for detecting and identifying a subject from a captured image.

近年のデジタルカメラは、撮像素子から得られた画像データから被写体の検出及び追尾を行い、その被写体に対してピント、明るさ、色を好適な状態に合わせて撮影する機能を備えていることが一般的になっている。また、検出した被写体の中から個体を特定する方法として認識技術がある。特定の固体の認識は、検出した被写体各々が有する特徴情報を抽出し、その抽出した特徴情報と、予め用意された参照用の特徴情報とを比較し、それらが一致するかどうかの照合を行うことによって行われる。参照用の特徴情報は、一例として、認識対象として想定される被写体を事前にカメラにより撮影して特徴情報を抽出し、その特徴情報を不揮発性メモリ等に記憶（つまり登録）しておくことなどによって用意される。しかしながらこの場合、参照用の特徴情報を事前に用意して登録しておくという煩雑な作業が必要になる。また、認識対象として想定される被写体自体、あるいはその特徴情報を事前に用意できないというケースもある。 Recent digital cameras have a function of detecting and tracking a subject from image data obtained from an image sensor and photographing the subject according to a suitable state of focus, brightness, and color. It has become common. In addition, there is a recognition technique as a method for specifying an individual from detected objects. In recognition of a specific object, feature information of each detected subject is extracted, and the extracted feature information is compared with reference feature information prepared in advance to check whether or not they match. Is done by. For reference feature information, for example, a subject assumed as a recognition target is photographed in advance by a camera, feature information is extracted, and the feature information is stored (that is, registered) in a nonvolatile memory or the like. Prepared by. However, in this case, a complicated work of preparing and registering reference feature information in advance is required. There are also cases where the subject itself assumed as a recognition target or its characteristic information cannot be prepared in advance.

これに対し、特許文献１には、事前の登録作業を略々不要にする技術が開示されている。特許文献１に記載の技術では、撮影画像から検出した被写体の特徴情報を辞書データと比較した結果、未登録被写体かつ重要被写体条件を満たしていれば、一旦、仮辞書データとして記憶する。重要被写体と判定される条件としては、被写体を追跡した時間長や、主要な被写体として選択した回数などが用いられている。そして、その後の撮影行為で取得した撮影画像から検出した被写体の特徴情報が仮辞書データに記憶した特徴情報と一致した場合、その被写体の特徴情報が辞書データに本登録される。 On the other hand, Patent Document 1 discloses a technique that makes the prior registration work substantially unnecessary. In the technique described in Patent Document 1, if the feature information of the subject detected from the photographed image is compared with the dictionary data, if the unregistered subject and the important subject condition are satisfied, the subject dictionary is temporarily stored as temporary dictionary data. As a condition for determining an important subject, a time length for tracking the subject, the number of times of selection as a main subject, and the like are used. Then, when the feature information of the subject detected from the photographed image acquired in the subsequent photographing action matches the feature information stored in the temporary dictionary data, the feature information of the subject is fully registered in the dictionary data.

特開２０１４−４４５２５号公報JP 2014-44525 A

前述したように、特許文献１に記載の技術では、被写体を追跡した時間長や主要な被写体の選択回数が、重要被写体か否かを判定する条件となされている。しかしながら、撮影画像内に例えば複数の被写体が存在するケースや、画像内から重要被写体が消失したり他の被写体等により遮られたりするケースでは、そもそも撮影者の意図した被写体が重要被写体として判定されなくなる場合が多い。このため、特許文献１の技術は、追跡時間長や主要な被写体として選択された回数に基づいた重要被写体判定が成功する頻度が低いという問題がある。 As described above, in the technique described in Patent Document 1, the time length of tracking a subject and the number of main subject selections are conditions for determining whether or not the subject is an important subject. However, in the case where there are a plurality of subjects in the photographed image, for example, in the case where the important subject disappears from the image or is blocked by other subjects, the subject intended by the photographer is determined as the important subject in the first place. Often disappears. For this reason, the technique of Patent Document 1 has a problem that the important subject determination based on the tracking time length and the number of times selected as the main subject is not successful.

そこで、本発明は、撮像された画像から短時間かつ正確に被写体を抽出して認識可能にすることを目的とする。 Therefore, an object of the present invention is to make it possible to extract a subject accurately from a captured image in a short time and recognize it.

本発明の画像処理装置は、撮影された画像から被写体を検出する検出手段と、前記検出された被写体から特徴情報を抽出する抽出手段と、前記特徴情報を基に前記被写体の個体を認識する認識手段と、前記特徴情報をデータベースに登録する登録手段と、前記特徴情報を一次記憶する記憶手段と、を有し、前記認識手段は、前記被写体が画像から消失した後に検出された被写体から抽出された特徴情報と、前記一次記憶されている特徴情報とを照合して、前記画像から消失した後に検出された被写体が前記一次記憶された特徴情報の被写体かどうかの確認を行い、前記登録手段は、前記画像から消失した後に検出された被写体が前記一次記憶された特徴情報の被写体であると前記認識手段により確認され、前記一次記憶された特徴情報の被写体が撮影された際の画角と、前記画像から消失した後に検出された被写体が撮影された際の画角とが異なる場合に、前記被写体の特徴情報を前記データベースに登録することを特徴とする。 An image processing apparatus according to the present invention includes a detection unit that detects a subject from a captured image, an extraction unit that extracts feature information from the detected subject, and recognition that recognizes an individual of the subject based on the feature information. Means for registering the feature information in a database, and storage means for temporarily storing the feature information. The recognition means is extracted from a subject detected after the subject disappears from the image. The registered feature information is compared with the primary stored feature information to confirm whether the subject detected after disappearing from the image is the subject of the primary stored feature information. The recognition means confirms that the subject detected after disappearing from the image is the subject of the feature information stored primarily, and the subject of the feature information stored primarily is copied. The feature information of the subject is registered in the database when the angle of view when the image is taken differs from the angle of view when the subject detected after disappearing from the image is taken. .

本発明によれば、撮像された画像から短時間かつ正確に被写体を抽出して認識可能となる。 According to the present invention, a subject can be accurately extracted in a short time from a captured image.

画像処理装置の一例であるデジタルカメラの構成例を示す図である。It is a figure which shows the structural example of the digital camera which is an example of an image processing apparatus. デジタルカメラの概略的な外観図である。1 is a schematic external view of a digital camera. 事前登録の説明に用いる図である。It is a figure used for description of prior registration. パンニング時の自動登録の説明図である。It is explanatory drawing of the automatic registration at the time of panning. ズーミング時の自動登録の説明図である。It is explanatory drawing of the automatic registration at the time of zooming. 顔ＩＤ発番の説明図である。It is explanatory drawing of face ID numbering. 顔検出結果の管理テーブルの一例を示す図である。It is a figure which shows an example of the management table of a face detection result. 辞書データベースの説明図である。It is explanatory drawing of a dictionary database. 一次記憶の説明図である。It is explanatory drawing of primary storage. カメラにおける処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process in a camera. 顔ＩＤ発番処理のフローチャートである。It is a flowchart of a face ID numbering process. 認識状況判定処理のフローチャートである。It is a flowchart of recognition status determination processing. 認識処理および重要被写体登録処理のフローチャートである。It is a flowchart of a recognition process and an important subject registration process. 信頼性の説明図である。It is explanatory drawing of reliability.

以下に、本発明の好ましい実施形態を、添付の図面に基づいて詳細に説明する。
図１は、本実施形態の画像処理装置の一適用例である撮像装置（デジタルカメラ、以下、カメラ１００とする。）の概略的な構成を示した図である。
レンズ１０は外光を集光して光学像を撮像部２０に結像させる。
メカ駆動回路１６は、レンズ１０を光軸方向に沿って駆動することで焦点調節や画角調節（ズーム動作）を行う。またメカ駆動回路１６は、カメラブレに応じてレンズを光軸方向以外にも駆動することで手ぶれ補正を行うことも可能である。なお、手ぶれ補正は撮像部２０を動かすことでも同様に実現可能である。手ぶれ補正を行う場合、システム制御部４２は、例えば図示しない加速度センサや角速度センサ等の検出出力を基にカメラ１００のぶれを検出し、そのカメラ１００のぶれを相殺するような公知の補正制御を行う。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
FIG. 1 is a diagram illustrating a schematic configuration of an imaging apparatus (digital camera, hereinafter referred to as a camera 100) which is an application example of the image processing apparatus of the present embodiment.
The lens 10 collects external light and forms an optical image on the imaging unit 20.
The mechanical drive circuit 16 performs focus adjustment and field angle adjustment (zoom operation) by driving the lens 10 along the optical axis direction. The mechanical drive circuit 16 can also perform camera shake correction by driving the lens in a direction other than the optical axis direction according to camera shake. Note that the camera shake correction can be similarly realized by moving the imaging unit 20. When performing camera shake correction, the system control unit 42 detects a shake of the camera 100 based on detection outputs of an acceleration sensor and an angular velocity sensor (not shown), for example, and performs known correction control to cancel the shake of the camera 100. Do.

絞り１３は口径を変化させる。ＮＤフィルター１４は光透過量を調節する。メカシャッター１２は全閉により遮光する。これら絞り１３、ＮＤフィルター１４、メカシャッター１２は光量調節機構として設けられており、用途に応じて使い分けられ、レンズ１０を通過した光の光量を調節する。
発光制御回路３２は、システム制御部４２による制御の下、ストロボユニット３０の発光駆動および発光の制御を行う。 The aperture 13 changes the aperture. The ND filter 14 adjusts the light transmission amount. The mechanical shutter 12 is shielded by being fully closed. The diaphragm 13, the ND filter 14, and the mechanical shutter 12 are provided as a light quantity adjustment mechanism, and are used properly according to the application, and adjust the light quantity of the light that has passed through the lens 10.
The light emission control circuit 32 performs light emission driving and light emission control of the strobe unit 30 under the control of the system control unit 42.

レンズ１０、光量調節機構（１２，１３，１４）を通過した光は、撮像部２０により受光される。撮像部２０は、撮像駆動回路２２からの駆動指示により動作し、受光素子への露光、露光時間の調節、露光した撮像信号の読み出し、読み出した撮像信号の増幅または減衰、撮像信号のＡ／Ｄ変換などを行う。撮像部２０から出力された画像データは、画像処理回路４０に入力されるか、あるいはＲＡＭ４６に一時的に記憶される。 The light that has passed through the lens 10 and the light amount adjustment mechanism (12, 13, 14) is received by the imaging unit 20. The imaging unit 20 operates in response to a drive instruction from the imaging drive circuit 22 and exposes the light receiving element, adjusts the exposure time, reads the exposed imaging signal, amplifies or attenuates the readout imaging signal, and A / D of the imaging signal Perform conversions. The image data output from the imaging unit 20 is input to the image processing circuit 40 or temporarily stored in the RAM 46.

画像処理回路４０は、撮像部２０から直接入力された画像データ、あるいはＲＡＭ４６を経由して入力された画像データに対し、画像処理や画像解析など様々な処理を行う。撮影時の露出合わせ（ＡＥ：Auto Exposure）やピント合わせ（ＡＦ：Auto Focus）の際には、画像処理回路４０は、撮像部２０から順次出力される画像データから輝度成分や周波数成分を抽出してシステム制御部４２に出力する。システム制御部４２は、画像処理回路４０からの輝度成分や周波数成分を評価値として用い、メカ駆動回路１６や撮像駆動回路２２を介してＡＥ，ＡＦ動作を制御する。 The image processing circuit 40 performs various processes such as image processing and image analysis on the image data directly input from the imaging unit 20 or the image data input via the RAM 46. In exposure adjustment (AE: Auto Exposure) and focus adjustment (AF: Auto Focus) at the time of shooting, the image processing circuit 40 extracts a luminance component and a frequency component from image data sequentially output from the imaging unit 20. To the system control unit 42. The system control unit 42 uses the luminance component and frequency component from the image processing circuit 40 as evaluation values, and controls the AE and AF operations via the mechanical drive circuit 16 and the imaging drive circuit 22.

また画像処理回路４０は、撮像部２０から取得した画像データを現像処理して画質を調節することができ、色合い、階調、明るさ、などを適切に設定して鑑賞に適した写真の画像データを生成する。画像処理回路４０は、入力された画像の一部の切り出しや、画像の回転、画像の合成などの、各種画像処理を行うこともできる。また画像処理回路４０では、入力された画像内から人物の顔などの被写体領域を検出することができ、画像内における人物の顔の位置、大きさ、傾き、顔の確からしさ情報などを得ることができる。さらに画像処理回路４０は、検出した人物の顔の特徴情報を抽出し、特定の個人であるかどうか認識することができる。このような認識を行う場合、画像処理回路４０は、予めＲＯＭ４８に記憶されている個人の特徴情報を読み出し、画像から抽出した人物の顔の特徴情報と比較することで、登録済みの個人と一致するか否かの一致判定処理を行う。また、画像処理回路４０は、人物の顔の特徴の詳細な解析を行うことができ、例えばその人物の眼の特徴を解析して、その視線方向を検出するような視線検出処理も行うことができる。 The image processing circuit 40 can develop the image data acquired from the imaging unit 20 to adjust the image quality, and appropriately set the hue, gradation, brightness, etc. Generate data. The image processing circuit 40 can also perform various types of image processing such as clipping a part of the input image, rotating the image, and synthesizing the image. Further, the image processing circuit 40 can detect a subject area such as a person's face from the input image, and obtain information on the position, size, inclination, face likelihood, etc. of the person's face in the image. Can do. Further, the image processing circuit 40 can extract the feature information of the detected human face and recognize whether or not the person is a specific individual. When performing such recognition, the image processing circuit 40 reads the personal feature information stored in the ROM 48 in advance and compares it with the facial feature information of the person extracted from the image, thereby matching the registered individual. Whether to match or not is determined. Further, the image processing circuit 40 can perform detailed analysis of the characteristics of a person's face. For example, the image processing circuit 40 can also perform line-of-sight detection processing such as analyzing the characteristics of the eyes of the person and detecting the line-of-sight direction. it can.

表示装置５０は、液晶デバイス（ＬＣＤ）などからなり、例えば画像処理回路４０で現像処理された画像を表示したり、文字やアイコンを表示したりすることができる。文字やアイコンの表示により、カメラ１００のユーザ（使用者）に対して、各種情報の伝達が可能となる。
操作部４４は、カメラ１００のユーザにより操作され、このユーザ操作情報がシステム制御部４２に入力される。システム制御部４２は、操作部４４からのユーザ操作情報に基づいて、例えばカメラ１００の各部への電源投入、撮影モード切り替え、各種設定、撮影の実行、画像の再生など、カメラ１００の各部の動作や信号処理を制御する。 The display device 50 includes a liquid crystal device (LCD) or the like, and can display an image developed by the image processing circuit 40 or display characters and icons, for example. Various information can be transmitted to the user (user) of the camera 100 by displaying characters and icons.
The operation unit 44 is operated by the user of the camera 100, and this user operation information is input to the system control unit 42. Based on user operation information from the operation unit 44, the system control unit 42 operates each unit of the camera 100, such as turning on power to each unit of the camera 100, switching a shooting mode, performing various settings, executing shooting, and reproducing an image. And control signal processing.

外部メモリＩ／Ｆ５２は、不図示のメモリソケット等を介して外部メモリ９０が挿入され、その外部メモリ９０とカメラ１００とを接続する。カメラ１００は、外部メモリＩ／Ｆ５２を介して外部メモリ９０と接続することにより、画像の授受やプログラムの取得等を行うことができる。
外部機器Ｉ／Ｆ５４は、不図示の接続コネクタや無線通信等を介して外部機器９２とカメラ１００とを接続する。カメラ１００は、外部機器Ｉ／Ｆ５４を介して外部機器９２と接続することにより、画像の授受や互いの機器を動作させるコマンド情報などのやり取り、プログラムの取得等を行うことができる。 The external memory I / F 52 has the external memory 90 inserted through a memory socket (not shown) and connects the external memory 90 and the camera 100. The camera 100 can perform image exchange, program acquisition, and the like by connecting to the external memory 90 via the external memory I / F 52.
The external device I / F 54 connects the external device 92 and the camera 100 via a connection connector (not shown), wireless communication, or the like. By connecting to the external device 92 via the external device I / F 54, the camera 100 can exchange images, exchange command information for operating the devices, acquire a program, and the like.

ＲＯＭ４８は書き換え可能な不揮発性メモリであり、カメラ１００の各種設定値やプログラムを格納している。ＲＡＭ４６は、撮像部２０にて撮像された画像データ、画像処理回路４０による処理途中や処理後の画像データの一時記憶等を行う。また、ＲＡＭ４６には、ＲＯＭ４８から読み出されたプログラムが展開される。
システム制御部４２は、ＲＯＭ４８から読み出されてＲＡＭ４６に展開されたプログラムを実行し、カメラ１００の前述した各部の制御や各種演算等を行う。 The ROM 48 is a rewritable nonvolatile memory and stores various setting values and programs of the camera 100. The RAM 46 temporarily stores image data picked up by the image pickup unit 20, image data being processed by the image processing circuit 40, and processed image data. The RAM 46 is loaded with a program read from the ROM 48.
The system control unit 42 executes a program read from the ROM 48 and developed in the RAM 46, and controls the above-described units of the camera 100 and performs various calculations.

図２（ａ）と図２（ｂ）は、本実施形態のカメラ１００の概略的な外観図である。図２（ａ）はカメラ前面側を示し、図２（ｂ）はカメラ背面側を示している。図２（ａ）に示すように、カメラ前面側にはレンズ１０が配置されており、これによりカメラ１００は被写体像を撮像することができる。またカメラ前面側にはストロボユニット３０が配置されており、カメラ１００は、主被写体が暗い場合にストロボユニット３０を発光させることで十分な光量を得ることができ、暗い中でも速いシャッター速度を保ち、好適な画像を得ることができる。またカメラ１００には、図１に示した操作部４４における各操作部材２００，２０２，２１０，２２０，２２２，２２４，２２６，２２８が配されている。個々の説明は省略するが、各操作部材は、操作ボタンや操作スイッチ、操作レバー等からなり、例えばカメラの電源投入、撮影モード切り替え、各種設定、撮影の実行、画像の再生などのユーザ操作時に使用される。一例として、操作部材２００は電源ボタン、操作部材２０２はシャッターボタンである。その他、操作部材には、表示装置５０のＬＣＤ画面上に配されたタッチパネルが含まれていてもよい。 2A and 2B are schematic external views of the camera 100 of the present embodiment. 2A shows the front side of the camera, and FIG. 2B shows the back side of the camera. As shown in FIG. 2A, a lens 10 is disposed on the front side of the camera, and the camera 100 can capture a subject image. In addition, a strobe unit 30 is disposed on the front side of the camera, and the camera 100 can obtain a sufficient amount of light by causing the strobe unit 30 to emit light when the main subject is dark, maintaining a fast shutter speed even in the dark, A suitable image can be obtained. The camera 100 is provided with operation members 200, 202, 210, 220, 222, 224, 226, and 228 in the operation unit 44 shown in FIG. Although not described in detail, each operation member includes an operation button, an operation switch, an operation lever, and the like. used. As an example, the operation member 200 is a power button, and the operation member 202 is a shutter button. In addition, the operation member may include a touch panel arranged on the LCD screen of the display device 50.

ここで、本実施形態のカメラ１００は、撮像部２０から得られた画像データを基に被写体を検出して追尾等を行い、その被写体に対してピント、明るさ、色を好適な状態に合わせて撮影する機能を有している。検出対象の被写体としては、一例として、人物の顔や人体、犬猫などの特定の動物などを挙げることができる。また、本実施形態のカメラ１００は、それら検出した被写体の中から個体を認識して特定可能となされている。すなわち、本実施形態のカメラ１００では、検出した被写体が有する特徴情報を抽出し、辞書データベースに登録されている特徴情報と比較・照合することにより個体の識別を行うことができる。 Here, the camera 100 of the present embodiment detects a subject based on the image data obtained from the imaging unit 20, performs tracking, etc., and adjusts the focus, brightness, and color of the subject to a suitable state. Has a function to shoot. Examples of the subject to be detected include a human face, a human body, a specific animal such as a dog and cat, and the like. In addition, the camera 100 according to the present embodiment can identify and identify an individual from the detected subjects. That is, in the camera 100 of this embodiment, individual information can be identified by extracting feature information of a detected subject and comparing / collating it with feature information registered in a dictionary database.

認識対象となる被写体の特徴情報を用意しておく手法としては、一例として、図３に示すような手法が知られている。図３には、人物Ａを認識対象である重要被写体として想定した例を挙げている。この例の場合、カメラ３１３の撮影画像から人物Ａの被写体領域３１５を検出し、その被写体領域３１５から特徴情報を抽出し、その特徴情報を人物Ａと関連付けて辞書データベース３３０に登録するような登録処理３１０が行われる。その後、例えばカメラ３５３によって、人物Ａ、人物Ｂ、人物Ｃ等が撮影された場合、その撮影画像からそれぞれ人物Ａの被写体領域３５５、人物Ｂの被写体領域３５７、人物Ｃの被写体領域３５９を検出してそれぞれ特徴情報を抽出する。そして、被写体領域３５５，３５７，３５９からそれぞれ抽出された特徴情報と、辞書データベース３３０の登録済みの特徴情報との一致判定を行うことにより、重要被写体である人物Ａが写っているかどうかの認識処理３５０が行われる。なお、登録処理３１０に用いられるカメラ３１３とその後の認識処理３５０とで用いられるカメラ３５３とは同じカメラであってもよいし、それぞれ別のカメラであってもよい。また、撮影画像から被写体を検出して特徴情報を抽出し、その特徴情報を辞書データベース３３０に登録するまでの処理は、カメラ以外の画像処理装置等により行われてもよい。また、カメラ３５３が認識処理３５０に用いる辞書データベース３３０は、例えば着脱可能な記憶媒体を経由、あるいはネットワーク等を経由して取得されてもよい。また、辞書データベース３３０は、カメラのＲＯＭ４８内に設定されていても良いし、外部メモリ９０内に設定されていても、あるいは、外部機器としての外部記憶装置に設定されていてもよい。 As an example of a method for preparing feature information of a subject to be recognized, a method as shown in FIG. 3 is known. FIG. 3 shows an example in which person A is assumed as an important subject to be recognized. In this example, the subject area 315 of the person A is detected from the photographed image of the camera 313, the feature information is extracted from the subject area 315, and the feature information is associated with the person A and registered in the dictionary database 330. Processing 310 is performed. Thereafter, for example, when a person A, a person B, a person C, etc. are photographed by the camera 353, a subject area 355 of the person A, a subject area 357 of the person B, and a subject area 359 of the person C are detected from the photographed images. To extract feature information. Then, by performing matching determination between the feature information extracted from each of the subject areas 355, 357, and 359 and the registered feature information in the dictionary database 330, a recognition process for determining whether or not the person A as an important subject is captured. 350 is performed. The camera 313 used in the registration process 310 and the camera 353 used in the subsequent recognition process 350 may be the same camera, or may be different cameras. Further, processing from detecting a subject from a captured image, extracting feature information, and registering the feature information in the dictionary database 330 may be performed by an image processing device other than the camera. Further, the dictionary database 330 used by the camera 353 for the recognition processing 350 may be acquired, for example, via a removable storage medium or via a network or the like. The dictionary database 330 may be set in the ROM 48 of the camera, may be set in the external memory 90, or may be set in an external storage device as an external device.

このように、認識対象となる被写体の特徴情報をどのように用意しておくかについては様々な方式があるが、本実施形態では、カメラ１００が撮影画像から被写体を検出して特徴情報を抽出し、辞書データベースに登録する場合を例に挙げて説明する。
詳細は後述するが、本実施形態のカメラ１００は、撮影画像から被写体を検出し、その検出被写体が撮影者にとって重要な被写体である場合に、その重要被写体から抽出した特徴情報を辞書データベースに自動的に登録する機能を備えている。 As described above, there are various methods for preparing the feature information of the subject to be recognized. In this embodiment, the camera 100 detects the subject from the captured image and extracts the feature information. The case of registration in the dictionary database will be described as an example.
Although details will be described later, the camera 100 according to the present embodiment automatically detects the subject from the captured image and automatically extracts feature information extracted from the important subject in the dictionary database when the detected subject is an important subject for the photographer. A function to register automatically.

以下、図４（ａ）および図４（ｂ）と図５（ａ）および図５（ｂ）とを用いて、被写体が撮影者にとって重要な被写体であるか否かの判定と、重要であると判定された被写体の特徴情報を辞書データベースに自動登録する処理の概要を説明する。
図４（ａ）と図４（ｂ）は、撮影者によるカメラ１００の画角操作の一例として、カメラ１００をパンニングして画角を移動させるような画角操作が行われたケースを挙げた説明図である。 Hereinafter, it is important to determine whether or not the subject is an important subject for the photographer by using FIGS. 4 (a) and 4 (b) and FIGS. 5 (a) and 5 (b). The outline of the process of automatically registering the feature information of the subject determined to be in the dictionary database will be described.
FIG. 4A and FIG. 4B show a case where an angle-of-view operation that pans the camera 100 and moves the angle of view is performed as an example of the angle-of-view operation of the camera 100 by the photographer. It is explanatory drawing.

図４（ａ）は、カメラ１００による撮影時の画角４０１の中に、ある人物が被写体４０３として写っている例を示している。被写体４０３は、例えば画角４０１の中央付近に位置しており、また画角４０１内において顔が占める大きさが一定以上となっていて、その顔がカメラ１００に対して正面を向いているとする。この場合、被写体４０３の人物は、撮影者（ユーザ）が撮影対象として決めている人物である可能性が高く、したがってその被写体４０３は重要被写体である可能性が高いと考えられる。なお、人物の被写体４０３が画角４０１の中央付近に位置し、また画角４０１内において顔が占める大きさが一定以上で、かつ顔がカメラ１００に対して正面を向いているかどうかの度合いは、公知の顔検出および認識技術等により算出可能である。 FIG. 4A shows an example in which a certain person is shown as a subject 403 in an angle of view 401 at the time of shooting by the camera 100. For example, the subject 403 is located near the center of the angle of view 401, and the size of the face in the angle of view 401 is greater than or equal to a certain level, and the face is facing the front with respect to the camera 100. To do. In this case, the person of the subject 403 is highly likely to be a person determined by the photographer (user) as a subject to be photographed, and therefore the subject 403 is considered to be highly likely to be an important subject. It should be noted that the degree of whether the person's subject 403 is located near the center of the angle of view 401, the size occupied by the face in the angle of view 401 is equal to or greater than a certain level, and the face is facing the front with respect to the camera 100. It can be calculated by a known face detection and recognition technique or the like.

辞書データベース４１５は、大容量の記憶媒体を有し、認識対象となる個人の特徴情報群、つまり撮影者にとって重要な被写体の複数の特徴情報を登録可能となされている。なお、辞書データベース４１５には、認識対象となる一人の人物の特徴情報群だけでなく、それぞれが認識対象となる複数の個人の特徴情報群が記憶されていてもよい。本実施形態の場合、例えば被写体４０３が撮影者にとって重要な被写体であれば、辞書データベース４１５には、その被写体４０３の特徴情報が、被写体４０３に関連した情報と紐付けられて登録される。 The dictionary database 415 has a large-capacity storage medium, and is capable of registering individual feature information groups to be recognized, that is, a plurality of feature information of subjects important to the photographer. Note that the dictionary database 415 may store not only a feature information group of one person as a recognition target but also a plurality of individual feature information groups as recognition targets. In the present embodiment, for example, if the subject 403 is an important subject for the photographer, the characteristic information of the subject 403 is registered in the dictionary database 415 in association with information related to the subject 403.

ここで本実施形態の場合、同じ被写体４０３の同じ特徴情報が重複登録されるのを避けるために、カメラ１００は、先ず、被写体４０３の特徴情報と辞書データベース４１５に登録済みの特徴情報とを用い、重複確認のための認識処理４１３を行う。認識処理４１３において認識非成立であった場合、つまり辞書データベース４１５に登録されていないことが確認された場合、カメラ１００は、被写体４０３の特徴情報を一次記憶メモリ４１９（例えばＲＡＭ４６）に一次記憶させる仮登録処理を行う。また、カメラ１００は、被写体４０３の特徴情報の仮登録と共に、その被写体４０３を撮影した時の画角情報についても一次記憶メモリ４１９に記憶させる。このように、仮登録処理の目的には、後述する本登録の自動化実現と共に、被写体４０３が重要被写体であるか否かが未だ判らない状況において、重複登録によって辞書データベース４１５の有限な容量が無駄に消費されてしまうのを避けることも含まれる。 Here, in this embodiment, in order to avoid the same feature information of the same subject 403 being registered repeatedly, the camera 100 first uses the feature information of the subject 403 and the feature information registered in the dictionary database 415. Then, recognition processing 413 for duplication confirmation is performed. If recognition is not established in the recognition process 413, that is, if it is confirmed that it is not registered in the dictionary database 415, the camera 100 primarily stores the feature information of the subject 403 in the primary storage memory 419 (for example, the RAM 46). Perform provisional registration processing. The camera 100 also temporarily stores the feature information of the subject 403 and stores the angle-of-view information when the subject 403 is photographed in the primary storage memory 419. As described above, the purpose of the provisional registration process is to realize automatic registration as will be described later, and in a situation where it is not yet known whether the subject 403 is an important subject or not, a limited capacity of the dictionary database 415 is wasted due to duplicate registration. It also includes avoiding consumption.

図４（ｂ）は、被写体４０３が画角４０１から消失する方向（フレームアウト方向４０５とする）に移動した場合に、撮影者がカメラ１００を水平方向にパンニング操作４０９して画角を移動させて被写体を再び画角内に収めるようにした様子を示している。図４（ｂ）に示した被写体４０７は、被写体４０３がフレームアウト方向４０５に移動した後の被写体を表している。ここで、移動した被写体４０７が撮影者にとって重要な被写体であった場合、撮影者は、被写体４０７を画角内に収めるようにカメラ１００の向きを変えることで画角を移動させるような画角操作を行うことが想定される。 In FIG. 4B, when the subject 403 moves in a direction in which the subject 403 disappears from the angle of view 401 (referred to as a frame-out direction 405), the photographer moves the angle of view by performing a panning operation 409 in the horizontal direction. This shows how the subject is again within the angle of view. A subject 407 shown in FIG. 4B represents the subject after the subject 403 has moved in the frame-out direction 405. Here, when the moved subject 407 is an important subject for the photographer, the photographer changes the angle of view by changing the direction of the camera 100 so that the subject 407 is within the angle of view. It is assumed that the operation is performed.

図４（ｂ）に示した画角４１１は、撮影者がカメラ１００を水平方向にパンニング操作４０９するような画角操作を行うことにより、被写体４０７を再び画角内に収めた様子を表している。図４（ｂ）のように、被写体４０７が画角４０１内から一時的に消失した後、パンニング操作４０９により画角を移動させる画角操作が行われた後に再び検出された被写体４０７は、撮影者にとって重要な被写体である可能性がさらに高まったと考えられる。たたしこの時点でも、被写体４０７が重要被写体であるかどうかは未だ確定していない。 An angle of view 411 illustrated in FIG. 4B represents a state in which the subject 407 is again within the angle of view by performing an angle of view operation such as the panning operation 409 of the camera 100 in the horizontal direction by the photographer. Yes. As shown in FIG. 4B, the subject 407 detected again after the view angle operation for moving the view angle by the panning operation 409 is performed after the subject 407 has disappeared temporarily from the view angle 401. The possibility of being an important subject for a person is thought to have further increased. However, even at this time, it has not yet been determined whether or not the subject 407 is an important subject.

このため、カメラ１００は、撮影者による画角移動の操作が行われた後の画角４１１で取得された撮影画像から被写体４０７が再検出された場合、その被写体４０７が撮影者にとって重要な被写体かどうかを確認するための認識処理４３３を行う。カメラ１００は、認識処理４３３において、先ず、再検出された被写体４０７の撮影時の画角情報と、一次記憶メモリ４３９（ＲＡＭ４６）に仮登録されている被写体４０３の撮影時の画角情報とから、画角を移動させる操作が行われたか否か判定する。また、カメラ１００は、画角を移動させる画角操作が行われたと判定し、その画角移動後に被写体４０７を検出した場合、その被写体４０７の特徴情報を抽出する。さらに、カメラ１００は、画角移動後に検出された被写体４０７から抽出した特徴情報と、一次記憶メモリ４３９に仮登録されている被写体４０３の特徴情報とを比較し、それら被写体４０７と被写体４０３が同一かどうかを確認する認識処理４３３を行う。そして、認識処理４３３で認識成立となった場合、つまり再検出された被写体４０７が一次記憶メモリ４３９に仮登録されていた場合、カメラ１００は、その被写体４０７を重要被写体と判定する。その後、カメラ１００は、被写体４０７の特徴情報もしくは被写体４０３の一次記憶されている特徴情報と、それら被写体４０７もしくは４０３の関連情報とを紐付けて、辞書データベース４３５に本登録する。詳細については後述するが、本実施形態の場合、特徴情報が有する信頼度に基づいて、それら被写体４０７の特徴情報と被写体４０３の特徴情報のいずれを辞書データベース４３５に本登録するかを決定する。 Therefore, when the subject 407 is re-detected from the photographed image acquired at the angle of view 411 after the angle of view movement operation by the photographer is performed, the subject 407 is an important subject for the photographer. A recognition process 433 is performed to check whether or not. In the recognition process 433, the camera 100 firstly uses the re-detected angle-of-view information at the time of photographing the subject 407 and the angle-of-view information at the time of photographing of the subject 403 temporarily registered in the primary storage memory 439 (RAM 46). Then, it is determined whether or not an operation for moving the angle of view has been performed. Further, when the camera 100 determines that an angle of view operation for moving the angle of view has been performed and detects the subject 407 after the angle of view is moved, the camera 100 extracts feature information of the subject 407. Further, the camera 100 compares the feature information extracted from the subject 407 detected after moving the angle of view with the feature information of the subject 403 temporarily registered in the primary storage memory 439, and the subject 407 and the subject 403 are the same. A recognition process 433 is performed to check whether or not. When the recognition process is established in the recognition process 433, that is, when the re-detected subject 407 is provisionally registered in the primary storage memory 439, the camera 100 determines that the subject 407 is an important subject. Thereafter, the camera 100 associates the feature information of the subject 407 or the primarily stored feature information of the subject 403 with the related information of the subject 407 or 403, and performs main registration in the dictionary database 435. Although details will be described later, in the case of the present embodiment, which of the feature information of the subject 407 and the feature information of the subject 403 is to be registered in the dictionary database 435 is determined based on the reliability of the feature information.

このように、本実施形態のカメラ１００は、仮登録された被写体が消失等した後、被写体が再検出された際に、被写体の特徴情報と共に画角情報をも合わせた認識処理を行うことにより、重要被写体かどうかの確認を行う。これにより、本実施形態のカメラ１００は、撮影者が意図して被写体を写そうとしていることを高い確率で推測でき、その重要な被写体の特徴情報を辞書データベースに自動登録することができる。 As described above, the camera 100 according to the present embodiment performs a recognition process that combines the angle information together with the feature information of the subject when the subject is re-detected after the temporarily registered subject disappears. Check if it is an important subject. As a result, the camera 100 according to the present embodiment can estimate with high probability that the photographer intends to photograph the subject, and can automatically register the feature information of the important subject in the dictionary database.

なお、画角移動の操作による画角の移動量や移動速度が非常に大きい場合には、被写体４０７が重要な被写体である可能性が低いと考えることもでき、この場合には確認のための認識処理４３３を行わないようにしてもよい。また、ここでは画角を移動させる画角操作の例として、カメラ１００の向きを水平方向（左右方向）に動かすパンニング操作に挙げたが、画角移動の操作はこの例には限定されない。画角移動の操作は、例えばカメラ１００の向きを垂直方向（上下方向）に動かすチルティング操作であってもよい。また、画角移動の操作は、カメラ１００をレンズ１０光軸に対して回転させるロール方向等の操作であってもよい。またそれらパンニング、チルティング、ロールが組み合わされた操作であってもよい。カメラ１００に対してパンニング、チルティング、ロール等の操作が行われたかどうかについては、例えば公知の加速度センサや角速度センサ、方位センサ等の出力に基づく検出技術により判定可能である。すなわちカメラ１００は、加速度センサや角速度センサ、方位センサ等の出力を基に、撮影前後のカメラの移動方向と移動量を算出し、それらの情報を基に、画角を移動させる操作が行われたかどうかを判定できる。その他にも、撮影画像から公知の動きベクトル検出を行う技術を用いても判定可能である。 Note that when the amount of movement and the movement speed of the angle of view by the operation of moving the angle of view are very large, it can be considered that the subject 407 is unlikely to be an important subject. In this case, for confirmation The recognition process 433 may not be performed. In addition, here, as an example of the view angle operation for moving the view angle, the panning operation for moving the direction of the camera 100 in the horizontal direction (left-right direction) is described, but the operation for moving the view angle is not limited to this example. The operation for moving the angle of view may be, for example, a tilting operation for moving the direction of the camera 100 in the vertical direction (vertical direction). Further, the operation for moving the angle of view may be an operation such as a roll direction for rotating the camera 100 with respect to the optical axis of the lens 10. Moreover, the operation which combined those panning, tilting, and a roll may be sufficient. Whether or not panning, tilting, roll, or the like has been performed on the camera 100 can be determined by a detection technique based on the output of a known acceleration sensor, angular velocity sensor, azimuth sensor, or the like. That is, the camera 100 calculates the moving direction and moving amount of the camera before and after shooting based on the output of the acceleration sensor, angular velocity sensor, azimuth sensor, and the like, and an operation for moving the angle of view is performed based on the information. Can be determined. In addition, the determination can also be made by using a known technique for detecting a motion vector from a captured image.

図５（ａ）と図５（ｂ）は、撮影者によるカメラ１００の画角操作の一例として、カメラのズーム操作により画角を変更する操作、あるいはカメラを被写体に対して遠近方向に移動させることにより画角を変更する操作が行われた場合を挙げた説明図である。 FIGS. 5A and 5B show an example of an angle of view operation of the camera 100 by the photographer, an operation of changing the angle of view by a zoom operation of the camera, or moving the camera in the perspective direction with respect to the subject. It is explanatory drawing which gave the case where operation which changes an angle of view was performed by this.

図５（ａ）は、図４（ａ）と同様に示した図であり、撮影時の画角５０１の中に、人物が被写体５０３として写っている例を示している。被写体５０３は、前述の被写体４０３と同様に、画面中央付近に位置し、画角内で顔が占める大きさが一定以上で、顔がカメラ１００に対して正面を向いているとする。この図５（ａ）の例の場合も図４（ａ）と同様に、被写体５０３は重要被写体である可能性が高いと考えられる。この場合も前述同様に、カメラ１００は、被写体５０３の特徴情報と辞書データベース５１５に登録済みの特徴情報とを用いて重複確認のための認識処理５１３を行い、重複しない場合には被写体５０３の特徴情報を一次記憶メモリ５１９に記憶する仮登録を行う。また、カメラ１００は、被写体５０３の特徴情報の仮登録と共に、その被写体５０３を撮影した時の画角情報についても一次記憶メモリ５１９に記憶させる。 FIG. 5A is a view similar to FIG. 4A, and shows an example in which a person is shown as a subject 503 in an angle of view 501 at the time of shooting. Similar to the subject 403 described above, the subject 503 is located near the center of the screen, and the face occupies a certain size or more in the angle of view, and the face is facing the front with respect to the camera 100. In the case of the example in FIG. 5A as well, it is considered that the subject 503 is highly likely to be an important subject, as in FIG. 4A. Also in this case, as described above, the camera 100 performs recognition processing 513 for duplication confirmation using the feature information of the subject 503 and the feature information registered in the dictionary database 515. Temporary registration for storing information in the primary storage memory 519 is performed. In addition, the camera 100 causes the primary storage memory 519 to store the angle-of-view information when the subject 503 is photographed together with provisional registration of the feature information of the subject 503.

図５（ｂ）は、図４（ｂ）の例と同様に、被写体５０３の人物が画角５０１から消失するフレームアウト方向５０５に移動した様子を表しており、被写体５０７はフレームアウト方向５０５に移動した後の被写体を表している。また画角５１１は、被写体５０７のフレームアウト後に、ワイド（Wide）方向にズーム操作がなされて広くなされた画角、あるいは撮影者がカメラを被写体から遠ざけることで、被写体５０７が収まるようになされた画角を表している。言い換えると、画角５１１は、画角５０１に対してより広い範囲を写すことができる画角である。図５（ｂ）に示すように、被写体５０７がフレームアウトした後、ズーム操作やカメラの移動などで画角を変更する操作がなされた後に、再び検出された被写体５０７は、撮影者にとって重要な被写体である可能性がさらに高まったと考えられる。ただしこの時点でも、被写体５０３が重要被写体であるかどうかは未だ確定していない。 FIG. 5B shows a state in which the person of the subject 503 moves in the frame-out direction 505 where the person disappears from the angle of view 501, and the subject 507 moves in the frame-out direction 505, as in the example of FIG. It represents the subject after moving. In addition, the angle of view 511 is set so that the subject 507 can be accommodated when the photographer moves the camera away from the subject by zooming in the wide direction after the subject 507 is out of the frame. It represents the angle of view. In other words, the angle of view 511 is an angle of view that can capture a wider range than the angle of view 501. As shown in FIG. 5B, after the subject 507 is out of the frame, the subject 507 detected again after an operation for changing the angle of view by zooming or moving the camera is important for the photographer. It seems that the possibility of being a subject has further increased. However, even at this time, it has not yet been determined whether or not the subject 503 is an important subject.

このため、カメラ１００は、撮影者により画角を変更する画角操作が行われた後の画角５１１で取得された撮影画像から被写体５０７が再検出された場合、それが撮影者にとって重要な被写体かどうかを確認するための認識処理５３３を行う。この認識処理５３３において、カメラ１００は、先ず、再検出された被写体５０７の撮影時の画角情報と、一次記憶メモリ５３９に仮登録されている被写体５０３の撮影時の画角情報とから、画角を変更する操作が行われたか否か判定する。また、カメラ１００は、画角を変更する画角操作が行われたと判定し、その画角変更後に被写体５０７を検出した場合、その被写体５０７の特徴情報を抽出する。さらに、カメラ１００は、画角変更後に検出された被写体５０７から抽出した特徴情報と、一次記憶メモリ５３９に仮登録されている被写体５０３の特徴情報とを比較し、それら被写体５０７と被写体５０３が同一かどうかを確認する認識処理５３３を行う。そして、認識処理５３３で認識成立となった場合、つまり再検出された被写体５０７が一次記憶メモリ５３９に仮登録されていた場合、カメラ１００は、その被写体５０７を重要被写体と判定する。その後、カメラ１００は、被写体５０７もしくは被写体５０３の特徴情報と、当該被写体５０７もしくは被写体５０３の関連情報とを紐付けて、辞書データベース５３５に本登録する。この例の場合も、後述するように、特徴情報が有する信頼度に基づいて、それら被写体５０７の特徴情報と被写体５０３の特徴情報のいずれを辞書データベース５３５に本登録するかを決定する。 For this reason, the camera 100 is important for the photographer when the subject 507 is re-detected from the photographed image acquired at the angle of view 511 after the angle of view operation for changing the angle of view is performed by the photographer. Recognition processing 533 for confirming whether or not the subject is present is performed. In this recognition processing 533, the camera 100 first calculates the image angle from the angle of view information at the time of shooting the re-detected subject 507 and the angle of view information at the time of shooting of the subject 503 temporarily registered in the primary storage memory 539. It is determined whether or not an operation for changing a corner has been performed. Further, when the camera 100 determines that an angle-of-view operation for changing the angle of view has been performed and detects the subject 507 after the angle-of-view change, the camera 100 extracts feature information of the subject 507. Further, the camera 100 compares the feature information extracted from the subject 507 detected after the change in the angle of view with the feature information of the subject 503 temporarily registered in the primary storage memory 539, and the subject 507 and the subject 503 are the same. A recognition process 533 is performed to check whether or not. When the recognition is established in the recognition processing 533, that is, when the re-detected subject 507 is temporarily registered in the primary storage memory 539, the camera 100 determines that the subject 507 is an important subject. Thereafter, the camera 100 associates the feature information of the subject 507 or the subject 503 with the related information of the subject 507 or the subject 503 and performs main registration in the dictionary database 535. Also in this example, as will be described later, which of the feature information of the subject 507 and the feature information of the subject 503 is to be registered in the dictionary database 535 is determined based on the reliability of the feature information.

図５（ａ）および図５（ｂ）の例も前述同様に、カメラ１００は、仮登録された被写体の消失等の後、再検出され被写体の特徴情報と共に画角情報をも用いた認識処理を行うことにより、重要被写体かどうかの確認を行う。これにより、カメラ１００は、撮影者が意図して被写体を写そうとしていることを高い確率で推測でき、その被写体の特徴情報を辞書データベースに自動登録することができる。 In the example of FIG. 5A and FIG. 5B as well, the camera 100 recognizes again using the field angle information together with the feature information of the subject that is re-detected after the provisionally registered subject disappears. To confirm whether the subject is an important subject. As a result, the camera 100 can estimate with high probability that the photographer intends to photograph the subject, and can automatically register the feature information of the subject in the dictionary database.

なお、画角変更の操作による画角の変更量や変更の速度が非常に大きい場合には、被写体５０７が重要な被写体である可能性が低いと考えることもでき、この場合には確認のための認識処理５３３を行わないようにしてもよい。また、ズーム操作によって画角を変更するような画角操作が行われたかどうかは、レンズ１０を駆動するメカ駆動回路１６の制御情報を基に判定可能である。すなわちカメラ１００は、例えばシステム制御部４２がメカ駆動回路１６を制御してレンズ１０の焦点距離（ズーム倍率）を変化させた際の制御値を基に、撮影者がカメラ１００の画角を意図的に変化させる画角操作を行ったかどうかを判定できる。また、カメラを被写体に対して遠近方向に移動させることで画角を変更する操作が行われたかどうかは、例えば公知の加速度センサや角速度センサ、方位センサ等の出力に基づく検出技術により判定可能である。すなわちカメラ１００は、例えば加速度センサや角速度センサ、方位センサ等の出力を基に、撮影前後のカメラの移動方向と移動量を算出し、それら移動方向と移動量を基にカメラを遠近方向に移動させる画角変更の操作が行われたかどうかを判定できる。その他にも、撮影画像から公知の動きベクトル検出を行う技術を用いても判定可能である。 Note that if the amount of change in the angle of view and the speed of change due to the operation of changing the angle of view are very large, it can be considered that the subject 507 is unlikely to be an important subject. The recognition process 533 may not be performed. Further, whether or not an angle of view operation that changes the angle of view by a zoom operation can be determined based on control information of the mechanical drive circuit 16 that drives the lens 10. That is, in the camera 100, for example, the photographer intends the angle of view of the camera 100 based on the control value when the system control unit 42 controls the mechanical drive circuit 16 to change the focal length (zoom magnification) of the lens 10. It is possible to determine whether or not an angle of view operation to be changed is performed. Further, whether or not an operation for changing the angle of view by moving the camera in the perspective direction with respect to the subject can be determined by a detection technique based on the output of a known acceleration sensor, angular velocity sensor, azimuth sensor, or the like. is there. That is, the camera 100 calculates the movement direction and movement amount of the camera before and after shooting based on outputs from, for example, an acceleration sensor, an angular velocity sensor, and an orientation sensor, and moves the camera in the perspective direction based on the movement direction and movement amount. It can be determined whether or not an operation for changing the angle of view has been performed. In addition, the determination can also be made by using a known technique for detecting a motion vector from a captured image.

また、前述の例では、画角を移動させる画角操作の例と、画角によって写る範囲を変更する画角操作の例を別々に説明したが、これら画角移動と画角変更の各画角操作が組み合わされて略々同時に行われた場合にも本実施形態は適用可能である。 In the above example, the example of the angle of view operation for moving the angle of view and the example of the angle of view operation for changing the range captured by the angle of view are described separately. The present embodiment can also be applied when corner operations are combined and performed substantially simultaneously.

ここで、前述の図４（ａ）および図４（ｂ）、図５（ａ）および図５（ｂ）で説明したような処理を実現するためには被写体の特徴情報の抽出および認識処理が必要であるが、その前に、先ず撮影画像内から被写体を検出する必要がある。また、撮影画像内には複数の被写体が存在するケース、被写体同士が交差するケース、被写体が消失／復帰するケースなど、多くのケースが生ずることが想定されるため、効率的に認識処理の実施とその認識処理結果の管理を行う必要がある。 Here, in order to realize the processing described with reference to FIGS. 4A, 4B, 5A, and 5B, subject feature information is extracted and recognized. Before that, it is necessary to first detect the subject from the captured image. In addition, since it is assumed that there are many cases such as a case where a plurality of subjects exist in a captured image, a case where subjects cross each other, and a case where subjects disappear / return, it is possible to efficiently perform recognition processing. It is necessary to manage the recognition processing result.

図６（ａ）〜図６（ｃ）は、撮影により取得された画像内から検出した被写体に識別情報（顔ＩＤとする）を割り振る発番処理を行うことで、複数の被写体を管理している様子を示している。各画像には時系列順にフレーム番号ｎ，ｎ＋１，ｎ＋２，・・・が付けられているとする。
図６（ａ）は、フレーム番号ｎの画像６００から３人の被写体が検出できているケースを表しており、カメラ１００は、これら３人の被写体に対してそれぞれ顔ＩＤ（ＩＤ０，ＩＤ１，ＩＤ２）を発番（割り振り）し、それら顔ＩＤによって各被写体を管理する。また、図６（ａ）の例ではフレーム番号ｎ＋１の画像６０１とフレーム番号ｎ＋２の画像６０２の間で被写体の移動があったとする。この場合、カメラ１００は、前フレームと現フレーム間で各被写体の相対的な位置関係の変化や各被写体のサイズ関係の変化に基づいて、移動前後の被写体が同一の被写体（同一人物）であるかの判定を行い、同一の被写体である場合には同じ顔ＩＤを発番する。例えば、フレーム番号ｎ＋１の時点において、顔ＩＤ２の人物の特徴情報が辞書データベースに格納してある特徴情報と一致した場合、カメラ１００は、顔ＩＤ２の人物を登録済みの特定の個人として認識する。その後、顔ＩＤ２の人物が特定の個人でありつづける状態を保つには、継続的に認識処理を行い続ける必要があるが、顔ＩＤを発番して管理しておくことでその必要性はなくなる。このようにフレーム番号ｎ＋１の画像６０２の時点で確定した認識結果と顔ＩＤ２を紐づけて管理していれば、フレーム番号ｎ＋３の画像６０３の時点であっても、顔ＩＤ２の人物は過去に認識された人物であることが分かる。 6 (a) to 6 (c) manage a plurality of subjects by performing a numbering process for assigning identification information (referred to as face ID) to subjects detected from within an image acquired by photographing. It shows how it is. Assume that frame numbers n, n + 1, n + 2,.
FIG. 6A shows a case where three subjects can be detected from the image 600 of frame number n. The camera 100 detects the face ID (ID0, ID1, ID2) for each of these three subjects. ) Is issued (allocated), and each subject is managed by the face ID. In the example of FIG. 6A, it is assumed that the subject has moved between the image 601 with the frame number n + 1 and the image 602 with the frame number n + 2. In this case, in the camera 100, the subject before and after the movement is the same subject (the same person) based on the change in the relative positional relationship between the subjects and the size relationship between the subjects between the previous frame and the current frame. The same face ID is issued when the subject is the same. For example, when the feature information of the person with face ID 2 matches the feature information stored in the dictionary database at the time of frame number n + 1, the camera 100 recognizes the person with face ID 2 as a registered specific individual. After that, in order to keep the person with face ID 2 as a specific individual, it is necessary to continue the recognition process, but this need is eliminated by managing the face ID by issuing it. . If the recognition result determined at the time of the image 602 with the frame number n + 1 and the face ID 2 are managed in this way, the person with the face ID 2 is recognized in the past even at the time of the image 603 with the frame number n + 3. It turns out that it is a person who was done.

図６（ｂ）は、フレーム番号ｎの画像６３０から２人の人物（顔ＩＤ３と顔ＩＤ４）が検出できている状態を表しており、その後、フレーム番号ｎ＋１の画像６３１からフレーム番号ｎ＋２の画像６３２の間に２人の人物が交差したケースを示している。顔ＩＤ３と顔ＩＤ４の人物が互いに近寄り交差した瞬間、手前側の被写体は継続して検出できるが、後ろ側の被写体は手前側の被写体により遮られることになる。図６（ｂ）の例では、顔ＩＤ３の人物が手前側であり、顔ＩＤ４の人物が後ろ側であるとする。この場合、手前側の顔ＩＤ３の人物は遮られずに検出できるため、顔ＩＤ３を保つことができるが、後ろ側の顔ＩＤ４の人物は遮られたことでＩＤの連続性が失わるため、遮られた後に検出された人物には別の顔ＩＤ５を発番する。なお、遮られる時間が短くかつ、遮られた後に再出現した場所が近いなど、所定の条件が満たされる場合には顔ＩＤ４を継続することも可能ではあるが、フレーム番号ｎ＋２の画像６３２のように別の顔ＩＤを発番するケースもある。この場合、同一人物であっても別の顔ＩＤを発番するため、別の人物として扱われる。 FIG. 6B shows a state in which two persons (face ID 3 and face ID 4) can be detected from the image 630 with the frame number n, and then the image 631 with the frame number n + 1 to the image with the frame number n + 2. A case where two persons cross each other between 632 is shown. At the moment when the persons with face ID 3 and face ID 4 approach each other, the near-side subject can be detected continuously, but the back-side subject is blocked by the near-side subject. In the example of FIG. 6B, it is assumed that the person with face ID 3 is on the near side and the person with face ID 4 is on the back side. In this case, since the person with face ID 3 on the near side can be detected without being blocked, face ID 3 can be maintained, but the person with face ID 4 on the back side is blocked, and the continuity of ID is lost. Another face ID 5 is issued to the person detected after being blocked. It should be noted that face ID4 can be continued when a predetermined condition is satisfied, such as when the time to be blocked is short and the place where it reappeared after being blocked is close, but like image 632 with frame number n + 2 In some cases, another face ID is issued. In this case, since different face IDs are issued even if they are the same person, they are treated as different persons.

図６（ｃ）は、フレーム番号ｎの画像６５０から１人の人物（顔ＩＤ６）が検出できている状態を表しておる。そして、その後、フレーム番号ｎ＋１の画像６５１で顔ＩＤ６の人物が画角内からフレームアウトし、さらにその後のフレーム番号ｎ＋２の画像６５２で再出現したケースを示している。フレーム番号ｎ＋１の画像６５１で顔ＩＤ６が発番された人物は、その後、フレーム番号ｎ＋２の画像６５２でフレームインしてきた際には顔ＩＤ７が発番される。この図６（ｃ）のようにフレームアウト後にフレームインとなった場合、顔ＩＤ６の人物が再出現したケースと、別の人物が出現したケースの両方が考えられため、新しい顔ＩＤ７の発番は正しい場合と誤りの場合がある。 FIG. 6C shows a state in which one person (face ID 6) can be detected from the image 650 having the frame number n. Then, a case is shown where a person with face ID 6 in the image 651 with the frame number n + 1 has gone out of the field of view and reappeared with the image 652 with the subsequent frame number n + 2. The person whose face ID 6 is assigned in the image 651 with the frame number n + 1 is subsequently assigned the face ID 7 when the person enters the frame 652 with the image 652 with the frame number n + 2. As shown in FIG. 6C, when a frame-in occurs after frame-out, both a case where a person with face ID 6 reappears and a case where another person appears are considered. May be right or wrong.

図６（ａ）〜図６（ｃ）に示したように顔ＩＤを発番することで、各個人の情報管理を実現しやすくなると言うメリットがある。
しかしながら、図６（ａ）、図６（ｂ）、図６（ｃ）のいずれのケースにおいても、顔ＩＤの発番は万能では無く、同一人物であっても異なる顔ＩＤを発番するケースが発生し得る。 As shown in FIGS. 6A to 6C, there is an advantage that it becomes easy to realize information management of each individual by issuing the face ID.
However, in any of the cases shown in FIGS. 6 (a), 6 (b), and 6 (c), the face ID is not universal, and different face IDs are issued even for the same person. Can occur.

図７は、画像内から検出した被写体における関連情報の管理テーブルを示しており、カメラ１００は、この管理テーブルを内部で保持している。管理テーブルには、被写体の関連情報を格納する領域として、顔ＩＤ情報欄７０３、サイズ情報欄７０５、位置情報欄７０７、個人認識情報欄７０９、優先順情報欄７０１が用意されている。顔ＩＤ情報欄７０３には前述した顔ＩＤが格納される。サイズ情報欄７０５には、それぞれの被写体の大きさをピクセル数で表すサイズ情報が格納される。位置情報欄７０７には、画像内における各被写体の位置を表す座標情報が格納される。個人認識情報欄７０９には被写体の人物の各個人の認識情報が格納される。優先順情報欄７０１には、それぞれ顔ＩＤの被写体の重要度を表す情報が格納される。管理テーブルでは、優先順情報欄７０１における重要度の順にソートすることで、重要度の順の行７３０，７３１，７３２，７３３，・・・に分けて、顔ＩＤに対応した被写体のサイズ情報、位置情報、個人認識情報を管理可能となされている。重要度は、例えば撮影者の意図と一致していることが最も望ましく高い重要度となされ、その撮影者の意図については、被写体のサイズ、位置、個人認識の成否などの要素に基づいて設定することができる。なお、管理テーブルで管理される情報のうち、顔ＩＤ、サイズ情報、位置情報は、画像内における被写体の存在を表す情報であり、例えばＡＦ時のピント合わせやＡＥ時の露出合わせの制御に用いられる。また、個人認識情報は、辞書データベースの情報を用いた認識処理を行った結果、認識が成立して特定できた個人を示す情報であり、顔ＩＤと共に管理テーブルにより管理される。このため、管理テーブル上で、どの被写体が、認識が成立したのか、あるいは認識が成立していないのかを、容易に知ることが可能となる。 FIG. 7 shows a management table of related information on the subject detected from the image, and the camera 100 holds this management table internally. In the management table, a face ID information column 703, a size information column 705, a position information column 707, a personal recognition information column 709, and a priority order information column 701 are prepared as areas for storing subject related information. The face ID information field 703 stores the face ID described above. The size information column 705 stores size information that represents the size of each subject by the number of pixels. The position information column 707 stores coordinate information representing the position of each subject in the image. The personal recognition information column 709 stores individual personal recognition information of the subject person. In the priority order information column 701, information indicating the importance of the subject with the face ID is stored. In the management table, by sorting in order of importance in the priority order information column 701, the size information of the subject corresponding to the face ID is divided into rows 730, 731, 732, 733,. Location information and personal recognition information can be managed. For example, it is most desirable that the degree of importance is the same as the intention of the photographer. The intention of the photographer is set based on factors such as the size of the subject, the position, and the success or failure of personal recognition. be able to. Of the information managed in the management table, the face ID, size information, and position information are information indicating the presence of the subject in the image, and are used, for example, for focusing control during AF and exposure adjustment during AE. It is done. The personal recognition information is information indicating an individual who can be identified and identified as a result of recognition processing using information in the dictionary database, and is managed by the management table together with the face ID. For this reason, it is possible to easily know on the management table which subject has been recognized or has not been recognized.

図８は、辞書データベース内の構造例を示した概念図である。
図８において、辞書データベース８０１は、例えば電源供給が停止されても記憶を続ける不揮発性メモリが記憶媒体として用いられ、図１のカメラ１００の例ではＲＯＭ４８内に辞書データベース用の領域が用意されている。なお、辞書データベース８０１は、外部メモリ９０内に実体を持つ構成であってもよい。 FIG. 8 is a conceptual diagram showing an example of the structure in the dictionary database.
In FIG. 8, the dictionary database 801 uses, for example, a non-volatile memory that continues to be stored even when power supply is stopped as a storage medium. In the example of the camera 100 in FIG. 1, an area for a dictionary database is prepared in the ROM 48. Yes. Note that the dictionary database 801 may have a configuration in the external memory 90.

辞書データベース８０１には、管理情報８０３が格納されている。管理情報８０３には、登録されている被写体の数（人数）やそれらの格納場所、被写体ごとに登録された特徴情報の数やそれらの格納場所など、全体の構成を示す情報が格納されている。また、管理情報８０３には、概念として、その下にツリー状に各個人情報８１１，８２１，８３１，・・・が格納され、さらにその下に各特徴情報８１３，８１５，８１７，・・・，８２３，８２５，・・・，８３３，８３５，・・・が格納されている。 Management information 803 is stored in the dictionary database 801. The management information 803 stores information indicating the overall configuration, such as the number of registered subjects (number of subjects) and their storage locations, the number of feature information registered for each subject and their storage locations. . Further, as a concept, the management information 803 stores individual information 811, 821, 831,... In a tree shape below the management information 803, and further includes feature information 813, 815, 817,. 823, 825,..., 833, 835,.

これらの個人情報には、それぞれの個人情報の内容情報８６１として、名前情報８６３、誕生日情報８６５、グループ情報８６７、その他情報８６９等の任意の情報を格納可能となっている。個人情報の用途としては、例えば、辞書データベースに登録されている人物を優先的にＡＦの対象としたり、その個人に最適な撮影パラメータを適用したり、例えば個人Ａと個人ＢでＡＦ対象とする優先度に差を付けたりすることなどが挙げられる。その他にも、個人情報の用途としては、例えば、個人情報に基づいて様々な撮影アプリケーションを作成したり、撮影画像を閲覧する際に個人の名前を検索キーとして画像を絞り込んだりする再生機能のアプリケーションを作成したりすることなども挙げられる。 In these personal information, arbitrary information such as name information 863, birthday information 865, group information 867, and other information 869 can be stored as content information 861 of each personal information. As the use of personal information, for example, a person registered in the dictionary database is preferentially subjected to AF, an optimum shooting parameter is applied to the individual, and, for example, individual A and person B are subject to AF. For example, a difference in priority may be given. Other uses of personal information include, for example, a playback function application that creates various shooting applications based on personal information, or narrows down images using personal names as search keys when browsing shot images. You can also create.

また各特徴情報には、特徴情報の内容情報８８１として、被写体の実際の特徴情報８３３、特徴情報の生成に使用した画像情報８８５、特徴情報が生成された画像の撮影時刻等の情報８８７、その他情報８８９等の任意の情報を格納可能となっている。個人認識の照合アルゴリズムによっては、同一人物を異なる条件で撮影して登録したほうが認識の性能が向上するものがあるため、１人当たり複数の特徴情報の登録が可能な構成となっている。また、特徴情報を生成した際に使用した画像情報８８５を格納しておくことで、登録した特徴情報をユーザが視覚的に確認することが可能となる。また、人物の特徴情報は、該当人物の成長、加齢などによって変化していくため、定期的に更新される。このため、特徴情報生成に用いた画像がいつ得られたのかを表す情報８８７を格納しておくことにより、更新の目安とすることができる。 Each feature information includes, as content information 881 of the feature information, actual feature information 833 of the subject, image information 885 used for generating the feature information, information 887 such as a shooting time of the image where the feature information is generated, and the like. Arbitrary information such as information 889 can be stored. Depending on the collation algorithm for personal recognition, the performance of recognition can be improved by photographing and registering the same person under different conditions, so that a plurality of pieces of feature information can be registered per person. Further, by storing the image information 885 used when the feature information is generated, the registered feature information can be visually confirmed by the user. In addition, since the feature information of a person changes with the growth and aging of the person, the feature information is updated regularly. For this reason, by storing the information 887 indicating when the image used for generating the feature information is obtained, it can be used as a guideline for updating.

図９は、前述した一次記憶メモリ内の構造の一例を示した概念図である。一次記憶メモリ９０１は、例えば装置への電源供給に依存する揮発性メモリを媒体としており、本実施形態の場合は図１のＲＡＭ４６に相当する。
図９に示すように、一次記憶メモリ９０１には、管理情報９０３が格納される。一次記憶メモリ９０１における管理情報９０３には、一次記憶されている特徴情報の数とそれらの格納場所の情報が格納されており、概念として、その下にツリー状に各特徴情報９１１，９１３，９１５，９１７，・・・が格納される。 FIG. 9 is a conceptual diagram showing an example of the structure in the primary storage memory described above. The primary storage memory 901 uses, for example, a volatile memory that depends on power supply to the apparatus as a medium, and corresponds to the RAM 46 in FIG. 1 in the present embodiment.
As shown in FIG. 9, management information 903 is stored in the primary storage memory 901. The management information 903 in the primary storage memory 901 stores the number of feature information stored in the primary storage and information on the storage locations thereof. As a concept, the feature information 911, 913, and 915 are arranged in a tree form below the feature information. , 917,... Are stored.

一次記憶メモリ９０１の特徴情報には、特徴情報の内容情報９３１として、実際の特徴情報９３３、顔ＩＤ９３２、特徴情報の生成時に使用した画像情報９３５、特徴情報の生成に用いた画像の撮影時点の画角情報９３７と時刻情報９３９などが格納される。画角情報としては、レンズ１０のズーム位置のようにメカ機構から得られる情報、電子ズームやアスペクト設定のように電子的な切り出しにより得られる情報、加速度センサなどの出力を基にパンニングの変化を相対的に表現した情報などを格納可能となっている。 The feature information in the primary storage memory 901 includes the feature information content information 931, the actual feature information 933, the face ID 932, the image information 935 used when generating the feature information, and the shooting time of the image used for generating the feature information. The angle of view information 937 and time information 939 are stored. As the angle of view information, information obtained from a mechanical mechanism such as the zoom position of the lens 10, information obtained by electronic clipping such as electronic zoom and aspect setting, and panning change based on output from an acceleration sensor or the like are used. It is possible to store relatively expressed information.

図１０、図１１、図１２、図１３には、本実施形態のカメラ１００のシステム起動、撮影画像からの被写体の検出、認識状況の判定、認識処理から登録までの各処理のフローチャートを示す。これら各フローチャートの処理は、ハードウェア構成により実行されてもよいし、ＣＰＵ等が実行するプログラムに基づくソフトウェア構成により実現されてもよく、一部がハードウェア構成で残りがソフトウェア構成により実現されてもよい。ＣＰＵ等が実行するプログラムは、例えばＲＯＭ４８等に格納されていてもよいし、外部メモリ９０等の記録媒体から取得されてもよく、或いは不図示のネットワーク等を介して取得されてもよい。以下の説明では、各処理のステップＳ１０１〜ステップＳ４６５をＳ１０１〜Ｓ４６５と略記する。 10, 11, 12, and 13 are flowcharts of each process from system activation of the camera 100 according to the present embodiment, detection of a subject from a captured image, determination of a recognition status, and recognition processing to registration. The processing of each of these flowcharts may be executed by a hardware configuration, or may be realized by a software configuration based on a program executed by a CPU or the like, partly realized by a hardware configuration and the rest by a software configuration. Also good. The program executed by the CPU or the like may be stored in the ROM 48 or the like, for example, may be acquired from a recording medium such as the external memory 90, or may be acquired via a network (not shown) or the like. In the following description, steps S101 to S465 of each process are abbreviated as S101 to S465.

図１０は、本実施形態のカメラ１００におけるシステム起動後の全体の処理の流れを示したフローチャートである。
カメラ１００は、電源ボタン（２００）が押下されて主電源がオンされると、Ｓ１０１において図１０のフローチャートの処理を開始する。そして、Ｓ１０３において、カメラ１００ではシステム起動処理が行われる。ここでは、カメラシステムが動作するに必要なＣＰＵやＬＳＩ等への電源供給、クロック供給をはじめ、メモリやＯＳの初期化など、基本システムの起動が行われる。 FIG. 10 is a flowchart showing the overall processing flow after system startup in the camera 100 of this embodiment.
When the power button (200) is pressed and the main power is turned on, the camera 100 starts the process of the flowchart of FIG. 10 in S101. In step S103, the camera 100 performs system activation processing. Here, the basic system is started, such as power supply and clock supply to the CPU and LSI necessary for the operation of the camera system, and initialization of the memory and OS.

次にＳ１０５において、システム制御部４２は、撮像部２０内のＣＣＤやＣＭＯＳ等の撮像素子の起動、メカ駆動回路１６を介してレンズ１０のフォーカスレンズやズームレンズ等の鏡筒系デバイスの起動を行う。これにより、メカシャッター１２や絞り１３が動作し、撮像部２０の撮像素子に外光が導かれる。また、システム制御部４２は、撮像駆動回路２２を介して撮像部２０の撮像素子や増幅器、Ａ／Ｄ変換器等の撮像系デバイスの駆動を開始する。 Next, in S105, the system control unit 42 activates an imaging element such as a CCD or CMOS in the imaging unit 20, and activates a lens barrel system device such as a focus lens or a zoom lens of the lens 10 via the mechanical drive circuit 16. Do. As a result, the mechanical shutter 12 and the diaphragm 13 are operated, and external light is guided to the imaging device of the imaging unit 20. In addition, the system control unit 42 starts driving an imaging system device such as an imaging device, an amplifier, and an A / D converter of the imaging unit 20 via the imaging drive circuit 22.

そしてこの状態で、システム制御部４２は、Ｓ１１３においてＡＥ駆動を開始し、Ｓ１１１においてＡＦ駆動を開始して、撮影対象に対して適切な明るさ、ピントになるように制御し続けていく。このときＳ２０１において、システム制御部４２は、画像処理回路４０による被写体の検出および認識処理も合わせて開始させる。また、システム制御部４２は、画像処理回路４０にて検出された被写体情報をＡＦやＡＥに利用することで、被写体のピントや明るさをより適切になるように調節することができる。またこれと同時期に、画像処理回路４０では、表示装置５０に出力するライブ画像の現像処理も開始され、これにより撮影者はライブビュー映像を表示装置５０の画面上で確認することが可能となる。これはすなわちファインダー用途としての要件を満たした状態であり、撮影者は撮影対象を捕えて画角調節などのフレーミング作業を行うことができる。 In this state, the system control unit 42 starts AE driving in S113, starts AF driving in S111, and continues to control the subject to be appropriately bright and focused. At this time, in S201, the system control unit 42 also starts subject detection and recognition processing by the image processing circuit 40. Further, the system control unit 42 can adjust the focus and brightness of the subject to be more appropriate by using the subject information detected by the image processing circuit 40 for AF and AE. At the same time, the image processing circuit 40 also starts development processing of the live image output to the display device 50, so that the photographer can check the live view video on the screen of the display device 50. Become. In other words, this is a state that satisfies the requirements for the viewfinder application, and the photographer can capture the object to be photographed and perform framing work such as angle adjustment.

次にＳ１２１において、システム制御部４２は、撮影者によって操作部４４のシャッターボタン（２０２）のいわゆる半押し操作（ＳＷ１がオン）がなされたか否かを判定する。システム制御部４２は、ＳＷ１がオンされていない（オフ）の場合にはＳ１１１、Ｓ１１３、Ｓ２０１に処理を戻し、ＳＷ１がオンされた場合にはＳ１２３においてＡＦに適した露出にするためのＡＦ用のＡＥ制御を行う。そして、システム制御部４２は、ＡＦに適した露出になされた後に、Ｓ１２５において、例えば静止画用のいわゆるワンショットＡＦ（One Shot AF）制御を行う。 In step S121, the system control unit 42 determines whether the photographer has performed a so-called half-press operation (SW1 is turned on) of the shutter button (202) of the operation unit 44. The system control unit 42 returns the processing to S111, S113, and S201 when SW1 is not turned on (off), and when SW1 is turned on, the system control unit 42 uses AF for exposure suitable for AF in S123. AE control is performed. Then, after the exposure suitable for AF is made, the system control unit 42 performs so-called one shot AF (One Shot AF) control for still images, for example, in S125.

次に、ＳＷ１のオンによる撮影準備が終わった後、システム制御部４２は、Ｓ１３１において、シャッターボタンのいわゆる全押し（ＳＷ２がオン）がなされたか否かを判定する。システム制御部４２は、ＳＷ２がオンされず、ＳＷ１もオフである場合にはＳ１１１、Ｓ１１３、Ｓ２０１に処理を戻し、ＳＷ２がオンされた場合にはＳ１５１において、カメラ１００による静止画撮影を実行させる。 Next, after completion of shooting preparation by turning on SW1, the system control unit 42 determines in S131 whether or not the shutter button is fully pressed (SW2 is turned on). If SW2 is not turned on and SW1 is also turned off, the system control unit 42 returns the processing to S111, S113, and S201, and if SW2 is turned on, causes the camera 100 to execute still image shooting in S151. .

図１１、図１２、図１３は、図１０のＳ２０１における被写体の検出および認識処理の動作の詳細を示すフローチャートである。
図１１は、図６（ａ）〜図６（ｃ）で説明した顔ＩＤの発番処理の流れを示したフローチャートである。システム制御部４２は、カメラにおける顔検出および顔認識機能が有効に設定されている状態で画像が撮影されると、Ｓ２０１において、画像処理回路４０に対し図１１のフローチャートに示す顔ＩＤ発番処理を開始させる。 11, FIG. 12, and FIG. 13 are flowcharts showing the details of the operation of subject detection and recognition processing in S201 of FIG.
FIG. 11 is a flowchart showing the flow of the face ID numbering process described with reference to FIGS. 6 (a) to 6 (c). When an image is captured in a state where the face detection and face recognition functions of the camera are set to be effective, the system control unit 42 performs face ID numbering processing shown in the flowchart of FIG. To start.

顔ＩＤ発番処理を開始すると、画像処理回路４０は、先ずＳ２０３において、撮像素子の駆動周期で生成された画像を入力として顔検出を行う。画像処理回路４０は、一度の顔検出処理により画像内から複数の顔を検出することが可能となっている。そして、画像処理回路４０は、以降の顔ＩＤ発番処理を、Ｓ２０３で検出した顔数の回数だけ繰り返す。このため、Ｓ２０５において、画像処理回路４０は、検出した顔数と発番した顔ＩＤの数ｃｏｕｎｔとを比較し、発番した顔ＩＤの数ｃｏｕｎｔが検出した顔数未満であるか否かを判定する。そして、画像処理回路４０は、発番した顔ＩＤの数ｃｏｕｎｔが検出した顔数未満である場合にはＳ２０７に処理を進め、発番した顔ＩＤの数ｃｏｕｎｔが検出した顔数以上になった場合にはＳ２１５に処理を進める。 When the face ID numbering process is started, the image processing circuit 40 first performs face detection in S203 by using an image generated in the drive cycle of the image sensor as an input. The image processing circuit 40 can detect a plurality of faces from the image by a single face detection process. The image processing circuit 40 then repeats the subsequent face ID numbering process as many times as the number of faces detected in S203. Therefore, in step S205, the image processing circuit 40 compares the number of detected faces with the number of issued face IDs, and determines whether the number of issued face IDs count is less than the detected number of faces. judge. If the number of issued face IDs count is less than the detected number of faces, the image processing circuit 40 proceeds to S207, and the number of issued face IDs count exceeds the detected number of faces. If so, the process proceeds to S215.

Ｓ２０７に進むと、画像処理回路４０は、前回の顔検出結果ｏｌｄと今回の顔検出結果の対象顔とを比較し、前回の顔位置及び顔サイズ情報を用いて、それらの連続性の評価を行う。そして、画像処理回路４０は、Ｓ２０９において、今回の顔検出結果の対象顔が前回の顔検出結果ｏｌｄにおける顔であるか、あるいは、今回新たに見つかった顔であるかを判定する。画像処理回路４０は、Ｓ２０９において新たに見つかった新規顔であると判定した場合にはＳ２１１に処理を進め、新規顔でない既知顔が引き続き見つかったと判定した場合にはＳ２１３に処理を進める。 In step S207, the image processing circuit 40 compares the previous face detection result old with the target face of the current face detection result, and evaluates the continuity using the previous face position and face size information. Do. In step S209, the image processing circuit 40 determines whether the target face of the current face detection result is a face in the previous face detection result old or a face newly found this time. If the image processing circuit 40 determines in S209 that the face has been newly found, the process proceeds to S211. If the image processing circuit 40 determines that a known face that is not a new face has been found, the image processing circuit 40 proceeds to S213.

Ｓ２１１に進むと、画像処理回路４０は、新規顔に対して新しい顔ＩＤを発番した後、数ｃｏｕｎｔの値をインクリメントしてＳ２０３に処理を戻す。ここで発番した顔ＩＤは、図７に示した管理テーブルの顔ＩＤ情報欄７０３に格納される。一方、Ｓ２１３に進んだ場合、画像処理回路４０は、前回の顔検出で発番された顔ＩＤを継続させて、数ｃｏｕｎｔの値をインクリメントしてＳ２０３に処理を戻す。
前述の処理が各検出顔について繰り返され、発番した顔ＩＤの値ｃｏｕｎｔが検出した顔数以上になったことでＳ２０３からＳ２１５に進むと、画像処理回路４０は、次の顔検出のために、顔検出結果ｏｌｄに今回の顔検出結果を保存する。その後、顔ＩＤ発番処理は一旦終了し、次の撮影画像について顔検出と顔ＩＤ発番が行われる際には再び図１１のフローチャートの処理が開始される。 In step S211, the image processing circuit 40 issues a new face ID to the new face, increments the value of several counts, and returns the process to step S203. The face ID issued here is stored in the face ID information column 703 of the management table shown in FIG. On the other hand, when the process proceeds to S213, the image processing circuit 40 continues the face ID issued in the previous face detection, increments the value of several counts, and returns the process to S203.
The above processing is repeated for each detected face, and when the issued face ID value count exceeds the number of detected faces, the process proceeds from S203 to S215, where the image processing circuit 40 performs the next face detection. The current face detection result is stored in the face detection result old. Thereafter, the face ID numbering process is temporarily ended, and when face detection and face ID numbering are performed for the next photographed image, the process of the flowchart of FIG. 11 is started again.

図１２は、検出された顔に対して認識処理を行うべきか否かを判定する認識状況判定処理の流れを示したフローチャートである。システム制御部４２は、カメラにおける顔検出および顔認識機能が有効に設定されている状態で画像が撮影されると、Ｓ３０１において、画像処理回路４０に対し図１２のフローチャートに示す認識状況判定処理を開始させる。 FIG. 12 is a flowchart showing the flow of recognition status determination processing for determining whether or not recognition processing should be performed on a detected face. When an image is shot with the face detection and face recognition functions set to effective in the camera, the system control unit 42 performs recognition state determination processing shown in the flowchart of FIG. 12 on the image processing circuit 40 in S301. Let it begin.

前述の顔ＩＤ発番処理と同様に、認識状況判定処理においても、複数の顔が検出されている場合には、それら検出された顔数の回数だけ処理が繰り返される。このため、認識状況判定処理を開始すると、画像処理回路４０は、先ずＳ３０３において、前述のＳ２０５と同様に、検出した顔数と発番した顔ＩＤの数ｃｏｕｎｔとを比較し、発番した顔ＩＤの数ｃｏｕｎｔが検出した顔数未満であるか否かを判定する。そして、画像処理回路４０は、発番した顔ＩＤの数ｃｏｕｎｔが検出した顔数未満である場合にはＳ３０５に処理を進める。 Similarly to the face ID numbering process described above, in the recognition status determination process, if a plurality of faces are detected, the process is repeated as many times as the number of detected faces. For this reason, when the recognition state determination process is started, first, in S303, the image processing circuit 40 compares the number of detected faces with the number of face IDs issued, and counts the issued face in the same manner as S205 described above. It is determined whether or not the ID count is less than the detected number of faces. The image processing circuit 40 advances the process to S305 when the number of issued face IDs count is less than the detected number of faces.

Ｓ３０５に進むと、画像処理回路４０は、図７に示した管理テーブルと顔ＩＤとを基に、検出された顔が、新規に見つかった顔であるか、または既に見つかっている既知の顔であるかどうかを判定する。そして、新規顔と判定した場合、画像処理回路４０は、Ｓ３０７において、その新規顔に関しては認識処理が未だ行われていないものとして扱い、後述する図１３の認識処理の対象とした後、数ｃｏｕｎｔの値をインクリメントしてＳ３０３に処理を戻す。一方、既知の顔と判定した場合、画像処理回路４０は、Ｓ３０９において、既知の顔に関しては認識処理が既に実施されているものとして扱うようにし、Ｓ３１１に処理を進める。 In step S305, the image processing circuit 40 determines whether the detected face is a newly found face or a known face already found based on the management table and face ID shown in FIG. Determine if it exists. If it is determined that the face is a new face, the image processing circuit 40 treats the new face as a recognition face that has not yet been performed in step S307, and sets it as a target for recognition processing in FIG. Is incremented and the process returns to S303. On the other hand, if it is determined that the face is a known face, the image processing circuit 40 treats the known face as already recognized in S309, and advances the process to S311.

ここで、既知の顔の場合、認識処理をやり直したほうが良い場合があるため、Ｓ３１１に進むと、画像処理回路４０は、既知の顔について認識処理をやり直したほうが良いかどうかを判定する再実施判定処理を行う。例えば、前述した図６（ｂ）のケースのように、人物同士が接近して交差した場合など、発番した顔ＩＤに誤りが生じる懸念があり、顔ＩＤに誤りがあると、それに関連して管理されている認識情報も誤りを持つこととなる。このため、画像処理回路４０は、人物同士の接近や交差が発生した場合、Ｓ３１１において、顔ＩＤとしては既知の人物であっても、その既知の顔について認識処理をやり直したほうが良いと判定し、Ｓ３１３に処理を進める。 Here, in the case of a known face, it may be better to repeat the recognition process. Therefore, when the process proceeds to S <b> 311, the image processing circuit 40 performs re-execution to determine whether the recognition process should be performed again for the known face. Judgment processing is performed. For example, as in the case of FIG. 6B described above, there is a concern that an error may occur in the issued face ID, such as when people approach each other and cross each other. The recognition information managed in this way will also have errors. For this reason, the image processing circuit 40 determines that it is better to repeat the recognition process for the known face even if the person is known as the face ID in S311 when an approach or intersection occurs between the persons. , The process proceeds to S313.

Ｓ３１３に進むと、画像処理回路４０は、その既知の顔について、後述する図１３の認識処理を再実施する対象とし、その後、数ｃｏｕｎｔの値をインクリメントしてＳ３０３に処理を戻す。
一方、Ｓ３１１において、認識処理を再実施する対象としないと判定した場合、画像処理回路４０は、数ｃｏｕｎｔの値をインクリメントしてＳ３０３に処理を戻す。
そして、発番した顔ＩＤの数ｃｏｕｎｔが検出した顔数以上になった場合、画像処理回路４０は図１２の認識状況判定処理を一旦終了させ、次の撮影画像について認識状況判定が行われる際には再び図１２のフローチャートの処理が開始される。 In step S313, the image processing circuit 40 sets the known face as a target for re-performing the later-described recognition process in FIG. 13, and then increments the value of several counts and returns the process to step S303.
On the other hand, if it is determined in S311 that the recognition process is not to be performed again, the image processing circuit 40 increments the value of the number count and returns the process to S303.
When the number of issued face IDs count is equal to or greater than the detected number of faces, the image processing circuit 40 once ends the recognition state determination process of FIG. 12 and the recognition state determination is performed for the next photographed image. Then, the process of the flowchart of FIG. 12 is started again.

図１３は、図１２のフローチャートにおいて認識処理の対象となされた顔に対する認識処理と、一次記憶メモリに仮登録された情報を用いた重要被写体登録処理の流れを示すフローチャートである。システム制御部４２は、カメラにおける顔検出および顔認識機能が有効に設定されていて、認識処理の対象が存在する場合、Ｓ４０１とＳ４５１において、画像処理回路４０に対し図１３のフローチャートに示す認識処理と重要被写体登録処理を開始させる。これら認識処理と重要被写体登録処理は並行して行われる。 FIG. 13 is a flowchart showing the flow of the recognition process for the face that is the target of the recognition process in the flowchart of FIG. 12 and the important subject registration process using the information temporarily registered in the primary storage memory. When the face detection and face recognition functions in the camera are set to be effective and there is a recognition process target, the system control unit 42 recognizes the image processing circuit 40 in the recognition process shown in the flowchart of FIG. 13 in S401 and S451. The important subject registration process is started. These recognition processing and important subject registration processing are performed in parallel.

認識処理のフローチャートから説明する。
Ｓ４０３において、画像処理回路４０は、検出した顔ごとに特徴情報の生成を行う。特徴情報の生成処理は、顔検出、目鼻口といった器官の検出、器官に基づいた特徴情報の生成というステップで行われる。たたし、途中の検出精度が低い場合には特徴情報が生成できないか、精度の低い特徴情報しか生成できないことになる。また、重複登録によって辞書データベースの容量が無駄に消費されてしまうのを避けることが望ましい。 The flowchart of the recognition process will be described.
In S403, the image processing circuit 40 generates feature information for each detected face. The feature information generation processing is performed in steps of face detection, detection of organs such as eyes and nose and mouth, and generation of feature information based on the organs. However, when the detection accuracy in the middle is low, feature information cannot be generated, or only feature information with low accuracy can be generated. It is also desirable to avoid wasting the capacity of the dictionary database due to duplicate registration.

このため、Ｓ４０５において、画像処理回路４０は、生成した特徴情報について、辞書データベースに登録済みの特徴情報と比較することによる照合を行う。例えば前述の図８に示したように、辞書データベースには複数人、かつ１人当たり複数の特徴情報が格納可能であるため、画像処理回路４０は、Ｓ４０５において、それら複数の情報との照合を順次行う。 For this reason, in S405, the image processing circuit 40 performs collation by comparing the generated feature information with the feature information registered in the dictionary database. For example, as shown in FIG. 8 described above, since the dictionary database can store a plurality of pieces of feature information and a plurality of pieces of feature information per person, the image processing circuit 40 sequentially collates with the pieces of information in S405. Do.

画像処理回路４０は、照合した結果を類似度として数値で表し、Ｓ４０７において、その類似度が所定の閾値を超える場合には認識成立と判定する。そして、Ｓ４０７において認識成立と判定した場合、画像処理回路４０は、Ｓ４０９において、認識結果を確定する。また、画像処理回路４０は、認識成立した人物が複数存在する場合には最も類似度の高い人物を採用する。 The image processing circuit 40 represents the collation result as a numerical value as a similarity, and determines that the recognition is established when the similarity exceeds a predetermined threshold value in S407. If it is determined in S407 that the recognition is established, the image processing circuit 40 determines the recognition result in S409. The image processing circuit 40 employs the person with the highest similarity when there are a plurality of recognized persons.

一方、Ｓ４０７において類似度が閾値以下と判定した場合、すなわち辞書データベースに登録されている何れの情報とも照合が成立しなかった場合、画像処理回路４０は、認識非成立と判定して、Ｓ４１１に処理を進める。そして、Ｓ４１１において、画像処理回路４０は、認識対象となった顔の特徴情報を、一次記憶メモリ（ＲＡＭ４６）に仮登録する。またこのとき、画像処理回路４０は、特徴情報と共に、顔ＩＤ、特徴情報を生成した元画像を撮影した際の画角情報、及び時刻情報とについても仮登録しておく。
そして、これらＳ４０９、Ｓ４１３の後、画像処理回路４０は図１３の認識処理を一旦終了させ、次の撮影画像について認識処理が行われる際には再び図１３の認識処理が開始される。 On the other hand, if it is determined in S407 that the degree of similarity is equal to or less than the threshold value, that is, if any information registered in the dictionary database is not verified, the image processing circuit 40 determines that recognition is not established, and proceeds to S411. Proceed with the process. In step S411, the image processing circuit 40 temporarily registers the feature information of the face to be recognized in the primary storage memory (RAM 46). At this time, the image processing circuit 40 provisionally registers the face ID, the angle of view information when the original image that generated the feature information is captured, and the time information as well as the feature information.
Then, after these S409 and S413, the image processing circuit 40 once ends the recognition processing of FIG. 13, and when the recognition processing is performed for the next photographed image, the recognition processing of FIG. 13 is started again.

次に重要被写体登録処理のフローチャートについて説明する。重要被写体登録処理は、一次記憶メモリに仮登録された特徴情報が重要被写体の特徴情報であるかどうかを判定し、重要被写体である場合には辞書データベースに本登録を行う処理である。
画像処理回路４０は、先ずＳ４５３において被写体から特徴情報を生成する。Ｓ４５３では、特徴情報と共に、顔ＩＤの発番、特徴情報を生成した元画像を撮影した際の画角情報、及び時刻情報の取得も行われる。 Next, a flowchart of important subject registration processing will be described. The important subject registration process is a process for determining whether or not the feature information provisionally registered in the primary storage memory is the feature information of the important subject and, if it is an important subject, performing the main registration in the dictionary database.
In step S453, the image processing circuit 40 first generates feature information from the subject. In S453, along with the feature information, face ID numbering, view angle information when the original image that generated the feature information is captured, and time information are also acquired.

次に、画像処理回路４０は、Ｓ４５３で生成した各情報と、前述した認識処理で生成して一次記憶メモリに仮登録している各情報とを用いて、Ｓ４５５、Ｓ４５７，Ｓ４５９，Ｓ４６１，Ｓ４６３の各比較処理に基づく判定処理を順次行う。 Next, the image processing circuit 40 uses S455, S457, S459, S461, and S463 by using the information generated in S453 and the information generated in the recognition process described above and temporarily registered in the primary storage memory. The determination process based on each comparison process is sequentially performed.

Ｓ４５５において、画像処理回路４０は、Ｓ４５３で発番した顔ＩＤと、仮登録された顔ＩＤを比較する。例えば図６（ｂ）、図６（ｃ）に示したように遮りやフレームアウトによって一旦顔が消失し、再び出現した場合、消失する前後で別の顔ＩＤが発番されていることになる。したがって、この顔ＩＤの発番の仕組みを利用し、顔ＩＤが同一であれば同一人物として、この重要被写体判定から除外する。すなわち画像処理回路４０は、顔ＩＤが異なる場合にはＳ４５７に処理を進め、顔ＩＤが同一である場合には重要被写体登録処理を終了する。 In S455, the image processing circuit 40 compares the face ID issued in S453 with the temporarily registered face ID. For example, as shown in FIGS. 6 (b) and 6 (c), when a face once disappears due to blockage or out of frame and reappears, another face ID is assigned before and after disappearing. . Therefore, using the face ID numbering mechanism, if the face IDs are the same, they are excluded from the important subject determination as the same person. That is, the image processing circuit 40 proceeds to S457 when the face IDs are different, and ends the important subject registration process when the face IDs are the same.

Ｓ４５７に進むと、画像処理回路４０は、Ｓ４５３で取得した画角情報と、仮登録されている画角情報とを比較し、それら画角情報の差を用いた判定処理を行う。ここで、画角情報の差が所定差未満である場合、Ｓ４５３で特徴情報を生成した顔の人物は、前述した画角の移動や画角の変更等の画角操作が行われていないのに、画角内に侵入してきた新たな人物であると推測される。このため、画像処理回路４０は、Ｓ４５７において画角情報の差が所定差未満であると判定した場合には、重要被写体判定から除外する。一方、Ｓ４５５で異なる顔ＩＤと判定され、かつＳ４５７で画角情報の差が所定差以上で、かつ所定の範囲内であると判定された場合には、画角操作を行うことで対象の人物を画角内に収めようとした可能性が高いと判断できる。このため、画像処理回路４０は、Ｓ４５７において、画角情報の差が所定差以上でかつ所定の範囲内であると判定した場合には、次の時間差判定処理に進む。すなわち画像処理回路４０は、Ｓ４５７で画角情報に差がないと判定した場合には重要被写体登録処理を終了し、一方、画角情報に差がある場合にはＳ４５９に処理を進める。 In step S457, the image processing circuit 40 compares the view angle information acquired in step S453 with the temporarily registered view angle information, and performs a determination process using the difference between the view angle information. Here, when the difference in the angle of view information is less than the predetermined difference, the face person who generated the feature information in S453 has not performed the angle of view operation such as the movement of the angle of view or the change of the angle of view described above. In addition, it is presumed that this is a new person who has entered the angle of view. For this reason, if the image processing circuit 40 determines in S457 that the difference in the angle of view information is less than the predetermined difference, the image processing circuit 40 excludes it from the important subject determination. On the other hand, if it is determined that the face ID is different in S455, and the difference in the angle-of-view information is greater than or equal to the predetermined difference and is within the predetermined range in S457, the target person is obtained by performing the angle-of-view operation. It can be determined that there is a high possibility that the image is within the angle of view. For this reason, if the image processing circuit 40 determines in S457 that the difference in the angle of view information is greater than or equal to the predetermined difference and is within the predetermined range, the image processing circuit 40 proceeds to the next time difference determination process. That is, if the image processing circuit 40 determines that there is no difference in the angle of view information in S457, the image processing circuit 40 ends the important subject registration process, whereas if there is a difference in the angle of view information, the image processing circuit 40 proceeds to S459.

Ｓ４５９に進むと、画像処理回路４０は、Ｓ４５３で取得した時刻情報と、仮登録されている時刻情報とを比較し、それら時刻情報の差を用いた判定処理を行う。Ｓ４５５の判定処理とＳ４５７の判定処理を通過してＳ４５９に進んだ場合であっても、一次記憶メモリに仮登録してから長時間が経過している場合には、撮影者が被写体を探そうとする行為を行っていない可能性が高いと考えられる。このため、画像処理回路４０は、Ｓ４５９において、それら時刻情報の時間差が所定時間範囲外であると判定した場合には重要被写体登録処理を終了する。一方、画像処理回路４０は、Ｓ４５９において、時刻情報の時間差が所定時間範囲内であると判定した場合にはＳ４６１に処理を進める。 In step S459, the image processing circuit 40 compares the time information acquired in step S453 with the temporarily registered time information, and performs a determination process using a difference between the time information. Even if the determination process of S455 and the determination process of S457 are passed and the process proceeds to S459, if a long time has elapsed since provisional registration in the primary storage memory, the photographer searches for the subject. It is highly probable that the act of For this reason, if the image processing circuit 40 determines in S459 that the time difference between the pieces of time information is outside the predetermined time range, the image processing circuit 40 ends the important subject registration process. On the other hand, if the image processing circuit 40 determines in S459 that the time difference of the time information is within the predetermined time range, the image processing circuit 40 proceeds to S461.

Ｓ４６１に進むと、画像処理回路４０は、Ｓ４５３で生成した特徴情報と、仮登録されている特徴情報との照合を行い、認識が成立するかどうかを判定する。そして、画像処理回路４０は、認識が成立しない場合には重要被写体登録処理を終了し、一方、認識が成立した場合にＳ４６３に処理を進める。 In step S461, the image processing circuit 40 collates the feature information generated in step S453 with the temporarily registered feature information, and determines whether or not recognition is established. If the recognition is not established, the image processing circuit 40 ends the important subject registration process. If the recognition is established, the image processing circuit 40 proceeds to S463.

Ｓ４６３に進むと、画像処理回路４０は、特徴情報に対する信頼度判定を行い、その判定結果を基に辞書データベースへの本登録処理を行う。すなわち、登録可能な特徴情報は、最新のものと、過去に一次記憶メモリに仮登録したものとがあるため、画像処理回路４０は、それら両者を比較してより信頼度が高いものを本登録する。 In step S463, the image processing circuit 40 performs reliability determination on the feature information, and performs main registration processing in the dictionary database based on the determination result. That is, since the feature information that can be registered includes the latest information and the information that has been provisionally registered in the primary storage memory in the past, the image processing circuit 40 performs a main registration of those having higher reliability by comparing the two. To do.

図１４（ａ）〜図１４（ｃ）は、条件によって信頼度に差が出る傾向について説明するための図である。図１４（ａ）は、画像内において被写体Ａと被写体Ｂの顔の大きさが異なっている例を示しており、顔の大きさが大きいほど信頼度（信頼性）が高く、小さいほど信頼性が低いことを示している。図１４（ｂ）は、画像内において被写体Ａと被写体Ｂの顔の向きが異なる例を示しており、被写体の顔が正面を向く度合いが高いほど信頼性が高く、横を向くほど信頼性が低いことを示している。図１４（ｃ）は、画像内において被写体Ａと被写体Ｂの顔の目鼻口の器官が遮られているかどかの例を示しており、目鼻口の器官が遮られていないほど信頼性が高く、目鼻口のいずれかの器官が遮られているほど信頼性が低いことを示している。例えば前述の図４に示したパンニングのケースでは、パンニングの前後で被写体自体には大きな差が無いため、より新しい被写体であるパンニング後に生成できた特徴情報を登録に用いたほうがよいことが想定される。また例えば前述の図５に示したズーム操作のケースでは、ズーム操作の前後で被写体の大きさに差があり、より被写体が大きいズーム操作前の方が、特徴情報の信頼性が高く登録に適している。本実施形態の画像処理回路４０では、これらのようにして信頼度を決定し、その信頼度の高い被写体の特徴情報を辞書データベースに本登録する。 FIG. 14A to FIG. 14C are diagrams for explaining a tendency that the reliability varies depending on conditions. FIG. 14A shows an example in which the sizes of the faces of the subject A and the subject B are different in the image. The larger the face size, the higher the reliability (reliability), and the smaller the reliability. Is low. FIG. 14B shows an example in which the orientations of the faces of the subject A and the subject B are different in the image. The higher the degree that the subject's face is facing the front, the higher the reliability is; It is low. FIG. 14 (c) shows an example of whether the eyes, nose and mouth organs of the faces of the subject A and the subject B are blocked in the image, and the reliability is so high that the eyes and nose are not blocked. The obstruction of any organ of the eyes and nose mouth indicates that the reliability is low. For example, in the case of the panning shown in FIG. 4 described above, there is no great difference in the subject itself before and after panning, so it is assumed that it is better to use feature information generated after panning that is a newer subject for registration. The Further, for example, in the case of the zoom operation shown in FIG. 5 described above, there is a difference in the size of the subject before and after the zoom operation, and before the zoom operation with a larger subject, the feature information is more reliable and suitable for registration. ing. In the image processing circuit 40 of this embodiment, the reliability is determined as described above, and the feature information of the subject with the high reliability is fully registered in the dictionary database.

以上説明したように、本実施形態においては、撮影者が被写体を画角内に捕え続けようとする操作に基づいて重要被写体の判定を行うようになされている。このため、本実施形態によれば、画面内に複数の被写体が存在しているケースや、被写体の遮りや消失が発生するケースのような厳しい条件下においても、短時間かつ確実性の高い重要被写体抽出を行うことができる。すなわち、本実施形態によれば、被写体が重要であるか否かを、撮影者による画角操作に基づいて判定することにより、的確に重要被写体を抽出して、自動的に登録することが可能となる。 As described above, in the present embodiment, an important subject is determined based on an operation in which the photographer tries to keep the subject within the angle of view. For this reason, according to the present embodiment, even under severe conditions such as a case where there are a plurality of subjects in the screen or a case where the subject is blocked or disappeared, it is important that the time is short and highly reliable. Subject extraction can be performed. That is, according to the present embodiment, it is possible to accurately extract an important subject and automatically register it by determining whether the subject is important based on a view angle operation by the photographer. It becomes.

なお前述の説明では、画像処理回路４０において被写体の検出および認識処理が行われる例を挙げたが、これらの処理はシステム制御部４２が本実施形態に係るプログラムを実行することにより実現されてもよい。また、一部が画像処理回路４０により実行され、残りがプログラムを基にシステム制御部４２により実行されてもよい。 In the above description, an example in which subject detection and recognition processing is performed in the image processing circuit 40 has been described. However, these processing may be realized by the system control unit 42 executing the program according to the present embodiment. Good. Further, a part may be executed by the image processing circuit 40 and the rest may be executed by the system control unit 42 based on a program.

また前述した実施形態では、画像処理装置の適用例としてデジタルカメラ等を挙げたが、この例には限定されず他の撮像装置にも適用可能である。例えば、デジタルカメラだけでなく、カメラ機能を備えたスマートフォンやタブレット端末などの各種携帯端末、各種の監視カメラ、工業用カメラ、車載カメラ、医療用カメラなどにも本実施形態は適用可能である。 In the embodiment described above, a digital camera or the like has been described as an application example of the image processing apparatus. However, the present invention is not limited to this example, and can be applied to other imaging apparatuses. For example, this embodiment can be applied not only to digital cameras but also to various portable terminals such as smartphones and tablet terminals having camera functions, various monitoring cameras, industrial cameras, in-vehicle cameras, medical cameras, and the like.

以上、本発明をその好適な実施形態に基づいて詳述してきたが、本発明はこれら特定の実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の様々な形態も本発明に含まれる。上述の実施形態の一部を適宜組み合わせてもよい。また、上述の実施形態の機能を実現するソフトウェアのプログラムを、記録媒体から直接、或いは有線／無線通信を用いてプログラムを実行可能なコンピュータを有するシステム又は装置に供給し、そのプログラムを実行する場合も本発明に含む。従って、本発明の機能処理をコンピュータで実現するために、該コンピュータに供給、インストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明の機能処理を実現するためのコンピュータプログラム自体も本発明に含まれる。その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等、プログラムの形態を問わない。プログラムを供給するための記録媒体としては、例えば、ハードディスク、磁気テープ等の磁気記録媒体、光／光磁気記憶媒体、不揮発性の半導体メモリでもよい。また、プログラムの供給方法としては、コンピュータネットワーク上のサーバに本発明を形成するコンピュータプログラムを記憶し、接続のあったクライアントコンピュータはがコンピュータプログラムをダウンロードしてプログラムするような方法も考えられる。 Although the present invention has been described in detail based on preferred embodiments thereof, the present invention is not limited to these specific embodiments, and various forms within the scope of the present invention are also included in the present invention. included. A part of the above-described embodiments may be appropriately combined. Also, when a software program that realizes the functions of the above-described embodiments is supplied from a recording medium directly to a system or apparatus having a computer that can execute the program using wired / wireless communication, and the program is executed Are also included in the present invention. Accordingly, the program code itself supplied and installed in the computer in order to implement the functional processing of the present invention by the computer also realizes the present invention. That is, the computer program itself for realizing the functional processing of the present invention is also included in the present invention. In this case, the program may be in any form as long as it has a program function, such as an object code, a program executed by an interpreter, or script data supplied to the OS. As a recording medium for supplying the program, for example, a magnetic recording medium such as a hard disk or a magnetic tape, an optical / magneto-optical storage medium, or a nonvolatile semiconductor memory may be used. As a program supply method, a computer program that forms the present invention is stored in a server on a computer network, and a connected client computer downloads and programs the computer program.

１０：レンズ、２０：撮像部、４０：画像処理部、４２：システム制御部、４４：操作部、４６：ＲＡＭ、４８：ＲＯＭ、５０：表示装置 10: Lens, 20: Imaging unit, 40: Image processing unit, 42: System control unit, 44: Operation unit, 46: RAM, 48: ROM, 50: Display device

Claims

Detection means for detecting a subject from the captured image;
Extracting means for extracting feature information from the detected subject;
Recognition means for recognizing an individual of the subject based on the feature information;
Registration means for registering the feature information in a database;
Storage means for primarily storing the feature information,
The recognizing unit compares the feature information extracted from the subject detected after the subject disappears from the image with the feature information stored in the primary storage, and detects the subject detected after disappearing from the image. Confirm whether or not the subject of the primary stored feature information,
The registration unit confirms that the subject detected after disappearing from the image is the subject of the feature information stored in the primary storage by the recognition unit, and the subject of the feature information stored in the primary storage is captured. An image processing apparatus, wherein when the angle of view is different from the angle of view when a subject detected after disappearing from the image is photographed, the feature information of the subject is registered in the database.

The image processing apparatus according to claim 1, wherein the storage unit performs the primary storage when feature information extracted from the subject is not registered in the database.

The detection means attaches identification information to the detected subject,
The recognizing unit performs the matching and the confirmation when the identification information of the subject before disappearing from the image is different from the identification information of the subject detected after disappearing from the image. The image processing apparatus according to claim 1.

The recognizing means collates the detected feature information of the subject from the image taken with the angle of view moved after the subject disappears from the image, and the primary stored feature information, and The image processing apparatus according to claim 1, wherein confirmation is performed.

The image processing apparatus according to claim 4, wherein the movement of the angle of view includes movement by moving the imaging apparatus in at least one of a horizontal direction and a vertical direction.

The recognizing unit collates the feature information of the subject detected from the image taken with the angle of view changed after the subject disappears from the image, and the feature information stored in the primary storage to check the confirmation The image processing apparatus according to claim 1, wherein:

The image processing apparatus according to claim 6, wherein the change in the angle of view includes a change by a zoom operation of the imaging apparatus.

The image processing apparatus according to claim 6, wherein the change in the angle of view includes a change by an operation of moving the imaging apparatus in the perspective direction with respect to the subject.

The registration means is configured such that a difference between an angle of view when a subject detected after disappearing from the image is photographed and an angle of view when the subject of the primary stored characteristic information is photographed is a predetermined difference or more. The image processing apparatus according to claim 1, wherein the registration is performed when the value is within a predetermined range.

The said recognition means performs the said collation and confirmation, when the time until a to-be-detected object is detected within the predetermined time range after disappearing from the said image. The image processing apparatus according to item 1.

The registration unit is configured to register the feature information having higher reliability among the feature information extracted from the subject detected after disappearing from the image and the feature information primarily stored. The image processing apparatus according to any one of claims 1 to 10.

The registration unit determines the reliability based on at least one of a size of a subject in the image, a degree of the subject facing the front, and a detection result of an organ included in the subject. The image processing apparatus according to claim 11, wherein:

An image processing method executed by an image processing apparatus,
A detection step of detecting a subject from the captured image;
An extraction step of extracting feature information from the detected subject;
A recognition step of recognizing an individual of the subject based on the feature information;
A registration step of registering the feature information in a database;
A storage step of temporarily storing the feature information,
In the recognition step, the feature information extracted from the subject detected after the subject disappears from the image is compared with the feature information stored in the primary storage, and the subject detected after disappearing from the image is detected. Confirm whether or not the subject of the primary stored feature information,
In the registration step, the recognition step confirms that the subject detected after disappearing from the image is the subject of the feature information stored in the primary storage, and the subject of the feature information stored in the primary storage is photographed. An image processing method comprising: registering feature information of the subject in the database when the angle of view is different from the angle of view when the subject detected after disappearing from the image is captured.

The program for functioning a computer as each means of the image processing apparatus of any one of Claim 1 to 12.