JP2012133573A

JP2012133573A - Image recognition device, image recognition method and program

Info

Publication number: JP2012133573A
Application number: JP2010284873A
Authority: JP
Inventors: Yasunobu Kodama; 泰伸兒玉
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-12-21
Filing date: 2010-12-21
Publication date: 2012-07-12
Anticipated expiration: 2030-12-21
Also published as: JP5761989B2

Abstract

PROBLEM TO BE SOLVED: To provide an image recognition device capable of accurately follow a target in an image.SOLUTION: An image recognition device specifies a target region from an input image, and follows the target region by calculating an attention point representing a position of the specified target region. When a region including the attention point is different from the target region, the image recognition device corrects the position of the attention point to an internal position of the target region, and follows the target region based on the corrected attention point so as to accurately follow the target even when detecting, for example, a doughnut shaped target.

Description

本発明は画像認識装置、画像認識方法及びプログラムに関し、特に、特定の被写体を追尾するために用いて好適な技術に関する。 The present invention relates to an image recognition apparatus, an image recognition method, and a program, and more particularly to a technique suitable for use in tracking a specific subject.

従来、画像内の色・輝度情報等から同一の色や輝度といった同じ特徴量を持つ領域ごとに画像を分割し、その中から自動的に被写体領域を検出する被写体検出機能が知られている。また、ユーザの操作によりタッチパネル等から動画像内の一点を指定し、指定された領域の色・輝度情報等を用いてその領域を追尾する被写体追尾機能が知られている。さらにこれらを組み合わせて、被写体検出機能により検出された被写体領域内の一点を指定し、被写体追尾機能で被写体を追尾する方法も知られている。 2. Description of the Related Art Conventionally, a subject detection function is known in which an image is divided into regions having the same feature amount such as the same color and luminance from the color / luminance information in the image and the subject region is automatically detected from the divided image. A subject tracking function is also known in which a point in a moving image is designated by a user operation from a touch panel or the like, and the area is tracked using color / luminance information of the designated area. Furthermore, a method is also known in which these are combined to specify one point in the subject area detected by the subject detection function and track the subject using the subject tracking function.

例えば特許文献１には、指定したオブジェクトを追尾する方法が開示されている。具体的には、移動するオブジェクト上の追尾点に対して、ユーザが指定する方向に基づいて検出領域を設定し、その重心位置を追尾点として推定している。これにより、ユーザの操作によって、動いているオブジェクトに対して高精度に追尾点を指定してオブジェクトを追尾することができるようにしている。 For example, Patent Document 1 discloses a method for tracking a specified object. Specifically, a detection area is set for the tracking point on the moving object based on the direction specified by the user, and the position of the center of gravity is estimated as the tracking point. Thereby, the user can track the object by specifying the tracking point with high accuracy for the moving object.

特開２００７−２７２７３１号公報JP 2007-272731 A

しかしながら、上述の特許文献１に開示された従来技術では、被写体がドーナツ型などの形状である場合、重心位置が被写体でない領域となる可能性があり、重心位置を追尾すると被写体以外のものを追尾してしまう可能性がある。 However, in the conventional technique disclosed in Patent Document 1 described above, when the subject has a donut shape or the like, the center of gravity position may be a non-subject region. There is a possibility that.

本発明は前述の問題点に鑑み、画像中の被写体を精度良く追尾できるようにすることを目的としている。 The present invention has been made in view of the above-described problems, and an object thereof is to enable accurate tracking of a subject in an image.

本発明の画像認識装置は、入力された画像から被写体領域を特定する特定手段と、前記特定手段によって特定された被写体領域の位置を代表する注目点を算出する算出手段と、前記算出手段によって算出された注目点に基づいて前記被写体領域を追尾する追尾手段とを有し、前記算出手段は、前記注目点を含む領域が前記被写体領域と異なる領域である場合には、前記注目点の位置を前記被写体領域の内部に補正し、前記追尾手段は、前記補正された位置の注目点に基づいて前記被写体領域を追尾することを特徴とする。 The image recognition apparatus of the present invention includes a specifying unit that specifies a subject area from an input image, a calculating unit that calculates a point of interest that represents the position of the subject area specified by the specifying unit, and a calculation unit that calculates the target area. Tracking means for tracking the subject area based on the focused point of interest, and the calculating means determines the position of the focused point when the area including the focused point is a different area from the subject area. The correction is made inside the subject area, and the tracking means tracks the subject area based on the attention point of the corrected position.

本発明によれば、被写体の検出領域の重心位置が被写体でない領域であっても、被写体を精度良く追尾することができる。 According to the present invention, a subject can be accurately tracked even if the center of gravity position of the subject detection region is a region that is not a subject.

実施形態に係る画像認識装置の機能構成例を示すブロック図である。It is a block diagram which shows the function structural example of the image recognition apparatus which concerns on embodiment. 被写体検出と被写体追尾との切替え手順の例を示すフローチャートである。10 is a flowchart illustrating an example of a procedure for switching between subject detection and subject tracking. 被写体領域を検出する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which detects a to-be-photographed area | region. 領域分割により被写体領域を検出する手順を説明する図である。It is a figure explaining the procedure which detects a to-be-photographed area | region by area | region division. 被写体を追尾する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which tracks a to-be-photographed object. 注目点を算出して補正する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which calculates and correct | amends an attention point. 注目点を補正する手順を説明する図である。It is a figure explaining the procedure which correct | amends an attention point.

＜撮像装置の構成＞
以下、本発明の好ましい実施形態について、添付の図面に基づいて詳細に説明する。本実施形態では、実施形態に係る撮像装置の一例として、デジタルカメラを用いた場合について説明する。
図１は、本実施形態に係るデジタルカメラのうち、被写体検出及び追尾を行う画像認識装置１００の機能構成例を示すブロック図である。
図１において、画像信号入力部１０１は、デジタル画像信号を入力する。例えば撮像装置においては、レンズ等で構成される光学部（不図示）を介して入射される光を受け、ＣＣＤ部（不図示）で光量に応じた電荷を出力する。そして、Ａ／Ｄ変換部（不図示）でＣＣＤ部から出力されたアナログ画像信号に対して、サンプリング、ゲイン調整、Ａ／Ｄ変換等を実施したデジタル画像信号が入力される。さらに、画像処理部（不図示）を介して、画像信号入力部１０１から入力されたデジタル画像信号に対して各種の画像処理を行い、処理済みのデジタル画像信号を出力する。例えば、デジタル画像信号をＹＵＶ画像信号に変換して出力する。 <Configuration of imaging device>
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the present embodiment, a case where a digital camera is used as an example of the imaging apparatus according to the embodiment will be described.
FIG. 1 is a block diagram illustrating a functional configuration example of an image recognition apparatus 100 that performs subject detection and tracking in the digital camera according to the present embodiment.
In FIG. 1, an image signal input unit 101 inputs a digital image signal. For example, in an imaging device, incident light is received through an optical unit (not shown) composed of a lens or the like, and a charge corresponding to the amount of light is output by a CCD unit (not shown). A digital image signal obtained by performing sampling, gain adjustment, A / D conversion, and the like is input to the analog image signal output from the CCD unit by an A / D conversion unit (not shown). Furthermore, various image processing is performed on the digital image signal input from the image signal input unit 101 via an image processing unit (not shown), and a processed digital image signal is output. For example, a digital image signal is converted into a YUV image signal and output.

被写体領域特定部１０２は、画像内の被写体領域を特定する。ここで被写体とは、例えば人物の顔や動物、建物等とする。被写体領域を特定する方法としては、画像内の色情報、輝度情報、コントラスト情報などを用いて特定する被写体検出がある。また、他に被写体領域を特定する方法として、撮像装置の表示面に配置されたタッチパネル（不図示）等を使用した、ユーザの操作による指定でもよい。 The subject area specifying unit 102 specifies a subject area in the image. Here, the subject is, for example, a human face, an animal, a building, or the like. As a method for specifying a subject area, there is subject detection that uses color information, luminance information, contrast information, and the like in an image. In addition, as another method for specifying the subject area, designation by a user operation using a touch panel (not shown) or the like disposed on the display surface of the imaging apparatus may be used.

被写体領域注目点算出部１０３は、被写体領域特定部１０２により特定された被写体領域から注目点を算出する。注目点とは、例えば被写体領域の重心点とする。被写体領域注目点補正部１０４は、被写体領域注目点算出部１０３により算出された注目点の位置を補正する。補正するか否かは、例えば被写体領域の色情報と注目点周辺の色情報とを比較して、それらが異なっているかどうかにより判断する。そして、被写体領域及び注目点周辺の色情報が異なっている場合に補正を行う。補正方法の一例としては、注目点を移動させる方向を画像内における上下左右の中から決定し、被写体領域の色情報と一致する領域に注目点の位置に移動させる。 The subject area attention point calculation unit 103 calculates the attention point from the subject area specified by the subject area specification unit 102. The attention point is, for example, the center of gravity of the subject area. The subject area attention point correction unit 104 corrects the position of the attention point calculated by the subject area attention point calculation unit 103. Whether or not to correct is determined by comparing, for example, the color information of the subject region and the color information around the point of interest, and whether or not they are different. Then, correction is performed when the color information around the subject region and the point of interest is different. As an example of the correction method, the direction in which the attention point is moved is determined from the top, bottom, left, and right in the image, and is moved to the position of the attention point in an area that matches the color information of the subject area.

被写体追尾部１０５は、被写体領域注目点補正部１０４により補正されて決定した注目点の位置をもとに被写体追尾を実行する。被写体を追尾する方法としては、例えば、被写体追尾を開始する時に注目点周辺の色情報、輝度情報の分布を保持しておき、パターンマッチングによる追尾を実施することによって被写体の動きに対して注目点の位置を更新し続けるようにする。 The subject tracking unit 105 performs subject tracking based on the position of the target point corrected and determined by the subject region target point correcting unit 104. As a method of tracking the subject, for example, when the tracking of the subject is started, the distribution of the color information and luminance information around the attention point is held, and tracking by pattern matching is performed, so that the attention point with respect to the movement of the subject Keep updating the position of.

被写体情報表示部１０６は、被写体追尾部１０５から出力された画像内の注目点位置に被写体を示す情報を不図示の表示部に表示する表示制御を行う。被写体を示す情報とは、例えば矩形枠を注目点周辺に表示するものとする。以上の構成により、被写体領域の形状に応じて被写体追尾を実行する時に用いる注目点を設定し、被写体を精度良く追尾することが可能となる。 The subject information display unit 106 performs display control to display information indicating the subject at the point of interest position in the image output from the subject tracking unit 105 on a display unit (not shown). The information indicating the subject is assumed to display a rectangular frame around the point of interest, for example. With the above configuration, it is possible to set a point of interest to be used when subject tracking is performed according to the shape of the subject region, and to accurately track the subject.

＜被写体検出と被写体追尾との切り替え＞
次に、被写体検出と被写体追尾の切り替え方法について説明する。図２は、被写体検出と被写体追尾とを切り替える処理手順の一例を示すフローチャートである。 <Switching between subject detection and subject tracking>
Next, a method for switching between subject detection and subject tracking will be described. FIG. 2 is a flowchart illustrating an example of a processing procedure for switching between subject detection and subject tracking.

まず、ステップＳ２０１において、被写体追尾部１０５は、被写体追尾を実行可能かどうかについて判定する。後述する被写体検出の実行により被写体領域が検出された場合には、被写体追尾を実行可能とみなす。また、初めてステップＳ２０１の処理を行う場合は追尾する領域が決定されていないため、被写体追尾を実行不可能とみなす。この判定の結果、被写体追尾が実行可能である場合はステップＳ２０３へ移行し、被写体追尾が実行不可能である場合はステップＳ２０２へ移行する。 First, in step S201, the subject tracking unit 105 determines whether subject tracking can be executed. When a subject area is detected by subject detection described later, it is considered that subject tracking can be executed. Further, when the process of step S201 is performed for the first time, it is considered that subject tracking cannot be performed because the tracking area has not been determined. As a result of this determination, if subject tracking can be performed, the process proceeds to step S203, and if subject tracking cannot be performed, the process proceeds to step S202.

ステップＳ２０２においては、被写体領域特定部１０２は、被写体検出を実行し、画像内の被写体領域を検出する。そして、ステップＳ２０４において、ステップＳ２０２で実行した被写体検出の結果、被写体領域を検出したかどうかについて判定する。被写体領域を検出したかどうかは、例えば、被写体検出を実行する時に被写体らしさを表す評価値を算出し、その評価値が閾値以上であるか否かにより判定する。この判定の結果、被写体領域を検出した場合はステップＳ２０６へ移行し、被写体領域を検出しなかった場合はステップＳ２０７へ移行する。 In step S202, the subject area specifying unit 102 performs subject detection and detects a subject area in the image. In step S204, it is determined whether or not a subject area has been detected as a result of subject detection executed in step S202. Whether or not the subject area has been detected is determined by, for example, calculating an evaluation value representing the likelihood of the subject when executing subject detection and determining whether or not the evaluation value is equal to or greater than a threshold value. If the subject area is detected as a result of this determination, the process proceeds to step S206, and if the subject area is not detected, the process proceeds to step S207.

一方、ステップＳ２０３においては、被写体追尾部１０５は、被写体検出により検出された被写体領域を追尾することによって被写体を追尾する。そして、ステップＳ２０５において、ステップＳ２０３で実行した被写体追尾の結果、被写体追尾に成功したかどうかを判定する。成功したかどうかは、例えば被写体追尾の実行結果に応じた追尾領域の評価値を算出し、その値が閾値以上か否かにより判定する。被写体追尾が成功した場合にはステップＳ２０８へ移行し、被写体追尾が成功しなかった場合はステップＳ２０７へ移行する。 On the other hand, in step S203, the subject tracking unit 105 tracks the subject by tracking the subject area detected by subject detection. In step S205, it is determined whether the subject tracking is successful as a result of the subject tracking executed in step S203. The success or failure is determined, for example, by calculating an evaluation value of the tracking area according to the subject tracking execution result and whether the value is equal to or greater than a threshold value. If the subject tracking is successful, the process proceeds to step S208. If the subject tracking is not successful, the process proceeds to step S207.

ステップＳ２０６においては、被写体追尾部１０５は、被写体領域が検出されたことから被写体追尾を実行可能とみなし、注目点を算出するステップＳ２０９へ移行する。そして、ステップＳ２０９において、被写体領域注目点算出部１０３は、被写体追尾を実行する際に用いる注目点を、被写体領域から算出する。なお、この注目点の詳細な説明については後述する。 In step S206, the subject tracking unit 105 considers that subject tracking can be performed because the subject region has been detected, and proceeds to step S209 to calculate the attention point. In step S209, the subject area attention point calculation unit 103 calculates the attention point used when subject tracking is performed from the subject area. A detailed description of this attention point will be given later.

一方、ステップＳ２０７において、被写体追尾部１０５は、被写体検出または被写体追尾が失敗したことから被写体追尾を実行不可能とみなす。また、ステップＳ２０８においては、被写体追尾部１０５は、被写体追尾が成功したことから被写体追尾を実行可能とみなす。 On the other hand, in step S207, the subject tracking unit 105 considers that subject tracking cannot be performed because subject detection or subject tracking has failed. In step S208, the subject tracking unit 105 considers that subject tracking can be executed because the subject tracking is successful.

以上のように、図２に示す処理を繰り返すことによって、被写体検出と被写体追尾とを切り替えることが可能となる。また、本実施形態では被写体検出と被写体追尾とのどちらか片方を実行する例について述べたが、被写体検出と被写体追尾とを両方同時に実行し続けるようにしてもよい。例えば、被写体追尾中に被写体検出により新たな被写体領域が検出された場合には、被写体検出及び被写体追尾の評価値に応じて、新たに検出された被写体領域における被写体を追尾する。 As described above, by repeating the processing shown in FIG. 2, it is possible to switch between subject detection and subject tracking. In this embodiment, an example in which either one of subject detection and subject tracking is executed has been described. However, both subject detection and subject tracking may be continuously executed. For example, when a new subject area is detected by subject detection during subject tracking, the subject in the newly detected subject area is tracked according to the evaluation values of subject detection and subject tracking.

＜被写体検出の実行＞
次に、図２のステップＳ２０２における画像内から被写体領域を検出する方法について説明する。図３は、被写体領域特定部１０２により被写体領域を検出する処理手順の一例を示すフローチャートである。
まず、ステップＳ３０１において、画像内の色や輝度などの特徴量を基に、特徴量が異なる領域ごとに画像を分割する。複数の領域に分割する方法の一例について、図４を参照しながら説明する。 <Execution of subject detection>
Next, a method for detecting the subject area from the image in step S202 of FIG. 2 will be described. FIG. 3 is a flowchart illustrating an example of a processing procedure for detecting a subject area by the subject area specifying unit 102.
First, in step S301, the image is divided into regions having different feature amounts based on the feature amounts such as color and luminance in the image. An example of a method of dividing into a plurality of regions will be described with reference to FIG.

例えば図４（ａ）に示す画像を同じ色や輝度といった同一とみなす特徴量を持つ領域ごとに分割するために、まず、画像を複数のブロックに分割する。ブロック単位としては、例えば図４に示すように、画像を１６×１２のブロックに分割する。そして、隣接するブロック間の色情報や輝度情報を比較し、それらのブロックが同一の特徴を持つかどうかを判定する。隣接するブロックが同一の特徴を持つかどうかは、ブロック間の輝度値や色相値の差分の大きさにより判定する。または、画像全体で輝度値や色相値のヒストグラムを作成し、ヒストグラムの山や谷の形状から同一輝度、同一色とみなす値の範囲を設定して、隣接ブロック同士が同一範囲内の輝度値や色相値を持つかどうかにより判定する。このような領域分割により、図４（ｂ）に示すように同一の特徴量を持つ領域ごとに分割することが可能となる。 For example, in order to divide the image shown in FIG. 4A into regions having the same feature amount such as the same color and luminance, first, the image is divided into a plurality of blocks. As a block unit, for example, as shown in FIG. 4, the image is divided into 16 × 12 blocks. Then, color information and luminance information between adjacent blocks are compared to determine whether or not those blocks have the same characteristics. Whether adjacent blocks have the same characteristics is determined based on the magnitude of the difference in luminance value or hue value between the blocks. Alternatively, create a histogram of luminance values and hue values for the entire image, set the range of values considered to be the same luminance and the same color from the shape of the peaks and valleys of the histogram, and set the luminance values within the same range between adjacent blocks. Judgment is made based on whether or not it has a hue value. By such area division, it becomes possible to divide into areas having the same feature amount as shown in FIG.

次に、ステップＳ３０２において、ステップＳ３０１で分割した画像内の各領域について、背景とみなす背景領域を決定し、決定した背景領域を被写体領域の候補から除去する。各領域が背景領域かどうかを判定する方法の一例として、背景らしさの評価値（以下、第１の評価値）を算出する。 Next, in step S302, a background area to be regarded as a background is determined for each area in the image divided in step S301, and the determined background area is removed from the subject area candidates. As an example of a method for determining whether or not each area is a background area, an evaluation value (hereinafter referred to as a first evaluation value) of background likelihood is calculated.

ここで、被写体が画像内の中心付近に存在する可能性が高いとみなし、第１の評価値は、例えば領域の面積に対する画面端に接しているブロック数の割合により決定される。画面端に接しているブロック数が多い領域は、領域の面積に対して画面端に接しているブロック数の割合が大きい。そのため被写体の可能性が低く、背景らしいとみなして第１の評価値を高いものとする。一方、画面端に接しているブロック数が少ない領域は、領域の面積に対して画面端に接しているブロック数の割合が小さい。この場合は背景の可能性が低く、被写体らしいとみなして第１の評価値を低いものとする。 Here, it is considered that there is a high possibility that the subject exists near the center in the image, and the first evaluation value is determined, for example, by the ratio of the number of blocks in contact with the screen edge to the area of the region. In a region where the number of blocks in contact with the screen edge is large, the ratio of the number of blocks in contact with the screen edge to the area of the region is large. For this reason, the first evaluation value is assumed to be high because the possibility of the subject is low and the background is considered to be likely. On the other hand, in the region where the number of blocks in contact with the screen edge is small, the ratio of the number of blocks in contact with the screen edge to the area of the region is small. In this case, the possibility of a background is low, and the first evaluation value is set low because it is considered to be a subject.

以上のようにして、例えば図４（ｃ）に示すように、画面端に接しているブロック数が多く、第１の評価値が高い領域（１）、（７）、（８）、（９）を背景とみなし、被写体領域の候補から除去することが可能となる。また、上述の方法では背景とみなす条件を画面の上下左右端で区別していないが、上下左右端それぞれで条件を異なるようにしてもよい。例えば、画面下端に接していても上・左・右端に接していない場合は背景とみなさないようにしたり、上・左・右端のうち２辺以上に接している場合は、より背景らしいとみなして第１の評価値を高くしたりしてもよい。 As described above, for example, as shown in FIG. 4C, the regions (1), (7), (8), (9) where the number of blocks in contact with the screen edge is large and the first evaluation value is high. ) As a background and can be removed from the subject area candidates. Further, in the above method, the condition to be regarded as the background is not distinguished between the upper, lower, left and right edges of the screen, but the condition may be different at the upper, lower, left and right edges. For example, if it touches the bottom edge of the screen but does not touch the top, left, or right edge, it will not be regarded as the background, or if it touches two or more sides of the top, left, or right edge, it will be regarded as more background. The first evaluation value may be increased.

上述の方法では、画面端にブロックが接していないが、画面中心から離れている領域については背景と判断することが困難となる。そこで、第１の評価値の他の算出方法として、各領域が画面端寄りに存在するかどうかにより算出してもよい。この場合、領域を構成するブロックの配置や領域の重心位置を算出し、その結果から画面端寄りに存在する領域かどうかを判定する。画面端寄りに存在する場合は、第１の評価値を高いものとする。このようにして、図４（ｄ）に示すように画面端にブロックが接していないが画面中心から離れている領域（２）、（５）を背景とみなし、被写体領域の候補から除去することが可能となる。 In the above method, the block is not in contact with the screen edge, but it is difficult to determine the region far from the center of the screen as the background. Therefore, as another calculation method of the first evaluation value, the calculation may be performed based on whether each region exists near the screen edge. In this case, the arrangement of the blocks constituting the area and the barycentric position of the area are calculated, and it is determined from the result whether the area exists near the screen edge. If it exists near the edge of the screen, the first evaluation value is set high. In this way, as shown in FIG. 4D, the areas (2) and (5) that are not in contact with the edge of the screen but are separated from the center of the screen are regarded as the background and are removed from the candidate object areas. Is possible.

これらの方法により各領域について第１の評価値を算出することができ、第１の評価値が高い領域を背景領域とみなすことが可能となる。さらに、背景として除去した領域と同様の色情報や輝度情報を持つ領域を背景とみなす。例えば、各領域の輝度や色相、彩度の平均値を算出し、背景除去された領域と除去されていない領域との平均値の差分を算出する。そして、この差分が小さい場合は、背景として除去されていない領域であっても背景とみなし、被写体領域の候補から除去する。 With these methods, the first evaluation value can be calculated for each region, and the region with the high first evaluation value can be regarded as the background region. Furthermore, an area having the same color information and luminance information as the area removed as the background is regarded as the background. For example, the average value of the luminance, hue, and saturation of each region is calculated, and the difference between the average values of the background-removed region and the non-removed region is calculated. When this difference is small, even an area that is not removed as a background is regarded as a background and is removed from the subject area candidates.

さらに、差分が小さいとみなす閾値を、色に応じて変更するようにしてもよい。例えば草木などの緑色は被写体よりも背景である可能性が高いとみなし、緑色の領域は差分が比較的大きくても背景とみなすように閾値を大きくする。このようにして、図４（ｅ）に示すように、画面中心付近に存在していても背景らしい領域（３）、（６）を背景とみなすことが可能となる。 Furthermore, the threshold value that the difference is considered to be small may be changed according to the color. For example, it is considered that green such as grass is more likely to be the background than the subject, and the threshold value is increased so that the green region is regarded as the background even if the difference is relatively large. In this way, as shown in FIG. 4E, it is possible to consider the background-like areas (3) and (6) as the background even if they exist near the center of the screen.

次に、ステップＳ３０３において、ステップＳ３０２で背景として除去されていない領域の中から被写体とみなす領域を決定する。一例として、各領域において被写体らしさを表す評価値（以下、第２の評価値）を算出し、その値に応じて被写体とみなす領域を決定する方法を説明する。 Next, in step S303, an area to be regarded as a subject is determined from the areas not removed as the background in step S302. As an example, a method will be described in which an evaluation value (hereinafter referred to as a second evaluation value) representing the likelihood of a subject in each region is calculated and a region regarded as a subject is determined according to the value.

第２の評価値としては、以下の重みWeight＿1〜Weight＿4を用いる。重みWeight＿1は画像内における領域の重心位置に応じた重みであり、重みWeight＿2は領域の大きさに応じた重みである。また、重みWeight＿3は領域の形が縦横比１：１に近いかどうかに応じた重みであり、重みWeight＿4は領域の形が円形に近いかどうかに応じた重みである。 As the second evaluation value, the following weights Weight_1 to Weight_4 are used. The weight Weight_1 is a weight according to the position of the center of gravity of the region in the image, and the weight Weight_2 is a weight according to the size of the region. The weight Weight_3 is a weight according to whether the shape of the region is close to an aspect ratio of 1: 1, and the weight Weight_4 is a weight according to whether the shape of the region is close to a circle.

例えば、領域の重心位置が画面中心に近いほど被写体らしいとみなし、重みWeight＿1の値を大きくする。また、領域が大きいほど被写体らしいとみなし、重みWeight＿2の値を大きくする。一方、領域の形が縦横比１：１に近いかどうかは次のように判定する。まず、領域の重心位置から領域の最遠点までの距離を算出し、重心位置を中心とし重心位置と最遠点とを結ぶ線分を半径とした外接円を描く。そして、この外接円の面積に対する領域の面積の割合が大きいほど被写体の縦横比が１：１に近いとみなし、重みWeight＿3の値を大きくする。また、領域の形が円形に近いかどうかは、領域の外周の長さを算出し、外周の長さに対する領域の面積の割合を計算することにより判定する。そして、面積の割合が大きい場合は円形に近いとみなし、重みWeight＿4の値を大きくする。 For example, the closer the center of gravity of the region is to the center of the screen, the more likely it is to be a subject, and the value of the weight Weight_1 is increased. Also, the larger the area, the more likely the subject is, and the value of the weight Weight_2 is increased. On the other hand, whether or not the shape of the region is close to an aspect ratio of 1: 1 is determined as follows. First, the distance from the centroid position of the area to the farthest point of the area is calculated, and a circumscribed circle is drawn with the line segment connecting the centroid position and the farthest point as the center and the radius as the center. Then, as the ratio of the area of the area to the area of the circumscribed circle is larger, the aspect ratio of the subject is considered to be closer to 1: 1, and the value of the weight Weight_3 is increased. Whether or not the shape of the region is close to a circle is determined by calculating the length of the outer periphery of the region and calculating the ratio of the area of the region to the length of the outer periphery. If the area ratio is large, it is considered to be close to a circle, and the value of the weight Weight_4 is increased.

以上のように背景として除去されていない各領域について、これらの重みWeight＿1〜Weight＿4を算出する。そして、重みWeight＿1〜Weight＿4を掛け合わせたものを第２の評価値とし、各領域の中でこの第２の評価値が一番大きい領域を被写体領域と判定する。図４（ｅ）に示す例では、領域（４）、（１０）、（１１）が背景として候補から除去されていない領域である。この場合、領域（４）が領域（１０）、（１１）と比較して重心位置が画面中心に近く、領域の大きさが大きく、かつ、形状が１：１に近く、円形に近いといえる。このため、領域（４）の被写体らしさの評価値が一番高くなり、被写体領域と判定される。 As described above, these weights Weight_1 to Weight_4 are calculated for each region not removed as a background. Then, the product of the weights Weight_1 to Weight_4 is used as the second evaluation value, and the area having the largest second evaluation value among the areas is determined as the subject area. In the example shown in FIG. 4E, the regions (4), (10), and (11) are regions that have not been removed from the candidates as backgrounds. In this case, it can be said that the area (4) is closer to the center of the screen than the areas (10) and (11), the size of the area is large, the shape is close to 1: 1, and the shape is close to a circle. . For this reason, the evaluation value of the subjectivity of the area (4) is the highest, and it is determined as the subject area.

ここで、被写体領域と判定する条件として、第２の評価値が一番高い領域という条件に、予め設定した閾値以上の第２の評価値であるという条件をさらに加えてもよい。また、複数の領域の第２の評価値を比較し、第２の評価値が一番大きい領域の評価値とそれ以外の領域の評価値とに大きく差がある場合には、他の領域と比較して突出して目立つ被写体である可能性が高い。このような場合は、第２の評価値が一番大きい領域を被写体領域とみなしやすくする。 Here, as a condition for determining the subject area, a condition that the second evaluation value is equal to or higher than a preset threshold value may be further added to the condition that the second evaluation value is the highest. Further, the second evaluation values of the plurality of regions are compared, and when there is a large difference between the evaluation value of the region where the second evaluation value is the largest and the evaluation value of the other region, There is a high possibility that the subject is prominent and prominent. In such a case, the region having the largest second evaluation value is easily regarded as the subject region.

また、上述した方法では、一枚の画像から被写体領域を判定しているが、時系列的に連続した複数の画像を用いて被写体領域を判定してもよい。例えば、同じ領域が連続して一番大きい第２の評価値を持つ場合に限り、被写体領域と設定する。また、同じ領域が連続して検出されるたびに第２の評価値に上昇させる重みを加えていき、設定した閾値以上となった場合に被写体領域と判定する。以上の方法により、画像内から被写体領域を検出することが可能となる。 In the above-described method, the subject region is determined from one image, but the subject region may be determined using a plurality of images that are continuous in time series. For example, the subject area is set only when the same area continuously has the second largest evaluation value. Further, every time the same area is continuously detected, a weight to be increased is added to the second evaluation value, and when it is equal to or more than a set threshold value, it is determined as a subject area. By the above method, it is possible to detect the subject area from the image.

＜被写体追尾の実行＞
次に、図２のステップＳ２０３における被写体を追尾する方法について説明する。図５は、被写体追尾部１０５により被写体を追尾する処理手順の一例を示すフローチャートである。
まず、ステップＳ５０１において、被写体の追尾を開始する時かどうかを判定する。この判定の結果、被写体の追尾を開始する時である場合はステップＳ５０２へ移行し、既に被写体の追尾を開始している場合はステップＳ５０３へ移行する。 <Execution of subject tracking>
Next, a method for tracking the subject in step S203 in FIG. 2 will be described. FIG. 5 is a flowchart illustrating an example of a processing procedure for tracking a subject by the subject tracking unit 105.
First, in step S501, it is determined whether it is time to start tracking a subject. If it is determined that it is time to start tracking the subject, the process proceeds to step S502. If tracking of the subject has already started, the process proceeds to step S503.

ステップＳ５０２においては、被写体領域注目点算出部１０３により算出された注目点をもとに、被写体追尾領域を設定する。一例として、画像を１６×１２のブロックに分割した場合について説明する。まず、注目点が存在するブロックを中心に、色情報をもとに特徴色を探す。特徴色の条件としては、ある閾値の面積で色の塊を持ち、かつ同じ色のブロックが画像内に少ないという条件とする。つまり、被写体領域にのみ多く存在し、背景領域には存在しないような色情報を特徴色とする。そうすることによって、被写体以外の領域に追尾する誤追尾を抑制することができ、被写体のみを精度良く追尾することが可能となる。特徴色を探したら、特徴色を持つブロックの周辺ブロックの色情報をメモリ等（不図示）に記憶し、被写体追尾に用いる。ここで周辺ブロックとは、例えば特徴色を持つブロックの上下左右に隣接している４個のブロックとする。 In step S502, a subject tracking region is set based on the attention point calculated by the subject region attention point calculation unit 103. As an example, a case where an image is divided into 16 × 12 blocks will be described. First, a characteristic color is searched based on color information, centering on a block where a point of interest exists. The condition for the feature color is a condition that there is a block of colors with a certain threshold area, and there are few blocks of the same color in the image. That is, color information that exists only in the subject area and does not exist in the background area is used as the characteristic color. By doing so, it is possible to suppress erroneous tracking that tracks an area other than the subject, and it is possible to accurately track only the subject. When a characteristic color is found, the color information of the peripheral blocks of the block having the characteristic color is stored in a memory or the like (not shown) and used for subject tracking. Here, the peripheral blocks are, for example, four blocks adjacent to each other in the vertical and horizontal directions of a block having a characteristic color.

次に、ステップＳ５０３において、追尾対象とする被写体領域をもとに被写体追尾を実行する。ここでは、ステップＳ５０２と同様に画像を複数のブロックに分割した場合について説明する。まず、画像の特定領域内において、ステップＳ５０２で算出した特徴色と一致するブロックを探す。特徴色と一致しているかどうかは、例えば特徴色と対象ブロックのＲ、Ｇ、Ｂそれぞれとの差分を算出し、これらの差分がすべて閾値以内である場合は一致しているとみなす。 Next, in step S503, subject tracking is executed based on the subject region to be tracked. Here, a case where an image is divided into a plurality of blocks as in step S502 will be described. First, a block that matches the characteristic color calculated in step S502 is searched for in a specific area of the image. Whether the color matches the feature color is calculated by, for example, calculating the difference between the feature color and each of R, G, and B of the target block.

特徴色と一致しているブロックがある場合は、さらに対象ブロックの周辺ブロック同士についてもＲ、Ｇ、Ｂの差分から一致しているかどうかを判定する。このとき、対象ブロックのＲ、Ｇ、Ｂの差分が小さいほど、また、周辺ブロックのうち一致しているブロック数が大きいほど、大きくなるように評価値を算出する。画像内の特定領域内において評価値が閾値以上のブロックを探し終えたら、前回の追尾で検出した被写体領域と閾値以上の評価値を持つ各ブロックとの距離を算出し、距離が近いほど値を大きくするような重みを評価値に掛け合わせる。そして、この重みを掛け合わせた評価値（以下、第３の評価値）が最も高いブロックを被写体領域の追尾先の領域とみなす。 If there is a block that matches the feature color, it is further determined whether or not the neighboring blocks of the target block also match from the differences of R, G, and B. At this time, the evaluation value is calculated so as to increase as the difference between R, G, and B of the target block decreases and as the number of matching blocks among the peripheral blocks increases. After searching for a block whose evaluation value is greater than or equal to the threshold value within a specific area in the image, calculate the distance between the subject area detected in the previous tracking and each block having an evaluation value equal to or greater than the threshold value. The evaluation value is multiplied by a weight that increases. A block having the highest evaluation value (hereinafter referred to as a third evaluation value) multiplied by this weight is regarded as a tracking destination area of the subject area.

上述した方法により被写体追尾を実行することが可能となるが、被写体の変化にも追従して精度良く被写体追尾を行うために、被写体追尾の実行中に追尾を開始した時に設定した被写体領域の色分布などの情報を更新してもよい。また、画像を複数のブロックに分割してその情報により被写体追尾を実行しているが、第３の評価値が最も高いブロックを決定してから、より高解像度の画像で追尾開始時の領域との色情報などの差分をさらに比較してもよい。このようにすることによって、被写体追尾の性能をより高精度にすることも可能となる。 Although it is possible to perform subject tracking by the above-described method, the color of the subject area set when tracking is started during subject tracking in order to accurately track the subject following changes in the subject. Information such as distribution may be updated. Also, the image is divided into a plurality of blocks and subject tracking is executed based on the information. After determining the block having the highest third evaluation value, the tracking start area in the higher-resolution image is determined. Differences such as color information may be further compared. By doing so, it becomes possible to make the subject tracking performance more accurate.

次に、ステップＳ５０４において、ステップＳ５０３で決定した追尾先の領域について、再び被写体追尾を実施するか否かを決定する。例えば、被写体追尾により算出された追尾先の領域が被写体でない可能性が高いと判定された場合には、被写体追尾を継続しないようにする。 Next, in step S504, it is determined whether subject tracking is to be performed again for the tracking destination area determined in step S503. For example, if it is determined that there is a high possibility that the tracking destination area calculated by subject tracking is not a subject, subject tracking is not continued.

被写体追尾を実施するかどうかの判定方法の一例として、複数の項目から評価値（以下、第４の評価値）を算出し、その値に応じて判定する。第４の評価値を算出する際には、まず、追尾開始時の被写体領域と追尾中の被写体領域とのＲ、Ｇ、Ｂの差分、被写体領域の大きさ、被写体領域の画面中心からの距離、及び被写体追尾を開始してからの時間の４項目について各重みを算出する。そして、これらの４項目を足し合わせて第４の評価値を算出する。 As an example of a method for determining whether to perform subject tracking, an evaluation value (hereinafter referred to as a fourth evaluation value) is calculated from a plurality of items, and the determination is made according to the value. When calculating the fourth evaluation value, first, R, G, and B differences between the subject area at the start of tracking and the subject area being tracked, the size of the subject area, and the distance from the screen center of the subject area , And the weights for the four items of time from the start of subject tracking. Then, a fourth evaluation value is calculated by adding these four items.

例えば、色情報として追尾開始時の被写体領域と追尾中の被写体領域との間で各ブロックのＲ、Ｇ、Ｂそれぞれの差分を比較し、差分が大きいほど被写体の変動が大きく、被写体ではない領域を追尾している可能性があるため、重みを小さくする。また、画像内における被写体領域の大きさについては、被写体追尾を開始した時からの最大サイズと現在のサイズとを比較する。そして、最大サイズに対して現在のサイズが小さくなるほど、追尾を開始した時と比較して被写体領域が画像内で主要な被写体ではなくなっている可能性があるため、重みを小さくする。 For example, the difference between R, G, and B of each block is compared between the subject area at the start of tracking and the subject area being tracked as color information, and the larger the difference, the larger the variation of the subject, and the non-subject area Since there is a possibility of tracking, the weight is reduced. As for the size of the subject region in the image, the maximum size from the start of subject tracking is compared with the current size. Then, as the current size becomes smaller than the maximum size, the subject area may not be the main subject in the image as compared to when tracking is started, so the weight is reduced.

また、被写体領域の画面中心からの距離が遠いほど、同様に主要な被写体ではなくなっている可能性があるため、重みを小さくする。さらに、被写体追尾を開始してからの時間については、時間が長くなるほど被写体を精度良く追尾できなくなっている可能性があるため、重みを小さくする。 In addition, as the distance from the center of the subject area to the screen center increases, there is a possibility that the subject is not the main subject as well, so the weight is reduced. Further, with respect to the time after the start of subject tracking, the longer the time, the more likely it is that the subject cannot be tracked with high accuracy, so the weight is reduced.

以上のように、第４の評価値を算出することにより、被写体追尾を再び実行するかどうかを判定する。この第４の評価値が大きい場合には被写体追尾の精度が高く、かつ、主要な被写体であるとみなして追尾を実行し続ける。一方、第４の評価値が小さい場合には被写体追尾の精度が低く、主要な被写体でなくなっている可能性があるため、追尾を中止する。以上の方法により、被写体を追尾することが可能となる。 As described above, by calculating the fourth evaluation value, it is determined whether or not to perform subject tracking again. When the fourth evaluation value is large, the subject tracking accuracy is high, and the tracking is continued to be regarded as the main subject. On the other hand, when the fourth evaluation value is small, the subject tracking accuracy is low and there is a possibility that the subject is no longer the main subject, so the tracking is stopped. By the above method, it is possible to track the subject.

＜注目点位置の補正＞
次に、図２のステップＳ２０９において、被写体検出により検出された被写体領域の注目点を算出して補正する方法について説明する。図６は、注目点を算出して補正する処理手順の一例を示すフローチャートである。
まず、ステップＳ６０１において、被写体領域注目点算出部１０３は、被写体領域を構成するブロックから注目点を算出する。被写体領域の位置を代表する注目点としては、例えば被写体領域の重心位置を含むブロックとする。 <Correction of attention point position>
Next, a method of calculating and correcting the attention point of the subject area detected by subject detection in step S209 in FIG. 2 will be described. FIG. 6 is a flowchart illustrating an example of a processing procedure for calculating and correcting the attention point.
First, in step S601, the subject area attention point calculation unit 103 calculates the attention point from the blocks constituting the subject area. As a point of interest representative of the position of the subject area, for example, a block including the position of the center of gravity of the subject area is used.

次に、ステップＳ６０２において、被写体領域注目点補正部１０４は、注目点の属するブロックが被写体領域に含まれているかどうかを判定する。例えば、図７（ａ）に示す楕円形の灰色領域を被写体領域とすると、中心の黒点が重心位置となる。一方、図７（ｂ）に示すように、被写体領域の中心付近が中空になっている場合は、重心位置は中空領域に存在する。 In step S602, the subject area attention point correction unit 104 determines whether the subject area includes a block to which the attention point belongs. For example, if an elliptical gray area shown in FIG. 7A is a subject area, the central black dot is the center of gravity position. On the other hand, as shown in FIG. 7B, when the vicinity of the center of the subject area is hollow, the position of the center of gravity exists in the hollow area.

ここで、重心位置を含むブロックが被写体領域に含まれているかどうかを判定するには、重心位置を含むブロックと被写体領域との色情報や輝度情報を比較する。被写体領域と重心位置を含むブロックとの輝度値、彩度値や色相値の差分が閾値以内である場合は、重心位置を含むブロックは被写体領域に含まれていると判定し、注目点の位置を補正しない。したがって、この場合はそのまま処理を終了する。一方、閾値を上回っている場合は、重心位置を含むブロックは被写体領域に含まれていないと判定し、注目点の位置が被写体領域内部に位置するように補正する必要がある。したがって、この場合はステップＳ６０３へ移行する。 Here, in order to determine whether a block including the center of gravity position is included in the subject area, color information and luminance information of the block including the center of gravity position and the subject area are compared. If the difference in luminance value, saturation value, and hue value between the subject area and the block including the center of gravity is within the threshold value, it is determined that the block including the center of gravity is included in the subject area, and the position of the target point Is not corrected. Therefore, in this case, the process is terminated as it is. On the other hand, if the threshold value is exceeded, it is determined that the block including the barycentric position is not included in the subject area, and correction is required so that the position of the target point is located within the subject area. Therefore, in this case, the process proceeds to step S603.

ステップＳ６０３においては、被写体領域注目点補正部１０４は、注目点の上下左右のブロック情報を取得する処理を開始する。以下、ステップＳ６０３〜Ｓ６０５の処理は、上下左右のブロック情報をすべて取得するまで繰り返す。このように、上下左右のブロック情報を順次比較して、補正する位置を決定することになる。 In step S603, the subject region attention point correction unit 104 starts a process of acquiring block information on the upper, lower, left, and right of the attention point. Hereinafter, the processing in steps S603 to S605 is repeated until all the upper, lower, left, and right block information is acquired. In this way, the correction information is determined by sequentially comparing the upper, lower, left and right block information.

ステップＳ６０４においては、被写体領域注目点補正部１０４は、注目点の上下左右４方向について、被写体領域でない幅OutsideLength[ n ]を算出する。図７（ｃ）に示す例では、細線の矢印で示すように注目点の上・左方向に被写体領域でないブロックが１ブロックずつ存在し、右・下方向には存在しない。つまり、OutsideLength[ 1 ]＝OutsideLength[ 3 ]＝１、OutsideLength[ 2 ]＝utsideLength[ 4 ]＝０となる。ここで、[ n ]は、上を１、下を２、左を３、右を４としている。 In step S604, the subject region attention point correction unit 104 calculates a width OutsideLength [n] that is not the subject region in the four directions of the attention point. In the example shown in FIG. 7 (c), as indicated by the thin line arrows, there are one block that is not the subject area above and to the left of the point of interest, and there is no block in the right and down directions. That is, OutsideLength [1] = OutsideLength [3] = 1, OutsideLength [2] = utsideLength [4] = 0. Here, [n] is 1 on the top, 2 on the bottom, 3 on the left, and 4 on the right.

次に、ステップＳ６０５において、注目点の上下左右４方向について、被写体領域でない領域の外側の被写体領域の幅InsideLength[ n ]を算出する。図７（ｃ）に示す例では、太線の矢印で示すように注目点の上・下方向には２ブロックずつ存在し、左・右方向には３ブロックずつ存在する。つまり、InsideLength[ 1 ]＝InsideLength[ 2 ]＝２、InsideLength[ 3 ]＝InsideLength[ 4 ]＝３となる。 Next, in step S605, the width InsideLength [n] of the subject region outside the region that is not the subject region is calculated in four directions, up, down, left, and right of the target point. In the example shown in FIG. 7C, there are two blocks above and below the point of interest, and three blocks left and right as indicated by thick arrows. That is, InsideLength [1] = InsideLength [2] = 2, InsideLength [3] = InsideLength [4] = 3.

ステップＳ６０３〜Ｓ６０５の処理を繰り返して上下左右のブロック情報を取得すると、次にステップＳ６０６において、注目点の上下左右４方向について、どの方向に注目点をずらすかを決定する。ずらす方向を決定する方法としては、例えば以下のようにする。 When the processing in steps S603 to S605 is repeated to acquire the vertical and horizontal block information, in step S606, it is determined in which direction the attention point is shifted in the four directions of the attention point. For example, the method for determining the direction of shifting is as follows.

まず、幅InsideLength[ n ]の値が閾値以上の方向のみを選択する。狭い被写体領域の内部を注目点として設定すると、注目点周辺の情報を用いて追尾する場合に、追尾する領域の大部分を被写体領域以外の背景部分で占めることになる。そうすると、背景部分の情報を用いて追尾する可能性があり、被写体領域以外のところへ追尾する誤追尾が発生しやすくなる。そのため、周辺に広い被写体領域がある位置が注目点となるようにする。 First, only the direction in which the value of the width InsideLength [n] is greater than or equal to the threshold value is selected. When the inside of a narrow subject area is set as a point of interest, when tracking is performed using information around the point of interest, most of the tracked region is occupied by a background portion other than the subject region. Then, there is a possibility of tracking using the information of the background portion, and erroneous tracking that tracks to a place other than the subject area is likely to occur. Therefore, a position where there is a wide subject area in the vicinity is set as a point of interest.

次に、選択した幅InsideLength[ n ]の値が閾値以上の方向のうち、幅OutsideLength[ n ]が一番小さい値の方向を選択する。これは、誤追尾を抑制しつつ、被写体検出により検出した位置と被写体追尾により追尾している位置とができるだけずれないようにするためである。なお、幅OutsideLength[ n ]が一番小さい方向が２つ以上ある場合は、幅InsideLength[ n ]が大きい方向を選択する。また、幅InsideLength[ n ]の値が閾値以下の方向しか存在しない場合は、被写体領域の形状が複雑である可能性がある。その場合は被写体追尾の実行が困難であるため、被写体検出を再度実行するようにする。図７（ｃ）に示す例では、幅InsideLength[ n ]の値が大きく、かつ幅OutsideLength[ n ]の値が小さい右方向を、注目点をずらす方向とする。 Next, among the directions in which the value of the selected width InsideLength [n] is equal to or greater than the threshold, the direction having the smallest value of the width OutsideLength [n] is selected. This is to prevent the position detected by subject detection and the position tracked by subject tracking from shifting as much as possible while suppressing erroneous tracking. If there are two or more directions in which the width OutsideLength [n] is the smallest, the direction in which the width InsideLength [n] is large is selected. Further, when there is only a direction in which the value of the width InsideLength [n] is equal to or less than the threshold value, the shape of the subject area may be complicated. In that case, since subject tracking is difficult to execute, subject detection is performed again. In the example shown in FIG. 7C, the right direction in which the value of the width InsideLength [n] is large and the value of the width OutsideLength [n] is small is the direction in which the point of interest is shifted.

次に、ステップＳ６０７において、ステップＳ６０６で決定した方向に注目点をずらす補正を行う。図７（ｄ）に示す例の場合は、右方向に注目点を補正する。このとき、注目点の補正位置は、補正する方向に存在する被写体領域の中心ブロックとする。 Next, in step S607, correction for shifting the attention point in the direction determined in step S606 is performed. In the case of the example shown in FIG. 7D, the attention point is corrected in the right direction. At this time, the correction position of the target point is the central block of the subject area existing in the correction direction.

以上の方法により本実施形態によれば、注目点の位置が検出した被写体領域の範囲外である場合は、注目点を補正するようにしたので、被写体がドーナツ型の形状であっても精度よく被写体を追尾ことが可能となる。 With the above method, according to the present embodiment, when the position of the point of interest is outside the range of the detected subject area, the point of interest is corrected, so even if the subject has a donut shape, it is accurate. The subject can be tracked.

さらに、被写体情報表示部１０６により、例えば画像内に被写体領域を中心とした矩形枠など被写体を示す情報を表示するようにしたので、使用するユーザにとって追尾している領域を視覚的に表現することが可能となる。ここで、注目点の補正量が大きい場合は実際の被写体領域と被写体情報の表示位置とが大きくずれてしまうので、補正後の注目点にて被写体追尾を実行しつつ、補正量の大きさによっては補正前の注目点を用いて被写体を示す情報を表示するようにする。 Furthermore, since the subject information display unit 106 displays information indicating the subject such as a rectangular frame centered on the subject region in the image, the region being tracked for the user to be used can be visually represented. Is possible. Here, when the correction amount of the attention point is large, the actual subject area and the display position of the subject information are greatly shifted. Therefore, tracking of the subject is performed at the corrected attention point, and the correction amount depends on the correction amount. Displays information indicating the subject using the point of interest before correction.

（その他の実施形態）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other embodiments)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

１０２被写体領域特定部
１０３被写体領域注目点算出部
１０４被写体領域注目点補正部
１０５被写体追尾部 102 Subject region specifying unit 103 Subject region attention point calculation unit 104 Subject region attention point correction unit 105 Subject tracking unit

Claims

Identifying means for identifying the subject area from the input image;
Calculating means for calculating a point of interest representative of the position of the subject area specified by the specifying means;
Tracking means for tracking the subject area based on the attention point calculated by the calculation means;
When the region including the attention point is a region different from the subject region, the calculation unit corrects the position of the attention point within the subject region,
The image recognition apparatus characterized in that the tracking means tracks the subject area based on a point of interest at the corrected position.

The image according to claim 1, wherein the specifying unit specifies a subject region using at least luminance information or color information in the image, or specifies a subject region in accordance with an operation by a user. Recognition device.

The image recognition apparatus according to claim 1, wherein the attention point is a gravity center position of the subject area.

The image recognition apparatus according to claim 1, wherein the tracking unit tracks the subject area using at least luminance information or color information in the image.

5. The calculation unit according to claim 1, wherein the calculation unit determines whether the regions are different from each other based on a difference in luminance information or color information between at least the region including the attention point and the subject region. The image recognition device according to any one of the above.

The identification means calculates an evaluation value representing the likelihood of a subject from the position, size, and shape of the region in the image, and identifies the subject region based on the calculated evaluation value. The image recognition device according to any one of 1 to 5.

Display control means for displaying on the display means together with the image that the subject exists in the subject area tracked by the tracking means;
The display control means further displays on the display means that the subject exists using the attention point before or after the correction by the calculation means. Item 7. The image recognition device according to any one of Items 1 to 6.

A specifying step of specifying a subject area from the input image;
A calculation step of calculating a point of interest representative of the position of the subject area specified in the specifying step;
A tracking step of tracking the subject area based on the attention point calculated in the calculation step,
In the calculation step, when the region including the attention point is a region different from the subject region, the position of the attention point is corrected inside the subject region,
In the tracking step, the subject region is tracked based on the attention point of the corrected position.

A specifying step of specifying a subject area from the input image;
A calculation step of calculating a point of interest representative of the position of the subject area specified in the specifying step;
Causing the computer to execute a tracking step of tracking the subject area based on the attention point calculated in the calculation step;
In the calculation step, when the region including the attention point is a region different from the subject region, the position of the attention point is corrected inside the subject region,
In the tracking step, the subject area is tracked based on a point of interest at the corrected position.