JP2011023898A

JP2011023898A - Display device, display method, and integrated circuit

Info

Publication number: JP2011023898A
Application number: JP2009166156A
Authority: JP
Inventors: Yoichi Sugino; 陽一杉野
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2009-07-14
Filing date: 2009-07-14
Publication date: 2011-02-03

Abstract

<P>PROBLEM TO BE SOLVED: To provide a display device which can easily specify a candidate area without being affected by the movement of the subject. <P>SOLUTION: A video camera 100 includes a detection part 130, a generation part 1620, a synthesis part 1640, an operation part 180, and a selection part 1650. The detection part 130 detects a candidate area from image data. The generation part 1620 generates candidate image data for identifying each candidate area detected by the detection part 130. The synthesis part 1640 generates synthetic image data by synthesizing the candidate image data and the image data. A display part 160 displays the synthetic image data as a video. The operation part 180 receives input of specification information which specifies a specific candidate image among the candidate images which appear in the video. The selection part 1650 selects the candidate area corresponding to the candidate image data specified through the operation part 180 as a reference area based on the specification information input to the operation part 180. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、映像を表示するための表示装置、表示方法および集積回路に関する。 The present invention relates to a display device, a display method, and an integrated circuit for displaying an image.

ビデオカメラやデジタルカメラといった撮像装置では、往々にして動いている人物や物体を撮影する場合がある。撮像装置には映像を表示するための表示装置が搭載されており、撮影時には、この表示装置に被写体の映像がリアルタイムに映し出される。ユーザーは、表示装置に映し出された映像を見ながら、静止画または動画の撮影を行う。
一方で、映像に映し出されている人間などの被写体を候補領域として自動的に検出する技術が提案されている。この種の技術としては、例えば顔検出技術が知られている。このような技術が用いられた表示装置では、自動的に検出され候補領域のうちユーザーにとって最も興味のある被写体（以下、主被写体ともいう）に対応している領域（以下、主被写体領域ともいう）を選択できるようになっている。主被写体領域が選択されると、選択された主被写体領域を基準に、自動露出制御（ＡＥ：ＡｕｔｏＥｘｐｏｓｕｒｅ）や自動合焦制御（ＡＦ：ＡｕｔｏＦｏｃｕｓ）が行われる（例えば、特許文献１を参照）。 An imaging device such as a video camera or a digital camera often photographs a moving person or object. The imaging device is equipped with a display device for displaying an image, and an image of the subject is displayed on the display device in real time at the time of shooting. The user takes a still image or a moving image while viewing the video displayed on the display device.
On the other hand, a technique has been proposed in which a subject such as a human being displayed in a video is automatically detected as a candidate area. As this type of technique, for example, a face detection technique is known. In a display device using such a technique, an area (hereinafter also referred to as a main subject area) that is automatically detected and corresponds to a subject (hereinafter also referred to as a main subject) that is most interesting to the user among candidate areas. ) Can be selected. When the main subject area is selected, automatic exposure control (AE) and automatic focus control (AF) are performed based on the selected main subject area (see, for example, Patent Document 1). ).

特開２００６−１０１１８６号公報JP 2006-101186 A

しかしながら、従来の表示装置では、自動的に検出された候補領域を直接指定して主被写体領域を決定するため、映像の中で主被写体が動いていると候補領域を指定しにくい。また、撮像装置に設けられた表示装置だけでなく、ある特定の領域を映像から選択する機能を有する装置であれば、このような課題が存在する。
本発明の課題は、被写体の動きの影響を受けることなく候補領域を容易に指定できる表示装置、表示方法および集積回路を提供することにある。 However, in the conventional display device, the automatically detected candidate area is directly specified to determine the main subject area. Therefore, it is difficult to specify the candidate area when the main subject is moving in the video. In addition to the display device provided in the imaging device, there is such a problem as long as the device has a function of selecting a specific area from the video.
An object of the present invention is to provide a display device, a display method, and an integrated circuit that can easily specify a candidate area without being affected by the movement of a subject.

本発明に係る表示装置は、複数の画像データを映像として表示するための装置であって、検出部と、生成部と、合成部と、操作部と、選択部と、を備えている。検出部は、特定の視覚的特徴を有する少なくとも１つの候補領域を画像データから検出する。生成部は、検出部により検出された各候補領域を識別するための少なくとも１つの候補識別情報を生成する。合成部は、候補識別情報と画像データとが合成された合成画像データを生成する。表示部は合成画像データを映像として表示する。操作部は映像に表れている候補識別情報のうち特定の候補識別情報を指定する指定情報の入力を受け付ける。選択部は、操作部に入力された指定情報に基づいて、操作部を介して指定された候補識別情報に対応する候補領域を基準領域として選択する。
この表示装置では、特定の視覚的特徴を有する候補領域が検出部により画像データから検出され、検出された候補領域に対応する候補識別情報が生成部により生成される。候補識別情報が生成されると、生成された候補識別情報と画像データとが合成された合成画像データが合成部により生成され、複数の合成画像データが映像として表示部に表示される。表示部に表示された映像には、画像データだけでなく候補識別情報も表れているため、表示部に表示されている映像を通して、ユーザーは候補識別情報を視認することができる。 A display device according to the present invention is a device for displaying a plurality of image data as video, and includes a detection unit, a generation unit, a synthesis unit, an operation unit, and a selection unit. The detection unit detects at least one candidate region having a specific visual feature from the image data. The generation unit generates at least one candidate identification information for identifying each candidate region detected by the detection unit. The combining unit generates combined image data in which the candidate identification information and the image data are combined. The display unit displays the composite image data as a video. The operation unit receives input of designation information for designating specific candidate identification information among candidate identification information appearing in the video. The selection unit selects a candidate region corresponding to the candidate identification information specified via the operation unit as a reference region based on the designation information input to the operation unit.
In this display device, a candidate area having a specific visual feature is detected from the image data by the detection unit, and candidate identification information corresponding to the detected candidate area is generated by the generation unit. When the candidate identification information is generated, synthesized image data obtained by synthesizing the generated candidate identification information and the image data is generated by the synthesizing unit, and a plurality of synthesized image data is displayed as a video on the display unit. Since the video displayed on the display unit displays not only image data but also candidate identification information, the user can visually recognize the candidate identification information through the video displayed on the display unit.

さらに、映像に表れている候補識別情報のうち特定の候補識別情報を指定するために、ユーザーが操作部を用いて指定情報を入力すると、入力された指定情報に基づいて、指定された候補識別情報に対応する候補領域が基準領域として選択部により選択される。
このように、この表示装置では、候補領域に対応する候補識別情報を候補領域とは別個に表示し、候補識別情報を利用して間接的に候補領域を選択することができるため、表示部に表示された候補領域が動いている場合であっても、候補領域の選択操作を容易に行うことができる。
本発明に係る表示方法は、複数の画像データを映像として表示するための表示方法であって、検出ステップと、生成ステップと、合成ステップと、表示ステップと、操作ステップと、選択ステップと、を備えている。検出ステップでは、特定の視覚的特徴を有する少なくとも１つの候補領域が画像データから検出される。生成ステップでは、検出ステップで検出された各候補領域を識別するための少なくとも１つの候補識別情報が生成される。合成ステップでは、候補識別情報と画像データとが合成された合成画像データが生成される。表示ステップでは合成画像データが映像として表示される。操作ステップでは、映像に表れている候補識別情報のうち特定の候補識別情報を指定する指定情報の入力を受け付ける。選択ステップでは、操作部に入力された指定情報に基づいて、操作部を介して指定された候補識別情報に対応する候補領域が基準領域として選択される。 Furthermore, when the user inputs designation information using the operation unit in order to designate specific candidate identification information among candidate identification information appearing in the video, the designated candidate identification is performed based on the inputted designation information. A candidate area corresponding to the information is selected as a reference area by the selection unit.
Thus, in this display device, candidate identification information corresponding to the candidate area can be displayed separately from the candidate area, and the candidate area can be selected indirectly using the candidate identification information. Even if the displayed candidate area is moving, the selection operation of the candidate area can be easily performed.
A display method according to the present invention is a display method for displaying a plurality of image data as video, and includes a detection step, a generation step, a synthesis step, a display step, an operation step, and a selection step. I have. In the detecting step, at least one candidate region having a specific visual feature is detected from the image data. In the generation step, at least one candidate identification information for identifying each candidate region detected in the detection step is generated. In the synthesis step, synthesized image data in which the candidate identification information and the image data are synthesized is generated. In the display step, the composite image data is displayed as a video. In the operation step, input of designation information for designating specific candidate identification information among candidate identification information appearing in the video is accepted. In the selection step, based on the designation information input to the operation unit, a candidate region corresponding to the candidate identification information designated via the operation unit is selected as a reference region.

この表示方法では、特定の視覚的特徴を有する候補領域が画像データから検出され、検出された候補領域に対応する候補識別情報が生成される。候補識別情報が生成されると、生成された候補識別情報と画像データとが合成された合成画像データが生成され、複数の合成画像データが映像として表示される。示された映像には、画像データだけでなく候補識別情報も表れているため、表示されている映像を通して、ユーザーは候補識別情報を視認することができる。
さらに、映像に表れている候補識別情報のうち特定の候補識別情報を指定するために、ユーザーが操作ステップにおいて指定情報を入力すると、入力された指定情報に基づいて、指定された候補識別情報に対応する候補領域が基準領域として選択される。
このように、この表示方法では、候補領域に対応する候補識別情報を候補領域とは別個に表示し、候補識別情報を利用して間接的に候補領域を選択することができるため、表示部に表示された候補領域が動いている場合であっても、候補領域の選択操作を容易に行うことができる。 In this display method, candidate areas having specific visual features are detected from image data, and candidate identification information corresponding to the detected candidate areas is generated. When the candidate identification information is generated, composite image data obtained by combining the generated candidate identification information and image data is generated, and a plurality of composite image data is displayed as a video. In the displayed video, not only image data but also candidate identification information appears, so that the user can visually recognize the candidate identification information through the displayed video.
Further, when the user inputs the designation information in the operation step in order to designate the specific candidate identification information among the candidate identification information appearing in the video, the designated candidate identification information is changed based on the inputted designation information. The corresponding candidate area is selected as the reference area.
Thus, in this display method, candidate identification information corresponding to the candidate area can be displayed separately from the candidate area, and the candidate area can be indirectly selected using the candidate identification information. Even when the displayed candidate area is moving, the candidate area can be easily selected.

本発明に係る集積回路は、複数の画像データを映像として表示するための集積回路であって、検出部と、生成部と、合成部と、操作部と、選択部と、を備えている。検出部は、特定の視覚的特徴を有する少なくとも１つの候補領域を画像データから検出する。生成部は、検出部により検出された各候補領域を識別するための少なくとも１つの候補識別情報を生成する。合成部は、候補識別情報と画像データとが合成された合成画像データを生成する。表示部は合成画像データを映像として表示する。操作部は映像に表れている候補識別情報のうち特定の候補識別情報を指定する指定情報の入力を受け付ける。選択部は、操作部に入力された指定情報に基づいて、操作部を介して指定された候補識別情報に対応する候補領域を基準領域として選択する。
この集積回路では、特定の視覚的特徴を有する候補領域が検出部により画像データから検出され、検出された候補領域に対応する候補識別情報が生成部により生成される。候補識別情報が生成されると、生成された候補識別情報と画像データとが合成された合成画像データが合成部により生成され、複数の合成画像データが映像として表示部に表示される。表示部に表示された映像には、画像データだけでなく候補識別情報も表れているため、表示部に表示されている映像を通して、ユーザーは候補識別情報を視認することができる。 An integrated circuit according to the present invention is an integrated circuit for displaying a plurality of image data as video, and includes a detection unit, a generation unit, a synthesis unit, an operation unit, and a selection unit. The detection unit detects at least one candidate region having a specific visual feature from the image data. The generation unit generates at least one candidate identification information for identifying each candidate region detected by the detection unit. The combining unit generates combined image data in which the candidate identification information and the image data are combined. The display unit displays the composite image data as a video. The operation unit receives input of designation information for designating specific candidate identification information among candidate identification information appearing in the video. The selection unit selects a candidate region corresponding to the candidate identification information specified via the operation unit as a reference region based on the designation information input to the operation unit.
In this integrated circuit, candidate regions having specific visual features are detected from the image data by the detection unit, and candidate identification information corresponding to the detected candidate regions is generated by the generation unit. When the candidate identification information is generated, synthesized image data obtained by synthesizing the generated candidate identification information and the image data is generated by the synthesizing unit, and a plurality of synthesized image data is displayed as a video on the display unit. Since the video displayed on the display unit displays not only image data but also candidate identification information, the user can visually recognize the candidate identification information through the video displayed on the display unit.

さらに、映像に表れている候補識別情報のうち特定の候補識別情報を指定するために、ユーザーが操作部を用いて指定情報を入力すると、入力された指定情報に基づいて、指定された候補識別情報に対応する候補領域が基準領域として選択部により選択される。
このように、この集積回路では、候補領域に対応する候補識別情報を候補領域とは別個に表示し、候補識別情報を利用して間接的に候補領域を選択することができるため、表示部に表示された候補領域が動いている場合であっても、候補領域の選択操作を容易に行うことができる。
ここで、表示装置としては、複数の画像データを映像として表示し、かつ、映像に表れる特定の領域を検出する機能を有する装置が考えられる。具体的には、表示装置としては、例えば、ビデオカメラ、デジタルカメラおよびカメラ機能を有する携帯電話などの撮像装置に搭載された表示装置が考えられ、さらに、テレビ、カーナビゲーションシステムおよびゲーム機などの電子機器に搭載された表示装置も考えられる。 Furthermore, when the user inputs designation information using the operation unit in order to designate specific candidate identification information among candidate identification information appearing in the video, the designated candidate identification is performed based on the inputted designation information. A candidate area corresponding to the information is selected as a reference area by the selection unit.
Thus, in this integrated circuit, the candidate identification information corresponding to the candidate area can be displayed separately from the candidate area, and the candidate area can be indirectly selected using the candidate identification information. Even when the displayed candidate area is moving, the candidate area can be easily selected.
Here, as the display device, a device having a function of displaying a plurality of image data as a video and detecting a specific area appearing in the video can be considered. Specifically, as the display device, for example, a display device mounted on an imaging device such as a video camera, a digital camera, and a mobile phone having a camera function can be considered, and further, a television, a car navigation system, a game machine, etc. A display device mounted on an electronic device is also conceivable.

また、「特定の視覚的特徴を有する候補領域」としては、例えば、人間の体の全部または一部を表す領域（例えば、顔領域）および特定の色を有する領域が考えられる。 In addition, examples of the “candidate area having a specific visual feature” include an area representing all or part of the human body (for example, a face area) and an area having a specific color.

以上のように、本発明に係る表示装置、表示方法および集積回路によれば、被写体の動きの影響を受けることなく候補領域を容易に指定できる。 As described above, according to the display device, the display method, and the integrated circuit according to the present invention, the candidate area can be easily specified without being affected by the movement of the subject.

ビデオカメラ１００のブロック図Block diagram of video camera 100 検出部１３０のブロック図Block diagram of detector 130 システム制御部１５７およびその周辺のブロック図Block diagram of system controller 157 and its surroundings 表示部１６０の表示例Display example of display unit 160 顔検出処理のフローチャートFlow chart of face detection processing 顔検出処理のフローチャートFlow chart of face detection processing 顔検出処理のフローチャートFlow chart of face detection processing 検出部２３０およびその周辺のブロック図Block diagram of detector 230 and its surroundings システム制御部２５７およびその周辺のブロック図Block diagram of system controller 257 and its surroundings 登録モードのフローチャート（第２実施形態）Flow chart of registration mode (second embodiment) 顔検出処理のフローチャート（第２実施形態）Flowchart of face detection process (second embodiment) 顔検出処理のフローチャート（第２実施形態）Flowchart of face detection process (second embodiment) 表示部１６０の表示例（第２実施形態）Display example of display unit 160 (second embodiment) 文字列を入力する入力画面の例（第３実施形態）Example of input screen for inputting character string (third embodiment) 登録モードのフローチャート（第３実施形態）Flow chart of registration mode (third embodiment) 表示部１６０の表示例（第３実施形態）Display example of display unit 160 (third embodiment) 表示部１６０の表示例（他の実施形態）Display example of display unit 160 (another embodiment)

［第１実施形態］
＜ビデオカメラの概要＞
図１〜図３を用いて、ビデオカメラ１００について説明する。
図１はビデオカメラ１００のブロック図である。図１において、破線で囲まれた範囲がビデオカメラ１００（撮像装置の一例）を表す。
図１に示すように、ビデオカメラ１００は、主に、光学系１０５と、映像信号生成ユニット７０と、映像信号解析ユニット７１と、システム制御部１５７と、光学機構制御ユニット７２と、記録処理ユニット７３と、表示部１６０と、操作部１８０と、バッファメモリ１３５と、システムバス１３４と、を有している。例えば、映像信号解析ユニット７１、システム制御部１５７、光学機構制御ユニット７２、記録処理ユニット７３、表示部１６０および操作部１８０により、複数の画像データを映像として表示する表示装置が構成されている。 [First Embodiment]
<Overview of video camera>
The video camera 100 will be described with reference to FIGS.
FIG. 1 is a block diagram of the video camera 100. In FIG. 1, a range surrounded by a broken line represents a video camera 100 (an example of an imaging apparatus).
As shown in FIG. 1, the video camera 100 mainly includes an optical system 105, a video signal generation unit 70, a video signal analysis unit 71, a system control unit 157, an optical mechanism control unit 72, and a recording processing unit. 73, a display unit 160, an operation unit 180, a buffer memory 135, and a system bus 134. For example, the video signal analysis unit 71, the system control unit 157, the optical mechanism control unit 72, the recording processing unit 73, the display unit 160, and the operation unit 180 constitute a display device that displays a plurality of image data as video.

＜光学系＞
光学系１０５は、被写体の光学像を形成し、光学像を撮像素子１１０に導く。具体的には、光学系１０５は、ズームレンズ群（図示せず）と、フォーカスレンズ（図示せず）と、絞りユニット（図示せず）と、シャッターユニット（図示せず）と、を有している。
ズームレンズ群は、複数のレンズを有しており、複数のレンズ枠により光軸方向に移動可能に支持されている。複数のレンズはズーム駆動ユニット（図示せず）により光軸に沿って駆動される。被写体の光学像を変倍する際に、各レンズはズーム駆動ユニットにより光軸に沿って駆動される。フォーカスレンズは、フォーカシング（合焦状態の調整）を行うためのレンズであり、フォーカスモータ（図示せず）により駆動される。フォーカスモータはレンズ制御部１７０（後述）により制御される。フォーカスレンズを光軸方向に駆動することで、撮像素子１１０上に結像している被写体からビデオカメラ１００までの距離を調整することができ、合焦状態を調整することができる。絞りユニットは、露出制御を行うためのユニットであり、露出制御部１６５（後述）により制御される。 <Optical system>
The optical system 105 forms an optical image of the subject and guides the optical image to the image sensor 110. Specifically, the optical system 105 includes a zoom lens group (not shown), a focus lens (not shown), a diaphragm unit (not shown), and a shutter unit (not shown). ing.
The zoom lens group has a plurality of lenses, and is supported by a plurality of lens frames so as to be movable in the optical axis direction. The plurality of lenses are driven along the optical axis by a zoom drive unit (not shown). When zooming the optical image of the subject, each lens is driven along the optical axis by the zoom drive unit. The focus lens is a lens for performing focusing (adjustment of the in-focus state), and is driven by a focus motor (not shown). The focus motor is controlled by a lens control unit 170 (described later). By driving the focus lens in the optical axis direction, the distance from the subject imaged on the image sensor 110 to the video camera 100 can be adjusted, and the in-focus state can be adjusted. The aperture unit is a unit for performing exposure control, and is controlled by an exposure control unit 165 (described later).

＜映像信号生成ユニット＞
映像信号生成ユニット７０は、被写体の光学像に基づいてデジタル映像信号を生成するユニットであり、撮像素子１１０と、Ａ／Ｄ変換部１１５と、映像信号処理部１２０と、Ｙ／Ｃ変換部１２５と、を有している。
撮像素子１１０は、光学系１０５により結像された被写体の光学像を電気信号（映像信号）に変換する。撮像素子１１０としては、例えばＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）イメージセンサおよびＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）イメージセンサが用いられる。Ａ／Ｄ変換部１１５は、撮像素子１１０から出力されるアナログ映像信号をＲＧＢ形式のデジタル映像信号に変換する。 <Video signal generation unit>
The video signal generation unit 70 is a unit that generates a digital video signal based on an optical image of a subject. The image sensor 110, the A / D conversion unit 115, the video signal processing unit 120, and the Y / C conversion unit 125. And have.
The image sensor 110 converts an optical image of a subject formed by the optical system 105 into an electric signal (video signal). As the image sensor 110, for example, a charge coupled device (CCD) image sensor and a complementary metal oxide semiconductor (CMOS) image sensor are used. The A / D converter 115 converts the analog video signal output from the image sensor 110 into a digital video signal in RGB format.

映像信号処理部１２０は、Ａ／Ｄ変換部１１５から出力されるデジタル映像信号に、ゲイン調整処理、ノイズ除去処理、ガンマ補正処理、アパーチャ処理およびニー処理等の映像信号処理を施す。映像信号処理部１２０は、映像信号処理を施したデジタル映像信号を、Ｙ／Ｃ変換部１２５、検出部１３０および画像情報抽出部１３２（後述）に出力する。Ｙ／Ｃ変換部１２５は、映像信号処理部１２０により処理されたデジタル映像信号をＲＧＢ形式からＹ／Ｃ形式へ変換する。映像信号処理部１２０およびＹ／Ｃ変換部１２５から出力されるデジタル映像信号は、被写体の映像を表す情報であり、複数フレームの画像データから構成されている。
＜バッファメモリ＞
バッファメモリ１３５は、Ｙ／Ｃ変換部１２５によりＹ／Ｃ形式に変換されたデジタル映像信号を、システムバス１３４を通して取得し、複数フレームの画像データ（以下、基準画像データともいう）として一時的に格納する。システム制御部１５７（後述）は、バッファメモリ１３５に格納されたデジタル映像信号に対して、表示部１６０（後述）で表示するのに適したサイズへ縮小処理を施す。縮小処理が施されたデジタル映像信号は、バッファメモリ１３５上に表示用映像信号として格納される。表示用映像信号は、複数フレームの画像データから構成されている。システム制御部１５７により縮小処理が施された表示用映像信号を構成する画像データを、以下、表示用画像データともいう。 The video signal processing unit 120 performs video signal processing such as gain adjustment processing, noise removal processing, gamma correction processing, aperture processing, and knee processing on the digital video signal output from the A / D conversion unit 115. The video signal processing unit 120 outputs the digital video signal subjected to the video signal processing to a Y / C conversion unit 125, a detection unit 130, and an image information extraction unit 132 (described later). The Y / C conversion unit 125 converts the digital video signal processed by the video signal processing unit 120 from the RGB format to the Y / C format. The digital video signal output from the video signal processing unit 120 and the Y / C conversion unit 125 is information representing the video of the subject, and is composed of a plurality of frames of image data.
<Buffer memory>
The buffer memory 135 acquires the digital video signal converted into the Y / C format by the Y / C conversion unit 125 through the system bus 134 and temporarily stores it as a plurality of frames of image data (hereinafter also referred to as reference image data). Store. The system control unit 157 (described later) performs a reduction process on the digital video signal stored in the buffer memory 135 to a size suitable for display on the display unit 160 (described later). The reduced digital video signal is stored in the buffer memory 135 as a display video signal. The display video signal is composed of a plurality of frames of image data. The image data constituting the display video signal subjected to the reduction process by the system control unit 157 is also referred to as display image data hereinafter.

＜記録処理ユニット＞
記録処理ユニット７３は、映像信号生成ユニット７０により生成されたデジタル映像信号を外部の記録媒体に保存するユニットであり、ＣＯＤＥＣ１４０と、記録Ｉ／Ｆ部１４５と、ソケット１５０と、を有している。ＣＯＤＥＣ１４０は、バッファメモリ１３５に格納されているＹ／Ｃ形式のデジタル映像信号に不可逆圧縮処理を施す。記録Ｉ／Ｆ部１４５は、ＣＯＤＥＣ１４０により不可逆圧縮処理が施されたデジタル映像信号を、システムバス１３４を経由して取得し、記録媒体１５５に記録する。ソケット１５０は、記録媒体１５５が装着可能であり、ビデオカメラ１００に記録媒体１５５を電気的に接続する。記録媒体１５５としては、例えばメモリーカードが挙げられる。
＜映像信号解析ユニット＞
映像信号解析ユニット７１は、入力された映像信号から必要な情報を取得するユニットであり、検出部１３０と、画像情報抽出部１３２と、を有している。 <Recording processing unit>
The recording processing unit 73 is a unit that stores the digital video signal generated by the video signal generating unit 70 in an external recording medium, and includes a CODEC 140, a recording I / F unit 145, and a socket 150. . The CODEC 140 performs irreversible compression processing on the Y / C format digital video signal stored in the buffer memory 135. The recording I / F unit 145 acquires the digital video signal subjected to the irreversible compression processing by the CODEC 140 via the system bus 134 and records it on the recording medium 155. The recording medium 155 can be attached to the socket 150, and the recording medium 155 is electrically connected to the video camera 100. An example of the recording medium 155 is a memory card.
<Video signal analysis unit>
The video signal analysis unit 71 is a unit that acquires necessary information from the input video signal, and includes a detection unit 130 and an image information extraction unit 132.

（１）検出部
検出部１３０は、映像信号処理部１２０から出力されるデジタル映像信号の入力を受け付け、入力されるデジタル映像信号から特定の視覚的特徴を有する候補領域を検出する。具体的には図２に示すように、検出部１３０は、画像処理部２００と、揮発性メモリ２１０と、顔検出部２２０と、を有している。本実施形態では、候補領域は人間の顔を表す領域である。
画像処理部２００は映像信号処理部１２０から入力されるデジタル映像信号に縮小処理を施す。より詳細には、画像処理部２００はデジタル映像信号を構成する複数フレームの画像データに対して縮小処理を順次施す。画像処理部２００により縮小処理が施された画像データを、以下、検出用画像データともいう。縮小処理時の縮小倍率は、基準画像データに対する表示用画像データの縮小倍率と同じであるため、検出用画像データ上の座標情報を表示用画像データ上の座標情報として利用できる。揮発性メモリ２１０は、画像処理部２００から出力される検出用画像データを順次格納する。 (1) Detection Unit The detection unit 130 receives an input of a digital video signal output from the video signal processing unit 120, and detects a candidate region having a specific visual feature from the input digital video signal. Specifically, as illustrated in FIG. 2, the detection unit 130 includes an image processing unit 200, a volatile memory 210, and a face detection unit 220. In this embodiment, the candidate area is an area representing a human face.
The image processing unit 200 performs a reduction process on the digital video signal input from the video signal processing unit 120. More specifically, the image processing unit 200 sequentially performs a reduction process on a plurality of frames of image data constituting a digital video signal. Hereinafter, the image data subjected to the reduction process by the image processing unit 200 is also referred to as detection image data. Since the reduction magnification at the time of the reduction processing is the same as the reduction magnification of the display image data with respect to the reference image data, the coordinate information on the detection image data can be used as the coordinate information on the display image data. The volatile memory 210 sequentially stores the detection image data output from the image processing unit 200.

顔検出部２２０は、揮発性メモリ２１０に格納された１フレームの検出用画像データから顔領域（候補領域の一例）を検出する。システム制御部１５７のＲＯＭ（後述）には、人間の顔に関する一般的な特徴を示す基準特徴量（基準情報の一例）が予め格納されている。基準特徴量は画像に表れる顔の特徴を示すデータである。顔検出部２２０は、ＲＯＭから基準特徴量を取得し、順次入力される検出用画像データと基準特徴量とをパターンマッチングなどの手法で比較し、基準特徴量の条件を満たす領域を候補領域として各フレームの検出用画像データから検出する。
さらに、顔検出部２２０は、検出した顔領域の画像内での位置を示す座標情報および顔領域の大きさを示すサイズ情報を生成し、座標情報およびサイズ情報（以下、これらの情報を候補情報ともいう）をシステム制御部１５７に送信する。１フレームの検出用画像データに複数の顔領域が存在する場合、顔検出部２２０は、各顔領域に対応する座標情報およびサイズ情報をシステム制御部１５７に送信する。顔検出部２２０による顔領域の検出は各フレームの検出用画像データに対して行われるため、顔領域の座標情報およびサイズ情報は１フレームごとにシステム制御部１５７に送られる。 The face detection unit 220 detects a face area (an example of a candidate area) from one frame of detection image data stored in the volatile memory 210. The ROM (described later) of the system control unit 157 stores in advance a reference feature amount (an example of reference information) indicating general features related to a human face. The reference feature amount is data indicating the feature of the face appearing in the image. The face detection unit 220 acquires the reference feature amount from the ROM, compares the sequentially input detection image data and the reference feature amount by a method such as pattern matching, and the region that satisfies the reference feature amount is set as a candidate region. It detects from the detection image data of each frame.
Further, the face detection unit 220 generates coordinate information indicating the position of the detected face area in the image and size information indicating the size of the face area, and uses the coordinate information and size information (hereinafter, these information as candidate information). Is also transmitted to the system control unit 157. When there are a plurality of face areas in one frame of detection image data, the face detection unit 220 transmits coordinate information and size information corresponding to each face area to the system control unit 157. Since the detection of the face area by the face detection unit 220 is performed on the detection image data of each frame, the coordinate information and size information of the face area are sent to the system control unit 157 for each frame.

（２）画像情報抽出部
画像情報抽出部１３２は、デジタル映像信号に含まれる様々な情報を抽出する。様々な情報とは、例えば、画像上の特定領域における輝度レベル値および高域周波数成分である。画像情報抽出部１３２での画像処理には、自動露出制御および自動合焦制御の精度を考慮して、映像信号処理部１２０から入力されるデジタル映像信号を構成する縮小処理が施されていない画像データが用いられる。本実施形態では、画像上の特定領域としては、操作部１８０を介して選択された主被写体領域（後述）が用いられる。
抽出された輝度レベル値は自動露出制御に用いられる。具体的には、画像情報抽出部１３２は、順次入力される画像データから主被写体領域の輝度レベル値を算出し、輝度レベル値をシステム制御部１５７に出力する。 (2) Image Information Extraction Unit The image information extraction unit 132 extracts various information included in the digital video signal. The various information is, for example, a luminance level value and a high frequency component in a specific area on the image. In the image processing in the image information extraction unit 132, an image that has not been subjected to reduction processing that constitutes a digital video signal input from the video signal processing unit 120 in consideration of the accuracy of automatic exposure control and automatic focusing control. Data is used. In the present embodiment, a main subject area (described later) selected via the operation unit 180 is used as the specific area on the image.
The extracted brightness level value is used for automatic exposure control. Specifically, the image information extraction unit 132 calculates the luminance level value of the main subject area from the sequentially input image data, and outputs the luminance level value to the system control unit 157.

システム制御部１５７は、画像情報抽出部１３２により算出された輝度レベル値に基づき、最適な絞り値を算出し、算出した絞り値を露出制御部１６５に送信する。システム制御部１５７により算出された絞り値に基づいて、露出制御部１６５は光学系１０５の絞りユニットの開度を調整する。輝度レベル値は各フレームの画像データに対して算出される。
また、高周波数成分は「山登り方式」として知られているコントラスト検出方式の自動合焦制御に用いられる。デジタル映像信号に含まれる高域周波数成分は、合焦の度合いに応じて変化し、フォーカスレンズが合焦位置に配置されている状態で最大値を取る。光学系１０５に含まれるフォーカスレンズを光軸方向に移動させながら特定の領域の高域周波数成分の最大値を求めることで、その領域に焦点を合わせることができるフォーカスレンズの位置を求めることができる。 The system control unit 157 calculates an optimum aperture value based on the brightness level value calculated by the image information extraction unit 132, and transmits the calculated aperture value to the exposure control unit 165. Based on the aperture value calculated by the system controller 157, the exposure controller 165 adjusts the aperture of the aperture unit of the optical system 105. The luminance level value is calculated for the image data of each frame.
The high frequency component is used for automatic focusing control of a contrast detection method known as “mountain climbing method”. The high frequency component included in the digital video signal changes according to the degree of focusing, and takes a maximum value in a state where the focus lens is disposed at the focusing position. By determining the maximum value of the high frequency component of a specific region while moving the focus lens included in the optical system 105 in the optical axis direction, the position of the focus lens that can focus on that region can be determined. .

具体的には、レンズ制御部１７０によりフォーカスレンズが光軸方向に駆動されている間に、画像情報抽出部１３２は、順次入力される画像データから主被写体領域の高域周波数成分を算出し、高域周波数成分をシステム制御部１５７に出力する。システム制御部１５７は、画像情報抽出部１３２により算出された高域周波数成分を画像情報抽出部１３２から取得し、それと同時に、その高域周波数成分が算出された際のフォーカスレンズの位置情報をレンズ制御部１７０から取得する。
順次取得されるこれらの情報に基づいて、システム制御部１５７は、高域周波数成分の最大値およびその最大値に対応するフォーカスレンズの位置（合焦位置）を算出し、フォーカスレンズを合焦位置まで駆動するための制御信号を生成する。この制御信号に基づいて、レンズ制御部１７０はフォーカスレンズを合焦位置まで駆動し、上記の合焦動作が繰り返される。これにより、主被写体の顔領域に対して常時焦点を合わせることができる。 Specifically, while the focus lens is driven in the optical axis direction by the lens control unit 170, the image information extraction unit 132 calculates a high frequency component of the main subject region from sequentially input image data, The high frequency component is output to the system control unit 157. The system control unit 157 acquires the high frequency component calculated by the image information extraction unit 132 from the image information extraction unit 132, and at the same time, obtains the position information of the focus lens when the high frequency component is calculated as the lens. Obtained from the control unit 170.
Based on the sequentially acquired information, the system control unit 157 calculates the maximum value of the high frequency component and the position (focus position) of the focus lens corresponding to the maximum value, and moves the focus lens to the focus position. A control signal for driving up to is generated. Based on this control signal, the lens controller 170 drives the focus lens to the in-focus position, and the above-described in-focus operation is repeated. Thereby, it is possible to always focus on the face area of the main subject.

＜光学機構制御ユニット＞
光学機構制御ユニット７２は、フォーカスモータとシャッターユニットと絞りユニットとを制御するユニットであり、露出制御部１６５と、レンズ制御部１７０と、を有している。
露出制御部１６５は、システム制御部１５７で算出された絞り値に基づいて光学系１０５に備えられた絞りユニットを駆動する。レンズ制御部１７０は、システム制御部１５７で算出されたフォーカスレンズの目標位置に基づいてフォーカスモータを制御し、フォーカスモータにより光学系１０５に含まれるフォーカスレンズが目標位置まで駆動される。
＜表示部＞
表示部１６０は、デジタル映像信号を表示するデバイスであり、たとえば小型のＬＣＤパネルである。表示部１６０はバッファメモリ１３５上に格納されている表示用映像信号を映像として表示する。 <Optical mechanism control unit>
The optical mechanism control unit 72 is a unit that controls the focus motor, the shutter unit, and the aperture unit, and includes an exposure control unit 165 and a lens control unit 170.
The exposure control unit 165 drives the aperture unit provided in the optical system 105 based on the aperture value calculated by the system control unit 157. The lens control unit 170 controls the focus motor based on the target position of the focus lens calculated by the system control unit 157, and the focus lens included in the optical system 105 is driven to the target position by the focus motor.
<Display section>
The display unit 160 is a device that displays a digital video signal, and is, for example, a small LCD panel. The display unit 160 displays the display video signal stored on the buffer memory 135 as a video.

後述するように、検出部１３０により検出された顔領域が存在する場合、表示用映像信号は、検出された顔領域に基づいて生成された候補画像と合成され、さらに、現在の動作状況を示すアイコン、記録時間および残バッテリー時間といった各種情報を示す画像と合成される。これらの画像が合成された表示用映像信号は、システムバス１３４を通して表示部１６０に送信され、表示部１６０により映像として表示される。
ここで、表示部１６０の表示例について説明する。図４に示すように、例えば、表示部１６０には、２人の人間（被写体）が映し出されており、両被写体の顔がそれぞれ検出部１３０により顔領域として検出されている。右側の被写体の顔４００ａは第１検出枠４３０ａにより囲まれており、左側の被写体の顔４００ｂは第２検出枠４３０ｂにより囲まれている。第１検出枠４３０ａに囲まれている領域が顔４００ａの顔領域であり、第２検出枠４３０ｂに囲まれている領域が顔４００ｂの顔領域である。後述するように、顔検出処理は各フレームの検出用画像データに対して行われ、かつ、被写体が移動しても顔領域が追尾されるようになっているため、第１検出枠４３０ａおよび第２検出枠４３０ｂが被写体の動きに合わせて移動する。 As will be described later, when the face area detected by the detection unit 130 exists, the display video signal is combined with the candidate image generated based on the detected face area, and further indicates the current operation status. It is combined with an image showing various information such as an icon, recording time and remaining battery time. The display video signal in which these images are combined is transmitted to the display unit 160 through the system bus 134 and displayed as a video by the display unit 160.
Here, a display example of the display unit 160 will be described. As shown in FIG. 4, for example, two humans (subjects) are displayed on the display unit 160, and the faces of both subjects are detected as face areas by the detection unit 130, respectively. The right subject's face 400a is surrounded by a first detection frame 430a, and the left subject's face 400b is surrounded by a second detection frame 430b. The area surrounded by the first detection frame 430a is the face area of the face 400a, and the area surrounded by the second detection frame 430b is the face area of the face 400b. As will be described later, the face detection process is performed on the detection image data of each frame, and the face area is tracked even when the subject moves, so the first detection frame 430a and the first detection frame 2 The detection frame 430b moves in accordance with the movement of the subject.

さらに、画面内の下側には横長の候補表示領域４１０が表示されており、この候補表示領域４１０は画面内で静止している。候補表示領域４１０には４つの区画領域４１０ａ〜４１０ｄ、左側操作領域４２０ａおよび右側操作領域４２０ｂが表示されている。
区画領域４１０ａ〜４１０ｄには検出された顔領域の画像が表示される。例えば、区画領域４１０ａには第１検出枠４３０ａ内の顔画像（右側の被写体の顔４００ａの画像）が表示されており、区画領域４１０ｂには第２検出枠４３０ｂ内の顔画像（左側の被写体の顔４００ｂの画像）が表示されている。検出されている顔領域が５つ以上の場合は、４つの区画領域４１０ａ〜４１０ｄでは表示しきれないため、５つのうち４つの顔画像が区画領域４１０ａ〜４１０ｄに表示され、残り１つの顔画像は表示されないが、左側操作領域４２０ａおよび右側操作領域４２０ｂに触れることで、区画領域４１０ａ〜４１０ｄに表示されている顔画像を左右にスクロールさせることができ、表示されていない顔画像を表示させることができる。 Further, a horizontally long candidate display area 410 is displayed on the lower side of the screen, and this candidate display area 410 is stationary in the screen. In the candidate display area 410, four partition areas 410a to 410d, a left operation area 420a, and a right operation area 420b are displayed.
Images of the detected face area are displayed in the divided areas 410a to 410d. For example, a face image (image of the right subject's face 400a) in the first detection frame 430a is displayed in the partition area 410a, and a face image (left subject in the second detection frame 430b) is displayed in the partition area 410b. Is displayed). When five or more face areas are detected, the four divided areas 410a to 410d cannot be displayed, and four face images out of the five are displayed in the divided areas 410a to 410d, and the remaining one face image is displayed. Is not displayed, but by touching the left operation area 420a and the right operation area 420b, the face images displayed in the partition areas 410a to 410d can be scrolled left and right, and the face images that are not displayed are displayed. Can do.

また、第１検出枠４３０ａおよび第２検出枠４３０ｂは、異なる視覚的特徴を有している。具体的には、第１検出枠４３０ａは、間隔が狭い破線で描かれており、第２検出枠４３０ｂは第１検出枠４３０ａよりも間隔が広い破線で描かれている。また、第１検出枠４３０ａの顔画像が表示されている区画領域４１０ａには、第１検出枠４３０ａと同じ視覚的特徴を有する第１装飾枠４１１ａが付加されており、第２検出枠４３０ｂの顔画像が表示されている区画領域４１０ｂには、第２検出枠４３０ｂと同じ視覚的特徴を有する第２装飾枠４１１ｂが付加されている。さらに、現在、主被写体として選択されている顔領域が明確になるように、主被写体として選択されている顔領域の検出枠および装飾枠は、例えば他の枠よりも太い線で表示されるようになっている。図４では、第２検出枠４３０ｂの顔領域が主被写体として選択されているため、第２検出枠４３０ｂおよび第２装飾枠４１１ｂが太線で表示されている。 Further, the first detection frame 430a and the second detection frame 430b have different visual characteristics. Specifically, the first detection frame 430a is drawn with a broken line with a narrow interval, and the second detection frame 430b is drawn with a broken line with a wider interval than the first detection frame 430a. In addition, a first decorative frame 411a having the same visual characteristics as the first detection frame 430a is added to the partitioned area 410a where the face image of the first detection frame 430a is displayed, and the second detection frame 430b A second decorative frame 411b having the same visual characteristics as the second detection frame 430b is added to the partitioned area 410b where the face image is displayed. Further, the detection frame and decoration frame of the face area selected as the main subject are displayed with thicker lines than other frames, for example, so that the face area currently selected as the main subject becomes clear. It has become. In FIG. 4, since the face area of the second detection frame 430b is selected as the main subject, the second detection frame 430b and the second decoration frame 411b are displayed in bold lines.

なお、本実施形態では、顔領域（第１検出枠４３０ａおよび第２検出枠４３０ｂ）および区画領域４１０ａ〜４１０ｄは正方形であり、大きさが異なっていても相似形となっている。このため、後述するように、顔領域の画像の大きさを区画領域の大きさに合わせて簡単に調整することができる。
＜操作部＞
操作部１８０は、システム制御部１５７に操作情報を入力するための各種操作部材を有しており、例えば、レリーズボタン（図示せず）などの種々のボタン類、操作ダイヤルおよびタッチパネルユニットを有している。ボタン類、操作ダイヤルおよびタッチパネルユニットは、システム制御部１５７に電気的に接続されている。レリーズボタンは撮影開始信号をシステム制御部１５７に出力する。 In the present embodiment, the face area (the first detection frame 430a and the second detection frame 430b) and the partition areas 410a to 410d are square, and have similar shapes even if the sizes are different. For this reason, as will be described later, the size of the image of the face region can be easily adjusted according to the size of the partition region.
<Operation unit>
The operation unit 180 includes various operation members for inputting operation information to the system control unit 157. For example, the operation unit 180 includes various buttons such as a release button (not shown), an operation dial, and a touch panel unit. ing. The buttons, operation dial, and touch panel unit are electrically connected to the system control unit 157. The release button outputs a shooting start signal to the system control unit 157.

操作部１８０のタッチパネルユニットは、表示部１６０の画面に装着されており、表示部１６０の画面上のタッチ位置（被接触位置の一例）を検知する。ユーザーが画面をタッチすると、タッチパネルユニットは、タッチ位置を示す指定情報を生成し、この指定情報をシステム制御部１５７に出力する。映像に表れている顔画像のうち特定の顔画像を指定する際、および表示されている顔画像をスクロールさせる際に、指定情報は用いられる。
＜システム制御部＞
システム制御部１５７は、ビデオカメラ１００の各部を制御する。システム制御部１５７にはＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）が搭載されている。ＲＯＭにはプログラムが格納されており、例えば、電源ＯＮ時にＲＯＭに格納されたプログラムがＲＡＭに展開される。ＲＡＭに展開されたプログラムをＭＰＵが逐次実行することで、システム制御部１５７は様々な機能を実現することができる。 The touch panel unit of the operation unit 180 is mounted on the screen of the display unit 160 and detects a touch position (an example of a contacted position) on the screen of the display unit 160. When the user touches the screen, the touch panel unit generates designation information indicating the touch position, and outputs the designation information to the system control unit 157. The designation information is used when a specific face image is designated from among the face images appearing in the video and when the displayed face image is scrolled.
<System controller>
The system control unit 157 controls each unit of the video camera 100. The system control unit 157 includes an MPU (Micro Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory). A program is stored in the ROM. For example, when the power is turned on, the program stored in the ROM is expanded in the RAM. The system control unit 157 can realize various functions by causing the MPU to sequentially execute the program expanded in the RAM.

具体的には図３に示すように、システム制御部１５７は、ＲＯＭに記憶された所定のプログラムとＭＰＵとの協働により実現される機能ブロックとして、リスト管理部１６１０と、生成部１６２０と、合成部１６４０と、選択部１６５０と、を有している。
（１）リスト管理部
リスト管理部１６１０は検出部１３０の顔検出部２２０により検出された顔領域の各情報を管理する。リスト管理部１６１０は、顔検出部２２０から送信される座標情報およびサイズ情報に基づいて候補情報リストを作成および更新する。ここで、候補情報リストを作成するとは、座標情報およびサイズ情報を互いに関連付けて所定のアドレスに格納することを意味している。候補情報リストの各情報は、例えばＲＡＭ１６３０上に格納される。 Specifically, as illustrated in FIG. 3, the system control unit 157 includes a list management unit 1610, a generation unit 1620, and a function block realized by cooperation of a predetermined program stored in the ROM and the MPU. A synthesis unit 1640 and a selection unit 1650 are included.
(1) List Management Unit The list management unit 1610 manages each piece of information on the face area detected by the face detection unit 220 of the detection unit 130. The list management unit 1610 creates and updates a candidate information list based on the coordinate information and size information transmitted from the face detection unit 220. Here, creating the candidate information list means that the coordinate information and the size information are associated with each other and stored at a predetermined address. Each piece of information in the candidate information list is stored on the RAM 1630, for example.

また、リスト管理部１６１０は、ＲＡＭ１６３０上に候補情報リストが存在するかどうかを判断する。ＲＡＭ１６３０上に候補情報リストが存在しない場合、リスト管理部１６１０は候補情報リストを作成する。顔領域の座標情報およびサイズ情報は１フレームごとに顔検出部２２０からリスト管理部１６１０へ送られる。リスト管理部１６１０は候補情報リストとして格納されている座標情報およびサイズ情報を、顔検出部２２０から送られてきた座標情報およびサイズ情報を用いて最新の情報に更新する。
（２）生成部
生成部１６２０は、バッファメモリ１３５に格納されている表示用映像信号に基づいて、検出部１３０により検出された候補領域に関する候補識別情報を生成する。具体的には、生成部１６２０は、検出部１３０により検出された各顔領域の座標情報およびサイズ情報に基づいて、顔検出の対象となっている検出用画像データと取得されたタイミングが同じである表示用画像データから、各顔領域内の画像を候補画像データ（候補識別情報の一例）として抽出し、抽出した候補画像データを顔領域の座標情報およびサイズ情報と関連付けてバッファメモリ１３５上の所定のアドレスに格納する。各顔領域の候補画像データは１フレームごとに順次抽出される。本実施形態では、候補領域は顔領域であるため、候補画像データが表す画像（以下、候補画像という）は人間の顔の画像となっている。 The list management unit 1610 determines whether a candidate information list exists on the RAM 1630. When the candidate information list does not exist on the RAM 1630, the list management unit 1610 creates a candidate information list. The coordinate information and size information of the face area are sent from the face detection unit 220 to the list management unit 1610 for each frame. The list management unit 1610 updates the coordinate information and size information stored as the candidate information list to the latest information using the coordinate information and size information sent from the face detection unit 220.
(2) Generation Unit The generation unit 1620 generates candidate identification information related to the candidate area detected by the detection unit 130 based on the display video signal stored in the buffer memory 135. Specifically, based on the coordinate information and size information of each face area detected by the detection unit 130, the generation unit 1620 has the same timing as the detection image data that is the target of face detection. An image in each face area is extracted as candidate image data (an example of candidate identification information) from certain display image data, and the extracted candidate image data is associated with the coordinate information and size information of the face area on the buffer memory 135. Store at a predetermined address. Candidate image data for each face area is extracted sequentially for each frame. In this embodiment, since the candidate area is a face area, the image represented by the candidate image data (hereinafter referred to as a candidate image) is a human face image.

また、顔領域の大きさには通常ばらつきがあるため、抽出される候補画像のサイズは顔領域ごとに異なっている場合が多い。したがって、候補画像データを座標情報およびサイズ情報と関連付ける前に、生成部１６２０は、予め設定されている区画領域４１０ａ〜４１０ｄの大きさと候補画像の大きさとを比較し、比較結果に基づいて候補画像の大きさを調整する。
例えば、生成部１６２０は、候補画像が区画領域よりも小さい場合は区画領域に合わせて候補画像データに拡大処理を施し、拡大処理が施された候補画像データを座標情報およびサイズ情報と関連付けてバッファメモリ１３５上の所定のアドレスに格納する。また、生成部１６２０は、候補画像が区画領域よりも大きい場合は区画領域に合わせて候補画像データに縮小処理を施し、縮小処理が施された候補画像データを座標情報およびサイズ情報と関連付けてバッファメモリ１３５上の所定のアドレスに格納する。なお、生成部１６２０は、候補画像が区画領域と同じ大きさの場合は、候補画像に対して縮小処理および拡大処理は行われない。 In addition, since the size of the face area usually varies, the size of the extracted candidate image is often different for each face area. Therefore, before associating the candidate image data with the coordinate information and the size information, the generation unit 1620 compares the size of the preset partition areas 410a to 410d with the size of the candidate image, and based on the comparison result, the candidate image Adjust the size of.
For example, if the candidate image is smaller than the partition area, the generation unit 1620 performs an enlargement process on the candidate image data according to the partition area, and associates the candidate image data subjected to the enlargement process with the coordinate information and the size information as a buffer. The data is stored at a predetermined address on the memory 135. In addition, when the candidate image is larger than the partitioned area, the generation unit 1620 performs a reduction process on the candidate image data according to the partitioned area, and associates the candidate image data subjected to the reduced process with the coordinate information and the size information as a buffer. The data is stored at a predetermined address on the memory 135. Note that the generation unit 1620 does not perform the reduction process and the enlargement process on the candidate image when the candidate image has the same size as the partition area.

（３）合成部
合成部１６４０は、生成部１６２０により生成された候補画像データとバッファメモリ１３５上の表示用画像データとを合成し、両者が合成された合成画像データを生成する。より詳細には、合成部１６４０は、区画領域４１０ａ〜４１０ｄに候補画像を割り当て、表示部１６０の画面上で候補画像が特定の位置に配置されるように候補画像データと表示用画像データとを合成する。
本実施形態では、図４に示すように、表示部１６０の画面上で複数の候補画像が横方向に並んでいる。さらに、表示部１６０に表示される映像における鉛直方向下半分の領域内に候補画像が配置されている。ここで、「鉛直方向」とは、表示部１６０に表示されている画像内での鉛直方向を意味している。合成画像データはバッファメモリ１３５に格納される。このような配置により、被写体の顔が候補画像に隠れてしまうのを防止できる。 (3) Combining Unit The combining unit 1640 combines the candidate image data generated by the generating unit 1620 and the display image data on the buffer memory 135, and generates combined image data obtained by combining the two. More specifically, the composition unit 1640 assigns candidate images to the partition regions 410a to 410d, and sets the candidate image data and display image data so that the candidate images are arranged at specific positions on the screen of the display unit 160. Synthesize.
In the present embodiment, as shown in FIG. 4, a plurality of candidate images are arranged in the horizontal direction on the screen of the display unit 160. Further, candidate images are arranged in the lower half region in the vertical direction in the video displayed on the display unit 160. Here, the “vertical direction” means the vertical direction in the image displayed on the display unit 160. The composite image data is stored in the buffer memory 135. With such an arrangement, it is possible to prevent the face of the subject from being hidden in the candidate image.

また、合成部１６４０は、表示部１６０に映像が表示されている状態で候補画像が画像データをもとに生成される画像上に重畳するように候補画像データと表示用画像データとを合成する。候補画像が表示されていない区画領域４１０ｃおよび４１０ｄは、枠のみが表示されており、バックの画像がそのまま見えるようになっている。
また、図４に示すように、表示部１６０の画面上で、候補画像が表示される候補表示領域４１０（第１表示領域の一例）の面積は、候補表示領域４１０以外の領域４２０（第２表示領域の一例）の面積よりも小さく設定されている。
さらに、候補画像データと表示用画像データとを合成する際、合成部１６４０は顔領域であることを示す装飾枠（装飾情報の一例）を各区画領域に付加する。図４の例では、第１検出枠４３０ａに対応する第１装飾枠４１１ａ（第１装飾情報の一例）が区画領域４１０ａに付加され、第２検出枠４３０ｂに対応する第２装飾枠４１１ｂ（第２装飾情報の一例）が区画領域４１０ｂに付加されている。第１装飾枠４１１ａは第１検出枠４３０ａと同じ視覚的特徴を有しており、第２装飾枠４１１ｂは第２検出枠４３０ｂと同じ視覚的特徴を有している。具体的には、前述のように、第１検出枠４３０ａおよび第１装飾枠４１１ａは間隔が狭い破線で描かれており、第２検出枠４３０ｂおよび第２装飾枠４１１ｂは第１検出枠４３０ａおよび第１装飾枠４１１ａよりも間隔が広い破線で描かれている。 Further, the synthesis unit 1640 synthesizes the candidate image data and the display image data so that the candidate image is superimposed on an image generated based on the image data while the video is displayed on the display unit 160. . In the partitioned areas 410c and 410d where no candidate image is displayed, only the frame is displayed, so that the back image can be seen as it is.
As shown in FIG. 4, the area of the candidate display area 410 (an example of the first display area) on which the candidate image is displayed on the screen of the display unit 160 is an area 420 (second display area) other than the candidate display area 410. It is set smaller than the area of one example of the display area.
Further, when the candidate image data and the display image data are combined, the combining unit 1640 adds a decoration frame (an example of decoration information) indicating a face area to each partition area. In the example of FIG. 4, a first decoration frame 411a (an example of first decoration information) corresponding to the first detection frame 430a is added to the partition area 410a, and a second decoration frame 411b (first frame) corresponding to the second detection frame 430b is added. 2 (an example of decoration information) is added to the partition area 410b. The first decorative frame 411a has the same visual features as the first detection frame 430a, and the second decorative frame 411b has the same visual features as the second detection frame 430b. Specifically, as described above, the first detection frame 430a and the first decoration frame 411a are drawn with a narrow broken line, and the second detection frame 430b and the second decoration frame 411b are the first detection frame 430a and It is drawn with a broken line that is wider than the first decorative frame 411a.

さらに、主被写体として選択されている顔領域（基準領域の一例）に付加されている検出枠および装飾枠は、他の検出枠および装飾枠と異なる視覚的特徴を有している。例えば、顔４００ｂが主被写体として選択されている場合、図４に示すように、第２検出枠４３０ｂおよび第２装飾枠４１１ｂが第１検出枠４３０ａおよび第１装飾枠４１１ａよりも太い線で描かれている。
（４）選択部
選択部１６５０は、操作部１８０を介して指定された候補画像に対応する顔領域を、操作部１８０に入力された指定情報に基づいて、自動露出制御および自動合焦制御の基準となる主被写体領域（基準領域の一例）として選択する。具体的には、選択部１６５０は、操作部１８０を介して入力された指定情報に基づいて、指定情報が示す位置に表示されている候補画像を特定し、特定された候補画像に対応する顔領域を特定し、さらに特定された顔領域を主被写体領域として選択する。選択部１６５０は、指定情報に基づいて特定された候補画像に対応する顔領域の座標情報およびサイズ情報を主被写体領域の情報としてＲＡＭ１６３０に格納する。操作部１８０を介して異なる候補画像が指定されると、ＲＡＭ１６３０に格納された主被写体領域の情報は、新しい指定情報に基づいて特定された顔領域の座標情報およびサイズ情報に置き換えられ、主被写体領域の情報が更新される。この主被写体領域の情報は、画像情報抽出部１３２での輝度レベル値および高域周波数成分の算出時に用いられる。 Furthermore, the detection frame and the decoration frame added to the face area (an example of the reference area) selected as the main subject have different visual characteristics from the other detection frames and the decoration frame. For example, when the face 400b is selected as the main subject, as shown in FIG. 4, the second detection frame 430b and the second decoration frame 411b are drawn with thicker lines than the first detection frame 430a and the first decoration frame 411a. It is.
(4) Selection Unit The selection unit 1650 performs automatic exposure control and automatic focusing control on the face area corresponding to the candidate image specified via the operation unit 180 based on the designation information input to the operation unit 180. It is selected as a main subject area (an example of a reference area) that serves as a reference. Specifically, the selection unit 1650 identifies the candidate image displayed at the position indicated by the designation information based on the designation information input via the operation unit 180, and the face corresponding to the identified candidate image. An area is specified, and the specified face area is selected as a main subject area. The selection unit 1650 stores the coordinate information and size information of the face area corresponding to the candidate image specified based on the designation information in the RAM 1630 as main subject area information. When a different candidate image is designated via the operation unit 180, the information on the main subject area stored in the RAM 1630 is replaced with the coordinate information and size information of the face area specified based on the new designation information. The area information is updated. This main subject area information is used when the image information extraction unit 132 calculates the luminance level value and the high frequency component.

また、前述のように、システム制御部１５７は、画像情報抽出部１３２と、露出制御部１６５と、レンズ制御部１７０と、を制御することにより、主被写体領域に対する自動露出制御および自動合焦制御を行なう。具体的には、システム制御部１５７は選択部１６５０により主被写体領域として選択された顔領域の座標情報およびサイズ情報を画像情報抽出部１３２に通知する。画像情報抽出部１３２は、座標情報およびサイズ情報に基づいて、主被写体領域の輝度レベル値および高域周波数成分を算出し、算出した輝度レベル値および高域周波数成分をシステム制御部１５７に通知する。システム制御部１５７は、輝度レベル値に基づいて最適な絞り値を計算し、露出制御部１６５に絞り値を通知する。また、システム制御部１５７は、高域周波数成分に基づいて、合焦状態に対応するフォーカスレンズの位置情報を算出し、レンズ制御部１７０にフォーカスレンズの合焦位置を通知する。これらの情報に基づいて、自動露出制御および自動合焦制御が行われる。 Further, as described above, the system control unit 157 controls the image information extraction unit 132, the exposure control unit 165, and the lens control unit 170, so that automatic exposure control and automatic focusing control for the main subject region are performed. To do. Specifically, the system control unit 157 notifies the image information extraction unit 132 of coordinate information and size information of the face area selected as the main subject area by the selection unit 1650. The image information extraction unit 132 calculates the luminance level value and high frequency component of the main subject area based on the coordinate information and the size information, and notifies the system control unit 157 of the calculated luminance level value and high frequency component. . The system control unit 157 calculates an optimum aperture value based on the luminance level value, and notifies the exposure control unit 165 of the aperture value. Further, the system control unit 157 calculates position information of the focus lens corresponding to the in-focus state based on the high frequency component, and notifies the lens control unit 170 of the focus position of the focus lens. Based on these pieces of information, automatic exposure control and automatic focusing control are performed.

＜ビデオカメラの動作＞
以上に述べたビデオカメラ１００の動作を図５〜図７に示すフローチャートを用いて説明する。なお、図５〜図７に示すフローチャートにおいては、すでにビデオカメラ１００の電源はＯＮされており、操作部１８０を用いて、映像の記録が可能な録画モードに動作モードが設定されているものとする。
（１）候補画像データの生成
図５に示すように、まずステップＳ３００では、デジタル映像信号を構成する画像データが順次検出部１３０および画像情報抽出部１３２に入力される。具体的には、光学系１０５を通して結像された被写体像が撮像素子１１０により電気信号に変換され、撮像素子１１０から出力されるアナログ映像信号がＡ／Ｄ変換部１１５によりデジタル映像信号に変換される。Ａ／Ｄ変換部１１５から出力されるデジタル映像信号に対して、映像信号処理部１２０およびＹ／Ｃ変換部１２５により前述の処理が施され、処理が施されたデジタル映像信号は、複数フレームの画像データの集合体であり、各フレームの画像データはバッファメモリ１３５に順次格納される。 <Operation of the video camera>
The operation of the video camera 100 described above will be described with reference to the flowcharts shown in FIGS. In the flowcharts shown in FIGS. 5 to 7, the video camera 100 is already turned on, and the operation mode is set to the recording mode in which video can be recorded using the operation unit 180. To do.
(1) Generation of Candidate Image Data As shown in FIG. 5, first, in step S300, image data constituting a digital video signal is sequentially input to the detection unit 130 and the image information extraction unit 132. Specifically, a subject image formed through the optical system 105 is converted into an electrical signal by the image sensor 110, and an analog video signal output from the image sensor 110 is converted into a digital video signal by the A / D converter 115. The The digital video signal output from the A / D conversion unit 115 is subjected to the above-described processing by the video signal processing unit 120 and the Y / C conversion unit 125, and the processed digital video signal has a plurality of frames. This is a collection of image data, and the image data of each frame is sequentially stored in the buffer memory 135.

加えて、Ｙ／Ｃ変換部１２５から出力されるデジタル映像信号を構成する画像データは、システム制御部１５７により表示部１６０上に表示可能なサイズへと縮小処理が施され、同じくバッファメモリ１３５上に表示用画像データとして順次格納される。また、検出部１３０内の揮発性メモリ２１０にも、画像処理部２００により縮小処理が施された画像データが検出用画像データとして順次格納される。
ステップＳ３１０では、顔検出部２２０により、検出部１３０内の揮発性メモリ２１０に格納された検出用画像データに基づいて顔領域が検出される。具体的には、顔検出部２２０により、人間の顔の特徴を示す基準特徴量と検出用画像データとが比較され、基準特徴量の条件を満たす顔領域が候補領域として検出用画像データから検出される。検出された顔領域の座標情報およびサイズ情報が顔検出部２２０により算出され、座標情報およびサイズ情報が顔検出部２２０からシステム制御部１５７に出力される。 In addition, the image data constituting the digital video signal output from the Y / C conversion unit 125 is reduced to a size that can be displayed on the display unit 160 by the system control unit 157, and is also stored on the buffer memory 135. Are sequentially stored as display image data. Also, the image data subjected to the reduction process by the image processing unit 200 is sequentially stored in the volatile memory 210 in the detection unit 130 as detection image data.
In step S <b> 310, the face detection unit 220 detects a face area based on the detection image data stored in the volatile memory 210 in the detection unit 130. Specifically, the face detection unit 220 compares the reference feature amount indicating the feature of the human face with the detection image data, and detects a face region that satisfies the reference feature amount from the detection image data as a candidate region. Is done. The face detection unit 220 calculates coordinate information and size information of the detected face area, and outputs the coordinate information and size information from the face detection unit 220 to the system control unit 157.

ステップＳ３１１以降の処理はシステム制御部１５７にて実行される。具体的には、ステップＳ３１１では、リスト管理部１６１０によりＲＡＭ１６３０に格納された候補情報リスト上で顔領域が存在するかどうかが判断される。例えば、候補情報リストが存在するとリスト管理部１６１０により判断された場合は、処理がステップＳ３１３へ進む。
一方、例えば録画モードへの切り替え直後は候補情報リストが存在しないため、ステップＳ３１２において、検出部１３０により検出用画像データから顔領域が検出されると、リスト管理部１６１０により候補情報リストが作成され、座標情報およびサイズ情報がＲＡＭ１６３０上の所定のアドレスに記憶され、処理がステップＳ３１３に移行する。
ステップＳ３１３では、検出された顔領域を追尾するために、検出用画像データから検出された顔領域の座標情報がリスト管理部１６１０により候補情報リスト上の各座標情報と照合される。 The processing after step S311 is executed by the system control unit 157. Specifically, in step S311, the list management unit 1610 determines whether a face area exists on the candidate information list stored in the RAM 1630. For example, if the list management unit 1610 determines that there is a candidate information list, the process proceeds to step S313.
On the other hand, for example, since the candidate information list does not exist immediately after switching to the recording mode, when the face area is detected from the detection image data by the detection unit 130 in step S312, the list management unit 1610 creates the candidate information list. The coordinate information and the size information are stored at a predetermined address on the RAM 1630, and the process proceeds to step S313.
In step S313, in order to track the detected face area, the coordinate information of the face area detected from the detection image data is collated with each coordinate information on the candidate information list by the list management unit 1610.

ここで、検出された顔領域を追尾するための処理について説明する。検出された顔領域の追尾処理は、現在のフレームの検出用画像データから検出された顔領域が、前フレームの検出用画像データから検出された顔領域と合致しているかどうかを判定することで行われる。本判定の最も単純な方法は、顔領域の座標を比較する方法である。一般的に、１秒あたり３０フレームまたは６０フレームの画像データが入力されるため、前フレームの画像データにおいて検出された顔領域が、現在のフレームの画像データにおいて前の顔領域よりも大きく離れた位置で検出されることは考えにくい。したがって、前フレームの検出用画像データから検出された顔領域の座標情報と現在のフレームの検出用画像データから検出された顔領域の座標情報との差分がある定められた範囲内に収まっている場合、それら２つの顔領域は同じ被写体の顔領域であると判定することができる。 Here, a process for tracking the detected face area will be described. The tracking process of the detected face area is performed by determining whether or not the face area detected from the detection image data for the current frame matches the face area detected from the detection image data for the previous frame. Done. The simplest method of this determination is a method of comparing the coordinates of face areas. Generally, because 30 or 60 frames of image data are input per second, the face area detected in the image data of the previous frame is far away from the previous face area in the image data of the current frame. It is unlikely to be detected at the position. Therefore, the difference between the coordinate information of the face area detected from the detection image data of the previous frame and the coordinate information of the face area detected from the detection image data of the current frame is within a predetermined range. In this case, it can be determined that the two face areas are face areas of the same subject.

このように、時系列で隣り合う２つの検出用画像データから検出された顔領域の座標情報を照合することで、動いている被写体の顔が同じ顔領域であると識別しながら顔検出処理により顔領域を追尾することができる。また、前フレームの検出用画像データから検出された顔領域と合致しない顔領域が現在のフレームの検出用画像データから検出された場合、新しく画面内に登場した（フレームインした）被写体の顔を示す顔領域であると判定することができる。さらに、前フレームの検出用画像データから検出された顔領域が現在のフレームの検出用画像データで検出されない場合、画面内から姿を消した（フレームアウトした）被写体の顔を示す顔領域であると判定することができる。
ステップＳ３１３では、リスト管理部１６１０により時系列で隣り合う２つの検出用画像データから検出された顔領域の照合が行われる。具体的には、ステップＳ３１０で検出された各顔領域の座標情報が候補情報リスト上に存在する各顔領域の座標情報とそれぞれ比較される。比較した結果、座標情報の差分が所定の範囲内に収まっている２つの顔領域は、同じ被写体の顔から検出された領域であると判定できるため、ステップＳ３１０で検出された最新の顔領域の座標情報およびサイズ情報が候補情報リストに登録され、候補情報リストが更新される。 In this way, by collating the coordinate information of the face area detected from the two adjacent detection image data in time series, the face detection process is performed while identifying that the face of the moving subject is the same face area. The face area can be tracked. In addition, when a face area that does not match the face area detected from the detection image data of the previous frame is detected from the detection image data of the current frame, the face of the subject newly appearing in the screen (framed in) is displayed. It can be determined that the face area is shown. Furthermore, when the face area detected from the image data for detection in the previous frame is not detected in the image data for detection in the current frame, the face area indicates the face of the subject that has disappeared (out of frame) from the screen. Can be determined.
In step S313, the list management unit 1610 collates face areas detected from two detection image data adjacent in time series. Specifically, the coordinate information of each face area detected in step S310 is compared with the coordinate information of each face area existing on the candidate information list. As a result of the comparison, it can be determined that the two face areas in which the difference in the coordinate information is within a predetermined range are areas detected from the face of the same subject, and therefore, the latest face area detected in step S310. Coordinate information and size information are registered in the candidate information list, and the candidate information list is updated.

さらに、ステップＳ３２０において、候補情報リストに登録されているがステップＳ３１０で検出されていない顔領域が存在する場合、その顔領域はフレームアウトした被写体の顔領域であるとリスト管理部１６１０により判断される。フレームアウトした顔が存在する場合は、ステップＳ３３０において、フレームアウトした被写体の顔領域に対応する各情報がリスト管理部１６１０により候補情報リストから削除され、処理がステップＳ３４０に進む。また、ステップＳ３２０において、フレームアウトした被写体の顔領域が存在しないとリスト管理部１６１０により判断された場合は、処理がステップＳ３４０に進む。
ステップＳ３４０では、ステップＳ３１０において検出された顔領域に対応する情報が候補情報リスト上に存在しない場合は、その顔領域はフレームインした新しい被写体の顔領域であるとリスト管理部１６１０により判断される。新しい被写体の顔領域が存在する場合は、図６に示すように、ステップＳ３５０において、リスト管理部１６１０により新しい被写体の顔領域の座標情報およびサイズ情報が候補情報リストに追加され、その後、処理がステップＳ３５１に進む。新しい被写体の顔領域が存在しない場合は、処理がステップＳ３５１に進む。 In step S320, if there is a face area registered in the candidate information list but not detected in step S310, the list management unit 1610 determines that the face area is the face area of the subject that is out of frame. The If there is a frame-out face, in step S330, the information corresponding to the face area of the frame-out subject is deleted from the candidate information list by the list management unit 1610, and the process proceeds to step S340. If the list management unit 1610 determines in step S320 that there is no face area of the framed subject, the process proceeds to step S340.
In step S340, if the information corresponding to the face area detected in step S310 does not exist on the candidate information list, the list management unit 1610 determines that the face area is a face area of a new subject framed in. . If the face area of the new subject exists, as shown in FIG. 6, the coordinate information and size information of the face area of the new subject are added to the candidate information list by the list management unit 1610 in step S350, and then the process is performed. The process proceeds to step S351. If there is no new subject face area, the process proceeds to step S351.

ステップＳ３５１では、更新された候補情報リストから各顔領域の座標情報およびサイズ情報が生成部１６２０により取得される。続くステップＳ３６０では、取得した座標情報およびサイズ情報に基づいて、生成部１６２０により各顔領域に対応する候補画像データが生成される。
具体的には図６に示すように、バッファメモリ１３５に格納された表示用画像データから、座標情報およびサイズ情報により特定される領域内の画像が生成部１６２０により候補画像データとして抽出される。さらに、ステップＳ１５２０では、抽出された候補画像データにより生成される候補画像が区画領域４１０ａ〜４１０ｄのサイズと比較される。候補画像が区画領域４１０ａ〜４１０ｄよりも小さい場合および候補画像が区画領域４１０ａ〜４１０ｄよりも大きい場合は、ステップＳ１５３０において、生成部１６２０によりサイズ変更が必要であると判断される。サイズ変更が必要な場合は、ステップＳ１５４０において、抽出された候補画像のサイズ変更が行われる。具体的には、候補画像が区画領域４１０ａ〜４１０ｄに比べて大きい場合は、候補画像が区画領域４１０ａ〜４１０ｄ内に収まるように、サイズ情報および区画領域４１０ａ〜４１０ｄの大きさに基づいて算出された倍率で、生成部１６２０により候補画像データに対して縮小処理が施される。逆に、抽出された候補画像が区画領域４１０ａ〜４１０ｄに比べて小さい場合は、候補画像が区画領域４１０ａ〜４１０ｄの範囲内で最大限大きく表示されるように、サイズ情報および区画領域４１０ａ〜４１０ｄの大きさに基づいて倍率で、生成部１６２０により候補画像データに対して拡大処理が施される。生成部１６２０により生成された候補画像データは、それと対応する座標情報およびサイズ情報と関連付けられてバッファメモリ１３５上の所定のアドレスに記憶される。候補画像データは１フレームごとに生成され、候補画像データの更新が順次行われる。 In step S351, the generation unit 1620 acquires coordinate information and size information of each face area from the updated candidate information list. In subsequent step S360, based on the acquired coordinate information and size information, the generation unit 1620 generates candidate image data corresponding to each face area.
Specifically, as shown in FIG. 6, the image within the area specified by the coordinate information and the size information is extracted as candidate image data from the display image data stored in the buffer memory 135 by the generation unit 1620. Further, in step S1520, the candidate images generated from the extracted candidate image data are compared with the sizes of the partitioned areas 410a to 410d. When the candidate image is smaller than the partitioned areas 410a to 410d and when the candidate image is larger than the partitioned areas 410a to 410d, the generation unit 1620 determines that the size needs to be changed in step S1530. If a size change is necessary, the size of the extracted candidate image is changed in step S1540. Specifically, when the candidate image is larger than the partitioned areas 410a to 410d, the candidate image is calculated based on the size information and the size of the partitioned areas 410a to 410d so that the candidate image fits within the partitioned areas 410a to 410d. The generation unit 1620 performs reduction processing on the candidate image data at the same magnification. Conversely, when the extracted candidate image is smaller than the partitioned areas 410a to 410d, the size information and the partitioned areas 410a to 410d are displayed so that the candidate image is displayed as large as possible within the partitioned areas 410a to 410d. The generation unit 1620 performs an enlargement process on the candidate image data at a magnification based on the size of the image. The candidate image data generated by the generation unit 1620 is stored in a predetermined address on the buffer memory 135 in association with the corresponding coordinate information and size information. Candidate image data is generated for each frame, and the candidate image data is sequentially updated.

（２）合成画像データの生成
図６に示すように、続くステップＳ３７０にて、システム制御部１５７により、候補画像データと、バッファメモリ１３５に格納された表示用画像データと、が合成された合成画像データが生成される。候補画像が画面上の特定の位置に配置されるように、合成部１６４０により候補画像データと画像データとが合成される。本実施形態では、候補画像が画像内の鉛直方向下側に配置されるように、合成部１６４０により候補画像データと画像データとが合成される。
ステップＳ３７０について詳細に説明すると、図６に示すように、ステップＳ１５５０において、合成部１６４０により、４つの区画領域４１０ａ〜４１０ｄに候補画像データが割り当てられる。割り当ての際、候補情報リストに登録されている順に、左の区画領域４１０ａから候補画像データが合成部１６４０により割り当てられる。図４の表示例で説明すると、顔４００ａおよび顔４００ｂの順で顔領域が候補情報リストに登録されている場合、顔４００ａの候補画像が区画領域４１０ａに表示され、顔４００ｂの候補画像が区画領域４１０ｂに表示される。合成部１６４０での割り当ての結果は、合成部１６４０からリスト管理部１６１０に送られる。具体的には、合成部１６４０からリスト管理部１６１０へ各候補画像データの表示位置情報が送られ、リスト管理部１６１０により表示位置情報が、対応する候補画像データと関連づけられた状態で候補情報リストに登録される。 (2) Generation of Composite Image Data As shown in FIG. 6, in subsequent step S370, the system controller 157 combines the candidate image data with the display image data stored in the buffer memory 135. Image data is generated. The candidate image data and the image data are combined by the combining unit 1640 so that the candidate image is arranged at a specific position on the screen. In the present embodiment, the candidate image data and the image data are combined by the combining unit 1640 so that the candidate image is arranged on the lower side in the vertical direction in the image.
The step S370 will be described in detail. As shown in FIG. 6, in step S1550, the composition unit 1640 assigns candidate image data to the four partitioned areas 410a to 410d. At the time of assignment, candidate image data is assigned by the compositing unit 1640 from the left partition area 410a in the order registered in the candidate information list. Referring to the display example of FIG. 4, when face areas are registered in the candidate information list in the order of the face 400a and the face 400b, the candidate image of the face 400a is displayed in the section area 410a, and the candidate image of the face 400b is sectioned. It is displayed in the area 410b. The result of allocation by the synthesis unit 1640 is sent from the synthesis unit 1640 to the list management unit 1610. Specifically, the display position information of each candidate image data is sent from the synthesis unit 1640 to the list management unit 1610, and the candidate information list is displayed in a state where the display position information is associated with the corresponding candidate image data by the list management unit 1610. Registered in

次のステップＳ１５６０において、合成部１６４０により候補画像が表示用画像データに合成される。このとき、ステップＳ１５５０にて割り当てられた位置に候補画像が表示されるように、候補画像データと表示用画像データとが合成部１６４０により合成される。また、生成部１６２０により候補画像のサイズ調整が行われているため、検出された顔領域の大きさに関係なく、表示部１６０の画面上では、区画領域４１０ａ〜４１０ｄと同じサイズで候補画像が表示されている。
また、本実施形態では、ステップＳ１５７０において、表示部１６０の画面上で顔領域と候補画像との対応関係を明確にするために、合成部１６４０により表示用画像データに対して装飾情報が合成される。具体的には図４に示すような第１検出枠４３０ａ、第２検出枠４３０ｂ、第１装飾枠４１１ａおよび第２装飾枠４１１ｂを示す装飾データが表示用画像データに付加される。このとき、図４では省略されているが、ＯＳＤ（ＯｎＳｃｒｅｅｎＤｉｓｐｌａｙ）機能を実現するために、残記録可能時間およびバッテリー残量などの各アイコンも合成部１６４０により表示用画像データに重畳される。 In the next step S1560, the candidate image is combined with the display image data by the combining unit 1640. At this time, the candidate image data and the display image data are combined by the combining unit 1640 so that the candidate image is displayed at the position assigned in step S1550. In addition, since the size of the candidate image is adjusted by the generation unit 1620, the candidate image has the same size as the partition regions 410a to 410d on the screen of the display unit 160 regardless of the size of the detected face region. It is displayed.
In this embodiment, in step S1570, in order to clarify the correspondence between the face area and the candidate image on the screen of the display unit 160, the decoration information is combined with the display image data by the combining unit 1640. The Specifically, decoration data indicating the first detection frame 430a, the second detection frame 430b, the first decoration frame 411a, and the second decoration frame 411b as shown in FIG. 4 is added to the display image data. At this time, although omitted in FIG. 4, in order to realize an OSD (On Screen Display) function, icons such as a remaining recordable time and a remaining battery level are also superimposed on the display image data by the combining unit 1640. .

次のステップＳ３８０では、合成画像データが表示部１６０に表示される。図４に示す表示例のように、表示部１６０には２人の人間が映し出されており、それぞれの顔が顔領域として検出されて、さらに候補画像が候補表示領域４１０に表示されている。
上記のステップＳ３００〜Ｓ３８０までの処理は、各フレームの表示用画像データに対して順次実行されるため、表示部１６０には合成画像データにより構成される合成映像が表示される。顔検出処理は被写体の顔を追尾するように実行されるため、被写体が動いていても同じ被写体の顔を検出し続けることができる。さらに、候補画像が各フレームの表示用画像データから順次抽出されているため、検出された顔領域内の動画が各区画領域４１０ａ〜４１０ｄに表示されることになる。
（３）主被写体領域の指定
次に、ＡＥおよびＡＦを行う際の基準領域として主被写体領域を選択する際の処理を図７に示すフローチャートを用いて説明する。主被写体領域の指定は、操作部１８０から入力された操作情報に基づき選択部１６５０により行われる。 In the next step S380, the composite image data is displayed on the display unit 160. As shown in the display example in FIG. 4, two people are displayed on the display unit 160, each face is detected as a face area, and a candidate image is further displayed in the candidate display area 410.
Since the processing from step S300 to step S380 is sequentially performed on the display image data of each frame, a composite video composed of the composite image data is displayed on the display unit 160. Since the face detection process is executed so as to track the face of the subject, the face of the same subject can be continuously detected even if the subject is moving. Furthermore, since the candidate images are sequentially extracted from the display image data of each frame, the moving image in the detected face area is displayed in each of the divided areas 410a to 410d.
(3) Designation of main subject area Next, processing when a main subject area is selected as a reference area when performing AE and AF will be described with reference to a flowchart shown in FIG. The main subject area is specified by the selection unit 1650 based on the operation information input from the operation unit 180.

具体的には図７に示すように、ステップＳ６００では、表示部１６０に表示された候補画像がタッチされたかどうかが判定される。表示部１６０の画面がユーザーによりタッチされると、操作部１８０のタッチパネルユニットにより表示部１６０上のタッチ位置の座標情報（指定情報）が生成され、タッチ位置の座標情報が操作部１８０から選択部１６５０に通知される。ステップＳ６０１では、選択部１６５０により、候補情報リストに登録されている表示位置情報とタッチ位置の座標情報とが比較され、タッチ位置が含まれる表示位置情報が特定される。さらに、特定された表示位置情報と関連付けられている候補画像データが選択部１６５０により特定される。
続くステップＳ６０２では、特定された候補画像データに対応する顔領域が候補情報リストに基づいて選択部１６５０により特定される。具体的には、ステップＳ６０１において特定された候補画像データと関連付けられている顔領域が主被写体領域として選択される。続くステップＳ６１０では、例えば、候補情報リストにおいて主被写体領域として選択された顔領域にフラグが設定される。候補情報リストのフラグを参照することで、主被写体領域として選択されている顔領域を特定することができる。 Specifically, as shown in FIG. 7, in step S600, it is determined whether the candidate image displayed on the display unit 160 has been touched. When the screen of the display unit 160 is touched by the user, the touch panel unit of the operation unit 180 generates touch position coordinate information (designated information) on the display unit 160, and the touch position coordinate information is selected from the operation unit 180 by the selection unit. 1650 is notified. In step S601, the selection unit 1650 compares the display position information registered in the candidate information list with the coordinate information of the touch position, and specifies the display position information including the touch position. Further, candidate image data associated with the specified display position information is specified by the selection unit 1650.
In subsequent step S602, the face area corresponding to the identified candidate image data is identified by the selection unit 1650 based on the candidate information list. Specifically, the face area associated with the candidate image data identified in step S601 is selected as the main subject area. In the subsequent step S610, for example, a flag is set in the face area selected as the main subject area in the candidate information list. By referring to the flag in the candidate information list, the face area selected as the main subject area can be specified.

一方、ステップＳ６００において表示部１６０に表示された候補画像がタッチされていない場合、ステップＳ６３０に進む。ステップＳ６３０では、すでに主被写体領域が決まっているかどうかが選択部１６５０により判定される。具体的には、候補情報リストに主被写体領域を示すフラグが設定されていれば、主被写体領域が決定済であると選択部１６５０により判定され、処理がステップＳ６４０へ進む。一方、候補情報リストにフラグが設定されていなければ、主被写体領域が未決定であると選択部１６５０により判定され、処理がステップＳ６５０に進む。
ここで、ステップＳ６３０ですでに決定されている主被写体領域は、少なくとも現在表示されている合成画像データより前の合成画像データで決定されていた主被写体領域である。このような場合、主被写体領域が選択されてから時間が経過しているため、主被写体領域が画像内に存在しないことが懸念される。例えば、主被写体がフレームアウトしたり遮蔽物に隠れて見えなくなってしまったりする場合などが想定される。 On the other hand, if the candidate image displayed on the display unit 160 is not touched in step S600, the process proceeds to step S630. In step S630, selection unit 1650 determines whether the main subject area has already been determined. Specifically, if the flag indicating the main subject area is set in the candidate information list, the selection unit 1650 determines that the main subject area has been determined, and the process proceeds to step S640. On the other hand, if the flag is not set in the candidate information list, the selection unit 1650 determines that the main subject area has not been determined, and the process proceeds to step S650.
Here, the main subject area that has already been determined in step S630 is a main subject area that has been determined based on at least the composite image data before the currently displayed composite image data. In such a case, since the time has elapsed since the main subject area was selected, there is a concern that the main subject area does not exist in the image. For example, a case where the main subject goes out of the frame or is hidden behind a shielding object and is not visible can be assumed.

しかし、図５に示すステップＳ３１３〜Ｓ３５０において、フレームインおよびフレームアウトした被写体を管理しているため、候補情報リストに設定されているフラグを確認すれば、画面内に主被写体領域が表示されているか否かを判定する必要はない。
ステップＳ６４０では、主被写体領域に対してシステム制御部１５７により自動合焦制御および自動露出制御が行われる。具体的には、主被写体領域の座標情報およびサイズ情報がシステム制御部１５７により画像情報抽出部１３２に送られ、主被写体領域内の画像データに基づいて自動露出制御および自動合焦制御が画像情報抽出部１３２、システム制御部１５７、レンズ制御部１７０および露出制御部１６５により実行される。
一方、ステップＳ６３０で主被写体領域が存在しないと判定されるか、あるいはステップＳ６２０で主被写体領域が画面内に存在しないと判定された場合、例えば、画面の中央の領域などの予め設定されている領域を対象として通常の自動合焦制御および自動露出制御が画像情報抽出部１３２、システム制御部１５７、レンズ制御部１７０および露出制御部１６５により実行される。 However, in steps S313 to S350 shown in FIG. 5, since the subject that has been framed in and out is managed, if the flag set in the candidate information list is confirmed, the main subject region is displayed on the screen. There is no need to determine whether or not.
In step S640, automatic focusing control and automatic exposure control are performed on the main subject area by the system control unit 157. Specifically, the coordinate information and size information of the main subject region are sent to the image information extraction unit 132 by the system control unit 157, and automatic exposure control and automatic focus control are performed based on the image data in the main subject region. It is executed by the extraction unit 132, the system control unit 157, the lens control unit 170, and the exposure control unit 165.
On the other hand, if it is determined in step S630 that the main subject area does not exist, or if it is determined in step S620 that the main subject area does not exist in the screen, for example, a central area of the screen is set in advance. Normal automatic focusing control and automatic exposure control for the area are executed by the image information extraction unit 132, the system control unit 157, the lens control unit 170, and the exposure control unit 165.

なお、ステップＳ６５０において、録画モードの終了指示の有無がシステム制御部１５７により確認され、録画モードの終了指示があれば処理が終了し、終了指示がなければステップＳ３００に戻る。
また、フローでは示していないが、図４に示すように、候補表示領域４１０には、候補表示領域４１０内に表示されている候補画像を切り替えるための左側操作領域４２０ａおよび右側操作領域４２０ｂが設けられている。ユーザーが左側操作領域４２０ａまたは右側操作領域４２０ｂをタッチすることで、操作部１８０を構成するタッチパネルを通して得られたタッチ座標を元に、システム制御部１５７は切り替え指示領域４２０にあたる位置がタッチされたことを検出し、候補表示領域４１０に表示されている候補画像を右または左にスクロールさせることができる。この場合、左側操作領域４２０ａおよび右側操作領域４２０ｂの操作状況に応じて、候補画像データと表示位置情報との対応関係がリスト管理部１６１０により更新される。 In step S650, the presence or absence of a recording mode end instruction is confirmed by the system control unit 157. If there is a recording mode end instruction, the process ends. If there is no end instruction, the process returns to step S300.
Although not shown in the flow, as shown in FIG. 4, the candidate display area 410 is provided with a left operation area 420 a and a right operation area 420 b for switching candidate images displayed in the candidate display area 410. It has been. When the user touches the left operation area 420a or the right operation area 420b, the system control unit 157 has touched the position corresponding to the switching instruction area 420 based on the touch coordinates obtained through the touch panel constituting the operation unit 180. And the candidate image displayed in the candidate display area 410 can be scrolled to the right or left. In this case, the list management unit 1610 updates the correspondence between the candidate image data and the display position information in accordance with the operation status of the left operation area 420a and the right operation area 420b.

＜特徴＞
以上に説明したビデオカメラ１００の特徴を以下にまとめる。
（１）
このビデオカメラ１００では、顔領域（候補領域）が検出部１３０により検出用画像データから検出され、検出された候補領域に対応する候補画像データが生成部１６２０により生成される。候補画像データが生成されると、生成された候補画像データと表示用画像データとが合成された合成画像データが合成部１６４０により生成され、合成画像データが映像として表示部１６０に表示される。表示部１６０に表示された映像には、表示用画像データから生成される映像だけでなく候補画像データから生成される候補画像も映し出されているため、表示部１６０に表示されている映像を通して、ユーザーは各候補領域に対応する候補画像を視認することができる。 <Features>
The characteristics of the video camera 100 described above are summarized below.
(1)
In the video camera 100, a face area (candidate area) is detected from the detection image data by the detection unit 130, and candidate image data corresponding to the detected candidate area is generated by the generation unit 1620. When the candidate image data is generated, composite image data obtained by combining the generated candidate image data and display image data is generated by the combining unit 1640, and the combined image data is displayed on the display unit 160 as a video. In the video displayed on the display unit 160, not only the video generated from the display image data but also the candidate image generated from the candidate image data is displayed, and therefore, through the video displayed on the display unit 160, The user can visually recognize the candidate image corresponding to each candidate area.

さらに、映像に表れている候補画像のうち特定の候補画像を指定するために、ユーザーが操作部１８０を用いて候補画像を指定すると、操作部１８０により指定情報が生成され、入力された指定情報に基づいて、指定された候補画像に対応する顔領域が主被写体領域として選択部１６５０により選択される。
このように、このビデオカメラ１００では、候補領域に対応する候補画像を候補領域とは別個に表示し、表示されている候補画像を利用して間接的に候補領域を選択することができるため、表示部１６０に表示された被写体が動いている場合であっても、候補領域の選択操作を容易に行うことができる。
（２）
合成画像データが映像として表示部１６０に表示されている場合に、表示部１６０の画面上で候補画像が特定の位置に配置されるため、検出された候補領域とは異なり、被写体が動いている場合であっても画面上で候補画像が一定の位置で表示される。したがって、候補領域を直接指定する場合に比べて、被写体の動きの影響を受けることなく、候補画像を介して所望の候補領域を容易に指定することができる。 Further, when a user designates a candidate image using the operation unit 180 in order to designate a specific candidate image among candidate images appearing in the video, designation information is generated by the operation unit 180, and the input designation information is input. Based on the above, the face area corresponding to the designated candidate image is selected by the selection unit 1650 as the main subject area.
Thus, in this video camera 100, candidate images corresponding to the candidate areas can be displayed separately from the candidate areas, and the candidate areas can be indirectly selected using the displayed candidate images. Even when the subject displayed on the display unit 160 is moving, the candidate region can be easily selected.
(2)
When the composite image data is displayed as a video on the display unit 160, the candidate image is arranged at a specific position on the screen of the display unit 160. Therefore, unlike the detected candidate region, the subject is moving. Even in this case, the candidate image is displayed at a certain position on the screen. Therefore, it is possible to easily specify a desired candidate area via the candidate image without being affected by the movement of the subject, compared to the case where the candidate area is directly specified.

また、表示部１６０の画面上で複数の候補画像が並ぶように、合成部１６４０が候補画像データと表示用画像データとを合成するため、合成画像データが映像として表示部１６０に表示されている場合に、複数の候補画像が並んで表示されることになる。このため、複数の候補画像を確認しやすくなり、さらに操作性が高まる。
さらに、表示部１６０に表示されている映像内において鉛直方向下半分の領域内に候補画像が配置されているため、候補画像により映像の表示が妨げられにくくなる。
特に、人間の上半身あるいは全身を撮影する場合、人間の顔は画面上で鉛直方向の上半分に集中しやすいため、候補表示領域４１０を下半分の領域内に配置することで、候補画像により候補領域の表示が妨げられにくくなる。
（３）
候補画像が表示用画像データの画像と重なっているため、表示用画像データにより生成される画像を表示部１６０の画面上に大きく表示することができ、表示部１６０の画面を有効利用することができる。 Further, since the combining unit 1640 combines the candidate image data and the display image data so that a plurality of candidate images are arranged on the screen of the display unit 160, the combined image data is displayed on the display unit 160 as a video. In this case, a plurality of candidate images are displayed side by side. For this reason, it becomes easy to confirm a plurality of candidate images, and the operability is further improved.
Further, since the candidate images are arranged in the lower half region in the vertical direction in the video displayed on the display unit 160, the display of the video is hardly hindered by the candidate images.
In particular, when photographing the human upper body or the whole body, the human face tends to concentrate on the upper half of the screen in the vertical direction. Area display is less likely to be disturbed.
(3)
Since the candidate image overlaps the image of the display image data, the image generated from the display image data can be displayed large on the screen of the display unit 160, and the screen of the display unit 160 can be used effectively. it can.

また、表示部１６０の画面上で候補画像が表示される候補表示領域４１０の面積が候補表示領域４１０以外の領域４２０の面積よりも小さいため、表示部１６０に表示されている表示用画像データの映像を確認しやすくなる。
（４）
１つの画像データから検出された各顔領域内の画像データが生成部１６２０により抽出され、抽出された画像データが生成部１６２０により候補画像データとして設定される。このため、表示部１６０に表示される候補画像と顔領域との対応関係が視覚的に明確となり、顔領域の選択を容易に行うことができる。
また、区画領域４１０ａ〜４１０ｄの大きさと候補画像の大きさとが生成部１６２０により比較され、比較結果に基づいて候補画像の大きさが生成部１６２０により調整される。 In addition, since the area of the candidate display area 410 where the candidate image is displayed on the screen of the display unit 160 is smaller than the area of the area 420 other than the candidate display area 410, the display image data displayed on the display unit 160 It becomes easier to check the video.
(4)
Image data in each face area detected from one image data is extracted by the generation unit 1620, and the extracted image data is set as candidate image data by the generation unit 1620. For this reason, the correspondence between the candidate image displayed on the display unit 160 and the face area is visually clarified, and the face area can be easily selected.
Further, the size of the partition areas 410a to 410d and the size of the candidate image are compared by the generation unit 1620, and the size of the candidate image is adjusted by the generation unit 1620 based on the comparison result.

具体的には、生成部１６２０は、候補画像が区画領域４１０ａ〜４１０ｄよりも小さい場合は区画領域４１０ａ〜４１０ｄに合わせて候補画像に拡大処理を施し、拡大処理が施された候補画像を候補識別情報として設定する。
また、生成部１６２０は、候補画像が区画領域４１０ａ〜４１０ｄよりも大きい場合は区画領域４１０ａ〜４１０ｄに合わせて候補画像に縮小処理を施し、縮小処理が施された候補画像を候補識別情報として設定する。
この場合、例えば、候補画像の大きさにばらつきがあっても、候補画像の大きさを区画領域４１０ａ〜４１０ｄの大きさに合わせることができる。候補画像が区画領域４１０ａ〜４１０ｄに比べて小さくても区画領域４１０ａ〜４１０ｄの大きさまで候補画像が拡大されるため、ユーザーが候補画像を確認しやすくなる。また、候補画像が区画領域４１０ａ〜４１０ｄに比べて大きくても区画領域４１０ａ〜４１０ｄの大きさまで候補画像が縮小されるため、ユーザーが候補画像を確認しやすくなる。 Specifically, when the candidate image is smaller than the partition areas 410a to 410d, the generation unit 1620 performs the enlargement process on the candidate image according to the partition areas 410a to 410d, and identifies the candidate image that has been subjected to the expansion process as a candidate identification Set as information.
In addition, when the candidate image is larger than the divided areas 410a to 410d, the generation unit 1620 performs a reduction process on the candidate image according to the divided areas 410a to 410d, and sets the candidate image subjected to the reduction process as candidate identification information. To do.
In this case, for example, even if there is a variation in the size of the candidate image, the size of the candidate image can be matched with the size of the partitioned areas 410a to 410d. Even if the candidate image is smaller than the divided areas 410a to 410d, the candidate image is enlarged to the size of the divided areas 410a to 410d, so that the user can easily confirm the candidate image. Further, even if the candidate image is larger than the partitioned areas 410a to 410d, the candidate image is reduced to the size of the partitioned areas 410a to 410d, so that the user can easily confirm the candidate image.

（５）
合成部１６４０が、候補画像と画像データとを合成する際、顔４００ａの候補領域であることを示す第１検出枠４３０ａを顔４００ａの候補領域に付加し、顔４００ａの候補領域に対応する候補画像であることを示す第１装飾枠４１１ａを候補画像が表示されている区画領域に付加する。このため、表示部１６０に表示される候補画像と候補領域との対応関係が視覚的にさらに明確となり、候補領域の選択を容易に行うことができる。
例えば、図４に示すように、第１検出枠４３０ａが第２検出枠４３０ｂとは異なる視覚的特徴を有しており、第１装飾枠４１１ａが第２装飾枠４１１ｂとは異なる視覚的特徴を有している。さらに、第１装飾枠４１１ａが第１検出枠４３０ａと実質的に同一の視覚的特徴を有しており、第２装飾枠４１１ｂが第２検出枠４３０ｂと実質的に同一の視覚的特徴を有している。これらの構成により、複数の候補領域を識別しやすくなり、さらに候補領域と候補画像との対応関係が視覚的に明確となる。したがって、表示されている候補画像を用いて候補領域を容易に選択することができる。 (5)
When the synthesis unit 1640 synthesizes the candidate image and the image data, the first detection frame 430a indicating the candidate area of the face 400a is added to the candidate area of the face 400a, and the candidate corresponding to the candidate area of the face 400a A first decorative frame 411a indicating that the image is an image is added to the partitioned area where the candidate image is displayed. For this reason, the correspondence between the candidate image displayed on the display unit 160 and the candidate area is further clarified visually, and the candidate area can be easily selected.
For example, as shown in FIG. 4, the first detection frame 430a has a different visual characteristic from the second detection frame 430b, and the first decoration frame 411a has a different visual characteristic from the second decoration frame 411b. Have. Further, the first decorative frame 411a has substantially the same visual characteristics as the first detection frame 430a, and the second decorative frame 411b has substantially the same visual characteristics as the second detection frame 430b. is doing. With these configurations, it becomes easy to identify a plurality of candidate regions, and the correspondence between the candidate regions and the candidate images is visually clarified. Therefore, the candidate area can be easily selected using the displayed candidate image.

［第２実施形態］
第１実施形態では、検出された顔領域をすべて候補画像としているが、その変形例として、あらかじめ登録しておいた顔と合致する顔のみを顔領域と判定してもよい。なお、以下の説明では、第１実施形態と実質的に同じ機能を有する構成には同じ符号を付し、その詳細な説明は省略する。
＜ビデオカメラの構成＞
第２実施形態に係るビデオカメラ５００では、検出された顔領域を登録しておき、登録された顔領域と同じ視覚的特徴を有する顔領域のみを検出するようになっている。
具体的には図８および図９に示すように、このビデオカメラ５００は、ビデオカメラ１００の構成のうち、検出部１３０の構成およびシステム制御部１５７の構成が第１実施形態と異なっており、検出部２３０と、システム制御部２５７と、を備えている。検出部２３０の顔検出部２２０は、検出用画像データから顔領域を検出する際、検出した顔領域の特徴量を算出する。この特徴量は、検出された顔領域を識別するためのデータとして用いられる。 [Second Embodiment]
In the first embodiment, all detected face areas are set as candidate images. However, as a modified example, only a face that matches a face registered in advance may be determined as a face area. In the following description, components having substantially the same functions as those in the first embodiment are denoted by the same reference numerals, and detailed description thereof is omitted.
<Configuration of video camera>
In the video camera 500 according to the second embodiment, the detected face area is registered, and only the face area having the same visual characteristics as the registered face area is detected.
Specifically, as shown in FIGS. 8 and 9, the video camera 500 is different from the first embodiment in the configuration of the detection unit 130 and the configuration of the system control unit 157 in the configuration of the video camera 100. A detection unit 230 and a system control unit 257 are provided. When detecting the face area from the detection image data, the face detection section 220 of the detection section 230 calculates the feature amount of the detected face area. This feature amount is used as data for identifying the detected face area.

また図８に示すように、検出部２３０は認識部７００を有している。認識部７００は、顔検出部２２０から顔領域の特徴量を取得し、あらかじめ揮発性メモリ２１０に登録済みの顔の特徴量（登録特徴量）と取得した顔の特徴量（検出特徴量）とを照合し、両者が示す顔が同一のものであるかどうかを判定する機能を有している。さらに、認識部７００は、検出特徴量と一致する登録特徴量を特定し、特定した登録特徴量についての情報をリスト管理部１６１０へ送る。リスト管理部１６１０は、登録特徴量に関連付けられている登録画像データをバッファメモリ１３５から取得する。認識部７００およびリスト管理部１６１０により登録情報取得部が構成されている。
図９は、第２実施形態におけるビデオカメラ５００の、システム制御部２５７の構成を示したものである。図９に示されるように、システム制御部２５７は、図３と同様の機能ブロックに加え、登録部１６６０をさらに有している。登録部１６６０は、登録特徴量を認識部７００を介して揮発性メモリ２１０に登録情報として格納し、登録画像データを登録特徴量と関連付けてバッファメモリ１３５上に格納する。また、登録部１６６０は、登録画像データとそれに対応する登録特徴量とを関連付けて、システムバス１３４および記録Ｉ／Ｆ部１４５を介して記録媒体１５５に登録情報として記録する。これにより、ビデオカメラ５００の電源がＯＦＦになっても登録情報が失われることはない。 As shown in FIG. 8, the detection unit 230 has a recognition unit 700. The recognition unit 700 acquires the feature amount of the face area from the face detection unit 220, the face feature amount (registered feature amount) registered in the volatile memory 210 in advance, and the acquired face feature amount (detection feature amount) And the function of determining whether or not the faces indicated by both are the same. Further, the recognizing unit 700 specifies a registered feature amount that matches the detected feature amount, and sends information about the specified registered feature amount to the list management unit 1610. The list management unit 1610 acquires registered image data associated with the registered feature amount from the buffer memory 135. The recognition unit 700 and the list management unit 1610 constitute a registration information acquisition unit.
FIG. 9 shows the configuration of the system control unit 257 of the video camera 500 in the second embodiment. As illustrated in FIG. 9, the system control unit 257 further includes a registration unit 1660 in addition to the functional blocks similar to those in FIG. 3. The registration unit 1660 stores the registered feature amount as registration information in the volatile memory 210 via the recognition unit 700, and stores the registered image data on the buffer memory 135 in association with the registered feature amount. Also, the registration unit 1660 associates the registered image data with the registered feature amount corresponding thereto, and records it as registration information on the recording medium 155 via the system bus 134 and the recording I / F unit 145. As a result, registration information is not lost even when the power of the video camera 500 is turned off.

＜ビデオカメラの動作＞
（１）顔領域の登録
被写体の登録方法について説明する。操作部１８０をなすモード変更ダイヤル（図示せず）を被写体の登録モードに合わせると、ビデオカメラ５００が被写体の登録モードに移行する。登録モードが開始されると、被写体の映像が表示部１６０にリアルタイムで映し出される。このとき、第１実施形態と同様に、検出された顔領域に対応する候補画像が候補表示領域４１０に表示されている。
図１０に示すように、ステップＳ８００において、操作部１８０は、タッチパネル上のタッチ位置を検出し、区画領域４１０ａ〜４１０ｄのいずれかがタッチされたかどうかを判定する。表示部１６０の画面がタッチされた場合は、操作部１８０によりタッチ位置を示す指定情報が生成され、処理がステップＳ８０１に進む。ステップＳ８０１では、候補情報リストに登録されている各候補画像データの表示位置情報とタッチ位置の座標情報とが選択部１６５０により比較され、タッチされた位置に表示されている候補画像データが特定され、処理がステップＳ８０２に進む。 <Operation of the video camera>
(1) Registration of face area A method for registering a subject will be described. When a mode change dial (not shown) forming the operation unit 180 is set to the subject registration mode, the video camera 500 shifts to the subject registration mode. When the registration mode is started, the subject image is displayed on the display unit 160 in real time. At this time, as in the first embodiment, candidate images corresponding to the detected face area are displayed in the candidate display area 410.
As illustrated in FIG. 10, in step S800, the operation unit 180 detects a touch position on the touch panel and determines whether any of the partition areas 410a to 410d is touched. When the screen of the display unit 160 is touched, designation information indicating the touch position is generated by the operation unit 180, and the process proceeds to step S801. In step S801, the display position information of each candidate image data registered in the candidate information list and the coordinate information of the touch position are compared by the selection unit 1650, and the candidate image data displayed at the touched position is specified. The process proceeds to step S802.

ステップＳ８０２では、第１実施形態のステップＳ６０２と同様に、特定された候補画像データに対応する顔領域が候補情報リストに基づいて選択部１６５０により特定される。ステップＳ８１０において、ステップＳ８０２で特定された顔領域の座標情報およびサイズ情報に基づいて顔領域内の画像が生成部１６２０により登録画像データとして抽出される。登録画像データの生成時、前述の第１実施形態と同様に、区画領域４１０ａ〜４１０ｄの大きさと登録画像データにより生成される登録画像の大きさとが生成部１６２０により比較され、比較結果に基づいて登録画像の大きさが生成部１６２０により調整される。続くステップＳ８２０では、ステップＳ８０２で特定された顔領域の特徴量が検出部２３０により登録特徴量として取得される。その後、処理はステップＳ８３０に進み、登録部１６６０により、取得された登録特徴量が認識部７００を介して揮発性メモリ２１０に登録情報として格納され、生成部１６２０により生成された登録画像データが登録特徴量と関連付けられてバッファメモリ１３５に格納される。 In step S802, as in step S602 of the first embodiment, the face area corresponding to the specified candidate image data is specified by the selection unit 1650 based on the candidate information list. In step S810, an image in the face area is extracted as registered image data by the generation unit 1620 based on the coordinate information and size information of the face area specified in step S802. When generating the registered image data, the size of the partitioned areas 410a to 410d and the size of the registered image generated by the registered image data are compared by the generating unit 1620, as in the first embodiment, and based on the comparison result. The size of the registered image is adjusted by the generation unit 1620. In subsequent step S820, the feature amount of the face area specified in step S802 is acquired as a registered feature amount by the detection unit 230. Thereafter, the process proceeds to step S830, where the registration unit 1660 stores the acquired registration feature quantity as registration information in the volatile memory 210 via the recognition unit 700, and the registration image data generated by the generation unit 1620 is registered. It is stored in the buffer memory 135 in association with the feature amount.

最後にステップＳ８４０にて、登録部１６６０により、登録画像データとそれに対応する登録特徴量とが関連付けられて、システムバス１３４および記録Ｉ／Ｆ部１４５を介して記録媒体１５５に登録情報として記録される。ステップＳ８５０では、ユーザーからの終了指示の有無がシステム制御部２５７により確認される。具体的には、操作部１８０をなすモード変更ダイヤルによるモード変更の有無がシステム制御部２５７により確認される。モード変更ダイヤルが登録モード以外のモードに設定されていれば、登録モードは終了し、モード変更ダイヤルが登録モードに設定されたままであれば、ステップＳ８００〜Ｓ８４０の処理が繰り返される。
なお、ステップＳ８４０における処理により、ビデオカメラ５００の電源がＯＦＦになっても、登録画像データおよび登録特徴量などの情報が失われることはない。 Lastly, in step S840, the registration unit 1660 associates the registered image data with the corresponding registered feature value, and is recorded as registration information on the recording medium 155 via the system bus 134 and the recording I / F unit 145. The In step S850, the system control unit 257 checks whether there is an end instruction from the user. Specifically, the system control unit 257 confirms whether or not there is a mode change by a mode change dial that forms the operation unit 180. If the mode change dial is set to a mode other than the registration mode, the registration mode ends. If the mode change dial remains set to the registration mode, the processes in steps S800 to S840 are repeated.
Note that information such as registered image data and registered feature values is not lost even when the power of the video camera 500 is turned off by the processing in step S840.

（２）表示処理
図１１に、主被写体の指定を行うための主被写体候補画像の作成および表示処理の流れをフローチャートで示している。図３に示した第１実施形態と処理内容が同じステップが存在するため、異なるステップの内容を中心に説明を行う。
ビデオカメラ５００の電源がＯＮにされると、ステップＳ１４００において、システム制御部２５７により、システムバス１３４および記録Ｉ／Ｆ部１４５を介して、記録媒体１５５に記録されている登録特徴量および登録画像データの読み込みが開始される。読み込まれた登録特徴量はシステム制御部２５７から認識部７００に送られる。認識部７００により登録特徴量が揮発性メモリ２１０に一時的に格納される。また、読み込まれた登録画像データは、システム制御部２５７によりバッファメモリ１３５に格納され、区画領域４１０ａ〜４１０ｄに表示される候補画像として使用される。 (2) Display Processing FIG. 11 is a flowchart showing the flow of main object candidate image creation and display processing for specifying a main subject. Since there are steps having the same processing contents as those in the first embodiment shown in FIG.
When the power of the video camera 500 is turned on, in step S1400, the registered feature amount and registered image recorded on the recording medium 155 by the system control unit 257 via the system bus 134 and the recording I / F unit 145. Data loading starts. The read registered feature amount is sent from the system control unit 257 to the recognition unit 700. The registered feature amount is temporarily stored in the volatile memory 210 by the recognition unit 700. The read registered image data is stored in the buffer memory 135 by the system control unit 257, and is used as a candidate image displayed in the partitioned areas 410a to 410d.

例えば動画撮影モードでは、表示部１６０に被写体の映像がリアルタイムで映し出される。具体的には、第１実施形態と同様に、ステップＳ３００およびＳ３１０において、検出用画像データが検出部２３０に入力され、顔検出部２２０により、揮発性メモリ２１０に格納された検出用画像データに基づいて顔領域が検出される。このとき、第１実施形態と同様に、検出部２３０により検出された顔領域の座標情報およびサイズ情報が検出用画像データから取得される。
ステップＳ３１０での顔領域検出に加えて、本実施形態では、顔認識処理に用いられる特徴量の算出も行われる。具体的には、ステップＳ１４１０において、ステップＳ３１０で検出された顔領域の特徴量が検出用画像データから検出特徴量として取得される。検出特徴量は、基準特徴量と同様に人間の顔の特徴が数値化されたデータであるが、基準特徴量とは異なり、顔領域固有のデータである。したがって、検出部２３０で検出された顔領域の同一性を、検出特徴量を用いて判定することができる。顔検出部２２０により検出された検出特徴量は、揮発性メモリ２１０に一時的に格納される。検出特徴量は、１フレームの検出用画像データで検出された顔領域ごとに算出される。 For example, in the moving image shooting mode, the video of the subject is displayed on the display unit 160 in real time. Specifically, as in the first embodiment, in steps S300 and S310, detection image data is input to the detection unit 230, and the face detection unit 220 converts the detection image data into the volatile memory 210. Based on this, a face area is detected. At this time, as in the first embodiment, the coordinate information and size information of the face area detected by the detection unit 230 are acquired from the detection image data.
In addition to the face area detection in step S310, in the present embodiment, calculation of feature amounts used for face recognition processing is also performed. Specifically, in step S1410, the feature amount of the face area detected in step S310 is acquired from the detection image data as the detected feature amount. The detected feature amount is data in which human face features are digitized in the same manner as the reference feature amount. However, unlike the reference feature amount, the detected feature amount is data specific to the face area. Therefore, the identity of the face area detected by the detection unit 230 can be determined using the detected feature amount. The detected feature amount detected by the face detection unit 220 is temporarily stored in the volatile memory 210. The detected feature amount is calculated for each face area detected by one frame of detection image data.

ステップＳ３１１からＳ３５０では、前述のとおり、候補情報リストの作成および更新が行なわれる。このとき、リスト管理部１６１０により、候補情報リスト上に検出特徴量が座標情報およびサイズ情報と関連付けられて記憶されている。
その後、ステップＳ１４２０において、認識部７００により、揮発性メモリ２１０に格納されている各登録特徴量と候補情報リストの検出特徴量とが比較される。ステップＳ１４３０において、認識部７００により検出特徴量と一致する登録特徴量が存在すると判断された場合、ステップＳ１４４０において、検出特徴量と一致する登録特徴量が認識部７００により特定され、特定された登録特徴量についての情報が認識部７００からリスト管理部１６１０へ送られ、特定された登録特徴量に関連付けられている登録画像データがリスト管理部１６１０によりバッファメモリ１３５から取得される。 In steps S311 to S350, the candidate information list is created and updated as described above. At this time, the list management unit 1610 stores the detected feature amount in association with the coordinate information and the size information on the candidate information list.
Thereafter, in step S1420, the recognition unit 700 compares each registered feature amount stored in the volatile memory 210 with the detected feature amount of the candidate information list. In step S1430, when the recognition unit 700 determines that there is a registered feature amount that matches the detected feature amount, in step S1440, the registered feature amount that matches the detected feature amount is specified by the recognition unit 700, and the specified registration is performed. Information about the feature amount is sent from the recognition unit 700 to the list management unit 1610, and registered image data associated with the specified registered feature amount is acquired from the buffer memory 135 by the list management unit 1610.

ステップＳ１４５０では、リスト管理部１６１０により、登録画像データが検出特徴量と関連付けられる。ステップＳ１３７０において、第１実施形態の候補画像データと同様に、登録画像データが合成部１６４０により表示用画像データと合成され、合成部１６４０により合成画像データが生成される。
具体的には図１２に示すように、ステップＳ１３７１において、合成部１６４０により、４つの区画領域４１０ａ〜４１０ｄに候補画像データが割り当てられる。割り当ての際、候補情報リストに登録されている順に、左の区画領域４１０ａから登録画像データが合成部１６４０により割り当てられる。図１３の表示例で説明すると、顔４００ａおよび顔４００ｂの順で顔領域が候補情報リストに登録されている場合、顔４００ａの登録画像が区画領域４１０ａに表示され、顔４００ｂの登録画像が区画領域４１０ｂに表示される。登録画像データが存在しない顔４００ｃについては、検出枠も登録画像も表示されない。 In step S1450, the list management unit 1610 associates the registered image data with the detected feature amount. In step S1370, similar to the candidate image data of the first embodiment, the registered image data is combined with the display image data by the combining unit 1640, and the combined image data is generated by the combining unit 1640.
Specifically, as shown in FIG. 12, in step S1371, the composition unit 1640 assigns candidate image data to the four partitioned areas 410a to 410d. At the time of assignment, the registration unit 1640 assigns registered image data from the left partition area 410a in the order registered in the candidate information list. Referring to the display example of FIG. 13, when face areas are registered in the candidate information list in the order of the face 400a and the face 400b, the registered image of the face 400a is displayed in the partitioned area 410a, and the registered image of the face 400b is partitioned. It is displayed in the area 410b. For the face 400c for which no registered image data exists, neither a detection frame nor a registered image is displayed.

合成部１６４０での割り当ての結果は、合成部１６４０からリスト管理部１６１０に送られる。具体的には、合成部１６４０からリスト管理部１６１０へ各登録画像データの表示位置情報が送られ、リスト管理部１６１０により表示位置情報が、対応する登録画像データと関連づけられた状態で候補情報リストに登録される。
次のステップＳ１３７２において、合成部１６４０により登録画像が表示用画像データに合成される。このとき、ステップＳ１３７１にて割り当てられた位置に登録画像が候補画像として表示されるように、登録画像データと表示用画像データとが合成部１６４０により合成される。また、登録画像データ生成時に、生成部１６２０により候補画像のサイズ調整が行われているため、検出された顔領域の大きさに関係なく、表示部１６０の画面上では、区画領域４１０ａ〜４１０ｄと同じサイズで候補画像が表示されている。 The result of allocation by the synthesis unit 1640 is sent from the synthesis unit 1640 to the list management unit 1610. Specifically, display position information of each registered image data is sent from the synthesizing unit 1640 to the list management unit 1610, and the candidate information list is displayed in a state where the display position information is associated with the corresponding registered image data by the list management unit 1610. Registered in
In the next step S 1372, the registration image is combined with the display image data by the combining unit 1640. At this time, the registration unit 1640 synthesizes the registered image data and the display image data so that the registered image is displayed as a candidate image at the position assigned in step S 1371. In addition, since the size of the candidate image is adjusted by the generation unit 1620 when the registered image data is generated, the partition regions 410a to 410d are displayed on the screen of the display unit 160 regardless of the size of the detected face region. Candidate images are displayed at the same size.

また、第１実施形態と同様に、ステップＳ１３７３において、表示部１６０の画面上で顔領域と登録画像との対応関係を明確にするために、合成部１６４０により表示用画像データに対して装飾情報が合成される。具体的には図１３に示すような第１検出枠４３０ａ、第２検出枠４３０ｂ、第１装飾枠４１１ａおよび第２装飾枠４１１ｂを示す装飾データが表示用画像データに付加される。
続くステップＳ３８０では、生成された合成画像データは表示部１６０に表示される。こうして、図１３に示すように、検出された顔領域のうち登録されている顔領域（顔４００ａおよび４００ｂ）については、登録画像データを用いて区画領域４１０ａおよび４１０ｂに登録画像が表示され、登録されていない顔領域（顔４００ｃ）については何も表示されない。登録画像は登録時に取得された静止画であるため、現在の顔４００ａおよび４００ｂとは向きや表情が異なっており、被写体の向きが変わっても区画領域４１０ａおよび４１０ｂに表示される登録画像は変化しない。 Similarly to the first embodiment, in step S1373, in order to clarify the correspondence between the face area and the registered image on the screen of the display unit 160, the composition unit 1640 performs decoration information on the display image data. Is synthesized. Specifically, decoration data indicating the first detection frame 430a, the second detection frame 430b, the first decoration frame 411a, and the second decoration frame 411b as shown in FIG. 13 is added to the display image data.
In the subsequent step S380, the generated composite image data is displayed on the display unit 160. Thus, as shown in FIG. 13, for the registered face areas (faces 400a and 400b) among the detected face areas, the registered images are displayed in the partitioned areas 410a and 410b using the registered image data, and registered. Nothing is displayed for the face area (face 400c) that has not been displayed. Since the registered image is a still image acquired at the time of registration, the orientation and expression are different from the current faces 400a and 400b, and the registered image displayed in the partitioned areas 410a and 410b changes even if the orientation of the subject changes. do not do.

以降の処理は、図７に示すフローと実質的に同じであるため、詳細な説明は省略する。
この場合、検出部２３０により検出された特徴量が、対応する候補識別情報と関連付けられて登録特徴量として登録部１６６０を介して記録媒体１５５に記録されている。入力された画像データをもとに検出部２３０が検出特徴量を生成すると、生成された検出特徴量が認識部７００により登録特徴量と照合され、検出特徴量と概ね合致する登録特徴量と関連付けられている登録画像データがリスト管理部１６１０により特定される。
さらに、リスト管理部１６１０により特定された登録画像データと表示用画像データとが合成部１６４０により合成され、合成画像データが生成される。つまり、登録特徴量と概ね合致する特徴量を有する顔領域が存在すると、登録特徴量と関連付けられた候補識別情報が表示部１６０に表示されることになる。このため、登録特徴量を候補識別情報と関連付けて登録しておけば、同じ特徴量を有する顔領域に対して同じ登録画像データが表示部１６０に表示され、顔領域とそれに対応する登録画像データとを認識しやすくなる。 Since the subsequent processing is substantially the same as the flow shown in FIG. 7, detailed description thereof is omitted.
In this case, the feature amount detected by the detection unit 230 is recorded on the recording medium 155 via the registration unit 1660 as a registered feature amount in association with the corresponding candidate identification information. When the detection unit 230 generates the detection feature amount based on the input image data, the generated detection feature amount is collated with the registration feature amount by the recognition unit 700, and is associated with the registration feature amount that substantially matches the detection feature amount. The registered image data is specified by the list management unit 1610.
Further, the registration image data specified by the list management unit 1610 and the display image data are combined by the combining unit 1640 to generate combined image data. In other words, if there is a face region having a feature amount that roughly matches the registered feature amount, candidate identification information associated with the registered feature amount is displayed on the display unit 160. Therefore, if the registered feature quantity is registered in association with the candidate identification information, the same registered image data is displayed on the display unit 160 for the face area having the same feature quantity, and the face area and the corresponding registered image data are displayed. It becomes easy to recognize.

［第３実施形態］
本実施形態は第２実施形態の変形例である。この実施形態では、顔領域を識別する文字情報を登録画像データの代わりに登録しておき、登録されている文字情報を表示部１６０の候補表示領域４１０に表示する。文字情報としては、被写体の名前などが考えられる。
前述の登録モードにおいて、登録画像データを生成する際に、操作部１８０を介して登録名を示す文字情報を入力する。文字情報を入力する入力画面の例を図１４に示す。
図１４において、登録画像１０００は、登録対象となる顔の画像である。ソフトキーボード１０１０は、文字情報を入力するための領域であり、文字キー領域に加えて、文字種別の切り替えキー領域や、文字の削除キー領域、決定キー領域を有している。表示部１６０の上に設けられた、操作部１８０をなすタッチパネルがユーザーによりタッチされると、タッチされた座標がソフトキーボード１０１０のどのキー領域を指し示しているかをシステム制御部２５７が判定し、該当するキーに対応する文字が入力されたものとする。入力された文字は、入力領域１０２０に表示される。 [Third Embodiment]
This embodiment is a modification of the second embodiment. In this embodiment, character information for identifying a face area is registered instead of registered image data, and the registered character information is displayed in the candidate display area 410 of the display unit 160. As the character information, the name of the subject can be considered.
In the registration mode described above, character information indicating a registration name is input via the operation unit 180 when generating registration image data. An example of an input screen for inputting character information is shown in FIG.
In FIG. 14, a registered image 1000 is a face image to be registered. The soft keyboard 1010 is an area for inputting character information, and includes a character type switching key area, a character deletion key area, and a determination key area in addition to the character key area. When a touch panel provided on the display unit 160 and constituting the operation unit 180 is touched by the user, the system control unit 257 determines which key area of the soft keyboard 1010 indicates the touched coordinates, and It is assumed that the character corresponding to the key to be input has been input. The input characters are displayed in the input area 1020.

本実施形態における、被写体の登録を行う際のフローチャートを図１５に示す。図１０と同様のステップについては同一の符号を付し、その詳細な説明は省略する。
第２実施形態で示した図１０におけるフローチャートとの違いは、ステップＳ１１００であり、ステップＳ１１００では上記で説明したように、主被写体候補を識別する文字列の入力が行われる。このとき、ステップＳ８１０で抽出された登録画像データが登録画像１０００として表示部１６０に表示される。
また、ステップＳ８４０での、記録媒体１５５へ記録される登録情報には、第２実施形態で説明した登録画像データおよび登録特徴量に加えて、ステップＳ１１００にて入力された文字情報も含まれる。このとき、登録部１６６０により、文字情報は登録特徴量と関連付けられて、例えばバッファメモリ１３５に記録される。 FIG. 15 is a flowchart for registering a subject in this embodiment. Steps similar to those in FIG. 10 are denoted by the same reference numerals, and detailed description thereof is omitted.
The difference from the flowchart in FIG. 10 shown in the second embodiment is Step S1100. In Step S1100, as described above, a character string for identifying a main subject candidate is input. At this time, the registered image data extracted in step S810 is displayed on the display unit 160 as a registered image 1000.
Further, the registration information recorded on the recording medium 155 in step S840 includes the character information input in step S1100 in addition to the registered image data and the registration feature amount described in the second embodiment. At this time, the character information is associated with the registered feature value by the registration unit 1660 and recorded in the buffer memory 135, for example.

図１６に本実施形態の表示例を示す。図１３と同様の構成要素については同一の番号を付し説明を省略する。図１６において、候補表示領域４１０には前述の区画領域４１０ａ〜４１０ｄに対応する区画領域１２００ａ〜１２００ｄが設けられている。顔領域を識別する識別文字列１２０２ａおよび１２０２ｂが表示されており、識別文字列１２０２ａが右側の顔４００ａに対応しており、識別文字列１２０２ｂが左側の顔４００ｂに対応している。
また、識別文字列１２０２ａが顔４００ａに対応しているため、区画領域１２００ａを囲む第１装飾枠１２０１ａは第１検出枠４３０ａと同じ種類の破線で表示されている。識別文字列１２０２ｂが顔４００ｂに対応しているため、区画領域１２００ｂを囲む第２装飾枠１２０１ｂは第２検出枠４３０ｂと同じ種類の破線で表示されている。また、主被写体領域として選択されている顔領域の検出枠および装飾枠は、太線で表示されている。図１６に示す表示例では、第２検出枠４３０ｂで囲まれている領域が主被写体領域として選択されているため、第２検出枠４３０ｂおよび第２装飾枠１２０１ｂが太線で表示されている。 FIG. 16 shows a display example of this embodiment. Constituent elements similar to those in FIG. In FIG. 16, the candidate display area 410 is provided with partitioned areas 1200a to 1200d corresponding to the partitioned areas 410a to 410d described above. Identification character strings 1202a and 1202b for identifying the face area are displayed, the identification character string 1202a corresponds to the right face 400a, and the identification character string 1202b corresponds to the left face 400b.
In addition, since the identification character string 1202a corresponds to the face 400a, the first decorative frame 1201a surrounding the partition area 1200a is displayed with a broken line of the same type as the first detection frame 430a. Since the identification character string 1202b corresponds to the face 400b, the second decorative frame 1201b surrounding the partitioned area 1200b is displayed with a broken line of the same type as the second detection frame 430b. Further, the detection frame and the decoration frame of the face area selected as the main subject area are displayed with bold lines. In the display example shown in FIG. 16, since the region surrounded by the second detection frame 430b is selected as the main subject region, the second detection frame 430b and the second decoration frame 1201b are displayed with bold lines.

前述の第１および第２実施形態と同様に、識別文字列１２０２ａをタッチすることで、識別文字列１２００ａに対応する顔領域４３０ａが主被写体領域として選択される。
この場合、基本的に第２実施形態の場合と同様の効果が得られるが、それに加えて、文字情報を用いることで顔領域が誰なのかを容易に判別することができる。
［他の実施形態］
本発明は、前述の実施形態に限定されるものではなく、本発明の範囲を逸脱することなく種々の変形および修正が可能である。
（１）
前述の実施形態では、ビデオカメラの録画時の動作を例に挙げて表示装置について説明しているが、静止画撮影時のライブビューモードや録画済の動画の再生時にも前述の技術は適用可能である。例えば、再生時に興味のある被写体を拡大したい場合、拡大対象とする被写体を指定する必要がある。そのような場合にでも本発明は適用可能である。 Similar to the first and second embodiments described above, by touching the identification character string 1202a, the face area 430a corresponding to the identification character string 1200a is selected as the main subject area.
In this case, basically the same effect as in the case of the second embodiment can be obtained, but in addition to this, it is possible to easily determine who the face area is by using character information.
[Other Embodiments]
The present invention is not limited to the above-described embodiments, and various changes and modifications can be made without departing from the scope of the present invention.
(1)
In the above-described embodiment, the display device has been described by taking the operation at the time of recording of the video camera as an example. However, the above-described technique can also be applied to the live view mode at the time of still image shooting and the playback of the recorded video. It is. For example, when it is desired to enlarge a subject of interest during playback, it is necessary to specify the subject to be enlarged. Even in such a case, the present invention is applicable.

また、デジタルカメラやカメラを備えた携帯電話といった、ビデオカメラ以外の撮像装置、あるいはテレビやカーナビゲーションシステム、ゲーム機などの映像を表示する表示装置を備えた機器であれば本発明は適用可能である。
（２）
前述の実施形態では、１つの主被写体領域を選択することを想定しているが、複数の主被写体領域を選択可能にしてもよい。
（３）
前述の実施形態では、タッチパネルを用いて主被写体領域を選択しているが、他の種類の操作部を用いて主被写体領域を選択するよう構成しても良い。例えば、方向を指示できるカーソルキーを用いて主被写体領域を選択するよう構成しても良い。この場合においても、常に位置関係の変化する主被写体候補を直接選択するのではなく、画面上に並んだ主候補画像を選択するため、確実に所望の主被写体領域を選択することが可能となる。 In addition, the present invention is applicable to any device provided with an image pickup device other than a video camera, such as a digital camera or a mobile phone equipped with a camera, or a display device such as a television, a car navigation system, or a game machine. is there.
(2)
In the above-described embodiment, it is assumed that one main subject area is selected, but a plurality of main subject areas may be selectable.
(3)
In the above-described embodiment, the main subject area is selected using the touch panel. However, the main subject area may be selected using another type of operation unit. For example, the main subject area may be selected using a cursor key that can indicate a direction. Even in this case, the main subject candidate whose position relationship is always changed is not directly selected, but the main candidate images arranged on the screen are selected, so that a desired main subject region can be surely selected. .

（４）
前述の実施形態では、表示用画像に候補画像、登録画像および文字情報を重畳して表示するようにしているが、異なる表示方法を用いてもよい。例えば、候補画像を半透明にすることで、候補画像が表示される領域においても表示用画像が透けて見えるよう構成しても良い。あるいは、表示用画像が候補画像と重ならないよう表示用画像を縮小し、表示部１６０に候補画像と表示用画像とを重畳させずに表示するよう構成しても良い。
（５）
前述の実施形態では、候補領域として人間の顔を検出するようにしているが、人体や、ペットなどを検出部１３０にて検出し、候補領域とするようにしてもなんら問題はない。あるいは、特定の物体に限らず、入力される画像の差分を取ることで得られる移動領域や、特定の色成分を含む領域を検出部１３０にて検出し、候補領域とするように構成しても良い。画像の差分を取ることで候補領域を検出する場合、検出部１３０は、異なるタイミングで取得された２つの画像データを比較することで、動いている被写体を認識し、特定の視覚的特徴を有する候補領域を検出する。 (4)
In the above-described embodiment, the candidate image, the registered image, and the character information are displayed superimposed on the display image. However, different display methods may be used. For example, the candidate image may be made translucent so that the display image can be seen through even in the region where the candidate image is displayed. Alternatively, the display image may be reduced so that the display image does not overlap the candidate image, and the candidate image and the display image may be displayed on the display unit 160 without being superimposed.
(5)
In the above-described embodiment, a human face is detected as a candidate area. However, there is no problem even if a human body, a pet, or the like is detected by the detection unit 130 and set as a candidate area. Alternatively, not only a specific object but also a moving region obtained by taking a difference between input images or a region including a specific color component is detected by the detection unit 130 and configured as a candidate region. Also good. When a candidate region is detected by taking a difference between images, the detection unit 130 recognizes a moving subject by comparing two image data acquired at different timings, and has a specific visual feature. Candidate areas are detected.

（６）
前述の実施形態では、システム制御部１５７と検出部１３０を別の構成としているが、検出部１３０が担う処理をシステム制御部１５７にて行なってもよい。また、プログラムはＲＯＭに格納されているが、このプログラムが外部の記録媒体から読み込まれるよう構成しても良い。
（７）
検出部１３０が検出する領域は人間の顔を表す領域に限定されない。例えば、検出部１３０が人間の体の全部または一部を表す領域を検出してもよいし、特定の色成分を含む領域であってもよい。
（８）
前述の実施形態では、生成部１６２０が候補画像および登録画像に対して区画領域４１０ａ〜４１０ｄの大きさに合わせて拡大処理および縮小処理を行うが、処理前の候補画像および登録画像が区画領域４１０ａ〜４１０ｄの範囲内に収まるのであれば、拡大処理を行わない構成であってもよい。 (6)
In the above-described embodiment, the system control unit 157 and the detection unit 130 are configured separately, but the processing performed by the detection unit 130 may be performed by the system control unit 157. The program is stored in the ROM, but the program may be read from an external recording medium.
(7)
The area detected by the detection unit 130 is not limited to an area representing a human face. For example, the detection unit 130 may detect an area representing all or part of a human body, or may be an area including a specific color component.
(8)
In the above-described embodiment, the generation unit 1620 performs the enlargement process and the reduction process on the candidate image and the registered image in accordance with the size of the partitioned areas 410a to 410d, but the candidate image and the registered image before the processing are the partitioned area 410a. As long as it falls within the range of ~ 410d, a configuration in which enlargement processing is not performed may be used.

また、候補表示領域４１０が画面の下側に配置されているが、候補表示領域４１０の配置は前述の実施形態に限定されない。
（９）
前述の第１実施形態では、第１装飾枠４１１ａおよび第２装飾枠４１１ｂが候補画像に付加されているが、図１７に示すように第１装飾枠４１１ａおよび第２装飾枠４１１ｂのような装飾枠を付加しない実施形態も考えられる。主被写体領域として選択されている顔領域が識別しやすいように主被写体領域の検出枠だけ他の検出枠と異なる視覚的特徴を持たしてもよい。
また、第１検出枠４３０ａと第１装飾枠４１１ａとは、同じ視覚的特徴を有しているが、両枠４３０ａおよび４１１ａが対応していることが認識できれば、実質的に同じ視覚的特徴を持たせていてもよい。例えば、図４においては、第１装飾枠４１１ａおよび第２装飾枠４１１ｂでは破線の間隔を異ならせているが、枠の色を変えるようにしてもよい。 Further, although the candidate display area 410 is arranged on the lower side of the screen, the arrangement of the candidate display area 410 is not limited to the above-described embodiment.
(9)
In the first embodiment described above, the first decorative frame 411a and the second decorative frame 411b are added to the candidate image. However, as shown in FIG. 17, the decorative elements such as the first decorative frame 411a and the second decorative frame 411b are used. An embodiment without a frame is also conceivable. Only the detection frame of the main subject region may have a visual feature different from the other detection frames so that the face region selected as the main subject region can be easily identified.
The first detection frame 430a and the first decoration frame 411a have the same visual characteristics. However, if it can be recognized that both the frames 430a and 411a correspond, the substantially same visual characteristics are obtained. You may have it. For example, in FIG. 4, the first decorative frame 411a and the second decorative frame 411b have different broken line intervals, but the frame color may be changed.

（１０）
前述の実施形態では、顔領域の座標情報に基づいて顔領域を追尾する機能を実現しているが、顔領域の移動量を動きベクトルとして記憶しておき、複数の候補画像が存在するような場合の判定精度を高める方法を用いても良い。また、画像データ間において顔の大きさが大きく変わることはないと考えられるため、画像データ間において顔領域の大きさの変化量がある一定範囲に収まっていることを条件に加えても良い。
（１１）
図１６に示した表示例は本実施形態における一例である。例えば、登録画像データを文字情報とともに表示部１６０に表示する構成であってもよい。つまり、第２実施形態と第３実施形態とを組み合わせた実施形態も考えられる。 (10)
In the above-described embodiment, the function of tracking the face area based on the coordinate information of the face area is realized. However, the movement amount of the face area is stored as a motion vector, and there are a plurality of candidate images. You may use the method of raising the determination precision in a case. Further, since it is considered that the size of the face does not change greatly between the image data, it may be added as a condition that the amount of change in the size of the face area is within a certain range between the image data.
(11)
The display example shown in FIG. 16 is an example in the present embodiment. For example, the registered image data may be displayed on the display unit 160 together with character information. That is, an embodiment in which the second embodiment and the third embodiment are combined is also conceivable.

（１２）
図５で示したフローチャートにおいては、ステップＳ３２０にて顔領域が画面内から消失していると判定された場合、ステップＳ３３０において、候補情報リストから消失したと判定された顔領域に関する情報が直ちに削除されているが、一定時間経過後にこれらの情報を削除するよう構成しても良い。この場合、短時間のオクルージョンやフレームアウトが顔領域に対して発生しても、区画領域４１０ａ〜４１０ｄに表示される候補画像は変わらないため、主被写体領域を選択しようとした瞬間に候補表示領域４１０内の表示内容が変わってしまい、主被写体領域の選択に失敗してしまう状況を回避することができる。また、画面内に存在しない主被写体領域を選択することができるため、主被写体が画面に入ってきた際に迅速に主被写体領域に対して自動合焦制御および自動露出制御が行われるという良好な効果も得られる。 (12)
In the flowchart shown in FIG. 5, if it is determined in step S320 that the face area has disappeared from the screen, information regarding the face area determined to have disappeared from the candidate information list in step S330 is immediately deleted. However, the information may be deleted after a predetermined time has elapsed. In this case, even if a short-time occlusion or frame-out occurs in the face area, the candidate images displayed in the partition areas 410a to 410d do not change. It is possible to avoid a situation in which the display content in 410 changes and selection of the main subject area fails. In addition, since a main subject area that does not exist in the screen can be selected, when the main subject enters the screen, automatic focusing control and automatic exposure control are quickly performed on the main subject area. An effect is also obtained.

（１３）
また、図７に示したフローチャートにおいては、画面内に主被写体が存在しないと判定した場合、通常の自動合焦制御および自動露出制御を直ちに行うようにしているが、一定時間経過後に通常の自動合焦制御および自動露出制御を行うようにしても良い。一定時間が経過するまでの間に行う自動合焦制御および自動露出制御の制御方法については種々考えられるが、例えば、自動合焦制御および自動露出制御を停止させても良い。主被写体の短時間のオクルージョンやフレームアウトに対して、自動合焦制御および自動露出制御が追従しなくなるため、画面のボケや画面の明るさの急激な切り替わりが発生しにくくなり、良好な撮影画像を得ることが可能になる。
（１４）
前述の実施形態では、候補画像の表示順序を候補情報リストの順としているが、顔領域の位置に合わせて候補画像の表示位置を決定してもよい。例えば、図４に示す表示例では、左側の顔４００ｂの候補画像が区画領域４１０ｂに表示されており、右側の顔４００ａの候補画像が区画領域４１０ｂの左側の区画領域４１０ａに表示されているが、顔４００ａおよび４００ｂの配置に合わせて、区画領域４１０ａに顔４００ｂの候補画像が表示され、かつ、区画領域４１０ｂに顔４００ａの候補画像が表示されるようにしてもよい。この場合、顔領域の位置関係と候補画像の位置関係とが対応するため、候補画像と顔領域との対応関係をユーザーがより把握しやすくなる。 (13)
In the flowchart shown in FIG. 7, when it is determined that the main subject does not exist in the screen, normal automatic focusing control and automatic exposure control are performed immediately. Focus control and automatic exposure control may be performed. Various control methods of automatic focusing control and automatic exposure control performed until a predetermined time elapses are conceivable. For example, automatic focusing control and automatic exposure control may be stopped. Automatic focus control and automatic exposure control do not follow the short occlusion and frame out of the main subject, making it difficult for screen blur and screen brightness to change suddenly, resulting in good shot images Can be obtained.
(14)
In the above-described embodiment, the display order of the candidate images is the order of the candidate information list. However, the display position of the candidate images may be determined in accordance with the position of the face area. For example, in the display example shown in FIG. 4, the candidate image of the left face 400b is displayed in the partitioned area 410b, and the candidate image of the right face 400a is displayed in the partitioned area 410a on the left side of the partitioned area 410b. The candidate image of the face 400b may be displayed in the partitioned area 410a and the candidate image of the face 400a may be displayed in the partitioned area 410b in accordance with the arrangement of the faces 400a and 400b. In this case, since the positional relationship between the face area and the positional relationship between the candidate images correspond, the user can more easily understand the correspondence between the candidate image and the face area.

（１５）
前述の第２実施形態では、登録されていない顔が検出されても、候補画像データは候補表示領域４１０に表示されないが、第１実施形態と同様に、登録されていない顔が検出された場合に検出用画像データから候補画像データを生成し、候補表示領域４１０に表示してもよい。つまり、第１実施形態と第２実施形態とを組み合わせる構成もあり得る。
また、登録情報の登録時、認識処理用の登録特徴量が記録媒体１５５に格納されるのではなく、ビデオカメラ５００内に設けられた不揮発性メモリ（図示せず）に格納されるようにしてもよい。この場合、記録媒体１５５を別の記録媒体に交換しても、登録情報を不揮発性メモリから読み込み可能となり、機器の利便性が向上する。不揮発性メモリとしては、フラッシュメモリおよびハードディスクドライブなどが考えられる。 (15)
In the second embodiment described above, even if an unregistered face is detected, the candidate image data is not displayed in the candidate display area 410. However, as in the first embodiment, an unregistered face is detected. Alternatively, candidate image data may be generated from the detection image data and displayed in the candidate display area 410. That is, there may be a configuration in which the first embodiment and the second embodiment are combined.
In addition, when registering registration information, a registration feature amount for recognition processing is not stored in the recording medium 155 but is stored in a non-volatile memory (not shown) provided in the video camera 500. Also good. In this case, even if the recording medium 155 is replaced with another recording medium, the registration information can be read from the nonvolatile memory, and the convenience of the device is improved. As the non-volatile memory, a flash memory and a hard disk drive can be considered.

（１６）
各機能ブロックは、ＬＳＩなどの半導体装置により個別に１チップ化されても良いし、一部または全部を含むように１チップ化されても良い。
なお、ここでは、ＬＳＩとしたが、集積度の違いにより、ＩＣ、システムＬＳＩ、スーパーＬＳＩ、ウルトラＬＳＩと呼称されることもある。
また、集積回路化の手法はＬＳＩに限るものではなく、専用回路又は汎用プロセサで実現してもよい。ＬＳＩ製造後に、プログラムすることが可能なＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）や、ＬＳＩ内部の回路セルの接続や設定を再構成可能なリコンフィギュラブル・プロセッサーを利用しても良い。
さらには、半導体技術の進歩又は派生する別技術によりＬＳＩに置き換わる集積回路化の技術が登場すれば、当然、その技術を用いて機能ブロックの集積化を行ってもよい。バイオ技術の適用等が可能性としてあり得る。 (16)
Each functional block may be individually made into one chip by a semiconductor device such as an LSI, or may be made into one chip so as to include a part or the whole.
Here, although LSI is used, it may be called IC, system LSI, super LSI, or ultra LSI depending on the degree of integration.
Further, the method of circuit integration is not limited to LSI, and implementation with a dedicated circuit or a general-purpose processor is also possible. An FPGA (Field Programmable Gate Array) that can be programmed after manufacturing the LSI or a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.
Further, if integrated circuit technology comes out to replace LSI's as a result of the advancement of semiconductor technology or a derivative other technology, it is naturally also possible to carry out function block integration using this technology. Biotechnology can be applied as a possibility.

（１７）
前述の実施形態の各処理は、ハードウェアにより実現されてもよいし、ソフトウェアにより実現されてもよいし、ソフトウェアおよびハードウェアの混在処理により実現されてもよい。 (17)
Each process of the above-described embodiment may be realized by hardware, may be realized by software, or may be realized by a mixed process of software and hardware.

本発明に係る表示装置によれば、主被写体領域を指定することが容易となるため、本発明は、映像上の特定の領域に対して合焦制御や露出制御を行う装置の分野において有用である。 The display device according to the present invention makes it easy to specify a main subject area, and thus the present invention is useful in the field of an apparatus that performs focus control and exposure control for a specific area on an image. is there.

１００ビデオカメラ
１３０検出部
１５７システム制御部
１６０表示部
１８０操作部
２２０顔検出部
７００認識部
１６２０リスト管理部
１６２０生成部
１６４０合成部
１６５０選択部
１６６０登録部 DESCRIPTION OF SYMBOLS 100 Video camera 130 Detection part 157 System control part 160 Display part 180 Operation part 220 Face detection part 700 Recognition part 1620 List management part 1620 Generation part 1640 Composition part 1650 Selection part 1660 Registration part

Claims

A display device for displaying a plurality of image data as video,
A detection unit for detecting at least one candidate region having a specific visual feature from the image data;
A generating unit that generates at least one candidate identification information for identifying each candidate area detected by the detecting unit;
A combining unit that generates combined image data in which the candidate identification information and the image data are combined;
A display unit for displaying the composite image data as the video;
An operation unit for receiving input of designation information for designating specific candidate identification information among the candidate identification information appearing in the video;
Based on the designation information input to the operation unit, a selection unit that selects the candidate region corresponding to the candidate identification information designated through the operation unit as a reference region;
A display device comprising:

The combining unit combines the candidate identification information and the image data so that the candidate identification information is arranged at a specific position on the screen of the display unit.
The display device according to claim 1.

The combining unit combines the candidate identification information and the image data so that a plurality of the candidate identification information are arranged on the screen of the display unit.
The display device according to claim 1.

The synthesizing unit synthesizes the candidate identification information and the image data so that the candidate identification information is arranged in a lower half region in the vertical direction of the video displayed on the display unit;
The display device according to claim 1.

The composition unit combines the candidate identification information and the image data so that the candidate identification information is superimposed on an image generated based on the image data in a state where the video is displayed on the display unit. To synthesize,
The display device according to claim 1.

The area of the first display area where the candidate identification information is displayed on the screen of the display unit is smaller than the area of the second display area other than the first display area,
The display device according to claim 1.

The generation unit extracts candidate image data in each candidate area detected from one image data, and sets the extracted candidate image data as the candidate identification information.
7. The display device according to any one of 1 to 6.

The generation unit compares the size of the partition area in which one candidate image is displayed as the candidate identification information with the size of the candidate image, and adjusts the size of the candidate image based on a comparison result.
The display device according to claim 7.

When the candidate image is smaller than the partition area, the generation unit performs an enlargement process on the candidate image according to the partition area, and sets the candidate image subjected to the enlargement process as the candidate identification information ,
The display device according to claim 8.

When the candidate image is larger than the partition area, the generation unit performs a reduction process on the candidate image according to the partition area, and sets the candidate image subjected to the reduction process as the candidate identification information ,
The display device according to claim 8 or 9.

When combining the candidate identification information and the image data, the combining unit adds first decoration information indicating the candidate area to each candidate area, and the candidate identification information corresponding to the candidate area 2nd decoration information indicating that is added to each area where the candidate identification information is displayed,
The display device according to claim 1.

The first decoration information added to the reference area has a visual feature different from the other first decoration information,
The second decoration information added to the candidate identification information corresponding to the reference area is a visual feature different from the other second decoration information and is substantially the same as the corresponding first decoration information. Have visual features,
The display device according to claim 11.

The detection unit compares reference information representing the specific visual feature with the image data, and detects a region that satisfies the criteria information as the candidate region.
The display device according to claim 1.

The candidate region having the specific visual feature is a region representing all or part of a human body;
The display device according to claim 1.

The candidate area having the specific visual feature is an area representing a human face;
The display device according to claim 1.

The candidate region having the specific visual feature is a region including a specific color component;
The display device according to claim 1.

The detection unit detects the candidate region having the specific visual feature by comparing two image data acquired at different timings.
The display device according to claim 1.

The designation information indicates a position on the screen of the display unit,
The selection unit identifies the candidate identification information displayed at the position indicated by the designation information based on the designation information input via the operation unit, and corresponds to the identified candidate identification information Selecting the candidate region as the reference region;
The display device according to claim 1.

The operation unit includes a touch panel unit that detects a contact position on the screen of the display unit and generates the designation information indicating the contact position.
The display device according to claim 1.

The detection unit generates a detection feature amount indicating a visual feature of the candidate area for each detected candidate area.
The display device according to claim 1.

A registration unit that associates the detected feature amount generated by the detection unit with the candidate identification information corresponding to the candidate region from which the detected feature amount is generated;
A registered information acquisition unit that matches the detected feature value with the registered feature value and identifies the candidate identification information associated with the registered feature value that matches the detected feature value;
The synthesizing unit synthesizes the candidate identification information identified by the registration information acquisition unit and the image data to generate the synthesized image data;
The display device according to claim 20.

The operation unit is adapted to accept input of character information,
The registration unit associates the character information with the registered feature amount;
The display device according to claim 21.

A display method for displaying a plurality of image data as video,
Detecting at least one candidate region having a specific visual feature from the image data;
Generating at least one candidate identification information for identifying each candidate area detected in the detecting step;
Generating a composite image data in which the candidate identification information and the image data are combined;
A display step of displaying the composite image data as the video;
An operation step of receiving input of designation information for designating the specific candidate identification information among the candidate identification information appearing in the video;
A selection step of selecting, as a reference area, the candidate area corresponding to the candidate identification information specified via the operation unit based on the designation information input to the operation unit;
Display method with.

An integrated circuit for displaying a plurality of image data as video,
A detection unit for detecting at least one candidate region having a specific visual feature from the image data;
A generating unit that generates at least one candidate identification information for identifying each candidate area detected by the detecting unit;
A combining unit that generates combined image data in which the candidate identification information and the image data are combined;
A display unit for displaying the composite image data as the video;
An operation unit for receiving input of designation information for designating specific candidate identification information among the candidate identification information appearing in the video;
Based on the designation information input to the operation unit, a selection unit that selects the candidate region corresponding to the candidate identification information designated through the operation unit as a reference region;
Integrated circuit with.