JP2010074400A

JP2010074400A - Device for determining composition, method of determining composition, and program

Info

Publication number: JP2010074400A
Application number: JP2008238213A
Authority: JP
Inventors: Kaoru Hatta; 薫八田; Toshiyuki Katsurada; 敏之桂田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2008-09-17
Filing date: 2008-09-17
Publication date: 2010-04-02
Anticipated expiration: 2028-09-17
Also published as: JP5098917B2

Abstract

<P>PROBLEM TO BE SOLVED: To obtain a more appropriate composition according to an attribute of an object by automatically searching the object as a person to optimize the composition of the object. <P>SOLUTION: A composition detection device includes: an object detection means for detecting an object existing in content of inputted image data; and an attribute detection means for detecting attributes, such as age and sex, for the detected objects. Also, a face direction is detected for each object, and then the composition adapted to a specific relationship is determined for attributes for each object. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、例えば撮像により得られた画像データの画内容から最適構図を判定する構図判定装置と、その方法に関する。さらに構図判定装置が実行するプログラムに関する。 The present invention relates to a composition determination apparatus for determining an optimum composition from image content of image data obtained by imaging, for example, and a method thereof. Further, the present invention relates to a program executed by the composition determination apparatus.

鑑賞者が良いと感じる写真を撮影するためのテクニック的な一要素として、構図の設定が挙げられる。ここでいう構図は、フレーミングともいわれるもので、例えば写真としての画枠内における被写体の配置をいう。
良好な構図とするための一般的、基本的な手法はいくつかあるものの、一般のカメラユーザが良い構図の写真を撮影することは、写真撮影に関する充分な知識、熟練した技術を持っていない限り、決して簡単なことではない。このことからすると、例えば良好な構図の写真画像を手軽で簡単に得ることのできる技術構成が求められることになる。 One of the technical elements for taking photos that viewers feel good is the composition setting. The composition referred to here is also called framing, and means, for example, the arrangement of a subject in an image frame as a photograph.
Although there are several general and basic methods for achieving a good composition, it is important for ordinary camera users to take photographs with good composition unless they have sufficient knowledge and skill in photography. It's never easy. From this, for example, a technical configuration capable of easily and easily obtaining a photographic image having a good composition is required.

例えば特許文献１には、自動追尾装置として、一定時間間隔の画像間の差を検出して、画像間の差の重心を算出し、この重心の移動量、移動方向から被写体画像の撮像画面に対する移動量、移動方向を検出して撮像装置を制御し、被写体画像を撮像画面の基準領域内に設定する技術構成が開示されている。
また、特許文献２には、自動追尾装置として、人物を自動追尾する場合に、人物の顔が画面中央となるように画面上の人物像全体の面積に対してその人物上の上側から２０％の面積となる位置を画面中央にして追尾することによって人物の顔を確実に撮影しながら追尾できるようにした技術が開示されている。
これらの技術構成を構図決定の観点から見れば、人物としての被写体を自動的に探索して、撮影画面において或る決まった構図でその被写体を配置させることが可能となっている。 For example, in Patent Document 1, as an automatic tracking device, a difference between images at a predetermined time interval is detected, a center of gravity of the difference between images is calculated, and a moving amount and a moving direction of the center of gravity are used for an imaging screen of a subject image. A technical configuration is disclosed in which a moving amount and a moving direction are detected to control an imaging apparatus, and a subject image is set within a reference area of an imaging screen.
In Patent Document 2, as an automatic tracking device, when a person is automatically tracked, 20% from the upper side of the person relative to the entire area of the person image on the screen so that the face of the person is at the center of the screen. A technique has been disclosed in which tracking is performed while reliably capturing a face of a person by tracking the position where the area becomes the center of the screen.
If these technical configurations are viewed from the viewpoint of composition determination, it is possible to automatically search for a subject as a person and place the subject in a certain composition on the shooting screen.

特開昭５９−２０８９８３号公報JP 59-208983 A 特開２００１−２６８４２５号公報JP 2001-268425 A

例として、人を被写体として想定したとすると、その被写体については、性別、年代（年齢、世代）など、属性として捉えられるいくつかの要素があることになる。上記した構図についてより良いものを求めようとすると、被写体の属性に応じて、適切とされる構図は異なってくる場合があると考えられる。
しかし、上記特許文献による技術では、追尾した被写体をある固定的な構図で配置させることしかできない。従って、被写体の状況などに対応して構図を変更して撮影するようなことはできないことになる。
そこで、本願発明では、例えば写真などとしての画像について良好な構図が手軽に得られるようにするための技術を提案することを目指すこととしたうえで、その際において、被写体の属性にも適応してより適切な構図の決定が行われるようにすることを目的とする。 For example, if a person is assumed to be a subject, the subject has several elements that can be regarded as attributes, such as gender and age (age, generation). If an attempt is made to obtain a better composition as described above, the appropriate composition may be different depending on the attribute of the subject.
However, with the technique according to the above-mentioned patent document, it is only possible to arrange the tracked subject with a certain fixed composition. Therefore, it is impossible to change the composition in accordance with the subject situation or the like.
Therefore, the present invention aims to propose a technique for easily obtaining a good composition for an image such as a photograph, and also adapts to the attributes of the subject. The purpose is to make a more appropriate composition decision.

そこで本発明は上記した課題を考慮して、構図判定装置として次のように構成する。
つまり、画像データを入力して、この画像データの画内容において存在する被写体を検出する被写体検出手段と、上記被写体検出手段により検出された被写体ごとの所定の属性について検出する属性検出手段と、上記属性検出手段により検出された被写体ごとの属性についての所定の関係性に基づいて構図を判定する構図判定手段とを備える構図判定装置とを備えることとした。 In view of the above-described problems, the present invention is configured as follows as a composition determination apparatus.
That is, subject detection means for inputting image data and detecting a subject existing in the image content of the image data, attribute detection means for detecting a predetermined attribute for each subject detected by the subject detection means, And a composition determination device including composition determination means for determining composition based on a predetermined relationship with respect to attributes for each subject detected by the attribute detection means.

上記構成では、画像データの画内容において存在する被写体を検出するとともに、検出したこれらの被写体についての属性の検出も行う。そのうえで、被写体ごとの属性に関する特定の関係性に適合した構図を判定することができる。 In the above configuration, subjects existing in the image content of the image data are detected, and attributes of these detected subjects are also detected. In addition, it is possible to determine a composition that matches a specific relationship regarding the attribute of each subject.

上記の構成により、本願発明によっては、被写体の属性に応じてより適切とされる構図が得られることになる。 With the above configuration, a composition that is more appropriate according to the attribute of the subject can be obtained depending on the present invention.

以下、本願発明を実施するための最良の形態（以下、実施形態という）について、下記の順により説明する。

１．撮像システムの構成
２．基本の構図判定処理例
３．年代に応じた構図判定（第１例）
４．年代に応じた構図判定（第２例）
５．性別に応じた構図判定
６．顔方向に応じた構図判定
７．アルゴリズム
８．変形例
Hereinafter, the best mode for carrying out the present invention (hereinafter referred to as an embodiment) will be described in the following order.

1. Configuration of imaging system 2. Basic composition determination processing example Composition determination according to the age (first example)
4). Composition determination according to the age (second example)
5). 5. Composition determination according to gender 6. Composition determination according to face direction Algorithm 8. Modified example

１．撮像システムの構成

本実施形態の撮像システムの説明に先立ち、本実施形態についての説明を行っていくのにあたり、構図、画枠、画角、撮像視野角、構図なる語を用いることとする。
構図は、ここでは、フレーミングともいわれるもので、例えば画枠内における被写体についてのサイズ設定も含めたうえでの配置状態をいう。
画枠は、例えば画像が嵌め込まれるようにしてみえる一画面相当の領域範囲をいい、一般には縦長若しくは横長の長方形としての外枠形状を有する。
画角は、ズーム角などともいわれるもので、撮像装置の光学系におけるズームレンズの位置によって決まる画枠に収まる範囲を角度により表したものである。一般的には、撮像光学系の焦点距離と、像面（イメージセンサ、フィルム）のサイズによって決まるものとされているが、ここでは、焦点距離に対応して変化し得る要素を画角といっている。
撮像視野角は、定位置に置かれた撮像装置により撮像して得られる画像の画枠に収まる範囲について、上記の画角に加え、パン（水平）方向における振り角度と、チルト（垂直）方向における角度（仰角、俯角）により決まるものをいう。
例えば、構図は、画像視野角によって決まる画枠内に収まる被写体の配置状態を指すものとなる。 1. Configuration of imaging system

Prior to the description of the imaging system of the present embodiment, the words composition, image frame, angle of view, imaging viewing angle, and composition will be used in describing the present embodiment.
Here, the composition is also referred to as framing, and refers to, for example, an arrangement state including the size setting for the subject in the image frame.
The image frame is, for example, an area range corresponding to one screen in which an image can be seen to be fitted, and generally has an outer frame shape as a vertically long or horizontally long rectangle.
The angle of view is also called a zoom angle or the like, and represents the range within an image frame determined by the position of the zoom lens in the optical system of the imaging apparatus. Generally, it is determined by the focal length of the imaging optical system and the size of the image plane (image sensor, film), but here, the element that can change according to the focal length is called the angle of view. The
In addition to the above-mentioned angle of view, the viewing angle for the panning (horizontal) direction and the tilt (vertical) direction for the range that fits within the image frame of the image obtained by the imaging device placed at a fixed position. It is determined by the angle (elevation angle, depression angle).
For example, the composition indicates the arrangement state of the subject that falls within the image frame determined by the image viewing angle.

本実施形態としては、本願発明に基づく構成を、デジタルスチルカメラと、このデジタルスチルカメラが取り付けられる雲台とからなる撮像システムに適用した場合を例に挙げることとする。 In this embodiment, the configuration based on the present invention is applied to an imaging system including a digital still camera and a camera platform to which the digital still camera is attached.

図１は、本実施形態に対応する撮像システムの外観構成例を、正面図により示している。
この図に示されるように、本実施形態の撮像システムは、デジタルスチルカメラ１と雲台１０とから成る。
デジタルスチルカメラ１は、本体正面側のパネルに設けられているレンズ部３によって撮像して得られる撮像光に基づいて静止画像データを生成し、これを内部に装填されている記憶媒体に記憶させることが可能とされている。つまり、写真として撮影した画像を、静止画像データとして記憶媒体に記憶保存させる機能を有する。このような写真撮影を手動で行うときには、ユーザは、本体上面部に設けられているシャッター（レリーズ）ボタンを押し操作する。 FIG. 1 is a front view showing an example of the external configuration of an imaging system corresponding to the present embodiment.
As shown in this figure, the imaging system of the present embodiment includes a digital still camera 1 and a pan head 10.
The digital still camera 1 generates still image data based on imaging light obtained by imaging with the lens unit 3 provided on the front panel of the main body, and stores the still image data in a storage medium loaded therein. It is possible. That is, it has a function of storing and saving an image taken as a photograph in a storage medium as still image data. When such a photograph is taken manually, the user presses a shutter (release) button provided on the upper surface of the main body.

雲台１０には、上記デジタルスチルカメラ１を固定するようにして取り付けることができる。つまり、雲台１０とデジタルスチルカメラ１は、相互の取り付けを可能とするための機構部位を備えている。 The digital still camera 1 can be fixed to the camera platform 10 so as to be fixed. That is, the pan head 10 and the digital still camera 1 are provided with a mechanism part for enabling mutual attachment.

そして、雲台１０においては、取り付けられたデジタルスチルカメラ１を、パン方向（水平方向）とチルト方向との両方向により動かすためのパン・チルト機構を備える。
雲台１０のパン・チルト機構により与えられるデジタルスチルカメラ１のパン方向、チルト方向それぞれの動き方は例えば図２（ａ）（ｂ）に示されるものとなる。図２（ａ）（ｂ）は、雲台１０に取り付けられているとされるデジタルスチルカメラ１を抜き出して、それぞれ、平面方向、側面方向より見たものである。
先ずパン方向については、デジタルスチルカメラ１の本体横方向と図２（ａ）に示される直線Ｘ１とが同じ向きとなる位置状態を基準にして、例えば回転軸Ｃｔ１を回転中心として回転方向＋αに沿った回転が行われることで、右方向へのパンニングの動きが与えられる。また、回転方向−αに沿った回転が行われることで、左方向へのパンニングの動きが与えられる。
また、チルト方向については、デジタルスチルカメラ１の本体縦方向が垂直方向の直線Ｙ１と一致する位置状態を基準にして、例えば回転軸Ｃｔ２を回転中心として回転方向＋βへの回転が行われることで、下方向へのパンニングの動きが与えられる。また、回転方向−βへの回転が行われることで、上方向へのパンニングの動きが与えられる。
なお、図２（ａ）（ｂ）に示される、±α方向、及び±β方向のそれぞれにおける最大可動回転角度については言及していないが、被写体の捕捉の機会をできるだけ多くするべきことを考慮するのであれば、できるだけ最大可動回転角度を大きく取ることが好ましいことになる。 The pan / tilt head 10 includes a pan / tilt mechanism for moving the attached digital still camera 1 in both the pan direction (horizontal direction) and the tilt direction.
For example, FIGS. 2A and 2B show how the digital still camera 1 moves in the pan and tilt directions provided by the pan / tilt mechanism of the camera platform 10. FIGS. 2A and 2B show the digital still camera 1 attached to the pan head 10 as viewed from the plane direction and the side direction, respectively.
First, regarding the pan direction, for example, the rotation direction + α is set with the rotation axis Ct1 as the rotation center with reference to the position state where the horizontal direction of the digital still camera 1 and the straight line X1 shown in FIG. A panning movement in the right direction is given by the rotation along. Further, by performing rotation along the rotation direction −α, a panning movement in the left direction is given.
With respect to the tilt direction, for example, rotation in the rotation direction + β is performed with the rotation axis Ct2 as the rotation center with reference to the position state in which the vertical direction of the main body of the digital still camera 1 coincides with the straight line Y1 in the vertical direction. , Given downward panning movement. Further, the panning movement in the upward direction is given by the rotation in the rotation direction −β.
Although the maximum movable rotation angle in each of the ± α direction and the ± β direction shown in FIGS. 2A and 2B is not mentioned, it should be considered that the number of opportunities for capturing the subject should be increased as much as possible. If so, it is preferable to make the maximum movable rotation angle as large as possible.

また、図３は、デジタルスチルカメラ１の内部構成例を示している。
この図において、先ず、光学系部２１は、例えばズームレンズ、フォーカスレンズなども含む所定枚数の撮像用のレンズ群、絞りなどを備えて成り、入射された光を撮像光としてイメージセンサ２２の受光面に結像させる。
また、光学系部２１においては、上記のズームレンズ、フォーカスレンズ、絞りなどを駆動させるための駆動機構部も備えられているものとされる。これらの駆動機構部は、例えば制御部２７が実行するとされるズーム（画角）制御、自動焦点調整制御、自動露出制御などのいわゆるカメラ制御によりその動作が制御される。 FIG. 3 shows an internal configuration example of the digital still camera 1.
In this figure, first, the optical system unit 21 includes a predetermined number of imaging lens groups including a zoom lens, a focus lens, and the like, a diaphragm, and the like. Form an image on the surface.
The optical system unit 21 is also provided with a drive mechanism unit for driving the zoom lens, the focus lens, the diaphragm, and the like. The operation of these drive mechanisms is controlled by so-called camera control such as zoom (viewing angle) control, automatic focus adjustment control, automatic exposure control, and the like executed by the control unit 27, for example.

イメージセンサ２２は、上記光学系部２１にて得られる撮像光を電気信号に変換する、いわゆる光電変換を行う。このために、イメージセンサ２２は、光学系部２１からの撮像光を光電変換素子の受光面にて受光し、受光された光の強さに応じて蓄積される信号電荷を、所定タイミングにより順次出力するようにされる。これにより、撮像光に対応した電気信号(撮像信号)が出力される。なお、イメージセンサ２２として採用される光電変換素子（撮像素子）としては、特に限定されるものではないが、現状であれば、例えばＣＭＯＳセンサやＣＣＤ(Charge Coupled Device)などを挙げることができる。また、ＣＭＯＳセンサを採用する場合には、イメージセンサ２２に相当するデバイス(部品)として、次に述べるＡ／Ｄコンバータ２３に相当するアナログ−デジタル変換器も含めた構造とすることができる。 The image sensor 22 performs so-called photoelectric conversion in which imaging light obtained by the optical system unit 21 is converted into an electric signal. For this purpose, the image sensor 22 receives the imaging light from the optical system unit 21 on the light receiving surface of the photoelectric conversion element, and sequentially accumulates signal charges accumulated according to the intensity of the received light at a predetermined timing. It is made to output. Thereby, an electrical signal (imaging signal) corresponding to the imaging light is output. The photoelectric conversion element (imaging element) employed as the image sensor 22 is not particularly limited, but in the present situation, for example, a CMOS sensor, a CCD (Charge Coupled Device), or the like can be used. When a CMOS sensor is used, a device (component) corresponding to the image sensor 22 may include an analog-digital converter corresponding to an A / D converter 23 described below.

上記イメージセンサ２２から出力される撮像信号は、Ａ／Ｄコンバータ２３に入力されることで、デジタル信号に変換され、信号処理部２４に入力される。
信号処理部２４では、Ａ／Ｄコンバータ２３から出力されるデジタルの撮像信号について、例えば１つの静止画 (フレーム画像)に相当する単位で取り込みを行い、このようにして取り込んだ静止画単位の撮像信号について所要の信号処理を施すことで、１枚の静止画に相当する画像信号データである撮像画像データ（撮像静止画像データ）を生成することができる。 The imaging signal output from the image sensor 22 is input to the A / D converter 23, thereby being converted into a digital signal and input to the signal processing unit 24.
In the signal processing unit 24, for example, the digital imaging signal output from the A / D converter 23 is captured in a unit corresponding to one still image (frame image), and the still image unit captured in this way is captured. By performing necessary signal processing on the signal, it is possible to generate captured image data (captured still image data) that is image signal data corresponding to one still image.

上記のようにして信号処理部２４にて生成した撮像画像データを画像情報として記憶媒体であるメモリカード４０に記録させる場合には、例えば１つの静止画に対応する撮像画像データを信号処理部２４からエンコード／デコード部２５に対して出力するようにされる。
エンコード／デコード部２５は、信号処理部２４から出力されてくる静止画単位の撮像画像データについて、所定の静止画像圧縮符号化方式により圧縮符号化を実行したうえで、例えば制御部２７の制御に応じてヘッダなどを付加して、所定形式に圧縮された撮像画像データの形式に変換する。そして、このようにして生成した撮像画像データをメディアコントローラ２６に転送する。メディアコントローラ２６は、制御部２７の制御に従って、メモリカード４０に対して、転送されてくる撮像画像データを書き込んで記録させる。この場合のメモリカード４０は、例えば所定規格に従ったカード形式の外形形状を有し、内部には、フラッシュメモリなどの不揮発性の半導体記憶素子を備えた構成を採る記憶媒体である。なお、画像データを記憶させる記憶媒体については、上記メモリカード以外の種別、形式などとされてもよい。 When the captured image data generated by the signal processing unit 24 as described above is recorded as image information in the memory card 40 that is a storage medium, for example, captured image data corresponding to one still image is stored in the signal processing unit 24. To the encode / decode unit 25.
The encoding / decoding unit 25 performs compression encoding on the captured image data in units of still images output from the signal processing unit 24 by a predetermined still image compression encoding method, and performs control of the control unit 27, for example. Accordingly, a header or the like is added and converted into a format of captured image data compressed into a predetermined format. Then, the captured image data generated in this way is transferred to the media controller 26. The media controller 26 writes and records the transferred captured image data in the memory card 40 under the control of the control unit 27. The memory card 40 in this case is a storage medium having a configuration including, for example, a card-type outer shape conforming to a predetermined standard and having a nonvolatile semiconductor storage element such as a flash memory inside. Note that the storage medium for storing the image data may be of a type or format other than the memory card.

また、本実施形態としての信号処理部２４は、撮像画像データを利用して被写体検出としての画像処理を実行することも可能とされている。本実施形態における被写体検出処理がどのようなものであるのかについては後述する。 Further, the signal processing unit 24 according to the present embodiment can execute image processing as subject detection using captured image data. The subject detection process in this embodiment will be described later.

また、デジタルスチルカメラ１は信号処理部２４にて得られる撮像画像データを利用して表示部３３により画像表示を実行させることで、現在撮像中の画像であるいわゆるスルー画を表示させることが可能とされる。例えば信号処理部２４においては、先の説明のようにしてＡ／Ｄコンバータ２３から出力される撮像信号を取り込んで１枚の静止画相当の撮像画像データを生成するのであるが、この動作を継続することで、動画におけるフレーム画像に相当する撮像画像データを順次生成していく。そして、このようにして順次生成される撮像画像データを、制御部２７の制御に従って表示ドライバ３２に対して転送する。これにより、スルー画の表示が行われる。 Further, the digital still camera 1 can display a so-called through image that is an image currently being captured by causing the display unit 33 to perform image display using captured image data obtained by the signal processing unit 24. It is said. For example, the signal processing unit 24 captures the imaging signal output from the A / D converter 23 as described above and generates captured image data corresponding to one still image, but this operation is continued. Thus, captured image data corresponding to frame images in the moving image is sequentially generated. The captured image data sequentially generated in this way is transferred to the display driver 32 under the control of the control unit 27. Thereby, a through image is displayed.

表示ドライバ３２では、上記のようにして信号処理部２４から入力されてくる撮像画像データに基づいて表示部３３を駆動するための駆動信号を生成し、表示部３３に対して出力していくようにされる。これにより、表示部３３においては、静止画単位の撮像画像データに基づく画像が順次的に表示されていくことになる。これをユーザが見れば、そのときに撮像しているとされる画像が表示部３３において動画的に表示されることになる。つまり、スルー画像が表示される。 The display driver 32 generates a drive signal for driving the display unit 33 based on the captured image data input from the signal processing unit 24 as described above, and outputs the drive signal to the display unit 33. To be. As a result, the display unit 33 sequentially displays images based on the captured image data in units of still images. If this is seen by the user, the image taken at that time is displayed on the display unit 33 as a moving image. That is, a through image is displayed.

また、デジタルスチルカメラ１は、メモリカード４０に記録されている撮像画像データを再生して、その画像を表示部３３に対して表示させることも可能とされる。
このためには、制御部２７が撮像画像データを指定して、メディアコントローラ２６に対してメモリカード４０からのデータ読み出しを命令する。この命令に応答して、メディアコントローラ２６は、指定された撮像画像データが記録されているメモリカード４０上のアドレスにアクセスしてデータ読み出しを実行し、読み出したデータを、エンコード／デコード部２５に対して転送する。 The digital still camera 1 can also reproduce captured image data recorded on the memory card 40 and display the image on the display unit 33.
For this purpose, the control unit 27 designates captured image data and instructs the media controller 26 to read data from the memory card 40. In response to this command, the media controller 26 accesses the address on the memory card 40 where the designated captured image data is recorded, executes data reading, and sends the read data to the encoding / decoding unit 25. Forward.

エンコード／デコード部２５は、例えば制御部２７の制御に従って、メディアコントローラ２６から転送されてきた撮像画像データから圧縮静止画データとしての実体データを取り出し、この圧縮静止画データについて、圧縮符号化に対する復号処理を実行して、１つの静止画に対応する撮像画像データを得る。そして、この撮像画像データを表示ドライバ３２に対して転送する。これにより、表示部３３においては、メモリカード４０に記録されている撮像画像データの画像が再生表示されることになる。 The encode / decode unit 25 extracts, for example, actual data as compressed still image data from the captured image data transferred from the media controller 26 under the control of the control unit 27, and decodes the compressed still image data with respect to compression encoding. Processing is executed to obtain captured image data corresponding to one still image. Then, the captured image data is transferred to the display driver 32. Thereby, on the display unit 33, the image of the captured image data recorded in the memory card 40 is reproduced and displayed.

また表示部３３に対しては、上記のスルー画像や撮像画像データの再生画像などとともに、ユーザインターフェイス画像も表示させることができる。この場合には、例えばそのときの動作状態などに応じて制御部２７が必要なユーザインターフェイス画像としての表示用画像データを生成し、これを表示ドライバ３２に対して出力するようにされる。これにより、表示部３３においてユーザインターフェイス画像が表示されることになる。なお、このユーザインターフェイス画像は、例えば特定のメニュー画面などのようにスルー画像や撮像画像データの再生画像とは個別に表示部３３の表示画面に表示させることも可能であるし、スルー画像や撮像画像データの再生画像上の一部において重畳・合成されるようにして表示させることも可能である。 On the display unit 33, a user interface image can be displayed along with the through image and the reproduced image of the captured image data. In this case, for example, the control unit 27 generates display image data as a necessary user interface image according to the operation state at that time, and outputs the display image data to the display driver 32. As a result, the user interface image is displayed on the display unit 33. The user interface image can be displayed on the display screen of the display unit 33 separately from the through image and the reproduced image of the captured image data, such as a specific menu screen. It is also possible to display the image data so as to be superimposed and synthesized on a part of the reproduced image.

制御部２７は、例えば実際においてはＣＰＵ(Central Processing Unit)を備えて成るもので、ＲＯＭ２８、ＲＡＭ２９などとともにマイクロコンピュータを構成する。ＲＯＭ２８には、例えば制御部２７としてのＣＰＵが実行すべきプログラムの他、デジタルスチルカメラ１の動作に関連した各種の設定情報などが記憶される。ＲＡＭ２９は、ＣＰＵのための主記憶装置とされる。
また、この場合のフラッシュメモリ３０は、例えばユーザ操作や動作履歴などに応じて変更(書き換え)の必要性のある各種の設定情報などを記憶させておくために使用する不揮発性の記憶領域として設けられるものである。なおＲＯＭ２８について、例えばフラッシュメモリなどをはじめとする不揮発性メモリを採用することとした場合には、フラッシュメモリ３０に代えて、このＲＯＭ２８における一部記憶領域を使用することとしてもよい。 The control unit 27 actually includes a CPU (Central Processing Unit), for example, and constitutes a microcomputer together with the ROM 28, the RAM 29, and the like. The ROM 28 stores, for example, various setting information related to the operation of the digital still camera 1 in addition to a program to be executed by the CPU as the control unit 27. The RAM 29 is a main storage device for the CPU.
In this case, the flash memory 30 is provided as a non-volatile storage area used for storing various setting information that needs to be changed (rewritten) according to, for example, a user operation or an operation history. It is what For example, if a non-volatile memory such as a flash memory is adopted as the ROM 28, a partial storage area in the ROM 28 may be used instead of the flash memory 30.

操作部３１は、デジタルスチルカメラ１に備えられる各種操作子と、これらの操作子に対して行われた操作に応じた操作情報信号を生成してＣＰＵに出力する操作情報信号出力部位とを一括して示している。制御部２７は、操作部３１から入力される操作情報信号に応じて所定の処理を実行する。これによりユーザ操作に応じたデジタルスチルカメラ１の動作が実行されることになる。 The operation unit 31 collects various operation elements provided in the digital still camera 1 and operation information signal output parts that generate operation information signals according to operations performed on these operation elements and output them to the CPU. As shown. The control unit 27 executes predetermined processing according to the operation information signal input from the operation unit 31. Thereby, the operation of the digital still camera 1 according to the user operation is executed.

雲台対応通信部３４は、雲台１０側とデジタルスチルカメラ１側との間での所定の通信方式に従った通信を実行する部位であり、例えばデジタルスチルカメラ１が雲台１０に対して取り付けられた状態において、雲台１０側の通信部との間での有線若しくは無線による通信信号の送受信を可能とするための物理層構成と、これより上位となる所定層に対応する通信処理を実現するための構成とを有して成る。 The pan head communication unit 34 is a part that performs communication according to a predetermined communication method between the pan head 10 side and the digital still camera 1 side. For example, the digital still camera 1 communicates with the pan head 10. In the attached state, a physical layer configuration for enabling transmission / reception of wired or wireless communication signals to / from the communication unit on the camera platform 10 side, and communication processing corresponding to a predetermined layer higher than this And a configuration for realizing.

図４は、雲台１０の構成例をブロック図により示している。
先に述べたように、雲台１０は、パン・チルト機構を備えるものであり、これに対応する部位として、パン機構部５３、パン用モータ５４、チルト機構部５６、チルト用モータ５７を備える。
パン機構部５３は、雲台１０に取り付けられたデジタルスチルカメラ１について、図２（ａ）に示したパン（横）方向の動きを与えるための機構を有して構成され、この機構の動きは、パン用モータ５４が正逆方向に回転することによって得られる。同様にして、チルト機構部５６は、雲台１０に取り付けられたデジタルスチルカメラ１について、図２（ｂ）に示したチルト（縦）方向の動きを与えるための機構を有して構成され、この機構の動きは、チルト用モータ５７が正逆方向に回転することによって得られる。 FIG. 4 is a block diagram illustrating a configuration example of the camera platform 10.
As described above, the pan head 10 includes a pan / tilt mechanism, and includes a pan mechanism unit 53, a pan motor 54, a tilt mechanism unit 56, and a tilt motor 57 as corresponding parts. .
The pan mechanism unit 53 is configured to have a mechanism for giving the pan (lateral) movement shown in FIG. 2A with respect to the digital still camera 1 attached to the pan head 10. Is obtained by rotating the pan motor 54 in the forward and reverse directions. Similarly, the tilt mechanism unit 56 is configured to have a mechanism for imparting a motion in the tilt (vertical) direction shown in FIG. 2B for the digital still camera 1 attached to the camera platform 10. This mechanism movement is obtained by rotating the tilt motor 57 in the forward and reverse directions.

制御部５１は、例えばＣＰＵ、ＲＯＭ、ＲＡＭなどが組み合わされて形成されるマイクロコンピュータを有して成り、上記パン機構部５３、チルト機構部５６の動きをコントロールする。例えば制御部５１がパン機構部５３の動きを制御するときには、パン機構部５３に必要な移動量と移動方向に対応した制御信号をパン用駆動部５５に対して出力する。パン用駆動部５５は、入力される制御信号に対応したモータ駆動信号を生成してパン用モータ５４に出力する。このモータ駆動信号によりパン用モータ５４が、例えば所要の回転方向及び回転角度で回転し、この結果、パン機構部５３も、これに対応した移動量と移動方向により動くようにして駆動される。
同様にして、チルト機構部５６の動きを制御するときには、制御部５１は、チルト機構部５６に必要な移動量と移動方向に対応した制御信号をチルト用駆動部５８に対して出力する。チルト用駆動部５８は、入力される制御信号に対応したモータ駆動信号を生成してチルト用モータ５７に出力する。このモータ駆動信号によりチルト用モータ５７が、例えば所要の回転方向及び回転角度で回転し、この結果、チルト機構部５６も、これに対応した移動量と移動方向により動くようにして駆動される。 The control unit 51 includes a microcomputer formed by combining a CPU, a ROM, a RAM, and the like, for example, and controls the movement of the pan mechanism unit 53 and the tilt mechanism unit 56. For example, when the control unit 51 controls the movement of the pan mechanism unit 53, a control signal corresponding to the amount and direction of movement necessary for the pan mechanism unit 53 is output to the pan drive unit 55. The pan drive unit 55 generates a motor drive signal corresponding to the input control signal and outputs the motor drive signal to the pan motor 54. By this motor drive signal, the pan motor 54 rotates, for example, at a required rotation direction and rotation angle, and as a result, the pan mechanism unit 53 is also driven to move according to the corresponding movement amount and movement direction.
Similarly, when controlling the movement of the tilt mechanism unit 56, the control unit 51 outputs a control signal corresponding to the movement amount and movement direction necessary for the tilt mechanism unit 56 to the tilt drive unit 58. The tilt drive unit 58 generates a motor drive signal corresponding to the input control signal and outputs it to the tilt motor 57. By this motor drive signal, the tilt motor 57 rotates, for example, in a required rotation direction and rotation angle, and as a result, the tilt mechanism unit 56 is also driven to move according to the corresponding movement amount and movement direction.

通信部５２は、雲台１０に取り付けられたデジタルスチルカメラ１内の雲台対応通信部３４との間で所定の通信方式に従った通信を実行する部位であり、雲台対応通信部３４と同様にして、相手側通信部と有線若しくは無線による通信信号の送受信を可能とするための物理層構成と、これより上位となる所定層に対応する通信処理を実現するための構成とを有して成る。 The communication unit 52 is a part that performs communication according to a predetermined communication method with the pan-head compatible communication unit 34 in the digital still camera 1 attached to the pan head 10. Similarly, it has a physical layer configuration for enabling transmission / reception of wired or wireless communication signals with a counterpart communication unit, and a configuration for realizing communication processing corresponding to a predetermined layer higher than this. It consists of

次に、図５のブロック図により、本実施形態に対応する撮像システムを成すデジタルスチルカメラ１及び雲台１０についての、ハードウェア及びソフトウェア（プログラム）により実現される機能構成例を示す。
この図において、デジタルスチルカメラ１は、撮像記録ブロック６１、構図判定ブロック６２、パン・チルト・ズーム制御ブロック６３、及び通信制御処理ブロック６４を備えて成るものとされている。 Next, the block diagram of FIG. 5 shows a functional configuration example realized by hardware and software (program) for the digital still camera 1 and the camera platform 10 constituting the imaging system corresponding to the present embodiment.
In this figure, the digital still camera 1 includes an imaging / recording block 61, a composition determination block 62, a pan / tilt / zoom control block 63, and a communication control processing block 64.

撮像記録ブロック６１は、撮像により得られた画像を画像信号のデータ（撮像画像データ）として得て、この撮像画像データを記憶媒体に記憶するための制御処理を実行する部位である。この部位は、例えば撮像のための光学系、撮像素子（イメージセンサ）、及び撮像素子から出力される信号から撮像画像データを生成する信号処理回路、また、撮像画像データを記憶媒体に書き込んで記録（記憶）させるための記録制御・処理系などを有して成る部位である。
この場合の撮像記録ブロック６１における撮像画像データの記録（撮像記録）は、構図判定ブロックの指示、制御により実行される。 The imaging record block 61 is a part that obtains an image obtained by imaging as image signal data (captured image data) and executes control processing for storing the captured image data in a storage medium. This part includes, for example, an optical system for imaging, an image sensor (image sensor), a signal processing circuit that generates captured image data from a signal output from the image sensor, and also records the captured image data in a storage medium. This is a part having a recording control / processing system for (storing).
In this case, the recording (imaging recording) of the captured image data in the imaging recording block 61 is executed by the instruction and control of the composition determination block.

構図判定ブロック６２は、撮像記録ブロック６１から出力される撮像画像データを取り込んで入力し、この撮像画像データを基にして、先ず被写体検出を行い、最終的には構図判定のための処理を実行する。
本実施形態においては、この構図判定に際して、被写体検出により検出された被写体ごとに、後述する属性についての検出も行う。そして、構図判定処理に際しては、この検出された属性を利用して最適とされる構図を判定する。さらに、判定した構図による画内容の撮像画像データが得られるようにするための構図合わせ制御も実行する。
ここで、構図判定ブロック６２が実行する被写体検出処理（初期顔枠の設定を含む）は、図３との対応では信号処理部２４が実行するようにして構成できる。また、この信号処理部２４による被写体検出処理は、ＤＳＰ(Digital signal Processor)による画像信号処理として実現できる。つまり、ＤＳＰに与えるプログラム、インストラクションにより実現できる。
また、構図判定ブロック６２が実行する顔枠の修正、及び構図判定、構図合わせ制御は、制御部２７としてのＣＰＵがプログラムに従って実行する処理として実現できる。 The composition determination block 62 captures and inputs the captured image data output from the imaging record block 61, first performs subject detection based on the captured image data, and finally executes a process for composition determination. To do.
In the present embodiment, at the time of this composition determination, an attribute described later is also detected for each subject detected by subject detection. In the composition determination process, an optimum composition is determined using the detected attribute. Further, composition adjustment control is performed to obtain captured image data of the image content according to the determined composition.
Here, the subject detection processing (including initial face frame setting) executed by the composition determination block 62 can be configured to be executed by the signal processing unit 24 in correspondence with FIG. The subject detection process by the signal processing unit 24 can be realized as an image signal process by a DSP (Digital signal Processor). That is, it can be realized by a program or instruction given to the DSP.
Further, the face frame correction, composition determination, and composition adjustment control executed by the composition determination block 62 can be realized as processing executed by the CPU as the control unit 27 according to a program.

パン・チルト・ズーム制御ブロック６３は、構図判定ブロック６２の指示に応じて、判定された最適構図に応じた構図、撮像視野角が得られるように、パン・チルト・ズーム制御を実行する。つまり、構図合わせ制御として、構図判定ブロック６２は、例えば判定された最適構図に応じて得るべき上記構図、撮像視野角をパン・チルト・ズーム制御ブロック６３に指示する。パン・チルト・ズーム制御ブロック６３は、指示された構図、撮像視野角が得られる撮像方向にデジタルスチルカメラ１が向くための、雲台１０のパン・チルト機構についての移動量を求め、この求めた移動量に応じた移動を指示するパン・チルト制御信号を生成する。
また、例えば判定された適切画角を得るためのズーム位置を求め、このズーム位置となるようにして、撮像記録ブロック６１が備えるとされるズーム機構を制御する。 The pan / tilt / zoom control block 63 executes pan / tilt / zoom control in accordance with an instruction from the composition determination block 62 so as to obtain a composition and an imaging viewing angle corresponding to the determined optimum composition. That is, as composition adjustment control, the composition determination block 62 instructs the pan / tilt / zoom control block 63 on the composition and the imaging viewing angle to be obtained according to, for example, the determined optimum composition. The pan / tilt / zoom control block 63 obtains the amount of movement of the pan / tilt mechanism for the pan / tilt head 10 so that the digital still camera 1 faces in the imaging direction in which the instructed composition and imaging viewing angle are obtained. A pan / tilt control signal for instructing movement according to the amount of movement is generated.
Further, for example, a zoom position for obtaining the determined appropriate angle of view is obtained, and the zoom mechanism that the imaging recording block 61 is provided is controlled so as to be the zoom position.

通信制御ブロック６４は、雲台１０側に備えられる通信制御ブロック７１との間で所定の通信プロトコルに従って通信を実行するための部位となる。上記パン・チルト・ズーム制御ブロック６３が生成したパン・チルト制御信号は、通信制御ブロック６４の通信により、雲台１０の通信制御ブロック７１に対して送信される。 The communication control block 64 is a part for executing communication with the communication control block 71 provided on the camera platform 10 side according to a predetermined communication protocol. The pan / tilt control signal generated by the pan / tilt / zoom control block 63 is transmitted to the communication control block 71 of the camera platform 10 through communication of the communication control block 64.

雲台１０は、例えば図示するようにして、通信制御ブロック７１、及びパン・チルト制御処理ブロック７２を有している。
通信制御ブロック７１は、デジタルスチルカメラ１側の通信制御ブロック６４との間での通信を実行するための部位であり、上記のパン・チルト制御信号を受信した場合には、このパン・チルト制御信号をパン・チルト制御処理ブロック７２に対して出力する。 The camera platform 10 includes a communication control block 71 and a pan / tilt control processing block 72 as shown in the figure, for example.
The communication control block 71 is a part for executing communication with the communication control block 64 on the digital still camera 1 side. When the pan / tilt control signal is received, the pan / tilt control is performed. The signal is output to the pan / tilt control processing block 72.

パン・チルト制御処理ブロック７２は、ここでは図示していない雲台１０側のマイクロコンピュータなどが実行する制御処理のうちで、パン・チルト制御に関する処理の実行機能に対応するものとなる。
このパン・チルト制御処理ブロック７２は、入力したパン・チルト制御信号に応じて、ここでは図示していないパン駆動機構部、チルト駆動機構部を制御する。これにより、最適構図に応じて必要な水平視野角と垂直視野角を得るためのパンニング、チルティングが行われる。 The pan / tilt control processing block 72 corresponds to an execution function of processing related to pan / tilt control among control processing executed by a microcomputer or the like on the camera platform 10 (not shown).
The pan / tilt control processing block 72 controls a pan driving mechanism unit and a tilt driving mechanism unit not shown here according to the input pan / tilt control signal. Thereby, panning and tilting are performed to obtain the required horizontal viewing angle and vertical viewing angle according to the optimum composition.

また、この場合の構図判定ブロック６２は後述するようにして被写体検出処理を実行するが、この被写体検出処理の結果として被写体が検出されないときには、パン・チルト・ズーム制御ブロック６３は、例えば指令に応じて被写体探索のためのパン・チルト・ズーム制御を行うことができるようになっている。 In this case, the composition determination block 62 performs subject detection processing as described later. When no subject is detected as a result of the subject detection processing, the pan / tilt / zoom control block 63 responds to a command, for example. Thus, pan / tilt / zoom control for searching for a subject can be performed.

２．基本の構図判定処理例

本実施形態の撮像システムによる、被写体検出処理によって複数の被写体が検出された場合の基本的な構図判定処理について説明する。なお、ここでは、説明を分かりやすいものとすることの便宜上、２つの被写体が検出された場合を例に挙げる。 2. Basic composition determination processing example

A basic composition determination process when a plurality of subjects are detected by the subject detection process by the imaging system of the present embodiment will be described. Here, for the sake of convenience of explanation, the case where two subjects are detected is taken as an example.

ここで、構図判定ブロック６２が実行する被写体検出処理は、取り込んだ撮像画像データの画内容から、人としての被写体を弁別して検出する処理をいうものとする。
上記被写体検出処理の具体的手法としては、顔検出の技術を用いることができる。この顔検出の方式、手法はいくつか知られているが、本実施形態においてはどの方式を採用するのかについては特に限定されるべきものではなく、検出精度や設計難易度などを考慮して適当とされる方式が採用されればよい。
そのうえで本実施形態では、例えば上記のようにして顔検出を応用して被写体検出を行うことに応じて、検出した被写体、即ち検出した顔の画像領域部分を基にして、その被写体についての重心(被写体重心)を設定する。以降の説明から理解されるように、この被写体重心は、構図判定処理のために有効に利用される。
被写体重心の設定の仕方についてはいくつか考えられる。最も簡単な例としては、検出した被写体の顔部分領域に対応して方形の枠(顔枠)を設定したうえで、この顔枠としての四角形の対角線の交点を、被写体重心とするものである。 Here, the subject detection processing executed by the composition determination block 62 refers to processing for discriminating and detecting a subject as a person from the image content of the captured image data taken in.
As a specific method of the subject detection process, a face detection technique can be used. Several face detection methods and methods are known, but in this embodiment, which method should be adopted is not particularly limited, and is appropriate in consideration of detection accuracy and design difficulty. It is only necessary to adopt the method.
In addition, in the present embodiment, for example, in response to subject detection by applying face detection as described above, based on the detected subject, that is, the detected image area portion of the face, the center of gravity ( Set the subject center of gravity. As will be understood from the following description, the subject center of gravity is effectively used for the composition determination process.
There are several ways to set the subject center of gravity. In the simplest example, a rectangular frame (face frame) is set corresponding to the detected face area of the subject, and the intersection of square diagonal lines as the face frame is used as the subject center of gravity. .

図６においては、被写体検出処理によって、画枠３００内において２つの被写体ＳＢＪ１、ＳＢＪ２が検出された状態が示されている。そのうえで、この図では、検出された２つの被写体ＳＢＪ１、ＳＢＪ２とともに、被写体重心Ｇ１、Ｇ２が示される。被写体重心Ｇ１が被写体ＳＢＪ１に対応して設定されたものとなり、被写体重心Ｇ２が被写体ＳＢＪ２に対応して設定されたものとなる。 FIG. 6 shows a state where two subjects SBJ1 and SBJ2 are detected in the image frame 300 by subject detection processing. In addition, in this figure, the subject gravity centers G1 and G2 are shown together with the two detected subjects SBJ1 and SBJ2. The subject gravity center G1 is set corresponding to the subject SBJ1, and the subject gravity center G2 is set corresponding to the subject SBJ2.

なお、この図における画枠３００は、方形のマスがマトリクス上に配列されて形成されるものとなっている。個々のマスは画枠３００に対応するフレーム画像データを形成する画素を模式的に表している。即ち、この図においては、画枠は、画素のデータの配列により形成されるものであることを模式的に示している。実際の画枠（フレーム画像領域）として採用される画素数は、例えば、水平画素数×垂直画素数＝６４０×４８０などとなる。 The image frame 300 in this figure is formed by arranging square cells on a matrix. Each square schematically represents a pixel that forms frame image data corresponding to the image frame 300. That is, in this figure, it is schematically shown that the image frame is formed by the arrangement of pixel data. The number of pixels adopted as an actual image frame (frame image region) is, for example, horizontal pixel number × vertical pixel number = 640 × 480.

また、このようにして画枠３００内において複数の被写体が検出された場合には、これらの被写体全体に対応した総合的な重心(総合被写体重心)についても設定することができる。
総合被写体重心の設定の仕方についてもいくつか考えられる。図６では、検出された被写体が２つである場合に対応するものとして、２つの被写体の被写体重心Ｇ１，Ｇ２を結ぶ線分の中点を総合被写体重心Ｇｔとして設定するものとしている。 In addition, when a plurality of subjects are detected in the image frame 300 in this way, it is possible to set a total center of gravity (total subject center of gravity) corresponding to all of these subjects.
There are several ways to set the total subject center of gravity. In FIG. 6, as a case corresponding to the case where there are two detected subjects, the midpoint of the line segment connecting the subject gravity centers G1 and G2 of the two subjects is set as the total subject gravity center Gt.

なお、総合被写体重心は、検出される被写体が３以上の場合にも対応して予め定めた所定のアルゴリズムによって設定できる。例えば検出された被写体の被写体重心を結ぶ線により多角形を形成し、この多角形について求めた重心の位置に基づいて総合被写体重心を設定することが考えられる。 Note that the total subject gravity center can be set by a predetermined algorithm corresponding to a case where the number of detected subjects is three or more. For example, a polygon may be formed by a line connecting the detected subject centroids of the subject, and the total subject centroid may be set based on the position of the centroid obtained for the polygon.

そして、あくまでも一例であるが、構図判定処理として、図６に示すようにして被写体が検出された場合においては、これらの被写体の配置位置に関する構図を、次のようにして設定する。なお、ここでは、説明を簡単にするために、画枠３００内における被写体ＳＢＪ１，ＳＢＪ２の各サイズについては構図要素としては適切になっていることとする。 As an example to the last, as a composition determination process, when a subject is detected as shown in FIG. 6, the composition relating to the arrangement position of these subjects is set as follows. Here, in order to simplify the explanation, it is assumed that the sizes of the subjects SBJ1 and SBJ2 in the image frame 300 are appropriate as composition elements.

図６においては、画枠３００上に対して、２つの仮想線Ｌｘ、Ｌｙを設定している。仮想線Ｌｘは、画枠３００の水平サイズ（水平画素数に相当する）Ｃｘの中点を通過する垂直な線である。仮想線Ｌｙは、画枠３００の垂直サイズ（垂直画素数に相当する）Ｃｙを３等分（Ｃｙ／３）する２本の水平な線のうち、上側となる線である。 In FIG. 6, two virtual lines Lx and Ly are set on the image frame 300. The virtual line Lx is a vertical line passing through the midpoint of the horizontal size (corresponding to the number of horizontal pixels) Cx of the image frame 300. The virtual line Ly is an upper line of two horizontal lines that divide the vertical size Cy (corresponding to the number of vertical pixels) Cy of the image frame 300 into three equal parts (Cy / 3).

そして、上記仮想線Ｌｘ、Ｌｙの交点を目標位置ＴＰとして設定する。この目標位置ＴＰは、被写体配置の構図要素に関係するもので、最適構図とするための被写体位置を得るために総合被写体重心Ｇｔが在るべき画枠３００内の位置（座標）である。つまり、この場合の構図判定処理としては、画枠３００内にて２つの被写体を検出した場合には、その被写体重心Ｇｔが、仮想線Ｌｘ、Ｌｙの交点に位置する状態が最適構図であると判定するものである。
ちなみに、この図に示される目標位置ＴＰは、いわゆる三分割法に従って設定されている。しかし、これは一例であり、構図判定処理として、画枠３００における目標位置ＴＰを、どのような概念、構図決定手法に基づいて決定するのかについては限定されるべきではない。 Then, the intersection of the virtual lines Lx and Ly is set as the target position TP. This target position TP is related to the composition element of the subject arrangement, and is the position (coordinates) in the image frame 300 where the total subject gravity center Gt should be in order to obtain the subject position for obtaining the optimum composition. That is, as the composition determination process in this case, when two subjects are detected in the image frame 300, the state where the subject center of gravity Gt is located at the intersection of the virtual lines Lx and Ly is the optimum composition. Judgment.
Incidentally, the target position TP shown in this figure is set according to a so-called three-division method. However, this is merely an example, and as a composition determination process, the concept and composition determination method for determining the target position TP in the image frame 300 should not be limited.

図７には、上記図６に示した目標位置ＴＰに対して総合被写体重心Ｇｔを位置させたときの画内容が示される。図６に示す画内容の状態から、この図７に示す画内容とするためには、例えば構図判定ブロック６２の指示を受けたパン・チルト・ズーム制御ブロック６２が、目標位置ＴＰに対して被写体重心Ｇｔが位置するようにして、雲台１０のパン・チルト機構を駆動するための制御を実行する。
なお、判定した構図となるようにしてパン・チルト・ズームの制御を行うことについては「構図合わせ制御」ともいう。また、構図判定処理と構図合わせ制御を合わせて「構図制御」ともいう。 FIG. 7 shows the image contents when the total subject gravity center Gt is positioned with respect to the target position TP shown in FIG. In order to obtain the image content shown in FIG. 7 from the image content state shown in FIG. 6, for example, the pan / tilt / zoom control block 62 that has received an instruction from the composition determination block 62 moves the subject with respect to the target position TP. Control for driving the pan / tilt mechanism of the camera platform 10 is executed so that the center of gravity Gt is located.
Note that performing pan / tilt / zoom control so that the determined composition is obtained is also referred to as “composition adjustment control”. The composition determination process and composition adjustment control are also referred to as “composition control”.

図７においては、被写体ＳＢＪ１、ＳＢＪ２の顔部分が、画枠３００において仮想線Ｌｙのほぼちょうど上あたりに位置している。また、この場合には、２つの被写体ＳＢＪ１，ＳＢＪ２が左右にほぼ並んだ状態となっていることに応じて、画枠３００の左右方向においては、被写体ＳＢＪ１，ＳＢＪ２を偏らせることなく、中心（仮想線Ｌｘ）からほぼ等距離となるように配置させている。
ちなみに、このような被写体の配置は、例えば三分割法の下では良い構図の典型の１つとされている。 In FIG. 7, the face portions of the subjects SBJ 1 and SBJ 2 are positioned almost just above the virtual line Ly in the image frame 300. In this case, in response to the fact that the two subjects SBJ1 and SBJ2 are substantially aligned on the left and right, in the left-right direction of the image frame 300, the subjects (SBJ1 and SBJ2 are not centered). They are arranged so as to be substantially equidistant from the virtual line Lx).
Incidentally, such an arrangement of the subject is considered to be one of the typical compositions that are good under the three-division method, for example.

３．年代に応じた構図判定（第１例）

図８に示される画枠３００内には２つの被写体ＳＢＪａ、ＳＢＪｃが検出されており、その位置としては、ほぼ横並びになっている。この点では、図６の場合と同様となる。
但し、図８においては、人物である被写体に関する年代の属性について、被写体ＳＢＪａは大人であり、被写体ＳＢＪｃは子供であるとする。つまり、画枠内に存在する被写体の属性としてみたときに、大人と子供という関係性を有しているものである。 3. Composition determination according to the age (first example)

Two subjects SBJa and SBJc are detected in the image frame 300 shown in FIG. 8, and the positions thereof are substantially side by side. This is the same as in the case of FIG.
However, in FIG. 8, it is assumed that the subject SBJa is an adult and the subject SBJc is a child with respect to the age attribute of a subject that is a person. That is, when viewed as the attributes of the subject existing in the image frame, there is a relationship between an adult and a child.

一般に、被写体として大人と子供とが混在する場合、重視されるのは子供のほうであることが多い。つまり、この場合には、複数の被写体の重要度は均等ではなく、子供のほうが被写体として高い重要度を持つ。このような場合には、図６、図７による基本の構図判定に従って、２つの被写体が左右方向において中央からほぼ等距離にあるようにして構図制御を行うよりも、子供のほうを中央寄りとする配置としたほうが良い構図が得られる。 In general, when adults and children are mixed as subjects, it is often the case that children are emphasized. That is, in this case, the importance of the plurality of subjects is not equal, and the child has a higher importance as the subject. In such a case, according to the basic composition determination according to FIGS. 6 and 7, the child is placed closer to the center than when the composition control is performed so that the two subjects are approximately equidistant from the center in the left-right direction. It is possible to obtain a composition that is better to be arranged.

そこで、本実施形態では、複数の被写体が検出された場合において、これら被写体の年代属性について，大人と子供とが混在しているときには、子供が中央寄りで配置されるように構図判定を行う。
このための構図判定処理として、図８の場合には次のようになる。
この場合においても、基本の構図判定処理の場合と同様に、大人の被写体ＳＢＪａと子供の被写体ＳＢＪｃのそれぞれについて、被写体重心Ｇ１，Ｇ２を設定する。そして、被写体重心Ｇ１，Ｇ２を結ぶ線分上において総合被写体重心Ｇｔを設定するのであるが、この際において、図示するように、予め定めた重み付けに応じた比率により、線分の中点よりも子供の被写体ＳＢＪｃの重心Ｇ２に近い側にずらして総合被写体重心Ｇｔの位置を設定する。 Therefore, in the present embodiment, when a plurality of subjects are detected, composition determination is performed so that the children are arranged closer to the center with respect to age attributes of these subjects when adults and children are mixed.
The composition determination process for this is as follows in the case of FIG.
In this case, as in the case of the basic composition determination process, the subject gravity centers G1 and G2 are set for each of the adult subject SBJa and the child subject SBJc. Then, the total subject gravity center Gt is set on the line segment connecting the subject gravity centers G1 and G2. At this time, as shown in the drawing, the ratio according to a predetermined weight is set to be higher than the midpoint of the line segment. The position of the total subject gravity center Gt is set by shifting to the side closer to the gravity center G2 of the child subject SBJc.

なお、上記比率は、総合被写体重心Ｇｔの位置を線分の中点からずらす量を設定するももであるが、これは、次に説明する図９からも理解されるように、結果として、子供の被写体重心Ｇ２を目標位置ＴＰに対してより近づける(偏倚させる)作用を有する。このことから、上記比率については偏倚率ｇともいうことにする。
この偏倚率ｇは、例えば−１≦ｇ≦１の範囲をとるものとして定義できる。偏倚率ｇ＝０では被写体重心Ｇ１，Ｇ２を結ぶ線分の中点に総合被写体重心Ｇｔが位置する。偏倚率ｇが−１に近づくほど総合被写体重心Ｇｔの位置は大人の被写体ＳＢＪａの被写体重心Ｇ１に近くなり、ｇ＝−１では被写体重心Ｇ１と一致する。また、偏倚率ｇが１に近づくほど総合被写体重心Ｇｔの位置が子供の被写体ＳＢＪｃの被写体重心Ｇ２に近くなり、偏倚率ｇ＝１では被写体重心Ｇ２と一致する。 Note that the above ratio sets the amount by which the position of the total subject center of gravity Gt is shifted from the midpoint of the line segment, but as a result, as will be understood from FIG. It has the effect of bringing the child's subject center of gravity G2 closer (biased) to the target position TP. For this reason, the ratio is also referred to as a bias rate g.
This deviation rate g can be defined as taking a range of −1 ≦ g ≦ 1, for example. When the deviation rate g = 0, the total subject gravity center Gt is located at the midpoint of the line connecting the subject gravity centers G1 and G2. As the deviation rate g approaches -1, the position of the total subject gravity center Gt becomes closer to the subject gravity center G1 of the adult subject SBJa, and coincides with the subject gravity center G1 when g = -1. Further, as the deviation rate g approaches 1, the position of the total subject gravity center Gt becomes closer to the subject gravity center G2 of the child subject SBJc, and coincides with the subject gravity center G2 when the deviation rate g = 1.

また、この場合においては、検出された被写体が２つで、ほぼ横並びとみてよい位置関係にあり、この点では、図６の場合と同様となる。そこで、この場合の目標位置ＴＰは、図６と同じ規則に従って図６と同じ位置に設定している。 In this case, there are two detected subjects, and they are in a positional relationship that can be regarded as being almost side by side, and in this respect, it is the same as in the case of FIG. Therefore, the target position TP in this case is set to the same position as in FIG. 6 according to the same rules as in FIG.

上記図８に示すようにして判定された構図に従って構図合わせ制御を行った結果を、図９に示す。この場合の構図合わせ制御としても、図６，図７の基本例の場合と同様に、目標位置ＴＰに対して、総合被写体重心Ｇｔが位置するようにしてパン・チルト制御を行うものとなる。 FIG. 9 shows the result of performing composition adjustment control in accordance with the composition determined as shown in FIG. As in the case of the basic example in FIGS. 6 and 7, the composition adjustment control in this case also performs pan / tilt control so that the total subject gravity center Gt is positioned with respect to the target position TP.

例えば図９と図７を比較して分かるように、図９では、子供の被写体ＳＢＪｃのほうが、大人の被写体ＳＢＪａよりも、左右方向において中央に寄って配置された構図となっている。つまり、２つの被写体がそれぞれ大人と子供である場合に適合した、良い構図が得られている。 For example, as can be seen by comparing FIG. 9 and FIG. 7, in FIG. 9, the child subject SBJc is arranged closer to the center in the left-right direction than the adult subject SBJa. That is, a good composition suitable for the case where the two subjects are an adult and a child is obtained.

４．年代に応じた構図判定（第２例）

図１０の画枠３００には、２つの被写体が検出されており、年代属性による区分としては、大人の被写体ＳＢＪａと子供の被写体ＳＢＪｃがそれぞれ１つづつ検出されている状態が示されている。この大人・子供の属性からみた関係性という点では、図８の場合と同様であるが、図１０においては、大人の被写体ＳＢＪａと子供の被写体ＳＢＪｃが縦方向に沿って並んでいるとみることのできる位置関係となっている。ここでは、大人の被写体ＳＢＪａが下で、子供の被写体ＳＢＪｃが上となっている。 4). Composition determination according to the age (second example)

In the image frame 300 of FIG. 10, two subjects are detected, and as a classification by age attribute, a state in which one adult subject SBJa and one child subject SBJc are detected is shown. The relationship from the viewpoint of the adult / child attributes is the same as in FIG. 8, but in FIG. 10, the adult subject SBJa and the child subject SBJc are considered to be aligned along the vertical direction. The positional relationship is possible. Here, the adult subject SBJa is on the bottom and the child subject SBJc is on the top.

このようにして複数の被写体が縦方向に並んでいる状態においても、年代属性に関して、大人の被写体と子供の被写体とが混在している場合には、図８、図９の場合と同じ理由により、縦方向においても子供の被写体のほうを中央よりに配置させるほうが良い構図が得られる。 Even in a state where a plurality of subjects are arranged in the vertical direction in this way, when the adult subject and the child subject are mixed with respect to the age attribute, for the same reason as in FIGS. Even in the vertical direction, it is possible to obtain a composition in which the child's subject is better positioned from the center.

そこで、例えば図１０の場合に対応しては、次のような構図判定が行われるようにする。
この場合においても、基本例と同様のアルゴリズムにより、大人の被写体ＳＢＪａと子供の被写体ＳＢＪｃとについて、それぞれ被写体重心Ｇ１，Ｇ２を設定する。そのうえで、図８の場合に準じて、被写体重心Ｇ１，Ｇ２を結ぶ線分上において、予め定めた偏倚率ｇに従って総合被写体重心Ｇｔを設定する。
なお、このときの偏倚率ｇは、図９の第１例の場合（被写体が横並びである場合）と同じである必要はなく、被写体が縦並びとなっていることに応じて適切とされる値をしてよい。 Therefore, for example, in the case of FIG. 10, the following composition determination is performed.
Also in this case, subject gravity centers G1 and G2 are set for the adult subject SBJa and the child subject SBJc, respectively, by the same algorithm as in the basic example. Then, in accordance with the case of FIG. 8, the total subject gravity center Gt is set according to a predetermined deviation rate g on a line segment connecting the subject gravity centers G1 and G2.
Note that the deviation rate g at this time does not have to be the same as in the case of the first example in FIG. 9 (when the subject is horizontally arranged), and is appropriate according to the fact that the subject is vertically aligned. May be a value.

また、図１０のようにして２つの被写体が縦並びの位置関係にある場合には、仮想線として、画枠３００の水平サイズを三等分する垂直な２本の仮想線Ｌｘ１、Ｌｘ２を設定するとともに、垂直サイズを三等分する水平な２本の仮想線Ｌｙ１、Ｌｙ２を設定する。 In addition, when the two subjects are vertically aligned as shown in FIG. 10, two vertical virtual lines Lx1 and Lx2 that divide the horizontal size of the image frame 300 into three equal parts are set as virtual lines. In addition, two horizontal virtual lines Ly1 and Ly2 that divide the vertical size into three equal parts are set.

上記の仮想線の設定によっては、画枠内において仮想線の交点が４箇所在ることになるが、図１０の場合には、これらの交点のうち、右下に位置する、仮想線Ｌｘ２と仮想線Ｌｙ２の交点に対して目標位置ＴＰを設定する。
図１０の場合には、子供の被写体ＳＢＪｃが上で、大人の被写体ＳＢＪａが下の位置関係となっている。この位置関係の場合には、目標位置ＴＰについて、先ず、水平方向に沿う仮想線としては下側の仮想線Ｌｙ２上の交点に対して設定することとしている。
仮想線Ｌｙ２上の交点は、左側の仮想線Ｌｘ１との交点と、右側の仮想線Ｌｘ２との交点との２箇所があるが、ここでは、右側の交点に目標位置ＴＰを設定している。ここで、仮想線Ｌｙ２上における右側と左側のいずれの交点に目標位置ＴＰを設定するのかについてであるが、先ず１つには、一義的にいずれか一方に決めておくことが考えられる。また、例えば年代以外の属性であるなどの特定の他の条件に基づいて、右側と左側のいずれの交点とするのかを決定することとしてもよい。 Depending on the setting of the virtual line, there are four intersections of the virtual lines in the image frame. In the case of FIG. 10, the virtual line Lx2 located at the lower right of these intersections and A target position TP is set for the intersection of the virtual line Ly2.
In the case of FIG. 10, the child subject SBJc is on the upper side and the adult subject SBJa is on the lower side. In the case of this positional relationship, the target position TP is first set as an imaginary line along the horizontal direction with respect to the intersection on the lower imaginary line Ly2.
There are two intersections on the virtual line Ly2: an intersection with the left virtual line Lx1 and an intersection with the right virtual line Lx2. Here, the target position TP is set at the right intersection. Here, as to whether the target position TP is set at the intersection point on the right side or the left side on the virtual line Ly2, it is conceivable that one of them is uniquely determined as one. Moreover, it is good also as determining which intersection is made into the right side or the left side based on other specific conditions, such as attributes other than the age, for example.

図１１は、上記図１０により説明した構図の判定結果に従って、構図合わせ制御を行った結果を示している。
この図からも分かるように、上下の位置関係にある大人の被写体ＳＢＪａと子供の被写体ＳＢＪｃのうち、画枠３００の縦方向において、子供の被写体ＳＢＪｃのほうが中央によって配置された構図となっている。つまり、大人よりも子供が中央よりに配置された良い構図が得られている。 FIG. 11 shows the result of performing composition adjustment control in accordance with the composition determination result described with reference to FIG.
As can be seen from this figure, among the adult subject SBJa and the child subject SBJc that are in the vertical relationship, the child subject SBJc is arranged in the center in the vertical direction of the image frame 300. . In other words, a good composition is obtained in which children are arranged from the center rather than adults.

また、確認のために述べておくと、大人被写体ＳＢＪａが上で、子供被写体ＳＢＪｃが下となる、図１０の場合とは逆の位置関係の場合には、構図判定処理において、垂直な仮想線Ｌｘ２(若しくは仮想線Ｌｘ１)と、水平方向に沿う上側の仮想線Ｌｙ１との交点に対して目標位置ＴＰを設定する。これにより、構図合わせの結果としては、子供被写体ＳＢＪｃが中央よりに配置される構図が得られる。 For confirmation, in the case where the adult subject SBJa is on and the child subject SBJc is on the bottom and the positional relationship is opposite to that in FIG. A target position TP is set for the intersection of Lx2 (or virtual line Lx1) and the upper virtual line Ly1 along the horizontal direction. As a result of the composition adjustment, a composition in which the child subject SBJc is arranged from the center is obtained.

なお、図１１からも分かるように、大人の被写体ＳＢＪａと子供の被写体ＳＢＪｃが上下となる位置関係においては、偏倚率ｇを高めに設定して総合被写体重心Ｇｔを、子供側の被写体重心Ｇ１側に近づけすぎると、構図制御の結果として、子供の被写体ＳＢＪｃが中央よりも下側となる構図が得られがちになる。また、大人被写体ＳＢＪａが許容範囲を超えて画枠からはみ出しすぎてしまい、かえって良くない構図となるケースが生じる可能性も出てくる。従って、この場合の偏倚率ｇについては、図８の場合よりも小さい（０に近い）値を設定することのほうが好ましい。
また、現実には、被写体が３以上の場合において、子供の被写体の横方向にも縦方向にも大人の被写体が位置しているような状態となることは当然起こりえる。このようなことを考慮して、偏倚率ｇについては、水平方向成分と垂直方向成分とについてそれぞれ個別に求めるようにすればよい。 As can be seen from FIG. 11, in the positional relationship in which the adult subject SBJa and the child subject SBJc are up and down, the bias rate g is set to be high and the total subject gravity center Gt is set to the child subject gravity center G1 side. If it is too close, the composition tends to obtain a composition in which the child's subject SBJc is below the center as a result of composition control. In addition, there may be a case where the adult subject SBJa exceeds the allowable range and protrudes too much from the image frame, resulting in a composition that is not good. Therefore, it is preferable to set a smaller value (close to 0) than the case of FIG.
In reality, when there are three or more subjects, it is naturally possible that an adult subject is positioned both in the horizontal direction and in the vertical direction of the child subject. In consideration of this, the deviation rate g may be obtained separately for the horizontal direction component and the vertical direction component.

また、図示は省略するが、図１０の場合において基本例に構図判定処理を行った場合には、次のようになる。
即ち、被写体重心Ｇ１，Ｇ２を結ぶ線分の中点に対して総合被写体重心Ｇｔを設定するとともに、目標位置ＴＰについては、例えば垂直な仮想線Ｌｘ２(若しくは仮想線Ｌｘ１)上の画枠内での中点に対して設定することになる。この結果、上下方向に沿っては、大人の被写体ＳＢＪａと子供の被写体ＳＢＪｃとが中央からほぼ均等の距離にあるようにして配置される構図となる。 Although illustration is omitted, in the case of FIG. 10, when the composition determination process is performed in the basic example, the following is performed.
That is, the total subject gravity center Gt is set with respect to the midpoint of the line segment connecting the subject gravity centers G1 and G2, and the target position TP is, for example, within the image frame on the vertical virtual line Lx2 (or virtual line Lx1). Will be set for the midpoint. As a result, along the vertical direction, the composition is such that the adult subject SBJa and the child subject SBJc are arranged at an approximately equal distance from the center.

５．性別に応じた構図判定

図１２においては、画枠３００内において２つの被写体が検出され、かつ、性別の属性について、左側が男性の被写体ＳＢＪｍとされ、右側が女性の被写体ＳＢＪｗとされている状態が示されている。即ち、性別の属性としてみた場合に、男性と女性の組泡となっている関係性を有している。 5). Composition determination according to gender

FIG. 12 shows a state in which two subjects are detected in the image frame 300, and regarding the gender attributes, the left side is a male subject SBJm and the right side is a female subject SBJw. That is, when viewed as a gender attribute, it has a relationship that is a combination bubble of men and women.

ここで、構図要素の１つとして、画枠３００内における被写体のサイズを挙げることができる。つまり、画枠内において小さすぎることもなく、また、大きすぎることもなく、適切なサイズとなっていることが、良い構図を得るための一条件であるとされている。 Here, as one of the composition elements, the size of the subject in the image frame 300 can be given. That is, it is said that one of the conditions for obtaining a good composition is that the image frame is not too small and not too large and is of an appropriate size.

例えば、この図１２に示される検出時の被写体ＳＢＪｍ、ＳＢＪｗのサイズでは、最適構図を満足するにはサイズが小さすぎるとする。そこで、この場合には、構図判定処理により、しかるべきサイズにまで被写体ＳＢＪｍ、ＳＢＪｗを拡大するための拡大率が設定されることになる。なお、ここでいう拡大率は、撮像により得られた画像についてのものとなるので、撮像部位におけるズーム機構に設定するズーム倍率に相当するものとなる。 For example, it is assumed that the size of the subjects SBJm and SBJw at the time of detection shown in FIG. 12 is too small to satisfy the optimum composition. Therefore, in this case, an enlargement ratio for enlarging the subjects SBJm and SBJw to an appropriate size is set by the composition determination process. Note that the enlargement ratio here is for an image obtained by imaging, and therefore corresponds to the zoom magnification set in the zoom mechanism in the imaging region.

この拡大率（ズーム倍率）の設定は、例えば次のようにして行う。
先にも述べたように、顔検出の手法を用いて被写体検出を行う場合、検出結果として、検出した被写体の顔部分に対して枠が設定される。ここでは、この枠を顔枠という。図１２においては、顔枠ＦＲとして示されている。この場合の顔枠ＦＲは、検出された被写体における顔の画像部分に対応して方形により配置されるものとなる。 The enlargement ratio (zoom magnification) is set as follows, for example.
As described above, when subject detection is performed using the face detection method, a frame is set as the detection result for the face portion of the detected subject. Here, this frame is called a face frame. In FIG. 12, it is shown as a face frame FR. The face frame FR in this case is arranged in a square shape corresponding to the face image portion in the detected subject.

次に、図１３は、上記図１２に示す画内容の下で設定された拡大率に従って構図合わせ制御としてのズーム制御を行って得られたとする画内容を示している。
図１２に示される画内容は、被写体がほぼ横並びとみて良い位置関係にある。このような場合には、最も左の被写体の顔枠ＦＲから最も右の被写体の顔枠ＦＲまでの距離を、被写体総体の水平（横方向）サイズＫとして扱う。
そして、この場合における適正な被写体サイズとは、上記被写体総体の水平サイズＫについて、画枠３００の水平サイズＣｘに対して一定の比率（占有率）ｎ（０＜ｎ＞１）を占めるときであるとして規定する。つまり、K=Cx・nが成立するようにして拡大率を設定すればよい。例えば、図１２に対応して得られている被写体検出時の被写体総体の水平サイズをK1、図１３に対応して得られている構図制御後の被写体総体の水平サイズをK2とし、拡大率（ここでは拡大率を長さとして捉えている）をZとすれば、K2=となるのであるから、拡大率Zは、Z=Cx・n/K1により求めることができる。
このようにして求められた拡大率Zに対応するズーム倍率により、図１２に示した状態からズーム制御を行った結果が、図１３となる。 Next, FIG. 13 shows an image content obtained by performing zoom control as composition adjustment control in accordance with the enlargement ratio set under the image content shown in FIG.
The image content shown in FIG. 12 is in a positional relationship that allows the subject to be considered to be substantially side by side. In such a case, the distance from the face frame FR of the leftmost subject to the face frame FR of the rightmost subject is handled as the horizontal (lateral direction) size K of the subject overall.
In this case, the appropriate subject size means that the horizontal size K of the subject overall occupies a fixed ratio (occupancy) n (0 <n> 1) with respect to the horizontal size Cx of the image frame 300. It prescribes that there is. That is, the enlargement ratio may be set so that K = Cx · n holds. For example, the horizontal size of the subject overall at the time of subject detection obtained corresponding to FIG. 12 is K1, the horizontal size of the subject overall after composition control obtained corresponding to FIG. If Z is taken as Z), the enlargement ratio Z can be obtained by Z = Cx · n / K1.
FIG. 13 shows the result of performing zoom control from the state shown in FIG. 12 with the zoom magnification corresponding to the enlargement factor Z thus obtained.

ただし、ここで図１３に対応して設定されている拡大率は、複数の被写体が検出された場合における通常の拡大率設定のアルゴリズムに従って得られるものであるとする。
この通常のアルゴリズムに従って得られる拡大率は、例えば、２つの被写体が検出された場合であれば、これらの被写体の性別についての属性が男性同士、若しくは女性同士とされ同性である場合には、妥当であり適切な構図を得ることができる。
しかし、２つの被写体が検出された場合において、これらの被写体が男女一組とされていわゆるカップルである場合には、通常の拡大率よりも大きな拡大率による被写体サイズとするほうが、写真画像としてはより雰囲気が良くなる、即ち、より良い構図が得られる。 However, the enlargement ratio set here corresponding to FIG. 13 is assumed to be obtained according to a normal enlargement ratio setting algorithm when a plurality of subjects are detected.
The enlargement ratio obtained according to this normal algorithm is appropriate if, for example, two subjects are detected, and the sex attribute of these subjects is male or female and are of the same sex. And an appropriate composition can be obtained.
However, when two subjects are detected, if these subjects are a couple of men and women and are a so-called couple, it is better to use a subject size with a larger magnification than the normal magnification as a photographic image. The atmosphere is improved, that is, a better composition is obtained.

そこで、検出された被写体数が２であり、かつ、これら被写体の性別についての属性が、それぞれ男性、女性である場合には、構図判定処理における拡大率の設定に際して、通常時に対応する占有率ｎよりも大きな占有率ｍを使用する。 Therefore, when the number of detected subjects is 2 and the attributes regarding the sexes of these subjects are male and female, respectively, the occupancy rate n corresponding to the normal time is set when setting the enlargement rate in the composition determination process. A larger occupancy m is used.

占有率ｍを使用して設定した拡大率によりズーム制御を行って得られた画内容を、図１４に示す。
占有率ｍ、ｎについてはｍ＞ｎの関係であることで、図１４に示される距離Ｋ（＝Ｃｘ・ｍ）は、図１３における距離Ｋ＝（Ｃｘ・ｎ）よりも大きなものとなっている。つまり、図１３と比較して図１４のほうが、画枠３００における男性の被写体ＳＢＪｍ及び女性の被写体ＳＢＪｗのサイズは大きく拡大されている。つまり、被写体がカップルであることに応じてより適切な構図となっている。 FIG. 14 shows the image contents obtained by performing zoom control with the enlargement ratio set using the occupation ratio m.
Since the occupancy ratios m and n are in the relationship of m> n, the distance K (= Cx · m) shown in FIG. 14 is larger than the distance K = (Cx · n) in FIG. Yes. That is, compared with FIG. 13, the size of the male subject SBJm and the female subject SBJw in the image frame 300 is greatly enlarged in FIG. That is, the composition is more appropriate according to the subject being a couple.

なお、図示による説明は省略するが、検出された２つの被写体の性別の属性が、それぞれ男性、女性でとされたうえで、これらの被写体の位置関係が画枠内において上下となっている場合においても、上記の説明に準じて、通常時よりも拡大率を大きく設定すればよい。ただし、拡大率（占有率ｍ）については、被写体の位置関係が上下であることに応じた適切な値を設定すべきであり、必ずしも、被写体の位置関係が水平である場合に設定される拡大率（占有率ｍ）と同じである必要はない。 Although illustration explanation is omitted, when the detected gender attributes of the two subjects are male and female, respectively, and the positional relationship between these subjects is up and down in the image frame In this case, according to the above description, the enlargement ratio may be set larger than that in the normal time. However, the enlargement ratio (occupancy ratio m) should be set to an appropriate value in accordance with the subject's positional relationship being up and down, and is not necessarily set when the subject's positional relationship is horizontal. It is not necessary to be the same as the rate (occupancy rate m).

６．顔方向に応じた構図判定

本実施形態の構図判定ブロック６２では、被写体検出処理に際して、検出した被写体ごとに、その顔方向を検出可能に構成することができる。ここでの顔方向とは、画枠内において被写体の顔が向いているものとして検出された方向をいう。
また、この顔方向の検出にあたっては、その検出結果を、例えば正面、左、右などのようにして段階的に出力する方式もあるが、ここでは、顔方向をベクトルにより検出する方式を採用するものとする。また、顔方向については、横方向の成分（左向き、正面、右向きの方向に対応する）と縦方向(上向き、正面、下向きの方向に対応する)の成分を検出することが可能であるが、ここでは説明を簡単にすることの便宜上、横方向の成分のみを検出するものとする。 6). Composition determination according to face direction

The composition determination block 62 of the present embodiment can be configured to detect the face direction of each detected subject in the subject detection process. The face direction here refers to the direction detected as the subject's face is facing in the image frame.
In addition, when detecting the face direction, there is a method of outputting the detection results in a stepwise manner, for example, front, left, right, etc. Here, a method of detecting the face direction by a vector is adopted. Shall. As for the face direction, it is possible to detect a horizontal component (corresponding to left, front, right) and a vertical component (corresponding to upward, front, downward) Here, for the sake of simplicity, it is assumed that only the lateral component is detected.

横方向に対応する顔方向ベクトルに対応しては、図１７に示す座標軸を定義する。つまり、互いに直交するＸ軸、Ｙ軸を定義する。Ｘ軸は左右方向に対応し、Ｙ軸は前後方向に対応する。但し、後ろ向きの状態の頭部については顔方向を検出しないので、Ｙ軸については交点から前側のみが有効なものとして考える。
そのうえで、顔方向は、この座標において、Ｘ軸、Ｙ軸の交点から、顔が向いている方向を示す単位ベクトルＶＴとして表現される。
この単位ベクトルＶＴは、図示するようにして、Ｙ軸と一致するときを０°、左側に対応するＸ軸と一致するときを９０°、右側に対応するＸ軸と一致するときを２７０°として、左右方向に沿った顔方向に応じて０°〜９０及び０°〜２７０°の角度範囲をとる。この単位ベクトルＶＴのとる角度がθとして表されている。
この場合、左右方向に沿った顔の向き具合を示す数量値（顔方向ベクトル量）Ｓは、
Ｓ=VTsinθ
により表すことができる。
つまり、顔方向ベクトル量Ｓは、−１≦S≦１の範囲をとり、正面を向いているときに０で、左に向くほど１に近い値となり、右に向くほど−１に近くなっていく。 For the face direction vector corresponding to the horizontal direction, the coordinate axes shown in FIG. 17 are defined. That is, an X axis and a Y axis that are orthogonal to each other are defined. The X axis corresponds to the left-right direction, and the Y axis corresponds to the front-rear direction. However, since the face direction is not detected for the head facing backward, only the front side from the intersection is considered to be effective for the Y axis.
In addition, the face direction is expressed as a unit vector VT indicating the direction in which the face faces from the intersection of the X axis and the Y axis in this coordinate.
As shown in the figure, this unit vector VT is 0 ° when it coincides with the Y axis, 90 ° when it coincides with the X axis corresponding to the left side, and 270 ° when it coincides with the X axis corresponding to the right side. The angle ranges from 0 ° to 90 and 0 ° to 270 ° depending on the face direction along the left-right direction. The angle taken by this unit vector VT is represented as θ.
In this case, the quantity value (face direction vector amount) S indicating the orientation of the face along the left-right direction is:
S = VTsinθ
Can be represented by
That is, the face direction vector amount S is in the range of −1 ≦ S ≦ 1, and is 0 when facing the front. The face direction vector amount S is closer to 1 as it goes to the left, and closer to −1 as it goes to the right. Go.

構図に関しては、被写体の向きが正面ではなくこれ以外の方向を向いているときには、画枠内においてその被写体の顔が向いている側、即ち被写体の視線が向いている側に大きな空間を空けた構図とすると、良い雰囲気になる。つまり、画枠内において被写体が向いている方向とは反対の方向に被写体を寄せた配置とすることでより構図が得られる。そのうえで、被写体の向きの度合いに応じて被写体を寄せる量（中心から離す距離）を変更してやると、さらに良い構図を得ることが可能にもなる。 As for composition, when the subject is facing the other direction than the front, a large space is left on the side facing the subject in the image frame, that is, the side where the subject's line of sight is facing. The composition gives a good atmosphere. That is, a composition can be obtained by arranging the subject in a direction opposite to the direction in which the subject is facing in the image frame. In addition, if the amount of the subject (the distance away from the center) is changed according to the degree of the orientation of the subject, a better composition can be obtained.

そこで本実施形態では、被写体の属性として顔方向を検出することとして、この検出した顔方向に基づいて、画枠内における被写体の位置をずらすこととした。
そのうえで、さらに、顔方向の検出結果をベクトル量により出力することとして、このベクトル量に応じて画枠内において被写体の位置をずらす量を変更設定することとした。 Therefore, in the present embodiment, the face direction is detected as the subject attribute, and the position of the subject in the image frame is shifted based on the detected face direction.
In addition, the detection result of the face direction is output as a vector amount, and the amount by which the position of the subject is shifted in the image frame is changed according to the vector amount.

図１５においては、画枠３００において、２つの被写体ＳＢＪ１，ＳＢＪ２が検出された状態を示している。これらの被写体ＳＢＪ１，ＳＢＪ２は、いずれも画枠３００（画面）において右側を向いているが、顔方向の検出結果として、被写体ＳＢＪ１については顔方向ベクトル量S1が検出され、被写体ＳＢＪ２については顔方向ベクトル量S2が検出されたものとしている。 FIG. 15 shows a state where two subjects SBJ1 and SBJ2 are detected in the image frame 300. These subjects SBJ1 and SBJ2 are both directed to the right side in the image frame 300 (screen). As a result of detecting the face direction, the face direction vector amount S1 is detected for the subject SBJ1, and the face direction for the subject SBJ2. It is assumed that the vector quantity S2 is detected.

これらの被写体ＳＢＪ１，ＳＢＪ２は、画枠３００においてほぼ横並びの位置関係にあるとみることができる。そこで、構図判定処理としては、先ず、被写体ＳＢＪ１，ＳＢＪ２ごとに被写体重心Ｇ１，Ｇ２を設定し、被写体重心Ｇ１，Ｇ２を結ぶ線分の中点に対して総合被写体重心Ｇｔを設定する。 These subjects SBJ1 and SBJ2 can be considered to have a substantially horizontal positional relationship in the image frame 300. Therefore, as the composition determination process, first, the subject gravity centers G1 and G2 are set for each of the subjects SBJ1 and SBJ2, and the total subject gravity center Gt is set for the midpoint of the line segment connecting the subject gravity centers G1 and G2.

ここで、図７，図８の基本例に従った場合には、仮想線Ｌｘ、Ｌｙの交点に対して目標位置ＴＰを設定することになる。
しかし、この場合においては、顔方向ベクトル量S1，S2を加算して得られる値に対応する移動量Δｘを所定の関数などを利用して求める。そして、この場合には、仮想線Ｌｘ、Ｌｙの交点から移動量Δｘだけ水平方向にずらした位置を、目標位置ＴＰとする。
顔方向ベクトル量S1，S2の加算値S1+S2によっては、検出された被写体全てを総合して１つの被写体としてみた総体的な顔方向ベクトル量(S_total)が示されることになる。この被写体総体の顔方向ベクトル量S_totalは、顔方向の属性についての個々の被写体間の関係性を示しているものとしてみることができる。
そして、本実施形態では、この被写体総体の顔方向ベクトル量S_totalに対応して、画枠内において、顔が向いている方向とは反対に被写体をずらす量、即ち移動量Δｘを決定しようとするものである。 Here, when the basic examples of FIGS. 7 and 8 are followed, the target position TP is set for the intersection of the virtual lines Lx and Ly.
However, in this case, the movement amount Δx corresponding to the value obtained by adding the face direction vector amounts S1 and S2 is obtained using a predetermined function or the like. In this case, the position shifted in the horizontal direction by the movement amount Δx from the intersection of the virtual lines Lx and Ly is set as the target position TP.
The total value S1 + S2 of the face direction vector amounts S1 and S2 indicates the total face direction vector amount (S_total) obtained by combining all the detected subjects as one subject. This face direction vector amount S_total of the subject overall can be regarded as indicating the relationship between the individual subjects regarding the face direction attribute.
In the present embodiment, in correspondence with the face direction vector amount S_total of the entire subject, an amount of shifting the subject in the image frame opposite to the direction in which the face is facing, that is, an amount of movement Δx is determined. Is.

具体的に、図１５においては、被写体ＳＢＪ１，ＳＢＪ２は、いずれも画枠３００(画面)内にて右側を向いている。つまり、被写体の総体としても顔方向は右向き傾向であることになる。このために、被写体ＳＢＪ１，ＳＢＪ２の顔方向ベクトル量S1,S2に基づいて求められる移動量Δｘにより設定される目標位置ＴＰは、図示するようにして、仮想線Ｌｘ、Ｌｙの交点、即ち、水平方向における中央よりも左側に位置することになる。そのうえで、仮想線Ｌｘ、Ｌｙの交点から目標位置ＴＰまでの距離は、S1+S2に対応したものとなっている。 Specifically, in FIG. 15, both the subjects SBJ1 and SBJ2 face the right side in the image frame 300 (screen). In other words, the face direction tends to be rightward as a whole subject. Therefore, the target position TP set by the movement amount Δx obtained based on the face direction vector amounts S1 and S2 of the subjects SBJ1 and SBJ2 is the intersection of the virtual lines Lx and Ly, that is, horizontal, as shown in the figure. It is located on the left side of the center in the direction. In addition, the distance from the intersection of the virtual lines Lx and Ly to the target position TP corresponds to S1 + S2.

そして、構図合わせ制御としては、このようにして移動量Δｘ分移動された目標位置ＴＰに対して、総合被写体重心Ｇｔを移動させるようにしてパン・チルト制御を行うことになる。 As composition adjustment control, pan / tilt control is performed by moving the total subject gravity center Gt with respect to the target position TP moved by the movement amount Δx in this way.

図１６は、上記図１５により説明した構図判定結果に従って構図合わせ制御を行って得られた画内容を示している。
このようにして、目標位置ＴＰに対して総合被写体重心Ｇｔが位置している状態では、、被写体ＳＢＪ１，ＳＢＪ２は画枠３００内において全体的に左側に偏って配置される。換言すれば、画枠３００における右側、即ち、被写体総体の顔が向いているとする方向において、左側よりも右側に大きな空間が空けられた構図となっている。かつ、右側に空けられた空間の広さ、即ち、画枠における被写体ＳＢＪ１，ＳＢＪ２の左側への偏倚量は、被写体総体の顔の向きの度合い（ベクトル量）に応じたものとなっている。
つまり、図１６では、被写体が画面内において顔を向けている方向、及び、その顔を向けている度合いに適合した良好な構図が得られている。 FIG. 16 shows the image content obtained by performing composition adjustment control in accordance with the composition determination result described with reference to FIG.
In this way, in a state where the total subject gravity center Gt is located with respect to the target position TP, the subjects SBJ1 and SBJ2 are arranged so as to be biased to the left as a whole in the image frame 300. In other words, the composition is such that a large space is left on the right side of the image frame 300, that is, on the right side of the left side in the direction in which the face of the entire subject is facing. In addition, the size of the space vacated on the right side, that is, the amount of deviation to the left side of the subjects SBJ1 and SBJ2 in the image frame, depends on the degree of face orientation (vector amount) of the subject overall.
That is, in FIG. 16, a good composition is obtained that is suitable for the direction in which the subject faces the face in the screen and the degree to which the face is directed.

なお、移動量Δｘを求めるための最も簡単な例としては、単位あたりの顔方向ベクトル量に対して一定の画素数pxを対応させたうえで、被写体総体の顔方向ベクトル量をS_totalとして、Δｘ＝S_total pxにより求める、という演算を考えることができる。 The simplest example for obtaining the movement amount Δx is that the face direction vector amount per unit is made to correspond to a certain number of pixels px, and the face direction vector amount of the subject overall is S_total, and Δx = S_total px can be calculated.

また、先にも述べたように、本実施形態においては、横方向における顔方向だけではなく、縦方向における顔方向についても検出可能とされる。つまり、顔がどの程度上を向いているのか、あるいは下を向いているのかを示す、縦方向に対応した顔方向ベクトル量を求めることが可能である。
そのうえで、上記図１５及び図１６の説明に準じて、縦方向に対応する顔方向ベクトル量に基づいては、縦（垂直）方向における目標位置ＴＰの移動量であるΔｙを求めて、目標位置ＴＰを移動させるように構成することも可能である。 As described above, in the present embodiment, not only the face direction in the horizontal direction but also the face direction in the vertical direction can be detected. That is, it is possible to obtain a face direction vector amount corresponding to the vertical direction that indicates how much the face is facing upward or downward.
Then, in accordance with the description of FIG. 15 and FIG. 16 above, based on the face direction vector amount corresponding to the vertical direction, Δy, which is the movement amount of the target position TP in the vertical (vertical) direction, is obtained to obtain the target position TP. It is also possible to configure so as to move.

７．アルゴリズム

図１５のフローチャートは、本実施形態としての構図制御のための処理手順例を示している。この図に示す処理は、図５に示されるデジタルスチルカメラ１の各機能部位が必要に応じて適宜実行するものとしてみることができる。また、これらの各機能部位が実行する処理は、図１の制御部（ＣＰＵ）２７がプログラムを実行することにより実現される制御、処理の手順としてみることができる。 7). algorithm

The flowchart of FIG. 15 shows an example of a processing procedure for composition control as the present embodiment. The processing shown in this figure can be regarded as appropriately executed by each functional part of the digital still camera 1 shown in FIG. 5 as necessary. Further, the processing executed by each of these functional parts can be viewed as a control and processing procedure realized by the control unit (CPU) 27 of FIG. 1 executing a program.

図１５においては、先ず、ステップＳ１０１にて、構図判定ブロック６２（信号処理部２４）により、撮像記録ブロック６１にてそのときに得られているとされる撮像画像データの取り込み（入力）を開始する。 In FIG. 15, first, in step S101, the composition determination block 62 (signal processing unit 24) starts capturing (input) of captured image data that is assumed to be obtained at that time in the imaging recording block 61. To do.

次のステップＳ１０２では、構図判定ブロック６２（信号処理部２４）により、取り込んだ撮像画像データを利用して被写体検出処理を実行する。
この被写体検出処理としては、例えば先に述べたようにして顔検出技術を応用し、その検出結果として、これまでに述べたようにして、検出した被写体ごとに、その顔の画像部分の領域に対応して顔枠ＦＲを設定する。例えば、被写体数であるとか、被写体検出時点での被写体サイズ及び画枠内の位置などの被写体に関する基本的情報は、この顔枠ＦＲの数、サイズ、位置などにより得ることができる。また、顔枠ＦＲが設定されることに応じて、この段階で、被写体ごとの被写体重心Ｇ１、Ｇ２・・・、及び総合被写体重心Ｇｔも取得される。
なお、この顔検出の方式、手法はいくつか知られているが、本実施形態においてはどの方式を採用するのかについては特に限定されるべきものではなく、検出精度や設計難易度などを考慮して適当とされる方式が採用されればよい。 In the next step S102, the composition determination block 62 (signal processing unit 24) executes subject detection processing using the captured image data that has been captured.
As the subject detection processing, for example, the face detection technology is applied as described above, and as a result of the detection, as described above, each detected subject is applied to the area of the image portion of the face. Correspondingly, the face frame FR is set. For example, basic information about the subject such as the number of subjects, the subject size at the time of subject detection, and the position in the image frame can be obtained from the number, size, position, etc. of the face frame FR. In addition, according to the setting of the face frame FR, the subject gravity centers G1, G2,... And the total subject gravity center Gt for each subject are also acquired at this stage.
There are several known face detection methods and methods, but in this embodiment, which method should be adopted is not particularly limited, taking into account detection accuracy, design difficulty, etc. It is only necessary to adopt an appropriate method.

ステップＳ１０３においては、上記ステップＳ１０２による被写体検出処理によって少なくとも１つの被写体が検出されたか否かについて判別する。ここで否定の判別結果が得られた場合にはステップＳ１０２に戻り、被写体探索のための被写体検出処理を実行する。ここでの被写体探索とは、デジタルスチルカメラ１側にて雲台１０のパン／チルト方向への移動を制御するとともに、ズーム制御も行って撮像視野角を変更していくことで、被写体が存在する撮像画像データが得られる状態とすることをいう。 In step S103, it is determined whether or not at least one subject has been detected by the subject detection process in step S102. If a negative determination result is obtained here, the process returns to step S102, and subject detection processing for subject search is executed. The subject search here means that the subject exists by controlling the movement of the camera platform 10 in the pan / tilt direction on the digital still camera 1 side and changing the imaging viewing angle by performing zoom control. This means that the captured image data is obtained.

ステップＳ１０３において被写体が検出されたとして肯定の判別結果が得られた場合には、ステップＳ１０４に進む。
ステップＳ１０４においては、ステップＳ１０２にて検出されたとする被写体ごとに、構図判定に必要とされる属性を検出する処理を実行させる。 If an affirmative determination result is obtained in step S103 that the subject is detected, the process proceeds to step S104.
In step S104, for each subject detected in step S102, a process for detecting an attribute required for composition determination is executed.

これまでの説明にあっては、年代、性別、顔方向の各被写体属性に基づいた構図制御について述べたが、ここで説明するアルゴリズムでは、これら年代、性別、顔方向の属性を全て総合的に利用したうえで、構図判定処理、構図制御を実行するようにして構成する。つまり、本アルゴリズムでは、年代、性別、顔方向の全ての属性を反映して適切な構図を得ようとするものである。
このために、ステップＳ１０４においては、ステップＳ１０２に対応して検出された被写体ごとに、その属性として、年代（大人、子供）、性別（男性、女性）、顔方向についての検出を行う。 In the explanation so far, composition control based on each subject attribute of age, gender, and face direction was described, but in the algorithm described here, these attributes of age, gender, and face direction are all comprehensive. After use, the composition determination process and composition control are executed. That is, in this algorithm, an appropriate composition is obtained by reflecting all attributes of age, gender, and face direction.
For this reason, in step S104, for each subject detected corresponding to step S102, the age (adult, child), sex (male, female), and face direction are detected as attributes.

なお、この属性検出処理についても、例えば実際には、ＤＳＰとしての信号処理部２４により実行させるようにして構成すればよい。
また、これら年代、性別、顔方向の検出については、例えばこれまでに知られている技術、アルゴリズムを応用すればよい。 The attribute detection process may be configured to be executed by the signal processing unit 24 as a DSP in practice.
For detection of age, sex, and face direction, for example, techniques and algorithms that have been known so far may be applied.

ステップＳ１０４の処理が終了した段階では、検出された被写体ごとに、被写体情報として、顔枠ＦＲの情報（位置、サイズなど）、被写体重心及び総合被写体重心、属性として検出された年代、性別、顔方向を示す情報が得られている。
そこでステップＳ１０５によっては、制御部２７が実行する構図判定ブロック６２の処理として、上記の被写体情報を利用して構図判定処理を実行する。
この構図判定処理により、総合被写体重心Ｇｔの修正、目標位置ＴＰの設定、ズーム倍率（被写体サイズの拡大率）などが決定される。このステップＳ１０５により得られた構図判定結果の情報は、例えばパン・チルト・ズーム制御ブロック６３に対して渡される。 At the stage where the processing in step S104 is completed, for each detected subject, information on the face frame FR (position, size, etc.), subject gravity center and total subject gravity center, subject age, gender, face detected as subject information Information indicating the direction is obtained.
Therefore, depending on step S105, the composition determination processing is executed using the subject information described above as the processing of the composition determination block 62 executed by the control unit 27.
By this composition determination processing, correction of the total subject gravity center Gt, setting of the target position TP, zoom magnification (subject size enlargement ratio), and the like are determined. Information on the composition determination result obtained in step S105 is passed to, for example, the pan / tilt / zoom control block 63.

ステップＳ１０６においては、パン・チルト・ズーム制御ブロック６３が構図判定結果に応じた撮像視野角が得られるようにするためのパン・チルト・ズーム制御を実行する。つまり、構図合わせ制御を実行する。 In step S106, the pan / tilt / zoom control block 63 executes pan / tilt / zoom control for obtaining an imaging viewing angle corresponding to the composition determination result. That is, composition adjustment control is executed.

上記ステップＳ１０６による構図合わせ制御が開始されて以降においては、構図判定ブロック６２は、ステップＳ１０７により、実際にそのときの撮像画像データの画像として得られている構図が、ステップＳ１０７により判定した構図と同じであるとみなされる状態（例えば一定以上の近似度となる状態）となったか否か（構図がＯＫであるか否か）を判別することとしている。 After the composition adjustment control in step S106 is started, the composition determination block 62 determines that the composition actually obtained as an image of the captured image data at that time in step S107 is the composition determined in step S107. It is determined whether or not the state is considered to be the same (for example, a state where the degree of approximation exceeds a certain level) (whether or not the composition is OK).

ここで、例えば何らかの原因により、構図合わせとして必要なだけの移動量によるパン・チルト・ズーム駆動を行わせたとしても構図がＯＫにならなかった場合には、ステップＳ１０７にて否定の判別結果が得られる。この場合には、ステップＳ１０２に戻ることで、被写体探索処理を再開させることとしている。
これに対して、ステップＳ１０７にて構図がＯＫになったとの判別結果が得られた場合には、ステップＳ１０８に進む。 Here, for example, if the composition does not become OK even if the pan / tilt / zoom drive is performed with the movement amount necessary for the composition adjustment for some reason, a negative determination result is obtained in step S107. can get. In this case, the subject search process is resumed by returning to step S102.
On the other hand, if it is determined in step S107 that the composition is OK, the process proceeds to step S108.

ステップＳ１０８においては、例えば撮像画像データの画内容について判定したとおりの構図が得られた状態の下、撮像記録すべきタイミングとなるのを待機する。
例えば、本実施形態のデジタルスチルカメラ１は、検出された被写体の顔の表情として少なくとも笑顔を検出可能とされている。そのうえで、例えば予めのユーザ操作などにより、検出された被写体が笑顔であることが検出されているタイミングでときに撮像記録を行うべきモードが設定されているとする。ステップＳ１０８は、例えばこのような笑顔撮影のモードに応じて、撮影記録すべきタイミングであるか否かについての判別を行う。つまり、現在得られている撮像画像データにおいて検出されている被写体の表情が笑顔となっているか否かについて判別する。 In step S108, for example, in a state where the composition as determined for the image content of the captured image data is obtained, the process waits for the timing to be captured and recorded.
For example, the digital still camera 1 of the present embodiment can detect at least a smile as the facial expression of the detected subject. In addition, for example, it is assumed that a mode in which imaging / recording is to be performed is set at a timing when the detected subject is detected to be smiling by a user operation or the like in advance. In step S108, for example, in accordance with such a smile shooting mode, it is determined whether or not it is time to shoot and record. That is, it is determined whether or not the facial expression of the subject detected in the currently obtained captured image data is a smile.

ステップＳ１０９においては、撮像記録タイミングがＯＫと成ったか否かについての判別を行っている。
例えば、上記ステップＳ１０８として実行する記録タイミングの判定処理期間において、検出されている被写体の表情が笑顔になったことが検出されたとする。すると、ステップＳ１０９により肯定の判別結果が得られることとなり、ステップＳ１１０に進む。これに対して、上記記録タイミングの判定処理期間を越えても検出されている被写体の表情について笑顔が検出されなければ、ステップＳ１０９において否定の判別結果が得られる。この場合には、ステップＳ１０２に戻り、被写体探索を伴う被写体検出処理を実行する。 In step S109, it is determined whether or not the imaging recording timing is OK.
For example, it is assumed that it is detected that the detected facial expression of the subject is smiling during the recording timing determination processing period executed as step S108. Then, a positive determination result is obtained in step S109, and the process proceeds to step S110. On the other hand, if a smile is not detected for the detected facial expression of the subject even after the recording timing determination processing period, a negative determination result is obtained in step S109. In this case, the process returns to step S102, and subject detection processing with subject search is executed.

ステップＳ１１０においては、例えば制御部２７は、撮像記録ブロック６１に対して撮像記録を指示する。これに応じて、そのときに得られている撮像画像データを、メモリカード４０に対して静止画のファイルとして記録する動作が実行される。 In step S110, for example, the control unit 27 instructs the imaging recording block 61 to perform imaging recording. In response to this, an operation of recording the captured image data obtained at that time as a still image file in the memory card 40 is executed.

図１９のフローチャートは、上記図１８におけるステップＳ１０５の構図判定処理としての手順例を示している。
先ず、構図判定ブロック６２は、ステップＳ２０１により、検出された被写体に付した番号を示す変数ｎについて初期値として１を設定する。
そのうえで、ステップＳ２０２により、ｎ番目の被写体についての被写体重心Ｇｎを算出して求める。これは、先にも述べたように、最も簡単な例としては、被写体検出時において設定される顔枠の対角線の交点として求めることができる。 The flowchart in FIG. 19 shows an example of the procedure as the composition determination process in step S105 in FIG.
First, in step S201, the composition determination block 62 sets 1 as an initial value for the variable n indicating the number assigned to the detected subject.
Then, in step S202, a subject gravity center Gn for the nth subject is calculated and obtained. As described above, as the simplest example, this can be obtained as an intersection of diagonal lines of the face frame set at the time of subject detection.

ステップＳ２０３においては、変数ｎについて最大値であるか否かを判別している。
ここで否定の判別結果が得られた場合には、未だ被写体重心を求めていない被写体が残っていることになる。そこで、ステップＳ２０４により変数ｎをインクリメントしてステップＳ２０２に戻ることにより、残りの被写体についての被写体重心Ｇｎを求めていくようにする。
そして、検出された全ての被写体についての被写体重心Ｇ１〜Ｇｎを求めたことによりステップＳ２０３において肯定の判別結果が得られたとすると、ステップＳ２０５に進む。 In step S203, it is determined whether or not the variable n is the maximum value.
If a negative determination result is obtained here, a subject for which the subject center of gravity has not yet been obtained remains. Therefore, the variable n is incremented in step S204 and the process returns to step S202, whereby the subject gravity center Gn for the remaining subject is obtained.
If a positive determination result is obtained in step S203 by obtaining the subject gravity centers G1 to Gn for all the detected subjects, the process proceeds to step S205.

ステップＳ２０５においては、図１８のステップＳ１０４の属性検出結果として年代属性を参照することで、検出された被写体についての年代の内訳として、大人と子供とが混在しているか否かについての判別を行う。
ここで、検出された被写体が大人だけである、若しくは子供だけであるとして否定の判別結果が得られた場合にはステップＳ２０６に進む。
ステップＳ２０６においては、偏倚率ｇを考慮しない、通常の関数を利用した演算によって総合被写体重心Ｇｔを求める。例えば被写体が２つの場合であれば、各被写体の被写体重心Ｇ１，Ｇ２を結ぶ線分の中点として総合被写体重心Ｇｔが求められることになる。 In step S205, the age attribute is referred to as the attribute detection result in step S104 in FIG. 18 to determine whether an adult and a child are mixed as a breakdown of the age of the detected subject. .
If a negative determination result is obtained that the detected subject is only an adult or only a child, the process proceeds to step S206.
In step S206, the total subject gravity center Gt is obtained by calculation using a normal function without considering the deviation rate g. For example, when there are two subjects, the total subject gravity center Gt is obtained as the midpoint of the line segment connecting the subject gravity centers G1 and G2 of each subject.

これに対して、ステップＳ２０５において肯定の判別結果が得られた場合にはステップＳ２０７、Ｓ２０８の手順を実行する。
ステップＳ２０７においては偏倚率ｇを設定する。
先に述べたように、この偏倚率ｇは、子供の被写体と大人の被写体が混在している場合において、水平、垂直方向における中心に対してより子供の被写体を寄せることを目的として、通常の総合被写体重心Ｇｔの位置を、子供の被写体の被写体重心に近づけるために利用するパラメータである。
また、前述したように、この偏倚率ｇは、子供の被写体と大人の被写体との位置関係に応じて、水平方向成分に対応する値と垂直方向成分に対応して異なる値をそれぞれ設定することができる。 On the other hand, if a positive determination result is obtained in step S205, the procedure of steps S207 and S208 is executed.
In step S207, the bias rate g is set.
As described above, this deviation rate g is used for the purpose of bringing the child subject closer to the center in the horizontal and vertical directions when the child subject and the adult subject are mixed. This is a parameter used to bring the position of the total subject gravity center Gt closer to the subject gravity center of the child's subject.
Further, as described above, the deviation rate g is set to a value corresponding to the horizontal direction component and a value corresponding to the vertical direction component depending on the positional relationship between the child subject and the adult subject. Can do.

ステップＳ２０９においては、基本の目標位置ＴＰを設定する。つまり、画枠内における目標位置ＴＰの座標を決定する。ここでいう基本の目標位置ＴＰとは、例えば、図６、図１０などにより説明したように、顔方向ベクトルに応じた移動量Δｘにより移動される前の、基本的な目標位置設定アルゴリズムにより画枠内の位置が決定される目標位置ＴＰのことである。先に図６〜図１１などにより説明したように、この基本の目標位置ＴＰは、被写体の数や位置関係などに基づいて決定されるようになっている。 In step S209, a basic target position TP is set. That is, the coordinates of the target position TP within the image frame are determined. The basic target position TP here is, for example, defined by a basic target position setting algorithm before being moved by the movement amount Δx corresponding to the face direction vector, as described with reference to FIGS. It is the target position TP at which the position within the frame is determined. As described above with reference to FIGS. 6 to 11 and the like, the basic target position TP is determined based on the number of subjects, the positional relationship, and the like.

ステップＳ２１０においては、被写体ごとに求められた顔方向ベクトル量S1〜Snから目標位置ＴＰについての移動量Δｘを算出する。そして、ステップＳ２１１にて、算出された移動量Δｘに応じて目標位置ＴＰの座標を移動する処理を実行する。
なお、確認のために述べておくと、検出された被写体の全てが正面を向いているとすれば、この移動量Δｘは０が求められることになる。 In step S210, a movement amount Δx for the target position TP is calculated from the face direction vector amounts S1 to Sn obtained for each subject. In step S211, a process of moving the coordinates of the target position TP according to the calculated movement amount Δx is executed.
For confirmation, if all of the detected subjects are facing the front, this movement amount Δx must be zero.

ステップ２１２においては、検出された被写体が男女一組のカップルであるか否かについて判別する。つまり、検出された被写体数が２であり、かつ、性別の属性については、男性と女性の組み合わせになっているか否かについて判別する。
ステップＳ２１２により否定の判別結果が得られた場合には、ステップＳ２１３に進み、カップル以外の被写体に適用すべき通常の拡大率Ｚを設定する。
これに対してステップＳ２１２により肯定の判別結果が得られたのであれば、ステップＳ２１４に進んで、カップルの被写体に対応する拡大率Ｚを設定する。前述したように、ここで設定される拡大率は、ステップＳ２１３により設定される通常の拡大率よりも大きい値を持つものとなる。
なお、確認のために述べておくと、この拡大率Ｚは、構図合わせ制御におけるズーム制御のときに利用するズーム倍率に対応する。 In step 212, it is determined whether or not the detected subject is a couple of men and women. That is, it is determined whether the number of detected subjects is 2 and the sex attribute is a combination of male and female.
If a negative determination result is obtained in step S212, the process proceeds to step S213, and a normal enlargement ratio Z to be applied to subjects other than the couple is set.
On the other hand, if a positive determination result is obtained in step S212, the process proceeds to step S214, and an enlargement ratio Z corresponding to a couple of subjects is set. As described above, the enlargement ratio set here has a larger value than the normal enlargement ratio set in step S213.
For confirmation, the enlargement factor Z corresponds to the zoom magnification used for zoom control in composition adjustment control.

ステップＳ２１５においては、構図判定ブロック６２は、パン・チルト・ズーム制御ブロック６３に対して、構図合わせの制御を指示する。このとき、構図判定ブロック６２は、パン・チルト・ズーム制御ブロック６３に対して、構図合わせ制御に用いるパラメータとして、ステップＳ２０６若しくはステップＳ２０８により求められた総合被写体重心Ｇｔ、ステップＳ２１１により求めた目標位置ＴＰの座標、ステップＳ２１３若しくはステップＳ２１４により求められた拡大率Ｚを出力する。
パン・チルト・ズーム制御ブロック６３は、上記の構図判定ブロック６２の指示に応答して、上記各パラメータを利用して、図１８のステップＳ１０６としての構図合わせ制御（パン・チルト・ズーム制御）を実行することになる。 In step S215, the composition determination block 62 instructs the pan / tilt / zoom control block 63 to control composition adjustment. At this time, the composition determination block 62 provides the pan / tilt / zoom control block 63 with the total subject gravity center Gt obtained in step S206 or S208 as the parameter used for composition adjustment control, and the target position obtained in step S211. The coordinates of TP and the enlargement ratio Z obtained in step S213 or step S214 are output.
In response to the instruction from the composition determination block 62, the pan / tilt / zoom control block 63 performs composition adjustment control (pan / tilt / zoom control) as step S106 in FIG. Will be executed.

上記図１９の処理手順によれば、大人と子供が被写体として混在する場合においては子供の被写体が中心に近づくようにする構図制御と、被写体の顔方向に応じて画枠内において被写体をしかるべき方向に寄せる構図制御と、被写体がカップルであることに応じて通常よりも大きな被写体サイズに拡大する構図制御とが組み合わされることになる。
つまり、例えば大人と子供の２つの被写体が検出されており、これらの被写体がいずれも右を向いているとすると、画枠内の左側に被写体を寄せながらも、被写体が大人のみ（若しくは子供のみ）のときよりも子供の被写体が中央寄りとなるような構図とすることが可能になる。 According to the processing procedure of FIG. 19, when an adult and a child are mixed as subjects, composition control is performed so that the child's subject comes closer to the center, and the subject should be matched within the image frame according to the face direction of the subject. The composition control for moving in the direction and the composition control for enlarging the subject size larger than usual according to the fact that the subject is a couple are combined.
That is, for example, if two subjects of adults and children are detected and both of these subjects are facing right, the subject is only an adult (or only a child) while the subject is brought to the left side in the image frame. The composition can be such that the child's subject is closer to the center than in the case of).

ただし、わかりやすい１つの例として、被写体が男女一組のカップルである場合において、二人ともかなり右を向いているとすると、顔方向ベクトルに基づいては、画枠におけるかなり左の方に目標位置ＴＰが設定されることになる。つまり、被写体は、相当に左側に寄った位置に配置される。
この場合において、図１４にて説明したように、Ｋ＝Ｃｘ・ｍとなる条件を満たすズーム倍率Ｚを設定し、このズーム倍率Ｚに従ってそのまま被写体サイズを拡大したとすると、画枠から被写体の顔がはみ出してしまうような状態となることも考えられる。この場合には、むしろ良くない構図になってしまう。 However, as an easy-to-understand example, if the subject is a couple of men and women and both of them are facing significantly right, the target position will be far to the left in the image frame based on the face direction vector. TP is set. In other words, the subject is arranged at a position considerably close to the left side.
In this case, as described with reference to FIG. 14, if the zoom magnification Z that satisfies the condition K = Cx · m is set and the subject size is enlarged as it is according to the zoom magnification Z, the face of the subject is drawn from the image frame. It is also possible that the situation will cause the image to protrude. In this case, the composition is rather bad.

そこで、本実施形態では、属性に応じた構図制御に関して、被写体のサイズ（ズーム倍率）に関する構図制御よりも、大人と子供が混在する場合の構図制御及び顔方向に応じた構図制御を優先させることとする。そして、被写体がカップルであることに対応して設定する拡大率Ｚについては、Ｋ＝Ｃｘ・ｍとなる条件を満たす拡大率の値を最大値とする範囲において、被写体の顔が適切に画枠内に収まっているとされる状態に対応する最大限の値が設定されるように調整を行う。 Therefore, in the present embodiment, with regard to composition control according to attributes, priority is given to composition control in the case of a mixture of adults and children and composition control according to face direction over composition control related to the size of the subject (zoom magnification). And With regard to the enlargement ratio Z that is set in correspondence with the subject being a couple, the subject's face is appropriately image framed within a range in which the enlargement ratio value that satisfies the condition K = Cx · m is the maximum value. Adjustment is performed so that the maximum value corresponding to the state that is within the range is set.

図２０のフローチャートは、図１９のステップＳ２１４の被写体がカップルである場合のズーム倍率Ｚの設定処理として、上記したズーム倍率Ｚの調整を行うようにした場合の手順例を示している。 The flowchart of FIG. 20 shows an example of a procedure when the zoom magnification Z is adjusted as the zoom magnification Z setting process when the subject in step S214 of FIG. 19 is a couple.

図２０のステップＳ３０１においては、先ず、図１４にて説明したＫ＝Ｃｘ・ｍとなる条件を満たす拡大率Ｚを算出する。ここで算出される拡大率Ｚが、本来、カップルの被写体に対応して設定すべき拡大率Ｚとしての値であり、カップルの被写体に対応して設定し得る拡大率Ｚの範囲における最大値となる。 In step S301 in FIG. 20, first, an enlargement ratio Z that satisfies the condition K = Cx · m described in FIG. 14 is calculated. The enlargement factor Z calculated here is a value as the enlargement factor Z that should be set corresponding to the couple's subject, and is the maximum value in the range of the enlargement factor Z that can be set corresponding to the couple's subject. Become.

ステップＳ３０２においては、現在において設定されている拡大率（現目標ズーム率）Ｚに対応して得られる構図の下での被写体の顔枠の配置状態を算出して求める。
つまり、図１９のステップＳ２０６若しくはステップＳ２０８により得られた総合被写体重心Ｇｔ、及びステップＳ２１１により得られた目標位置ＴＰの座標、及び、現拡大率Ｚをパラメータとして構図制御を行ったとするときの顔枠の位置サイズを求める。 In step S302, the arrangement state of the face frame of the subject under the composition obtained corresponding to the currently set enlargement ratio (current target zoom ratio) Z is calculated and obtained.
That is, the face when composition control is performed using the total subject gravity center Gt obtained in step S206 or S208 in FIG. 19 and the coordinates of the target position TP obtained in step S211 and the current enlargement ratio Z as parameters. Find the position size of the frame.

ここで、本実施形態では、被写体の顔枠が側からはみ出さない状態であれば、被写体の顔が画枠内で写っている構図が確保されているものとして考える。そこで、ステップＳ３０３では、上記ステップＳ３０２により求めた被写体の顔枠の位置サイズの情報から、その顔枠が画枠からはみ出しているか否かについて判別する。 Here, in the present embodiment, if the subject's face frame does not protrude from the side, it is assumed that a composition in which the subject's face is captured in the image frame is secured. Therefore, in step S303, it is determined whether or not the face frame protrudes from the image frame based on the position size information of the subject's face frame obtained in step S302.

ここで肯定の判別結果が得られた場合には、現拡大率Ｚでは大きすぎることになる。そこで、この場合にはステップＳ３０４に進み、拡大率Ｚについて、予め定めた固定値による縮小係数dec（０＜dec＜１）を乗算して新たな現拡大率Ｚを求める。そのうえで、ステップＳ３０２に戻る。
このような処理を実行していくことで、最終的には、被写体の顔枠が画枠からはみ出さない範囲で最大の拡大率Ｚが設定された段階でステップＳ３０３にて肯定の判別結果が得られることになり、この処理を抜けることになる。つまり、このときの拡大率Ｚが、ステップＳ２１４にて設定する拡大率Ｚとなる。 If a positive determination result is obtained here, the current enlargement ratio Z is too large. In this case, the process proceeds to step S304, and the enlargement ratio Z is multiplied by a reduction coefficient dec (0 <dec <1) based on a predetermined fixed value to obtain a new current enlargement ratio Z. Then, the process returns to step S302.
By executing such processing, finally, a positive determination result is obtained in step S303 when the maximum enlargement ratio Z is set in a range where the subject's face frame does not protrude from the image frame. As a result, this process is exited. That is, the enlargement factor Z at this time is the enlargement factor Z set in step S214.

なお、顔方向に応じて設定される目標位置ＴＰによっては、構図合わせ制御の結果、被写体の顔が画枠からはみ出してしまう可能性は、ステップＳ２１３により設定する通常の拡大率Ｚについてもいえる。
そこで、このための対策としては、図２０に準じた処理を、図１９のステップＳ２１３においても適用すればよい。この際には、ステップＳ３０１にて、図１３により説明したＫ＝Ｃｘ・ｎとなる条件を満たす拡大率Ｚを算出することになる。また、この場合における縮小係数decについては、必ずしもステップＳ２１４の場合と同じである必要はなく、通常の拡大率設定に適合した値を設定すればよい。 Note that, depending on the target position TP set according to the face direction, the possibility that the subject's face protrudes from the image frame as a result of the composition adjustment control can also be said for the normal enlargement ratio Z set in step S213.
Therefore, as a countermeasure for this, the processing according to FIG. 20 may be applied in step S213 of FIG. At this time, in step S301, the enlargement ratio Z that satisfies the condition K = Cx · n described with reference to FIG. 13 is calculated. Further, the reduction coefficient dec in this case is not necessarily the same as that in step S214, and a value suitable for the normal enlargement ratio setting may be set.

また、考え方によっては、例えば上記とは逆に、被写体サイズ（拡大率）の構図制御を、顔方向に応じた目標位置ＴＰの設定（被写体を画枠の一方に寄せる構図制御）よりも優先させるという考え方を取ることができる。
この場合には、先ず、Ｋ＝Ｃｘ・ｎ若しくはＫ＝Ｃｘ・ｍとなる条件を満たす拡大率Ｚを設定する。そのうえで、被写体ごとの顔方向ベクトル量S1〜Ｓｎにより求められる移動量Δｘを最大値として、この範囲内で、被写体の顔枠がはみ出さない適切な移動量Δｘを求めるようにすればよい。 Depending on the way of thinking, for example, contrary to the above, composition control of the subject size (magnification rate) is given priority over setting of the target position TP according to the face direction (composition control for bringing the subject to one side of the image frame). Can be taken.
In this case, first, an enlargement ratio Z that satisfies the condition of K = Cx · n or K = Cx · m is set. In addition, the movement amount Δx obtained from the face direction vector amounts S1 to Sn for each subject may be set as the maximum value, and an appropriate movement amount Δx that does not protrude the subject's face frame within this range may be obtained.

８．変形例

次に、上記本実施形態としての撮像システムについての変形例について説明していく。
先ず、図２１に示す撮像システムでは、デジタルスチルカメラ１において、撮像記録ブロック６１により得られる撮像画像データを、通信制御処理ブロック６４から雲台１０側の通信制御ブロック７１に対して送信するようにされている。 8). Modified example

Next, modified examples of the imaging system as the present embodiment will be described.
First, in the imaging system shown in FIG. 21, in the digital still camera 1, the captured image data obtained by the imaging recording block 61 is transmitted from the communication control processing block 64 to the communication control block 71 on the pan head 10 side. Has been.

また、この図２１においては、雲台１０の構成として通信制御処理ブロック７１、パン・チルト制御処理ブロック７２、及び構図判定ブロック７３が示されている。
通信制御処理ブロック７１により受信された撮像画像データは、構図判定ブロック７３に対して出力される。この構図判定ブロック７３は、例えば先に図５に示した構図判定ブロック６２の構成が適用される。つまり、入力した撮像画像データを基として、被写体検出、属性検出、及び構図判定処理を実行する。そして、この場合には、例えば、判定された最適構図が得られる撮像方向（撮像視野角）とするためのパン機構部とチルト機構部の移動量を求め、この移動量を指示するパン・チルト制御信号をパン・チルト制御処理ブロック７２に対して出力する。これにより、構図判定ブロック７３にて判定した最適構図が得られるようにしてパンニング、チルティングが行われる。
このようにして、図２１に示す撮像システムは、デジタルスチルカメラ１から雲台１０に撮像画像データを送信させることとして、雲台１０側により、取り込んだ撮像画像データに基づく構図判定とこれに応じたパン・チルト制御とを実行するようにして構成しているものである。
また、この図２１に示す構成においては、撮像視野角の制御として、ズーム（画角）制御を可能とするためには、通信制御処理ブロック６３，７１間での通信を利用して、雲台１０の構図判定ブロック７３にて判定された最適構図に応じた画角を撮像記録ブロックに指示し、この指示された画角となるように撮像記録ブロック６１がズームレンズの駆動を実行するように構成すればよい。 In FIG. 21, a communication control processing block 71, a pan / tilt control processing block 72, and a composition determination block 73 are shown as the configuration of the camera platform 10.
The captured image data received by the communication control processing block 71 is output to the composition determination block 73. For example, the composition determination block 73 shown in FIG. 5 is applied to the composition determination block 73. That is, subject detection, attribute detection, and composition determination processing are executed based on the input captured image data. In this case, for example, the movement amount of the pan mechanism unit and the tilt mechanism unit for obtaining the imaging direction (imaging viewing angle) in which the determined optimum composition is obtained is obtained, and the pan / tilt commanding the movement amount is obtained. A control signal is output to the pan / tilt control processing block 72. Thus, panning and tilting are performed so that the optimum composition determined by the composition determination block 73 is obtained.
In this way, the imaging system shown in FIG. 21 transmits the captured image data from the digital still camera 1 to the camera platform 10, and the camera platform 10 side determines the composition based on the captured image data and responds accordingly. The pan / tilt control is executed.
Further, in the configuration shown in FIG. 21, in order to enable zoom (view angle) control as the imaging viewing angle control, the communication between the communication control processing blocks 63 and 71 is used to control the pan head. The angle of view according to the optimum composition determined by the composition determination block 73 is instructed to the image-recording block, and the image-recording block 61 executes the driving of the zoom lens so that the angle of view is instructed. What is necessary is just to comprise.

図２２は、本実施形態に対応する撮像システムについての他の変形例としての構成例を示している。なお、この図において、図２１と同一部分には同一符号を付して説明を省略する。
このシステムにおいては、雲台１０側において撮像記録ブロック７５が備えられる。この撮像記録ブロック７５は、例えば図５の撮像記録ブロック６１と同様に、撮像のための光学系と撮像素子（イメージセンサ）を備えて、撮像光に基づいた信号（撮像信号）を得るようにされているとともに、この撮像信号から撮像画像データを生成するための信号処理部、及び撮像記録データの記録制御系から成る。
撮像記録ブロック７５により生成される撮像画像データは、構図判定ブロック７３に出力される。
なお、撮像記録ブロック７５が撮像光を取り込む方向（撮像方向）は、例えば雲台１０に載置されるデジタルスチルカメラ１の撮撮像方向とできるだけ一致するようにして設定することが好ましい。つまり、撮像記録ブロック７５は、デジタルスチルカメラ１側の撮像記録ブロック６１により撮像される画像とできるだけ同じとなるようにして、雲台１０にて設けられるようにする。 FIG. 22 shows a configuration example as another modification of the imaging system corresponding to the present embodiment. In this figure, the same parts as those in FIG.
In this system, an imaging recording block 75 is provided on the camera platform 10 side. The imaging recording block 75 includes an optical system for imaging and an imaging element (image sensor), for example, similarly to the imaging recording block 61 of FIG. 5, so as to obtain a signal (imaging signal) based on imaging light. And a signal processing unit for generating captured image data from the captured image signal and a recording control system for captured image data.
The captured image data generated by the imaging recording block 75 is output to the composition determination block 73.
Note that the direction in which the imaging recording block 75 captures the imaging light (imaging direction) is preferably set so as to match the imaging imaging direction of the digital still camera 1 placed on the camera platform 10 as much as possible. That is, the imaging recording block 75 is provided on the camera platform 10 so as to be as similar as possible to the image captured by the imaging recording block 61 on the digital still camera 1 side.

この場合の構図判定ブロック７３、及びパン・チルト制御処理ブロック７２は、上記図２１と同様にして構図判定と、この構図判定結果に応じたパン・チルト機構の駆動制御を実行する。
但し、この場合の構図判定ブロック７３は、デジタルスチルカメラ１に撮像記録を実行させるタイミングに対応しては、通信制御処理ブロック７１経由でデジタルスチルカメラ１に対して、撮像記録の実行を指示する指示信号を送信させる。デジタルスチルカメラ１では、この指示信号が受信されることに応じて撮像記録を実行し、そのときに撮像記録ブロック６１により得られているとされる撮像画像データの撮像記録を実行する。
このようにして他の変形例では、構図判定及び構図獲得制御に関して、撮像記録の実行動作以外の全ての制御・処理を雲台１０側で完結して行うことができる。 In this case, the composition determination block 73 and the pan / tilt control processing block 72 execute composition determination and drive control of the pan / tilt mechanism according to the composition determination result in the same manner as in FIG.
However, the composition determination block 73 in this case instructs the digital still camera 1 to execute imaging recording via the communication control processing block 71 in response to the timing at which the digital still camera 1 executes imaging recording. An instruction signal is transmitted. The digital still camera 1 executes imaging recording in response to the reception of this instruction signal, and executes imaging recording of captured image data that is assumed to be obtained by the imaging recording block 61 at that time.
In this way, in another modified example, regarding the composition determination and composition acquisition control, all the control and processing other than the imaging recording execution operation can be completed on the camera platform 10 side.

なお、上記の説明では、パン制御、チルト制御に関しては、雲台１０のパン・チルト機構の動きを制御することにより行うこととしているが、雲台１０に代えて、例えば、デジタルスチルカメラ１の光学系部（２１）に対しては、反射鏡により反射された撮像光が入射されるようにしたうえで、撮像光に基づいて得られる画像についてパンニング・チルティングされた結果が得られるようにして上記反射光を動かす構成を採用することも考えられる。
また、デジタルスチルカメラ１の撮像素子（イメージセンサ２２）から画像として有効な撮像信号を取り込むための画素領域を水平方向と垂直方向にシフトさせるという制御を行うことによっても、パンニング・チルティングが行われるのと同等の結果を得ることができる。この場合には、雲台１０若しくはこれに準ずる、デジタルスチルカメラ１以外のパン・チルトのための装置部を用意する必要が無く、デジタルスチルカメラ１単体により本実施形態としての構図獲得制御に相当する動作を完結させることが可能となる。
また、画角制御（ズーム制御）についても、ズームレンズの駆動に代えて、撮像画像データから一部画像領域を切り出すという画像処理を実行することによって実現可能である。
また、デジタルスチルカメラ１の光学系部におけるレンズの光軸を水平・垂直方向に変更することのできる機構を備えて、この機構の動きを制御するように構成しても、パンニング・チルティングを行うことが可能である。 In the above description, pan control and tilt control are performed by controlling the movement of the pan / tilt mechanism of the camera platform 10, but instead of the camera platform 10, for example, the digital still camera 1 The optical system unit (21) is configured such that imaging light reflected by the reflecting mirror is incident, and an image obtained based on the imaging light is panned and tilted. It is also possible to adopt a configuration for moving the reflected light.
Further, panning / tilting is performed by controlling the pixel area for capturing an effective imaging signal as an image from the imaging element (image sensor 22) of the digital still camera 1 in the horizontal direction and the vertical direction. The same result can be obtained. In this case, it is not necessary to prepare a pan / tilt device unit other than the pan head 10 or the digital still camera 1 according to this, and the digital still camera 1 alone corresponds to composition acquisition control as the present embodiment. It is possible to complete the operation.
The angle of view control (zoom control) can also be realized by executing image processing for cutting out a partial image region from the captured image data instead of driving the zoom lens.
Even if the optical system unit of the digital still camera 1 has a mechanism that can change the optical axis of the lens in the horizontal and vertical directions and controls the movement of this mechanism, panning and tilting can be performed. Is possible.

なお、これまでの説明にあっては、デジタルスチルカメラ１が雲台１０に取り付けられた撮像システムとしているが、本実施形態としての構図判定と、構図判定結果に基づいた撮像記録の構成自体は、雲台が無くとも、デジタルスチルカメラ１単体でも実現される。
つまり、本実施形態のデジタルスチルカメラ１を、単に固定的に置いた状態とする状況であっても、そこで撮像される画像に応じて構図判定が行われ、この判定結果に応じて自動撮像記録が実行される。そして、このようなデジタルスチルカメラ１の利用の仕方であっても、状況によっては、充分に有用となるものである。 In the description so far, the digital still camera 1 is an imaging system attached to the camera platform 10. However, the composition determination and the imaging recording configuration based on the composition determination result are as follows. Even without a pan head, the digital still camera 1 can be realized alone.
That is, even in a situation where the digital still camera 1 of the present embodiment is simply placed in a fixed state, composition determination is performed according to the image captured there, and automatic imaging recording is performed according to the determination result. Is executed. Even in such a way of using the digital still camera 1, it is sufficiently useful depending on the situation.

続いては、本実施形態の構図判定の基本構成を、上記撮像システム以外に適用した例について挙げていく。
先ず、図２３は、実施形態としての構図判定のための構成を、デジタルスチルカメラなどの撮像装置単体に対して適用している。この撮像装置は、例えば撮像モード時において撮像装置により撮像している画像が、判定結果に応じた適正な構図になったときに、このことを表示によってユーザに通知する。
このために撮像装置が備えるべき構成として、ここでは構図判定ブロック８１、通知制御処理ブロック８２、表示部８３を示している。ここでの構図判定ブロック８１が、図５に示した構図判定ブロック６２と同等の構成を採るものとされる。
例えばユーザは、撮像装置を撮像モードに設定したうえで、撮像装置を手に持っており、いつでもレリーズ操作（シャッターボタン操作）を行えば撮像記録が行える状況にあるものとする。
このような状態の下、構図判定ブロック８１では、先ず、そのときに撮像して得られる撮像画像データを取り込んで、一連の構図判定の処理を実行して、最終的に最適構図を判定する。
そのうえで、さらに、この場合の構図判定ブロック８１としては、そのときに実際に得られている撮像画像データの構図と、判定された最適構図との一致性、類似度を求めるようにされる。そして、例えば類似度が一定以上になったときに、実際に撮影して得られている撮像画像データの画内容が最適構図になったと判定する。なお、例えば実際においては、撮像画像データの構図と最適構図とが一致したとみなされる程度の、所定以上の類似度が得られたら、最適構図と判断するようにしてアルゴリズムを構成することが考えられる。また、ここでの一致性、類似度をどのようにして求めるのかについては、多様なアルゴリズムを考えることができ、採用する構図形成要素によっても異なってくるので、ここでは、その具体例については特に言及しない。 Subsequently, an example in which the basic configuration of composition determination of this embodiment is applied to other than the imaging system will be described.
First, in FIG. 23, a configuration for composition determination as an embodiment is applied to a single imaging apparatus such as a digital still camera. For example, when the image captured by the image capturing apparatus in the image capturing mode has an appropriate composition according to the determination result, the image capturing apparatus notifies the user of this fact by display.
For this purpose, the composition determination block 81, the notification control processing block 82, and the display unit 83 are shown here as the configuration that the imaging apparatus should have. The composition determination block 81 here has the same configuration as the composition determination block 62 shown in FIG.
For example, it is assumed that the user sets the imaging device in the imaging mode, holds the imaging device in his hand, and can perform imaging recording by performing a release operation (shutter button operation) at any time.
Under such a state, the composition determination block 81 first captures captured image data obtained by imaging at that time, executes a series of composition determination processes, and finally determines the optimum composition.
In addition, the composition determination block 81 in this case further obtains the coincidence and similarity between the composition of the captured image data actually obtained at that time and the determined optimum composition. For example, when the degree of similarity becomes equal to or higher than a certain level, it is determined that the image content of the captured image data actually obtained by photographing has an optimal composition. For example, in practice, if a degree of similarity equal to or greater than a predetermined level is obtained to such an extent that the composition of the captured image data is considered to match the optimum composition, the algorithm may be configured so that the composition is determined as the optimum composition. It is done. In addition, various algorithms can be considered as to how to determine the coincidence and similarity here, and it varies depending on the composition forming element to be adopted. Do not mention.

このようにして撮像画像データの画面内容が最適構図になったことの判定結果の情報は通知制御処理ブロック８２に対して出力される。通知制御処理ブロック８２は、上記の情報の入力に応じて、現在において撮像されている画像が最適構図であることをユーザに通知するための所定態様による表示が表示部８３にて行われるように表示制御を実行する。なお、通知制御処理ブロック８２は、撮像装置が備えるマイクロコンピュータ（ＣＰＵ）などによる表示制御機能と、表示部８３に対する画像表示を実現するための表示用画像処理機能などにより実現される。なお、ここでの最適構図であることのユーザへの通知は、電子音、若しくは合成音声などをはじめとした音により行われるように構成してもよい。 Information on the determination result that the screen content of the captured image data has the optimum composition in this way is output to the notification control processing block 82. In response to the input of the above information, the notification control processing block 82 displays on the display unit 83 in a predetermined manner for notifying the user that the currently captured image has the optimum composition. Execute display control. The notification control processing block 82 is realized by a display control function by a microcomputer (CPU) provided in the imaging device, a display image processing function for realizing image display on the display unit 83, and the like. Note that the notification to the user of the optimal composition here may be configured to be performed by sounds such as electronic sounds or synthesized sounds.

また、表示部８３は、例えば本実施形態のデジタルスチルカメラ１の表示部３３に対応するもので、例えば撮像装置における所定位置に対してそのディスプレイパネルが表出するようにして設けられ、撮影モード時にはいわゆるスルー画といわれる、そのときに撮像されている画像が表示されることが一般的である。従って、この撮像装置の実際にあっては、表示部８３において、スルー画に対して重畳される態様で最適構図であることを通知する内容の画像が表示されることになる。ユーザは、この最適構図であることを通知する表示が現れたときにレリーズ操作を行うようにされる。これにより、写真撮影の知識や技術に長けていないようなユーザであっても、適切な構図、即ち良好な画内容の写真撮影を簡単に行うことが可能になる。 The display unit 83 corresponds to, for example, the display unit 33 of the digital still camera 1 of the present embodiment. For example, the display unit 83 is provided such that the display panel is exposed to a predetermined position in the imaging apparatus, and the shooting mode is set. It is common to display an image captured at that time, sometimes referred to as a so-called through image. Therefore, in the actual image pickup apparatus, the display unit 83 displays an image with a content for notifying that the optimum composition is superimposed on the through image. The user performs a release operation when a display notifying that the optimum composition is displayed. This makes it possible for a user who is not skilled in photography and skill to easily take a photo with an appropriate composition, that is, good image content.

また、図２４に示す例も、上記図２３と同様にデジタルスチルカメラなどの撮像装置単体に対して実施形態の構図判定の構成を適用したものとなる。
先ず、この図に示す構成においては、図２３と同様に、構図判定ブロック８１により、入力される撮像画像データを基にして最適構図を判定する処理を実行するとともに、その後のタイミングにおいて得られる撮像画像データの画内容から最適構図が何であるのかを判定し、次に、撮像画像データの画内容が、判定された最適構図となるのを判定する。そして、最適構図になったことを判定すると、このことをレリーズ制御処理ブロック８４に対して通知する。
レリーズ制御処理ブロック８４は、撮像画像データを記録（撮像記録）するための制御を実行する部位とされ、例えば撮像装置が備えるマイクロコンピュータが実行する制御などにより実現される。上記の通知を受けたレリーズ制御処理ブロック８４は、そのときに得られている撮像画像データが、例えば記憶媒体に記憶されるようにして画像信号処理、記録制御処理を実行する。
このような構成であれば、デジタルスチルカメラ１を手持ちで撮像する際に、撮像画像について最適構図を有する画内容が得られたタイミングで、自動的にその撮像画像の記録が行われるようにできる。 Also, in the example shown in FIG. 24, the composition determination configuration of the embodiment is applied to a single imaging apparatus such as a digital still camera, as in FIG.
First, in the configuration shown in this figure, as in FIG. 23, the composition determination block 81 executes a process for determining the optimum composition based on the input captured image data, and the imaging obtained at the subsequent timing. It is determined from the image content of the image data what the optimal composition is, and then it is determined that the image content of the captured image data has the determined optimal composition. When it is determined that the optimum composition has been achieved, this is notified to the release control processing block 84.
The release control processing block 84 is a part that executes control for recording (capturing and recording) captured image data, and is realized by, for example, control executed by a microcomputer included in the imaging apparatus. Upon receiving the above notification, the release control processing block 84 executes image signal processing and recording control processing so that the captured image data obtained at that time is stored in, for example, a storage medium.
With such a configuration, when the digital still camera 1 is imaged by hand, the captured image can be automatically recorded at the timing when the image content having the optimum composition is obtained for the captured image. .

なお、上記図２３及び図２４の構成は、例えばスチルカメラの範疇であれば、デジタルスチルカメラに適用できるほか、銀塩フィルムなどに撮像画像を記録するいわゆる銀塩カメラといわれるものにも、例えば光学系により得られた撮像光を分光して取り入れるイメージセンサと、このイメージセンサからの信号を入力して処理するデジタル画像信号処理部などを設けることで適用が可能である。 23 and 24 can be applied to a digital still camera as long as it is in the category of a still camera, for example, a so-called silver salt camera that records a captured image on a silver salt film or the like. The present invention can be applied by providing an image sensor that takes in imaging light obtained by an optical system by spectroscopically and a digital image signal processing unit that inputs and processes a signal from the image sensor.

図２５も、実施形態の構図判定の基本構成をデジタルスチルカメラなどの撮像装置に適用した構成の一例である。この図に示す撮像装置１００は、図示するようにして、構図判定ブロック１０１、メタデータ作成処理ブロック１０２、ファイル作成処理ブロック１０３を備える。ここでは、構図判定ブロック１０１が、図２の構図判定ブロック２０２に対応する。 FIG. 25 is also an example of a configuration in which the basic configuration for composition determination according to the embodiment is applied to an imaging apparatus such as a digital still camera. The imaging apparatus 100 shown in this figure includes a composition determination block 101, a metadata creation processing block 102, and a file creation processing block 103 as shown in the figure. Here, the composition determination block 101 corresponds to the composition determination block 202 of FIG.

ここでは図示していない撮像記録ブロックにより撮像して得られる撮像画像データは、撮像装置１００内の構図判定ブロック１０１、ファイル作成処理ブロック１０３とに対して入力することとしている。なお、この場合において、撮像装置１００内に入力された撮像画像データは、例えばレリーズ操作などに応じて記憶媒体に記憶されるべきこととなった撮像画像データであり、ここでは図示していない、撮像記録ブロックでの撮像により得られた撮像信号を基に生成されたものである。 Here, captured image data obtained by imaging with an imaging recording block (not shown) is input to the composition determination block 101 and the file creation processing block 103 in the imaging apparatus 100. In this case, the captured image data input into the imaging apparatus 100 is captured image data that should be stored in a storage medium in accordance with a release operation, for example, and is not illustrated here. It is generated based on the imaging signal obtained by imaging in the imaging recording block.

先ず構図判定ブロック１０１では、定常的に繰り返し構図判定処理を実行する。
そのうえで、この場合の構図判定処理としては、さらに、判定結果に基づき、入力された撮像画像データの全画像領域において、判定された最適構図が得られるとされる所定の縦横比による画像部分（トリミング画像部分）がどこであるのかを特定する処理を実行する。そして、特定したトリミング画像部分を示す情報を、メタデータ作成処理ブロック１０２に対して出力する。
そして、このような処理を実行していく際に、構図判定ブロック１０１においては、判定結果の履歴情報（判定結果履歴情報）を保持していくようにするとともに、この判定結果履歴情報を参照して、判定した構図が消尽されている場合には、図３に準じて、以降は同じ構図判定結果に基づいたトリミング画像部分の特定を行わないようにする。若しくは図４に準じて構図判定アルゴリズムを変更して、構図判定処理とトリミング画像部分の特定を実行していく。 First, in the composition determination block 101, a composition determination process is repeatedly executed regularly.
In addition, as a composition determination process in this case, an image portion (trimming) with a predetermined aspect ratio that is determined to obtain the determined optimum composition in the entire image area of the input captured image data based on the determination result. A process for identifying where the image portion is) is executed. Then, information indicating the specified trimmed image portion is output to the metadata creation processing block 102.
When such a process is executed, the composition determination block 101 keeps the determination result history information (determination result history information) and refers to the determination result history information. If the determined composition is exhausted, the trimming image portion is not specified based on the same composition determination result thereafter, in accordance with FIG. Alternatively, the composition determination algorithm is changed according to FIG. 4, and the composition determination process and the trimmed image portion are specified.

メタデータ作成処理ブロック１０２では、入力された情報に基づいて、対応する撮像画像データから最適構図を有する画像を得るために必要な情報から成るメタデータ（編集メタデータ）を作成し、ファイル作成処理ブロック１０３に対して出力する。この編集メタデータは、例えば、対応する撮像画像データとしての画面におけるトリミング画像部分がどこであるのかを示す情報などとなる。 The metadata creation processing block 102 creates metadata (edited metadata) composed of information necessary for obtaining an image having an optimal composition from the corresponding captured image data based on the input information, and creates a file. Output to block 103. This editing metadata is, for example, information indicating where the trimmed image portion on the screen as the corresponding captured image data is.

この図に示す撮像装置１００では、撮像画像データについて、所定形式による静止画像ファイルとして管理されるようにして記憶媒体に記録するものとされる。これに対応して、ファイル作成処理ブロック１０３は、撮像画像データを、静止画像ファイル形式に変換（作成）する。
ファイル作成処理ブロック１０３は、先ず、入力される撮像画像データについて、画像ファイル形式に対応した画像圧縮符号化を行い、撮像画像データから成るファイル本体部分を作成する。これとともに、メタデータ作成処理ブロック１０２から入力された編集メタデータを、所定の格納位置に対して格納するようにしてヘッダ及び付加情報ブロックなどのデータ部分を作成する。そして、これらファイル本体部分、ヘッダ、付加情報ブロックなどから静止画像ファイルを作成し、これを出力する。これにより、図示するようにして、記憶媒体に記録すべき静止画像ファイルとしては、撮像画像データとともにメタデータ（編集メタデータ）が含まれる構造を有したものが得られる。 In the imaging apparatus 100 shown in this figure, captured image data is recorded on a storage medium so as to be managed as a still image file in a predetermined format. In response to this, the file creation processing block 103 converts (creates) the captured image data into a still image file format.
First, the file creation processing block 103 performs image compression encoding corresponding to the image file format on the input captured image data, and creates a file body portion composed of the captured image data. At the same time, the editing metadata input from the metadata creation processing block 102 is stored in a predetermined storage position, and data portions such as a header and an additional information block are created. Then, a still image file is created from the file body part, header, additional information block, and the like, and this is output. As a result, as shown in the figure, a still image file to be recorded on the storage medium is obtained having a structure including metadata (edited metadata) together with captured image data.

図２６は、上記図２５の装置により作成された静止画像ファイルについて編集を行う編集装置の構成例を示している。
図に示す編集装置１１０は、静止画像ファイルのデータを取り込んで、先ずメタデータ分離処理ブロック１１１に入力する。メタデータ分離処理ブロック１１１は、静止画像ファイルのデータから、ファイル本体部分に相当する撮像画像データとメタデータとを分離する。分離して得られたメタデータについてはメタデータ解析処理ブロック１１２に対して出力し、撮像画像データについてはトリミング処理ブロック１１３に対して出力する。 FIG. 26 shows a configuration example of an editing apparatus that edits a still image file created by the apparatus of FIG.
The editing apparatus 110 shown in the drawing takes in still image file data and first inputs it to the metadata separation processing block 111. The metadata separation processing block 111 separates captured image data and metadata corresponding to the file body portion from the data of the still image file. The metadata obtained by the separation is output to the metadata analysis processing block 112, and the captured image data is output to the trimming processing block 113.

メタデータ解析処理ブロック１１２は、取り込んだメタデータを解析する処理を実行する部位である。そして、解析処理として、編集メタデータを解析した場合には、最適構図が得られるトリミング画像部分を認識する。そして、この認識された画像部分のトリミングを指示するトリミング指示情報をトリミング処理ブロック１１３に対して出力する。
トリミング処理ブロック１１３は、メタデータ分離処理ブロック１１１側から入力した撮像画像データから、上記メタデータ分離処理ブロック１１２から入力されるトリミング指示情報が示す画像部分を抜き出すための画像処理を実行し、抜き出した画像部分を１つの独立した画像データである、編集撮像画像データとして出力する。 The metadata analysis processing block 112 is a part that executes processing for analyzing the captured metadata. Then, when the editing metadata is analyzed as an analysis process, the trimmed image portion where the optimum composition is obtained is recognized. Then, trimming instruction information for instructing trimming of the recognized image portion is output to the trimming processing block 113.
The trimming processing block 113 executes image processing for extracting an image portion indicated by the trimming instruction information input from the metadata separation processing block 112 from the captured image data input from the metadata separation processing block 111 side. The image portion is output as edited captured image data which is one independent image data.

上記図２５、図２６に示される撮像装置と編集装置から成るシステムによれば、例えば撮影などにより得たオリジナルの静止画像データ（撮像画像データ）はそのまま無加工で保存しておけるようにしたうえで、このオリジナルの静止画像データから、メタデータを利用して、最適構図となる画像部分を抜き出す編集が行えることになる。また、このような最適構図に対応した抜き出し画像部分の決定は、自動的に行われるものであり、ユーザにとっては、非常に編集が簡単になる。
なお、図２６に示す編集装置としての機能は、例えばパーソナルコンピュータなどにインストールされる画像データ編集のためのアプリケーションであるとか、画像データを管理するアプリケーションにおける画像編集機能などで採用することが考えられる。 According to the system composed of the imaging device and the editing device shown in FIGS. 25 and 26, original still image data (captured image data) obtained by, for example, photographing can be stored as it is without being processed. Thus, editing can be performed to extract an image portion having an optimum composition from the original still image data by using metadata. Also, the determination of the extracted image portion corresponding to such an optimal composition is performed automatically, and the editing is very easy for the user.
The function as the editing apparatus shown in FIG. 26 may be adopted, for example, as an image data editing application installed in a personal computer or an image editing function in an application for managing image data. .

図２７は、ビデオカメラなどとしての動画像の撮影記録が可能な撮像装置に、実施形態の構図判定の構成を適用した例である。
この図に示す撮像装置１２０には、動画像データが入力される。この動画像データは、例えば同じ撮像装置１２０が有するとされる撮像部により撮像を行って得られる撮像信号に基づいて生成されるものである。この動画像データは、撮像装置１２０における構図判定ブロック１２２、及びファイル作成・記録処理ブロック１２４に対して入力される。
この場合は構図判定ブロック１２２が図２に示した構図判定ブロック２００に相当する。構図判定ブロック１２２は、入力されてくる動画像データの画像について、定常的に、一連の被写体検出、属性検出、構図判定の処理を行う。そのうえで、さらに、上記動画像データの画像の実際の画内容について、判定結果として得た最適構図との差違（近似度）を比較することにより、良否判定を行う。
そして、この比較結果として、実際の撮像画像にて得られている構図と、判定された最適構図とについて一定以上の近似度が得られたのであれば、良好な構図であると判定され、上記類似度が一定以下であれば、良好な構図ではないと判定される。
構図判定ブロック１２２は、上記のようにして動画像データについて良好な構図が得られていると判定したときには、メタデータ作成処理ブロック１２３に対して、動画像データにおいて、今回、上記の良好な構図が得られていると判定した画像区間(良好構図区間)がどこであるのかを示す情報(良好構図区間指示情報)を出力する。良好構図区間指示情報)は、例えば動画像データにおける良好構図区間の開始位置と終了位置を示す情報などとされる。 FIG. 27 is an example in which the composition determination configuration of the embodiment is applied to an imaging apparatus capable of capturing and recording moving images such as a video camera.
Moving image data is input to the imaging device 120 shown in this figure. This moving image data is generated based on an imaging signal obtained by imaging by an imaging unit assumed to be included in the same imaging device 120, for example. This moving image data is input to the composition determination block 122 and the file creation / recording processing block 124 in the imaging apparatus 120.
In this case, the composition determination block 122 corresponds to the composition determination block 200 shown in FIG. The composition determination block 122 regularly performs a series of subject detection, attribute detection, and composition determination processing on the input moving image data image. In addition, the quality determination is performed by comparing the difference (approximation) between the actual image content of the moving image data and the optimum composition obtained as a determination result.
As a result of the comparison, if a certain degree of approximation is obtained for the composition obtained in the actual captured image and the determined optimum composition, it is determined that the composition is good, and the above If the similarity is below a certain level, it is determined that the composition is not good.
When the composition determination block 122 determines that a good composition has been obtained for the moving image data as described above, the above-described good composition for the moving image data is compared with the metadata creation processing block 123. Information (good composition section instruction information) indicating where the image section (good composition section) determined to have been obtained is output. The good composition section instruction information) is, for example, information indicating the start position and the end position of the good composition section in the moving image data.

そのうえで、構図判定ブロック１２２は、構図の判定結果履歴情報を保持するとともに、この判定結果履歴情報に基づいて、消尽された構図判定結果に応じては、上記の良好構図区間指示情報は作成しないようにされる。これにより、似たような構図の画像ばかりが良好構図区間として指定される結果を避けることができる。 In addition, the composition determination block 122 holds composition determination result history information, and does not create the above-described good composition section instruction information according to the exhausted composition determination result based on the determination result history information. To be. As a result, it is possible to avoid a result in which only images having a similar composition are designated as good composition sections.

この場合のメタデータ作成処理ブロック１２３は、次に説明する動画像記録処理ブロック１２４により記憶媒体にファイルとして記録される動画像データについての、各種所要のメタデータを生成するものとされる。そのうえで、上記のようにして構図判定ブロック１２２から良好構図区間指示情報を入力した場合には、入力された良好構図区間指示情報により示される画像区間が良好な構図であることを示すメタデータを生成し、動画像記録処理ブロック１２４に対して出力する。
動画像記録処理ブロック１２４は、入力された動画像データについて、所定形式による動画像ファイルとして管理されるようにして記憶媒体に記録するための制御を実行する。そして、メタデータ作成処理ブロック１２３からメタデータが出力されてきた場合には、このメタデータが、動画像ファイルに付随するメタデータに含められるようにして記録されるようにするための制御を実行する。
これにより、図示するようにして、記憶媒体に記録される動画像ファイルは、撮像により得られたとする動画像データに、良好な構図が得られている画像区間を示すメタデータが付随された内容を有することになる。
なお、上記のようにしてメタデータにより示される、良好な構図が得られている画像区間は、或る程度の時間幅を有する動画像による画像区間とされてもよいし、動画像データから抜き出した静止画像によるものとされてもよい。また、上記のメタデータに代えて、良好な構図が得られている画像区間の動画像データ若しくは静止画像データを生成して、これを動画像ファイルに付随する副次的な画像データ（或いは動画像ファイルと独立したファイル）として記録する構成も考えられる。
また、図２７に示されるようにして、撮像装置１２０に対して構図判定ブロック１２２を備える構成では、構図判定ブロック１２２により良好構図区間であると判定された動画像の区間のみを動画像ファイルとして記録するように構成することも考えられる。さらには、構図判定ブロック１２２により良好な構図であると判定された画像区間に対応する画像データを、データインターフェースなどを経由して外部機器に出力するような構成も考えることができる。 In this case, the metadata creation processing block 123 generates various required metadata for the moving image data recorded as a file on the storage medium by the moving image recording processing block 124 described below. In addition, when good composition section instruction information is input from the composition determination block 122 as described above, metadata indicating that the image section indicated by the input good composition section instruction information is a good composition is generated. And output to the moving image recording processing block 124.
The moving image recording processing block 124 executes control for recording the input moving image data on a storage medium so as to be managed as a moving image file in a predetermined format. When metadata is output from the metadata creation processing block 123, control is performed so that the metadata is recorded so as to be included in the metadata accompanying the moving image file. To do.
As a result, as shown in the figure, the moving image file recorded on the storage medium includes the moving image data obtained by imaging accompanied by metadata indicating an image section in which a good composition is obtained. Will have.
It should be noted that the image section in which a good composition is obtained, which is indicated by the metadata as described above, may be an image section based on a moving image having a certain time width, or extracted from the moving image data. It may be based on a still image. Also, instead of the above metadata, moving image data or still image data of an image section in which a good composition is obtained is generated, and this is added to secondary image data (or moving image data) attached to the moving image file. It is also conceivable to record as a separate file from the image file.
As shown in FIG. 27, in the configuration including the composition determination block 122 for the imaging apparatus 120, only a moving image section determined to be a good composition section by the composition determination block 122 is used as a moving image file. It may be configured to record. Furthermore, a configuration in which image data corresponding to an image section determined to have a good composition by the composition determination block 122 is output to an external device via a data interface or the like can be considered.

また、図２５の撮像装置１００に対応する装置としては、図２６に示した編集装置以外に、図２８に示す印刷装置１３０を考えることができる。
この場合には、印刷装置１３０が、印刷すべき画像として、静止画像ファイルを取り込むこととされている。この静止画像ファイルは、例えば撮像装置１００により生成して記録されたものを含み、図示するようにして、静止画としての画像データの実体と、メタデータとを有する構造を持っている。従って、このメタデータは、図２５、図２６に示した静止画像ファイルにおけるものと同意義の内容の構図編集メタデータを含んでいるものである。 In addition to the editing apparatus shown in FIG. 26, a printing apparatus 130 shown in FIG. 28 can be considered as an apparatus corresponding to the imaging apparatus 100 shown in FIG.
In this case, the printing apparatus 130 captures a still image file as an image to be printed. This still image file includes, for example, a file generated and recorded by the imaging apparatus 100, and has a structure having the substance of image data as a still image and metadata as shown in the figure. Therefore, this metadata includes composition editing metadata having the same meaning as that in the still image file shown in FIGS.

このようにして取り込んだファイルは、メタデータ分離処理ブロック１３１が入力する。メタデータ分離処理ブロック１３１は、図２６のメタデータ分離処理ブロック１１１と同様にして、静止画像ファイルのデータから、ファイル本体部分に相当する画像データと、これに付随するメタデータとを分離する。分離して得られたメタデータについてはメタデータ解析処理ブロック１３２に対して出力し、画像データについてはトリミング処理ブロック１３３に対して出力する。 The metadata separation processing block 131 inputs the file taken in this way. The metadata separation processing block 131 separates image data corresponding to the file main body portion and metadata associated therewith from the still image file data in the same manner as the metadata separation processing block 111 of FIG. The metadata obtained by the separation is output to the metadata analysis processing block 132, and the image data is output to the trimming processing block 133.

メタデータ解析処理ブロック１３２は、取り込んだメタデータについて、図２６のメタデータ分離処理ブロック１１１と同様の解析処理を実行し、トリミング処理ブロック１３３に対してトリミング指示情報を出力する。 The metadata analysis processing block 132 performs analysis processing similar to the metadata separation processing block 111 of FIG. 26 on the fetched metadata, and outputs trimming instruction information to the trimming processing block 133.

トリミング処理ブロック１３３は、図２６におけるトリミング処理ブロック１１３と同様にして、メタデータ分離処理ブロック１３１より入力した画像データから、上記メタデータ分離処理ブロック１３２から入力されるトリミング指示情報が示す画像部分を抜き出すための画像処理を実行する。そして、この抜き出した画像部分から生成した印刷用の形式の画像データを、印刷用画像データとして、印刷制御処理ブロック１３４に出力する。 The trimming processing block 133 is similar to the trimming processing block 113 in FIG. 26, and the image portion indicated by the trimming instruction information input from the metadata separation processing block 132 is selected from the image data input from the metadata separation processing block 131. Image processing for extraction is executed. Then, the print format image data generated from the extracted image portion is output to the print control processing block 134 as print image data.

印刷制御処理ブロック１３４は、入力された印刷用画像データを利用して、ここでは図示していない印刷機構を動作させるための制御を実行する。
このような動作により、印刷装置１３０によっては、入力した画像データの全体画像から、最適構図が得られているとされる画像部分が自動的に抜き出されて、１枚の画として印刷されることになる。 The print control processing block 134 executes control for operating a printing mechanism (not shown) using the input print image data.
By such an operation, depending on the printing apparatus 130, an image portion that is assumed to have an optimum composition is automatically extracted from the entire image of the input image data and printed as a single image. It will be.

なお、本願発明による構図判定の構成を適用できる装置、システム、アプリケーションソフトウェアなどは、これまでに説明した撮像システム、撮像装置などのほかにも考えられる。 Note that devices, systems, application software, and the like to which the composition determination configuration according to the present invention can be applied are considered in addition to the imaging systems and imaging devices described above.

また、これまでの実施形態においては、顔枠である身体部位枠は方形とされているが、例えば、方形以外の形状とされたうえで、この形状に対して適切とされる方向に対するサイズの拡大が行われるようにされてよい。
また、顔枠のサイズ形状を修正したうえでの構図制御（構図判定処理、構図合わせ制御）によりどのような構図が得られるようにするのかについては多様に考えられるものであり、ここでは特に限定されるべきものではない。 Further, in the embodiments so far, the body part frame which is the face frame is a square, but for example, the body part frame is a shape other than a square and has a size in a direction appropriate for this shape. An enlargement may be performed.
In addition, there are various ways to obtain a composition by composition control (composition determination processing, composition adjustment control) after correcting the size and shape of the face frame. Should not be done.

また、これまでにも述べてきたように、本願に基づく構成における少なくとも一部は、ＣＰＵやＤＳＰにプログラムを実行させることで実現できる。
このようなプログラムは、例えばＲＯＭなどに対して製造時などに書き込んで記憶させるほか、リムーバブルの記憶媒体に記憶させておいたうえで、この記憶媒体からインストール(アップデートも含む)させるようにしてＤＳＰ対応の不揮発性の記憶領域やフラッシュメモリ３０などに記憶させることが考えられる。また、ＵＳＢやＩＥＥＥ１３９４などのデータインターフェース経由により、他のホストとなる機器からの制御によってプログラムのインストールを行えるようにすることも考えられる。さらに、ネットワーク上のサーバなどにおける記憶装置に記憶させておいたうえで、デジタルスチルカメラ１にネットワーク機能を持たせることとし、サーバからダウンロードして取得できるように構成することも考えられる。
As described above, at least a part of the configuration based on the present application can be realized by causing a CPU or a DSP to execute a program.
Such a program is written and stored in a ROM or the like at the time of manufacture, for example, and is stored in a removable storage medium and then installed (including update) from this storage medium. It is conceivable to store the data in a corresponding non-volatile storage area, the flash memory 30, or the like. It is also conceivable that the program can be installed through a data interface such as USB or IEEE 1394 under the control of another host device. Further, it may be possible to store the data in a storage device such as a server on the network, and to give the digital still camera 1 a network function so that the digital still camera 1 can be downloaded and acquired from the server.

実施形態としての撮像システムを構成するデジタルスチルカメラ及び雲台を示す図である。It is a figure which shows the digital still camera and pan head which comprise the imaging system as embodiment. 実施形態の撮像システムについて、雲台に取り付けられたデジタルスチルカメラのパン方向及びチルト方向に沿った動きの例を模式的に示す図である。It is a figure which shows typically the example of a motion along the pan direction and the tilt direction of the digital still camera attached to the camera platform for the imaging system of the embodiment. 実施形態の撮像システムを構成するデジタルスチルカメラの内部構成例を示すブロック図である。It is a block diagram which shows the internal structural example of the digital still camera which comprises the imaging system of embodiment. 実施形態の撮像システムを構成する雲台の内部構成例を示すブロック図である。It is a block diagram which shows the internal structural example of the pan head which comprises the imaging system of embodiment. 実施形態における撮像システムについての内部システム構成例を示すブロック図である。It is a block diagram which shows the internal system structural example about the imaging system in embodiment. 構図判定処理として、基本に沿った一具体例を説明するための図である。It is a figure for demonstrating one specific example along the basics as a composition determination process. 構図判定処理として、基本に沿った一具体例を説明するための図である。It is a figure for demonstrating one specific example along the basics as a composition determination process. 実施形態として、大人と子供の被写体が混在する場合の構図判定処理（第１例）を説明するための図である。It is a figure for demonstrating the composition determination process (1st example) when the subject of an adult and a child is mixed as embodiment. 実施形態として、大人と子供の被写体が混在する場合の構図判定処理（第１例）を説明するための図である。It is a figure for demonstrating the composition determination process (1st example) when the subject of an adult and a child is mixed as embodiment. 実施形態として、大人と子供の被写体が混在する場合の構図判定処理（第２例）を説明するための図である。It is a figure for demonstrating the composition determination process (2nd example) when the subject of an adult and a child is mixed as embodiment. 実施形態として、大人と子供の被写体が混在する場合の構図判定処理（第２例）を説明するための図である。It is a figure for demonstrating the composition determination process (2nd example) when the subject of an adult and a child is mixed as embodiment. 被写体サイズ（ズーム倍率）に関する構図判定処理を説明するための図である。It is a figure for demonstrating the composition determination process regarding a to-be-photographed object size (zoom magnification). 被写体サイズ（ズーム倍率）に関する構図判定処理例（通常の場合）を説明するための図である。It is a figure for demonstrating the composition determination process example (normal case) regarding subject size (zoom magnification). 被写体サイズ（ズーム倍率）に関する構図判定処理例（カップルの場合）を説明するための図である。It is a figure for demonstrating the composition determination process example (in the case of a couple) regarding subject size (zoom magnification). 顔方向に応じた構図判定処理例を説明するための図である。It is a figure for demonstrating the example of a composition determination process according to a face direction. 顔方向に応じた構図判定処理例を説明するための図である。It is a figure for demonstrating the example of a composition determination process according to a face direction. 顔方向ベクトル量を説明するための図である。It is a figure for demonstrating face direction vector amount. 実施形態としての構図制御のためのアルゴリズム例を示すフローチャートである。It is a flowchart which shows the example of an algorithm for composition control as embodiment. 実施形態の構図制御における構図判定処理としての手順例を示す図である。It is a figure which shows the example of a procedure as a composition determination process in the composition control of embodiment. 実施形態の構図判定処理におけるズーム倍率設定のための処理手順例を示すフローチャートである。It is a flowchart which shows the example of a process sequence for the zoom magnification setting in the composition determination process of embodiment. 実施形態の撮像システムについての、他の内部システム構成例を示すブロック図である。It is a block diagram which shows the other internal system structural example about the imaging system of embodiment. 実施形態の撮像システムについて、他の内部システム構成例を示すブロック図である。It is a block diagram which shows the other internal system structural example about the imaging system of embodiment. 撮像システム以外での実施形態における、構図判定ブロックの適用例を示すブロック図である。It is a block diagram which shows the example of application of a composition determination block in embodiment other than an imaging system. 撮像システム以外での実施形態における、構図判定ブロックの適用例を示すブロック図である。It is a block diagram which shows the example of application of a composition determination block in embodiment other than an imaging system. 撮像システム以外での実施形態における、構図判定ブロックの適用例を示すブロック図である。It is a block diagram which shows the example of application of a composition determination block in embodiment other than an imaging system. 図２５の撮像装置に対応する編集装置の構成例を示すブロック図である。FIG. 26 is a block diagram illustrating a configuration example of an editing device corresponding to the imaging device in FIG. 25. 撮像システム以外での実施形態における、構図判定ブロックの適用例を示すブロック図である。It is a block diagram which shows the example of application of a composition determination block in embodiment other than an imaging system. 図２５の撮像装置に対応する編集装置の構成例を示すブロック図である。FIG. 26 is a block diagram illustrating a configuration example of an editing device corresponding to the imaging device in FIG. 25.

Explanation of symbols

１デジタルスチルカメラ、２シャッターボタン、３レンズ部、１０雲台、２１光学系、２２イメージセンサ、２３Ａ／Ｄコンバータ、２４信号処理部、２５エンコード／デコード部、２６メディアコントローラ、２７制御部、２８ＲＯＭ、２９ＲＡＭ、３０フラッシュメモリ、３１操作部、３２表示ドライバ、３３表示部、３４雲台対応通信部、４０メモリカード、５１制御部、５２通信部、５３パン機構部、５４パン用モータ、５５パン用駆動部、５６チルト機構部、５７チルト用モータ、５８チルト用駆動部、６１撮像記録ブロック、６２構図判定ブロック、６３パン・チルト・ズーム制御ブロック、６４通信制御処理ブロック DESCRIPTION OF SYMBOLS 1 Digital still camera, 2 Shutter button, 3 Lens part, 10 Head, 21 Optical system, 22 Image sensor, 23 A / D converter, 24 Signal processing part, 25 Encoding / decoding part, 26 Media controller, 27 Control part, 28 ROM, 29 RAM, 30 Flash memory, 31 Operation section, 32 Display driver, 33 Display section, 34 Pan head compatible communication section, 40 Memory card, 51 Control section, 52 Communication section, 53 Bread mechanism section, 54 Bread motor , 55 Pan drive unit, 56 Tilt mechanism unit, 57 Tilt motor, 58 Tilt drive unit, 61 Image recording block, 62 Composition determination block, 63 Pan / tilt / zoom control block, 64 Communication control processing block

Claims

Subject detection means for inputting image data and detecting a subject present in the image content of the image data;
Attribute detection means for detecting a predetermined attribute for each subject detected by the subject detection means;
Composition determining means for determining a composition based on a predetermined relationship with respect to attributes for each subject detected by the attribute detecting means;
A composition determination apparatus comprising:

The attribute detection means detects a face direction for each subject as an attribute,
The composition determination means determines the position of the subject in the image frame based on the face direction of the entire subject obtained based on the face direction detected for each subject;
The composition determination apparatus according to claim 1.

The attribute detection means detects the age of the subject as an attribute,
The composition determination means is configured to display an image frame when an adult subject and a child subject are mixed based on the age detected for each subject than when an adult subject and a child subject are not mixed. Determine the position of the subject so that the child's subject is placed closer to the center
The composition determination apparatus according to claim 1 or 2.

The attribute detecting means detects the sex of the subject as an attribute,
In the case where the number of detected subjects is 2 and the sexes of the two detected subjects are male and female, respectively, based on the sex detection result by the attribute detecting unit. Set the size of the subject for the male / female group that is larger than the size of the subject in the image frame set in the normal case.
The composition determination apparatus according to claim 1.

When the composition determining means sets the enlargement ratio of the image according to the size of the subject for the male / female group, the composition determination means uses the position of the subject determined along the direction indicated by the face orientation of the subject overall. Setting the enlargement ratio corresponding to the size of the subject so that the face portions of the two subjects fit within the image frame;
The composition determination apparatus according to claim 2.

A subject detection procedure for inputting image data and detecting a subject present in the image content of the image data;
An attribute detection procedure for detecting a predetermined attribute for each subject detected by the subject detection procedure;
A composition determination procedure for determining a composition based on a predetermined relationship with respect to attributes for each subject detected by the attribute detection procedure;
A composition determination method for executing.

A subject detection procedure for inputting image data and detecting a subject present in the image content of the image data;
An attribute detection procedure for detecting a predetermined attribute for each subject detected by the subject detection procedure;
A composition determination procedure for determining a composition based on a predetermined relationship with respect to attributes for each subject detected by the attribute detection procedure;
Is a program that causes a composition determination apparatus to execute.