JP5440588B2

JP5440588B2 - Composition determination apparatus, composition determination method, and program

Info

Publication number: JP5440588B2
Application number: JP2011245273A
Authority: JP
Inventors: 真吾善積; 央樹山脇
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2011-11-09
Filing date: 2011-11-09
Publication date: 2014-03-12
Anticipated expiration: 2027-10-17
Also published as: JP2012029338A

Description

本発明は、例えば静止画としての画像データを対象として、その画内容が有する構図を判定するようにされた装置である、構図判定装置とその方法に関する。また、このような装置が実行するプログラムに関する。 The present invention relates to a composition determination apparatus and method for determining, for example, image data as a still image and determining the composition of the image content. The present invention also relates to a program executed by such a device.

例えば、良い印象を与えることのできる写真を撮影するためのテクニック的な一要素として、構図設定が挙げられる。ここでいう構図は、フレーミングともいわれるもので、例えば写真としての画面内における被写体の配置をいう。
良好な構図とするための一般的、基本的な手法はいくつかあるものの、一般のカメラユーザが良い構図の写真を撮影することは、写真撮影に関する充分な知識、技術を持っていない限り、決して簡単なことではない。このことからすると、例えば良好な構図の写真画像を手軽で簡単に得ることのできる技術構成が求められることになる。 For example, composition setting is one of the technical elements for taking a photo that can give a good impression. The composition here is also referred to as framing, and means, for example, the arrangement of the subject in the screen as a photograph.
Although there are several general and basic methods for achieving a good composition, it is never possible for a general camera user to take a photo with a good composition unless they have sufficient knowledge and skills regarding photography. It ’s not easy. From this, for example, a technical configuration capable of easily and easily obtaining a photographic image having a good composition is required.

例えば特許文献１には、自動追尾装置として、一定時間間隔の画像間の差を検出して、画像間の差の重心を算出し、この重心の移動量、移動方向から被写体画像の撮像画面に対する移動量、移動方向を検出して撮像装置を制御し、被写体画像を撮像画面の基準領域内に設定する技術構成が開示されている。
また、特許文献２には、自動追尾装置として、人物を自動追尾する場合に、人物の顔が画面中央となるように画面上の人物像全体の面積に対してその人物上の上側から２０％の面積となる位置を画面中央にして追尾することによって人物の顔を確実に撮影しながら追尾できるようにした技術構成が開示されている。
これらの技術構成を構図決定の観点から見れば、人物としての被写体を自動的に探索して、撮影画面において或る決まった構図でその被写体を配置させることが可能となっている。 For example, in Patent Document 1, as an automatic tracking device, a difference between images at a predetermined time interval is detected, a center of gravity of the difference between images is calculated, and a moving amount and a moving direction of the center of gravity are used for an imaging screen of a subject image. A technical configuration is disclosed in which a moving amount and a moving direction are detected to control an imaging apparatus, and a subject image is set within a reference area of an imaging screen.
In Patent Document 2, as an automatic tracking device, when a person is automatically tracked, 20% from the upper side of the person relative to the entire area of the person image on the screen so that the face of the person is at the center of the screen. A technical configuration is disclosed in which a person's face can be tracked while being reliably tracked by tracking the position of the area as the center of the screen.
If these technical configurations are viewed from the viewpoint of composition determination, it is possible to automatically search for a subject as a person and place the subject in a certain composition on the shooting screen.

特開昭５９−２０８９８３号公報JP 59-208983 A 特開２００１−２６８４２５号公報JP 2001-268425 A

例えば被写体に関する所定の状況、状態などに対応しては、最適構図も異なってくることがあると考えられる。しかし、上記特許文献による技術では、追尾した被写体をある固定的な構図で配置させることしかできない。従って、被写体の状況などに対応して構図を変更して撮影するようなことはできないことになる。
そこで、本願発明では、例えば写真などとしての画像について良好な構図が手軽に得られるようにするための技術を提案することを目指すこととしたうえで、その際において、被写体の状況・状態の変化にも適応してより高度で柔軟性のある構図の決定が行われるようにすることを目的とする。 For example, it is considered that the optimum composition may differ depending on a predetermined situation and state regarding the subject. However, with the technique according to the above-mentioned patent document, it is only possible to arrange the tracked subject with a certain fixed composition. Therefore, it is impossible to change the composition in accordance with the subject situation or the like.
Therefore, in the present invention, for example, the aim is to propose a technique for easily obtaining a good composition for an image such as a photograph. The purpose is to make it possible to determine a more sophisticated and flexible composition.

そこで本発明は上記した課題を考慮して、構図判定装置として次のように構成する。
つまり、画像データに基づく画像中における特定の被写体の存在を検出する被写体検出手段と、上記被写体検出手段により検出された被写体である検出被写体の数に応じて構図を判定する構図判定手段と、を備え、上記構図判定手段は、上記検出被写体の数が所定値以上である場合左右両端に位置する被写体間の距離に基づいて、長方形の上記画像を、縦長と横長の何れとするのかを判定することとした。
In view of the above-described problems, the present invention is configured as follows as a composition determination apparatus.
That is, subject detection means for detecting the presence of a specific subject in an image based on image data, and composition determination means for determining the composition according to the number of detected subjects that are the subjects detected by the subject detection means. And the composition determination means determines whether the rectangular image is portrait or landscape based on the distance between the subjects located at the left and right ends when the number of detected subjects is equal to or greater than a predetermined value. It was decided.

上記構成では、画像データに基づく画像中において検出される被写体の数に応じて最適とされる構図の判定が行われる。例えば１画面内において存在する被写体の数に応じて、最適とされる構図は異なってくるものであるが、本願発明によれば、被写体数という状況の変化に適応して最適構図が得られることになるものである。 In the above-described configuration, the optimum composition is determined according to the number of subjects detected in the image based on the image data. For example, the optimum composition varies depending on the number of subjects existing in one screen. However, according to the present invention, an optimum composition can be obtained by adapting to changes in the situation of the number of subjects. It will be.

このようにして本発明によっては、画像データが有する画像の内容について、被写体数に応じた最適構図を得ることが可能となる。つまり、単純に或る固定的な構図による被写体の配置が行われる場合と比較して、より高度で柔軟性のある構図決定が自動的に行われる。これにより、本願発明を適用した装置を利用するユーザは、面倒な手間を掛けることなく、最適構図の画像を得ることが可能になるものであり、例えばこれまでよりも高い利便性を提供できることになる。 In this way, according to the present invention, it is possible to obtain an optimal composition corresponding to the number of subjects with respect to the content of the image included in the image data. In other words, a more advanced and flexible composition determination is automatically performed as compared to the case where the subject is simply arranged with a certain fixed composition. As a result, a user who uses the apparatus to which the present invention is applied can obtain an image with the optimum composition without troublesome work. For example, it is possible to provide higher convenience than before. Become.

本発明の実施の形態としての撮像システム（デジタルスチルカメラ、雲台）の外観構成例を示す図である。It is a figure which shows the example of an external appearance structure of the imaging system (digital still camera, pan head) as embodiment of this invention. 実施の形態の撮像システムの動作として、雲台に取り付けられたデジタルスチルカメラのパン方向及びチルト方向に沿った動きの例を模式的に示す図である。It is a figure which shows typically the example of a motion along the pan direction and the tilt direction of the digital still camera attached to the camera platform as an operation of the imaging system of the embodiment. 実施の形態のデジタルスチルカメラの構成例を示す図である。It is a figure which shows the structural example of the digital still camera of embodiment. 実施の形態の雲台の構成例を示す図である。It is a figure which shows the structural example of the pan head of embodiment. 実施の形態のデジタルスチルカメラが構図制御に対応して備えるものとされる機能をブロック単位の構成により示す図である。It is a figure which shows the function with which the digital still camera of embodiment is provided with corresponding to composition control by the structure of a block unit. 個別被写体の重心と、複数の個別被写体についての総合被写体重心とを説明する図である。It is a figure explaining the center of gravity of an individual subject and the total subject center of gravity for a plurality of individual subjects. 撮像画像データの画面に設定した原点座標を説明する図である。It is a figure explaining the origin coordinate set to the screen of captured image data. 第１の構図制御における、検出された個別被写体が１つの場合の構図制御例を模式的に示す図である。It is a figure which shows typically the example of composition control in the case of the one individual subject detected in 1st composition control. 第１の構図制御における、検出された個別被写体が２つの場合の構図制御例を模式的に示す図である。It is a figure which shows typically the composition control example in the case of two detected individual subjects in 1st composition control. 第１の構図制御における、検出された個別被写体が３以上の場合の構図制御例を模式的に示す図である。It is a figure which shows typically the example of composition control in case the detected individual subject is 3 or more in 1st composition control. 第１の構図制御のための処理手順例を示すフローチャートである。It is a flowchart which shows the example of a process sequence for 1st composition control. 第２の構図制御における、検出された個別被写体が１つの場合の構図制御例を模式的に示す図である。It is a figure which shows typically the example of composition control in the case of the one individual subject detected in 2nd composition control. 第２の構図制御における、検出された個別被写体が２つとされ、かつ、これら個別被写体間の距離が一定以下の状態で検出（補足）された場合の構図制御例を模式的に示す図である。FIG. 10 is a diagram schematically illustrating an example of composition control in the second composition control when two detected individual subjects are detected and the distance between the individual subjects is detected (supplemented) in a state of a certain value or less. . 第２の構図制御のための処理手順例を示すフローチャートである。It is a flowchart which shows the process sequence example for 2nd composition control. 実施の形態における被写体弁別を説明するための図である。It is a figure for demonstrating the object discrimination in embodiment. 実施の形態における被写体弁別を実現するための処理手順例を示すフローチャートである。It is a flowchart which shows the example of a process sequence for implement | achieving the object discrimination in embodiment. 実施の形態の撮像システムの変形例としての構成例を示す図である。It is a figure which shows the structural example as a modification of the imaging system of embodiment. 実施の形態の撮像システムの他の変形例としての構成例を示す図である。It is a figure which shows the structural example as another modification of the imaging system of embodiment. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention. 本願発明に基づく構図判定の適用例を示す図である。It is a figure which shows the example of application of the composition determination based on this invention.

以下、本願発明を実施するための最良の形態（以下、実施の形態という）について説明を行う。本実施の形態としては、本願発明に基づく構成を、デジタルスチルカメラと、このデジタルスチルカメラが取り付けられる雲台とからなる撮像システムに適用した場合を例に挙げることとする。 Hereinafter, the best mode for carrying out the present invention (hereinafter referred to as an embodiment) will be described. As an example of the present embodiment, the configuration based on the present invention is applied to an imaging system including a digital still camera and a camera platform to which the digital still camera is attached.

図１は、本実施の形態としての撮像システムの外観構成例を、正面図により示している。
この図に示されるように、本実施の形態の撮像システムは、デジタルスチルカメラ１と雲台１０とから成る。
デジタルスチルカメラ１は、本体正面側のパネルに設けられているレンズ部３によって撮像して得られる撮像光に基づいて静止画像データを生成し、これを内部に装填されている記憶媒体に記憶させることが可能とされている。つまり、写真として撮影した画像を、静止画像データとして記憶媒体に記憶保存させる機能を有する。このような写真撮影を手動で行うときには、ユーザは、本体上面部に設けられているシャッター（レリーズ）ボタンを押し操作する。 FIG. 1 is a front view showing an example of the external configuration of an imaging system according to the present embodiment.
As shown in this figure, the imaging system of the present embodiment includes a digital still camera 1 and a pan head 10.
The digital still camera 1 generates still image data based on imaging light obtained by imaging with the lens unit 3 provided on the front panel of the main body, and stores the still image data in a storage medium loaded therein. It is possible. That is, it has a function of storing and saving an image taken as a photograph in a storage medium as still image data. When such a photograph is taken manually, the user presses a shutter (release) button provided on the upper surface of the main body.

雲台１０には、上記デジタルスチルカメラ１を固定するようにして取り付けることができる。つまり、雲台１０とデジタルスチルカメラ１は、相互の取り付けを可能とするための機構部位を備えている。 The digital still camera 1 can be fixed to the camera platform 10 so as to be fixed. That is, the pan head 10 and the digital still camera 1 are provided with a mechanism part for enabling mutual attachment.

そして、雲台１０においては、取り付けられたデジタルスチルカメラ１を、パン方向（水平方向）とチルト方向との両方向により動かすためのパン・チルト機構を備える。
雲台１０のパン・チルト機構により与えられるデジタルスチルカメラ１のパン方向、チルト方向それぞれの動き方は例えば図２（ａ）（ｂ）に示されるものとなる。図２（ａ）（ｂ）は、雲台１０に取り付けられているとされるデジタルスチルカメラ１を抜き出して、それぞれ、平面方向、側面方向より見たものである。
先ずパン方向については、デジタルスチルカメラ１の本体横方向と図２（ａ）に示される直線Ｘ１とが同じ向きとなる位置状態を基準にして、例えば回転軸Ｃｔ１を回転中心として回転方向＋αに沿った回転が行われることで、右方向へのパンニングの動きが与えられる。また、回転方向−αに沿った回転が行われることで、左方向へのパンニングの動きが与えられる。
また、チルト方向については、デジタルスチルカメラ１の本体縦方向が垂直方向の直線Ｙ１と一致する一状態を基準にして、例えば回転軸Ｃｔ２を回転中心として回転方向＋βへの回転が行われることで、下方向へのパンニングの動きが与えられる。また、回転方向−βへの回転が行われることで、上方向へのパンニングの動きが与えられる。
なお、図２（ａ）（ｂ）に示される、±α方向、及び±β方向のそれぞれにおける最大可動回転角度については言及していないが、被写体の捕捉の機会をできるだけ多くするべきことを考慮するのであれば、できるだけ最大可動回転角度を大きく取ることが好ましいことになる。 The pan / tilt head 10 includes a pan / tilt mechanism for moving the attached digital still camera 1 in both the pan direction (horizontal direction) and the tilt direction.
For example, FIGS. 2A and 2B show how the digital still camera 1 moves in the pan and tilt directions provided by the pan / tilt mechanism of the camera platform 10. FIGS. 2A and 2B show the digital still camera 1 attached to the pan head 10 as viewed from the plane direction and the side direction, respectively.
First, regarding the pan direction, for example, the rotation direction + α is set with the rotation axis Ct1 as the rotation center with reference to the position state in which the horizontal direction of the main body of the digital still camera 1 and the straight line X1 shown in FIG. A panning movement in the right direction is given by the rotation along. Further, by performing rotation along the rotation direction −α, a panning movement in the left direction is given.
With respect to the tilt direction, for example, the rotation in the rotation direction + β is performed with the rotation axis Ct2 as the rotation center with reference to one state where the vertical direction of the main body of the digital still camera 1 coincides with the straight line Y1 in the vertical direction. , Given downward panning movement. Further, the panning movement in the upward direction is given by the rotation in the rotation direction −β.
Although the maximum movable rotation angle in each of the ± α direction and the ± β direction shown in FIGS. 2A and 2B is not mentioned, it should be considered that the number of opportunities for capturing the subject should be increased as much as possible. If so, it is preferable to make the maximum movable rotation angle as large as possible.

図３のブロック図は、本実施の形態のデジタルスチルカメラ１の内部構成例を示している。
この図において、先ず、光学系部２１は、例えばズームレンズ、フォーカスレンズなども含む所定枚数の撮像用のレンズ群、絞りなどを備えて成り、入射された光を撮像光としてイメージセンサ２２の受光面に結像させる。
また、光学系部２１においては、上記のズームレンズ、フォーカスレンズ、絞りなどを駆動させるための駆動機構部も備えられているものとされる。これらの駆動機構部は、例えば制御部２７が実行するとされるズーム（画角）制御、自動焦点調整制御、自動露出制御などのいわゆるカメラ制御によりその動作が制御される。 The block diagram of FIG. 3 shows an internal configuration example of the digital still camera 1 of the present embodiment.
In this figure, first, the optical system unit 21 includes a predetermined number of imaging lens groups including a zoom lens, a focus lens, and the like, a diaphragm, and the like, and receives light of the image sensor 22 using incident light as imaging light. Form an image on the surface.
The optical system unit 21 is also provided with a drive mechanism unit for driving the zoom lens, the focus lens, the diaphragm, and the like. The operation of these drive mechanisms is controlled by so-called camera control such as zoom (viewing angle) control, automatic focus adjustment control, automatic exposure control, and the like executed by the control unit 27, for example.

イメージセンサ２２は、上記光学系部２１にて得られる撮像光を電気信号に変換する、いわゆる光電変換を行う。このために、イメージセンサ２２は、光学系部２１からの撮像光を光電変換素子の受光面にて受光し、受光された光の強さに応じて蓄積される信号電荷を、所定タイミングにより順次出力するようにされる。これにより、撮像光に対応した電気信号(撮像信号)が出力される。なお、イメージセンサ２２として採用される光電変換素子（撮像素子）としては、特に限定されるものではないが、現状であれば、例えばＣＭＯＳセンサやＣＣＤ(Charge Coupled Device)などを挙げることができる。また、ＣＭＯＳセンサを採用する場合には、イメージセンサ２２に相当するデバイス(部品)として、次に述べるＡ／Ｄコンバータ２３に相当するアナログ−デジタル変換器も含めた構造とすることができる。 The image sensor 22 performs so-called photoelectric conversion in which imaging light obtained by the optical system unit 21 is converted into an electric signal. For this purpose, the image sensor 22 receives the imaging light from the optical system unit 21 on the light receiving surface of the photoelectric conversion element, and sequentially accumulates signal charges accumulated according to the intensity of the received light at a predetermined timing. It is made to output. Thereby, an electrical signal (imaging signal) corresponding to the imaging light is output. The photoelectric conversion element (imaging element) employed as the image sensor 22 is not particularly limited, but in the present situation, for example, a CMOS sensor, a CCD (Charge Coupled Device), or the like can be used. When a CMOS sensor is used, a device (component) corresponding to the image sensor 22 may include an analog-digital converter corresponding to an A / D converter 23 described below.

上記イメージセンサ２２から出力される撮像信号は、Ａ／Ｄコンバータ２３に入力されることで、デジタル信号に変換され、信号処理部２４に入力される。
信号処理部２４では、Ａ／Ｄコンバータ２３から出力されるデジタルの撮像信号について、例えば１つの静止画 (フレーム画像)に相当する単位で取り込みを行い、このようにして取り込んだ静止画単位の撮像信号について所要の信号処理を施すことで、１枚の静止画に相当する画像信号データである撮像画像データ（撮像静止画像データ）を生成することができる。 The imaging signal output from the image sensor 22 is input to the A / D converter 23, thereby being converted into a digital signal and input to the signal processing unit 24.
In the signal processing unit 24, for example, the digital imaging signal output from the A / D converter 23 is captured in a unit corresponding to one still image (frame image), and the still image unit captured in this way is captured. By performing necessary signal processing on the signal, it is possible to generate captured image data (captured still image data) that is image signal data corresponding to one still image.

上記のようにして信号処理部２４にて生成した撮像画像データを、画像情報として記憶媒体（記憶媒体装置）であるメモリカード４０に記録させる場合には、例えば１つの静止画に対応する撮像画像データを信号処理部２４からエンコード／デコード部２５に対して出力するようにされる。
エンコード／デコード部２５は、信号処理部２４から出力されてくる静止画単位の撮像画像データについて、所定の静止画像圧縮符号化方式により圧縮符号化を実行したうえで、例えば制御部２７の制御に応じてヘッダなどを付加して、所定形式に圧縮された撮像画像データの形式に変換する。そして、このようにして生成した撮像画像データをメディアコントローラ２６に転送する。メディアコントローラ２６は、制御部２７の制御に従って、メモリカード４０に対して、転送されてくる撮像画像データを書き込んで記録させる。この場合のメモリカード４０は、例えば所定規格に従ったカード形式の外形形状を有し、内部には、フラッシュメモリなどの不揮発性の半導体記憶素子を備えた構成を採る記憶媒体である。なお、画像データを記憶させる記憶媒体については、上記メモリカード以外の種別、形式などとされてもよい。 When the captured image data generated by the signal processing unit 24 as described above is recorded as image information in a memory card 40 that is a storage medium (storage medium device), for example, a captured image corresponding to one still image. Data is output from the signal processing unit 24 to the encoding / decoding unit 25.
The encoding / decoding unit 25 performs compression encoding on the captured image data in units of still images output from the signal processing unit 24 by a predetermined still image compression encoding method, and performs control of the control unit 27, for example. Accordingly, a header or the like is added and converted into a format of captured image data compressed into a predetermined format. Then, the captured image data generated in this way is transferred to the media controller 26. The media controller 26 writes and records the transferred captured image data in the memory card 40 under the control of the control unit 27. The memory card 40 in this case is a storage medium having a configuration including, for example, a card-type outer shape conforming to a predetermined standard and having a nonvolatile semiconductor storage element such as a flash memory inside. Note that the storage medium for storing the image data may be of a type or format other than the memory card.

また、本実施の形態としての信号処理部２４は、先の説明のようにして取得される撮像画像データを利用して、被写体検出としての画像処理を実行することも可能とされている。本実施の形態における被写体検出処理がどのようなものであるのかについては後述する。 In addition, the signal processing unit 24 according to the present embodiment can perform image processing as subject detection using captured image data acquired as described above. The subject detection processing in this embodiment will be described later.

また、デジタルスチルカメラ１は信号処理部２４にて得られる撮像画像データを利用して表示部３３により画像表示を実行させることで、現在撮像中の画像であるいわゆるスルー画を表示させることが可能とされる。例えば信号処理部２４においては、先の説明のようにしてＡ／Ｄコンバータ２３から出力される撮像信号を取り込んで１枚の静止画相当の撮像画像データを生成するのであるが、この動作を継続することで、動画におけるフレーム画像に相当する撮像画像データを順次生成していく。そして、このようにして順次生成される撮像画像データを、制御部２７の制御に従って表示ドライバ３２に対して転送する。これにより、スルー画の表示が行われる。 Further, the digital still camera 1 can display a so-called through image that is an image currently being captured by causing the display unit 33 to perform image display using captured image data obtained by the signal processing unit 24. It is said. For example, the signal processing unit 24 captures the imaging signal output from the A / D converter 23 as described above and generates captured image data corresponding to one still image, but this operation is continued. Thus, captured image data corresponding to frame images in the moving image is sequentially generated. The captured image data sequentially generated in this way is transferred to the display driver 32 under the control of the control unit 27. Thereby, a through image is displayed.

表示ドライバ３２では、上記のようにして信号処理部２４から入力されてくる撮像画像データに基づいて表示部３３を駆動するための駆動信号を生成し、表示部３３に対して出力していくようにされる。これにより、表示部３３においては、静止画単位の撮像画像データに基づく画像が順次的に表示されていくことになる。これをユーザが見れば、そのときに撮像しているとされる画像が表示部３３において動画的に表示されることになる。つまり、モニタ画像が表示される。なお、先の図１で説明した表示画面部５が、ここでの表示部３３の画面部分に相当する。 The display driver 32 generates a drive signal for driving the display unit 33 based on the captured image data input from the signal processing unit 24 as described above, and outputs the drive signal to the display unit 33. To be. As a result, the display unit 33 sequentially displays images based on the captured image data in units of still images. If this is seen by the user, the image taken at that time is displayed on the display unit 33 as a moving image. That is, a monitor image is displayed. The display screen unit 5 described above with reference to FIG. 1 corresponds to the screen portion of the display unit 33 here.

また、デジタルスチルカメラ１は、メモリカード４０に記録されている撮像画像データを再生して、その画像を表示部３３に対して表示させることも可能とされる。
このためには、制御部２７が撮像画像データを指定して、メディアコントローラ２６に対してメモリカード４０からのデータ読み出しを命令する。この命令に応答して、メディアコントローラ２６は、指定された撮像画像データが記録されているメモリカード４０上のアドレスにアクセスしてデータ読み出しを実行し、読み出したデータを、エンコード／デコード部２５に対して転送する。 The digital still camera 1 can also reproduce captured image data recorded on the memory card 40 and display the image on the display unit 33.
For this purpose, the control unit 27 designates captured image data and instructs the media controller 26 to read data from the memory card 40. In response to this command, the media controller 26 accesses the address on the memory card 40 where the designated captured image data is recorded, executes data reading, and sends the read data to the encoding / decoding unit 25. Forward.

エンコード／デコード部２５は、例えば制御部２７の制御に従って、メディアコントローラ２６から転送されてきた撮像画像データから圧縮静止画データとしての実体データを取り出し、この圧縮静止画データについて、圧縮符号化に対する復号処理を実行して、１つの静止画に対応する撮像画像データを得る。そして、この撮像画像データを表示ドライバ３２に対して転送する。これにより、表示部３３においては、メモリカード４０に記録されている撮像画像データの画像が再生表示されることになる。 The encode / decode unit 25 extracts, for example, actual data as compressed still image data from the captured image data transferred from the media controller 26 under the control of the control unit 27, and decodes the compressed still image data with respect to compression encoding. Processing is executed to obtain captured image data corresponding to one still image. Then, the captured image data is transferred to the display driver 32. Thereby, on the display unit 33, the image of the captured image data recorded in the memory card 40 is reproduced and displayed.

また表示部３３に対しては、上記のモニタ画像や撮像画像データの再生画像などとともに、ユーザインターフェイス画像も表示させることができる。この場合には、例えばそのときの動作状態などに応じて制御部２７が必要なユーザインターフェイス画像としての表示用画像データを生成し、これを表示ドライバ３２に対して出力するようにされる。これにより、表示部３３においてユーザインターフェイス画像が表示されることになる。なお、このユーザインターフェイス画像は、例えば特定のメニュー画面などのようにモニタ画像や撮像画像データの再生画像とは個別に表示部３３の表示画面に表示させることも可能であるし、モニタ画像や撮像画像データの再生画像上の一部において重畳・合成されるようにして表示させることも可能である。 In addition to the monitor image and the reproduced image of the captured image data, a user interface image can be displayed on the display unit 33. In this case, for example, the control unit 27 generates display image data as a necessary user interface image according to the operation state at that time, and outputs the display image data to the display driver 32. As a result, the user interface image is displayed on the display unit 33. The user interface image can be displayed on the display screen of the display unit 33 separately from the monitor image and the reproduced image of the captured image data, such as a specific menu screen. It is also possible to display the image data so as to be superimposed and synthesized on a part of the reproduced image.

制御部２７は、例えば実際においてはＣＰＵ(Central Processing Unit)を備えて成るもので、ＲＯＭ２８、ＲＡＭ２９などとともにマイクロコンピュータを構成する。ＲＯＭ２８には、例えば制御部２７としてのＣＰＵが実行すべきプログラムの他、デジタルスチルカメラ１の動作に関連した各種の設定情報などが記憶される。ＲＡＭ２９は、ＣＰＵのための主記憶装置とされる。
また、この場合のフラッシュメモリ３０は、例えばユーザ操作や動作履歴などに応じて変更(書き換え)の必要性のある各種の設定情報などを記憶させておくために使用する不揮発性の記憶領域として設けられるものである。なおＲＯＭ２８について、例えばフラッシュメモリなどをはじめとする不揮発性メモリを採用することとした場合には、フラッシュメモリ３０に代えて、このＲＯＭ２８における一部記憶領域を使用することとしてもよい。 The control unit 27 actually includes a CPU (Central Processing Unit), for example, and constitutes a microcomputer together with the ROM 28, the RAM 29, and the like. In the ROM 28, for example, various setting information related to the operation of the digital still camera 1 is stored in addition to a program to be executed by the CPU as the control unit 27. The RAM 29 is a main storage device for the CPU.
Further, the flash memory 30 in this case is provided as a nonvolatile storage area used for storing various setting information that needs to be changed (rewritten) in accordance with, for example, a user operation or an operation history. It is For example, if a non-volatile memory such as a flash memory is adopted as the ROM 28, a partial storage area in the ROM 28 may be used instead of the flash memory 30.

操作部３１は、デジタルスチルカメラ１に備えられる各種操作子と、これらの操作子に対して行われた操作に応じた操作情報信号を生成してＣＰＵに出力する操作情報信号出力部位とを一括して示している。制御部２７は、操作部３１から入力される操作情報信号に応じて所定の処理を実行する。これによりユーザ操作に応じたデジタルスチルカメラ１の動作が実行されることになる。 The operation unit 31 collects various operation elements provided in the digital still camera 1 and operation information signal output parts that generate operation information signals according to operations performed on these operation elements and output them to the CPU. As shown. The control unit 27 executes predetermined processing according to the operation information signal input from the operation unit 31. Thereby, the operation of the digital still camera 1 according to the user operation is executed.

雲台対応通信部３４は、雲台１０側とデジタルスチルカメラ１側との間での所定の通信方式に従った通信を実行する部位であり、例えばデジタルスチルカメラ１が雲台１０に対して取り付けられた状態において、雲台１０側の通信部との間での有線若しくは無線による通信信号の送受信を可能とするための物理層構成と、これより上位となる所定層に対応する通信処理を実現するための構成とを有して成る。 The pan head communication unit 34 is a part that performs communication according to a predetermined communication method between the pan head 10 side and the digital still camera 1 side. For example, the digital still camera 1 communicates with the pan head 10. In the attached state, a physical layer configuration for enabling transmission / reception of wired or wireless communication signals to / from the communication unit on the camera platform 10 side, and communication processing corresponding to a predetermined layer higher than this And a configuration for realizing.

図４は、雲台１０の構成例をブロック図により示している。
先に述べたように、雲台１０は、パン・チルト機構を備えるものであり、これに対応する部位として、パン機構部５３、パン用モータ５５、チルト機構部５６、チルト用モータ５７を備える。
パン機構部５３は、雲台１０に取り付けられたデジタルスチルカメラ１について、図２（ａ）に示したパン（横）方向の動きを与えるための機構を有して構成され、この機構の動きは、パン用モータ５４が正逆方向に回転することによって得られる。同様にして、チルト機構部５３は、雲台１０に取り付けられたデジタルスチルカメラ１について、図２（ｂ）に示したチルト（縦）方向の動きを与えるための機構を有して構成され、この機構の動きは、チルト用モータ５７が正逆方向に回転することによって得られる。 FIG. 4 is a block diagram illustrating a configuration example of the camera platform 10.
As described above, the pan head 10 includes a pan / tilt mechanism, and includes a pan mechanism unit 53, a pan motor 55, a tilt mechanism unit 56, and a tilt motor 57 as corresponding parts. .
The pan mechanism unit 53 is configured to have a mechanism for giving the pan (lateral) movement shown in FIG. 2A with respect to the digital still camera 1 attached to the pan head 10. Is obtained by rotating the pan motor 54 in the forward and reverse directions. Similarly, the tilt mechanism unit 53 is configured to have a mechanism for imparting a motion in the tilt (vertical) direction shown in FIG. This mechanism movement is obtained by rotating the tilt motor 57 in the forward and reverse directions.

制御部５１は、例えばＣＰＵ、ＲＯＭ、ＲＡＭなどが組み合わされて形成されるマイクロコンピュータを有して成り、上記パン機構部５３、チルト機構部５６の動きをコントロールする。例えば制御部５１がパン機構部５３の動きを制御するときには、パン機構部５３に必要な移動量と移動方向に対応した制御信号をパン用駆動部５５に対して出力する。パン用駆動部５５は、入力される制御信号に対応したモータ駆動信号を生成してパン用モータ５５に出力する。このモータ駆動信号によりパン用モータ５５が、例えば所要の回転方向及び回転角度で回転し、この結果、パン機構部５３も、これに対応した移動量と移動方向により動くようにして駆動される。
同様にして、チルト機構部５６の動きを制御するときには、制御部５１は、チルト機構部５６に必要な移動量と移動方向に対応した制御信号をチルト用駆動部５８に対して出力する。チルト用駆動部５８は、入力される制御信号に対応したモータ駆動信号を生成してチルト用モータ５７に出力する。このモータ駆動信号によりチルト用モータ５７が、例えば所要の回転方向及び回転角度で回転し、この結果、チルト機構部５６も、これに対応した移動量と移動方向により動くようにして駆動される。 The control unit 51 includes a microcomputer formed by combining a CPU, a ROM, a RAM, and the like, for example, and controls the movement of the pan mechanism unit 53 and the tilt mechanism unit 56. For example, when the control unit 51 controls the movement of the pan mechanism unit 53, a control signal corresponding to the movement amount and movement direction necessary for the pan mechanism unit 53 is output to the pan driving unit 55. The pan drive unit 55 generates a motor drive signal corresponding to the input control signal and outputs the motor drive signal to the pan motor 55. In response to this motor drive signal, the pan motor 55 rotates, for example, in a required rotation direction and rotation angle, and as a result, the pan mechanism unit 53 is also driven to move according to the corresponding movement amount and movement direction.
Similarly, when controlling the movement of the tilt mechanism unit 56, the control unit 51 outputs a control signal corresponding to the movement amount and movement direction necessary for the tilt mechanism unit 56 to the tilt drive unit 58. The tilt drive unit 58 generates a motor drive signal corresponding to the input control signal and outputs it to the tilt motor 57. By this motor drive signal, the tilt motor 57 rotates, for example, in a required rotation direction and rotation angle, and as a result, the tilt mechanism unit 56 is also driven to move according to the corresponding movement amount and movement direction.

通信部５２は、雲台１０に取り付けられたデジタルスチルカメラ１内の雲台対応通信部３４との間で所定の通信方式に従った通信を実行する部位であり、雲台対応通信部３４と同様にして、相手側通信部と有線若しくは無線による通信信号の送受信を可能とするための物理層構成と、これより上位となる所定層に対応する通信処理を実現するための構成とを有して成る。 The communication unit 52 is a part that performs communication according to a predetermined communication method with the pan-head compatible communication unit 34 in the digital still camera 1 attached to the pan head 10. Similarly, it has a physical layer configuration for enabling transmission / reception of wired or wireless communication signals with a counterpart communication unit, and a configuration for realizing communication processing corresponding to a predetermined layer higher than this. It consists of

上記した構成のデジタルスチルカメラ１と雲台１０から成る撮像システムでは、例えば、人を主体的な被写体（以降は単に被写体という）として扱うこととしたうえで、この被写体を検出するための探索を行うとともに、被写体の存在が検出されたのであれば、この被写体が写っている画像として最適とされる構図（最適構図）が得られるように（フレーミングが行われるように）して雲台１０のパン・チルト機構を駆動する。そして、最適構図が得られたタイミングで、そのときの撮像画像データを記憶媒体（メモリカード４０）に記録することが行われる。
つまり、本実施の形態の撮像システムでは、デジタルスチルカメラによる写真撮影を行うのにあたり、探索された被写体について最適構図を決定（判定）して撮影記録を行うという動作が自動的に実行される。これにより、ユーザ自身が構図を判断して撮影を行わなくとも、相応に良質な写真の画像を得ることが可能になる。また、このようなシステムでは、誰かがカメラを持って撮影する必要が無くなるので、その撮影が行われる場所に居る全員が被写体となることができる。また、被写体となるユーザが、カメラの視野角範囲に入ろうと特に意識しなくとも、被写体が収まった写真が得られることになる。つまり、その撮影場所に居る人の自然な様子を撮影する機会が増えるものであり、これまでにはあまりなかった雰囲気の写真を多く得ることができる。
また、最適構図というものは、被写体数に応じては異なってくるものであるとの考え方をとることもできるが、本実施の形態における構図決定に関しては、検出された被写体の数に基づいて異なる最適構図を判定できるようにして構成される。これにより、例えば被写体数を考慮せずに構図決定を行うこととした場合と比較すれば、総合的には、本実施の形態のほうがより良質な画像を得ることが可能となる。 In the imaging system including the digital still camera 1 and the camera platform 10 having the above-described configuration, for example, a person is treated as a main subject (hereinafter simply referred to as a subject), and a search for detecting the subject is performed. If the presence of the subject is detected, the composition (optimal composition) that is optimal as an image in which the subject is captured is obtained (so that framing is performed) and the pan head 10 Drives the pan / tilt mechanism. Then, at the timing when the optimum composition is obtained, the captured image data at that time is recorded in the storage medium (memory card 40).
That is, in the imaging system of the present embodiment, when taking a picture with a digital still camera, an operation of determining (determining) an optimum composition for the searched subject and performing shooting and recording is automatically executed. Accordingly, it is possible to obtain a correspondingly high quality photographic image without the user himself / herself judging the composition and taking a picture. Also, in such a system, it is not necessary for someone to take a picture with a camera, so that everyone in the place where the picture is taken can be a subject. Further, even if the user who is the subject is not particularly conscious of entering the viewing angle range of the camera, a photograph containing the subject can be obtained. In other words, the opportunity to shoot the natural appearance of the person at the shooting location increases, and it is possible to obtain many photographs with an atmosphere that did not exist so far.
In addition, it can be considered that the optimal composition differs depending on the number of subjects, but the composition determination in the present embodiment differs based on the number of detected subjects. An optimum composition can be determined. As a result, compared with the case where composition determination is performed without considering the number of subjects, for example, it is possible to obtain a better quality image in the present embodiment as a whole.

以降においては、本実施の形態における構図制御に関する説明を行っていく。
図５は、デジタルスチルカメラ１側が備える、本実施の形態の構図制御に対応した機能部位についての構成例を示している。
この図において被写体検出処理ブロック６１は、イメージセンサ２２にて得られる撮像信号に基づいて信号処理部２４にて得られる撮像画像データを利用して、被写体の探索制御を含む、被写体検出処理を実行する部位とされる。ここでの被写体検出処理は、先ず撮像画像データの画面の画内容から、人としての被写体を弁別して検出する処理をいうものであり、ここでの検出結果として得られる情報（検出情報）は、人としての被写体の数と、個々の被写体（個別被写体）ごとについての画面内での位置情報、及び個別被写体ごとについての画像内におけるサイズ（占有面積）などとなる。なお、構図判定のアルゴリズムなどの構成によっては、被写体の数のみが検出情報として得られるようにされれば、本実施の形態としての構図制御は実現可能である。
また、ここでの被写体検出処理の具体的手法としては、顔検出の技術を用いることができる。また、顔検出の方式、手法もいくつか知られているが、本実施の形態においてはどの方式を採用するのかについては特に限定されるべきものではなく、検出精度や設計難易度などを考慮して適当とされる方式が採用されればよい。
また、この被写体検出処理ブロック６１が実行する被写体検出処理は、信号処理部２４における画像信号処理として実現することができる。先の説明のようにして信号処理部２４がＤＳＰにより構成される場合、この被写体検出処理は、信号処理部２４としてのＤＳＰに与えるプログラム、インストラクションにより実現されることになる。
また、被写体探索制御時においては、雲台１０のパン・チルト機構を制御するために、通信制御処理ブロック６３経由で、上記パン・チルト機構を駆動するための制御信号を出力する。 Hereinafter, the composition control in the present embodiment will be described.
FIG. 5 shows a configuration example of functional parts corresponding to the composition control of the present embodiment, which is provided on the digital still camera 1 side.
In this figure, a subject detection processing block 61 executes subject detection processing including subject search control using captured image data obtained by the signal processing unit 24 based on an imaging signal obtained by the image sensor 22. It is considered as a part to be The subject detection processing here refers to processing for discriminating and detecting a subject as a person from the image content of the captured image data screen. Information (detection information) obtained as a detection result here is: The number of subjects as a person, position information on the screen for each individual subject (individual subject), the size (occupied area) in the image for each individual subject, and the like. Note that, depending on the configuration of the composition determination algorithm or the like, if only the number of subjects is obtained as detection information, the composition control as the present embodiment can be realized.
Further, as a specific technique for subject detection processing here, a face detection technique can be used. Also, some face detection methods and methods are known, but in this embodiment, which method should be adopted is not particularly limited, considering detection accuracy and design difficulty. It is only necessary to adopt an appropriate method.
The subject detection processing executed by the subject detection processing block 61 can be realized as image signal processing in the signal processing unit 24. When the signal processing unit 24 is configured by a DSP as described above, this subject detection processing is realized by a program and instructions given to the DSP as the signal processing unit 24.
At the time of subject search control, a control signal for driving the pan / tilt mechanism is output via the communication control processing block 63 in order to control the pan / tilt mechanism of the camera platform 10.

被写体検出処理ブロック６１の被写体検出処理結果である検出情報は、構図制御処理ブロック６２に対して入力される。
構図制御処理ブロック６２は、入力された被写体についての検出情報を利用して、最適であるとしてみなされる構図（最適構図）を決定する。そして、決定した最適構図を得るための制御（構図制御）を実行する。この場合の構図制御としては、画角（本実施の形態では、例えばズームレンズの制御に応じて変更可能な視野角をいう）の変更制御と、パン（左右）方向に沿った撮像方向の制御（パン制御）と、チルト（上下）方向に沿った撮像方向の制御（チルト制御）から成る。画角変更のためには、デジタルスチルカメラ１の光学系部２１におけるズームレンズを移動するズーム制御、若しくは撮像画像データに対する画像切り出しなどの画像信号処理の少なくとも何れか一方を行う。また、パン制御、チルト制御は、雲台１０のパン・チルト機構を制御して動かすことにより行う。パン・チルト機構の制御を行うとき、構図制御処理ブロック６２は、パン・チルト機構をしかるべき位置状態とするための制御信号を、通信制御処理ブロック６３を経由して、雲台１０側に送信させる。
なお、上記構図制御処理ブロック６２が実行する構図決定と構図制御の処理は、例えば、制御部２７（ＣＰＵ）がプログラムに基づいて実行するように構成することができる。あるいは、これに信号処理部２４がプログラムに基づいて実行する処理を併用した構成とすることも考えられる。また、通信制御処理ブロック６３は、雲台１０側の通信部５２との通信処理を所定のプロトコルに従って実行するようにして構成される部位であり、雲台対応通信部３４に対応する機能部位となる。 Detection information that is a result of subject detection processing in the subject detection processing block 61 is input to the composition control processing block 62.
The composition control processing block 62 uses the input detection information about the subject to determine the composition (optimum composition) that is regarded as optimal. Then, control (composition control) for obtaining the determined optimal composition is executed. As composition control in this case, change control of the angle of view (in this embodiment, for example, a viewing angle that can be changed according to control of the zoom lens) and control of the imaging direction along the pan (left and right) direction (Pan control) and control of the imaging direction along the tilt (up and down) direction (tilt control). In order to change the angle of view, at least one of zoom control for moving the zoom lens in the optical system unit 21 of the digital still camera 1 and image signal processing such as image cutout for captured image data is performed. Further, pan control and tilt control are performed by controlling and moving the pan / tilt mechanism of the camera platform 10. When controlling the pan / tilt mechanism, the composition control processing block 62 transmits a control signal for setting the pan / tilt mechanism to an appropriate position state via the communication control processing block 63 to the pan head 10 side. Let
The composition determination and composition control processing executed by the composition control processing block 62 can be configured to be executed by the control unit 27 (CPU) based on a program, for example. Alternatively, a configuration in which the signal processing unit 24 performs processing executed based on a program may be considered. The communication control processing block 63 is a part configured to execute communication processing with the communication unit 52 on the camera platform 10 side according to a predetermined protocol, and includes a functional part corresponding to the camera support communication unit 34. Become.

次に、図６を参照して、被写体検出処理ブロック６１が実行するとされる被写体検出処理の事例を挙げておく。
ここで、被写体検出処理ブロック６１が、図６（ａ）に示す画内容の撮像画像データを取り込んだとする。この撮像画像データの画内容としては、人としての被写体が１つ存在した画を撮影して得られたものである。また、図６（ａ）（及び図６（ｂ））には、１画面をマトリクス状に区切った状態を示しているが、これは、撮像画像データとしての画面が、所定数による水平・垂直画素の集合から成るものであることを模式的に示している。
図６（ａ）に示す画内容の撮像画像データを対象に被写体検出（顔検出）を行うことによっては、図において示される１つの個別被写体ＳＢＪの顔が検出されることになる。即ち、顔検出処理によって１つの顔が検出されることを以て、ここでは１つの個別被写体が検出されることとしている。そして、このようにして個別被写体を検出した結果としては、先にも述べたように個別被写体の数、位置、サイズの情報を得るようにされる。
先ず、個別被写体数に関しては、例えば顔検出により検出された顔の数を求めればよい。図６（ａ）の場合には、検出される顔が１つであるから、個別被写体数としても１であるとの結果が得られる。
また、個別被写体ごとの位置情報に関しては、少なくとも、撮像画像データとしての画像内における個別被写体ＳＢＪの重心Ｇ（X,Y）を求めることとする。なお、この場合の重心Ｇ（X,Y）の基準となる撮像画像データの画面上のX,Y原点座標Ｐ(0,0)は、例えば図７に示すようにして、画面サイズに対応したX軸方向(水平方向)の幅(水平画サイズ)Cxの中間点と、Y軸方向（垂直方向）の幅(垂直画サイズ)Cyの中間点との交点であることとしている。
また、この重心Ｇについての個別被写体の画像内における位置の定義であるとか、重心Ｇをどのようにして設定するのかについては、例えばこれまでに知られている被写体重心検出方式を採用することができる。
また、個別被写体ごとのサイズについては、例えば顔検出処理により顔部分であるとして特定、検出される領域の画素数を求めるようにすればよい。 Next, referring to FIG. 6, an example of subject detection processing that is executed by the subject detection processing block 61 will be described.
Here, it is assumed that the subject detection processing block 61 has captured captured image data having the image content shown in FIG. The image content of the captured image data is obtained by photographing an image in which one subject as a person exists. FIG. 6 (a) (and FIG. 6 (b)) shows a state in which one screen is divided into a matrix. This is because the screen as captured image data has a predetermined number of horizontal and vertical. It schematically shows that the pixel consists of a set of pixels.
By performing subject detection (face detection) on the captured image data of the image content shown in FIG. 6A, the face of one individual subject SBJ shown in the figure is detected. That is, one individual subject is detected here by detecting one face by the face detection process. As a result of detecting the individual subject in this manner, information on the number, position, and size of the individual subject is obtained as described above.
First, regarding the number of individual subjects, for example, the number of faces detected by face detection may be obtained. In the case of FIG. 6A, since one face is detected, the result that the number of individual subjects is 1 is also obtained.
Further, regarding the position information for each individual subject, at least the centroid G (X, Y) of the individual subject SBJ in the image as the captured image data is obtained. In this case, the X, Y origin coordinates P (0,0) on the screen of the captured image data serving as a reference for the center of gravity G (X, Y) in this case correspond to the screen size as shown in FIG. 7, for example. It is assumed that it is the intersection of the intermediate point of the width (horizontal image size) Cx in the X-axis direction (horizontal direction) and the intermediate point of the width (vertical image size) Cy in the Y-axis direction (vertical direction).
In addition, as to the definition of the position of the individual subject in the image with respect to the center of gravity G and how to set the center of gravity G, for example, a known subject center of gravity detection method may be adopted. it can.
As for the size of each individual subject, for example, the number of pixels in a region that is identified and detected as a face portion by face detection processing may be obtained.

また、図６（ｂ）に示す撮像画像データを取り込んで被写体検出処理ブロック６１が被写体検出処理を実行したとされると、先ずは、顔検出により２つの顔の存在することが特定されることになるので、個別被写体数については２であるとの結果が得られることになる。ここでは、２つの個別被写体のうち、左側を個別被写体ＳＢＪ０、右側を個別被写体ＳＢＪ１として識別性を持たせている。また、個別被写体ＳＢＪ０、ＳＢＪ１ごとに求めた重心の座標については、それぞれ、Ｇ0(X0，Y0)、Ｇ1(X1,Y1)として示されている。
また、このようにして、２以上の個別被写体が検出される場合には、これら複数の個別被写体をひとまとまりの被写体（総合被写体）としてみた場合の重心である、総合被写体重心Ｇｔ（Xg,Yg）を求めるようにされる。
この総合被写体重心Ｇｔをどのようにして設定するのかについては、いくつか考えることができるが、ここでは、最も簡易な例として、検出された複数の個別被写体のうちで、画面の左端と右端の両端に位置する個別被写体の重心を結ぶ線分上の中間点を総合被写体重心Ｇｔとして設定した場合を示している。この総合被写体重心Ｇｔは、例えば後述するようにして構図制御において利用することができる情報であり、個別被写体の重心の情報が取得されれば演算により求められる情報である。従って、総合被写体重心Ｇｔについては、被写体検出処理ブロック６１により求め、これを検出情報として出力することとしてもよいが、構図制御処理ブロック６２が、検出情報として取得した個別被写体の重心の位置を示す情報のうちから、左右両端に位置する個別被写体の重心に関する情報を利用して求めるようにしてもよい。
なお、ほかには例えば、複数の個別被写体のサイズに応じて重み付け係数を与え、この重み付け係数を利用して、例えばサイズの大きな個別被写体に総合被写体重心Ｇｔの位置が近くなるように配慮した設定手法も考えることができる。
また、個別被写体のサイズについては、例えば個別被写体ＳＢＪ０、ＳＢＪ１ごとに、その検出された顔が占有するとされる画素数を求めることとすればよい。 If it is assumed that the captured image data shown in FIG. 6B is captured and the subject detection processing block 61 executes the subject detection processing, first, the presence of two faces is specified by face detection. Therefore, the result that the number of individual subjects is 2 is obtained. Here, of the two individual subjects, the left subject is the individual subject SBJ0, and the right subject is the individual subject SBJ1, which has discriminability. The coordinates of the center of gravity obtained for each of the individual subjects SBJ0 and SBJ1 are indicated as G0 (X0, Y0) and G1 (X1, Y1), respectively.
When two or more individual subjects are detected in this way, the total subject center of gravity Gt (Xg, Yg), which is the center of gravity when these individual subjects are viewed as a group of subjects (total subjects). ).
There are several ways to set the total subject gravity center Gt, but here, as the simplest example, among the detected individual subjects, the left and right edges of the screen In this example, an intermediate point on a line segment connecting the centroids of the individual subjects located at both ends is set as the total subject centroid Gt. The total subject gravity center Gt is information that can be used in composition control, for example, as will be described later, and is information obtained by calculation if information on the center of gravity of an individual subject is acquired. Accordingly, the total subject gravity center Gt may be obtained by the subject detection processing block 61 and output as detection information, but the composition control processing block 62 indicates the position of the center of gravity of the individual subject acquired as detection information. Of the information, the information on the center of gravity of the individual subject located at both the left and right ends may be used.
In addition, for example, a weighting coefficient is given according to the sizes of a plurality of individual subjects, and the setting is made with consideration given so that the position of the total subject gravity center Gt is close to, for example, a large individual subject. A method can also be considered.
As for the size of the individual subject, for example, for each of the individual subjects SBJ0 and SBJ1, the number of pixels that the detected face occupies may be obtained.

続いては、図８〜図１０を参照して、本実施の形態における第１例としての構図制御により得られる構図についての説明を行う。
図８（ａ）には、構図制御前の被写体検出の結果により得られた撮像画像データとして、１つの個別被写体ＳＢＪ０が撮像された画内容が得られた場合を示している。なお、本実施の形態にあっては、デジタルスチルカメラ１を取り付けた雲台１０を通常に設置した場合には、横長の画像が撮像されるようにしてデジタルスチルカメラ１の向きが設定される。従って、第１例にあっては、撮像により横長の画像が得られることを前提とする。 Subsequently, a composition obtained by composition control as a first example in the present embodiment will be described with reference to FIGS.
FIG. 8A shows a case where image content obtained by capturing one individual subject SBJ0 is obtained as captured image data obtained as a result of subject detection before composition control. In the present embodiment, when the camera platform 10 to which the digital still camera 1 is attached is normally installed, the orientation of the digital still camera 1 is set so that a horizontally long image is captured. . Therefore, in the first example, it is assumed that a horizontally long image is obtained by imaging.

上記図８（ａ）に示したようにして１つの個別被写体が検出された場合には、例えば図８（ａ）から図８（ｂ）への遷移として示すようにして、この個別被写体ＳＢＪの、撮像画像データの画面内における占有率が所定値となるようにして画角を狭くしていくズーム制御を実行させ、個別被写体のサイズを変更するようにされる。なお、図８（ａ）（ｂ）では、個別被写体ＳＢＪ０のサイズを拡大するために画角を狭くする方向での変更を行っている場合を示しているが、個別被写体が検出された段階において、その個別被写体の画面内における占有率が上記の所定値を越えていた場合には、この占有率が所定値にまで小さくなるようにして、画角を広くしていくズーム制御を実行させることになる。
また、個別被写体が１つである場合の画面内におけるその位置についてであるが、本実施の形態の場合、左右方向に関しては、ほぼ中央となるように位置させることとしている。このためには、例えば個別被写体ＳＪＢ０の重心Ｇの左右方向における位置を、ほぼ中央となるようにすればよい。 When one individual subject is detected as shown in FIG. 8A, for example, as shown as a transition from FIG. 8A to FIG. 8B, the individual subject SBJ is detected. Then, zoom control is performed to narrow the angle of view so that the occupation ratio of the captured image data in the screen becomes a predetermined value, and the size of the individual subject is changed. 8A and 8B show a case where the change is made in the direction of narrowing the angle of view in order to enlarge the size of the individual subject SBJ0, but at the stage where the individual subject is detected. When the occupancy ratio of the individual subject in the screen exceeds the predetermined value, zoom control is performed to increase the angle of view by reducing the occupancy ratio to the predetermined value. become.
Further, regarding the position in the screen when there is one individual subject, in the case of the present embodiment, the position is set so as to be substantially in the center in the left-right direction. For this purpose, for example, the position of the center of gravity G of the individual subject SJB0 in the left-right direction may be set substantially at the center.

次に、図９（ａ）に示すようにして、２つの個別被写体が検出された場合には、構図制御として、先ず、これら２つの個別被写体ＳＢＪ０、ＳＢＪ１間の距離（被写体間距離）Ｋを求める。この距離Ｋは、例えば個別被写体ＳＢＪ０の重心Ｇ0のＸ座標（X0）と個別被写体ＳＢＪ１の重心Ｇ1のX座標（X1）との差(X1−X0)により表現することができる。そして、上記のようにして求められる被写体間距離Ｋが、図９（ｂ）に示すようにして、水平画サイズCxの１／３（Ｋ＝Cx／３）となるように画角の調整を行う。また、この場合においても、２つの個別被写体ＳＢＪ０、ＳＢＪ１が写る領域は、左右方向においてほぼ画面の中央となるようにして配置するようにされており、このためには、例えば、これらの個別被写体ＳＢＪ０、ＳＢＪ１についての総合被写体重心Ｇｔを、左右方向における中央の位置に配置するようにされる。
なお、被写体間距離Ｋを水平画サイズCxの１／３とするのは、三分割法といわれる構図設定の手法に基づいている。三分割法は、最も基本的な構図設定手法の１つであり、方形の画面を垂直方向と水平方向のそれぞれに沿って三等分する仮想線上に被写体を位置させることにより良好な構図を得ようとするものである。被写体間距離Ｋ＝Cx／３となるようにしたうえで、上記のようにして総合被写体重心Ｇｔを左右方向における中央に位置させると、個別被写体ＳＢＪ０の重心Ｇ0は、画面縦方向に沿った左側の仮想線上にほぼ位置することとなり、また、個別被写体ＳＢＪ１の重心Ｇ1は、画面縦方向に沿った右側の仮想線上にほぼ位置することとなる。つまり三分割法に従った構図が得られることになる。 Next, as shown in FIG. 9A, when two individual subjects are detected, as a composition control, first, a distance (distance between subjects) K between these two individual subjects SBJ0 and SBJ1 is set. Ask. This distance K can be expressed, for example, by the difference (X1−X0) between the X coordinate (X0) of the centroid G0 of the individual subject SBJ0 and the X coordinate (X1) of the centroid G1 of the individual subject SBJ1. Then, the angle of view is adjusted so that the inter-subject distance K obtained as described above is 1/3 of the horizontal image size Cx (K = Cx / 3) as shown in FIG. 9B. Do. Also in this case, the area where the two individual subjects SBJ0 and SBJ1 are shown is arranged so as to be substantially at the center of the screen in the left-right direction. For this purpose, for example, these individual subjects are arranged. The total subject gravity center Gt for SBJ0 and SBJ1 is arranged at the center position in the left-right direction.
The inter-subject distance K is set to 1/3 of the horizontal image size Cx based on a composition setting method called a three-division method. The three-division method is one of the most basic composition setting methods, and a good composition is obtained by positioning the subject on a virtual line that divides the rectangular screen into three equal parts along the vertical and horizontal directions. It is about to try. When the inter-subject distance K = Cx / 3 and the total subject gravity center Gt is positioned at the center in the left-right direction as described above, the center G0 of the individual subject SBJ0 is the left side in the vertical direction of the screen. The center of gravity G1 of the individual subject SBJ1 is substantially located on the right virtual line along the vertical direction of the screen. That is, a composition according to the three-division method is obtained.

また、図１０（ａ）のようにして、３つの個別被写体が検出された場合の構図制御としては、画面内において最も左側に位置する個別被写体ＳＢＪ０と、最も右側に位置する個別被写体ＳＢＪ２との間の被写体間距離Ｋを求める。つまり、被写体間距離Kを求めるのには、個別被写体がｎ個検出されている場合に、画面左から右にかけての個別被写体に０番からｎ−1番までの番号を割り当てることとした上で、画面において最も左側に位置する個別被写体ＳＢＪ０の重心G0のX座標を（X0）、画面において最も右側に位置する個別被写体ＳＢＪ(n-1)の重心Gn-1のX座標を（Xn-1）として、（Xn-1）−（X0）により表される一般式で求めることができる。
そして、この場合には、この被写体間距離Ｋについて、図９（ｂ）に示すようにして、水平画サイズCxの１／２となるようにして画角の制御を行う。また、左右における被写体位置については、総合被写体Ｇｔが画面左右方向におけるほぼ中央に位置するようにして、３つ個別被写体が存在する領域部分が、画面左右方向においてほぼ中央にくるようにする。本実施の形態としては、３以上の個別被写体が検出された場合において、この図１０に示される構図制御を行うこととする。
３以上の個別被写体が画面内に在る場合には、例えば三分割法に忠実に従って被写体間距離Ｋを水平画サイズCxの１／３とするよりも、さらに水平画サイズCxに対する被写体間距離Ｋの比率を大きくとったほうが一般的には良い構図が得られる。そこで、本実施の形態としては、上記のようにして、３以上の個別被写体が検出された場合には、被写体間距離Ｋ＝Cx／２となる構図を形成することとしている。
このようにして本実施の形態では、検出される個別被写体が１の場合と、２の場合と、３以上の場合とで、それぞれ異なる画角の調整による構図制御を行うようにされている。 In addition, as shown in FIG. 10A, composition control when three individual subjects are detected includes an individual subject SBJ0 located on the leftmost side in the screen and an individual subject SBJ2 located on the rightmost side. A distance K between the subjects is obtained. In other words, the distance K between the subjects is obtained by assigning numbers 0 to n−1 to the individual subjects from the left to the right of the screen when n individual subjects are detected. The X coordinate of the center of gravity G0 of the individual subject SBJ0 located on the leftmost side of the screen is (X0), and the X coordinate of the center of gravity Gn-1 of the individual subject SBJ (n-1) located on the rightmost side of the screen is (Xn-1). ) As a general formula represented by (Xn-1)-(X0).
In this case, the angle of view is controlled so that the distance K between the subjects is ½ of the horizontal image size Cx as shown in FIG. 9B. As for the subject position in the left and right directions, the general subject Gt is positioned substantially at the center in the horizontal direction of the screen so that the area portion where the three individual subjects are present is approximately at the center in the horizontal direction of the screen. In the present embodiment, the composition control shown in FIG. 10 is performed when three or more individual subjects are detected.
When there are three or more individual subjects in the screen, for example, the subject-to-subject distance K with respect to the horizontal image size Cx is more than the subject-to-subject distance K is set to 1/3 of the horizontal image size Cx according to the three-division method In general, a better composition can be obtained by increasing the ratio of. Therefore, in the present embodiment, as described above, when three or more individual subjects are detected, a composition in which the inter-subject distance K = Cx / 2 is formed.
In this way, in the present embodiment, composition control is performed by adjusting the different angle of view for each of the detected individual subjects 1, 2, and 3 or more.

図１１は、上記図８〜図１０により説明した第１例としての構図制御に対応して、図５に示した被写体検出処理ブロック６１、構図制御処理ブロック６２、及び通信制御処理ブロック６３が実行するものとされる手順例を示している。また、この図に示す処理は、ＤＳＰとしての信号処理部２４、制御部２７におけるＣＰＵがプログラムを実行することで実現されるものとしてみることができる。このようなプログラムは、例えばＲＯＭなどに対して製造時などに書き込んで記憶させるほか、リムーバブルの記憶媒体に記憶させておいたうえで、この記憶媒体からインストール(アップデートも含む)させるようにしてＤＳＰ対応の不揮発性の記憶領域やフラッシュメモリ３０などに記憶させることが考えられる。また、ＵＳＢやＩＥＥＥ１３９４などのデータインターフェース経由により、他のホストとなる機器からの制御によってプログラムのインストールを行えるようにすることも考えられる。さらに、ネットワーク上のサーバなどにおける記憶装置に記憶させておいたうえで、デジタルスチルカメラ１にネットワーク機能を持たせることとし、サーバからダウンロードして取得できるように構成することも考えられる。 FIG. 11 is executed by the subject detection processing block 61, the composition control processing block 62, and the communication control processing block 63 shown in FIG. 5 in response to the composition control as the first example described with reference to FIGS. An example procedure to be performed is shown. Further, the processing shown in this figure can be considered to be realized by the CPU in the signal processing unit 24 and the control unit 27 as a DSP executing a program. Such a program is written and stored in a ROM or the like at the time of manufacture, for example, and is stored in a removable storage medium and then installed (including update) from this storage medium. It is conceivable to store the data in a corresponding non-volatile storage area, the flash memory 30, or the like. It is also conceivable that the program can be installed through a data interface such as USB or IEEE 1394 under the control of another host device. Further, it may be possible to store the data in a storage device such as a server on the network, and to give the digital still camera 1 a network function so that the digital still camera 1 can be downloaded and acquired from the server.

先ず、ステップＳ１０１〜ステップＳ１０６までは、被写体を探索して検出するための手順となり、主に被写体検出処理ブロック６１が実行するものとされる。
ステップＳ１０１では、イメージセンサ２２からの撮像信号に基づいた撮像画像データを取り込んで取得する。ステップＳ１０２では、上記ステップＳ１０１により取得した撮像画像データを利用して被写体検出処理を実行する。ここでの被写体検出処理としては、例えば先ず、先に述べた顔検出などの手法により、撮像画像データとしての画面内容において個別被写体が存在するか否かについての検出を行うものであり、個別被写体が存在する場合には、少なくとも個別被写体数、個別被写体ごとの位置（重心）、サイズを検出情報として得るようにされる。 First, steps S101 to S106 are procedures for searching for and detecting a subject, and the subject detection processing block 61 is mainly executed.
In step S101, captured image data based on the imaging signal from the image sensor 22 is acquired and acquired. In step S102, subject detection processing is executed using the captured image data acquired in step S101. As the subject detection processing here, for example, first, detection of whether or not an individual subject exists in the screen content as captured image data is performed by a method such as face detection described above. Is present, at least the number of individual subjects, the position (center of gravity) for each individual subject, and the size are obtained as detection information.

ステップＳ１０３では、上記ステップＳ１０２による被写体検出処理の結果として、個別被写体の存在が検出されたか否かについての判別を行う。ここで個別被写体の存在が検出されなかった（検出される個別被写体数が０である）として否定の判別結果が得られた場合には、ステップＳ１０４に進み、画角を広くするためのズームレンズの移動制御（ズームアウト制御）を実行する。このようにして画角を広くすることで、より広い範囲が撮像されることになるので、それだけ個別被写体を補足しやすくなる。また、これとともに、ステップＳ１０５により、被写体探索のために雲台１０のパン・チルト機構を動かすための制御（パン・チルト制御）を実行する。このときには、被写体検出処理ブロック６１がパン・チルト制御のための制御信号を通信制御処理ブロック６３に渡し、雲台１０の通信部５２に対して送信されるようにして制御を行う。
なお、上記被写体探索のためのパン・チルト制御として、雲台１０のパン・チルト機構をどのようなパターンで動かすのかについては、例えば探索が効率的に行われることを配慮して決めることとすればよい。
また、ステップＳ１０６においては、モードフラグｆについて０を設定（ｆ＝０）し、ステップＳ１０１に戻るようにされる。
このようにして、撮像画像データの画内容において少なくとも１つの個別被写体が検出されるまでは、ステップＳ１０１〜ステップＳ１０６の手順が繰り返される。このとき、デジタルスチルカメラ１と雲台１０から成るシステムは、被写体探索のために、デジタルスチルカメラ１がパン方向及びチルト方向に動かされている状態となっている。 In step S103, it is determined whether or not the presence of an individual subject has been detected as a result of the subject detection process in step S102. If a negative determination result is obtained because the presence of the individual subject is not detected (the number of detected individual subjects is 0), the process proceeds to step S104, and the zoom lens for widening the field angle is obtained. The movement control (zoom-out control) is executed. By widening the angle of view in this way, a wider range is imaged, so that it becomes easier to capture individual subjects. At the same time, in step S105, control (pan / tilt control) for moving the pan / tilt mechanism of the pan / tilt head 10 for subject search is executed. At this time, the subject detection processing block 61 passes a control signal for pan / tilt control to the communication control processing block 63 and performs control so as to be transmitted to the communication unit 52 of the camera platform 10.
As the pan / tilt control for searching for the subject, the pattern in which the pan / tilt mechanism of the pan / tilt head 10 is moved is determined in consideration of, for example, efficient search. That's fine.
In step S106, 0 is set for the mode flag f (f = 0), and the process returns to step S101.
In this manner, the procedure from step S101 to step S106 is repeated until at least one individual subject is detected in the image content of the captured image data. At this time, the system including the digital still camera 1 and the camera platform 10 is in a state in which the digital still camera 1 is moved in the pan direction and the tilt direction in order to search for a subject.

そして、ステップＳ１０３において個別被写体の存在が検出されたとして肯定の判別結果が得られたとされると、ステップＳ１０７以降の手順に進む。ステップＳ１０７以降の手順は、主に構図制御処理ブロック６２が実行するものとなる。 If it is determined that the presence of the individual subject is detected in step S103 and a positive determination result is obtained, the process proceeds to step S107 and subsequent steps. The procedure after step S107 is mainly executed by the composition control processing block 62.

ステップＳ１０７においては、現在のモードフラグｆに設定されている値が何であるのかを判別する。
ｆ==0であると判別された場合には、構図制御として、最初のラフな被写体捕捉モードを実行すべき場合であることを示すものであり、図のようにしてステップＳ１０８から始まる手順を実行する。
ステップＳ１０８においては、総合被写体重心Ｇｔが、撮像画像データの画面（撮像画像データの画内容を表したとするときに得られる画面）における原点座標Ｐ(0,0)(図７参照)に位置しているか否かについての判別を行う。ここで、総合被写体重心Ｇｔは、未だ原点座標に位置していないとして否定の判別結果が得られた場合には、ステップＳ１０９により、総合被写体重心Ｇｔが原点座標に位置するようにして、雲台１０のパン・チルト機構を動かすための制御を実行し、ステップＳ１０１に戻る。このようにして、個別被写体の存在が検出されている状態での最初の構図制御の手順である捕捉モードは、総合被写体重心Ｇｔを、先ずは初期の基準位置である原点座標に対して位置させるようにして雲台１０のパン・チルト機構を制御することで、検出された個別被写体が写っているとされる画像領域を画面内の中央に位置させようとするものである。 In step S107, it is determined what the current mode flag f is.
If it is determined that f == 0, this indicates that the first rough subject capturing mode should be executed as composition control, and the procedure starting from step S108 as shown in FIG. Run.
In step S108, the total subject gravity center Gt is located at the origin coordinate P (0,0) (see FIG. 7) on the screen of the captured image data (screen obtained when the image content of the captured image data is represented). It is determined whether or not it is. Here, if a negative determination result is obtained because the total subject gravity center Gt is not yet located at the origin coordinates, in step S109, the total subject gravity center Gt is located at the origin coordinates, and the pan head Control for moving the pan / tilt mechanism 10 is executed, and the process returns to step S101. In this way, in the capture mode, which is the first composition control procedure in the state where the presence of an individual subject is detected, the total subject center of gravity Gt is first positioned with respect to the origin coordinates, which is the initial reference position. In this way, by controlling the pan / tilt mechanism of the camera platform 10, the image area where the detected individual subject is captured is positioned at the center of the screen.

なお、上記ステップＳ１０９としてのパン・チルト制御を実際に行うのにあたってのアルゴリズムの一例をここで示しておく。
モードフラグｆ==0の状態で個別被写体が検出される状態では、被写体検出処理ブロック６１は、下記の（数１）により示される演算を行って、パン方向における必要移動量Ｓpanとチルト方向における必要移動量Stiltを求めるようにされる。下記の（数１）において、ｎは検出された個別被写体数を示し、ｐ（Xi，Yi）は０番からｎ−１番までの番号が与えられた個別被写体のうちのｉ番目の個別被写体の重心のX,Y座標を示す。確認のために、図７に示したように、この場合における原点座標(0,0)は、画面における水平方向における中点と垂直方向における中点との交点となる。

An example of an algorithm for actually performing the pan / tilt control as step S109 is shown here.
In the state in which the individual subject is detected with the mode flag f == 0, the subject detection processing block 61 performs the calculation represented by the following (Equation 1) to perform the necessary movement amount Span in the pan direction and the tilt direction. The required movement amount Stilt is obtained. In the following (Equation 1), n indicates the number of detected individual subjects, and p (Xi, Yi) is the i-th individual subject among the individual subjects given numbers 0 to n-1. The X and Y coordinates of the center of gravity are shown. For confirmation, as shown in FIG. 7, the origin coordinate (0, 0) in this case is the intersection of the middle point in the horizontal direction and the middle point in the vertical direction on the screen.

例えばステップＳ１０８では、上記のようにして求められる必要移動量Span,Stiltの絶対値が所定値以内(厳密には０となるが、０より大きな値とされてもよい)であるか否かを判別することを以て、総合被写体重心Ｇｔが原点座標Ｐに在るか否かと同等の判別を行うことができる。そして、ステップＳ１０９においては、必要移動量Span,Stiltの絶対値が所定範囲内となるようにしてパン・チルト制御を実行するようにされる。なお、このときのパン・チルト制御に際してのパン機構部５３、チルト機構部５６の速度は一定としても良いのであるが、例えば、必要移動量Span,Stiltが大きくなるのに応じて速度を高くしていくなどして可変させることが考えられる。このようにすれば、パンニングあるいはチルティングによる必要移動量が大きくなったときも、比較的短時間で総合被写体重心Ｇｔを原点座標に近づけることが可能になる。
そして、ステップＳ１０８において、総合被写体重心Ｇｔが原点座標に位置したとして肯定の判別結果が得られたとされると、ステップＳ１１０によりモードフラグｆについて１を設定（ｆ＝１）してステップＳ１０１に戻る。このステップＳ１０９によりモードフラグｆについて１が設定された状態は、構図制御における最初の手順である捕捉モードは完了し、次の第１の構図の調整制御(構図調整モード)を実行すべき状態であることを示す。 For example, in step S108, it is determined whether or not the absolute values of the necessary movement amounts Span and Stilt obtained as described above are within a predetermined value (strictly, it is 0, but may be a value larger than 0). By determining, it is possible to perform determination equivalent to whether or not the total subject gravity center Gt is at the origin coordinate P. In step S109, pan / tilt control is executed such that the absolute values of the necessary movement amounts Span and Stilt are within a predetermined range. Note that the speeds of the pan mechanism unit 53 and the tilt mechanism unit 56 at the time of pan / tilt control at this time may be constant. For example, the speed is increased as the required movement amounts Span and Stilt increase. It is possible to make it variable by going on. In this way, even when the necessary movement amount due to panning or tilting becomes large, it becomes possible to bring the total subject gravity center Gt close to the origin coordinates in a relatively short time.
If it is determined in step S108 that the total subject gravity center Gt is located at the origin coordinate and a positive determination result is obtained, step S110 sets 1 for the mode flag f (f = 1), and the process returns to step S101. . The state in which 1 is set for the mode flag f in step S109 is a state in which the capture mode, which is the first procedure in composition control, is completed, and the next first composition adjustment control (composition adjustment mode) is to be executed. Indicates that there is.

そして、モードフラグｆ==1とされて第1の構図調整モードを実行すべき場合には、ステップＳ１０７からステップＳ１１１に進むことになる。第１の構図調整モードは、以降の説明からも理解されるように、検出された個別被写体数ごとに応じた最適構図を得るためのズーム(画角)調整となるものである。なお、画角調整によっては画面内における個別被写体のサイズであるとか、複数の個別被写体間の距離が変化する結果を生じる。 When the mode flag f == 1 and the first composition adjustment mode is to be executed, the process proceeds from step S107 to step S111. As will be understood from the following description, the first composition adjustment mode is a zoom (view angle) adjustment for obtaining an optimum composition according to the number of detected individual subjects. Depending on the angle of view adjustment, the size of the individual subject in the screen or the distance between the plurality of individual subjects may change.

ステップＳ１１１においては、現在検出されている個別被写体数がいくつであるのかについて判別することとしており、１であればステップＳ１１２から始まる手順を実行することになる。
ステップＳ１１２においては、検出されている個別被写体のサイズがＯＫであるか否かについて判別する。個別被写体のサイズがＯＫである状態とは、図８（ｂ）に示したように、個別被写体としての画像部分の画面内での占有率が所定の範囲値に収まっている状態である。ステップＳ１１２において否定の判別結果が得られた場合には、ステップＳ１１３に進み、上記の占有率が所定範囲値内となる（所定の範囲値内に収まる）ようにしてズームレンズの駆動制御（ズーム制御）を実行し、ステップＳ１０１に戻る。なお、このときには、個別被写体の重心Ｇ（総合被写体重心Ｇｔ）の水平方向（左右方向）における位置に関しては、ステップＳ１０９にて設定されたＸ座標（X=０）に対応する位置を維持するようにしてズーム制御を行うようにされる。これにより、個別被写体を左右方向においてほぼ中央に位置させた状態を維持することができる。また、被写体探索、検出動作の実行時においては、ステップＳ１０４によりズームアウト制御が行われるので、ステップＳ１１３としてのズーム制御に際してはズームイン制御となる場合が多いと考えられる。しかし、何らかの原因で上記の占有率が所定範囲値を越えるような画面内容の状態であることに応じてステップＳ１１２にて否定の判別結果が得られた場合、ステップＳ１１３ではズームアウトを実行させて占有率が所定範囲値内に収まるように制御することになる。
そして、ステップＳ１１２において肯定の判別結果が得られたのであればステップＳ１１４に進み、モードフラグｆについて２を設定してステップＳ１０１に戻るようにされる。なお、モードフラグｆ==２は、以降の説明からも理解されるように、第１の構図調整が完了して、次の第２の構図調整を実行したうえでレリーズ動作を実行すべきであることを示す。 In step S111, it is determined how many individual subjects are currently detected. If the number is 1, the procedure starting from step S112 is executed.
In step S112, it is determined whether or not the size of the detected individual subject is OK. The state where the size of the individual subject is OK is a state where the occupation ratio of the image portion as the individual subject within the screen is within a predetermined range value as shown in FIG. 8B. If a negative determination result is obtained in step S112, the process proceeds to step S113, and the zoom lens drive control (zoom) is performed so that the occupation ratio falls within the predetermined range value (contains within the predetermined range value). Control), and the process returns to step S101. At this time, the position corresponding to the X coordinate (X = 0) set in step S109 is maintained with respect to the position of the center G of the individual subject (total subject center Gt) in the horizontal direction (left-right direction). Thus, zoom control is performed. As a result, it is possible to maintain a state in which the individual subject is positioned substantially at the center in the left-right direction. Further, since zoom-out control is performed in step S104 when subject search and detection operations are performed, it is considered that zoom-in control is often performed during zoom control in step S113. However, if a negative determination result is obtained in step S112 in response to the screen content state such that the occupation ratio exceeds the predetermined range value for some reason, zoom-out is executed in step S113. Control is performed so that the occupation ratio falls within a predetermined range value.
If a positive determination result is obtained in step S112, the process proceeds to step S114, 2 is set for the mode flag f, and the process returns to step S101. Note that the mode flag f == 2 should be executed after the first composition adjustment is completed and the next second composition adjustment is performed, as will be understood from the following description. Indicates that there is.

また、ステップＳ１１１において個別被写体数が２であるとして判別された場合には、ステップＳ１１５から始まる手順を実行する。
ステップＳ１１５においては、撮像画像データの画面における２つの個別被写体の被写体間距離Ｋが、図９（ｂ）に示したようにして、水平画サイズCxの１／３となっている状態（K==Cx/3）にあるか否かについての判別を行うようにされる。ここで否定の判別結果が得られた場合には、ステップＳ１１６に進み、上記のK==Cx/3の状態となるようにズーム制御を実行する。なお、このときにも、総合被写体重心Ｇｔの水平方向における位置については、ステップＳ１０９にて設定されたＸ座標（X=０）を維持するようにしてズーム制御を行うようにされる。この点については後述するステップＳ１１９も同様である。そして、ステップＳ１１５によりK==Cx/3の状態になっているとして肯定の判別結果が得られた場合には、ステップＳ１１７に進んでモードフラグｆについて２を設定してステップＳ１０１に戻る。 If it is determined in step S111 that the number of individual subjects is 2, the procedure starting from step S115 is executed.
In step S115, the distance K between the two individual subjects on the captured image data screen is 1/3 of the horizontal image size Cx as shown in FIG. 9B (K = = Cx / 3) is determined. If a negative determination result is obtained here, the process proceeds to step S116, and zoom control is executed so that the state of K == Cx / 3 is obtained. At this time as well, the zoom control is performed so as to maintain the X coordinate (X = 0) set in step S109 for the position of the total subject gravity center Gt in the horizontal direction. This also applies to step S119 described later. If a positive determination result is obtained in step S115 that K == Cx / 3, the process proceeds to step S117, where 2 is set for the mode flag f, and the process returns to step S101.

また、ステップＳ１１１において個別被写体数が３以上であるとして判別された場合には、ステップＳ１１８から始まる手順を実行する。
ステップＳ１１８においては、撮像画像データの画面における被写体間距離Ｋ（この場合には、画面内における最も左の個別被写体の重心から、画面内における最も右の個別被写体の重心までの距離となる）が、図１０（ｂ）に示したようにして、水平画サイズCxの１／２となっている状態（K==Cx/2）にあるか否かについての判別を行う。ここで否定の判別結果が得られた場合には、ステップＳ１１９に進み、K==Cx/2となるようにズーム制御を実行する。そして、K==Cx/3の状態になっているとして肯定の判別結果が得られた場合には、ステップＳ１２０に進んでモードフラグｆについて２を設定してステップＳ１０１に戻るようにされる。 If it is determined in step S111 that the number of individual subjects is 3 or more, the procedure starting from step S118 is executed.
In step S118, the distance K between subjects on the screen of the captured image data (in this case, the distance from the center of gravity of the leftmost individual subject in the screen to the center of gravity of the rightmost individual subject in the screen). As shown in FIG. 10B, it is determined whether or not the horizontal image size Cx is half (K == Cx / 2). If a negative determination result is obtained here, the process proceeds to step S119, and zoom control is executed so that K == Cx / 2. If a positive determination result is obtained assuming that K == Cx / 3, the process proceeds to step S120, 2 is set for the mode flag f, and the process returns to step S101.

このようにして、モードフラグｆについて２が設定された状態では、図８〜図１０により説明した、個別被写体が１つ、２つ、若しくは３以上の場合に対応した構図制御までの手順が完了した状態であることになる。そこで、ステップＳ１０７にてモードフラグｆが２であると判別された場合には、ステップＳ１２１以降の手順により、第２の構図調整モードを実行する。 In this way, in the state where 2 is set for the mode flag f, the procedure up to composition control corresponding to the case where one, two, or three or more individual subjects are described with reference to FIGS. 8 to 10 is completed. Will be in the state. Therefore, when it is determined in step S107 that the mode flag f is 2, the second composition adjustment mode is executed according to the procedure after step S121.

例えば、図８〜図１０での構図制御の説明にあっては、その説明を簡単なものとするために、画面上下方向における個別被写体の重心の位置をどのようにして設定するのかについては言及していないが、実際においては、画面の中央から例えば或る必要量だけ上方向に移動（オフセット）させたほうが、より良い構図となる場合がある。そこで、本実施の形態の構図制御の実際としては、最適構図としてより良好なものが得られるようにして総合被写体重心Ｇｔの縦（垂直）方向のオフセット量を設定できるようになっている。このための手順が、第２の構図調整モードとなるものであり、ステップＳ１２１及び次に説明するステップＳ１２２として実行される。
ステップＳ１２１では、総合被写体重心Ｇｔ（個別被写体が１つの場合はその個別被写体の重心Ｇとなる）の位置について、画面上の原点座標Pを通過する水平直線（Ｘ軸）から所定のオフセット量だけオフセットしている状態にあるか否か（重心オフセットがＯＫであるか否か）を判別する。
ステップＳ１２１にて否定の判別結果が得られた場合には、ステップＳ１２２により、設定されたオフセット量だけ重心がオフセットされるようにして、雲台１０のチルト機構が動くようにチルト制御を実行し、ステップＳ１０１に戻る。そして、ステップＳ１２１において肯定の判別結果が得られた段階では、個別被写体数に応じた最適構図が得られているものとされる状態が得られたことになる。
なお、このステップ１２１、Ｓ１２２に対応した重心オフセットとしてのオフセット量の値をどのようにして設定するのかについては、いくつかの手法が考えられることから、ここでは特に限定されるべきものではない。最も簡単な設定の１つとしては、例えば三分割法に基づいて、縦方向における中心位置から、垂直画サイズCyの1/6に相当する長さのオフセット値を与えることが考えられる。もちろん、例えば個別被写体数に応じた異なるオフセット値を所定の規則に従って設定するように構成することも考えられる。 For example, in the description of the composition control in FIGS. 8 to 10, reference is made to how to set the position of the center of gravity of the individual subject in the vertical direction of the screen in order to simplify the description. In practice, however, it may be better to move (offset), for example, a certain required amount upward from the center of the screen. Therefore, in the actual composition control of the present embodiment, the offset amount in the vertical (vertical) direction of the total subject gravity center Gt can be set so that a better optimal composition can be obtained. The procedure for this is the second composition adjustment mode, and is executed as step S121 and step S122 described next.
In step S121, the position of the total subject center of gravity Gt (or the center of gravity G of the individual subject when there is one individual subject) is a predetermined offset amount from the horizontal straight line (X axis) passing through the origin coordinate P on the screen. It is determined whether or not there is an offset state (whether or not the gravity center offset is OK).
If a negative determination result is obtained in step S121, tilt control is executed so that the tilt mechanism of the camera platform 10 moves by offsetting the center of gravity by the set offset amount in step S122. Return to step S101. Then, at the stage where a positive determination result is obtained in step S121, a state is obtained in which an optimum composition corresponding to the number of individual subjects is obtained.
Note that there are several methods for setting the value of the offset amount as the center-of-gravity offset corresponding to these steps 121 and S122, and therefore there is no particular limitation here. One of the simplest settings is to give an offset value having a length corresponding to 1/6 of the vertical image size Cy from the center position in the vertical direction based on, for example, a three-division method. Of course, for example, it may be configured to set different offset values according to the number of individual subjects according to a predetermined rule.

そして、ステップＳ１２１により肯定の判別結果が得られた場合には、ステップＳ１２３から始まる、レリーズ動作に対応した処理手順を実行する。ここでのレリーズ動作とは、そのときに得られている撮像画像データを、静止画像データとして記憶媒体（メモリカード４０）に記憶させるための動作をいう。つまり、手動によるシャッター操作を行っている場合では、このシャッター操作に応答して、そのときに得られていた撮像画像データを静止画像データとして記憶媒体に対して記録する動作にあたる。 If a positive determination result is obtained in step S121, the processing procedure corresponding to the release operation starting from step S123 is executed. Here, the release operation refers to an operation for storing the captured image data obtained at that time in a storage medium (memory card 40) as still image data. In other words, when a manual shutter operation is performed, the captured image data obtained at that time is recorded on the storage medium as still image data in response to the shutter operation.

ステップＳ１２３においては、現在においてレリーズ動作を実行可能な条件を満たしているか否かを判別する。条件としては例えば、合焦状態にあること（オートフォーカス制御が有効に設定されている場合）、雲台１０のパン・チルト機構が停止状態にあること、などを挙げることができる。
上記ステップＳ１２３で否定の判別結果が得られた場合には、ステップＳ１０１に戻る。これにより、レリーズ動作を実行できる条件が満たされる状態となるのを待機することができる。そして、ステップＳ１２３において肯定の判別結果が得られると、ステップＳ１０４によりレリーズ動作を実行する。このようにして、本実施の形態では、最適構図の撮像画像データを記録することができる。
レリーズ動作が終了したとされると、ステップＳ１２５により所要のパラメータについて初期設定を行う。この処理により、モードフラグｆについては初期値の０が設定される。また、ズームレンズの位置も、予め設定された初期位置に戻される。
そして、ステップＳ１２５の処理を実行した後はステップＳ１０１に戻る。このようにしてステップＳ１２５からステップＳ１０１に戻ることにより、被写体を探索し、この探索により検出されることとなった個別被写体数に応じた最適構図を得て撮像記録（レリーズ動作）を行うという動作が、自動的に繰り返し実行されることになる。 In step S123, it is determined whether or not the conditions under which the release operation can be currently performed are satisfied. Examples of conditions include being in focus (when autofocus control is set to be effective), pan / tilt mechanism of pan head 10 being stopped, and the like.
If a negative determination result is obtained in step S123, the process returns to step S101. Accordingly, it is possible to wait for the condition for executing the release operation to be satisfied. If a positive determination result is obtained in step S123, the release operation is executed in step S104. In this way, in this embodiment, captured image data having an optimal composition can be recorded.
When the release operation is finished, initial setting is performed for necessary parameters in step S125. By this process, the initial value 0 is set for the mode flag f. Further, the position of the zoom lens is also returned to the preset initial position.
And after performing the process of step S125, it returns to step S101. By returning from step S125 to step S101 in this manner, an object is searched for, and an optimum composition corresponding to the number of individual objects detected by this search is obtained and image recording (release operation) is performed. Will be automatically and repeatedly executed.

なお、上記図１１の場合におけるレリーズ動作は、撮像画像から静止画像を記録媒体に記録する動作となるものであるが、本実施の形態におけるレリーズ動作は、より広義には、上記の静止画像を記録媒体に記録することを含め、例えば撮像画像から必要な静止画像データを取得することを指す。従って、例えば本実施の形態のデジタルスチルカメラ１により、データインターフェースなどを経由して他の記録装置などに伝送するために、撮像画像から静止画像データを取得するような動作も、レリーズ動作となるものである。 Note that the release operation in the case of FIG. 11 is an operation for recording a still image from a captured image on a recording medium. However, the release operation in the present embodiment is a broader sense of the above still image. Including recording on a recording medium, for example, it means obtaining necessary still image data from a captured image. Therefore, for example, an operation of acquiring still image data from a captured image so as to be transmitted to another recording apparatus or the like via the data interface by the digital still camera 1 of the present embodiment is also a release operation. Is.

また、上記図１１において、ステップＳ１１１による判別結果に応じて、ステップＳ１１２、Ｓ１１３に対応するズーム制御、ステップＳ１１５、Ｓ１１６に対応するズーム制御、あるいはステップＳ１１８、Ｓ１１９によるズーム制御を実行するという構成は、現在検出されている個別被写体数に応じて構図判定方法を変更しているものであるとみることができる。
ここでいう構図判定方法の変更とは、例えば構図判定、構図制御のためのアルゴリズムを変更すること、あるいは構図判定、構図制御のためのパラメータを変更することをいう。ステップＳ１１１により検出される個別被写体数が１であると判別した場合には、ステップＳ１１２、Ｓ１１３により、個別被写体としての画像部分の画面内での占有率に基づいてズーム制御を行うのに対して、ステップＳ１１１により検出される個別被写体数が２以上であると判別した場合には、上記の占有率ではなく、被写体間距離Ｋに基づいてズーム制御を行う。これは、個別被写体のサイズを調整に関する構図判定、構図制御にあたり、そのためのアルゴリズムを、検出される個別被写体数に応じて変更しているといえる。さらに、検出される個別被写体数が２以上である場合においては、個別被写体数が２の場合と３以上の場合とで、最適構図であると判定される被写体間距離Ｋについて、Cx/3とCx/2のようにして異なる値が設定される。これは、検出される個別被写体数に応じて、個別被写体のサイズを調整に関する構図判定、構図制御のためのパラメータを変更しているといえる。 Further, in FIG. 11, the zoom control corresponding to steps S112 and S113, the zoom control corresponding to steps S115 and S116, or the zoom control corresponding to steps S118 and S119 is executed according to the determination result in step S111. Thus, it can be considered that the composition determination method is changed according to the number of individual subjects currently detected.
The change of the composition determination method here means, for example, changing an algorithm for composition determination and composition control, or changing parameters for composition determination and composition control. When it is determined that the number of individual subjects detected in step S111 is 1, zoom control is performed based on the occupation ratio of the image portion as an individual subject in the screen in steps S112 and S113. When it is determined that the number of individual subjects detected in step S111 is 2 or more, zoom control is performed based on the inter-subject distance K instead of the above occupancy rate. It can be said that the algorithm for the composition determination and composition control related to the adjustment of the size of the individual subject is changed according to the number of detected individual subjects. Further, in the case where the number of detected individual subjects is 2 or more, the distance K between subjects determined to be the optimal composition between the case where the number of individual subjects is 2 and the case where it is 3 or more is Cx / 3. Different values are set like Cx / 2. It can be said that the parameters for composition determination and composition control relating to the adjustment of the size of the individual subject are changed according to the number of detected individual subjects.

続いては、本実施の形態における第２の構図制御について説明する。第２の構図制御では、検出される個別被写体数に応じて、以降説明するようにして撮像画像データとしての画面設定(構図)について縦長と横長とで切り換えを行う。
第２の構図制御にあっては、先ず、初期状態として横長の構図が設定された状態で被写体の検出を行うものとする。
そして、例えば図１２（ａ）に示すようにして撮像画像データの画面内において１つの個別被写体ＳＢＪ０が検出されたとする。このようにして検出された個別被写体数が１つの場合、第２の構図制御にあっては、図１２（ａ）から図１２（ｂ）への遷移として示すように、構図を縦長に設定するようにされる。
そのうえで、個別被写体ＳＢＪ０については、画面内における占有率が所定範囲値内となるようにしてサイズの調整（ズーム）制御を行うようにされる。また、この場合には、個別被写体ＳＢＪ０の水平方向における位置はほぼ中央となるようにしており、垂直方向における位置は、所定規則に従って中央よりも上方向にオフセットされるようにして在るようにされる。 Subsequently, the second composition control in the present embodiment will be described. In the second composition control, the screen setting (composition) as captured image data is switched between portrait and landscape according to the number of detected individual subjects as described below.
In the second composition control, first, an object is detected in a state where a horizontally long composition is set as an initial state.
For example, it is assumed that one individual subject SBJ0 is detected in the captured image data screen as shown in FIG. When the number of individual subjects detected in this way is one, in the second composition control, the composition is set to be vertically long as shown as a transition from FIG. 12 (a) to FIG. 12 (b). To be done.
In addition, for the individual subject SBJ0, size adjustment (zoom) control is performed so that the occupation ratio in the screen is within a predetermined range value. Further, in this case, the position of the individual subject SBJ0 in the horizontal direction is substantially the center, and the position in the vertical direction is offset upward from the center according to a predetermined rule. Is done.

被写体が１つの場合の構図に関して、特に人物の場合には、縦長よりも横長の構図としたほうが、全体的、総合的な構図も良好になるとの考え方をとることができる。そこで、第２の構図制御にあっては、この考え方に基づいて、個別被写体が１つのときには、縦長の構図としたうえで、個別被写体のサイズ、位置調整を行うという構図制御を行うこととしているものである。 Regarding the composition in the case of one subject, it can be considered that, in particular, in the case of a person, the overall composition becomes better when the composition is landscape rather than portrait. Therefore, in the second composition control, based on this concept, when there is one individual subject, the composition control is performed such that the composition and the position and the position of the individual subject are adjusted after having a vertically long composition. Is.

なお、本実施の形態において、上記のようにして構図を横長から縦長に変更するためには、例えば横長の構図として得られる撮像画像データから縦長サイズ分の画像領域を切り出し、このようして得られた縦長サイズの画像データ部分を利用することが考えられる。
或いは、例えば雲台１０について、デジタルスチルカメラ１を横置きに対応する状態と縦置きにする状態との間で動かすことのできるような機構を設けることとしたうえで、この機構を駆動制御することで構図を変更可能とした構成も考えることができる。 In the present embodiment, in order to change the composition from landscape to portrait as described above, for example, an image area corresponding to the portrait size is cut out from captured image data obtained as a landscape composition, and thus obtained. It is conceivable to use the vertically long image data portion.
Alternatively, for example, the pan head 10 is provided with a mechanism capable of moving the digital still camera 1 between a state corresponding to the horizontal placement and a state where the digital still camera 1 is placed vertically, and the mechanism is driven and controlled. Thus, a configuration in which the composition can be changed can be considered.

また、図１３（ａ）に示すようにして、撮像画像データの画面内において２つの個別被写体ＳＢＪ０、ＳＢＪ１が検出されたとする。第２の構図制御では、このようにして２つの個別被写体数が検出された場合には、その検出時に対応したときのままの画角での、被写体間距離Ｋが所定の閾値以下であるかどうかについての判断を行うようにされる。
被写体間距離Ｋが閾値以下である場合には、２つの個別被写体は、互いに相当に近い状態にあるということになる。このような状態であれば、構図としては、横長とするよりも縦長としたほうが好ましいとの考え方をとることができる。そこで、この場合には、図１３（ａ）から図１３（ｂ）への遷移として示すようにして、構図を縦長に変更する。なお、構図を変更するための手法としては、例えば上記したとおりである。そのうえで、個別被写体ＳＢＪ０、ＳＢＪ１のサイズと位置が適切なものとなるようにして、ズーム制御やパン・チルト制御を行うようにされる。この場合にも、画面内での個別被写体ＳＢＪ０、ＳＢＪ１から成るとされる画像部分の水平方向における位置はほぼ中央となるようにしており、垂直方向における位置は、所定規則に従って中央よりも上方向に在るようにされている。
一方、検出された２つの個別被写体ＳＢＪ０、ＳＢＪ１の被写体間距離Ｋが閾値を越えている場合には、２つの個別被写体は、互いに相応に離れていることになるが、この場合には、構図としては、横長とすることが好ましいということになる。そこで、この場合には、先に図９により説明したのと同様の構図制御を行うようにされる。 Also, assume that two individual subjects SBJ0 and SBJ1 are detected in the screen of the captured image data as shown in FIG. In the second composition control, when the number of two individual subjects is detected in this way, is the subject-to-subject distance K at or below the predetermined threshold at the angle of view corresponding to the detection time? Judgment about whether or not.
When the inter-subject distance K is equal to or less than the threshold value, the two individual subjects are in a substantially close state to each other. In such a state, it is possible to take the idea that it is preferable to use a vertically long composition rather than a horizontally long composition. Therefore, in this case, the composition is changed to be vertically long as shown as a transition from FIG. 13 (a) to FIG. 13 (b). For example, the method for changing the composition is as described above. In addition, zoom control and pan / tilt control are performed so that the sizes and positions of the individual subjects SBJ0 and SBJ1 are appropriate. Also in this case, the position in the horizontal direction of the image portion composed of the individual subjects SBJ0 and SBJ1 in the screen is set substantially at the center, and the position in the vertical direction is higher than the center according to a predetermined rule To be in.
On the other hand, when the inter-subject distance K between the two detected individual subjects SBJ0 and SBJ1 exceeds the threshold value, the two individual subjects are separated from each other accordingly. Therefore, it is preferable to set it horizontally long. Therefore, in this case, composition control similar to that described above with reference to FIG. 9 is performed.

また、撮像画像データの画面内において３以上の個別被写体ＳＢＪ０〜ＳＢＪｎ（ｎは３以上の自然数）が検出されたとする。ここでは、３人以上の個別被写体が検出された場合には、構図については、横長とすることのほうが全体構図としてみたときに好ましいとの考え方を採る。そこで、この場合には、第２の構図制御としても、例えば図１０により説明したのと同様の構図制御を行うようにされる。 Also, it is assumed that three or more individual subjects SBJ0 to SBJn (n is a natural number of 3 or more) are detected in the screen of the captured image data. Here, when three or more individual subjects are detected, it is assumed that the composition is preferably in landscape orientation when viewed as an overall composition. Therefore, in this case, the same composition control as described with reference to FIG. 10 is performed as the second composition control.

図１４は、第２の構図制御に対応して図５に示した被写体検出処理ブロック６１、構図制御処理ブロック６２、及び通信制御処理ブロック６３が実行するものとされる手順例を示している。
この図において、ステップＳ２０１〜Ｓ２１０までの手順は、図１１のステップＳ１０１〜Ｓ１１０までの手順と同様となる。ただし、ステップＳ２０４においては、ステップＳ１０４と同様にズームアウト制御を実行するほか、これまでに設定されていた構図が縦長であったときには、初期状態である横長に設定しなおすための制御も実行するようにされる。 FIG. 14 shows an example of a procedure that is executed by the subject detection processing block 61, the composition control processing block 62, and the communication control processing block 63 shown in FIG. 5 corresponding to the second composition control.
In this figure, the procedure from step S201 to S210 is the same as the procedure from step S101 to S110 in FIG. However, in step S204, zoom-out control is executed in the same manner as in step S104, and when the composition set so far is vertically long, control for resetting it to the horizontally long initial state is also executed. To be done.

モードフラグｆ==1の状態では、ステップＳ１１１と同様にして、ステップＳ２１１により、検出された個別被写体数について、１であるのか、２であるのか、あるいは3以上であるのかを判定するようにされている。
先ず、ステップＳ２１１にて個別被写体数が１であると判別された場合には、ステップＳ２１２から始まる手順を実行する。
ステップＳ２１２においては、これまでにおいて設定されていた構図が横長である場合には、これを縦長に変更するための制御を実行する。この処理として、例えば先の説明のようにして、横長の構図として得られる撮像画像データから縦長サイズ分の画像領域を切り出すための信号処理を実行すればよい。このような処理は、信号処理部２４における構図制御処理ブロック６２としての機能により実現すべきものとなる。ステップＳ２１２の手順を実行すると、ステップＳ２１３に進む。
ステップＳ２１３〜ステップＳ２１５は、それぞれ、図１１のステップＳ１１２〜Ｓ１１４と同じとなる。
上記ステップＳ２１２〜Ｓ２１５までの手順により、図１２により説明した構図制御（個別被写体についての上方向へのオフセットを除く）が行われたこととなる。 In the state where the mode flag f == 1, it is determined in step S211 whether the number of detected individual subjects is 1 or 2 or 3 or more in the same manner as in step S111. Has been.
First, when it is determined in step S211 that the number of individual subjects is 1, a procedure starting from step S212 is executed.
In step S212, if the composition set so far is horizontally long, control for changing it to vertically long is executed. As this processing, for example, as described above, signal processing for cutting out an image area of a vertically long size from captured image data obtained as a horizontally long composition may be executed. Such processing should be realized by the function as the composition control processing block 62 in the signal processing unit 24. When the procedure of step S212 is executed, the process proceeds to step S213.
Steps S213 to S215 are the same as steps S112 to S114 in FIG. 11, respectively.
The composition control described with reference to FIG. 12 (excluding the upward offset for the individual subject) is performed by the procedure from step S212 to step S215.

また、ステップＳ２１１にて個別被写体数が２であると判別された場合には、ステップＳ２１６から始まる手順を実行する。
ステップＳ２１６においては、検出された２つの個別被写体についての個別被写体間距離Ｋが閾値以下であるか否かについて判別する。ここで、肯定の判別結果が得られた場合には、ステップＳ２１７に進み、これまでにおいて設定されていた構図が横長である場合には、これを縦長に変更するための制御を実行する。そして、ステップＳ２１３〜Ｓ２１５の手順を実行することになる。
ただし、ステップＳ２１７からステップＳ２１３に至った場合、ステップＳ２１３〜Ｓ２１５の手順を実行する際には、個別被写体が２つであることに対応して設定された、個別被写体が１つのときとは異なる占有率についての所定範囲値が設定される。そして、２つの個別被写体の占有率が、この所定範囲値内に収まったとされると、個別被写体の適正サイズが得られたとして、ステップＳ２１５においてモードフラグｆについて２を設定することになる。
一方、ステップＳ２１６において、否定の判別結果が得られた場合には、ステップＳ２１８からはじまる手順を実行するようにされる。
ステップＳ２１８では、これまでにおいて設定されていた構図が縦長であった場合には、これを横長に変更するための制御を実行する。これに続くステップＳ２１９〜Ｓ２２１の手順は、図１１のステップＳ１１５〜Ｓ１１７と同様となる。
上記ステップＳ２１６〜Ｓ２２１までの手順、及びステップＳ２１７から続くステップＳ２１３〜Ｓ２１５の手順により、個別被写体数が２である場合の第２例の構図制御が行われる。つまり、被写体間距離Ｋが短いとされる場合の画面を縦長とする構図制御と、被写体間距離Ｋが長いとされる場合の画面を横長とする構図制御との使い分けが行われる。 If it is determined in step S211 that the number of individual subjects is 2, the procedure starting from step S216 is executed.
In step S216, it is determined whether or not the individual subject distance K between the two detected individual subjects is equal to or less than a threshold value. If an affirmative determination result is obtained, the process proceeds to step S217. If the composition set so far is horizontally long, control for changing it to vertically long is executed. Then, steps S213 to S215 are executed.
However, when the process from step S217 to step S213 is performed, when executing the procedure of steps S213 to S215, it is different from the case where the number of individual objects is one, which is set corresponding to the fact that there are two individual objects. A predetermined range value for the occupation ratio is set. If the occupation ratios of the two individual subjects are within the predetermined range value, 2 is set for the mode flag f in step S215, assuming that the appropriate size of the individual subject is obtained.
On the other hand, if a negative determination result is obtained in step S216, the procedure starting from step S218 is executed.
In step S218, if the composition set so far is vertically long, control is performed to change it to horizontally long. Subsequent steps S219 to S221 are the same as steps S115 to S117 in FIG.
The composition control of the second example when the number of individual subjects is 2 is performed by the procedure from step S216 to S221 and the procedure from step S217 to step S213 to S215. In other words, the composition control in which the screen when the inter-subject distance K is short is vertically long and the composition control in which the screen is horizontal when the inter-subject distance K is long are selectively used.

なお、構図を縦長と横長のどちらに設定するのかに関して、ステップＳ２１６においては、画面水平方向に対応する個別被写体間距離Ｋのみを判定の要素としている。しかし、実際においては、例えば、画面水平（左右）方向に対応する個別被写体間距離Ｋに加えて、画面垂直(上下)方向に対応する個別被写体間距離（Ｋv）も判定要素に加えることが考えられる。個別被写体間距離（Ｋv）は、画面において最も上に位置する個別被写体の重心と最も下に位置する個別被写体の重心との間の距離として定義できる。
例えば、実際の画面においては、上下方向において２つの個別被写体間の距離が相当に離れている場合がある。このようなときには、たとえ左右方向における２つの個別被写体間の距離が在る程度離れているとしても、縦長の構図としたほうが良好な全体構図となる場合があると考えられる。
ステップＳ２１６として、画面水平方向に対応する個別被写体間距離Ｋとともに、画面垂直方向に対応する個別被写体間距離Ｋvを判定要素に利用する場合のアルゴリズムの例を挙げておく。
例えば、画面水平方向に対応する個別被写体間距離Ｋと画面垂直方向に対応する個別被写体間距離Ｋvについての比としてＫ／Ｋｖを求める。そして、このＫ／Ｋｖについて、所定の閾値以上であるか否かについて判別する。Ｋ／Ｋｖが閾値以上である場合には、上下方向における個別被写体間の距離と比べれば、水平方向における２つの個別被写体間の距離のほうが相応に広いということになる。この場合には、ステップＳ２１８により横長の構図を設定する。これに対して、Ｋ／Ｋｖが閾値未満である場合には、上下方向における個別被写体間の距離が相応に離れていることになる。そこで、この場合には、ステップＳ２１７により縦長の構図を設定する。
あるいは、先の説明と同様に、画面水平方向に対応する個別被写体間距離Ｋと所定の閾値を比較し、個別被写体間距離Ｋが閾値以下であれば、このときには、ステップＳ２１７に進んで縦長の構図を設定する。これに対して、個別被写体間距離Ｋが閾値を越えた場合には、次に画面垂直方向に対応する個別被写体間距離Ｋvについて、所定の閾値と比較する。なお、個別被写体間距離Ｋvと比較する閾値は、個別被写体間距離Ｋに対応する閾値と同じ値である必要はなく、個別被写体間距離Ｋvに対応して適切に設定された値を利用すればよい。そして、個別被写体間距離Ｋvが閾値以内であれば、ステップＳ２１８により横長の構図を設定するが、閾値を越えるのであれば、ステップＳ２１７により縦長の構図を設定する。 Note that regarding whether to set the composition to portrait or landscape, in step S216, only the individual subject distance K corresponding to the horizontal direction of the screen is used as a determination element. However, in practice, for example, in addition to the distance K between individual subjects corresponding to the horizontal (left and right) direction of the screen, the distance between individual subjects (Kv) corresponding to the vertical (vertical) direction of the screen may be added to the determination element. It is done. The distance between individual subjects (Kv) can be defined as the distance between the center of gravity of the individual subject located at the top of the screen and the center of gravity of the individual subject located at the bottom.
For example, on an actual screen, the distance between two individual subjects may be considerably separated in the vertical direction. In such a case, even if the distance between the two individual subjects in the left-right direction is as far as possible, it may be considered that the overall composition may be better when the composition is vertically long.
As step S216, an example of an algorithm in the case where the individual subject distance Kv corresponding to the screen vertical direction and the individual subject distance Kv corresponding to the screen vertical direction are used as determination elements will be described.
For example, K / Kv is obtained as a ratio between the distance K between individual subjects corresponding to the horizontal direction of the screen and the distance Kv between individual subjects corresponding to the vertical direction of the screen. And it is discriminate | determined about this K / Kv being more than a predetermined threshold value. If K / Kv is equal to or larger than the threshold, compared with the distance between the individual subjects in the vertical direction, the more the distance between two individual subjects in the horizontal direction becomes correspondingly wider gutters Ukoto. In this case, a landscape composition is set in step S218. On the other hand, when K / Kv is less than the threshold, the distance between the individual subjects in the vertical direction is correspondingly increased. Therefore, in this case, a vertically long composition is set in step S217.
Alternatively, as in the previous description, the distance between individual subjects K corresponding to the horizontal direction of the screen is compared with a predetermined threshold value. Set the composition. On the other hand, when the distance K between individual subjects exceeds the threshold value, the individual subject distance Kv corresponding to the vertical direction of the screen is compared with a predetermined threshold value. Note that the threshold value to be compared with the individual subject distance Kv does not have to be the same value as the threshold value corresponding to the individual subject distance K, and if a value appropriately set corresponding to the individual subject distance Kv is used. Good. Then, if it is within the threshold value is subject-to-subject distance Kv, is to set the horizontal composition in step S218, if the exceeding the threshold value, to set the portrait composition in step S217.

ステップＳ２１１にて個別被写体数が３であると判別された場合には、ステップＳ２２２から始まる手順を実行する。ステップＳ２２２では、これまでにおいて設定されていた構図が縦長であった場合には、これを横長に変更するための制御を実行する。これに続くステップＳ２２３〜Ｓ２２５の手順は、図１１のステップＳ１１８〜Ｓ１２０と同様となる。 If it is determined in step S211 that the number of individual subjects is 3, the procedure starting from step S222 is executed. In step S222, if the composition set so far is vertically long, control is performed to change it to horizontally long. Subsequent steps S223 to S225 are the same as steps S118 to S120 in FIG.

これまでの手順を経た結果として、モードフラグｆについて２が設定された状態では、ステップＳ２２６から始まる手順を実行することになる。
ステップＳ２２６、Ｓ２２７による手順は、図１１のＳ１２１、Ｓ１２２と同様となる。この手順が実行されることで、例えば先に図１２、図１３により述べたようにして、個別被写体が画面内において、中央よりも上側に位置するような構図を得ることができる。 As a result of the procedure so far, in a state where 2 is set for the mode flag f, the procedure starting from step S226 is executed.
The procedures in steps S226 and S227 are the same as S121 and S122 in FIG. By executing this procedure, for example, as described above with reference to FIGS. 12 and 13, it is possible to obtain a composition in which the individual subject is positioned above the center in the screen.

ステップＳ２２８〜Ｓ２３０は、図１１のステップＳ１２３〜Ｓ１２５と同様にして、レリーズ動作に関連した手順となる。この手順が実行されることで、構図制御によって最適構図が得られている撮像画像データを記憶媒体に記録できることになる。 Steps S228 to S230 are procedures related to the release operation in the same manner as steps S123 to S125 of FIG. By executing this procedure, captured image data for which an optimum composition is obtained by composition control can be recorded in a storage medium.

なお、先の図１１及び上記図１４に示される各構図制御の手順は、その全体の流れからしてみると、検出される個別被写体の数に応じて最適とみなされる構図を判定、決定し、この判定した構図の撮像画像データが実際に得られる（反映される）ようにして、ズーム制御、及びパン・チルト制御を適宜実行しているものであるとみることができる。 Note that the composition control procedures shown in FIG. 11 and FIG. 14 are determined and determined based on the overall flow in accordance with the number of individual subjects detected. Thus, it can be considered that the zoom control and the pan / tilt control are appropriately executed so that the captured image data of the determined composition is actually obtained (reflected).

また、図１１及び図１４に示される各構図制御の手順にあっては、基本的には、検出される個別被写体数が１つの場合、２つの場合、３以上の場合の３つの条件分岐に対応させて構図を判定するようにしている。しかし、これはあくまでも一例であって、例えば個別被写体数が３以上の場合において、さらに具体的な個別被写体数ごとに区分して構図が判定されるように構成してもよい。
例えば、縦長と横長の何れの構図を設定するのか、という構図判定のアルゴリズムに関して、図１４の場合には、検出される個別被写体数が２のときには、個別被写体間距離Ｋに応じて縦長と横長の何れかが選択されるが、検出される個別被写体数が３以上の場合には、一律に横長を設定することとしている。しかし、例えば、検出される個別被写体数が３以上の場合であっても、検出される個別被写体数ごとに適合するようにして設定した閾値と、そのときの個別被写体間距離Ｋとを比較した結果に基づいて、横長と縦長の何れの構図とするのかを決定するように構成してもよい。つまり、検出される個別被写体数が２以上であれば、個別被写体間距離Ｋに基づいた縦長構図と横長構図の判定を行うようにして構成できる。また、このときにも、先にステップＳ２１６において述べた、垂直方向に対応する個別被写体間距離Ｋvを判定要素に加えることができる。 In each composition control procedure shown in FIGS. 11 and 14, basically, when the number of detected individual subjects is one, two cases are divided into three conditional branches in the case of three or more. The composition is determined in correspondence. However, this is merely an example. For example, when the number of individual subjects is three or more, the composition may be determined by further dividing the number of individual subjects.
For example , regarding the composition determination algorithm for determining which composition is portrait or landscape, in the case of FIG. 14, when the number of detected individual subjects is 2, portrait and landscape are selected according to the distance K between the individual subjects. However, when the number of detected individual subjects is 3 or more, the landscape is uniformly set. However, for example, even when the number of detected individual subjects is 3 or more, the threshold set so as to be adapted to each detected number of individual subjects was compared with the distance K between the individual subjects at that time. Based on the result, it may be configured to determine whether the composition is landscape or portrait. In other words, when the number of detected individual subjects is two or more, the composition can be configured such that the portrait composition and landscape composition are determined based on the distance K between the individual subjects. Also at this time, the individual subject distance Kv corresponding to the vertical direction described above in step S216 can be added to the determination element.

ところで、本実施の形態の撮像システムを利用するのにあたって、その周囲において相応に多くの人がいる環境の中で、特定の１又は複数の人物のみ対象にして構図制御を実行させたいような状況もあると考えられる。しかしながら、被写体検出処理が顔検出技術に基づくものであるとして、単純に検出された顔を全て個別被写体数として認識してしまうようなアルゴリズム構成であると、上記のようにして、特定人物のみを対象とする構図制御は適正に行われなくなってしまう。特に、本実施の形態の構図制御は、個別被写体数に応じて異なる構図を設定するので、ユーザが望まない構図となる結果が生じる可能性も相応に高くなると考えられる。 By the way, when using the imaging system of the present embodiment, there is a situation in which composition control is performed only on one or more specific persons in an environment where there are correspondingly many people. It is believed that there is. However, assuming that the subject detection processing is based on face detection technology, the algorithm configuration that simply recognizes all detected faces as the number of individual subjects, as described above, only a specific person is detected. The target composition control is not performed properly. In particular, since the composition control according to the present embodiment sets different compositions according to the number of individual subjects, it is considered that the possibility of producing a result that the user does not want is likely to increase accordingly.

そこで、本実施の形態として上記のような状況に対応させる場合には、図１１のステップＳ１０２若しくは図１４のステップＳ２０２における被写体検出処理において、次のような被写体の弁別処理が可能なように構成することができる。 Therefore, when the present embodiment is adapted to the above situation, the subject detection process in step S102 of FIG. 11 or step S202 of FIG. 14 is configured such that the following subject discrimination process is possible. can do.

この場合には、先ず、例えばデジタルスチルカメラ１に対する操作により、構図制御の対象とする個別被写体（対象個別被写体）の上限数を設定できるようにしておく。設定された対象個別被写体上限数の情報は、例えば被写体検出処理ブロック６１が保持しておくようにされる。ここでは、具体例として、対象個別被写体上限数として２が設定されているものとする。
そして、例えば被写体探索動作（ステップ１０５、Ｓ２０５）を実行した結果、図１５（ａ）に示す画内容の撮像画像データが得られたとする。これに対応するステップＳ１０１若しくはＳ２０２の被写体検出処理としては、顔検出により４つの個別被写体の存在を検出することになる。この段階において検出される個別被写体は、ここでは、「被写体候補」として扱われる。図では、画面内の４つの被写体候補について、左から右にかけて、被写体候補ＤＳＢＪ０、ＤＳＢＪ１、ＤＳＢＪ２、ＤＳＢＪ３と符号を付している。 In this case, first, an upper limit number of individual subjects (target individual subjects) to be subject to composition control can be set by, for example, an operation on the digital still camera 1. For example, the subject detection processing block 61 holds information on the set target individual subject upper limit number. Here, as a specific example, it is assumed that 2 is set as the target individual subject upper limit number.
For example, assume that captured image data having the image content shown in FIG. 15A is obtained as a result of executing the subject search operation (step 105, S205). Corresponding to the subject detection processing in step S101 or S202, the presence of four individual subjects is detected by face detection. The individual subject detected at this stage is treated as a “subject candidate” here. In the figure, the four candidate subjects in the screen are labeled with subject candidates DSBJ0, DSBJ1, DSBJ2, and DSBJ3 from left to right.

このようにして、単純に顔検出を行った結果としては４つの被写体（候補被写体）が検出される。しかし、上記したように、この場合においては対象個別被写体上限数として２が設定されていることとしている。このことに基づいて、被写体検出処理ブロック６１は、４つの候補被写体ＤＳＢＪ０、ＤＳＢＪ１、ＤＳＢＪ２、ＤＳＢＪ３のうちで、サイズの大きなほうから順に２つの候補被写体を選び、これらの候補被写体を対象個別被写体とする。この場合には、候補被写体ＤＳＢＪ０、ＤＳＢＪ１、ＤＳＢＪ２、ＤＳＢＪ３のうちで、サイズの最も大きな２つは、候補被写体ＤＳＢＪ２、ＤＳＢＪ３であることになる。そこで、被写体検出処理ブロック６１は、図１５（ｂ）に示すようにして、候補被写体ＤＳＢＪ２、ＤＳＢＪ３を、それぞれ、対象個別被写体ＳＢＪ０、ＳＢＪ１として扱うこととして、候補被写体ＤＳＢＪ０、ＤＳＢＪ１については、対象個別被写体ではないものとして無視する。そして、図１１のステップＳ１０７以降若しくは図１４のＳ２０７以降の構図制御のための手順を実行する際には、対象個別被写体のみを制御の対象とするようにされる。このような被写体の弁別が行われることで、例えば多くの人が周囲にいるような環境、状況であっても、撮像システムに対して最も手前の位置に構図制御の対象としたい人物が居るようにすることで、これらの人物のみを対象とした適正な構図制御による撮影が行えることになる。 In this way, four subjects (candidate subjects) are detected as a result of simple face detection. However, as described above, in this case, 2 is set as the target individual subject upper limit number. Based on this, the subject detection processing block 61 selects two candidate subjects in descending order of size from the four candidate subjects DSBJ0, DSBJ1, DSBJ2, and DSBJ3, and selects these candidate subjects as target individual subjects. To do. In this case, of the candidate subjects DSBJ0, DSBJ1, DSBJ2, and DSBJ3, the two largest sizes are the candidate subjects DSBJ2 and DSBJ3. Therefore, the subject detection processing block 61 treats the candidate subjects DSBJ2 and DSBJ3 as the target individual subjects SBJ0 and SBJ1, respectively, as shown in FIG. 15B, and the candidate subjects DSBJ0 and DSBJ1 are subject to the individual target. Ignore it as a non-subject. Then, when the procedure for composition control after step S107 in FIG. 11 or after step S207 in FIG. 14 is executed, only the target individual subject is controlled. By such subject discrimination, for example, even in an environment or situation where many people are in the vicinity, there is a person who wants to be subject to composition control at the closest position to the imaging system. By doing so, it is possible to perform photographing by appropriate composition control for only these persons.

図１６のフローチャートは、図１１のステップＳ１０２或いは図１４のステップＳ２０２の被写体検出処理の一部として実行される、上記の被写体弁別のための手順例を示している。
この処理にあっては、例えば先ず、顔検出処理により検出された全ての被写体を被写体候補として扱うこととしている。そして、ステップＳ３０１においては、この被写体候補が少なくとも１つ検出されるのを待機しており、被写体候補が検出されたのであれば、ステップＳ３０２に進む。 The flowchart in FIG. 16 shows an example of the procedure for subject discrimination described above, which is executed as part of the subject detection process in step S102 in FIG. 11 or step S202 in FIG.
In this process, for example, first, all subjects detected by the face detection process are treated as subject candidates. In step S301, the process waits until at least one subject candidate is detected. If a subject candidate is detected, the process proceeds to step S302.

ステップＳ３０２においては、現在において設定されている対象個別被写体上限数が、上記ステップＳ３０１に対応して検出された被写体候補数以上であるか否かについて判別することとしている。
ステップＳ３０２において肯定の判別結果が得られた場合には、被写体候補の数が対象個別被写体上限数を超えていないことになる。そこで、この場合には、ステップＳ３０３により、検出された全ての被写体候補を対象個別被写体として扱うものとして設定する。
これに対して、ステップＳ３０２において否定の判別結果が得られた場合には、被写体候補の数が対象個別被写体上限数よりも多いことになる。この場合には、ステップＳ３０４により、検出された被写体候補のサイズの大きい順から、対象個別被写体上限数分の被写体候補を選別する。そして、ステップＳ３０５により選別した被写体候補を、対象個別被写体として扱うものとして設定する。これにより、被写体弁別が行われたこととなる。
このような図１６の手順を経ることで、図１１のステップＳ１０２或いは図１４のステップＳ２０２の被写体検出処理の結果としては、上記ステップＳ３０３又はステップＳ２０５により設定された対象個別被写体の数、対象個別被写体ごとのサイズ、位置などの情報を、検出情報として構図制御処理ブロック６２に出力することになる。構図制御処理ブロック６２は、この検出情報を利用して、図１１のステップＳ１０７以降、或いは図１４のステップＳ２０７以降の構図制御を実行する。 In step S302, it is determined whether or not the currently set target individual subject upper limit number is equal to or greater than the number of subject candidates detected corresponding to step S301.
If a positive determination result is obtained in step S302, the number of subject candidates does not exceed the target individual subject upper limit number. Therefore, in this case, all detected subject candidates are set to be handled as target individual subjects in step S303.
On the other hand, if a negative determination result is obtained in step S302, the number of subject candidates is larger than the target individual subject upper limit number. In this case, in step S304, subject candidates corresponding to the upper limit number of target individual subjects are selected in descending order of the size of the detected subject candidates. Then, the subject candidates selected in step S305 are set to be handled as target individual subjects. As a result, subject discrimination is performed.
Through the procedure of FIG. 16, the result of the subject detection process in step S102 of FIG. 11 or step S202 of FIG. 14 is the number of target individual subjects set in step S303 or step S205. Information such as the size and position of each subject is output to the composition control processing block 62 as detection information. The composition control processing block 62 executes composition control after step S107 in FIG. 11 or after step S207 in FIG. 14 using this detection information.

図１７は、本実施の形態の撮像システムの変形例としての構成例を示している。
この図では、先ず、デジタルスチルカメラ１から通信制御処理ブロック６３を経由して、撮像に基づいて信号処理部２４にて生成される撮像画像データを、雲台１０に対して送信するようにされている。
この図においては、雲台１０の構成として通信制御処理ブロック７１、パン・チルト制御処理ブロック７２、被写体検出処理ブロック７３、及び構図制御処理ブロック７４が示されている。
通信制御処理ブロック７１は、図４の通信部５２に対応する機能部位であって、デジタルスチルカメラ１側の通信制御処理ブロック部６３（雲台対応通信部３４）との通信処理を所定のプロトコルに従って実行するようにして構成される部位である。
通信制御処理ブロック７１により受信された撮像画像データは、被写体検出処理ブロック７３に渡される。この被写体検出ブロッ７３は、例えば図５に示した被写体検出処理ブロック６１と同等の被写体検出処理が少なくとも可能なだけの信号処理部を備えて構成され、取り込んだ撮像画像データを対象として被写体検出処理を実行し、その検出情報を構図制御処理ブロック７４に出力する。
構図制御処理ブロック７４は、図４の構図制御処理ブロック６２と同等の構図制御を実行可能とされており、この構図制御処理の結果としてパン制御、チルト制御を行うときには、そのための制御信号をパン・チルト制御処理ブロック７２に対して出力する。
パン・チルト制御処理ブロック７２は、例えば図４における制御部５１が実行する制御処理のうちで、パン・チルト制御に関する処理の実行機能に対応するもので、入力される制御信号に応じてパン機構部５３、チルト機構部５６の動きをコントロールするための信号をパン用駆動部５５、チルト用駆動部５８に対して出力する。これにより、構図制御処理ブロック６２にて判定した構図が得られるようにしてパンニング、チルティングが行われる。
このようにして、図１７に示す撮像システムは、デジタルスチルカメラ１から雲台１０に撮像画像データを送信させることとして、雲台１０側により、取り込んだ撮像画像データに基づく被写体検出処理と構図制御とを実行するようにして構成しているものである。 FIG. 17 shows a configuration example as a modified example of the imaging system of the present embodiment.
In this figure, first, the captured image data generated by the signal processing unit 24 based on imaging is transmitted from the digital still camera 1 to the camera platform 10 via the communication control processing block 63. ing.
In this figure, a communication control processing block 71, a pan / tilt control processing block 72, a subject detection processing block 73, and a composition control processing block 74 are shown as the configuration of the camera platform 10.
The communication control processing block 71 is a functional part corresponding to the communication unit 52 of FIG. 4, and performs communication processing with the communication control processing block unit 63 (the pan head compatible communication unit 34) on the digital still camera 1 side with a predetermined protocol. It is the part comprised so that it may perform according to.
The captured image data received by the communication control processing block 71 is passed to the subject detection processing block 73. The subject detection block 73 includes a signal processing unit capable of at least subject detection processing equivalent to the subject detection processing block 61 shown in FIG. 5, for example. Subject detection processing is performed on the captured image data. And the detection information is output to the composition control processing block 74.
The composition control processing block 74 can execute composition control equivalent to the composition control processing block 62 of FIG. Output to the tilt control processing block 72.
The pan / tilt control processing block 72 corresponds to an execution function of processing related to pan / tilt control, for example, among the control processing executed by the control unit 51 in FIG. Signals for controlling the movement of the unit 53 and the tilt mechanism unit 56 are output to the pan driving unit 55 and the tilt driving unit 58. Thereby, panning and tilting are performed so that the composition determined by the composition control processing block 62 is obtained.
In this manner, the imaging system shown in FIG. 17 transmits the captured image data from the digital still camera 1 to the camera platform 10, and the camera platform 10 side performs subject detection processing and composition control based on the captured image data. And are configured to execute.

図１８は、本実施の形態の撮像システムについての他の変形例としての構成例を示している。なお、この図において、図１７と同一部分には同一符号を付して説明を省略する。
このシステムにおいては、雲台１０側において撮像部７５が備えられる。この撮像部７５は、例えば撮像のための光学系と撮像素子（イメージャ）を備えて、撮像光に基づいた信号（撮像信号）を得るようにされているとともに、この撮像信号から撮像画像データを生成するための信号処理部から成る。これは、例えば図１に示した光学系部２１、イメージセンサ２２、Ａ／Ｄコンバータ２３、及び信号処理部２４において撮像画像データを得るまでの信号処理段から成る部位に対応する構成となる。撮像部７５により生成される撮像画像データは被写体検出処理ブロック７３に出力される。なお、撮像部７５が撮像光を取り込む方向（撮像方向）は、例えば雲台１０に載置されるデジタルスチルカメラ１の光学系部２１（レンズ部３）の撮像方向とできるだけ一致するようにして設定される。 FIG. 18 shows a configuration example as another modified example of the imaging system of the present embodiment. In this figure, the same parts as those in FIG.
In this system, an imaging unit 75 is provided on the camera platform 10 side. The imaging unit 75 includes, for example, an optical system for imaging and an imaging device (imager), and obtains a signal (imaging signal) based on imaging light, and captures captured image data from the imaging signal. It consists of a signal processing unit for generating. For example, the optical system unit 21, the image sensor 22, the A / D converter 23, and the signal processing unit 24 shown in FIG. The captured image data generated by the imaging unit 75 is output to the subject detection processing block 73. Note that the direction in which the imaging unit 75 captures the imaging light (imaging direction) is set to coincide with the imaging direction of the optical system unit 21 (lens unit 3) of the digital still camera 1 placed on the camera platform 10 as much as possible, for example. Is set.

この場合の被写体検出処理ブロック７３及び構図制御処理ブロック７４は、上記図１７と同様にして被写体検出処理、構図制御処理を実行する。但し、この場合の構図制御処理ブロック７３は、パン・チルト制御に加えて、レリーズ動作を実行させるタイミングに対応してはレリーズ指示信号を、通信制御処理ブロック７１からデジタルスチルカメラ１に対して送信させる。デジタルスチルカメラ１では、レリーズ指示信号が受信されることに応じてレリーズ動作を実行するようにされる。
このようにして他の変形例では、被写体検出処理と構図制御に関して、レリーズ動作自体に関する以外の全ての制御・処理を雲台１０側で完結して行うことができる。 In this case, the subject detection processing block 73 and the composition control processing block 74 execute subject detection processing and composition control processing in the same manner as in FIG. However, the composition control processing block 73 in this case transmits a release instruction signal from the communication control processing block 71 to the digital still camera 1 in response to the timing for executing the release operation in addition to the pan / tilt control. Let In the digital still camera 1, a release operation is executed in response to receiving a release instruction signal.
In this way, in another modified example, regarding the subject detection processing and composition control, all control / processing other than the release operation itself can be completed on the pan head 10 side.

さらに、本実施の形態の撮像システムによる被写体検出であるとか構図制御については、下記のような変形を考えることもできる。
例えば、これまでにおいては、特に水平方向（左右方向）に関しての構図制御に関しては述べていないが、例えば三分割法などによれば、被写体を中央に配置するのではなく、左右何れかの方向に偏らせることによっても良い構図が得られるものとされている。そこで、実際においては、個別被写体数に応じた構図制御として、例えば被写体重心（個別被写体重心、総合被写体重心）についての左右方向における所要量の移動が行われるように構成することとしてもよい。 Furthermore, the following modifications can be considered for subject detection and composition control by the imaging system of the present embodiment.
For example, the composition control in the horizontal direction (left and right direction) has not been described so far. For example, according to the three-division method or the like, the subject is not placed in the center but in the left or right direction. A good composition can be obtained by biasing. Therefore, in actuality, as composition control according to the number of individual subjects, for example, a required amount of movement in the left-right direction with respect to the subject gravity center (individual subject gravity center, total subject gravity center) may be performed.

また、図１１、図１４に示される構図制御において実行されるパン制御、チルト制御は、雲台１０のパン・チルト機構の動きを制御することにより行うこととしているが、雲台１０に代えて、例えば、デジタルスチルカメラ１のレンズ部３に対しては、反射鏡により反射された撮像光が入射されるようにしたうえで、撮像光に基づいて得られる画像についてパンニング・チルティングされた結果が得られるようにして上記反射光を動かす構成を採用することも考えられる。
また、デジタルスチルカメラ１のイメージセンサ２２から画像として有効な撮像信号を取り込むための画素領域を水平方向と垂直方向にシフトさせるという制御を行うことによっても、パンニング・チルティングが行われるのと同等の結果を得ることができる。この場合には、雲台若しくはこれに準ずる、デジタルスチルカメラ１以外のパン・チルトのための装置部を用意する必要が無く、デジタルスチルカメラ１単体により本実施の形態としての構図制御を完結させることが可能となる。
また、光学系部２１におけるレンズの光軸を水平・垂直方向に変更することのできる機構を備えて、この機構の動きを制御するように構成しても、パンニング・チルティングを行うことが可能である。 In addition, the pan control and tilt control executed in the composition control shown in FIGS. 11 and 14 are performed by controlling the movement of the pan / tilt mechanism of the camera platform 10, but instead of the camera platform 10. For example, the result of panning and tilting an image obtained based on the imaging light after the imaging light reflected by the reflecting mirror is incident on the lens unit 3 of the digital still camera 1. It is also possible to adopt a configuration in which the reflected light is moved so that
Further, by performing control of shifting a pixel region for capturing an effective imaging signal as an image from the image sensor 22 of the digital still camera 1 in the horizontal direction and the vertical direction, it is equivalent to performing panning / tilting. Result can be obtained. In this case, there is no need to prepare a pan / tilt device unit other than the pan head or the digital still camera 1 according to this, and the composition control as the present embodiment is completed by the digital still camera 1 alone. It becomes possible.
In addition, it is possible to perform panning and tilting even if a mechanism is provided that can change the optical axis of the lens in the optical system unit 21 in the horizontal and vertical directions and control the movement of this mechanism. It is.

また、本願発明に基づく構図判定のための構成は、これまでに実施の形態として説明してきた撮像システム以外にも適用することができる。そこで以降、本願発明による構図判定の適用例について述べる。 Further, the configuration for composition determination based on the present invention can be applied to other than the imaging system described as the embodiment so far. Therefore, hereinafter, application examples of composition determination according to the present invention will be described.

先ず、図１９は、本願発明による構図判定を、デジタルスチルカメラなどの撮像装置単体に対して適用したもので、例えば撮像モード時において撮像装置により撮像している画像が適正な構図になったときに、このことを表示によってユーザに通知しようとするものである。
このために撮像装置が備えるべき構成として、ここでは被写体検出・構図判定処理ブロック８１、通知制御処理ブロック８２、表示部８３を示している。
被写体検出・構図判定処理ブロック８１は、撮像画像データを取り込んで、例えば図５の被写体検出処理ブロック６１と同等の被写体検出処理と、この被写体検出処理の結果としての検出情報を利用して、例えば図５と同等の構図判定のための処理とを行うようにされた部位である。
例えばユーザは、撮像装置を撮像モードに設定したうえで、撮像装置を手に持っており、いつでもレリーズ操作（シャッターボタン操作）を行えば撮像画像の記録が行える状況にあるものとする。
このような状態の下、被写体検出・構図判定処理ブロック８１では、そのときに撮像して得られる撮像画像データを取り込んで被写体検出を行う。すると構図制御処理によっては、先ず、検出された個別被写体の数等に応じて最適構図がどのようなものであるのかが特定されることになるが、この場合の構図判定処理としては、そのときに得られている撮像画像データの画内容の構図と、最適構図との一致性、類似度を求めるようにされる。そして、例えば類似度が一定以上になったときに、実際に撮影して得られている撮像画像データの画内容が最適構図になったと判定するようにされる。なお、例えば実際においては、撮像画像データの画内容の構図と最適構図とが一致したとみなされる程度の、所定以上の類似度が得られたら、最適構図と判断するようにしてアルゴリズムを構成することが考えられる。また、ここでの一致性、類似度をどのようにして求めるのかについては多様なアルゴリズムを考えることができるので、ここでは、その具体例については特に言及しない。
このようにして撮像画像データの画面内容が最適構図になったことの判定結果の情報は通知制御処理ブロック８２に対して出力される。通知制御処理ブロック８２は、上記の情報の入力に応じて、現在において撮像されている画像が最適構図であることをユーザに通知するための所定態様による表示が表示部８３にて行われるように表示制御を実行する。なお、通知制御処理ブロック８２は、撮像装置が備えるマイクロコンピュータ（ＣＰＵ）などによる表示制御機能と、表示部８３に対する画像表示を実現するための表示用画像処理機能などにより実現される。なお、ここでの最適構図であることのユーザへの通知は、電子音、若しくは合成音声などをはじめとした音により行われるように構成してもよい。
また、表示部８３は、例えば本実施の形態のデジタルスチルカメラ１の表示部３３に対応するもので、例えば撮像装置における所定位置に対してそのディスプレイパネルが表出するようにして設けられ、撮影モード時にはいわゆるスルー画といわれる、そのときに撮像されている画像が表示されることが一般的である。従って、この撮像装置の実際にあっては、表示部８３において、スルー画に対して重畳される態様で最適構図であることを通知する内容の画像が表示されることになる。ユーザは、この最適構図であることを通知する表示が現れたときにレリーズ操作を行うようにされる。これにより、写真撮影の知識や技術に長けていないようなユーザであっても、良好な構図の写真撮影を簡単に行うことが可能になる。 First, FIG. 19 shows a case where the composition determination according to the present invention is applied to a single imaging apparatus such as a digital still camera. For example, when an image captured by the imaging apparatus in an imaging mode becomes an appropriate composition. In addition, this is to be notified to the user by display.
For this purpose, the object detection / composition determination processing block 81, the notification control processing block 82, and the display unit 83 are shown here as the configuration that the imaging apparatus should have.
The subject detection / composition determination processing block 81 takes captured image data and uses, for example, subject detection processing equivalent to the subject detection processing block 61 of FIG. 5 and detection information as a result of this subject detection processing, for example, This is a part that is configured to perform a composition determination process equivalent to that in FIG.
For example, it is assumed that the user sets the imaging device in the imaging mode, holds the imaging device in his hand, and can record the captured image at any time by performing a release operation (shutter button operation).
Under such a state, the subject detection / composition determination processing block 81 captures captured image data obtained by imaging at that time and performs subject detection. Then, depending on the composition control process, first, what the optimal composition is is determined according to the number of detected individual subjects and the like. In this case, as the composition determination process, Thus, the consistency and similarity between the composition of the image content of the captured image data obtained and the optimum composition is obtained. Then, for example, when the degree of similarity becomes equal to or higher than a certain level, it is determined that the image content of the captured image data actually obtained by photographing has an optimal composition. For example, in practice, when a degree of similarity equal to or higher than a predetermined level is obtained, the algorithm is configured so that the composition is determined to be the optimum composition when the composition of the image content of the captured image data and the optimum composition are considered to match. It is possible. In addition, since various algorithms can be considered as to how to obtain the coincidence and similarity here, specific examples thereof are not particularly mentioned here.
Information on the determination result that the screen content of the captured image data has the optimum composition in this way is output to the notification control processing block 82. In response to the input of the above information, the notification control processing block 82 displays on the display unit 83 in a predetermined manner for notifying the user that the currently captured image has the optimum composition. Execute display control. The notification control processing block 82 is realized by a display control function by a microcomputer (CPU) provided in the imaging device, a display image processing function for realizing image display on the display unit 83, and the like. Note that the notification to the user of the optimal composition here may be made by sound such as electronic sound or synthesized speech.
The display unit 83 corresponds to, for example, the display unit 33 of the digital still camera 1 according to the present embodiment. For example, the display unit 83 is provided so that the display panel is exposed to a predetermined position in the imaging apparatus. In the mode, the so-called through image is generally displayed and the image captured at that time is displayed. Therefore, in the actual image pickup apparatus, the display unit 83 displays an image with a content for notifying that the optimum composition is superimposed on the through image. The user performs a release operation when a display notifying that the optimum composition is displayed. As a result, even a user who is not good at photography knowledge and technology can easily take a photo with a good composition.

また、図２０も、上記図１９と同様にデジタルスチルカメラなどの撮像装置単体に対して本願発明による構図判定を適用したものとなる。
先ず、この図に示す構成においては、図１９と同様に、被写体検出・構図判定処理ブロック８１により、そのときの撮像により得られる撮像画像データを取り込んで被写体検出処理を行うとともに、被写体検出情報に基づいて、上記の撮像画像データの画内容が最適構図であるか否かを判定するようにされる。そして、最適構図になったことを判定すると、このことをレリーズ制御処理ブロック８４に対して通知する。
レリーズ制御処理ブロック８４は、撮像画像データを記録するための制御を実行する部位とされ、例えば撮像装置が備えるマイクロコンピュータが実行する制御などにより実現される。上記の通知を受けたレリーズ制御処理ブロック８４は、そのときに得られている撮像画像データが、例えば記憶媒体に記憶されるようにして画像信号処理、記録制御処理を実行する。
このような構成であれば、例えば最適な構図の画像が撮像されたときには、自動的にその撮像画像の記録が行われるようにした撮像装置を得ることができる。 FIG. 20 also applies the composition determination according to the present invention to a single imaging apparatus such as a digital still camera as in FIG.
First, in the configuration shown in this figure, as in FIG. 19, the subject detection / composition determination processing block 81 captures captured image data obtained by imaging at that time and performs subject detection processing, as well as subject detection information. Based on this, it is determined whether or not the image content of the captured image data has an optimal composition. When it is determined that the optimum composition has been achieved, this is notified to the release control processing block 84.
The release control processing block 84 is a part that executes control for recording captured image data, and is realized by, for example, control executed by a microcomputer included in the imaging apparatus. Upon receiving the above notification, the release control processing block 84 executes image signal processing and recording control processing so that the captured image data obtained at that time is stored in, for example, a storage medium.
With such a configuration, for example, when an image with an optimal composition is captured, an imaging apparatus can be obtained in which the captured image is automatically recorded.

なお、上記図１９及び図２０の構成は、例えばスチルカメラの範疇であれば、例えば図１により示されるような構成のデジタルスチルカメラに適用できるほか、銀塩フィルムなどに撮像画像を記録するいわゆる銀塩カメラといわれるものにも、例えば光学系により得られた撮像光を分光して取り入れるイメージセンサと、このイメージセンサからの信号を入力して処理するデジタル画像信号処理部などを設けることで適用が可能である。 19 and 20 can be applied to, for example, a digital still camera having a configuration as shown in FIG. 1, for example, in the category of a still camera, or so-called recording a captured image on a silver salt film or the like. Also applied to what is called a silver halide camera by providing an image sensor that spectrally captures imaged light obtained by an optical system and a digital image signal processing unit that inputs and processes signals from this image sensor Is possible.

また、図２１は、既に存在する画像データに対して画像編集を行う編集装置に本願発明を適用した例である。
この図においては編集装置９０が示されている。ここでの編集装置９０は、既に存在する画像データとして、例えば記憶媒体に記憶されていたものを再生して得た画像データ（再生画像データ）を得るようにされている。なお、記憶媒体から再生したものの他に、例えばネットワーク経由でダウンロードしたものを取り込んでもよい。即ち、編集装置９０が取り込むべき撮像画像データをどのような経路で取得するのかについては、特に限定されるべきものではない。 FIG. 21 shows an example in which the present invention is applied to an editing apparatus that performs image editing on already existing image data.
In this figure, an editing device 90 is shown. Here, the editing device 90 is configured to obtain image data (reproduced image data) obtained by reproducing, for example, data stored in a storage medium, as already existing image data. In addition to the data reproduced from the storage medium, for example, data downloaded via a network may be imported. That is, the route through which the captured image data to be captured by the editing apparatus 90 is acquired is not particularly limited.

編集装置９０が取り込んだとされる再生撮像画像データは、トリミング処理ブロック９１と被写体検出・構図判定処理ブロック９２のそれぞれに対して入力される。
先ず、被写体検出・構図判定処理ブロック９２は、例えば先ず、図１９、図２０と同様の被写体検出処理を実行して検出情報を出力する。そして、この検出情報を利用した構図判定処理として、この場合には、入力される再生撮像画像データとしての全画面において、最適構図が得られるとされる所定の縦横比による画像部分（最適構図の画像部分）がどこであるのかを特定する。そして、最適構図の画像部分が特定されると、例えばその画像部分の位置を示す情報（トリミング指示情報）をトリミング処理ブロック９１に対して出力する。
トリミング処理ブロック９１は、上記のようにしてトリミング指示情報が入力されたことに応答して、入力される再生撮像画像データから、トリミング指示情報が示す画像部分を抜き出すための画像処理を実行し、抜き出した画像部分を１つの独立した画像データとして出力する。これが編集撮像画像データとなる。
このような構成であれば、例えば画像データの編集処理として、元々ある画像データの画内容から最適構造となる部分を抜き出した内容の画像データを新規に得るというトリミングが自動的に行われることになる。このような編集機能は、例えばパーソナルコンピュータなどにインストールされる画像データ編集のためのアプリケーションであるとか、画像データを管理するアプリケーションにおける画像編集機能などで採用することが考えられる。 Reproduced captured image data taken by the editing device 90 is input to the trimming processing block 91 and the subject detection / composition determination processing block 92, respectively.
First, the subject detection / composition determination processing block 92 first executes subject detection processing similar to FIGS. 19 and 20, for example, and outputs detection information. Then, as a composition determination process using this detection information, in this case, an image portion having a predetermined aspect ratio (optimum composition) at which the optimum composition is obtained on the entire screen as the input reproduced captured image data. Identify where the image part is). When the image portion of the optimum composition is specified, for example, information indicating the position of the image portion (trimming instruction information) is output to the trimming processing block 91.
In response to the input of the trimming instruction information as described above, the trimming processing block 91 executes image processing for extracting an image portion indicated by the trimming instruction information from the input reproduction captured image data. The extracted image portion is output as one independent image data. This is the edited captured image data.
With such a configuration, for example, as image data editing processing, trimming is automatically performed in which image data having a content obtained by extracting a portion having an optimal structure from the image content of the original image data is newly obtained. Become. Such an editing function may be adopted, for example, as an image data editing application installed in a personal computer or an image editing function in an application for managing image data.

図２２は、本願発明の構図判定をデジタルスチルカメラなどの撮像装置に適用した構成の一例である。
ここでは図示していない撮像部により撮像して得られる撮像画像データは、撮像装置１００内の被写体検出・構図判定処理ブロック１０１、ファイル作成処理ブロック１０３とに対して入力することとしている。なお、この場合において、撮像装置１００内に入力された撮像画像データは、例えばレリーズ操作などに応じて記憶媒体に記憶されるべきこととなった撮像画像データであり、ここでは図示していない、撮像部での撮像により得られた撮像信号を基に生成されたものである。
先ず被写体検出・構図判定処理ブロック１０１では、入力された撮像画像データを対象に被写体検出を行い、その検出情報に基づいて最適構図がどのようなものであるのかを判定するようにされる。具体的には、例えば図２１の場合と同様にして、入力された撮像画像データの全画面において最適構図となる画像部分を特定した情報が得られるようにされればよい。そして、このようにして得た最適構図についての判定結果を表す情報を、メタデータ作成処理ブロック１０２に対して出力する。
メタデータ作成処理ブロック１０２では、入力された情報に基づいて、対応する撮像画像データから最適構図を得るために必要な情報から成るメタデータ（構図編集メタデータ）を作成し、ファイル作成処理ブロック１０３に対して出力する。この構図編集メタデータの内容としては、例えば、対応する撮像画像データとしての画面においてトリミングする画像領域部分がどこであるのかを示し得る位置情報などとなる。
この図に示す撮像装置１００では、撮像画像データについて、所定形式による静止画像ファイルとして管理されるようにして記憶媒体に記録するものとされる。これに対応して、ファイル作成処理ブロック１０３は、撮像画像データを、静止画像ファイル形式に変換（作成）する。
ファイル作成処理ブロック１０３は、先ず、入力される撮像画像データについて、画像ファイル形式に対応した画像圧縮符号化を行い、撮像画像データから成るファイル本体部分を作成する。これとともに、メタデータ作成処理ブロック１０２から入力された構図編集メタデータを、所定の格納位置に対して格納するようにしてヘッダ及び付加情報ブロックなどのデータ部分を作成する。そして、これらファイル本体部分、ヘッダ、付加情報ブロックなどから静止画像ファイルを作成し、これを出力する。これにより、図示するようにして、記憶媒体に記録すべき静止画像ファイルとしては、撮像画像データとともにメタデータ（構図編集メタデータ）が含まれる構造を有したものが得られることになる。 FIG. 22 shows an example of a configuration in which the composition determination of the present invention is applied to an imaging apparatus such as a digital still camera.
Here, captured image data obtained by imaging by an imaging unit (not shown) is input to the subject detection / composition determination processing block 101 and the file creation processing block 103 in the imaging apparatus 100. In this case, the captured image data input into the imaging apparatus 100 is captured image data that should be stored in a storage medium in accordance with a release operation, for example, and is not illustrated here. The signal is generated based on an image signal obtained by image capturing in the image capturing unit.
First, in the subject detection / composition determination processing block 101, subject detection is performed on the input captured image data, and the optimum composition is determined based on the detection information. Specifically, for example, as in the case of FIG. 21, it is only necessary to obtain information specifying an image portion having an optimal composition in the entire screen of the input captured image data. Then, information representing the determination result for the optimum composition obtained in this way is output to the metadata creation processing block 102.
In the metadata creation processing block 102, metadata (composition editing metadata) including information necessary for obtaining an optimum composition from corresponding captured image data is created based on the input information, and a file creation processing block 103 is created. Is output. The contents of the composition editing metadata include, for example, position information that can indicate where the image area portion to be trimmed is on the screen as the corresponding captured image data.
In the imaging apparatus 100 shown in this figure, captured image data is recorded on a storage medium so as to be managed as a still image file in a predetermined format. In response to this, the file creation processing block 103 converts (creates) the captured image data into a still image file format.
First, the file creation processing block 103 performs image compression encoding corresponding to the image file format on the input captured image data, and creates a file body portion composed of the captured image data. At the same time, the composition editing metadata input from the metadata creation processing block 102 is stored in a predetermined storage position to create data portions such as a header and an additional information block. Then, a still image file is created from the file body part, header, additional information block, and the like, and this is output. As a result, as shown in the figure, a still image file to be recorded on the storage medium has a structure including metadata (composition editing metadata) together with captured image data.

図２３は、上記図２２の装置により作成された静止画像ファイルについて編集を行う編集装置の構成例を示している。
図に示す編集装置１１０は、静止画像ファイルのデータを取り込んで、先ずメタデータ分離処理ブロック１１１に入力する。メタデータ分離処理ブロック１１１は、静止画像ファイルのデータから、ファイル本体部分に相当する撮像画像データとメタデータとを分離する。分離して得られたメタデータについてはメタデータ解析処理ブロック１１２に対して出力し、撮像画像データについてはトリミング処理ブロック１１３に対して出力する。 FIG. 23 shows a configuration example of an editing apparatus that edits a still image file created by the apparatus of FIG.
The editing apparatus 110 shown in the drawing takes in still image file data and first inputs it to the metadata separation processing block 111. The metadata separation processing block 111 separates captured image data and metadata corresponding to the file body portion from the data of the still image file. The metadata obtained by the separation is output to the metadata analysis processing block 112, and the captured image data is output to the trimming processing block 113.

メタデータ解析処理ブロック１１２は、取り込んだメタデータを解析する処理を実行する部位とされる。そして、解析処理として、構図編集メタデータについては、その内容である最適構図を得るための情報から、すくなくとも、対応の撮像画像データを対象としてトリミングを行う画像領域を特定する。そして、この特定された画像領域のトリミングを指示するトリミング指示情報をトリミング処理ブロック１１３に対して出力する。
トリミング処理ブロック１１３は、先の図２１のトリミング処理ブロック９１と同様に、メタデータ分離処理ブロック１１１側から入力した撮像画像データから、上記メタデータ分離処理ブロック１１２から入力されるトリミング指示情報が示す画像部分を抜き出すための画像処理を実行し、抜き出した画像部分を１つの独立した画像データである、編集撮像画像データとして出力する。 The metadata analysis processing block 112 is a part that executes processing for analyzing the captured metadata. As composition analysis metadata, for the composition editing metadata, an image region to be trimmed for the corresponding captured image data is specified at least from the information for obtaining the optimum composition as the content. Then, trimming instruction information for instructing trimming of the specified image area is output to the trimming processing block 113.
The trimming processing block 113 indicates the trimming instruction information input from the metadata separation processing block 112 from the captured image data input from the metadata separation processing block 111 side, similarly to the trimming processing block 91 of FIG. Image processing for extracting the image portion is executed, and the extracted image portion is output as edited captured image data which is one independent image data.

上記図２２、図２３に示される撮像装置と編集装置から成るシステムによれば、例えば撮影などにより得たオリジナルの静止画像データ（撮像画像データ）はそのまま無加工で保存しておけるようにしたうえで、このオリジナル静止画像データからメタデータを利用して、最適構図となる画像を抜き出す編集が行えることになる。また、このような最適構図に対応した抜き出し画像部分の決定は、自動的に行われるものとなる。 According to the system including the imaging device and the editing device shown in FIGS. 22 and 23, original still image data (captured image data) obtained by, for example, photographing can be stored as it is without being processed. Thus, editing can be performed to extract an image having the optimum composition from the original still image data by using metadata. Further, the extraction image portion corresponding to such an optimal composition is automatically determined.

図２４は、ビデオカメラなどとしての動画像の撮影記録が可能な撮像装置に本願発明を適用した例である。
この図に示す撮像装置１２０には、動画像データが入力される。この動画像データは、例えば同じ撮像装置１２０が有するとされる撮像部により撮像を行って得られる撮像信号に基づいて生成されるものである。この動画像データは、撮像装置１２０における被写体検出・構図判定処理ブロック１２２、及びファイル作成・記録処理ブロック１２４に対して入力される。
この場合の被写体検出・構図判定処理ブロック１２２は、入力されてくる動画像データについての構図の良否判定を行う。例えば、被写体検出・構図判定処理ブロック１２２では、予め良好とされる構図がどのようなものであるのかについてのパラメータ（良好構図対応パラメータ）を保持している。このパラメータとしては、検出される個別被写体数ごとに応じて適切であるとして設定された画面内における個別被写体の占有率、被写体間距離Ｋなどとなる、そして、被写体検出・構図判定処理ブロック１２２は、入力されてくる動画像データについて、例えば継続的にどのような構図となっているかについての構図判定を行う（例えば動画像データにおける実際の個別被写体の占有率、被写体間距離Ｋなどの構図パラメータを求める）と共に、この判定結果として得られた動画像データの構図パラメータと、上記の良好構図パラメータとを比較する。そして、動画像データの構図パラメータが良好構図対応パラメータに対して一定以上の類似度を有していれば良好な構図であると判定され、上記類似度が一定以下であれば、良好な構図ではないと判定される。
被写体検出・構図判定処理ブロック１２２は、上記のようにして動画像データについて良好な構図が得られていると判定したときには、メタデータ作成処理ブロック１２３に対して、動画像データにおいて、今回、上記の良好な構図が得られていると判定した画像区間(良好構図画像区間)がどこであるのかを示す情報(良好構図画像区間指示情報)を出力する。良好構図画像区間指示情報)は、例えば動画像データにおける良好構図画像区間としての開始位置と終了位置を示す情報などとされる。 FIG. 24 shows an example in which the present invention is applied to an imaging apparatus capable of capturing and recording moving images as a video camera or the like.
Moving image data is input to the imaging device 120 shown in this figure. This moving image data is generated based on an imaging signal obtained by imaging by an imaging unit assumed to be included in the same imaging device 120, for example. This moving image data is input to the subject detection / composition determination processing block 122 and the file creation / recording processing block 124 in the imaging apparatus 120.
In this case, the subject detection / composition determination processing block 122 determines whether the composition of the input moving image data is acceptable. For example, the subject detection / composition determination processing block 122 holds parameters (good composition-corresponding parameters) regarding what kind of composition is supposed to be good in advance. The parameters include the occupancy rate of the individual subject in the screen set as appropriate according to the number of detected individual subjects, the inter-subject distance K, and the like. The subject detection / composition determination processing block 122 For example, composition determination is performed on the input moving image data, for example, regarding what composition is continuously being formed (for example, composition parameters such as an actual occupancy rate of the individual subject in the moving image data, an inter-subject distance K). In addition, the composition parameter of the moving image data obtained as a result of this determination is compared with the above-described good composition parameter. If the composition parameter of the moving image data has a certain degree of similarity to the good composition correspondence parameter, it is determined that the composition is good, and if the similarity is below a certain value, the composition is good. It is determined that there is no.
When the subject detection / composition determination processing block 122 determines that a good composition is obtained for the moving image data as described above, the subject detection / composition determination processing block 122 performs the above processing on the moving image data. Information (good composition image section instruction information) indicating where the image section (good composition image section) determined to have obtained a good composition is output. The good composition image section instruction information) is, for example, information indicating a start position and an end position as a good composition image section in the moving image data.

この場合のメタデータ作成処理ブロック１２３は、次に説明する動画像記録処理ブロック１２４により記憶媒体にファイルとして記録される動画像データについての、各種所要のメタデータを生成するものとされる。そのうえで、上記のようにして被写体検出・構図判定処理ブロック１２２から良好構図画像区間指示情報を入力した場合には、入力された良好構図画像区間指示情報が示す画像区間が良好な構図であることを示すメタデータを生成し、動画像記録処理ブロック１２４に対して出力する。
動画像記録処理ブロック１２４は、入力された動画像データについて、所定形式による動画像ファイルとして管理されるようにして記憶媒体に記録するための制御を実行する。そして、メタデータ作成処理ブロック１２３からメタデータが出力されてきた場合には、このメタデータが、動画像ファイルに付随するメタデータに含められるようにして記録されるようにするための制御を実行する。
これにより、図示するようにして、記憶媒体に記録される動画像ファイルは、撮像により得られたとする動画像データに、良好な構図が得られている画像区間を示すメタデータが付随された内容を有することになる。
なお、上記のようにしてメタデータにより示される、良好な構図が得られている画像区間は、或る程度の時間幅を有する動画像による画像区間とされてもよいし、動画像データから抜き出した静止画像によるものとされてもよい。また、上記のメタデータに代えて、良好な構図が得られている画像区間の動画像データ若しくは静止画像データを生成して、これを動画像ファイルに付随する副次的な画像データ静止画像データ（或いは動画像ファイルと独立したファイル）として記録する構成も考えられる。
また、図２４に示されるようにして、撮像装置１２０に対して被写体検出・構図判定処理ブロック１２２を備える構成では、被写体検出・構図判定処理ブロック１２２により良好構図画像区間であると判定された動画像の区間のみを動画像ファイルとして記録するように構成することも考えられる。さらには、被写体検出・構図判定処理ブロック１２２により良好構図であると判定された画像区間に対応する画像データを、データインターフェースなどを経由して外部機器に出力するような構成も考えることができる。 In this case, the metadata creation processing block 123 generates various required metadata for the moving image data recorded as a file on the storage medium by the moving image recording processing block 124 described below. In addition, when the good composition image section instruction information is input from the subject detection / composition determination processing block 122 as described above, it is confirmed that the image section indicated by the input good composition image section instruction information is a good composition. The generated metadata is generated and output to the moving image recording processing block 124.
The moving image recording processing block 124 executes control for recording the input moving image data on a storage medium so as to be managed as a moving image file in a predetermined format. When metadata is output from the metadata creation processing block 123, control is performed so that the metadata is recorded so as to be included in the metadata accompanying the moving image file. To do.
As a result, as shown in the figure, the moving image file recorded on the storage medium includes the moving image data obtained by imaging accompanied by metadata indicating an image section in which a good composition is obtained. Will have.
It should be noted that the image section in which a good composition is obtained, which is indicated by the metadata as described above, may be an image section based on a moving image having a certain time width, or extracted from the moving image data. It may be based on a still image. Also, in place of the above metadata, moving image data or still image data of an image section in which a good composition is obtained is generated, and this is added to secondary image data still image data attached to the moving image file. A configuration of recording as a file (or a file independent of a moving image file) is also conceivable.
Also, as shown in FIG. 24, in the configuration in which the imaging apparatus 120 includes the subject detection / composition determination processing block 122, the moving image determined to be a good composition image section by the subject detection / composition determination processing block 122. It is also conceivable that only the image section is recorded as a moving image file. Furthermore, a configuration may be considered in which image data corresponding to an image section determined to have a good composition by the subject detection / composition determination processing block 122 is output to an external device via a data interface or the like.

図２５は、印刷を行う印刷装置に本願発明を適用した例である。
この場合には、印刷装置１３０が、印刷すべき画像内容を有する画像データ（静止画）を取り込むこととされており、このようにして取り込んだデータは、トリミング処理ブロック１３１、及び被写体検出・構図判定処理ブロック１３２に対して入力される。
先ず、被写体検出・構図判定処理ブロック１３２は、図２１の被写体検出・構図判定処理ブロック９２と同様の被写体検出処理・構図判定処理を実行することで、入力される画像データの全画面における最適構図の画像部分を特定する処理を実行し、この処理結果に応じた内容のトリミング指示情報を生成してトリミング処理ブロック１３１に対して出力する。
トリミング処理ブロック１３１は、図２１のトリミング処理ブロック９１と同様にして、入力した画像データから、トリミング指示情報が示す画像部分を抜き出すための画像処理を実行する。そして、この抜き出した画像部分のデータを、印刷用画像データとして印刷制御処理ブロック１３３に対して出力する。
印刷制御処理ブロック１３３は、入力された印刷用画像データを利用して、ここでは図示していない印刷機構を動作させるための制御を実行する。
このような動作により、印刷装置１３０によっては、入力した画像データの画内容から、最適構図が得られているとされる画像部分が自動的に抜き出されて、１枚の画として印刷されることになる。 FIG. 25 shows an example in which the present invention is applied to a printing apparatus that performs printing.
In this case, the printing apparatus 130 captures image data (still image) having the image content to be printed. The captured data includes the trimming processing block 131 and the subject detection / composition. Input to the determination processing block 132.
First, the subject detection / composition determination processing block 132 executes subject detection processing / composition determination processing similar to the subject detection / composition determination processing block 92 of FIG. Is executed, and trimming instruction information having contents corresponding to the processing result is generated and output to the trimming processing block 131.
The trimming processing block 131 executes image processing for extracting the image portion indicated by the trimming instruction information from the input image data in the same manner as the trimming processing block 91 of FIG. Then, the extracted image portion data is output to the print control processing block 133 as print image data.
The print control processing block 133 executes control for operating a printing mechanism (not shown) using the input print image data.
With such an operation, depending on the printing apparatus 130, the image portion that is said to have the optimum composition is automatically extracted from the image content of the input image data and printed as a single image. It will be.

図２６に示される例は、例えば静止画像ファイルを多数記憶し、これらの静止画像ファイルを利用したサービスを提供するための装置、システムに適用して好適である。
記憶部１４１には、多数の静止画像ファイルが記憶される。
被写体検出・構図判定処理ブロック１４２は、所定のタイミングで、記憶部１４１に記憶されている静止画ファイルを取り込み、そのファイル本体部に格納される静止画像データを取り出す。そして、この静止画像データを対象として、例えば図２２の被写体検出・構図判定処理ブロック１０１と同様の処理を実行して最適構図についての判定結果を表す情報を得て、この情報をメタデータ作成処理ブロック１４３に対して出力する。 The example shown in FIG. 26 is suitable for application to, for example, an apparatus and system for storing a large number of still image files and providing services using these still image files.
A large number of still image files are stored in the storage unit 141.
The subject detection / composition determination processing block 142 takes in a still image file stored in the storage unit 141 at a predetermined timing, and extracts still image data stored in the file main body. Then, for this still image data, for example, processing similar to the subject detection / composition determination processing block 101 in FIG. 22 is executed to obtain information indicating the determination result for the optimal composition, and this information is used as metadata creation processing. Output to block 143.

メタデータ作成処理ブロック１４３は、入力された情報に基づいて、先の図２２のメタデータ作成処理ブロック１０２と同様に、メタデータ（構図編集メタデータ）を作成する。その上で、この場合には、作成したメタデータを、記憶部１４１に記憶されるメタデータテーブルに登録する。このメタデータテーブルは、同じ記憶部１４１に記憶される静止画像データとの対応関係が示されるようにしてメタデータを格納して成る情報単位とされる。つまり、メタデータテーブルによっては、メタデータ（構図編集メタデータ）と、このメタデータを作成するために被写体検出・構図判定処理ブロック１０１により被写体検出処理及び構図判定処理の対象となった静止画像ファイルとの対応が示される。 Based on the input information, the metadata creation processing block 143 creates metadata (composition editing metadata) in the same manner as the metadata creation processing block 102 of FIG. In this case, the created metadata is registered in the metadata table stored in the storage unit 141. This metadata table is an information unit formed by storing metadata so as to show a correspondence relationship with still image data stored in the same storage unit 141. That is, depending on the metadata table, metadata (composition editing metadata) and a still image file subjected to subject detection processing and composition determination processing by the subject detection / composition determination processing block 101 in order to create this metadata Correspondence with is shown.

そして、例えば外部からの静止画ファイルの要求に応じて、記憶部１４１に記憶されている静止画像ファイルを出力する（例えばサーバであれば、クライアントからのダウンロード要求に応じて静止画像ファイルをダウンロードする場合などとなる）際には、静止画ファイル出力処理ブロック１４４が、記憶部１４１から要求された静止画像ファイルを検索して取り込むとともに、この検索した静止画ファイルに対応するメタデータ（構図編集メタデータ）も、メタデータテーブルから検索して取り込むようにされる。 Then, for example, a still image file stored in the storage unit 141 is output in response to a request for an external still image file (for example, if it is a server, a still image file is downloaded in response to a download request from a client). When the still image file output processing block 144 searches for and imports the requested still image file from the storage unit 141, the metadata corresponding to the searched still image file (composition editing metadata) Data) is also retrieved from the metadata table.

そして、この静止画像ファイル出力処理ブロック１４４は、例えば図２３に示したメタデータ解析処理ブロック１１２、及びトリミング処理ブロック１１３に相当する機能ブロックを少なくとも有して成る。
静止画像ファイル出力処理ブロック１４４においては、内部のメタデータ作成処理ブロックにより、取り込んだメタデータを解析してトリミング指示情報を得る。そして、同じ内部のトリミング処理ブロックにより、取り込んだ静止画像ファイルに格納される静止画像データを対象として、上記トリミング指示情報に応じたトリミングを実行する。そして、トリミングにより得られた画像部分を改めて１つの静止画像データとして生成し、これを出力する。 The still image file output processing block 144 includes at least functional blocks corresponding to, for example, the metadata analysis processing block 112 and the trimming processing block 113 shown in FIG.
In the still image file output processing block 144, the internal metadata generation processing block analyzes the captured metadata to obtain trimming instruction information. Then, trimming according to the trimming instruction information is executed on the still image data stored in the captured still image file by the same internal trimming processing block. Then, the image portion obtained by the trimming is newly generated as one still image data, and this is output.

上記図２６のシステム構成は、多様なサービスへの適用を考えることができる。
例えば、１つにはネットワーク経由での写真のプリントサービスに適用できる。つまり、ユーザは、プリントサービスのサーバに、プリント（印刷）してもらいたい画像データ（静止画像ファイル）をネットワーク経由でアップロードする。サーバでは、このようしてアップロードされてきた静止画像ファイルを記憶部１４１に記憶しておき、このファイルに対応するメタデータも作成してメタデータテーブルに登録しておく。そして、実際に印刷出力するときには、静止画像ファイル出力処理ブロック１４４により、最適構図を抜き出した静止画像データを印刷用の画像データとして出力する。つまり、このサービスによっては、写真プリントを依頼すると、最適構図に補正されてプリントされたものが送られてくるものである。 The system configuration shown in FIG. 26 can be applied to various services.
For example, one can be applied to a photo print service via a network. That is, the user uploads image data (still image file) desired to be printed (printed) to the print service server via the network. In the server, the still image file uploaded in this way is stored in the storage unit 141, and metadata corresponding to this file is also created and registered in the metadata table. When the image is actually printed out, the still image file output processing block 144 outputs the still image data extracted from the optimum composition as image data for printing. In other words, depending on this service, when a photo print is requested, a print that has been corrected to an optimal composition is sent.

また１つには、例えばブログなどのサーバにも適用することができる。記憶部１４１には、ブログのテキストのデータと共にアップロードされた画像データを記憶させることとする。これにより、例えばブログのページには、ユーザがアップロードした画像データから最適構図を抜き出した画像を貼り付けさせることが可能になる。 For example, it can be applied to a server such as a blog. The storage unit 141 stores image data uploaded together with blog text data. Thereby, for example, an image obtained by extracting the optimum composition from the image data uploaded by the user can be pasted on the blog page.

なお、上記図１７〜図２６により説明した例は一部であって、本願発明による構図判定を適用できる装置、システム、アプリケーションソフトウェアなどはほかにも考えられる。 The examples described with reference to FIGS. 17 to 26 are only a part, and other devices, systems, application software, and the like to which the composition determination according to the present invention can be applied are also conceivable.

また、これまでの実施の形態の説明にあっては、被写体（個別被写体）は、人であることを前提としているが、例えば、動物や植物であるとか、人以外の種類のものを被写体とする場合にも、本願発明を適用することが考えられる。
また、被写体検出の対象となる画像データは、撮像に由来して得られるもの（撮像画像データ）のみに限定されるべきものではなく、例えば、絵であるとかデザイン画などの画内容を有する画像データを対象とすることも考えられる。
また、本願発明のもとで判定される構図（最適構図）は、必ずしも、三分割法のみに基づいて決まるものに限定されない。例えば、三分割法とともに、黄金率による構図設定の手法も知られているが、黄金率などの他の手法を採用することについても特に支障はない。さらには、上記三分割法であるとか、黄金率など、一般的に良いとされる構図のみに限定されるものではない。例えば一般的には良くないとされる構図であっても、構図の設定次第では、ユーザがおもしろみを感じたり、かえって良いと感じるような場合もあると考えられる。従って、本願発明のもとで判定される構図（最適構図）としては、実用性、エンタテイメント性などを考慮して任意に設定されればよく、実際においては特に制限はない。 Further, in the description of the embodiments so far, it is assumed that the subject (individual subject) is a person. For example, an object such as an animal or a plant, In this case, it is conceivable to apply the present invention.
Further, the image data to be subject to object detection should not be limited to only those obtained from imaging (captured image data), for example, images having image contents such as pictures or design images. It is also possible to target data.
Further, the composition (optimum composition) determined under the present invention is not necessarily limited to that determined based only on the three-division method. For example, a composition setting method based on the golden rate is known along with the three-division method, but there is no particular problem in adopting other methods such as the golden rate. Furthermore, the composition is not limited to the generally good composition such as the above three-division method or the golden rate. For example, even if the composition is generally not good, depending on the composition setting, it may be considered that the user may find it interesting or feel good. Therefore, the composition (optimum composition) determined under the present invention may be arbitrarily set in consideration of practicality and entertainment properties, and is not particularly limited in practice.

１デジタルスチルカメラ、２シャッターボタン、３レンズ部、１０雲台、２１光学系、２２イメージセンサ、２３Ａ／Ｄコンバータ、２４信号処理部、２５エンコード／デコード部、２６メディアコントローラ、２７制御部、２８ＲＯＭ、２９ＲＡＭ、３０フラッシュメモリ、３１操作部、３２表示ドライバ、３３表示部、３４雲台対応通信部、４０メモリカード、５１制御部、５２通信部、５３パン機構部、５４パン用モータ、５５パン用駆動部、５６チルト機構部、５７チルト用モータ、５８チルト用駆動部、６１被写体検出処理ブロック、６２構図制御処理ブロック、６３通信制御処理ブロック、ＳＢＪ（ＳＢＪ０〜ｎ）個別被写体、７１通信制御処理ブロック、７２パン・チルト制御処理ブロック、７３被写体検出処理ブロック、７４構図制御処理ブロック、７５撮像部、８１・９２・１０１・１２２・１３２・１４２被写体検出・構図判定処理ブロック、８２通知制御処理ブロック、８３表示部、８４レリーズ制御処理ブロック、９１・１３１トリミング処理ブロック、１０２・１２３・１４３メタデータ作成処理ブロック、１０３ファイル作成処理ブロック、１１１メタデータ分離処理ブロック、１１２メタデータ解析処理ブロック、１１３トリミング処理ブロック、１２４ファイル作成・記録処理ブロック、１３３印刷制御処理ブロック、１４１記憶部、１４４静止画ファイル出力処理ブロック DESCRIPTION OF SYMBOLS 1 Digital still camera, 2 Shutter button, 3 Lens part, 10 Head, 21 Optical system, 22 Image sensor, 23 A / D converter, 24 Signal processing part, 25 Encoding / decoding part, 26 Media controller, 27 Control part, 28 ROM, 29 RAM, 30 Flash memory, 31 Operation section, 32 Display driver, 33 Display section, 34 Pan head compatible communication section, 40 Memory card, 51 Control section, 52 Communication section, 53 Bread mechanism section, 54 Bread motor , 55 Pan drive unit, 56 Tilt mechanism unit, 57 Tilt motor, 58 Tilt drive unit, 61 Subject detection processing block, 62 Composition control processing block, 63 Communication control processing block, SBJ (SBJ0-n) Individual subject, 71 Communication control processing block, 72 Pan / tilt control Physical block, 73 Subject detection processing block, 74 Composition control processing block, 75 Imaging unit, 81, 92, 101, 122, 132, 142 Subject detection / composition determination processing block, 82 Notification control processing block, 83 Display unit, 84 Release Control processing block, 91/131 trimming processing block, 102/123/143 metadata creation processing block, 103 file creation processing block, 111 metadata separation processing block, 112 metadata analysis processing block, 113 trimming processing block, 124 file creation Recording processing block, 133 print control processing block, 141 storage unit, 144 still image file output processing block

Claims

Subject detection means for detecting the presence of a specific subject in the image based on the image data;
Composition determining means for determining the composition according to the number of detected subjects that are the subjects detected by the subject detecting means;
With
The composition determination means is
The composition determination means is
A composition determination device that determines whether the rectangular image is portrait or landscape based on the distance between subjects located at both left and right ends when the number of detected subjects is equal to or greater than a predetermined value .

When the number of detected subjects is equal to or greater than a predetermined value and the composition determination means is a predetermined value equal to or greater than 2, the composition determination means determines the rectangular image based on the distance between the subjects located at the left and right ends. It is designed to determine whether it is portrait or landscape.
The composition determination apparatus according to claim 1 .

The composition determination means is
Composition determination device according to claim 1 or 2 can be set the maximum number of the object of interest of configuration diagram control.

The composition determination device
The composition determination apparatus according to claim 3 , wherein among the detected subjects, the composition control target is set in descending order of the size of the detected subject based on the set upper limit number.

The composition determination means is
If the number of subjects is 1,
The composition determination apparatus according to claim 1 , wherein the rectangular image is vertically long.

The composition determination means is
When the number of subjects is 2,
The composition determination apparatus according to claim 2 , wherein the rectangular image is vertically long or horizontally long based on a relationship between a distance between the subjects and a predetermined threshold.

The composition determination means is
If the number of subjects is 3,
The composition determination apparatus according to claim 2 , wherein the rectangular image is horizontally long.

The composition determination means is
When the number of detected subjects is equal to or greater than a predetermined value, the rectangular image is displayed either vertically or horizontally based on the distance between subjects located at both left and right ends and the distance between subjects located at both top and bottom ends. To determine whether or not
The composition determination apparatus according to claim 1.

When the number of detected subjects is equal to or greater than a predetermined value and the composition determination means has a predetermined value equal to or greater than 2, the distance between subjects located at both left and right ends and the distance between subjects located at both upper and lower ends Based on the above, it is determined whether the rectangular image is vertical or horizontal,
The composition determination apparatus according to claim 8 .

The composition determination apparatus according to claim 1, further comprising mechanism control means for controlling the mechanism so that the composition determined by the composition determination means is obtained.

The composition determination apparatus according to claim 1, further comprising notification control means for notifying a user when the determined composition is reached.

The composition determination means is
Change the composition determination method according to the number of detected subjects,
In case the number of the detected subject is 2 or more, the distance between the object located in the left and right ends, the ratio of the length of the image horizontal direction, in claim 1 using a different value depending on the number of the detected subject The composition determination device described.

The composition determination means is
The composition determination apparatus according to claim 12 , wherein the ratio is increased as the number of detected subjects increases.

A subject detection procedure for detecting the presence of a specific subject in an image based on image data;
A composition determination procedure for determining the composition according to the number of detected subjects that are the subjects detected by the subject detection procedure;
Run
The composition determination procedure is as follows:
A composition determination method for determining whether the rectangular image is portrait or landscape based on the distance between subjects located at the left and right ends when the number of detected subjects is a predetermined value or more .

A subject detection procedure for detecting the presence of a specific subject in an image based on image data;
A composition determination procedure for determining the composition according to the number of detected subjects that are the subjects detected by the subject detection procedure;
Is executed by the composition determination device,
The composition determination procedure is as follows:
A program for determining whether the rectangular image is portrait or landscape based on the distance between subjects located at the left and right ends when the number of detected subjects is equal to or greater than a predetermined value .