JP6512208B2

JP6512208B2 - Image processing apparatus, image processing method and program

Info

Publication number: JP6512208B2
Application number: JP2016235668A
Authority: JP
Inventors: 孝一斉藤
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2016-12-05
Filing date: 2016-12-05
Publication date: 2019-05-15
Anticipated expiration: 2034-12-25
Also published as: JP2017054541A

Description

本発明は、画像処理装置、画像処理方法及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program.

近年、撮像素子の高画素化、或いは、動画像の規格が標準画質からフルハイビジョン、さらに４Ｋへと高画素化が進み、１つの画面から部分を切り出しても充分な画質を得られるようになってきている。
こういった流れの中で、注目する人物を追尾しつつ画質を落とさない範囲で切り出しを行う技術が開示されている（例えば、特許文献１参照）。 In recent years, the number of pixels of imaging devices has been increased, or the standard for moving images has been increased from standard image quality to full high vision, and further increased to 4K, so that sufficient image quality can be obtained even if portions are cut out from one screen. It is coming.
In such a flow, there is disclosed a technique of performing cutout in a range that does not degrade the image quality while tracking a person of interest (for example, see Patent Document 1).

特開２０１３−１１５５９７号公報JP, 2013-115597, A

しかしながら、注目する人物を追尾して切り出しを行う従来の技術は、注目する人物が一人、或いは複数でも画像内における時間的な位置と空間的な位置の両方が近接している場合には有効であっても、複数の人物の画像内における時間的な位置と空間的な位置のいずれかが離れている場合には、効果が得られないという課題がある。これは、人物に限らず、画像内の注目する部分が離れて存在する場合においても同様である。 However, the conventional technique of tracking and extracting the person of interest is effective when both the temporal position and the spatial position in the image are close even if one or more persons of interest are in the image. Even if there is a problem, the effect can not be obtained when any one of the temporal position and the spatial position in the images of a plurality of persons is separated. This applies not only to the person but also to the case where the portion of interest in the image is apart.

本発明の課題は、画像内に存在する複数の被写体の位置関係によらず、効果的な画像を生成することである。 An object of the present invention is to generate an effective image regardless of the positional relationship of a plurality of objects present in the image.

上記目的を達成するため、本発明の一態様の画像処置装置は、
第１の被写体と、離間したフレーム期間に存在する第２の被写体とが撮影された１つの動画像から、前記第１の被写体又は前記第２の被写体が存在するフレーム期間を切り出すことで、動画像を切り出す切出手段と、
時間的な結合基準に基づいた被写体の順位付けに応じて、前記切出手段により切り出された前記第１の被写体に対応する動画像と、前記第２の被写体に対応する動画像とを結合することで、新たな１つの動画像を生成する生成手段と、
を備えることを特徴とする。 In order to achieve the above object, an image processing apparatus according to an aspect of the present invention is
By cutting out a frame period in which the first subject or the second subject is present from one moving image in which a first subject and a second subject existing in a separated frame period are photographed, the video Cutting means for cutting out an image,
The moving image corresponding to the first object cut out by the cutting out means is combined with the moving image corresponding to the second object according to the ranking of the objects based on the temporal combination criteria Means for generating one new moving image,
And the like.

本発明によれば、画像内に存在する複数の被写体の位置関係によらず、効果的な画像を生成することができる。 According to the present invention, an effective image can be generated regardless of the positional relationship of a plurality of objects present in the image.

本発明の一実施形態に係る画像処理装置のハードウェアの構成を示すブロック図である。It is a block diagram showing the composition of the hardware of the image processing device concerning one embodiment of the present invention. 図１の画像処理装置の機能的構成のうち、画像生成処理を実行するための機能的構成を示す機能ブロック図である。FIG. 5 is a functional block diagram showing a functional configuration for executing image generation processing among the functional configurations of the image processing apparatus of FIG. 1. レイアウトテーブルの一例を示す図である。It is a figure which shows an example of a layout table. レイアウトの一例を示す模式図であり、図４（Ａ）はレイアウト２Ａ、図４（Ｂ）はレイアウト２Ｂ、図４（Ｃ）はレイアウト３Ｇ、図４（Ｄ）はレイアウト３Ｈを示す図である。FIG. 4A is a schematic view showing an example of layout, FIG. 4A is a view showing a layout 2A, FIG. 4B is a view showing a layout 2B, FIG. 4C is a layout 3G, and FIG. 4D is a layout 3H. . 空間的切り出し基準の概念を示す模式図であり、図５（Ａ）、（Ｂ）は空間的切り出し基準１、図５（Ｃ）、（Ｄ）は空間的切り出し基準２、図５（Ｅ）、（Ｆ）は空間的切り出し基準３、図５（Ｇ）、（Ｈ）、（Ｉ）は空間的切り出し基準４の概念を示す模式図である。5 (A) and 5 (B) are spatial clipping criteria 1, FIGS. 5 (C) and 5 (D) are spatial clipping criteria 2, FIG. 5 (E). FIG. 5F is a schematic diagram showing the concept of the spatial cutout criterion 4 and FIG. 5G, FIG. 5H, and FIG. 画像生成手順全体の概念を示す模式図である。It is a schematic diagram which shows the concept of the whole image generation procedure. 画像生成手順全体の概念を示す模式図である。It is a schematic diagram which shows the concept of the whole image generation procedure. 画像生成手順全体の概念を示す模式図である。It is a schematic diagram which shows the concept of the whole image generation procedure. 画像生成手順全体の概念を示す模式図である。It is a schematic diagram which shows the concept of the whole image generation procedure. 図２の機能的構成を有する図１の画像処理装置が実行する画像生成処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the image generation process which the image processing apparatus of FIG. 1 which has the functional structure of FIG. 2 performs. 図１０におけるステップＳ３の切り出し処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the cutting-out process of step S3 in FIG. 図１０におけるステップＳ７の空間的結合処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the spatial coupling process of step S7 in FIG. 図１０におけるステップＳ８の時間的結合処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the temporal connection process of step S8 in FIG. 第２実施形態における画像処理装置が実行する画像生成処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the image generation process which the image processing apparatus in 2nd Embodiment performs. 図１４におけるステップＳ１０３の切り出し処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the cutting-out process of step S103 in FIG. 図１４におけるステップＳ１０６の結合処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the joint process of step S106 in FIG. 複数の被写体をまとめて空間的に切り出す概念を示す模式図である。It is a schematic diagram which shows the concept which puts together several subjects and it cuts out spatially.

以下、本発明の実施形態について、図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described using the drawings.

［第１実施形態］
［構成］
図１は、本発明の一実施形態に係る画像処理装置１のハードウェアの構成を示すブロック図である。
画像処理装置１は、例えばデジタルカメラとして構成される。 First Embodiment
[Constitution]
FIG. 1 is a block diagram showing a hardware configuration of an image processing apparatus 1 according to an embodiment of the present invention.
The image processing apparatus 1 is configured as, for example, a digital camera.

画像処理装置１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１３と、バス１４と、入出力インターフェース１５と、撮像部１６と、入力部１７と、出力部１８と、記憶部１９と、通信部２０と、ドライブ２１と、を備えている。 The image processing apparatus 1 includes a central processing unit (CPU) 11, a read only memory (ROM) 12, a random access memory (RAM) 13, a bus 14, an input / output interface 15, an imaging unit 16, and an input unit. 17, an output unit 18, a storage unit 19, a communication unit 20, and a drive 21.

ＣＰＵ１１は、ＲＯＭ１２に記録されているプログラム、または、記憶部１９からＲＡＭ１３にロードされたプログラムに従って各種の処理を実行する。例えば、ＣＰＵ１１は、後述する画像生成処理のためのプログラムに従って、画像生成処理を実行する。 The CPU 11 executes various processes in accordance with a program stored in the ROM 12 or a program loaded from the storage unit 19 into the RAM 13. For example, the CPU 11 executes an image generation process according to a program for an image generation process described later.

ＲＡＭ１３には、ＣＰＵ１１が各種の処理を実行する上において必要なデータ等も適宜記憶される。 Data and the like necessary for the CPU 11 to execute various processes are also stored in the RAM 13 as appropriate.

ＣＰＵ１１、ＲＯＭ１２及びＲＡＭ１３は、バス１４を介して相互に接続されている。このバス１４にはまた、入出力インターフェース１５も接続されている。入出力インターフェース１５には、撮像部１６、入力部１７、出力部１８、記憶部１９、通信部２０及びドライブ２１が接続されている。 The CPU 11, the ROM 12 and the RAM 13 are connected to one another via a bus 14. An input / output interface 15 is also connected to the bus 14. An imaging unit 16, an input unit 17, an output unit 18, a storage unit 19, a communication unit 20, and a drive 21 are connected to the input / output interface 15.

撮像部１６は、図示はしないが、光学レンズ部と、イメージセンサと、を備えている。 Although not shown, the imaging unit 16 includes an optical lens unit and an image sensor.

光学レンズ部は、被写体を撮影するために、光を集光するレンズ、例えばフォーカスレンズやズームレンズ等で構成される。
フォーカスレンズは、イメージセンサの受光面に被写体像を結像させるレンズである。ズームレンズは、焦点距離を一定の範囲で自在に変化させるレンズである。
光学レンズ部にはまた、必要に応じて、焦点、露出、ホワイトバランス等の設定パラメータを調整する周辺回路が設けられる。 The optical lens unit is configured of a lens that collects light, such as a focus lens or a zoom lens, in order to capture an object.
The focus lens is a lens that forms an object image on the light receiving surface of the image sensor. The zoom lens is a lens that freely changes the focal length in a certain range.
The optical lens unit is also provided with peripheral circuits for adjusting setting parameters such as focus, exposure, white balance, etc., as necessary.

イメージセンサは、光電変換素子や、ＡＦＥ（ＡｎａｌｏｇＦｒｏｎｔＥｎｄ）等から構成される。
光電変換素子は、例えばＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）型の光電変換素子等から構成される。光電変換素子には、光学レンズ部から被写体像が入射される。そこで、光電変換素子は、被写体像を光電変換（撮像）して画像信号を一定時間蓄積し、蓄積した画像信号をアナログ信号としてＡＦＥに順次供給する。
ＡＦＥは、このアナログの画像信号に対して、Ａ／Ｄ（Ａｎａｌｏｇ／Ｄｉｇｉｔａｌ）変換処理等の各種信号処理を実行する。各種信号処理によって、ディジタル信号が生成され、撮像部１６の出力信号として出力される。
このような撮像部１６の出力信号を、以下、「撮像画像」と呼ぶ。撮像画像は、ＣＰＵ１１等に適宜供給される。 The image sensor includes a photoelectric conversion element, an AFE (Analog Front End), and the like.
The photoelectric conversion element is composed of, for example, a complementary metal oxide semiconductor (CMOS) type photoelectric conversion element or the like. A subject image is incident on the photoelectric conversion element from the optical lens unit. Therefore, the photoelectric conversion element photoelectrically converts (captures) an object image, accumulates an image signal for a certain period of time, and sequentially supplies the accumulated image signal as an analog signal to the AFE.
The AFE performs various signal processing such as A / D (Analog / Digital) conversion processing on this analog image signal. A digital signal is generated by various signal processing, and is output as an output signal of the imaging unit 16.
Hereinafter, such an output signal of the imaging unit 16 is referred to as a “captured image”. The captured image is appropriately supplied to the CPU 11 and the like.

入力部１７は、各種ボタン等で構成され、ユーザの指示操作に応じて各種情報を入力する。
出力部１８は、ディスプレイやスピーカ等で構成され、画像や音声を出力する。
記憶部１９は、ハードディスク或いはフラッシュメモリ等で構成され、各種画像のデータを記憶する。
通信部２０は、インターネットを含むネットワークを介して他の装置（図示せず）との間で行う通信を制御する。 The input unit 17 includes various buttons and the like, and inputs various information in accordance with a user's instruction operation.
The output unit 18 is configured by a display, a speaker, and the like, and outputs an image and sound.
The storage unit 19 is configured by a hard disk, a flash memory, or the like, and stores data of various images.
The communication unit 20 controls communication with other devices (not shown) via a network including the Internet.

ドライブ２１には、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリ等よりなる、リムーバブルメディア３１が適宜装着される。ドライブ２１によってリムーバブルメディア３１から読み出されたプログラムは、必要に応じて記憶部１９にインストールされる。また、リムーバブルメディア３１は、記憶部１９に記憶されている画像のデータ等の各種データも、記憶部１９と同様に記憶することができる。 A removable medium 31 made of a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is appropriately attached to the drive 21. The program read from the removable media 31 by the drive 21 is installed in the storage unit 19 as needed. The removable media 31 can also store various data such as image data stored in the storage unit 19 in the same manner as the storage unit 19.

図２は、このような画像処理装置１の機能的構成のうち、画像生成処理を実行するための機能的構成を示す機能ブロック図である。
画像生成処理とは、複数の人物が撮影された１つの動画像から、登録されている顔情報を基にそれぞれの人物の顔の部分を空間的または時間的に切り出し、切り出された動画像を結合して新たな動画像を生成する一連の処理をいう。 FIG. 2 is a functional block diagram showing a functional configuration for executing an image generation process among the functional configurations of such an image processing apparatus 1.
In the image generation processing, from one moving image in which a plurality of persons are photographed, the face portion of each person is spatially or temporally cut out based on registered face information, and the cut out moving images are It refers to a series of processes that combine to generate a new moving image.

画像生成処理が実行される場合、図２に示すように、ＣＰＵ１１において、画像選択部５１と、切り出し基準特定部５２と、切り出し処理部５３と、結合基準特定部５４と、結合処理部５５と、レイアウト選択部５６と、が機能する。 When the image generation process is executed, as shown in FIG. 2, in the CPU 11, the image selection unit 51, the cutout reference identification unit 52, the cutout processing unit 53, the connection reference identification unit 54, and the connection processing unit 55 in the CPU 11. , And the layout selection unit 56 function.

また、記憶部１９の一領域には、画像記憶部７１と、レイアウト記憶部７２と、生成画像記憶部７３と、顔情報記憶部７４と、が設定される。 Further, in one area of the storage unit 19, an image storage unit 71, a layout storage unit 72, a generated image storage unit 73, and a face information storage unit 74 are set.

画像記憶部７１には、画像処理装置１または他の装置によって撮像された動画像のデータが記憶される。
レイアウト記憶部７２には、新たな動画像を生成する際に用いられるレイアウトのデータと、レイアウトを選択するための条件が定義されたレイアウトテーブルと、が記憶される。 The image storage unit 71 stores data of moving images captured by the image processing device 1 or another device.
The layout storage unit 72 stores layout data used when generating a new moving image, and a layout table in which conditions for selecting a layout are defined.

図３は、レイアウトテーブルの一例を示す図である。
図３に示すように、レイアウトテーブルには、動画像から切り出された部分の数を示す切り出し画像数と、動画像から切り出された部分のうち最も高い優先順位が設定された部分を示す最優先の切り出し対象と、対応するレイアウトとが対応付けて記憶されている。
具体的には、レイアウトテーブルには、切り出し画像数として「２、３」等の数値、最優先の切り出し対象として「人物Ａ、Ｂ、Ｃの顔」、「正面、右向き、左向き等の顔の向き」、「最優先の切り出し対象なし」等の切り出し対象の属性、対応レイアウトとして「レイアウト２Ａ、２Ｂ・・・２Ｎ、レイアウト３Ａ、３Ｂ・・・３Ｎ」等のレイアウトのデータが対応付けられている。 FIG. 3 is a diagram showing an example of the layout table.
As shown in FIG. 3, the layout table has the highest priority indicating the number of cut-out images indicating the number of portions cut out from the moving image and the portion to which the highest priority is set among the portions cut out from the moving image. The extraction target of and the corresponding layout are stored in association with each other.
Specifically, in the layout table, numerical values such as “2, 3” as the number of cut-out images, “faces of persons A, B, C”, “faces of front, right, left etc.” Attributes of the extraction target such as “direction” and “no top priority extraction target” and layout data such as “layout 2A, 2B... 2N, layout 3A, 3B. There is.

図４は、レイアウトの一例を示す模式図であり、図４（Ａ）はレイアウト２Ａ、図４（Ｂ）はレイアウト２Ｂ、図４（Ｃ）はレイアウト３Ｇ、図４（Ｄ）はレイアウト３Ｈを示す図である。
図４の各模式図に示すように、レイアウトには、背景と、背景に合成する画像の数と、大きさ及び位置関係とが定義されている。なお、図４（Ａ）のレイアウト２Ａや、図４（Ｂ）のレイアウト２Ｂのように、画面全体に画像が合成される場合には、背景として枠のみが表示される。 FIG. 4 is a schematic view showing an example of the layout, FIG. 4 (A) is a layout 2A, FIG. 4 (B) is a layout 2B, FIG. 4 (C) is a layout 3G, and FIG. 4 (D) is a layout 3H. FIG.
As shown in each schematic diagram of FIG. 4, the layout defines the background, the number of images to be combined with the background, and the size and positional relationship. As in the layout 2A of FIG. 4A and the layout 2B of FIG. 4B, when an image is combined on the entire screen, only a frame is displayed as a background.

具体的には、レイアウトには、背景に合成する所定数の画像の合成位置が設定され、それぞれの合成位置には優先順位が設定されている。また、各合成位置には、合成する画像の大きさが設定されている。
例えば、図４（Ａ）のレイアウト２Ａは、２枚の画像を左右に並ぶ合成位置に、同一の大きさで配置するレイアウトであり、左側の合成位置が優先順位１、右側の合成位置が優先順位２となっている。また、図４（Ｃ）のレイアウト３Ｇは、３枚の画像を左上、中央、右下に並ぶ合成位置に、中央の合成位置を大きく、左上及び右下の合成位置を小さく配置するレイアウトであり、中央の合成位置が優先順位１、左上の合成位置が優先順位２、右下の合成位置が優先順位３となっている。 Specifically, in the layout, a combination position of a predetermined number of images to be combined with the background is set, and priority is set to each combination position. Also, the size of the image to be combined is set at each combining position.
For example, layout 2A in FIG. 4A is a layout in which two images are arranged in the same size at the combining position arranged in the left and right direction, and the combining position on the left is priority 1 and the combining position on the right is priority It is in the second place. In addition, layout 3G of FIG. 4C is a layout in which three images are arranged at the synthesis positions aligned in the upper left, center, lower right, with the central synthesis position larger and the upper left and lower lower synthesis positions smaller. The center combining position has priority 1, the upper left combining position has priority 2, and the lower right combining position has priority 3.

図２に戻り、生成画像記憶部７３には、切り出された動画像をレイアウトに従って結合した新たな動画像のデータが記憶される。
顔情報記憶部７４には、人物の顔（正面、右向き、左向き）の認証用データが記憶されている。人物の顔の認証用データは、動画像において検出された顔が、特定の人物の顔であるか否かを認証するために用いられる。なお、動画像から顔を検出する処理及び検出された顔が特定の人物であるか否かを認証する処理には、公知の各種顔検出技術及び顔認証技術を用いることができる。 Returning to FIG. 2, the generated image storage unit 73 stores data of a new moving image obtained by combining the cut out moving images according to the layout.
The face information storage unit 74 stores authentication data of the face of the person (front, right, left). The identification data of the face of the person is used to identify whether the face detected in the moving image is the face of a specific person. Note that various known face detection techniques and face authentication techniques can be used for the process of detecting a face from a moving image and the process of authenticating whether or not the detected face is a specific person.

画像選択部５１は、画像記憶部７１に記憶された画像のデータの中から、ユーザの指示入力に対応する動画像のデータを選択する。以下、選択された動画像のデータを「オリジナルの動画像」と呼ぶ。
切り出し基準特定部５２は、動画像から人物の顔の部分を切り出す際の切り出し基準を特定する。
本実施形態では、切り出し基準として、空間的切り出し基準と、時間的切り出し基準とが定義されており、切り出し基準特定部５２は、動画像の切り出しが行われる際に、前回使用された空間的切り出し基準または時間的切り出し基準のいずれかを特定する。なお、切り出し基準特定部５２が、ユーザによる任意の切り出し基準の特定を受け付けることとしてもよい。
具体的には、空間的切り出し基準及び時間的切り出し基準は、以下のように設定されている。 The image selection unit 51 selects, from among the data of the images stored in the image storage unit 71, the data of the moving image corresponding to the user's instruction input. Hereinafter, data of the selected moving image is referred to as "original moving image".
The cutout reference specifying unit 52 specifies a cutout reference when cutting out a face portion of a person from a moving image.
In the present embodiment, a spatial cutout criterion and a temporal cutout criterion are defined as the cutout criterion, and the cutout criterion identifying unit 52 uses the spatial cutout previously used when the moving image cutout is performed. Identify either the reference or the temporal clipping criteria. Note that the extraction criterion specification unit 52 may receive an identification of an arbitrary extraction criterion by the user.
Specifically, the spatial clipping criterion and the temporal clipping criterion are set as follows.

（空間的切り出し基準１）
動画像を構成するフレーム毎の画面において検出され、特定の人物として認証された顔それぞれを切り出し対象とする。
（空間的切り出し基準２）
動画像を構成するフレーム毎の画面において検出された顔及び特定の人物として認証された顔それぞれを切り出し対象とする。
（空間的切り出し基準３）
動画像を構成するフレーム毎の画面において特定の人物として認証された顔を切り出し対象としない。
（空間的切り出し基準４）
動画像を構成するフレーム毎の画面において検出された一人の顔または特定の人物として認証された一人の顔の正面／右向き／左向きを別々に切り出し対象とする。ただし、認証された顔を検出された顔よりも優先し、複数の顔が検出または認証された場合は、認証対象として登録された顔に設定されている優先順位や、検出または認証された顔の大きさに応じて切り出し対象を決定する。 (Spatial cutout criteria 1)
Each face detected and identified as a specific person on the screen of each frame constituting the moving image is to be extracted.
(Spatial cutout criteria 2)
A face detected on a screen of each frame constituting a moving image and a face authenticated as a specific person are extracted as extraction targets.
(Spatial cutout criteria 3)
A face authenticated as a specific person in the screen of each frame constituting a moving image is not to be extracted.
(Spatial cutout criterion 4)
The face / right / left direction of one face detected on the screen of each frame constituting a moving image or one face identified as a specific person is separately extracted. However, if a plurality of faces are detected or authenticated by giving priority to the authenticated face over the detected face, the priority set to the face registered as the authentication target, or the detected or authenticated face Determine the extraction target according to the size of.

（時間的切り出し基準１）
動画像を構成する連続するフレームにおいて顔が検出された場合または特定の人物として顔が認証された場合、検出された顔及び認証された顔が存在しているか否かによらず、同じ時間の長さで各顔に対応する動画像を切り出す。ただし、顔が検出または認証されなかったフレームの期間には、オリジナルの動画像をそのまま挿入することで、同じフレーム数となるようにする。
（時間的切り出し基準２）
動画像を構成する連続するフレームにおいて顔が検出された場合または特定の人物として顔が認証された場合、検出された顔及び認証された顔が存在しているか否かによらず、同じ時間の長さで各顔に対応する動画像を切り出す。ただし、顔が検出または認証されなかったフレームの期間は切り出しを行わずに時間的に圧縮した動画像を切り出し、最も長い動画像の長さに合わせるように、他の短い動画像のフレームレートを低下させることで、異なるフレーム数でも同じ時間となるようにする。 (Temporal cutout criteria 1)
When a face is detected in consecutive frames constituting a moving image or when a face is recognized as a specific person, regardless of whether or not the detected face and the recognized face exist, the same time period Cut out a moving image corresponding to each face by the length. However, during the frame period in which the face is not detected or authenticated, the original moving image is inserted as it is, so that the same frame number is obtained.
(Temporal cutout criteria 2)
When a face is detected in consecutive frames constituting a moving image or when a face is recognized as a specific person, regardless of whether or not the detected face and the recognized face exist, the same time period Cut out a moving image corresponding to each face by the length. However, during the frame period in which no face is detected or recognized, the temporally compressed moving image is cut out without cutting out, and the frame rate of another short moving image is adjusted to match the length of the longest moving image. By reducing it, it is made to be the same time even for different number of frames.

（時間的切り出し基準３）
動画像を構成する連続するフレームにおいて顔が検出された場合または特定の人物として顔が認証された場合、検出された顔及び認証された顔が存在している部分のフレームの期間のみを切り出し、時間的に圧縮した動画を切り出す。ただし、切り出し対象とする部分は、連続して顔が検出または認証される時間が所定の閾値時間以上の部分とする。なお、検出または認証された顔の正面／右向き／左向きを区別して時間的に切り出すこととしてもよい。 (Temporal cutout criteria 3)
When a face is detected in consecutive frames constituting a moving image or when a face is identified as a specific person, only the frame period of the detected face and the portion where the identified face is present is cut out, Cut out a temporally compressed video. However, in the part to be extracted, the time during which the face is continuously detected or recognized is a part that is equal to or more than a predetermined threshold time. Note that the face of the detected or authenticated face may be cut out temporally by distinguishing between front / right / left.

切り出し処理部５３は、切り出し基準特定部５２によって特定された切り出し基準に従って、動画像から人物の顔の部分を切り出す切り出し処理を実行する。具体的には、切り出し処理部５３は、オリジナルの動画像をサーチし、切り出し対象となる被写体を特定する。そして、切り出し処理部５３は、切り出し基準特定部５２によって特定された切り出し基準に従って、フレーム毎の画面内の特定の人物の顔を含む矩形領域を空間的に切り出したり、特定の人物の顔を含むフレームの期間を時間的に切り出したりする。なお、空間的に切り出された動画像は、後述するレイアウトへの合成の際に、レイアウトによっては切り出された領域の一部をカットして、サイズやアスペクト比等が変更される場合がある。そのため、切り出し処理部５３において切り出し対象となる被写体を空間的に切り出す場合には、被写体の周囲に一定の余白領域を含めて切り出しておくこととしてもよい。 The cutout processing unit 53 executes cutout processing to cut out the face portion of the person from the moving image in accordance with the cutout reference specified by the cutout reference specifying unit 52. Specifically, the cutout processing unit 53 searches for an original moving image, and identifies a subject to be cutout. Then, the cutout processing unit 53 spatially cuts out a rectangular area including the face of a specific person in the screen for each frame, or includes the face of the specific person according to the cutout criteria specified by the cutout criteria specifying unit 52. Cut out the frame period in time. Note that depending on the layout, a portion of the region that has been cut out may be cut, and the size, aspect ratio, and the like may be changed, in the case of combining with the layout to be described later. Therefore, in the case where the subject to be cut out is spatially cut out in the cut-out processing unit 53, it may be cut out including a certain margin area around the subject.

結合基準特定部５４は、切り出し処理部５３によって切り出された動画像を結合する際の結合基準を特定する。
本実施形態では、結合基準として、空間的結合基準と、時間的結合基準とが定義されており、結合基準特定部５４は、動画像の結合が行われる際に、前回使用された空間的結合基準または時間的結合基準のいずれかを特定する。なお、結合基準特定部５４が、ユーザによる任意の結合基準の特定を受け付けることとしてもよい。
具体的には、空間的結合基準及び時間的切り出し基準は、以下のように設定されている。 The combination reference specification unit 54 specifies a combination reference when combining moving images cut out by the cut-out processing unit 53.
In the present embodiment, a spatial coupling criterion and a temporal coupling criterion are defined as the coupling criterion, and the coupling criterion specifying unit 54 is a spatial coupling previously used when the combination of moving images is performed. Identify either a reference or a temporal binding reference. Note that the combination criteria specification unit 54 may receive specification of any combination criteria by the user.
Specifically, the spatial coupling criteria and the temporal clipping criteria are set as follows.

（空間的結合基準１）
切り出し画像数と、予め登録されている優先順位が最も高い切り出し対象とに対応するレイアウトを選択し、切り出された動画像の優先順位の順に、合成位置の領域に割り当てる。
（空間的結合基準２）
切り出し画像数と、切り出し部分が最も大きい切り出し対象とに対応するレイアウトを選択し、切り出された動画像の優先順位の順に、合成位置の領域に割り当てる。
（空間的結合基準３）
切り出し画像数と、切り出し時間（切り出し後に時間の長さを変更する場合には、元の切り出し時間）が最も長い切り出し対象とに対応するレイアウトを選択し、切り出された動画像の優先順位の順に、合成位置の領域に割り当てる。 (Spatial coupling criterion 1)
A layout corresponding to the number of cut-out images and a cut-out target registered in advance and having the highest priority is selected, and assigned to the area of the synthesis position in the order of the priority of the cut-out moving images.
(Spatial coupling criterion 2)
The layout corresponding to the number of cut-out images and the cut-out target having the largest cut-out portion is selected, and assigned to the area of the synthesis position in the order of priority of the cut-out moving images.
(Spatial coupling criterion 3)
Select the layout corresponding to the number of cut-out images and the cut-out target with the longest cut-out time (in the case of changing the length of time after cut-out, the original cut-out time), and order of priority of cut-out moving images , Assign to the area of the composite position.

（時間的結合基準１）
予め登録されている優先順位の順に結合する。ただし、認証された顔を検出された顔よりも優先し、複数の顔が検出または認証された場合は、認証対象として登録された顔に設定されている優先順位や、検出または認証された顔の大きさに応じて結合順序を決定する。
（時間的結合基準２）
切り出し部分の大きさの順に結合する。
（時間的結合基準３）
切り出し時間（切り出し後に時間の長さを変更する場合には、オリジナルの動画像を挿入していない、或いはフレームレートを低下させていない元の切り出し時間）の長さの順に結合する。 (Temporal coupling criteria 1)
It combines in order of the priority registered beforehand. However, if a plurality of faces are detected or authenticated by giving priority to the authenticated face over the detected face, the priority set to the face registered as the authentication target, or the detected or authenticated face The order of combination is determined according to the size of.
(Temporal coupling criteria 2)
Combine in order of the size of the cut out part.
(Temporal coupling criteria 3)
It combines in order of the length of cutting out time (When changing the length of time after cutting out, original moving image is not inserted, or original cutting out time which is not reducing frame rate).

結合処理部５５は、結合基準特定部５４によって特定された結合基準に従って、切り出し処理部５３によって切り出された各動画像を結合する。具体的には、結合処理部５５は、切り出された各動画像を空間的結合基準に従って結合する空間的結合処理、または、切り出された各動画像を時間的結合基準に従って結合する時間的結合処理を実行する。
レイアウト選択部５６は、結合処理部５５によって動画像の空間的な結合が行われる場合に、切り出し画像数及び最優先の切り出し対象に基づいて、レイアウトテーブルを参照し、レイアウトデータを選択する。 The combination processing unit 55 combines the moving images cut out by the cut-out processing unit 53 in accordance with the combination standard specified by the combination standard specification unit 54. Specifically, the combining processing unit 55 performs spatial combining processing for combining the extracted moving images according to the spatial combining criteria, or temporal combining processing for combining the extracted moving images according to the temporal combining criteria Run.
The layout selection unit 56 refers to the layout table and selects layout data based on the number of cut-out images and the highest-priority cut-out target when the connection processing unit 55 performs spatial connection of moving images.

［画像生成手順の概念］
図５は、空間的切り出し基準の概念を示す模式図であり、図５（Ａ）、（Ｂ）は空間的切り出し基準１、図５（Ｃ）、（Ｄ）は空間的切り出し基準２、図５（Ｅ）、（Ｆ）は空間的切り出し基準３、図５（Ｇ）、（Ｈ）、（Ｉ）は空間的切り出し基準４の概念を示す模式図である。なお、図５の各模式図における破線は、空間的な切り出し領域を示している。
空間的切り出し基準１では、図５（Ａ）に示すように、動画像内に特定の人物として認証された人物Ａ〜Ｃの顔が含まれる場合、人物Ａ〜Ｃの顔それぞれが切り出し対象となる。また、図５（Ｂ）に示すように、動画像内に特定の人物として認証された人物Ａ，Ｂの顔及びそれ以外の人物Ｘの顔が含まれる場合、人物Ａ，Ｂの顔それぞれが切り出し対象となり、人物Ｘの顔は切り出し対象とならない。 [Concept of image generation procedure]
FIG. 5 is a schematic view showing the concept of the spatial clipping standard, and FIGS. 5A and 5B are spatial clipping standard 1, FIG. 5C and FIG. 5D are spatial clipping standard 2, and FIG. 5 (E) and 5 (F) are schematic diagrams showing the concept of the spatial segmentation standard 4, and FIGS. 5 (G), 5 (H) and 5 (I) are the spatial segmentation standard 4. FIG. In addition, the broken line in each schematic diagram of FIG. 5 has shown the spatial cutting out area | region.
In the spatial cutout criterion 1, as shown in FIG. 5A, when the moving image includes the faces of the persons A to C who are authenticated as the specific person, the faces of the persons A to C are regarded as cutout objects, respectively. Become. Further, as shown in FIG. 5B, when the moving image includes the faces of the persons A and B who have been authenticated as the specific person and the faces of the other person X, the faces of the persons A and B are respectively The face of the person X is not to be extracted.

空間的切り出し基準２では、図５（Ｃ）に示すように、動画像内に特定の人物として認証された人物Ａ〜Ｃの顔が含まれる場合、図５（Ａ）と同様に、人物Ａ〜Ｃの顔それぞれが切り出し対象となる。一方、図５（Ｄ）に示すように、動画像内に特定の人物として認証された人物Ａ，Ｂの顔及びそれ以外の人物Ｘの顔が含まれる場合、人物Ａ，Ｂの顔及び人物Ｘの顔それぞれが切り出し対象となる。 In the spatial cutout criterion 2, as shown in FIG. 5 (C), when the moving image includes the faces of the persons A to C authenticated as the specific person, the person A is similar to FIG. 5 (A). Each face of ~ C is to be extracted. On the other hand, as shown in FIG. 5D, when the moving image includes the faces of the persons A and B identified as the specific person and the faces of the other person X, the faces and persons of the persons A and B are included. Each face of X is to be extracted.

空間的切り出し基準３では、図５（Ｅ）に示すように、動画像内に特定の人物として認証された人物Ａ〜Ｃの顔が含まれる場合、人物Ａ〜Ｃの顔は切り出し対象となない。また、図５（Ｆ）に示すように、動画像内に特定の人物として認証された人物Ａ，Ｂの顔及びそれ以外の人物（不特定の人物）Ｘの顔が含まれる場合、人物Ａ，Ｂの顔は切り出し対象とならず、人物Ｘの顔は切り出し対象となる。
空間的切り出し基準４では、図５（Ｇ）〜（Ｉ）に示すように、動画像内に特定の人物として認証された人物またはそれ以外の人物の顔が含まれる場合、優先順位の最も高い顔の向きが正面の状態、右向きの状態及び左向きの状態それぞれが別々に切り出し対象となる。 In the spatial cutout criterion 3, as shown in FIG. 5 (E), when the moving image includes the faces of the persons A to C authenticated as the specific person, the faces of the persons A to C are to be extracted. Absent. Further, as shown in FIG. 5F, when the moving image includes the faces of the persons A and B authenticated as the specific person and the faces of the other person (unspecified person) X, the person A The faces of B and C are not to be extracted, and the face of person X is to be extracted.
In the spatial cutout criterion 4, as shown in FIGS. 5G to 5I, when the moving image includes the face of the person authenticated as a specific person or the face of another person, the highest priority is given. The face orientation is the front, the right orientation, and the left orientation are separately extracted.

図６〜９は、画像生成手順全体の概念を示す模式図である。なお、図６〜９においては、時間的切り出し基準によって動画が切り出された場合を例として、時間的結合基準及び空間的結合基準によって動画像が生成される場合の概念を示している。
具体的には、図６は時間的切り出し基準１で動画像を切り出した場合、図７は時間的切り出し基準２で動画像を切り出した場合、図８は時間的切り出し基準３で動画像を切り出した場合であって、切り出した動画像間で切り出しフレームが重複する場合、図９は時間的切り出し基準３で動画像を切り出した場合であって、切り出した動画像間で切り出しフレームが重複しない場合の例を示す図である。 6 to 9 are schematic diagrams showing the concept of the entire image generation procedure. 6 to 9 illustrate the concept of the case where a moving image is generated by the temporal connection criterion and the spatial connection criterion, taking the case where the moving image is extracted by the temporal cutout criterion as an example.
Specifically, FIG. 6 shows a case where a moving image is cut out according to temporal cut-out criterion 1, FIG. 7 shows a case where a moving image is cut out according to temporal cut-out criterion 2, FIG. In the case where cut-out frames overlap between the cut-out moving images, FIG. 9 is the case where the moving image is cut out according to temporal cut-out reference 3 and the cut-out frames do not overlap between cut-out moving images Is a diagram illustrating an example of

図６に示すように、時間的切り出し基準１では、図６（Ａ）に示すオリジナルの動画像内に特定の人物として認証された人物Ａ，Ｂ，Ｃの顔が含まれる場合、同じ時間の長さで人物Ａ，Ｂ，Ｃそれぞれの顔に対応する動画像が切り出される（図６（Ｂ）参照）。なお、図６（Ａ）に示すオリジナルの動画像では、人物Ａ，Ｂ，Ｃの顔が全フレームの期間において認証されているが、図７（Ａ）に示すオリジナルの動画像のように顔が検出または認証されなかったフレームの期間がある場合には、その期間にオリジナルの動画像が挿入される。そのため、切り出される各動画像の時間、フレーム数及びフレームレートは、オリジナルの動画像と同じとなる。
このとき、予め登録されている優先順位が人物Ａ、人物Ｂ、人物Ｃの順に高く、切り出された部分の大きさが人物Ｂ、人物Ａ、人物Ｃの順に大きいものとする。 As shown in FIG. 6, according to the temporal cutout standard 1, when the face of the persons A, B, and C authenticated as the specific person is included in the original moving image shown in FIG. Moving images corresponding to the faces of the persons A, B, and C are cut out by the length (see FIG. 6B). In the original moving image shown in FIG. 6A, the faces of the persons A, B, and C are authenticated in the period of the entire frame, but as in the original moving image shown in FIG. 7A. If there is a period of frame that has not been detected or authenticated, the original moving image is inserted in that period. Therefore, the time, the number of frames, and the frame rate of each moving image to be cut out are the same as the original moving image.
At this time, it is assumed that the priorities registered in advance are higher in the order of person A, person B and person C, and the size of the cut out part is larger in the order of person B, person A and person C.

すると、図６（Ｃ）に示すように、切り出された動画像を時間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、人物Ａ、人物Ｂ、人物Ｃの順に、切り出された動画像が結合される。
一方、切り出された動画像を時間的結合基準２に従って結合する場合、切り出し部分の大きさに応じて、人物Ｂ、人物Ａ、人物Ｃの順に、切り出された動画像が結合される。また、時間的結合基準３に従って結合する場合、切り出される動画像の時間は人物Ａ、人物Ｂ、人物Ｃで同じであるが、図７（Ａ）に示すオリジナルの動画像のようにオリジナルの動画像を挿入する前の元の切り出された動画像の長さが異なる場合は、長さの順に応じて、人物Ａ、人物Ｂ、人物Ｃの順に、結合する。
また、図６（Ｄ）に示すように、切り出された動画像を空間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、人物Ａが優先順位１の合成位置の領域、人物Ｂが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、人物Ａが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ａが選択される。 Then, as shown in FIG. 6C, when combining the cut out moving images according to the temporal combination criterion 1, according to the priority registered in advance, in the order of person A, person B, person C, The cut out moving images are combined.
On the other hand, when the cut out moving images are combined according to the temporal combination criterion 2, the cut out moving images are combined in the order of the person B, the person A and the person C according to the size of the cut out portion. Also, when combining according to the temporal combining criteria 3, the time of the moving image to be cut out is the same for person A, person B and person C, but the original moving image as in the original moving image shown in FIG. If the length of the original clipped moving image before inserting the image is different, the combination is performed in the order of the person A, the person B, and the person C according to the order of the lengths.
Further, as shown in FIG. 6D, when combining the cut out moving images according to the spatial connection criterion 1, the area of the synthesis position of the person A in the priority 1 according to the priority registered in advance. The person B is joined to the area of the synthesis position of priority 2 and the person C is joined to the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the person A is the top-priority cut-out target, the layout 3A is selected with reference to the layout table.

一方、切り出された動画像を空間的結合基準２に従って結合する場合、切り出された部分の大きさに応じて、人物Ｂが優先順位１の合成位置の領域、人物Ａが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。また、空間的結合基準３に従って結合する場合、切り出される動画像の時間は人物Ａ、人物Ｂ、人物Ｃで同じであるが、図７（Ａ）に示すオリジナルの動画像のようにオリジナルの動画像を挿入する前の元の切り出された動画像の長さが異なる場合は、長さの順に応じて、人物Ａ、人物Ｂ、人物Ｃの順に優先順位の高い合成位置の領域に結合する。なお、このとき、切り出し画像数が３であり、人物Ｂが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ｂが選択される。 On the other hand, when combining the cut out moving images according to the spatial connection standard 2, the person B is a region of the combining position of priority 1 and the person A is a combining position of priority 2 according to the size of the cut out portion. , The person C is combined with the area of the priority 3 composite position. Further, when combining according to the spatial combining criterion 3, the time of the moving image to be cut out is the same for person A, person B and person C, but the original moving image as in the original moving image shown in FIG. If the length of the original clipped moving image before inserting the image is different, it is combined with the area of the synthesis position having the highest priority in the order of person A, person B and person C according to the order of length. At this time, since the number of cut-out images is 3 and the person B is the cut-out target of top priority, the layout 3B is selected with reference to the layout table.

また、図７に示すように、時間的切り出し基準２では、図７（Ａ）に示すオリジナルの動画像内に特定の人物として認証された人物Ａ，Ｂ，Ｃの顔が含まれる場合、同じ時間の長さで人物Ａ，Ｂ，Ｃそれぞれの顔に対応する動画像が切り出される（図７（Ｂ）参照）。なお、図７（Ａ）に示すオリジナルの動画像では、人物Ｂ，Ｃの顔が一部のフレームの期間において認証されていないが、顔が検出または認証されなかったフレームの期間がある場合には、その期間が切り出されず圧縮される。そのため、時間の長さを合わせるためにフレームレートをオリジナルの動画像より低下させる。
このとき、予め登録されている優先順位が人物Ａ、人物Ｂ、人物Ｃの順に高く、切り出された部分の大きさが人物Ｂ、人物Ａ、人物Ｃの順に大きく、切り出された動画像のフレームレートが人物Ａ、人物Ｂ、人物Ｃの順に高い（即ち、時間の長さを合わせる前の長さが人物Ａ、人物Ｂ、人物Ｃの順に長い）ものとする。 In addition, as shown in FIG. 7, in the temporal clipping reference 2, when the original moving image shown in FIG. 7A includes the faces of persons A, B, and C authenticated as a specific person, the same process is performed. Moving images corresponding to the faces of the persons A, B, and C are cut out by the length of time (see FIG. 7B). In the original moving image shown in FIG. 7A, the faces of the persons B and C are not authenticated in the period of some frames, but there is a period of frames in which no face is detected or authenticated. Is compressed without cutting out its period. Therefore, to match the length of time, the frame rate is reduced compared to the original moving image.
At this time, the priority registered in advance is higher in the order of person A, person B and person C, and the size of the extracted part is larger in order of person B, person A and person C and the frame of the extracted moving image It is assumed that the rate is higher in the order of person A, person B and person C (that is, the length before matching the length of time is longer in the order of person A, person B and person C).

すると、図７（Ｃ）に示すように、切り出された動画像を時間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、人物Ａ、人物Ｂ、人物Ｃの順に、切り出された動画像が結合される。
一方、切り出された動画像を時間的結合基準２に従って結合する場合、切り出し部分の大きさに応じて、人物Ｂ、人物Ａ、人物Ｃの順に、切り出された動画像が結合される。
さらに、切り出された動画像を時間的結合基準３に従って結合する場合、切り出し時間の長さが同じであるが、この場合は、フレームレートを低下させる前の元の切り出された動画像の長さに応じて、人物Ａ、人物Ｂ、人物Ｃの順に、切り出された動画像が結合される。 Then, as shown in FIG. 7C, when combining the cut out moving images according to the temporal combination criterion 1, according to the priority registered in advance, in the order of person A, person B, person C, The cut out moving images are combined.
On the other hand, when the cut out moving images are combined according to the temporal combination criterion 2, the cut out moving images are combined in the order of the person B, the person A and the person C according to the size of the cut out portion.
Furthermore, when combining the extracted moving images according to the temporal combination criterion 3, the length of the extraction time is the same, but in this case, the length of the original extracted image before reducing the frame rate Accordingly, the cut-out moving images are combined in the order of the person A, the person B, and the person C.

また、図７（Ｄ）に示すように、切り出された動画像を空間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、人物Ａが優先順位１の合成位置の領域、人物Ｂが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、人物Ａが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ａが選択される。
一方、切り出された動画像を空間的結合基準２に従って結合する場合、切り出された部分の大きさに応じて、人物Ｂが優先順位１の合成位置の領域、人物Ａが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、人物Ｂが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ｂが選択される。 Further, as shown in FIG. 7D, when combining the cut out moving images in accordance with the spatial connection criterion 1, the area of the synthesis position of the person A in the priority 1 according to the priority registered in advance. The person B is joined to the area of the synthesis position of priority 2 and the person C is joined to the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the person A is the top-priority cut-out target, the layout 3A is selected with reference to the layout table.
On the other hand, when combining the cut out moving images according to the spatial connection standard 2, the person B is a region of the combining position of priority 1 and the person A is a combining position of priority 2 according to the size of the cut out portion. , The person C is combined with the area of the priority 3 composite position. At this time, since the number of cut-out images is 3 and the person B is the cut-out target of top priority, the layout 3B is selected with reference to the layout table.

さらに、切り出された動画像を空間的結合基準３に従って結合する場合、切り出し時間の長さが同じであるが、この場合は、フレームレートを低下させる前の元の切り出された動画像の長さに応じて、人物Ａが優先順位１の合成位置の領域、人物Ｂが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、人物Ａが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ａが選択される。 Furthermore, if the clipped moving images are combined according to the spatial combining criterion 3, the length of the cutout time is the same, but in this case, the length of the original clipped moving image before reducing the frame rate Accordingly, the person A is combined with the area of the synthesis position of priority 1, the area of the person B with the synthesis position of priority 2, and the person C with the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the person A is the top-priority cut-out target, the layout 3A is selected with reference to the layout table.

また、図８に示すように、時間的切り出し基準３では、図８（Ａ）に示すオリジナルの動画像内に特定の人物として認証された人物Ａ，Ｂ，Ｃの顔が含まれる場合、人物Ａ，Ｂ，Ｃの顔が含まれる部分のみが人物Ａ，Ｂ，Ｃそれぞれの顔に対応する動画像として切り出される（図８（Ｂ）参照）。図８に示す例では、オリジナルの動画像において、人物Ａ、人物Ｂ、人物Ｃの顔が含まれるフレームが時間的に重複している。なお、図８（Ａ）に示すオリジナルの動画像では、人物Ｂ，Ｃの顔が一部のフレームの期間において認証されていないが、顔が検出または認証されなかったフレームの期間がある場合には、オリジナルの動画像以下となる。
このとき、予め登録されている優先順位が人物Ａ、人物Ｂ、人物Ｃの順に高く、切り出された部分の大きさが人物Ｂ、人物Ａ、人物Ｃの順に大きく、切り出された動画像の長さが人物Ａ、人物Ｂ、人物Ｃの順に長いものとする。 In addition, as shown in FIG. 8, with temporal segmentation standard 3, when the original moving image shown in FIG. 8A includes the faces of persons A, B, and C authenticated as a specific person, the person Only the parts including the faces A, B and C are cut out as moving images corresponding to the faces of the persons A, B and C (see FIG. 8B). In the example shown in FIG. 8, in the original moving image, frames in which the faces of the person A, the person B, and the person C are included temporally overlap. In the original moving image shown in FIG. 8A, the faces of the persons B and C are not authenticated in the period of some frames, but there is a period of frames in which no face is detected or authenticated. Is less than the original video.
At this time, the priority registered in advance is higher in the order of person A, person B and person C, and the size of the extracted part is larger in order of person B, person A and person C and the length of the extracted moving image The length is longer in the order of person A, person B, and person C.

すると、図８（Ｃ）に示すように、切り出された動画像を時間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、人物Ａ、人物Ｂ、人物Ｃの順に、切り出された動画像が結合される。
一方、切り出された動画像を時間的結合基準２に従って結合する場合、切り出し部分の大きさに応じて、人物Ｂ、人物Ａ、人物Ｃの順に、切り出された動画像が結合される。
さらに、切り出された動画像を時間的結合基準３に従って結合する場合、切り出し時間の長さに応じて、人物Ａ、人物Ｂ、人物Ｃの順に、切り出された動画像が結合される。 Then, as shown in FIG. 8C, when combining the cut out moving images according to the temporal combination criterion 1, according to the priority registered in advance, in the order of person A, person B, person C, The cut out moving images are combined.
On the other hand, when the cut out moving images are combined according to the temporal combination criterion 2, the cut out moving images are combined in the order of the person B, the person A and the person C according to the size of the cut out portion.
Furthermore, when the cut out moving images are combined according to the temporal combination criterion 3, the cut out moving images are combined in the order of the person A, the person B and the person C according to the length of the cutout time.

また、図８（Ｄ）に示すように、切り出された動画像を空間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、人物Ａが優先順位１の合成位置の領域、人物Ｂが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、人物Ａが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ａが選択される。
一方、切り出された動画像を空間的結合基準２に従って結合する場合、切り出された部分の大きさに応じて、人物Ｂが優先順位１の合成位置の領域、人物Ａが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、人物Ｂが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ｂが選択される。 Further, as shown in FIG. 8D, when combining the cut out moving images according to the spatial connection criterion 1, the area of the combining position of the person A in the priority 1 according to the priority registered in advance. The person B is joined to the area of the synthesis position of priority 2 and the person C is joined to the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the person A is the top-priority cut-out target, the layout 3A is selected with reference to the layout table.
On the other hand, when combining the cut out moving images according to the spatial connection standard 2, the person B is a region of the combining position of priority 1 and the person A is a combining position of priority 2 according to the size of the cut out portion. , The person C is combined with the area of the priority 3 composite position. At this time, since the number of cut-out images is 3 and the person B is the cut-out target of top priority, the layout 3B is selected with reference to the layout table.

さらに、切り出された動画像を空間的結合基準３に従って結合する場合、切り出し時間の長さに応じて、人物Ａが優先順位１の合成位置の領域、人物Ｂが優先順位２の合成位置の領域、人物Ｃが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、人物Ａが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ａが選択される。 Furthermore, when combining the cut out moving images according to the spatial connection criterion 3, according to the length of the cut-out time, the area of the synthesis position of person A is priority 1 and the area of the synthesis position of person B is priority 2 , Person C is joined to the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the person A is the top-priority cut-out target, the layout 3A is selected with reference to the layout table.

また、図９に示すように、時間的切り出し基準３では、図９（Ａ）に示すオリジナルの動画像内に特定の人物として認証された人物の正面、右向き及び左向きの顔が含まれる場合、正面、右向き及び左向きの顔が含まれる部分のみがそれぞれ動画像として切り出される（図９（Ｂ）参照）。図９（Ａ）に示すオリジナルの動画像では、人物の顔が全フレームの期間において認証されており、正面、右向き及び左向きの顔が含まれる各々のフレームは時間的に重複していない。そのため、切り出される動画像の長さの合計は、オリジナルの動画像の長さと同じとなる。
このとき、予め登録されている優先順位が正面、右向き、左向きの順に高く、切り出された部分の大きさが右向き、正面、左向きの順に大きく、切り出された動画像の長さが正面、右向き、左向きの順に長いものとする。 Further, as shown in FIG. 9, in the case of the temporal cutout reference 3, when the original moving image shown in FIG. 9A includes the face in the front, right and left directions of the person who is authenticated as the specific person, Only portions including faces facing front, right and left are respectively cut out as moving images (see FIG. 9B). In the original moving image shown in FIG. 9A, the face of the person is authenticated in the entire frame period, and each frame including the front, right and left faces does not overlap in time. Therefore, the total length of the moving image to be cut out is the same as the length of the original moving image.
At this time, the priority registered in advance is higher in the order of front, right and left, and the size of the cut out part is larger in the order of right, front and left and the length of the cut out is front, right It is assumed to be longer in the left direction.

すると、図９（Ｃ）に示すように、切り出された動画像を時間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、正面、右向き、左向きの順に、切り出された動画像が結合される。
一方、切り出された動画像を時間的結合基準２に従って結合する場合、切り出し部分の大きさに応じて、右向き、正面、左向きの順に、切り出された動画像が結合される。
さらに、切り出された動画像を時間的結合基準３に従って結合する場合、切り出し時間の長さに応じて、正面、右向き、左向きの順に、切り出された動画像が結合される。 Then, as shown in FIG. 9C, when combining the cut out moving images according to the temporal connection standard 1, according to the priority registered in advance, the parts are cut out in the order of front, right and left. Moving pictures are combined.
On the other hand, when the cut out moving images are combined according to the temporal combination standard 2, the cut out moving images are combined in the order of rightward, front and left according to the size of the cut out portion.
Furthermore, when the cut out moving images are combined according to the temporal combination criterion 3, the cut out moving images are combined in the order of front, right and left according to the length of the cut out time.

また、図９（Ｄ）に示すように、切り出された動画像を空間的結合基準１に従って結合する場合、予め登録されている優先順位に応じて、正面が優先順位１の合成位置の領域、右向きが優先順位２の合成位置の領域、左向きが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、正面が最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ｇが選択される。
一方、切り出された動画像を空間的結合基準２に従って結合する場合、切り出された部分の大きさに応じて、右向きが優先順位１の合成位置の領域、正面が優先順位２の合成位置の領域、左向きが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、右向きが最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ｈが選択される。 Further, as shown in FIG. 9D, when combining the cut out moving images according to the spatial connection standard 1, according to the priority registered in advance, the area of the synthesis position of the priority 1 in the front, The rightward direction is combined with the area of the synthesis position of priority 2 and the leftward direction is connected with the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the front is the cut-out target of top priority, the layout 3G is selected with reference to the layout table.
On the other hand, when combining the cut out moving images in accordance with the spatial connection standard 2, the area facing to the right is a combination position of priority 1 and the area from the front is a combination position of priority 2 according to the size of the cut out part. , Left facing is combined with the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the rightward direction is the cut-out target of top priority, the layout 3H is selected with reference to the layout table.

さらに、切り出された動画像を空間的結合基準３に従って結合する場合、切り出し時間の長さに応じて、正面が優先順位１の合成位置の領域、右向きが優先順位２の合成位置の領域、左向きが優先順位３の合成位置の領域に結合される。なお、このとき、切り出し画像数が３であり、正面が最優先の切り出し対象であるため、レイアウトテーブルを参照して、レイアウト３Ｇが選択される。 Furthermore, when combining the cut out moving images according to the spatial connection criterion 3, according to the length of the cut-out time, the front is an area of the synthesis position of priority 1; the right is an area of the synthesis position of priority 2; Is coupled to the area of the synthesis position of priority 3. At this time, since the number of cut-out images is 3 and the front is the cut-out target of top priority, the layout 3G is selected with reference to the layout table.

［動作］
次に、動作を説明する。
［画像生成処理］
図１０は、図２の機能的構成を有する図１の画像処理装置１が実行する画像生成処理の流れを説明するフローチャートである。
画像生成処理は、ユーザによる入力部１７への画像生成処理開始のための操作により開始される。 [Operation]
Next, the operation will be described.
[Image generation processing]
FIG. 10 is a flow chart for explaining the flow of image generation processing executed by the image processing apparatus 1 of FIG. 1 having the functional configuration of FIG. 2.
The image generation process is started by an operation by the user for starting the image generation process on the input unit 17.

ステップＳ１において、画像選択部５１は、画像記憶部７１に記憶された画像のデータの中から、ユーザの指示入力に対応する動画像のデータを選択する。この選択された動画像のデータが、オリジナルの動画像となる。
ステップＳ２において、切り出し基準特定部５２は、動画像から人物の顔の部分を切り出す際の切り出し基準を特定する。ここでは、前回選択された切り出し基準を今回使用する切り出し基準として特定するが、ユーザによる入力部１７への切り出し基準の選択のための操作により特定してもよい。 In step S 1, the image selection unit 51 selects moving image data corresponding to the user's instruction input from among the image data stored in the image storage unit 71. The data of the selected moving image is the original moving image.
In step S 2, the clipping reference identification unit 52 identifies the clipping reference when clipping the face portion of the person from the moving image. Here, although the cutout criterion selected last time is specified as the cutout criterion used this time, it may be specified by the operation for selecting the cutout criterion to the input unit 17 by the user.

ステップＳ３において、切り出し処理部５３は、切り出し基準特定部５２によって特定された切り出し基準に従って、動画像から人物の顔の部分を切り出す切り出し処理を実行する。なお、切り出し処理の詳細は後述する。
ステップＳ４において、結合基準特定部５４は、切り出し処理部５３によって切り出された動画像を結合する際の結合基準を特定する。ここでは、前回選択された結合基準を今回使用する結合基準として特定するが、ユーザによる入力部１７への結合基準の選択のための操作により特定してもよい。 In step S 3, the cutout processing unit 53 executes cutout processing to cut out the face portion of the person from the moving image in accordance with the cutout reference identified by the cutout reference identification unit 52. The details of the clipping process will be described later.
In step S4, the combination reference specification unit 54 specifies a combination reference when combining moving images cut out by the cut-out processing unit 53. Here, although the previously selected combination criterion is specified as the combination criterion to be used this time, it may be specified by an operation for selection of the combination criterion to the input unit 17 by the user.

ステップＳ５において、結合処理部５５は、ステップＳ４において特定された結合基準に基づき、切り出された動画像を空間的に結合するか否かの判定を行う。
切り出された動画像を空間的に結合しない場合、ステップＳ５においてＮＯと判定されて、処理はステップＳ８に移行する。
一方、切り出された動画像を空間的に結合する場合、ステップＳ５においてＹＥＳと判定されて、処理はステップＳ６に移行する。 In step S5, the combining processing unit 55 determines whether to spatially combine the cut out moving images based on the combining reference identified in step S4.
If the extracted moving images are not spatially combined, it is determined as NO in step S5, and the process proceeds to step S8.
On the other hand, in the case of spatially combining the cut out moving images, YES is determined in step S5, and the process proceeds to step S6.

ステップＳ６において、レイアウト選択部５６は、ステップＳ４において特定された結合基準に基づき、ステップ３において切り出し処理により切り出された切り出し画像数及び最優先の切り出し対象に基づいて、レイアウトテーブルを参照し、レイアウトデータを選択する。
ステップＳ７において、結合処理部５５は、切り出された動画像を空間的結合基準に従って結合する空間的結合処理を実行する。なお、空間的結合処理の詳細は後述する。 In step S6, the layout selection unit 56 refers to the layout table on the basis of the number of clipped images clipped out in step 3 and the clipping object of top priority based on the combination standard identified in step S4. Select data
In step S7, the combination processing unit 55 performs a spatial combination process of combining the cut out moving images in accordance with the spatial combination criteria. The details of the spatial coupling process will be described later.

ステップＳ８において、結合処理部５５は、切り出された動画像を時間的結合基準に従って結合する時間的結合処理を実行する。なお、時間的結合処理の詳細は後述する。
ステップＳ７及びステップＳ８の後、画像生成処理は終了となる。 In step S8, the combining processing unit 55 performs temporal combining processing for combining the cut out moving images in accordance with a temporal combining criterion. The details of the temporal connection process will be described later.
After steps S7 and S8, the image generation process ends.

［切り出し処理］
図１１は、図１０におけるステップＳ３の切り出し処理の流れを説明するフローチャートである。
ステップＳ３１において、切り出し処理部５３は、オリジナルの動画像をサーチし、切り出し対象となる被写体を特定する。 [Cut-out process]
FIG. 11 is a flowchart for explaining the flow of the clipping process of step S3 in FIG.
In step S31, the cutout processing unit 53 searches for an original moving image, and specifies a subject to be cutout.

ステップＳ３２において、切り出し処理部５３は、特定されている空間的切り出し基準または時間的切り出し基準に基づいて、最優先の被写体を処理対象として選択する。
ステップＳ３３において、切り出し処理部５３は、選択された被写体を空間的または時間的に切り出し、中間動画像を生成する。中間動画像とは、特定のファイル形式とされていない一時的に記憶されるフレームデータの集合である。 In step S 32, the clipping processing unit 53 selects a subject with the highest priority as a processing target based on the specified spatial clipping reference or temporal clipping reference.
In step S33, the cutout processing unit 53 spatially or temporally cuts out the selected subject to generate an intermediate moving image. An intermediate moving image is a set of temporarily stored frame data that is not in a specific file format.

ステップＳ３４において、切り出し処理部５３は、全ての切り出し対象となる被写体の切り出し処理が終了したか否かの判定を行う。
全ての切り出し対象となる被写体の切り出し処理が終了していない場合、ステップＳ３４においてＮＯと判定されて、処理はステップＳ３５に移行する。 In step S34, the clipping processing unit 53 determines whether or not the clipping processing of all the subjects to be clipped is completed.
If the clipping process of all the subjects to be clipped is not completed, it is determined as NO in step S34, and the process proceeds to step S35.

ステップＳ３５において、切り出し処理部５３は、次の優先順位の被写体を処理対象として選択する。
ステップＳ３５の後、処理はステップＳ３３に移行する。
一方、全ての切り出し対象となる被写体の切り出し処理が終了した場合、ステップＳ３４においてＹＥＳと判定されて、処理は図１０の画像生成処理に戻る。 In step S35, the clipping processing unit 53 selects a subject with the next priority as a processing target.
After step S35, the process proceeds to step S33.
On the other hand, when the clipping process of all the subjects to be clipped is completed, YES is determined in step S34, and the process returns to the image generation process of FIG.

［空間的結合処理］
図１２は、図１０におけるステップＳ７の空間的結合処理の流れを説明するフローチャートである。
ステップＳ５１において、結合処理部５５は、選択されたレイアウトの各合成位置の優先順位に対応する中間動画像（サイズやアスペクト比を調整した中間動画像）をレイアウトに従って合成することにより結合する。 [Spatial connection processing]
FIG. 12 is a flow chart for explaining the flow of the spatial coupling process of step S7 in FIG.
In step S51, the combining processing unit 55 combines the intermediate moving images (intermediate moving images whose sizes and aspect ratios have been adjusted) corresponding to the priorities of the combining positions of the selected layout according to the layout.

ステップＳ５２において、結合処理部５５は、中間動画像を結合した動画像をファイル化する。なお、本実施形態において、動画像をファイル化する際のファイル形式としては、例えば、ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）４、Ｈ．２６４或いはＨ．２６５等に準拠したものとすることができる。
ステップＳ５２の後、処理は画像生成処理に戻る。 In step S52, the combining processing unit 55 files the moving image obtained by combining the intermediate moving image. Note that, in the present embodiment, as a file format at the time of converting a moving image into a file, for example, MPEG (Moving Picture Experts Group) 4, H.264, or the like can be used. H.264 or H. It can conform to H.265 and the like.
After step S52, the process returns to the image generation process.

［時間的結合処理］
図１３は、図１０におけるステップＳ８の時間的結合処理の流れを説明するフローチャートである。
ステップＳ７１において、結合処理部５５は、特定された時間的結合基準の優先順位の順に中間動画像を繋いで結合する。 Temporal join processing
FIG. 13 is a flow chart for explaining the flow of the temporal connection process of step S8 in FIG.
In step S71, the combining processing unit 55 connects and combines intermediate moving images in the order of the priorities of the identified temporal combination criteria.

ステップＳ７２において、結合処理部５５は、中間動画像を結合した動画像をファイル化する。
ステップＳ７２の後、処理は画像生成処理に戻る。 In step S72, the combining processing unit 55 files the moving image obtained by combining the intermediate moving image.
After step S72, the process returns to the image generation process.

このような処理の結果、ユーザによって選択された動画像（オリジナルの動画像）において、複数の人物が被写体として含まれる場合であっても、空間的切り出し基準または時間的切り出し基準に従って、複数の切り出し対象が切り出される。そして、切り出された動画像が、空間的結合基準または時間的結合基準に従って、優先順位に応じて結合され、新たな動画像が生成される。
そのため、複数の人物が画面内で離れている場合であっても、それぞれの人物を適切に動画像から切り出して、新たな動画像を生成することができる。
したがって、画面内に存在する複数の被写体の位置関係によらず、効果的な画像を生成することが可能となる。 As a result of such processing, in the moving image (original moving image) selected by the user, even in the case where a plurality of persons are included as subjects, a plurality of cutouts according to the spatial cutout criterion or the temporal cutout criterion The subject is cut out. Then, the extracted moving images are combined according to the priority according to the spatial combining criterion or the temporal combining criterion, and a new moving image is generated.
Therefore, even when a plurality of persons are separated in the screen, each person can be appropriately cut out from the moving image to generate a new moving image.
Therefore, it becomes possible to generate an effective image regardless of the positional relationship of a plurality of subjects present in the screen.

また、空間的切り出し基準に従って切り出された切り出し対象を、時間的結合基準に従って結合し、新たな動画像を生成することができる。
そのため、動画像において離れた位置に写っている被写体を、予め設定された基準に適合させて、時間的に連続して表示することが可能となる。
また、時間的切り出し基準に従って切り出された切り出し対象を、空間的結合基準に従って結合し、新たな動画像を生成することができる。
そのため、動画像において同時に写っている被写体であるか否かに関わらず、予め設定された基準に適合させて、複数の被写体の動画像を空間的に配置された状態で表示することが可能となる。 Also, segmentation objects segmented according to spatial segmentation criteria can be combined according to temporal coupling criteria to generate a new moving image.
Therefore, it becomes possible to display the subject appearing at a distant position in the moving image according to a preset reference and display it continuously in time.
Also, segmentation objects segmented according to the temporal segmentation criteria can be combined according to the spatial coupling criteria to generate a new moving image.
Therefore, it is possible to display moving images of a plurality of subjects in a spatially arranged state by conforming to a preset reference regardless of whether the objects are simultaneously shown in the moving image or not. Become.

なお、本実施形態においては、画像を切り出す対象として、動画像を例に挙げて説明したが、動画像の他、連写された複数の静止画像等、連続する静止画像を対象に画像を切り出すこととしてもよい。 In the present embodiment, a moving image has been described as an example of the image cutting target, but the image is cut out for continuous still images such as a plurality of continuously shot still images as well as the moving image. You may do it.

［第２実施形態］
次に、本発明の第２実施形態について説明する。
第２実施形態に係る画像処理装置１は、静止画像を対象として被写体の空間的な切り出しを行い、切り出された静止画像を空間的に結合して新たな静止画像を生成する。
即ち、第２実施形態に係る画像処理装置１は、ハードウェアの構成及び、画像生成処理を実行するための機能的構成は第１実施形態と共通で、主として画像生成処理の内容が第１実施形態と異なっている。 Second Embodiment
Next, a second embodiment of the present invention will be described.
The image processing apparatus 1 according to the second embodiment spatially cuts out a subject for a still image, spatially combines the cut out still images, and generates a new still image.
That is, the image processing apparatus 1 according to the second embodiment has the hardware configuration and the functional configuration for executing the image generation processing in common with the first embodiment, and the contents of the image generation processing are mainly performed in the first embodiment. It is different from the form.

図１４は、第２実施形態における画像処理装置１が実行する画像生成処理の流れを説明するフローチャートである。
画像生成処理は、ユーザによる入力部１７への画像生成処理開始のための操作により開始される。 FIG. 14 is a flowchart illustrating the flow of the image generation process performed by the image processing apparatus 1 according to the second embodiment.
The image generation process is started by an operation by the user for starting the image generation process on the input unit 17.

ステップＳ１０１において、画像選択部５１は、画像記憶部７１に記憶された画像のデータの中から、ユーザの指示入力に対応する静止画像のデータを選択する。以下、選択された静止画像のデータを「オリジナルの静止画像」と呼ぶ。
ステップＳ１０２において、切り出し基準特定部５２は、静止画像から人物の顔の部分を切り出す際の切り出し基準（空間的切り出し基準）を特定する。なお、本実施形態においては、第１実施形態における空間的切り出し基準１〜４のうち、空間的切り出し基準１〜３のいずれかが特定される。ここでも、第１実施形態同様、前回選択された切り出し基準を今回使用する切り出し基準として特定するが、ユーザによる入力部１７への切り出し基準の選択のための操作により特定してもよい。 In step S101, the image selection unit 51 selects data of a still image corresponding to a user's instruction input from among the image data stored in the image storage unit 71. Hereinafter, data of the selected still image will be referred to as "original still image".
In step S 102, the cutout reference specification unit 52 specifies a cutout reference (spatial cutout reference) when cutting out the face portion of the person from the still image. In the present embodiment, among the spatial cutout criteria 1 to 4 in the first embodiment, any one of the spatial cutout criteria 1 to 3 is specified. Here, as in the first embodiment, the previously selected clipping criterion is specified as the clipping criterion to be used this time, but it may be specified by an operation for selecting the clipping criterion to the input unit 17 by the user.

ステップＳ１０３において、切り出し処理部５３は、切り出し基準特定部５２によって特定された切り出し基準に従って、静止画像から人物の顔の部分を切り出す切り出し処理を実行する。なお、切り出し処理の詳細は後述する。
ステップＳ１０４において、結合基準特定部５４は、切り出し処理部５３によって切り出された静止画像を結合する際の結合基準（空間的結合基準）を特定する。なお、本実施形態においては、第１実施形態における空間的結合基準１〜３のうち、空間的結合基準１，２のいずれかが特定される。ここでも、第１実施形態同様、前回選択された結合基準を今回使用する結合基準として特定するが、ユーザによる入力部１７への切り出し基準の選択のための操作により特定してもよい。 In step S103, the clipping processing unit 53 executes clipping processing to clip out the face portion of the person from the still image in accordance with the clipping reference identified by the clipping reference identifying unit 52. The details of the clipping process will be described later.
In step S104, the combination reference specification unit 54 specifies a combination reference (spatial connection reference) when combining the still images cut out by the cut-out processing unit 53. In the present embodiment, one of the spatial coupling criteria 1 and 2 is specified among the spatial coupling criteria 1 to 3 in the first embodiment. Here, as in the first embodiment, the combination criterion selected last time is specified as the combination criterion to be used this time, but may be specified by an operation for selecting the cutout criterion to the input unit 17 by the user.

ステップＳ１０５において、レイアウト選択部５６は、ステップＳ１０４において特定された結合基準に基づき、ステップＳ１０３において切り出し処理により切り出された切り出し画像数及び最優先の切り出し対象に基づいて、レイアウトテーブルを参照し、レイアウトデータを選択する。
ステップＳ１０６において、結合処理部５５は、切り出された静止画像を空間的結合基準に従って結合する結合処理を実行する。なお、結合処理の詳細は後述する。
ステップＳ１０６の後、画像生成処理は終了となる。 In step S105, the layout selection unit 56 refers to the layout table on the basis of the combination criteria specified in step S104, based on the number of cut-out images cut out in the cut-out process in step S103 and the highest priority cut-out target. Select data
In step S106, the combining processing unit 55 performs a combining process of combining the cut out still images in accordance with the spatial combining criteria. The details of the combining process will be described later.
After step S106, the image generation process ends.

［切り出し処理］
図１５は、図１４におけるステップＳ１０３の切り出し処理の流れを説明するフローチャートである。
ステップＳ１２１において、切り出し処理部５３は、オリジナルの静止画像をサーチし、切り出し対象となる被写体を特定する。 [Cut-out process]
FIG. 15 is a flowchart for explaining the flow of the clipping process of step S103 in FIG.
In step S121, the clipping processing unit 53 searches for the original still image, and specifies the subject to be clipped.

ステップＳ１２２において、切り出し処理部５３は、特定されている空間的切り出し基準に基づいて、最優先の被写体を処理対象として選択する。
ステップＳ１２３において、切り出し処理部５３は、選択された被写体を空間的に切り出し、中間静止画像を生成する。中間静止画像とは、特定のファイル形式とされていない一時的に記憶される画素データの集合である。 In step S122, the clipping processing unit 53 selects a subject with the highest priority as a processing target based on the specified spatial clipping standard.
In step S123, the cutout processing unit 53 spatially cuts out the selected subject and generates an intermediate still image. The intermediate still image is a set of temporarily stored pixel data that is not in a specific file format.

ステップＳ１２４において、切り出し処理部５３は、全ての切り出し対象となる被写体の切り出し処理が終了したか否かの判定を行う。
全ての切り出し対象となる被写体の切り出し処理が終了していない場合、ステップＳ１２４においてＮＯと判定されて、処理はステップＳ１２５に移行する。 In step S124, the clipping processing unit 53 determines whether or not the clipping processing of all the subjects to be clipped is completed.
If the clipping process of all the subjects to be clipped is not completed, it is determined as NO in step S124, and the process proceeds to step S125.

ステップＳ１２５において、切り出し処理部５３は、次の優先順位の被写体を処理対象として選択する。
ステップＳ１２５の後、処理はステップＳ１２３に移行する。
一方、全ての切り出し対象となる被写体の切り出し処理が終了した場合、ステップＳ１２４においてＹＥＳと判定されて、処理は図１４の画像生成処理に戻る。 In step S125, the cutout processing unit 53 selects a subject of the next priority as a processing target.
After step S125, the process proceeds to step S123.
On the other hand, when the clipping process of all the subjects to be clipped is completed, YES is determined in step S124, and the process returns to the image generation process of FIG.

［結合処理］
図１６は、図１４におけるステップＳ１０６の結合処理の流れを説明するフローチャートである。
ステップＳ１４１において、結合処理部５５は、選択されたレイアウトの各合成位置の優先順位に対応する中間静止画像（サイズやアスペクト比を調整した中間静止画像）をレイアウトに従って合成することにより結合する。 [Join process]
FIG. 16 is a flowchart for explaining the flow of the combining process of step S106 in FIG.
In step S141, the combining processing unit 55 combines the intermediate still images (intermediate still images whose sizes and aspect ratios have been adjusted) corresponding to the priorities of the combining positions of the selected layout according to the layout.

ステップＳ１４２において、結合処理部５５は、中間静止画像を結合した静止画像をファイル化する。なお、本実施形態において、静止画像をファイル化する際のファイル形式としては、例えば、ＪＰＥＧ（ＪｏｉｎｔＰｈｏｔｏｇｒａｐｈｉｃＥｘｐｅｒｔｓＧｒｏｕｐ）或いはＧＩＦ（ＧｒａｐｈｉｃＩｎｔｅｒｃｈａｎｇｅＦｏｒｍａｔ）等に準拠したものとすることができる。
ステップＳ１４２の後、処理は図１４の画像生成処理に戻る。 In step S142, the combining processing unit 55 files the still image obtained by combining the intermediate still images. In the present embodiment, as a file format at the time of converting a still image into a file, for example, it is possible to conform to JPEG (Joint Photographic Experts Group), GIF (Graphic Interchange Format) or the like.
After step S142, the process returns to the image generation process of FIG.

このような処理の結果、ユーザによって選択された静止画像において、複数の人物が被写体として含まれる場合であっても、空間的切り出し基準に従って、複数の切り出し対象が切り出される。そして、切り出された静止画像が、空間的結合基準に従って、優先順位に応じて結合され、新たな静止画像が生成される。
そのため、複数の人物が画面内で離れている場合であっても、それぞれの人物を適切に静止画像から切り出して、新たな静止画像を生成することができる。
したがって、画面内に存在する複数の被写体の位置関係によらず、効果的な画像を生成することが可能となる。 As a result of such processing, even if a plurality of persons are included as subjects in the still image selected by the user, a plurality of clipping targets are clipped according to the spatial clipping standard. Then, the clipped still images are combined according to the priority according to the spatial combining criteria to generate a new still image.
Therefore, even when a plurality of persons are separated in the screen, each person can be appropriately cut out from the still image, and a new still image can be generated.
Therefore, it becomes possible to generate an effective image regardless of the positional relationship of a plurality of subjects present in the screen.

［変形例１］
上述の実施形態においては、人物の顔等の被写体を空間的に切り出す際に、各被写体を個別に切り出す場合を例に挙げて説明した。これに対し、人物の顔等の被写体を空間的に切り出す際に、画面内で近接している複数の被写体を、１つの領域でまとめて切り出すこととしてもよい。
図１７は、複数の被写体をまとめて空間的に切り出す概念を示す模式図である。
図１７においては、動画像のフレームまたは静止画像内に、特定の人物として認証された人物Ａ〜Ｃの顔が含まれており、人物Ａ及び人物Ｂの顔が設定された閾値以内の距離に位置している。
このとき、人物Ａ及び人物Ｂの顔をまとめて１つの領域で空間的に切り出すことができる。
これにより、空間的に関連性が高い被写体が分離されることなく切り出されるため、新たな画像を生成する際に、被写体の状況に応じて、被写体間の関係をより適切なものとすることができる。 [Modification 1]
In the above-mentioned embodiment, when subjecting subjects, such as a person's face, etc. to be spatially cut out, the case where each subject was cut out individually was mentioned as an example and explained. On the other hand, when a subject such as the face of a person is spatially cut out, a plurality of close subjects in the screen may be cut out together in one region.
FIG. 17 is a schematic view showing a concept of collectively cutting out a plurality of subjects.
In FIG. 17, the moving image frame or still image includes the faces of persons A to C who have been authenticated as a specific person, and the distances between the faces of persons A and B within the set threshold value. positioned.
At this time, the faces of the person A and the person B can be collectively cut out spatially in one region.
As a result, since a subject with high spatial relevance is cut out without being separated, it is possible to make the relation between subjects more appropriate according to the situation of the subject when generating a new image. it can.

以上のように構成される画像処理装置１は、切り出し処理部５３と、結合処理部５５と、を備える。
切り出し処理部５３は、１つの動画像から、空間的及び／又は時間的な切り出し基準に基づき、複数の動画像を切り出す。
結合処理部５５は、切り出された前記複数の動画像を、空間的及び／又は時間的な結合基準に基づき、時間的又は空間的に結合して、新たな１つの動画像を生成する。
これにより、空間的及び／又は時間的な切り出し基準に従って切り出された複数の動画像から、空間的及び／又は時間的な結合基準に従って１つの動画像を生成することができる。
そのため、複数の被写体が画面内で離れている場合であっても、それぞれの被写体を適切に動画像から切り出して、新たな動画像を生成することができる。
したがって、画面内に存在する複数の被写体の位置関係によらず、効果的な画像を生成することが可能となる。 The image processing apparatus 1 configured as described above includes a cutout processing unit 53 and a combination processing unit 55.
The cutout processing unit 53 cuts out a plurality of moving images from one moving image based on spatial and / or temporal cutout criteria.
The combination processing unit 55 combines the extracted moving images temporally and / or spatially based on the spatial and / or temporal connection criteria to generate one new moving image.
Thereby, one moving image can be generated according to the spatial and / or temporal connection criterion from a plurality of moving images segmented according to the spatial and / or temporal cutout criterion.
Therefore, even when a plurality of subjects are separated in the screen, each subject can be appropriately cut out from the moving image, and a new moving image can be generated.
Therefore, it becomes possible to generate an effective image regardless of the positional relationship of a plurality of subjects present in the screen.

また、結合処理部５５は、空間的な切り出し基準で切り出された複数の動画像を、時間的な結合基準で結合する。
これにより、動画像において離れた位置に写っている被写体を、予め設定された時間的な結合基準に適合させて、時間的に連続して表示することが可能となる。 Further, the combining processing unit 55 combines a plurality of moving images cut out based on the spatial cut-out reference based on the temporal connection reference.
As a result, it becomes possible to match the subject appearing at a distant position in the moving image with the preset temporal connection reference and to display the subject continuously in time.

また、結合処理部５５は、時間的な切り出し基準で切り出された複数の動画像を、空間的な結合基準で結合する。
これにより、動画像において同時に写っている被写体であるか否かに関わらず、予め設定された空間的な結合基準に適合させて、複数の被写体の動画像を空間的に配置された状態で表示することが可能となる。 Further, the combining processing unit 55 combines a plurality of moving images cut out based on the temporal cut-out reference based on the spatial connection reference.
Thereby, regardless of whether or not the subject is simultaneously captured in the moving image, the moving images of a plurality of subjects are displayed in a spatially arranged state by being adapted to a preset spatial connection reference. It is possible to

また、切り出し処理部５３は、空間的な切り出し基準として、動画像を構成する個々の画面内に含まれる所定の被写体部分を空間的に切り出す。
これにより、動画像の各画面において、所定の被写体の領域を適切に切り出すことができる。 Also, the clipping processing unit 53 spatially cuts out a predetermined subject portion included in each screen constituting the moving image as a spatial clipping reference.
As a result, it is possible to appropriately cut out the area of the predetermined subject on each screen of the moving image.

また、切り出し処理部５３は、時間的な切り出し基準として、動画像を構成するフレームのうち所定の被写体が含まれるフレームを時間的に切り出す。
これにより、動画像において、所定の被写体が含まれるフレームの期間を適切に切り出すことができる。 Further, the clipping processing unit 53 cuts out, in time, a frame including a predetermined subject among frames constituting a moving image, as a temporal clipping reference.
As a result, it is possible to appropriately cut out the period of a frame in which a predetermined subject is included in a moving image.

また、所定の被写体は、予め登録されている人物の顔である。
これにより、予め登録された複数の人物が画面内で離れて写っている場合であっても、それぞれの被写体を適切に動画像から切り出すことができる。 The predetermined subject is the face of a person registered in advance.
As a result, even when a plurality of persons registered in advance are captured separately in the screen, each subject can be appropriately cut out of the moving image.

また、所定の被写体は、更に、不特定の人物の顔を含む。
これにより、不特定の人物の顔が写っている動画において、複数の人物を適切に動画像から切り出すことができる。 In addition, the predetermined subject further includes the face of an unspecified person.
Thus, in a moving image in which the face of an unspecified person is taken, a plurality of persons can be appropriately cut out from the moving image.

また、切り出し処理部５３による切り出しの対象としない人物の顔が予め登録される。
これにより、動画像に写っている複数の人物の中から、特定の人物を除外して、効果的な動画像を生成することができる。 In addition, the faces of persons not to be extracted by the extraction processing unit 53 are registered in advance.
Thus, it is possible to generate an effective moving image by excluding a specific person from among a plurality of persons appearing in the moving image.

また、結合処理部５５は、空間的な結合基準として、切り出された動画像の個々を優先順位に基づいて、画面を空間的に分割し優先順位が付けられた各領域に割り当てる。
これにより、切り出された被写体の動画像の優先順位と空間的な優先順位と対応させて、効果的な動画像を生成することができる。 Further, the combining processing unit 55 spatially divides the screen into regions based on the priorities and assigns the regions to the priorities, as the spatial combining reference, based on the priorities.
In this way, it is possible to generate an effective moving image in correspondence with the priority of the moving image of the subject that has been cut out and the spatial priority.

また、画像処理装置１は、レイアウト選択部５６を備える。
レイアウト選択部５６は、切り出された動画像の数、或いは画像に含まれる被写体に対応する、複数の画像を結合する数、大きさ或いは位置関係が定義されたレイアウトを選択する。
結合処理部５５は、切り出された複数の画像を、選択されたレイアウトに結合して新たな１つの画像を生成する。
これにより、切り出された動画像を自動的に適切なレイアウトに結合して、新たな動画像を生成することができる。 The image processing apparatus 1 further includes a layout selection unit 56.
The layout selecting unit 56 selects a layout in which the number, the size, or the positional relationship in which a plurality of images are combined corresponding to the number of the moving images cut out or the subject included in the images is defined.
The combining processing unit 55 combines the plurality of cut out images into the selected layout to generate one new image.
Thereby, the extracted moving image can be automatically combined into an appropriate layout to generate a new moving image.

また、結合処理部５５は、時間的な結合基準として、切り出された動画像の個々を優先順位に基づいて、時間的に繋げる。
これにより、切り出された動画像の優先順位に対応する順序で、複数の被写体の動画像を繋げて新たな動画像を生成することができる。 Further, the combining processing unit 55 temporally connects each of the cut out moving images based on the priority as a temporal combining reference.
Thus, moving images of a plurality of subjects can be connected to generate a new moving image in the order corresponding to the priority of the cut out moving images.

また、優先順位は、画像に含まれる被写体に対応する予め登録されている優先順位、切り出された被写体部分の空間的な大きさ、或いは切り出された動画像の時間的な長さである。
これにより、切り出された動画像の属性に応じて、適切な優先順位を設定することができる。 Further, the priority is a priority registered in advance corresponding to a subject included in the image, a spatial size of the clipped subject portion, or a temporal length of the clipped moving image.
Thereby, an appropriate priority can be set according to the attribute of the extracted moving image.

また、画像処理装置１は、切り出し処理部５３と、レイアウト選択部５６と、結合処理部５５と、を備える。
切り出し処理部５３は、１つの画像から、空間的な切り出し基準に基づき、複数の画像を切り出す。
レイアウト選択部５６は、切り出された画像の特徴に基づき、複数の画像を結合する数、大きさ或いは位置関係が定義されたレイアウトを選択する。
結合処理部５５は、切り出された複数の画像を、選択されたレイアウトに結合して新たな１つの画像を生成する。
これにより、空間的な切り出し基準に従って切り出された複数の画像から、空間的な結合基準に従って１つの静止画像または動画像を生成することができる。
そのため、複数の被写体が画面内で離れている場合であっても、それぞれの被写体を適切に静止画像から切り出して、新たな静止画像または動画像を生成することができる。
したがって、画面内に存在する複数の被写体の位置関係によらず、効果的な画像を生成することが可能となる。 The image processing apparatus 1 further includes a cutout processing unit 53, a layout selection unit 56, and a combination processing unit 55.
The cutout processing unit 53 cuts out a plurality of images from one image based on spatial cutout criteria.
The layout selection unit 56 selects a layout in which the number, size, or positional relationship of combining a plurality of images is defined based on the features of the extracted images.
The combining processing unit 55 combines the plurality of cut out images into the selected layout to generate one new image.
Thus, it is possible to generate one still image or moving image according to the spatial connection criterion from a plurality of images extracted according to the spatial extraction criterion.
Therefore, even when a plurality of subjects are separated in the screen, each subject can be appropriately cut out from the still image, and a new still image or moving image can be generated.
Therefore, it becomes possible to generate an effective image regardless of the positional relationship of a plurality of subjects present in the screen.

なお、本発明は、上述の実施形態に限定されるものではなく、本発明の目的を達成できる範囲での変形、改良等は本発明に含まれるものである。 The present invention is not limited to the above-described embodiment, and modifications, improvements, and the like in the range in which the object of the present invention can be achieved are included in the present invention.

上述の実施形態において、人物の顔を切り出しの対象として説明したが、これに限られない。即ち、人物の他の部位であってもよく、更に他の生物でも、物でも、それらの部分でも、画像から認証或いは検出が可能であればよい。 In the above-mentioned embodiment, although the face of a person was explained as the object of clipping, it is not restricted to this. That is, it may be another part of the person, and it may be possible to authenticate or detect other images, objects or parts thereof from the image.

上述の実施形態において、オリジナルの静止画像または動画像から切り出された画像を結合して生成される新たな静止画像または動画像のサイズ及びアスペクト比は、オリジナルの静止画像または動画像と同一または異なるものとすることができる。 In the above embodiment, the size and aspect ratio of the new still image or moving image generated by combining the images extracted from the original still image or moving image are the same as or different from the original still image or moving image It can be

また、上述の実施形態において、レイアウトには、背景と、背景に合成する画像の数と、大きさ及び位置関係とが定義されているものとして説明したが、これに限られない。即ち、レイアウトには、背景、合成する画像の数、大きさ、位置関係のうちの一部を定義したり、これら以外の要素を定義したりすることができる。 Further, in the above-described embodiment, it has been described that the layout defines the background, the number of images to be combined with the background, and the size and positional relationship, but the present invention is not limited to this. That is, in the layout, a part of the background, the number of images to be synthesized, the size, and the positional relationship can be defined, or other elements can be defined.

また、上述の実施形態において、画像記憶部７１に記憶されている画像のデータを対象として画像生成処理を行うこととして説明したが、これに限られない。例えば、撮像部１６によって撮像される静止画像または動画像を対象として画像生成処理を行うこととしてもよい。 In the above-described embodiment, the image generation processing is performed on the data of the image stored in the image storage unit 71. However, the present invention is not limited to this. For example, the image generation processing may be performed on a still image or a moving image captured by the imaging unit 16.

また、上述の実施形態において、空間的切り出し基準４では、動画像において検出された一人の顔または特定の人物として認証された一人の顔の正面／右向き／左向きを別々に切り出し対象としたが、これに限られない。例えば、動画像において検出された一人の顔または特定の人物として認証された一人の顔の笑顔、怒った顔、泣き顔等を別々に切り出し対象としてもよい。 Further, in the above-described embodiment, in the spatial cutout criterion 4, the front / right / left direction of one face detected in a moving image or one face authenticated as a specific person is separately extracted. It is not restricted to this. For example, a smile, an angry face, a cry face, or the like of one face detected in a moving image or one face recognized as a specific person may be separately extracted.

また、上述の実施形態において、レイアウトの種類は、切り出された画像を結合できるものであれば種々のものを採用することができる。例えば、レイアウトの種類は、画面内により小さい画面を重ねて画像を表示するＰｉｎＰ（ＰｉｃｔｕｒｅＩｎＰｉｃｔｕｒｅ）の形態等とすることができる。 Further, in the above-described embodiment, various types of layout can be adopted as long as they can combine cut-out images. For example, the type of layout can be a form of PinP (Picture In Picture) in which images are displayed by overlapping smaller screens in the screen.

また、上述の実施形態では、本発明が適用される画像処理装置１は、デジタルカメラを例として説明したが、特にこれに限定されない。
例えば、本発明は、画像生成処理機能を有する電子機器一般に適用することができる。具体的には、本発明は、ノート型のパーソナルコンピュータ、プリンタ、テレビジョン受像機、ビデオカメラ、携帯型ナビゲーション装置、携帯電話機、スマートフォン、ポータブルゲーム機等に適用可能である。 Moreover, in the above-mentioned embodiment, although the image processing apparatus 1 to which this invention is applied was demonstrated as an example of the digital camera, it is not specifically limited to this.
For example, the present invention can be applied to electronic devices in general having an image generation processing function. Specifically, the present invention is applicable to a laptop personal computer, a printer, a television receiver, a video camera, a portable navigation device, a portable telephone, a smart phone, a portable game machine, and the like.

上述した一連の処理は、ハードウェアにより実行させることもできるし、ソフトウェアにより実行させることもできる。
換言すると、図２の機能的構成は例示に過ぎず、特に限定されない。即ち、上述した一連の処理を全体として実行できる機能が画像処理装置１に備えられていれば足り、この機能を実現するためにどのような機能ブロックを用いるのかは特に図２の例に限定されない。
また、１つの機能ブロックは、ハードウェア単体で構成してもよいし、ソフトウェア単体で構成してもよいし、それらの組み合わせで構成してもよい。 The series of processes described above can be performed by hardware or software.
In other words, the functional configuration of FIG. 2 is merely illustrative and not particularly limited. That is, it is sufficient if the image processing apparatus 1 has a function capable of executing the above-described series of processes as a whole, and what functional block is used to realize this function is not particularly limited to the example of FIG. .
Further, one functional block may be configured by hardware alone, may be configured by software alone, or may be configured by a combination of them.

一連の処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、コンピュータ等にネットワークや記録媒体からインストールされる。
コンピュータは、専用のハードウェアに組み込まれているコンピュータであってもよい。また、コンピュータは、各種のプログラムをインストールすることで、各種の機能を実行することが可能なコンピュータ、例えば汎用のパーソナルコンピュータであってもよい。 When the series of processes are executed by software, a program that configures the software is installed on a computer or the like from a network or a recording medium.
The computer may be a computer incorporated in dedicated hardware. The computer may be a computer capable of executing various functions by installing various programs, for example, a general-purpose personal computer.

このようなプログラムを含む記録媒体は、ユーザにプログラムを提供するために装置本体とは別に配布される図１のリムーバブルメディア３１により構成されるだけでなく、装置本体に予め組み込まれた状態でユーザに提供される記録媒体等で構成される。リムーバブルメディア３１は、例えば、磁気ディスク（フロッピディスクを含む）、光ディスク、または光磁気ディスク等により構成される。光ディスクは、例えば、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｋ−ＲｅａｄＯｎｌｙＭｅｍｏｒｙ），ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ），Ｂｌｕ−ｒａｙ（登録商標）Ｄｉｓｃ（ブルーレイディスク）等により構成される。光磁気ディスクは、ＭＤ（Ｍｉｎｉ−Ｄｉｓｋ）等により構成される。また、装置本体に予め組み込まれた状態でユーザに提供される記録媒体は、例えば、プログラムが記録されている図１のＲＯＭ１２や、図１の記憶部１９に含まれるハードディスク等で構成される。 The recording medium including such a program is not only configured by the removable medium 31 of FIG. 1 distributed separately from the apparatus main body to provide the program to the user, but also the user in a state incorporated in advance in the apparatus main body The recording medium etc. provided to The removable medium 31 is composed of, for example, a magnetic disk (including a floppy disk), an optical disk, or a magneto-optical disk. The optical disc is composed of, for example, a CD-ROM (Compact Disk-Read Only Memory), a DVD (Digital Versatile Disk), a Blu-ray (registered trademark) Disc (Blu-ray Disc), or the like. The magneto-optical disk is configured by an MD (Mini-Disk) or the like. The recording medium provided to the user in a state of being incorporated in the apparatus main body is, for example, the ROM 12 of FIG. 1 in which the program is recorded, the hard disk included in the storage unit 19 of FIG.

なお、本明細書において、記録媒体に記録されるプログラムを記述するステップは、その順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的或いは個別に実行される処理をも含むものである。 In the present specification, in the step of describing the program to be recorded on the recording medium, the processing performed chronologically along the order is, of course, parallel or individually not necessarily necessarily chronologically processing. It also includes the processing to be performed.

以上、本発明のいくつかの実施形態について説明したが、これらの実施形態は、例示に過ぎず、本発明の技術的範囲を限定するものではない。本発明はその他の様々な実施形態を取ることが可能であり、さらに、本発明の要旨を逸脱しない範囲で、省略や置換等種々の変更を行うことができる。これら実施形態やその変形は、本明細書等に記載された発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 While some embodiments of the present invention have been described above, these embodiments are merely illustrative and do not limit the technical scope of the present invention. The present invention can take other various embodiments, and furthermore, various changes such as omissions and substitutions can be made without departing from the scope of the present invention. These embodiments and modifications thereof are included in the scope and the gist of the invention described in the present specification, etc., and are included in the invention described in the claims and the equivalent scope thereof.

以下に、本願の出願当初の特許請求の範囲に記載された発明を付記する。
［付記１］
１つの動画像から、空間的及び／又は時間的な切り出し基準に基づき、複数の動画像を切り出す切出手段と、
切り出された前記複数の動画像を、空間的及び／又は時間的な結合基準に基づき、時間的又は空間的に結合して、新たな１つの動画像を生成する生成手段と、
を備えることを特徴とする画像処理装置。
［付記２］
前記生成手段は、前記空間的な切り出し基準で切り出された複数の動画像を、時間的な結合基準で結合する、
ことを特徴とする付記１に記載の画像処理装置。
［付記３］
前記生成手段は、前記時間的な切り出し基準で切り出された複数の動画像を、空間的な結合基準で結合する、
ことを特徴とする付記１に記載の画像処理装置。
［付記４］
前記切出手段は、前記空間的な切り出し基準として、動画像を構成する個々の画面内に含まれる所定の被写体部分を空間的に切り出す、
ことを特徴とする付記１から３のいずれか１つに記載の画像処理装置。
［付記５］
前記切出手段は、前記時間的な切り出し基準として、動画像を構成するフレームのうち所定の被写体が含まれるフレームを時間的に切り出す、
ことを特徴とする付記１から４のいずれか１つに記載の画像処理装置。
［付記６］
前記所定の被写体は、予め登録されている人物の顔である、
ことを特徴とする付記４または５に記載の画像処理装置。
［付記７］
前記所定の被写体は、更に、不特定の人物の顔を含む、
ことを特徴とする付記６に記載の画像処理装置。
［付記８］
前記切出手段による切り出しの対象としない人物の顔が予め登録される、
ことを特徴とする付記６に記載の画像処理装置。
［付記９］
前記生成手段は、前記空間的な結合基準として、切り出された動画像の個々を優先順位に基づいて、画面を空間的に分割し優先順位が付けられた各領域に割り当てる、
ことを特徴とする付記１から８のいずれか１つに記載の画像処理装置。
［付記１０］
切り出された動画像の数、或いは画像に含まれる被写体に対応する、複数の画像を結合する数、大きさ或いは位置関係が定義されたレイアウトを選択する選択手段を更に備え、
前記生成手段は、切り出された複数の画像を、選択された前記レイアウトに結合して新たな１つの画像を生成する、
ことを特徴とする付記９に記載の画像処理装置。
［付記１１］
前記生成手段は、前記時間的な結合基準として、切り出された動画像の個々を優先順位に基づいて、時間的に繋げる、
ことを特徴とする付記４に記載の画像処理装置。
［付記１２］
前記優先順位は、画像に含まれる被写体に対応する予め登録されている優先順位、切り出された被写体部分の空間的な大きさ、或いは切り出された動画像の時間的な長さである、
ことを特徴とする付記９から１１のいずれか１つに記載の画像処理装置。
［付記１３］
１つの画像から、所定の切り出し基準に基づき、複数の画像を切り出す切出手段と、
切り出された前記画像の特徴に基づき、複数の画像を結合する数、大きさ或いは位置関係が定義されたレイアウトを選択する選択手段と、
切り出された前記複数の画像を、選択された前記レイアウトに結合して新たな１つの画像を生成する生成手段と、
を備えることを特徴とする画像処理装置。
［付記１４］
前記選択手段は、切り出された画像の特徴として、画像の数、或いは画像に含まれる被写体に対応する前記レイアウトを選択する、
ことを特徴とする付記１３に記載の画像処理装置。
［付記１５］
前記切出手段は、所定の被写体が含まれる画像から、所定の被写体部分を切り出す、
ことを特徴とする付記１３または１４に記載の画像処理装置。
［付記１６］
１つの動画像から、空間的及び／又は時間的な切り出し基準に基づき、複数の動画像を切り出す切出処理と、
切り出された前記複数の動画像を、空間的及び／又は時間的な結合基準に基づき、時間的又は空間的に結合して、新たな１つの動画像を生成する生成処理と、
を含むことを特徴とする画像処理方法。
［付記１７］
コンピュータに、
１つの動画像から、空間的及び／又は時間的な切り出し基準に基づき、複数の動画像を切り出す切出機能と、
切り出された前記複数の動画像を、空間的及び／又は時間的な結合基準に基づき、時間的又は空間的に結合して、新たな１つの動画像を生成する生成機能と、
を実現させることを特徴とするプログラム。
［付記１８］
１つの画像から、所定の切り出し基準に基づき、複数の画像を切り出す切出処理と、
切り出された前記画像の特徴に基づき、複数の画像を結合する数、大きさ或いは位置関係が定義されたレイアウトを選択する選択処理と、
切り出された前記複数の画像を、選択された前記レイアウトに結合して新たな１つの画像を生成する生成処理と、
を含むことを特徴とする画像処理方法。
［付記１９］
コンピュータに、
１つの画像から、所定の切り出し基準に基づき、複数の画像を切り出す切出機能と、
切り出された前記画像の特徴に基づき、複数の画像を結合する数、大きさ或いは位置関係が定義されたレイアウトを選択する選択機能と、
切り出された前記複数の画像を、選択された前記レイアウトに結合して新たな１つの画像を生成する生成機能と、
を実現させることを特徴とするプログラム。 The invention described in the claims at the beginning of the application of the present application is appended below.
[Supplementary Note 1]
Clipping means for clipping a plurality of moving images from one moving image based on spatial and / or temporal clipping criteria;
Generation means for temporally or spatially combining the plurality of extracted moving images based on spatial and / or temporal connection criteria to generate one new moving image;
An image processing apparatus comprising:
[Supplementary Note 2]
The generation means combines a plurality of moving images cut out by the spatial cut-out criterion by a temporal combination reference,
The image processing apparatus according to claim 1, characterized in that:
[Supplementary Note 3]
The generation means combines a plurality of moving images cut out based on the temporal cut-out reference based on a spatial connection reference,
The image processing apparatus according to claim 1, characterized in that:
[Supplementary Note 4]
The clipping unit spatially clips, as the spatial clipping reference, a predetermined subject portion included in each screen constituting a moving image.
The image processing apparatus according to any one of appendices 1 to 3, characterized in that
[Supplementary Note 5]
The clipping unit temporally clips out a frame including a predetermined subject among frames constituting a moving image as the temporal clipping reference.
The image processing apparatus according to any one of appendices 1 to 4, characterized in that
[Supplementary Note 6]
The predetermined subject is a face of a person registered in advance.
The image processing apparatus according to any one of appendices 4 or 5, characterized in that
[Supplementary Note 7]
The predetermined subject further includes the face of an unspecified person,
The image processing apparatus according to claim 6, characterized in that
[Supplementary Note 8]
A face of a person not to be cut out by the cutting out means is registered in advance.
The image processing apparatus according to claim 6, characterized in that
[Supplementary Note 9]
The generation means spatially divides each of the cut out moving images on the basis of priority, and assigns each of the regions to which priority has been given, as the spatial connection criterion.
The image processing apparatus according to any one of appendices 1 to 8, characterized in that
[Supplementary Note 10]
The image processing apparatus further comprises selection means for selecting a layout in which the number, size, or positional relationship of combining a plurality of images corresponding to the number of cut out moving images or a subject included in the images is selected,
The generation unit combines a plurality of extracted images with the selected layout to generate a new image.
Appendix 9. The image processing apparatus according to appendix 9.
[Supplementary Note 11]
The generation unit temporally connects each of the cut out moving images based on priority as the temporal connection criterion.
The image processing apparatus according to appendix 4, characterized in that
[Supplementary Note 12]
The priority is a pre-registered priority corresponding to a subject included in an image, a spatial size of a clipped subject portion, or a temporal length of a clipped moving image.
The image processing apparatus according to any one of appendices 9 to 11, characterized in that
[Supplementary Note 13]
Clipping means for clipping a plurality of images from one image based on a predetermined clipping criterion;
Selection means for selecting a layout in which the number, size or positional relationship of combining a plurality of images is defined based on the features of the extracted images;
Generation means for combining the plurality of extracted images with the selected layout to generate a new image;
An image processing apparatus comprising:
[Supplementary Note 14]
The selection means selects the number of images or the layout corresponding to a subject included in an image as the feature of the extracted image.
The image processing device according to appendix 13, characterized in that
[Supplementary Note 15]
The cutting out unit cuts out a predetermined subject portion from an image including the predetermined subject.
The image processing apparatus according to any one of appendices 13 or 14, characterized in that
[Supplementary Note 16]
A clipping process of clipping a plurality of moving images from one moving image based on spatial and / or temporal clipping criteria;
Generation processing for combining the extracted moving images temporally or spatially based on spatial and / or temporal connection criteria to generate one new moving image;
An image processing method comprising:
[Supplementary Note 17]
On the computer
A clipping function of clipping a plurality of moving images from one moving image based on spatial and / or temporal clipping criteria;
A generation function of temporally or spatially combining the plurality of extracted moving images based on spatial and / or temporal connection criteria to generate one new moving image;
A program that is characterized by realizing
[Supplementary Note 18]
A clipping process of clipping a plurality of images from one image based on a predetermined clipping criterion;
A selection process of selecting a layout in which the number, size, or positional relationship of combining a plurality of images is defined based on the features of the extracted images;
Generation processing of combining the plurality of extracted images with the selected layout to generate a new image;
An image processing method comprising:
[Supplementary Note 19]
On the computer
A clipping function for clipping a plurality of images from one image based on a predetermined clipping criterion;
A selection function of selecting a layout in which the number, size, or positional relationship of combining a plurality of images is defined based on the features of the extracted images;
A generation function of combining the plurality of extracted images into the selected layout to generate a new image;
A program that is characterized by realizing

１・・・画像処理装置，１１・・・ＣＰＵ，１２・・・ＲＯＭ，１３・・・ＲＡＭ，１４・・・バス，１５・・・入出力インターフェース，１６・・・撮像部，１７・・・入力部，１８・・・出力部，１９・・・記憶部，２０・・・通信部，２１・・・ドライブ，３１・・・リムーバブルメディア，５１・・・画像選択部，５２・・・切り出し基準特定部
，５３・・・切り出し処理部，５４・・・結合基準特定部，５５・・・結合処理部，５６・・・レイアウト選択部，７１・・・画像記憶部，７２・・・レイアウト記憶部，７３・・・生成画像記憶部，７４・・・顔情報記憶部 DESCRIPTION OF SYMBOLS 1 ... Image processing apparatus, 11 ... CPU, 12 ... ROM, 13 ... RAM, 14 ... Bus, 15 ... Input-output interface, 16 ... Imaging part, 17 ... An input unit 18, an output unit 19, a storage unit 20, a communication unit 21, a drive 31, a removable medium 51, an image selection unit 52, and the like. Segmentation reference specification unit 53: Segmentation processing unit 54: Joint reference identification unit 55: Joint processing unit 56: Layout selection unit 71: Image storage unit 72 Layout storage unit, 73 ... generated image storage unit, 74 ... face information storage unit

Claims

By cutting out a frame period in which the first subject or the second subject is present from one moving image in which a first subject and a second subject existing in a separated frame period are photographed, the video Cutting means for cutting out an image,
The moving image corresponding to the first object cut out by the cutting out means is combined with the moving image corresponding to the second object according to the ranking of the objects based on the temporal combination criteria Means for generating one new moving image,
An image processing apparatus comprising:

The clipping unit is a clipping target when the first subject or the second subject is detected or authenticated for a predetermined threshold time or more.
The image processing apparatus according to claim 1,

The temporal connection criterion may be a priority pre-registered in the subject, a spatial size of the extracted subject portion, or a temporal length of the extracted moving image.
The image processing apparatus according to claim 1, wherein the image processing apparatus comprises:

When the first subject and the second subject present in individual frames making up a moving image are located at a distance equal to or greater than a predetermined threshold value in the frame, the clipping means determines the first condition. Cut out the subject and the second subject independently and spatially
The image processing apparatus according to any one of claims 1 to 3, characterized in that:

When the first object and the second object present in individual frames making up a moving image are located at a distance within a frame that is less than a predetermined threshold value, the cutting-out means may perform the first operation. Collectively and spatially cutting out the subject and the second subject,
The image processing apparatus according to claim 4, characterized in that:

By cutting out a frame period in which the first subject or the second subject is present from one moving image in which a first subject and a second subject existing in a separated frame period are photographed, the video a switching origin sense to cut out the image,
The moving image corresponding to the first object cut out by the cutting process and the moving image corresponding to the second object are combined according to the ranking of the objects based on the temporal combination criteria Generation processing to generate one new moving image,
An image processing method comprising:

On the computer
By cutting out a frame period in which the first subject or the second subject is present from one moving image in which a first subject and a second subject existing in a separated frame period are photographed, the video With a clipping function that cuts out an image,
The moving image corresponding to the first object cut out by the cutting out function is combined with the moving image corresponding to the second object according to the ranking of the objects based on the temporal combination criteria Generation function to generate one new moving image,
A program that is characterized by realizing

Clipping means for clipping a moving image of each predetermined subject by cutting out a frame period in which a plurality of predetermined subjects are present from one moving image in which at least one subject is present in separated frame periods;
A new moving image is generated by combining the moving images cut out by the cutting unit according to the predetermined objects in accordance with the ranking of the predetermined objects based on the temporal combination criteria. Means for generating
An image processing apparatus comprising: