JP2008020944A

JP2008020944A - Image processing method, program, and device

Info

Publication number: JP2008020944A
Application number: JP2006189484A
Authority: JP
Inventors: Kenji Ikemizu; 憲治池水
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2006-07-10
Filing date: 2006-07-10
Publication date: 2008-01-31

Abstract

<P>PROBLEM TO BE SOLVED: To prevent the satisfaction of a user with a slide show from being degraded as much as possible in applying effects to any object other than the face extracted from an image. <P>SOLUTION: When it is decided that the face region extracted from an original image is a primary object, the threshold of object extraction is 5 as it is, and any other unnecessary object region is not extracted, and effects are not applied to any unnecessary object region. When the face region detected from the original image is not the primary object, the threshold of the object extraction is 4 decreased by one from 5 or a value less than that. Thus, it is possible to increase probability that the object region other than the face is extracted, and to increase the satisfaction of a user with a slide show in applying effects to even an object as the primary object intended by the photographer. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は静止画に基づいてスライドショーを生成する技術に関する。 The present invention relates to a technique for generating a slide show based on still images.

一定時間で切り替わるような従来のスライドショーにとどまらず、フォトクリップのような表示上の効果（エフェクト）が施されたスライドショーを自動的に生成するソフトウェアが知られている。エフェクトは画像の一部をズームしたり切り抜いたりテンプレート画像を上に重ねて表示したりする。 There is known software that automatically generates a slide show with a display effect (effect) such as a photo clip, as well as a conventional slide show that switches in a certain time. The effect zooms and crops a part of the image and displays the template image on top of it.

例えば、特許文献１や２のように、画像から顔領域を抽出して、その情報を使ってスライドショーを作成する技術がある。 For example, as in Patent Documents 1 and 2, there is a technique for extracting a face area from an image and creating a slide show using the information.

一方、風景写真や動物写真のように、顔領域が存在しない画像の場合、犬や車などの被写体を抽出したい場合がある。しかし、それらは顔抽出における目のように、形が一様であり、特定しやすい対象でないため、抽出精度が低くなる傾向にある。その抽出手段としては、色が大きく変わっている領域や、高周波数成分の多い領域を抽出する方法が考えられる。 On the other hand, in the case of an image having no face area such as a landscape photograph or an animal photograph, it may be desired to extract a subject such as a dog or a car. However, they have a uniform shape, like eyes in face extraction, and are not easy to specify, so the extraction accuracy tends to be low. As the extraction means, a method of extracting a region where the color has changed greatly or a region having many high frequency components can be considered.

これは簡単にいうと、色の変化量あるいは周波数成分の高さに、ある閾値を設定して、それを超えた場合には、対象領域として抽出するというものである。
特開２００５−１８２１９６号公報特開２００５−３５４３２２号公報 To put it simply, a certain threshold is set for the amount of color change or the height of the frequency component, and when it exceeds the threshold, it is extracted as a target region.
JP 2005-182196 A JP 2005-354322 A

上述の物体抽出の閾値を低く設定すればするほど、物体を抽出する可能性は高くなるが、その分、余計な物体や物体として誤った領域までも抽出してしまう可能性が高い。そして、余計な物体や誤った領域にエフェクトを施すと、ユーザはエフェクトの本来の効果を実感できないどころか、逆に誤った対象へのエフェクトを無価値に感じてしまい、スライドショーの満足度が低下するおそれがある。 The lower the object extraction threshold is set, the higher the possibility that an object will be extracted, but there is a higher possibility that an extra area or an erroneous area will be extracted as much. And if you apply effects to extraneous objects or the wrong area, the user will not be able to realize the original effect of the effect, but will feel the effect on the wrong object worthless, and the satisfaction of the slide show will decrease There is a fear.

本発明はこのような問題点に鑑みてなされたもので、その目的は、人物であるか否かにかかわらず、画像からエフェクトを施すべき領域を正しく抽出し、かつ余計な物体を極力抽出しないようにし、スライドショーに対するユーザの満足度を高めることにある。 The present invention has been made in view of such problems, and its purpose is to correctly extract a region to be effected from an image regardless of whether or not it is a person and not to extract extraneous objects as much as possible. In this way, the user's satisfaction with the slide show is increased.

本発明は、静止画に基づいてスライドショーデータを生成する画像処理方法であって、静止画から人物の顔の存在する領域である顔領域を抽出するステップと、物体抽出精度を設定するステップと、設定された物体抽出精度に従い、静止画から顔以外の物体の存在する領域である物体領域を抽出するステップと、抽出された顔領域および物体領域のうち少なくとも一方に基づいて、主要被写体が抽出されたか否かを判断するステップと、主要被写体が抽出されたと判断されなかったことに応じ、現在設定されている物体抽出精度を所定の値だけ減らした値を新たな物体抽出精度に設定するステップと、を含む。 The present invention is an image processing method for generating slide show data based on a still image, the step of extracting a face region that is a region where a human face exists from a still image, a step of setting object extraction accuracy, In accordance with the set object extraction accuracy, a main object is extracted based on at least one of the extracted face area and object area, and a step of extracting an object area that is an area where an object other than a face exists from a still image. Determining whether or not the main subject has been extracted, and setting a value obtained by reducing the currently set object extraction accuracy by a predetermined value as a new object extraction accuracy. ,including.

この方法では、人物の顔あるいは物体である主要被写体が静止画像から抽出されなかった場合、物体抽出精度を所定の値減らし、これを新たな物体抽出精度とする。低く設定された新たな物体抽出精度では、物体を抽出する確率が増加するから、物体が抽出される可能性がより高くなる。また、主要被写体が静止画像から抽出されなかった場合に初めて抽出精度を落とすから、抽出精度が低くなりすぎて必要のない物体が抽出されたり、誤った領域が物体領域として抽出される可能性を極力に抑えることができる。このため、必要のない物体や誤った領域にエフェクトが付与されたスライドショーが実行されてユーザに不満足感を与えるおそれが少ない。 In this method, when a main subject that is a human face or object is not extracted from a still image, the object extraction accuracy is reduced by a predetermined value, and this is set as a new object extraction accuracy. With a new object extraction accuracy set to be low, the probability of extracting an object increases, so the possibility that an object will be extracted becomes higher. In addition, since the extraction accuracy is lowered only when the main subject is not extracted from the still image, there is a possibility that the extraction accuracy becomes too low and unnecessary objects are extracted, or erroneous regions are extracted as object regions. It can be suppressed as much as possible. For this reason, there is little possibility that a slide show in which an effect is applied to an unnecessary object or an incorrect area is executed and the user is not satisfied.

上記の画像処理方法をコンピュータに実行させるためのプログラムも本発明に含まれる。 A program for causing a computer to execute the above image processing method is also included in the present invention.

また、上記の画像処理方法に含まれるステップおよび抽出された顔領域および物体領域のうち少なくとも一方に所定のエフェクトが実施された静止画を順次表示するスライドショーデータを生成するステップを実行するコンピュータと、コンピュータが生成したスライドショーデータに基づき、エフェクトが実施された静止画を順次表示装置に表示する表示制御部と、を備えた画像処理装置も本発明に含まれる。 Further, a computer that executes the steps included in the image processing method and the step of generating slide show data that sequentially displays still images on which at least one of the extracted face area and object area is subjected to a predetermined effect; An image processing apparatus including a display control unit that sequentially displays still images on which effects have been implemented on a display device based on slide show data generated by a computer is also included in the present invention.

本発明では、人物の顔あるいは物体である主要被写体が静止画像から抽出されなかった場合、物体抽出精度を所定の値減らし、これを新たな物体抽出精度とする。低く設定された新たな物体抽出精度では、物体を抽出する確率が増加するから、物体が抽出される可能性がより高くなる。また、主要被写体が静止画像から抽出されなかった場合に初めて抽出精度を落とすから、抽出精度が低くなりすぎて必要のない物体が抽出されたり、誤った領域が物体領域として抽出される可能性を極力に抑えることができる。このため、必要のない物体や誤った領域にエフェクトが付与されたスライドショーが実行されてユーザに不満足感を与えるおそれが少ない。 In the present invention, when the main subject which is a human face or object is not extracted from the still image, the object extraction accuracy is reduced by a predetermined value, and this is set as a new object extraction accuracy. With a new object extraction accuracy set to be low, the probability of extracting an object increases, so the possibility that an object will be extracted becomes higher. In addition, since the extraction accuracy is lowered only when the main subject is not extracted from the still image, there is a possibility that the extraction accuracy becomes too low and unnecessary objects are extracted, or erroneous regions are extracted as object regions. It can be suppressed as much as possible. For this reason, there is little possibility that a slide show in which an effect is applied to an unnecessary object or an incorrect area is executed and the user is not satisfied.

図１は本発明の好ましい実施形態に係るスライドショー作成装置のブロック構成図である。スライドショー作成装置は、ＣＰＵ８、スライドショー生成部１、顔領域抽出部２、物体領域抽出部９、記憶部４、メモリ５、表示制御部３、表示装置６および入力装置７を備える。この装置は、典型的には、ハードディスクレコーダ、パソコン、デジタルカメラなどで実現される。 FIG. 1 is a block diagram of a slide show creation apparatus according to a preferred embodiment of the present invention. The slide show creation device includes a CPU 8, a slide show generation unit 1, a face region extraction unit 2, an object region extraction unit 9, a storage unit 4, a memory 5, a display control unit 3, a display device 6, and an input device 7. This apparatus is typically realized by a hard disk recorder, a personal computer, a digital camera, or the like.

入力装置７は、静止画像や音楽データを外部から入力するための装置であり、ＤＶＤやフラッシュメモリといった可搬性記録媒体のデータ読み取り装置、ネットワークアダプタ、シリアルポートなどで構成される。 The input device 7 is a device for inputting still images and music data from the outside, and includes a data reading device for a portable recording medium such as a DVD or a flash memory, a network adapter, a serial port, and the like.

記憶部４は、入力装置７から入力された静止画像や音楽を蓄積するハードディスク（ＨＤＤ）などの記憶媒体である。記憶部４は、テンプレートおよびテンプレート管理情報を蓄積したデータベース、プログラム、後述する顔領域ＤＢや物体領域ＤＢなども記憶する。 The storage unit 4 is a storage medium such as a hard disk (HDD) that stores still images and music input from the input device 7. The storage unit 4 also stores a database storing programs and template management information, a program, a face area DB and an object area DB, which will be described later.

メモリ５は、ＣＰＵ８のプログラムやプログラムの処理に必要なデータを一時的に格納する。 The memory 5 temporarily stores a program for the CPU 8 and data necessary for program processing.

表示制御部３は、ＣＰＵ８の指令に従い、ディスプレイなどで構成された表示装置６へ出力する映像信号を生成し、表示装置６に出力する。表示装置はスライドショー作成装置に内蔵されていてもよいし外付けの装置であってもよい。 The display control unit 3 generates a video signal to be output to the display device 6 constituted by a display or the like according to a command from the CPU 8 and outputs the video signal to the display device 6. The display device may be built in the slide show creation device or an external device.

スライドショー生成部１は、記憶部４に蓄積された静止画像のうちユーザの選択した所望の１または複数の静止画像（例えばＥｘｉｆ形式や、ＪＰＥＧ形式の画像ファイル。以下、元画像と呼ぶ）と記憶部４に記憶されたテンプレートの中から選択された所望のテンプレートの規定する動作条件に基づき、所定の効果が付与された複数の元画像の全部または一部を順次切り替えて再生する形式の動画データあるいは音声も含んだ動画データ（スライドショー）を生成する。 The slide show generation unit 1 stores one or more desired still images (for example, Exif format or JPEG format image files, hereinafter referred to as original images) selected by the user among the still images accumulated in the storage unit 4. Movie data in a format in which all or a part of a plurality of original images to which a predetermined effect is given are sequentially switched and reproduced based on an operation condition defined by a desired template selected from templates stored in the unit 4 Alternatively, moving image data (slide show) including sound is generated.

スライドショーに用いる元画像は、入力された静止画像全てでなくてもよく、例えば後述する物体領域も顔領域も抽出されないような画像は、切り捨ててもよい。あるいは、主要被写体、物体領域、顔領域が重複すると推定される画像群や、類似度が高い画像群は、その中から１つだけをスライドショーの元画像として用い、残りは切り捨ててもよい。 The original image used for the slide show does not have to be all of the input still images. For example, an image in which neither an object area nor a face area described later can be extracted may be discarded. Alternatively, an image group in which the main subject, the object area, and the face area are estimated to overlap or an image group with high similarity may be used as the original image of the slide show, and the rest may be discarded.

具体的なテンプレートの動作条件の一例を挙げると、画像中の物体領域や顔領域（まとめて特定領域で表す）を所定の速度で水平・垂直移動（パン・チルト）するよう表示させたり、特定領域を拡縮（ズームイン／ズームアウト）したり、特定領域以外を隠す（マスキング、図２（ａ）参照）ようにしたり、特定領域を強調するような画像を合成したり（合成枠、図２（ｂ）参照）、特定領域を回転させるなどして元画像の動きを作る「エフェクト」の種類がある。スライドショーの再生に同期して再生される「バックグラウンドミュージック（ＢＧＭ）」の曲名を動作条件に含めてもよい。 An example of specific template operating conditions is to display or specify the object area or face area (collectively expressed as a specific area) in the image to move horizontally and vertically (pan / tilt) at a predetermined speed. Enlarge or reduce the area (zoom in / zoom out), hide other areas (masking, see FIG. 2A), or compose an image that emphasizes the specific area (composite frame, FIG. 2 ( b)), there is a type of “effect” that creates a movement of the original image by rotating a specific area. A song name of “background music (BGM)” that is played back in synchronization with the playback of the slide show may be included in the operating conditions.

生成された後の動画データは、ＭＰ３ファイルのようにテンプレートとは独立した形式でもよいし、アニメーションＧＩＦのように静止画とテンプレートの両方が揃って動画が再生される形式に変換されてもよい。 The generated moving image data may be in a format independent of the template, such as an MP3 file, or may be converted into a format in which both a still image and a template are reproduced, such as an animation GIF. .

詳細は省略するが、テンプレートは、元画像の再生に付随する文書、キャラクタ、アイコンなどの各種オブジェクトの配置座標、サイズ、色などの表示状態も規定することができる。 Although details are omitted, the template can also define display states such as arrangement coordinates, sizes, colors, and the like of various objects such as documents, characters, and icons accompanying the reproduction of the original image.

顔領域抽出部２は、各元画像における顔領域、すなわち人物の顔の部分の存在領域を抽出する。顔領域抽出部２は、顔領域を抽出したか否かを示す信号をスライドショー生成部１に出力する。どのようにして顔領域を抽出するかは任意であるが、例えば、顔の輪郭パターンに類似の形状の領域を顔領域として抽出する、あるいは目、鼻、口の存在部分とそれを包含する肌色領域から顔全体の輪郭を推定し、それに相当する領域を顔領域として抽出することなどが挙げられる。 The face area extraction unit 2 extracts a face area in each original image, that is, an existing area of a human face portion. The face area extraction unit 2 outputs a signal indicating whether or not a face area has been extracted to the slide show generation unit 1. How the face area is extracted is arbitrary, but, for example, an area having a shape similar to the face contour pattern is extracted as the face area, or the presence part of the eyes, nose and mouth and the skin color that encompasses it For example, the contour of the entire face is estimated from the area, and the corresponding area is extracted as the face area.

物体領域抽出部９は、犬や猫などの動物、自動車などの乗り物、その他顔領域以外の物体の存在領域を物体領域として抽出する。どのようにして物体を抽出するかは任意であるが、例えば、周辺画素領域と比較した色の変化が所定の閾値を超えているような画素領域や周波数成分が所定の閾値を超えている画素領域を物体領域として抽出する。 The object area extraction unit 9 extracts an existence area of an object other than an animal such as a dog or a cat, a vehicle such as an automobile, or other face area as an object area. How an object is extracted is arbitrary, but for example, a pixel region in which the color change compared to the surrounding pixel region exceeds a predetermined threshold or a pixel whose frequency component exceeds a predetermined threshold An area is extracted as an object area.

顔領域抽出部２が抽出対象とする顔領域は、目、鼻、口などの特定部位の存在を基準とすれば、比較的高い抽出精度が確保できる。一方、物体領域抽出部９が抽出対象とする物体領域は、物体の性質が一様でないし（例えば物体が犬である場合は、犬種によって身体的特徴の差異が大きく、人物の目、鼻、口などの特定部位を基準とするのが困難）、物体の抽出基準となる閾値撮影が画一的であるため撮影条件によっては有効でなく、顔領域の抽出と比較すると抽出精度が低いことが一般的に言える。そうすると、物体をなるべく多く抽出するためには、その抽出精度を低く設定する必要がある。 If the face area to be extracted by the face area extraction unit 2 is based on the presence of specific parts such as eyes, nose and mouth, a relatively high extraction accuracy can be ensured. On the other hand, the object region to be extracted by the object region extraction unit 9 has a non-uniform nature of the object (for example, when the object is a dog, there are large differences in physical characteristics depending on the dog breed, It is difficult to use a specific part such as a mouth as a reference), and the threshold imaging that is the standard for object extraction is uniform, so it is not effective depending on the imaging conditions, and the extraction accuracy is low compared to face area extraction. Is generally true. Then, in order to extract as many objects as possible, it is necessary to set the extraction accuracy low.

しかし、既に顔が抽出された場合、一般的にはそれらが元画像中の主要被写体と解されるから、精度が低い物体抽出の結果、物体として誤って抽出された領域に余計なエフェクトが実施されてしまい、ユーザの不満足感につながるし、精度の低い物体を抽出する処理自体が無駄になる可能性が高い。本実施形態はかかる問題点に対処するため、主要被写体の抽出の結果に応じて物体抽出の精度を物体領域抽出に必要かつ最小限度緩和する。 However, if faces have already been extracted, they are generally interpreted as the main subject in the original image, and as a result of object extraction with low accuracy, an extra effect is performed on the area that was mistakenly extracted as an object. This leads to user dissatisfaction, and there is a high possibility that processing itself for extracting an object with low accuracy is wasted. In order to cope with such a problem, the present embodiment relaxes the accuracy of object extraction necessary and minimum for object region extraction according to the result of extraction of the main subject.

図３および図４は記憶部４に含まれる顔領域ＤＢおよび物体領域ＤＢの記憶する情報の一例を示す。 3 and 4 show an example of information stored in the face area DB and the object area DB included in the storage unit 4.

図３に示すように、顔領域ＤＢは、各画像の識別番号（ＩＤ）と、各画像のサイズと、顔領域抽出部２が各画像から抽出した１または複数の顔領域と、各顔領域の中心座標と、各顔領域のサイズとを対応づけて記憶する。 As shown in FIG. 3, the face area DB includes an identification number (ID) of each image, the size of each image, one or a plurality of face areas extracted from each image by the face area extraction unit 2, and each face area. Are stored in association with the center coordinates of each face area and the size of each face area.

一方、図４に示すように、物体領域ＤＢは、各画像の識別番号（ＩＤ）と、各画像のサイズと、物体領域抽出部９が各画像から抽出した１または複数の顔領域と、各物体領域の中心座標と、各物体領域のサイズとを対応づけて記憶する。 On the other hand, as shown in FIG. 4, the object region DB includes an identification number (ID) of each image, the size of each image, one or a plurality of face regions extracted from each image by the object region extraction unit 9, The center coordinates of the object area and the size of each object area are stored in association with each other.

図３または図４に示す情報は、必ずしも全てを同時に記憶する必要はなく、必要な情報の一部のみを記憶してもよい。 The information shown in FIG. 3 or FIG. 4 does not necessarily have to be stored all at the same time, and only a part of the necessary information may be stored.

図５はスライドショー作成処理の流れを示すフローチャートである。この処理はＣＰＵ８が実行する。 FIG. 5 is a flowchart showing the flow of the slide show creation process. This process is executed by the CPU 8.

Ｓ１では、後述する顔領域ＤＢおよび物体領域ＤＢの作成処理を行う。 In S1, a process for creating a face area DB and an object area DB, which will be described later, is performed.

Ｓ２では、メモリ５に記憶された元画像の全てについて下記のＳ３（エフェクト生成処理）が実行されたか否かを判断する。Ｓ３が実行されていない元画像があれば、その元画像についてＳ３の処理を行う。全ての元画像についてＳ３が実行されたと判断すれば、この判断を終了する。 In S2, it is determined whether or not the following S3 (effect generation process) has been executed for all of the original images stored in the memory 5. If there is an original image for which S3 has not been executed, the process of S3 is performed on the original image. If it is determined that S3 has been executed for all the original images, this determination is terminated.

Ｓ３では、各画像の各顔領域・各物体領域を対象とした所定のエフェクトが実施されるデータを生成する。 In S3, data for generating a predetermined effect for each face area and each object area of each image is generated.

Ｓ４では、エフェクトが設定された各元画像のスライドショーにおける表示順序を、元画像の入力順、タイムスタンプ順、元画像のファイル名のアイウエオあるいはアルファベット順に従って、あるいはランダムに決定する。 In S4, the display order in the slide show of each original image to which the effect is set is determined in accordance with the input order of the original image, the time stamp order, the eye name or alphabetical order of the file name of the original image, or randomly.

Ｓ５では、各々の静止画像の顔領域・物体領域にエフェクトの設定された静止画像を、Ｓ４で決定された表示順序に従って表示していくスライドショーのデータの作成をスライドショー生成部１に指令する。 In S5, the slide show generation unit 1 is instructed to create slide show data for displaying still images with effects set in the face area and object area of each still image according to the display order determined in S4.

図６は、顔領域ＤＢおよび物体領域ＤＢの作成処理の流れを示す。この処理はＣＰＵ８が実行する。 FIG. 6 shows a flow of processing for creating the face area DB and the object area DB. This process is executed by the CPU 8.

Ｓ１１では、入力装置７から入力された１または複数の静止画像を元画像としてメモリ５に記憶する。 In S <b> 11, one or more still images input from the input device 7 are stored in the memory 5 as original images.

Ｓ１２では、メモリ５に記憶された元画像の全てについて下記のＳ１３〜Ｓ１８が実行されたか否かを判断する。Ｓ１３〜Ｓ１８が実行されていない元画像があれば、その元画像についてＳ１３〜Ｓ１８の処理を行う。全ての元画像についてＳ１３〜Ｓ１８が実行されたと判断すれば、この処理を終了する。 In S12, it is determined whether or not the following S13 to S18 have been executed for all of the original images stored in the memory 5. If there is an original image for which S13 to S18 have not been executed, the processes of S13 to S18 are performed on the original image. If it is determined that S13 to S18 have been executed for all the original images, this process is terminated.

Ｓ１３では、顔領域抽出部２に対し、各元画像のそれぞれから顔領域を抽出するよう指令する。顔領域抽出部２は、ＣＰＵ８からの指令に応じて元画像から顔領域を抽出する。抽出された顔領域は、必要に応じ、その顔領域に固有の番号、その顔領域の抽出元となった元画像のＩＤ、元画像における顔領域の座標、およびその顔領域のサイズと対応づけられた上で顔領域ＤＢに記憶される（図３参照）。 In S13, the face area extraction unit 2 is instructed to extract a face area from each original image. The face area extraction unit 2 extracts a face area from the original image in response to a command from the CPU 8. If necessary, the extracted face area is associated with a number unique to the face area, the ID of the original image from which the face area is extracted, the coordinates of the face area in the original image, and the size of the face area. And stored in the face area DB (see FIG. 3).

Ｓ１４では、物体領域抽出の閾値を所定の初期値（例えば図７の最高レベル「５」）に設定する。 In S14, the object region extraction threshold is set to a predetermined initial value (for example, the highest level “5” in FIG. 7).

Ｓ１５では、物体領域抽出部９に対し、各元画像のそれぞれから物体領域を抽出するよう指令する。物体領域抽出部９は、ＣＰＵ８からの指令に応じて元画像から物体領域を抽出する。抽出された物体領域は、必要に応じ、その物体領域に固有の番号、その領域の抽出元となった元画像のＩＤ、元画像における物体領域の座標、およびその物体領域のサイズと対応づけられた上で物体領域ＤＢに記憶される（図４参照）。 In S15, the object area extraction unit 9 is instructed to extract an object area from each original image. The object area extraction unit 9 extracts an object area from the original image in response to a command from the CPU 8. The extracted object area is associated with a number unique to the object area, the ID of the original image from which the area is extracted, the coordinates of the object area in the original image, and the size of the object area, as necessary. And stored in the object region DB (see FIG. 4).

Ｓ１６では、抽出された顔領域および物体領域すなわち特定領域のうち少なくともいずれか１つが元画像における主要被写体であるか否かを判断する。いずれかの特定領域が主要被写体である場合はＳ１２に戻り、いずれの特定領域も主要被写体でない場合はＳ１７に移行する。 In S16, it is determined whether or not at least one of the extracted face region and object region, that is, the specific region is a main subject in the original image. If any specific area is the main subject, the process returns to S12, and if any specific area is not the main subject, the process proceeds to S17.

いずれの特定領域が主要被写体であるかの判断は、例えば元画像の中心位置に特定領域が存在するか否かを判断し、中心位置に特定領域が存在すれば、その特定領域が主要被写体であると判断する。あるいは、Ｅｘｉｆファイルのタグ情報に記憶されている焦点位置に特定領域が存在するか否かを判断し、焦点位置に特定領域が存在すれば、その特定領域が主要被写体であると判断する。あるいは、必ずしも主要被写体が元画像の中心にあるかとは限らないため、抽出された特定領域の面積の合計が、元画像の面積のうちの所定の割合（例えば５０％）以上を占める場合、それらの特定領域のすべてが主要被写体であると判断する。 The determination of which specific area is the main subject is made by, for example, determining whether or not the specific area exists at the center position of the original image. If the specific area exists at the center position, the specific area is the main subject. Judge that there is. Alternatively, it is determined whether or not the specific area exists at the focal position stored in the tag information of the Exif file. If the specific area exists at the focal position, it is determined that the specific area is the main subject. Alternatively, since the main subject is not necessarily at the center of the original image, if the total area of the extracted specific regions occupies a predetermined ratio (for example, 50%) or more of the area of the original image, these It is determined that all the specific areas are the main subjects.

Ｓ１７では、現在の物体領域抽出の閾値が最低値（例えば図７の最低レベル「１」）であるか否かを判断する。最低値であればＳ１２に戻り、最低値でない場合はＳ１８に移行する。 In S17, it is determined whether or not the current object region extraction threshold is the lowest value (for example, the lowest level “1” in FIG. 7). If it is the minimum value, the process returns to S12, and if it is not the minimum value, the process proceeds to S18.

Ｓ１８では、現在の物体領域抽出の閾値（色変化の閾値、周波数の閾値その他のもの）から１段階下げた値（例えば図７の「５」から１レベル下げた「４」）を新たな閾値とする。そして、Ｓ１５に戻る。 In S18, a value (for example, “4”, which is one level lower than “5” in FIG. 7) that is lowered by one step from the current object region extraction threshold (color change threshold, frequency threshold, etc.) is set as a new threshold. And Then, the process returns to S15.

以上の処理による作用と効果は次のようになる。 The operations and effects of the above processing are as follows.

図８は元画像の一例を示しており、図８（ａ）は人物が主要被写体となっている元画像、図８（ｂ）は人物とそれ以外の物体が主要被写体となっている元画像、図８（ｃ）は人物以外の物体が主要被写体となっている元画像を例示する。 FIG. 8 shows an example of an original image. FIG. 8A is an original image in which a person is a main subject, and FIG. 8B is an original image in which a person and other objects are main subjects. FIG. 8C illustrates an original image in which an object other than a person is the main subject.

図８（ａ）のような元画像から、顔１〜顔３という顔領域が検出され、Ｓ１６でこれらの顔領域のうち中心の「顔２」が主要被写体と判定されれば、物体抽出の閾値は５のままであり、その他の余計な物体領域が抽出されず、余計な物体領域でなく主要被写体である顔領域にエフェクトが付与され（Ｓ３）、スライドショーに対するユーザの満足度が低下するおそれが小さい。 If face areas of face 1 to face 3 are detected from the original image as shown in FIG. 8A, and the central “face 2” of these face areas is determined as the main subject in S16, object extraction is performed. The threshold value remains 5, and no other extraneous object areas are extracted, and the effect is given to the face area as the main subject instead of the extra object areas (S3), and the user's satisfaction with the slide show may be reduced. Is small.

また、図８（ｂ）のような元画像から、顔１という顔領域が検出されたが、Ｓ１６でその顔領域が中心になく主要被写体がないと判定されれば、物体抽出の閾値は５から１段階減らされた４かそれ未満の値になる。このため、顔以外の物体領域である物１〜物４が抽出される確率が増加し、本来撮影者が意図していた主要被写体である「顔１」と「物体４」の双方にエフェクトが付与され、スライドショーに対するユーザの満足度が増す。 If a face area called face 1 is detected from the original image as shown in FIG. 8B, but it is determined in S16 that the face area is not in the center and there is no main subject, the object extraction threshold value is 5. The value is reduced by one step from 4 or less. For this reason, the probability that the objects 1 to 4 that are object regions other than the face are extracted increases, and the effect is applied to both “face 1” and “object 4” that are the main subjects originally intended by the photographer. The satisfaction of the user with respect to the slide show is increased.

また、図８（ｃ）のような元画像からは、顔領域が全く検出されないから、物体抽出の閾値は５から１段階減らされた４かそれ未満の値になり、顔以外の物体領域が抽出される確率が増加し、物体１〜４という物体領域が抽出される確率が増加し、本来撮影者が意図していた主要被写体である「物体４」にエフェクトが付与され、スライドショーに対するユーザの満足度が増す。 Further, since no face area is detected from the original image as shown in FIG. 8C, the object extraction threshold is reduced by one step from 5 to 4 or less, and object areas other than the face are not detected. The probability of extraction increases, the probability that the object areas of objects 1 to 4 are extracted increases, an effect is given to “object 4” which is the main subject originally intended by the photographer, and the user's Satisfaction increases.

スライドショー作成装置の概略構成図Schematic configuration diagram of slideshow creation device 特定領域に施されたエフェクトの一例を示す図A figure showing an example of effects applied to a specific area 顔領域ＤＢの情報を概念的に示す図The figure which shows information of face field DB notionally 物体領域ＤＢの情報を概念的に示す図The figure which shows information of object field DB notionally スライドショー生成処理の流れを示すフローチャートFlow chart showing the flow of slide show generation processing ＤＢ作成処理の流れを示すフローチャートFlow chart showing the flow of DB creation processing 物体領域抽出の閾値の段階の一例を示す図The figure which shows an example of the step of the threshold value of object area extraction 元画像の一例を示す図Diagram showing an example of the original image

Explanation of symbols

１：スライドショー生成部、２：顔領域抽出部、３：表示制御部、６：表示装置、９：物体領域抽出部 1: slide show generation unit, 2: face area extraction unit, 3: display control unit, 6: display device, 9: object region extraction unit

Claims

An image processing method for generating slide show data based on still images,
Extracting a face region, which is a region where a human face exists, from the still image;
Setting object extraction accuracy;
Extracting an object region that is an area where an object other than the face exists from the still image according to the set object extraction accuracy;
Determining whether a main subject has been extracted based on at least one of the extracted face region and object region;
Setting a value obtained by reducing the currently set object extraction accuracy by a predetermined value as a new object extraction accuracy in response to the determination that the main subject has not been extracted;
An image processing method including:

A program for causing a computer to execute the image processing method according to claim 1.

The step included in the image processing method according to claim 1 and the step of generating slide show data for sequentially displaying still images on which at least one of the extracted face area and object area is subjected to a predetermined effect are executed. A computer,
A display control unit for sequentially displaying still images on which the effect has been performed on a display device based on slide show data generated by the computer;
An image processing apparatus.