JP4857162B2

JP4857162B2 - Image processing apparatus and image processing method

Info

Publication number: JP4857162B2
Application number: JP2007089357A
Authority: JP
Inventors: 芳季石井
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2007-03-29
Filing date: 2007-03-29
Publication date: 2012-01-18
Anticipated expiration: 2027-03-29
Also published as: JP2008252354A

Description

本発明は画像処理装置、画像処理方法、プログラム、及び記録媒体に関し、特に、撮影した動画像のサムネール画像を生成するために用いて好適な技術に関する。 The present invention relates to an image processing apparatus, an image processing method, a program, and a recording medium, and more particularly to a technique suitable for use in generating a thumbnail image of a captured moving image.

近年、光ディスクや磁気ディスク、半導体メモリなどを記録メディアとするビデオレコーダーが一般化している。これらの記録メディアは、従来のテープメディアと比べてランダムアクセスが可能なため、記録したカットの一覧表示を行い、再生するカットを選択する方式が一般的に用いられている。カットの一覧表示には、カット中の画像フレームから選択された代表画像を縮小したサムネール画像と呼ばれる画像が用いられている。 In recent years, video recorders using recording media such as optical disks, magnetic disks, and semiconductor memories have become common. Since these recording media can be accessed randomly as compared with conventional tape media, a method of displaying a list of recorded cuts and selecting cuts to be reproduced is generally used. In the cut list display, an image called a thumbnail image obtained by reducing the representative image selected from the image frame being cut is used.

従来、このサムネール画像としては、カットの先頭フレーム、またはカットの先頭から所定秒後のフレーム画像から生成された縮小画像が用いられている。所定秒後の縮小画像を用いる理由としては、例えばテレビドラマなどでは、カットの先頭が黒画像であり、実際の映像はフェードインを用いて編集されている場合があり、カットの先頭フレームでは一覧表示での識別に必ずしも有効ではないからである。 Conventionally, as the thumbnail image, a reduced image generated from a top frame of a cut or a frame image after a predetermined time from the top of the cut is used. The reason for using a reduced image after a predetermined time is that, for example, in a TV drama or the like, the top of the cut is a black image, and the actual video may be edited using fade-in. This is because it is not necessarily effective for identification in display.

一方、映像アーカイブなどで用いるためのデータベースの検索手法として、俳優など特定人物が写った画面からシーンの代表画面を生成する方式が、特許文献１で提案されている。この特許文献１に記載の方式は、俳優名での分類、検索を目的としたものであるため、対象となる人物がシーン中に登場した場合は、シーンの主な内容にかかわらず対象人物に関連付けられる。 On the other hand, Patent Document 1 proposes a method for generating a scene representative screen from a screen showing a specific person such as an actor as a database search method for use in a video archive or the like. Since the method described in Patent Document 1 is intended for classification and search by actor name, if a target person appears in the scene, the target person will be identified regardless of the main content of the scene. Associated.

図１４は、特許文献１で開示されているデータベース検索のための人物代表画像作成の処理手順を示すフローチャートである。図１４に記載のフローチャートでは、一連の記録済みビデオ映像から、指定した人物が登場したフレームをシーン毎に代表画像として取り出すものである。 FIG. 14 is a flowchart showing a procedure for creating a person representative image for database search disclosed in Patent Document 1. In the flowchart illustrated in FIG. 14, a frame in which a designated person appears is extracted as a representative image for each scene from a series of recorded video images.

まず、ステップＳ１４０１で対象となるビデオ映像の処理を開始する。ステップＳ１４０２でフラグを０にリセットする。次に、ステップＳ１４０３に進み、処理するフレームの読み出しを行う。その後、ステップＳ１４０４でシーンチェンジが検出されたか否かを判定する。この判定の結果、シーンチェンジが検出された場合はステップＳ１４０５に進み、前シーンの代表画像を表示し、ステップＳ１４０６でフラグを０にリセットする。一方、ステップＳ１４０４の判定の結果、シーンチェンジが検出されなかった場合は、ステップＳ１４０７に進む。 First, in step S1401, processing of a target video image is started. In step S1402, the flag is reset to zero. In step S1403, a frame to be processed is read out. Thereafter, it is determined in step S1404 whether a scene change has been detected. If a scene change is detected as a result of this determination, the process advances to step S1405 to display the representative image of the previous scene, and the flag is reset to 0 in step S1406. On the other hand, if the result of determination in step S1404 is that a scene change has not been detected, processing proceeds to step S1407.

そして、ステップＳ１４０７でフラグが０であるか否かを判定する。この判定の結果、０である場合は、ステップＳ１４０８で代表画像を記憶し、ステップＳ１４０９でフラグを１にセットする。一方、ステップＳ１４０７の判定の結果、０でない場合は、ステップＳ１４１０に進む。 In step S1407, it is determined whether or not the flag is 0. If the result of this determination is 0, the representative image is stored in step S1408, and the flag is set to 1 in step S1409. On the other hand, if the result of determination in step S1407 is not 0, processing proceeds to step S1410.

そして、ステップＳ１４１０で顔画像の検出を行い、ステップＳ１４１１で検出結果を判定する。この判定の結果、顔画像が検出された場合は、ステップＳ１４１２に進み、代表画像を更新し、ステップＳ１４０２に戻る。また、ステップＳ１４１１の判定の結果、顔画像が検出されなかった場合は、ステップＳ１４０２に戻り、処理を繰り返す。 Then, a face image is detected in step S1410, and the detection result is determined in step S1411. If a face image is detected as a result of this determination, the process advances to step S1412 to update the representative image, and the process returns to step S1402. If no face image is detected as a result of the determination in step S1411, the process returns to step S1402 to repeat the process.

特開２００１−１６７１１０号公報JP 2001-167110 A

前述したような従来の固定的なサムネール画像を生成する方法は、テレビチューナー付きのビデオレコーダーでテレビドラマなどを録画する際には、シーンの始まりの画面が確認できるため、有効な方法である。ところが、家庭用ビデオカメラでカットの一覧表示を行う場合は、必ずしも有効とはいえない。 The conventional method of generating a fixed thumbnail image as described above is an effective method because the screen at the beginning of the scene can be confirmed when recording a TV drama or the like with a video recorder with a TV tuner. However, when displaying a list of cuts with a home video camera, it is not always effective.

家庭用ビデオカメラを用いた撮影は、テレビドラマのようにシナリオに沿ったものではなく、被写体を記録する場合が大半である。このため、家庭用ビデオカメラにおける一覧表示では、シーンの流れよりも主な被写体が何であったかをある程度反映したものを用いることが望ましい。従来の固定的な代表画像フレーム設定では、この用途に適したサムネール画像を生成することは困難であり、家庭用として撮影することを主に目的としたサムネール画像を生成することが可能なビデオカメラが望まれていた。 Shooting using a home video camera is not in line with the scenario as in a TV drama, and in most cases the subject is recorded. For this reason, it is desirable to use a list that reflects to some extent what the main subject is rather than the flow of the scene in the list display in the home video camera. With the conventional fixed representative image frame setting, it is difficult to generate a thumbnail image suitable for this application, and a video camera capable of generating a thumbnail image mainly intended for home use. Was desired.

また、特許文献１で提案されている特定人物による代表画像を生成する方式は、被写体を反映した代表画像生成方式である。ところがこの方式は、俳優名によるデータベース検索を目的としており、カットの主な内容に関わらず特定人物が登場するか否いかによって代表画像を決定する方式である。 Further, the method for generating a representative image by a specific person proposed in Patent Document 1 is a representative image generation method reflecting a subject. However, this method is intended for database search by actor name, and is a method for determining a representative image depending on whether or not a specific person appears regardless of the main contents of the cut.

例えば、主な被写体が風景であり、カットの最後に特定人物が偶然写りこんだというように、家庭用ビデオカメラ撮影でありがちな条件においても、特定人物の代表画像が生成される。このため、この方式は、家庭用ビデオカメラで一覧表示させるためのサムネール生成には必ずしも有効とはいえない。単純に特定人物の登場の有無だけではなく、カットにおける主な被写体を反映できるようなサムネール画像を生成する機能を持ったビデオカメラが望まれていた。 For example, a representative image of a specific person is generated even under conditions that are likely to be taken with a home video camera, such as when the main subject is a landscape and the specific person accidentally appears at the end of the cut. For this reason, this method is not necessarily effective for generating thumbnails for displaying a list on a home video camera. There has been a demand for a video camera having a function of generating a thumbnail image that can reflect not only the presence of a specific person but also the main subject in the cut.

本発明は前述の問題点に鑑み、偶然写りこんだ人物を代表画像としてサムネール画像を生成してしまう不都合を防止して、主な被写体を反映したサムネール画像を生成できるようにすることを目的としている。 SUMMARY OF THE INVENTION The present invention has been made in view of the above-described problems, and it is an object of the present invention to prevent the inconvenience of generating a thumbnail image using a person who appears accidentally as a representative image and to generate a thumbnail image reflecting a main subject. Yes.

本発明の画像処理装置は、入力された映像に含まれる人物の顔の特徴量を検出して、前記人物が特定人物であるかどうかの認識を行う特徴量検出手段を備えた画像処理装置であって、前記特徴量検出手段によって認識の対象となる前記特定人物の顔の特徴量を記憶部に記憶する特徴量記憶手段と、記録媒体に映像を記録する記録手段と、前記記録手段により記録された映像の記録開始の時点から、記録終了の時点までよりも短い第１の所定の期間内に、前記記憶部に予め記憶されている前記特定人物の顔の特徴量に基づいて、前記記録された映像から前記特定人物が前記特徴量検出手段によって認識された場合は、前記第１の所定の期間内の映像の中の前記特定人物が認識された映像フレームから前記記録された映像の代表画像としてサムネール画像を生成し、前記第１の所定の期間内に前記記録された映像から前記特定人物が前記特徴量検出手段によって認識されなかった場合は、前記記録開始の時点から前記第１の所定の期間よりも短い第２の所定の期間が経過した時点の映像フレームから前記サムネール画像を生成するサムネール画像生成手段とを備えたことを特徴とする。 The image processing apparatus of the present invention detects the characteristic amount of the face of a human product is Ru contained in movies image that is entered, provided with a characteristic quantity detecting means for recognizing whether the person is a specific person A feature amount storage unit that stores a feature amount of the face of the specific person to be recognized by the feature amount detection unit in a storage unit, a recording unit that records an image on a recording medium, from the time of recording start of the recording means images recorded by, in a short first predetermined time period than up to the point of end of recording, the feature amount of the face of the specific person which is previously stored in the storage unit based on the case where the specific person from the recorded image is recognized by the feature quantity detecting unit, the recording from the video frame a particular person has been recognized in the image in the first predetermined time period thumbnail as a representative image of the video It generates Le image, when said first of said specific person from said recorded video within a predetermined time period has not been recognized by the feature quantity detecting unit, from the time of the recording start said first predetermined And a thumbnail image generating means for generating the thumbnail image from a video frame at the time when a second predetermined period shorter than the period has elapsed .

本発明の画像処理方法は、入力された映像に含まれる人物の顔の特徴量を検出して、前記人物が特定人物であるかどうかの認識を行う特徴量検出工程を備えた画像処理方法であって、前記特徴量検出工程において認識の対象となる前記特定人物の顔の特徴量を記憶部に記憶する特徴量記憶工程と、記録媒体に映像を記録する記録工程と、前記記録工程において記録された映像の記録開始の時点から、記録終了の時点までよりも短い第１の所定の期間内に、前記記憶部に予め記憶されている前記特定人物の顔の特徴量に基づいて、前記記録された映像から前記特定人物が前記特徴量検出工程において認識した場合は、前記第１の所定の期間内の映像の中の前記特定人物が認識された映像フレームから前記記録された映像の代表画像としてサムネール画像を生成し、前記第１の所定の期間内に前記記録された映像から前記特定人物が前記特徴量検出工程において認識しなかった場合は、前記記録開始の時点から前記第１の所定の期間よりも短い第２の所定の期間が経過した時点の映像フレームから前記サムネール画像を生成するサムネール画像生成工程とを備えたことを特徴とする。 The image processing method of the present invention detects the characteristic amount of the face of a human product is Ru contained in movies image that is entered, provided with a feature amount detection process for recognizing whether the person is a specific person A feature amount storing step of storing a feature amount of the face of the specific person to be recognized in the feature amount detecting step in a storage unit, a recording step of recording a video on a recording medium, from the time of start of recording video recorded in the recording step, in a short first predetermined time period than up to the point of end of recording, the feature amount of the face of the specific person which is previously stored in the storage unit Based on the recorded video, when the specific person recognizes in the feature amount detection step, the recording is performed from the video frame in which the specific person in the video within the first predetermined period is recognized. thumbnail as a representative image of the video was It generates Le image, when said first of said specific person from said recorded video within a predetermined time period has not recognized in the feature amount detection step, from the time of the recording start said first predetermined And a thumbnail image generation step of generating the thumbnail image from a video frame at the time when a second predetermined period shorter than the period has elapsed .

本発明のプログラムは、入力された映像に含まれる人物の顔の特徴量を検出して、前記人物が特定人物であるかどうかの認識を行う特徴量検出工程と、前記特徴量検出工程において認識の対象となる前記特定人物の顔の特徴量を記憶部に記憶する特徴量記憶工程と、記録媒体に映像を記録する記録工程と、前記記録工程において記録された映像の記録開始の時点から、記録終了の時点までよりも短い第１の所定の期間内に、前記記憶部に予め記憶されている前記特定人物の顔の特徴量に基づいて、前記記録された映像から前記特定人物が前記特徴量検出工程において認識した場合は、前記第１の所定の期間内の映像の中の前記特定人物が認識された映像フレームから前記記録された映像の代表画像としてサムネール画像を生成し、前記第１の所定の期間内に前記記録された映像から前記特定人物が前記特徴量検出工程において認識しなかった場合は、前記記録開始の時点から前記第１の所定の期間よりも短い第２の所定の期間が経過した時点の映像フレームから前記サムネール画像を生成するサムネール画像生成工程とをコンピュータに実行させることを特徴とする。 Program of the present invention detects the characteristic amount of the face of a human product is Ru contained in movies image is entered, a feature amount detection step for recognizing whether the person is a specific person, the characteristic A feature amount storing step of storing in the storage unit a feature amount of the face of the specific person to be recognized in the amount detecting step, a recording step of recording an image on a recording medium, and a recording of the image recorded in the recording step wherein from the time of the start, in a short first predetermined time period than up to the point of recording end, on the basis of the feature amount of the face of a specific person stored in advance in the storage unit, from the recorded image When a specific person recognizes in the feature amount detection step, a thumbnail image is generated as a representative image of the recorded video from a video frame in which the specific person is recognized in the video within the first predetermined period And the first If the specific person does not recognize in the feature amount detection step from the recorded video within a predetermined period, a second predetermined period shorter than the first predetermined period from the recording start time And a thumbnail image generation step of generating the thumbnail image from the video frame at the time when elapses .

本発明の記録媒体は、前記に記載のプログラムを記録したことを特徴とする。 The recording medium of the present invention is characterized by recording the program described above.

本発明によれば、偶然写りこんだ人物を代表画像としてサムネール画像を生成してしまう不都合を防止することが可能となり、撮影の主たる被写体が検出対象となる特定人物であるのかまたは別の被写体であるのかを判定することができる。したがって、主な被写体を反映したサムネール画像を生成することができる。 According to the present invention, even it is possible to prevent a disadvantage that generates a thumbnail image of natural yelling-through person as a representative image, the or another object main subject is a specific person be detected photographing Can be determined. Accordingly, a thumbnail image reflecting the main subject can be generated.

（第１の実施形態）
以下、図面を参照しながら本発明の実施形態について詳細に説明する。
図１は、本実施形態における画像処理装置１００の映像記録処理回路の構成例を示すブロック図である。画像処理装置１００は、例えばビデオカメラであり、映像とともに音声の記録処理を行う。なお、音声に関しては本発明の内容に直接関わらないため説明を省略する。 (First embodiment)
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram illustrating a configuration example of a video recording processing circuit of the image processing apparatus 100 according to the present embodiment. The image processing apparatus 100 is a video camera, for example, and performs audio recording processing together with video. Since the voice is not directly related to the contents of the present invention, the description is omitted.

第１の端子１０１には、レンズ等の光学系と、ＣＣＤ等のセンサーと、カメラ信号処理回路とからなる撮像手段から映像信号が入力される。入力された映像信号は、動画像符号化回路１０４において、例えばＭＰＥＧ２などの符号化方式を用いて圧縮符号化される。なお、圧縮符号化処理の詳細については本発明の内容に直接関係しないため説明を省略する。 A video signal is input to the first terminal 101 from an imaging unit including an optical system such as a lens, a sensor such as a CCD, and a camera signal processing circuit. The input video signal is compressed and encoded by the moving image encoding circuit 104 using an encoding method such as MPEG2. The details of the compression encoding process are not directly related to the contents of the present invention, and thus the description thereof is omitted.

マルチプレクサ１０５は、圧縮符号化された動画像データに音声データ、サブコードデータ等を多重化してＡＶデータを作成する。なお、これらの映像記録処理以外のビデオカメラ構成要素については説明を省略する。マルチプレクサ１０５によって多重化されたＡＶデータは、メディア記録回路１０６によって記録メディア１０７に記録される。記録メディアとしては光学ディスク、磁気ディスク、固体メモリ等があるが、ＡＶデータとともに一覧用のサムネール画像を記録可能なメディアであれば何でもよい。システムコントローラ１０３は、第２の端子１０２から入力される記録開始指示によって各ブロックを制御し、ＡＶデータの記録をコントロールする。 The multiplexer 105 multiplexes audio data, subcode data, and the like with the compressed and encoded moving image data to create AV data. Note that description of video camera components other than these video recording processes is omitted. The AV data multiplexed by the multiplexer 105 is recorded on the recording medium 107 by the media recording circuit 106. Examples of the recording medium include an optical disk, a magnetic disk, a solid-state memory, and the like, but any medium can be used as long as it can record thumbnail images for listing together with AV data. The system controller 103 controls each block according to a recording start instruction input from the second terminal 102 and controls recording of AV data.

一方、第１の端子１０１を介して入力された映像信号は、代表画像生成制御回路１１６が第１のスイッチ１０９を制御することにより、後述の顔画像検出処理のサイクルごとにフレームメモリ１０８に記憶される。なお、このサイクルは顔画像検出処理能力が高ければ毎フレームごとでもよく、処理能力や必要な検出間隔に応じて間歇的であってもよい。 On the other hand, the video signal input via the first terminal 101 is stored in the frame memory 108 for each cycle of face image detection processing described later by the representative image generation control circuit 116 controlling the first switch 109. Is done. Note that this cycle may be performed every frame if the face image detection processing capability is high, or may be intermittent depending on the processing capability and a necessary detection interval.

特徴量検出回路１１２は、フレームメモリ１０８に記憶された映像フレーム（映像信号）に対して画像特徴量の検出を行う。本実施形態では、人物の顔画像を判定するための画像特徴量を用いた例について説明する。画像特徴量を用いた顔画像判定については、部分画像の色相や目、鼻、口といった顔画像に特有の構造を利用したアルゴリズムが提案され、携帯用機器の個人認証などでもすでに実用化されている。本実施形態では、前記アルゴリズムに基づいて画像特徴量を用いた顔画像判定を行う。なお、本発明は、特徴量検出方法とは直接関係ないため、既に公知である顔画像判定のためのアルゴリズムについての詳細な説明は省略する。 The feature amount detection circuit 112 detects an image feature amount for a video frame (video signal) stored in the frame memory 108. In the present embodiment, an example using an image feature amount for determining a human face image will be described. For facial image determination using image features, an algorithm using a structure specific to facial images such as the hue of the partial image, eyes, nose, and mouth has been proposed, and has already been put to practical use in personal authentication of portable devices. Yes. In the present embodiment, face image determination using an image feature amount is performed based on the algorithm. Since the present invention is not directly related to the feature amount detection method, a detailed description of an already known algorithm for face image determination is omitted.

特徴量記憶メモリ（記憶部）１１４には、特定人物の認識に必要な特徴量が予め記憶されている。例えば、画像処理装置１００で記録する前に、家族などの特定人物を被写体として撮像し、その際に、代表画像生成制御回路１１６が特徴量記憶手段として機能し、第２のスイッチ１１３を書き込み側に制御する。これにより、特徴量検出回路１１２から出力される画像特徴量を、特徴量記憶メモリ１１４に記憶することができる。これらの制御は、システムコントローラ１０３から送られる指示に基づいて代表画像生成制御回路１１６が各部を制御することにより行う。 In the feature amount storage memory (storage unit) 114, feature amounts necessary for recognition of a specific person are stored in advance. For example, before recording with the image processing apparatus 100, a specific person such as a family is imaged as a subject, and at that time, the representative image generation control circuit 116 functions as a feature amount storage unit, and the second switch 113 is set to the writing side. To control. Thereby, the image feature quantity output from the feature quantity detection circuit 112 can be stored in the feature quantity storage memory 114. These controls are performed by the representative image generation control circuit 116 controlling each unit based on an instruction sent from the system controller 103.

特徴量比較回路１１５は、特徴量検出回路１１２から出力される画像特徴量と、特徴量記憶メモリ１１４に記憶されている画像特徴量とを比較することによって現在撮影中の映像フレームに特定人物の顔画像が含まれているか否かを判定する。そして、特徴量比較回路１１５及び特徴量検出回路１１２が、特徴量検出手段として機能する。記録時間計数回路１１７は、システムコントローラ１０３から出力される記録開始情報に基づき、記録開始から顔画像判定を行った映像フレームまでの経過時間を計数する。 The feature amount comparison circuit 115 compares the image feature amount output from the feature amount detection circuit 112 with the image feature amount stored in the feature amount storage memory 114, thereby identifying a specific person in the currently captured video frame. It is determined whether or not a face image is included. The feature amount comparison circuit 115 and the feature amount detection circuit 112 function as feature amount detection means. Based on the recording start information output from the system controller 103, the recording time counting circuit 117 counts the elapsed time from the start of recording to the video frame for which face image determination has been performed.

代表画像生成制御回路１１６は、特徴量比較回路１１５による顔画像の特徴量比較結果と、記録時間計数回路１１７において計数された記録開始からの経過時間とに応じて、代表画像とすべき映像フレームを決定する。そして、第３のスイッチ１１８を制御することによって、代表画像となるべき映像フレームの画像を代表画像メモリ１１０に記憶する。ここで行われる特徴量比較結果と記録開始からの経過時間とに基づいた代表画像決定方法の詳細については、フローチャートを参照しながら後述する。 The representative image generation control circuit 116 determines the video frame to be a representative image according to the feature amount comparison result of the face image by the feature amount comparison circuit 115 and the elapsed time from the start of recording counted by the recording time counting circuit 117. To decide. Then, by controlling the third switch 118, the image of the video frame to be the representative image is stored in the representative image memory 110. Details of the representative image determination method based on the feature amount comparison result and the elapsed time from the start of recording will be described later with reference to a flowchart.

サムネール画像生成回路１１１は、代表画像メモリ１１０に記憶された代表画像を縮小その他必要な変換処理を行ってサムネール画像データを生成する。この際、特定人物の顔画像を含む映像フレームが代表画像として選択された場合は、予め記憶させておいた特定人物の名前等の付加情報をサムネール画像データに付加することも可能である。そして、生成されたサムネール画像データをメディア記録回路１０６によって記録メディア１０７に記録する。 The thumbnail image generation circuit 111 performs thumbnail image data reduction or other necessary conversion processing on the representative image stored in the representative image memory 110 to generate thumbnail image data. At this time, when a video frame including a face image of a specific person is selected as a representative image, additional information such as the name of the specific person stored in advance can be added to the thumbnail image data. The generated thumbnail image data is recorded on the recording medium 107 by the media recording circuit 106.

図２は、本実施形態において、顔画像検出結果と記録開始からの経過時間とによって代表画像を決定し、サムネール画像データを作成する処理手順の一例を示すフローチャートである。
まず、ステップＳ２０１において、処理を開始する。次に、ステップＳ２０２において、代表画像生成制御回路１１６は、顔画像検出状態を保持する顔画像検出フラグ（Flg）を、未検出状態を表す「０」にリセットする。また、記録時間計数回路１１７で計数される経過時間tを「０」にリセットする。 FIG. 2 is a flowchart illustrating an example of a processing procedure for determining a representative image based on a face image detection result and an elapsed time from the start of recording and creating thumbnail image data in the present embodiment.
First, in step S201, processing is started. Next, in step S202, the representative image generation control circuit 116 resets the face image detection flag (Flg) holding the face image detection state to “0” indicating the undetected state. Further, the elapsed time t counted by the recording time counting circuit 117 is reset to “0”.

次に、ステップＳ２０３において、システムコントローラ１０３の制御によって映像信号の記録を開始する。そして、ステップＳ２０４において、代表画像生成制御回路１１６は第１のスイッチ１０９を制御して、顔画像検出処理を行うための映像フレームＦ（ｔ）をフレームメモリ１０８に書き込む。 Next, in step S203, recording of a video signal is started under the control of the system controller 103. In step S204, the representative image generation control circuit 116 controls the first switch 109 to write the video frame F (t) for performing the face image detection process in the frame memory 108.

次に、ステップＳ２０５において、代表画像生成制御回路１１６は、経過時間ｔと顔画像検出の有効期間である第１の時間期間Ｔ１とを比較する。この比較の結果、経過時間ｔが第１の時間期間Ｔ１以内である場合は、ステップＳ２０６以降の顔画像検出処理に進む。一方、ステップＳ２０５の比較の結果、経過時間ｔが第１の時間期間Ｔ１を越えている場合は、顔画像検出処理をスキップして、ステップＳ２１２に進む。 Next, in step S205, the representative image generation control circuit 116 compares the elapsed time t with a first time period T1 that is an effective period of face image detection. As a result of this comparison, when the elapsed time t is within the first time period T1, the process proceeds to the face image detection process after step S206. On the other hand, as a result of the comparison in step S205, if the elapsed time t exceeds the first time period T1, the face image detection process is skipped and the process proceeds to step S212.

なお、このステップＳ２０５における経過時間ｔと顔画像検出期間（第１の時間期間）Ｔ１との比較によって、今回の撮影の主たる被写体が検出対象となる人物であるのか、または別の被写体であるのかを判定する。この判定は、シナリオに沿った撮影ではない家庭用ビデオカメラを用いた撮影では、主となる被写体を狙ってから撮影開始される場合が多いことに基づいている。 Note that, by comparing the elapsed time t in step S205 with the face image detection period (first time period) T1, whether the main subject of the current shooting is a person to be detected or another subject. Determine. This determination is based on the fact that shooting using a home video camera that is not shooting according to a scenario often starts shooting after aiming at the main subject.

次に、ステップＳ２０６において、代表画像生成制御回路１１６は、顔画像検出フラグ（Flg）をチェックすることにより、顔画像検出が既になされているかどうかを判定する。この判定の結果、顔画像検出フラグ（Flg）が、検出済みを表す「１」にセットされている場合は、以下の検出処理をスキップしてステップＳ２１２に進む。一方、ステップＳ２０６の判定の結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ２０７の顔画像特徴量検出処理に進む。 In step S206, the representative image generation control circuit 116 checks the face image detection flag (Flg) to determine whether face image detection has already been performed. As a result of this determination, if the face image detection flag (Flg) is set to “1” indicating that detection has been completed, the following detection process is skipped and the process proceeds to step S212. On the other hand, if the result of determination in step S206 is that the face image detection flag (Flg) is “0” and no face image has been detected, the process proceeds to face image feature amount detection processing in step S207.

次に、ステップＳ２０７において、特徴量検出回路１１２は、既に記憶されている特定人物の顔画像を判定するための顔画像特徴量Ｄ（Ｆ（ｔ））を映像フレームＦ（ｔ）から検出して、特徴量比較回路１１５に特徴量Ｐを出力する。そして、ステップＳ２０８において、特徴量比較回路１１５は、特徴量記憶メモリ１１４に記憶された特定人物の顔画像特徴量Ｍと特徴量Ｐとを所定の関数Ｃ（Ｐ，Ｍ）を用いて比較し、比較結果Ｋを算出する。 Next, in step S207, the feature amount detection circuit 112 detects a face image feature amount D (F (t)) for determining a face image of a specific person that has already been stored, from the video frame F (t). Thus, the feature quantity P is output to the feature quantity comparison circuit 115. In step S208, the feature amount comparison circuit 115 compares the face image feature amount M and the feature amount P of the specific person stored in the feature amount storage memory 114 using a predetermined function C (P, M). The comparison result K is calculated.

次に、ステップＳ２０９において、特徴量比較回路１１５は、比較結果Ｋを所定の閾値Ｋｔｈと比較することによって、映像フレームＦ（ｔ）内に特定人物の顔画像が検出されたか否かを判定する。この判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ以上であり、顔画像が検出された場合は、ステップＳ２１０に進む。そして、ステップＳ２１０において、代表画像生成制御回路１１６は第３のスイッチ１１８を制御して映像フレームＦ（ｔ）を代表画像として代表画像メモリ１１０に書き込む。そして、ステップＳ２１１において、代表画像生成制御回路１１６は、顔画像検出フラグ（Flg）を検出済みを示す「１」にセットする。 Next, in step S209, the feature amount comparison circuit 115 determines whether or not a face image of a specific person has been detected in the video frame F (t) by comparing the comparison result K with a predetermined threshold value Kth. . As a result of the determination, if the comparison result K is equal to or greater than the predetermined threshold value Kth and a face image is detected, the process proceeds to step S210. In step S210, the representative image generation control circuit 116 controls the third switch 118 to write the video frame F (t) as a representative image in the representative image memory 110. In step S211, the representative image generation control circuit 116 sets the face image detection flag (Flg) to “1” indicating that detection has been performed.

一方、ステップＳ２０９の判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ未満であり、顔画像が検出されなかった場合は、以下のステップＳ２１０、Ｓ２１１の処理をスキップしてステップＳ２１２に進む。このように、経過時間ｔが第１の時間期間Ｔ１を越えるまで顔画像検出を実行し、検出された先頭の顔画像が代表画像メモリ１１０に記憶される。 On the other hand, as a result of the determination in step S209, if the comparison result K is less than the predetermined threshold value Kth and no face image is detected, the processing in steps S210 and S211 below is skipped and the process proceeds to step S212. In this way, face image detection is executed until the elapsed time t exceeds the first time period T1, and the detected head face image is stored in the representative image memory 110.

次に、ステップＳ２１２において、代表画像生成制御回路１１６は、顔画像が第１の時間期間Ｔ１内に検出されなかった場合の代表画像フレーム位置である第２の時間期間Ｔ２と経過時間ｔとを比較する。この比較の結果、経過時間ｔが第２の時間期間Ｔ２を越えている場合は以下のステップＳ２１３、Ｓ２１４の処理をスキップしてステップＳ２１５に進む。 Next, in step S212, the representative image generation control circuit 116 calculates the second time period T2 that is the representative image frame position and the elapsed time t when the face image is not detected within the first time period T1. Compare. As a result of this comparison, when the elapsed time t exceeds the second time period T2, the processing of the following steps S213 and S214 is skipped and the process proceeds to step S215.

一方、ステップＳ２１２の比較の結果、経過時間ｔが第２の時間期間Ｔ２以内である場合は、ステップＳ２１３において、代表画像生成制御回路１１６は、顔画像検出フラグ（Flg）をチェックする。このチェックの結果、顔画像検出フラグ（Flg）が検出済みを表す「１」にセットされている場合は、以下のステップＳ２１４の処理をスキップしてステップＳ２１５に進む。 On the other hand, when the elapsed time t is within the second time period T2 as a result of the comparison in step S212, the representative image generation control circuit 116 checks the face image detection flag (Flg) in step S213. As a result of this check, when the face image detection flag (Flg) is set to “1” indicating detection, the processing of the following step S214 is skipped and the process proceeds to step S215.

一方、ステップＳ２１３のチェックの結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ２１４に進み、代表画像生成制御回路１１６は、現在の映像フレームＦ（ｔ）を代表画像メモリ１１０に書き込む。このように、顔画像が検出されていない場合は、経過時間ｔが第２の時間期間Ｔ２を越えるまで代表画像メモリ１１０に記憶される内容は、最新の映像フレームＦ（ｔ）によって更新される。 On the other hand, if the face image detection flag (Flg) is “0” as a result of the check in step S213 and no face image is detected, the process proceeds to step S214, and the representative image generation control circuit 116 F (t) is written into the representative image memory 110. As described above, when the face image is not detected, the content stored in the representative image memory 110 is updated by the latest video frame F (t) until the elapsed time t exceeds the second time period T2. .

次に、ステップＳ２１５において、代表画像生成制御回路１１６は、ユーザの操作等によって、システムコントローラ１０３から記録停止が指示されたか否かを判定する。この判定の結果、記録停止が指示されていない場合は、ステップＳ２１６に進み、処理を継続する。そして、ステップＳ２１６において、記録時間計数回路１１７は、経過時間ｔを所定の検出単位でインクリメントする。前述したように、顔画像検出処理能力が高くかつ要求される検出間隔が短い場合は毎フレームごとにインクリメントを行い、検出間隔を長く取る場合は所定のインターバル処理となるようにインクリメントする。そして、インクリメントした後はステップＳ２０４に戻り、処理を繰り返す。 Next, in step S215, the representative image generation control circuit 116 determines whether or not a recording stop instruction has been issued from the system controller 103 by a user operation or the like. If the result of this determination is that there is no instruction to stop recording, processing proceeds to step S216 and processing continues. In step S216, the recording time counting circuit 117 increments the elapsed time t by a predetermined detection unit. As described above, when the face image detection processing capability is high and the required detection interval is short, the increment is performed every frame, and when the detection interval is long, the increment is performed so as to be a predetermined interval process. After the increment, the process returns to step S204 to repeat the process.

一方、ステップＳ２１５の判定の結果、記録停止が指示された場合は、ステップＳ２１７に進み、サムネール画像生成回路１１１は、代表画像メモリ１１０に記憶されている映像フレームからサムネール画像データを生成する。そして、ステップＳ２１８において、メディア記録回路１０６は、生成されたサムネール画像データを記録メディア１０７に記録して処理を終了する。 On the other hand, if recording stop is instructed as a result of the determination in step S215, the process proceeds to step S217, and the thumbnail image generation circuit 111 generates thumbnail image data from the video frames stored in the representative image memory 110. In step S218, the media recording circuit 106 records the generated thumbnail image data on the recording medium 107, and ends the process.

図２のフローチャートに示すように、ステップＳ２０７以降の顔画像検出処理は経過時間ｔが第１の時間期間Ｔ１以下である期間のみ行われる。このため、経過時間ｔが第１の時間期間Ｔ１を越えてからは顔画像検出処理自体を停止させることも可能である。この場合、特徴量検出回路１１２、特徴量比較回路１１５等は、クロック供給を停止するなどの省エネルギーに対応するための実装が可能である。このように実際の検出時間を記録冒頭の短時間に限定することによって、本実施形態の画像処理装置１００をビデオカメラ等の携帯機器に適用した場合でも、バッテリー消費に対する影響を抑えることが可能である。 As shown in the flowchart of FIG. 2, the face image detection process after step S207 is performed only during a period in which the elapsed time t is equal to or shorter than the first time period T1. Therefore, the face image detection process itself can be stopped after the elapsed time t exceeds the first time period T1. In this case, the feature quantity detection circuit 112, the feature quantity comparison circuit 115, and the like can be mounted to cope with energy saving such as stopping the clock supply. By limiting the actual detection time to a short time at the beginning of recording in this way, even when the image processing apparatus 100 of the present embodiment is applied to a portable device such as a video camera, it is possible to suppress the influence on battery consumption. is there.

図３は、以上の処理によって代表画像を選択する具体例を示す図である。図３に示す例では、顔画像が検出されなかった場合のディフォルトの代表画像位置（第２の時間期間）Ｔ２＜顔画像検出期間（第１の時間期間）Ｔ１として説明する。 FIG. 3 is a diagram showing a specific example of selecting a representative image by the above processing. In the example illustrated in FIG. 3, description will be made assuming that the default representative image position (second time period) T2 <face image detection period (first time period) T1 when no face image is detected.

図３（ａ）に示す例では、経過時間ｔ＜Ｔ２＜Ｔ１であり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されたため、経過時間ｔの位置の映像フレームが代表画像として選択される。また、図３（ｂ）に示す例ではＴ２＜経過時間ｔ＜Ｔ１であり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されたため、経過時間ｔの位置の映像フレームが代表画像として選択される。一方、図３（ｃ）に示す例ではＴ２＜Ｔ１＜経過時間ｔであり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されなかったため、ディフォルトの代表画像位置Ｔ２の映像フレームが代表画像として選択される。 In the example shown in FIG. 3A, since the elapsed time t <T2 <T1 and the face image of the specific person is detected within the face image detection period T1, the video frame at the position of the elapsed time t is selected as the representative image. Is done. In the example shown in FIG. 3B, since T2 <elapsed time t <T1 and a face image of a specific person is detected within the face image detection period T1, the video frame at the position of the elapsed time t is used as a representative image. Selected. On the other hand, in the example shown in FIG. 3C, T2 <T1 <elapsed time t, and the face image of the specific person was not detected within the face image detection period T1, so the video frame at the default representative image position T2 is representative. Selected as an image.

図４は、代表画像を選択する他の具体例を示す図である。図４に示す例では、顔画像検出期間（第１の時間期間）Ｔ１＜顔画像が検出されなかった場合のディフォルトの代表画像位置（第２の時間期間）Ｔ２として説明する。 FIG. 4 is a diagram illustrating another specific example of selecting a representative image. In the example shown in FIG. 4, description will be made assuming that the face image detection period (first time period) T1 <the default representative image position (second time period) T2 when no face image is detected.

図４（ａ）に示す例では、経過時間ｔ＜Ｔ１＜Ｔ２であり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されたため、経過時間ｔの位置の映像フレームが代表画像として選択される。一方、図４（ｂ）に示す例ではＴ１＜経過時間ｔ＜Ｔ２であり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されなかったため、ディフォルトの代表画像位置Ｔ２の映像フレームが代表画像として選択される。また、図４（ｃ）に示す例ではＴ１＜Ｔ２＜経過時間ｔであり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されなかったため、ディフォルトの代表画像位置Ｔ２の映像フレームが代表画像として選択される。 In the example shown in FIG. 4A, since the elapsed time t <T1 <T2 and the face image of the specific person is detected within the face image detection period T1, the video frame at the position of the elapsed time t is selected as the representative image. Is done. On the other hand, in the example shown in FIG. 4B, since T1 <elapsed time t <T2 and the face image of the specific person was not detected within the face image detection period T1, the video frame at the default representative image position T2 is representative. Selected as an image. In the example shown in FIG. 4C, T1 <T2 <elapsed time t, and the face image of the specific person was not detected within the face image detection period T1, so the video frame at the default representative image position T2 is representative. Selected as an image.

以上のように本実施形態によれば、前述したように家庭用ビデオカメラにおいては、主となる被写体を狙ってから撮影を開始する場合が多い。このことに基づき、顔画像検出結果と撮影開始からの経過時間とを利用して、撮影の主たる被写体が検出対象となる人物であるのかまたは別の被写体であるのかを判定し、撮影意図を反映したサムネール画像を作成することが実現できる。 As described above, according to this embodiment, as described above, in home video cameras, shooting is often started after aiming at a main subject. Based on this, the face image detection result and the elapsed time from the start of shooting are used to determine whether the main subject of shooting is a person to be detected or another subject and reflect the intention of shooting It is possible to create a thumbnail image.

（第２の実施形態）
図５は、本実施形態における画像処理装置５００の映像記録処理回路の構成例を示すブロック図である。
第１の端子５０１には、レンズ等の光学系、ＣＣＤ等のセンサー、カメラ信号処理回路からなる撮像手段から映像信号が入力される。入力された映像信号は、動画像符号化回路５０４において、例えばＭＰＥＧ２などの符号化方式を用いて圧縮符号化される。マルチプレクサ５０５は、圧縮符号化された動画像データに音声データ、サブコードデータ等を多重化する。なお、第１の実施形態と同様に、これらの映像記録処理以外のビデオカメラ構成要素については説明を省略する。 (Second Embodiment)
FIG. 5 is a block diagram illustrating a configuration example of a video recording processing circuit of the image processing apparatus 500 in the present embodiment.
A video signal is input to the first terminal 501 from an imaging unit including an optical system such as a lens, a sensor such as a CCD, and a camera signal processing circuit. The input video signal is compressed and encoded by the moving image encoding circuit 504 using an encoding method such as MPEG2. The multiplexer 505 multiplexes audio data, subcode data, and the like on the compressed and encoded moving image data. Similar to the first embodiment, description of the video camera components other than the video recording processing is omitted.

マルチプレクサ５０５によって多重化されたＡＶデータは、メディア記録回路５０６によって記録メディア５０７に記録される。システムコントローラ５０３は、第２の端子５０２から入力される記録開始指示によって各ブロックを制御し、ＡＶデータの記録をコントロールする。 The AV data multiplexed by the multiplexer 505 is recorded on the recording medium 507 by the media recording circuit 506. The system controller 503 controls each block according to a recording start instruction input from the second terminal 502 and controls recording of AV data.

一方、第１の端子５０１を介して入力された映像信号は、代表画像生成制御回路５１６が第１のスイッチ５１３を制御することにより、後述の顔画像検出処理のサイクルごとにフレームメモリ５０８に記憶される。なお、このサイクルは顔画像検出処理能力が高ければ毎フレームごとでもよく、処理能力や必要な検出間隔に応じて間歇的であってもよい。 On the other hand, the video signal input via the first terminal 501 is stored in the frame memory 508 for each cycle of face image detection processing described later by the representative image generation control circuit 516 controlling the first switch 513. Is done. Note that this cycle may be performed every frame if the face image detection processing capability is high, or may be intermittent depending on the processing capability and a necessary detection interval.

特徴量検出回路５１２は、フレームメモリ５０８に記憶された映像フレーム（映像信号）に対して画像特徴量の検出を行う。本実施形態では、人物の顔画像を判定するための画像特徴量を用いた例について説明する。 The feature amount detection circuit 512 detects an image feature amount for a video frame (video signal) stored in the frame memory 508. In the present embodiment, an example using an image feature amount for determining a human face image will be described.

特徴量記憶メモリ（記憶部）５１４には、特定人物の認識に必要な特徴量が予め記憶されている。例えば、画像処理装置５００で記録する前に、家族などの特定人物を被写体として撮像し、その際に、代表画像生成制御回路５１６が特徴量記憶手段として機能し、第２のスイッチ５１８を書き込み側に制御する。これにより、特徴量検出回路５１２から出力される画像特徴量を特徴量記憶メモリ５１４に記憶することができる。これらの制御は、システムコントローラ５０３から送られる指示に基づいて代表画像生成制御回路５１６が各部を制御することにより行う。 The feature amount storage memory (storage unit) 514 stores in advance feature amounts necessary for recognition of a specific person. For example, before recording with the image processing apparatus 500, a specific person such as a family is imaged as a subject. At that time, the representative image generation control circuit 516 functions as a feature amount storage unit, and the second switch 518 is set to the writing side. To control. As a result, the image feature amount output from the feature amount detection circuit 512 can be stored in the feature amount storage memory 514. These controls are performed by the representative image generation control circuit 516 controlling each unit based on an instruction sent from the system controller 503.

特徴量比較回路５１５は、特徴量検出回路５１２から出力される画像特徴量と、特徴量記憶メモリ５１４に記憶されている画像特徴量とを比較することによって現在撮影中の映像フレームに特定人物の顔画像が含まれているか否かを判定する。そして、特徴量比較回路５１５及び特徴量検出回路５１２が、特徴量検出手段として機能する。記録時間計数回路５１７は、システムコントローラ５０３から出力される記録開始情報に基づき、記録開始から顔画像判定を行った映像フレームまでの経過時間を計数する。 The feature amount comparison circuit 515 compares the image feature amount output from the feature amount detection circuit 512 with the image feature amount stored in the feature amount storage memory 514, thereby identifying the specific person in the currently captured video frame. It is determined whether or not a face image is included. The feature quantity comparison circuit 515 and the feature quantity detection circuit 512 function as a feature quantity detection unit. Based on the recording start information output from the system controller 503, the recording time counting circuit 517 counts the elapsed time from the recording start to the video frame for which the face image determination is performed.

代表画像生成制御回路５１６は、特徴量比較回路５１５による顔画像の特徴量比較結果と、記録時間計数回路５１７において計数された記録開始からの経過時間とに応じて、代表画像とすべき映像フレームを決定する。そして、この時点の映像フレームを再生時に特定するためのインデックス情報を代表画像インデックスメモリ５１９に記憶する。インデックス情報としては、記録しているカット先頭からのフレーム番号などを利用する。本実施形態は、第１の実施形態と異なり、実際の代表画像を記憶するかわりにインデックス情報を記憶することによって代表画像候補を保持している。これによって撮影時に必要なメモリ容量を削減することができる。 The representative image generation control circuit 516 is a video frame to be a representative image according to the feature amount comparison result of the face image by the feature amount comparison circuit 515 and the elapsed time from the start of recording counted by the recording time counting circuit 517. To decide. Then, index information for specifying the video frame at the time of reproduction is stored in the representative image index memory 519. As the index information, a frame number from the head of the recorded cut is used. Unlike the first embodiment, this embodiment holds representative image candidates by storing index information instead of storing actual representative images. As a result, the memory capacity required for shooting can be reduced.

代表画像の読み出しは、記録終了後にこのインデックス情報を用いて、記録メディア５０７に記録した動画データを再生することによって行う。システムコントローラ５０３は、記録終了後にメディア再生回路５２０を制御し、記録メディア５０７から今回の記録データのうちインデックス情報で指定された映像フレームの再生に必要なデータを読み出す。 The representative image is read by reproducing the moving image data recorded on the recording medium 507 using the index information after the recording is completed. The system controller 503 controls the media playback circuit 520 after the recording is completed, and reads data necessary for playback of the video frame specified by the index information from the recording media 507 at this time.

読み出されたＡＶデータからデマルチプレクサ５２１によって映像データを取り出し、動画像復号化回路５２２によって圧縮符号化された動画像データから動画像を再生する。そして、再生された動画像中におけるインデックス情報で指定された映像フレームを第３のスイッチ５２３を制御することによって代表画像メモリ５１０に記憶する。 Video data is extracted from the read AV data by the demultiplexer 521, and a moving image is reproduced from the moving image data compressed and encoded by the moving image decoding circuit 522. Then, the video frame specified by the index information in the reproduced moving image is stored in the representative image memory 510 by controlling the third switch 523.

サムネール画像生成回路５１１は、代表画像メモリ５１０に記憶された代表画像を縮小その他必要な変換処理を行ってサムネール画像データを生成する。この際、特定人物の顔画像を含む映像フレームが代表画像フレームとして選択された場合は、予め記憶させておいた特定人物の名前等の付加情報をサムネール画像データに付加することも可能である。そして、生成されたサムネール画像データをメディア記録回路５０６によって記録メディア５０７に記録する。 The thumbnail image generation circuit 511 performs thumbnail image data reduction or other necessary conversion processing on the representative image stored in the representative image memory 510 to generate thumbnail image data. At this time, when a video frame including a face image of a specific person is selected as a representative image frame, additional information such as the name of the specific person stored in advance can be added to the thumbnail image data. The generated thumbnail image data is recorded on the recording medium 507 by the media recording circuit 506.

図６−１及び図６−２は、本実施形態において、顔画像検出結果と記録開始からの経過時間とによって代表画像を決定し、サムネール画像データを作成する処理手順の一例を示すフローチャートである。
まず、ステップＳ６０１において、処理を開始する。次に、ステップＳ６０２において、代表画像生成制御回路５１６は、顔画像検出状態を保持する顔画像検出フラグ（Flg）を未検出状態を表す「０」にリセットする。また、記録時間計数回路５１７で計数される経過時間ｔを「０」にリセットする。 FIGS. 6A and 6B are flowcharts illustrating an example of a processing procedure for determining a representative image based on a face image detection result and an elapsed time from the start of recording and creating thumbnail image data in the present embodiment. .
First, in step S601, processing is started. In step S602, the representative image generation control circuit 516 resets the face image detection flag (Flg) that holds the face image detection state to “0” that represents the undetected state. Further, the elapsed time t counted by the recording time counting circuit 517 is reset to “0”.

次に、ステップＳ６０３において、システムコントローラ５０３の制御によって映像信号の記録を開始する。そして、ステップＳ６０４において、代表画像生成制御回路５１６は第１のスイッチ５１３を制御して、顔画像検出処理を行うための映像フレームＦ（ｔ）をフレームメモリ５０８に書き込む。 Next, in step S603, recording of a video signal is started under the control of the system controller 503. In step S604, the representative image generation control circuit 516 controls the first switch 513 to write the video frame F (t) for performing the face image detection process in the frame memory 508.

次に、ステップＳ６０５において、代表画像生成制御回路５１６は、経過時間ｔと顔画像検出の有効期間である第１の時間期間Ｔ１とを比較する。この比較の結果、経過時間ｔが第１の時間期間Ｔ１以内である場合は、ステップＳ６０６以降の顔画像検出処理に進む。一方、ステップＳ６０５の比較の結果、経過時間ｔが第１の時間期間Ｔ１を越えている場合は、顔画像検出処理をスキップして、ステップＳ６１２に進む。 Next, in step S605, the representative image generation control circuit 516 compares the elapsed time t with a first time period T1 that is an effective period of face image detection. As a result of this comparison, when the elapsed time t is within the first time period T1, the process proceeds to the face image detection process after step S606. On the other hand, if the elapsed time t exceeds the first time period T1 as a result of the comparison in step S605, the face image detection process is skipped and the process proceeds to step S612.

次に、ステップＳ６０６において、代表画像生成制御回路５１６は、顔画像検出フラグ（Flg）をチェックすることにより、顔画像検出が既になされているかどうか判定する。この判定の結果、顔画像検出フラグ（Flg）が検出済みを表す「１」にセットされている場合は、以下の検出処理をスキップしてステップＳ６１２に進む。一方、ステップＳ６０６の判定の結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出の場合は、ステップＳ６０７の顔画像特徴量検出処理に進む。 In step S606, the representative image generation control circuit 516 checks the face image detection flag (Flg) to determine whether face image detection has already been performed. As a result of this determination, if the face image detection flag (Flg) is set to “1” indicating that detection has been completed, the following detection processing is skipped and the process proceeds to step S612. On the other hand, if the result of determination in step S606 is that the face image detection flag (Flg) is “0” and no face image has been detected, the process proceeds to face image feature amount detection processing in step S607.

次に、ステップＳ６０７において、特徴量検出回路５１２は、既に記憶されている特定人物の顔画像を判定するための顔画像特徴量Ｄ（Ｆ（ｔ））を映像フレームＦ（ｔ）から検出して、特徴量Ｐを出力する。そして、ステップＳ６０８において、特徴量比較回路５１５は、特徴量記憶メモリ５１４に記憶された特定人物の顔画像特徴量Ｍと特徴量Ｐとを所定の関数Ｃ（Ｐ，Ｍ）を用いて比較し、比較結果Ｋを算出する。 Next, in step S607, the feature amount detection circuit 512 detects a face image feature amount D (F (t)) for determining a face image of a specific person already stored from the video frame F (t). The feature amount P is output. In step S608, the feature amount comparison circuit 515 compares the face image feature amount M and the feature amount P of the specific person stored in the feature amount storage memory 514 using a predetermined function C (P, M). The comparison result K is calculated.

次に、ステップＳ６０９において、特徴量比較回路５１５は、比較結果Ｋを所定の閾値Ｋｔｈと比較することによって、映像フレームＦ（ｔ）内に特定人物の顔画像が検出されたか否かを判定する。この判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ以上であり、顔画像が検出された場合は、ステップＳ６１０に進む。そして、ステップＳ６１０において、代表画像生成制御回路５１６は、映像フレームＦ（ｔ）のインデックス情報を代表画像インデックスメモリ５１９に書き込む。そして、ステップＳ６１１において、代表画像生成制御回路５１６は、顔画像検出フラグ（Flg）を検出済みを示す「１」にセットする。本実施形態では、インデックス情報として経過時間ｔを用いている。 Next, in step S609, the feature amount comparison circuit 515 determines whether or not a face image of a specific person has been detected in the video frame F (t) by comparing the comparison result K with a predetermined threshold value Kth. . If it is determined that the comparison result K is equal to or greater than the predetermined threshold value Kth and a face image is detected, the process proceeds to step S610. In step S610, the representative image generation control circuit 516 writes the index information of the video frame F (t) in the representative image index memory 519. In step S611, the representative image generation control circuit 516 sets the face image detection flag (Flg) to “1” indicating that detection has been performed. In this embodiment, the elapsed time t is used as index information.

一方、ステップＳ６０９の判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ未満であり、顔画像が検出されなかった場合は、以下のステップＳ６１０、Ｓ６１１の処理をスキップしてステップＳ６１２に進む。このように、経過時間ｔが第１の時間期間Ｔ１を越えるまで顔画像検出を実行し、検出された先頭の顔画像に対するインデックス情報が代表画像インデックスメモリ５１９に記憶される。 On the other hand, as a result of the determination in step S609, if the comparison result K is less than the predetermined threshold value Kth and no face image is detected, the processing in steps S610 and S611 below is skipped and the process proceeds to step S612. In this way, face image detection is executed until the elapsed time t exceeds the first time period T1, and index information for the detected first face image is stored in the representative image index memory 519.

ステップＳ６１２において、代表画像生成制御回路５１６は、顔画像が第１の時間期間Ｔ１内に検出されなかった場合の代表画像フレーム位置である第２の時間期間Ｔ２と経過時間ｔとを比較する。この比較の結果、経過時間ｔが第２の時間期間Ｔ２を越えている場合は、以下のステップＳ６１３、Ｓ６１４の処理をスキップしてステップＳ６１５に進む。 In step S612, the representative image generation control circuit 516 compares the second time period T2, which is a representative image frame position when the face image is not detected within the first time period T1, with the elapsed time t. As a result of this comparison, when the elapsed time t exceeds the second time period T2, the processing of the following steps S613 and S614 is skipped and the process proceeds to step S615.

一方、ステップＳ６１２の比較の結果、経過時間ｔが第２の時間期間Ｔ２以内である場合は、ステップＳ６１３において、代表画像生成制御回路５１６は、顔画像検出フラグ（Flg）をチェックする。このチェックの結果、顔画像検出フラグ（Flg）が検出済みを表す「１」にセットされている場合は、以下のステップＳ６１４の検出処理をスキップしてステップＳ６１５に進む。 On the other hand, if the elapsed time t is within the second time period T2 as a result of the comparison in step S612, the representative image generation control circuit 516 checks the face image detection flag (Flg) in step S613. As a result of this check, when the face image detection flag (Flg) is set to “1” indicating that detection has been completed, the detection process of step S614 below is skipped and the process proceeds to step S615.

一方、ステップＳ６１３のチェックの結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ６１４に進む。そして、ステップＳ６１４において、代表画像生成制御回路５１６は、現在の映像フレームＦ（ｔ）のインデックス情報を代表画像インデックスメモリ５１９に書き込む。このように、顔画像が検出されていない場合は経過時間ｔが第２の時間期間Ｔ２を越えるまで代表画像インデックスメモリ５１９に記憶される内容は、最新の映像フレームＦ（ｔ）のインデックス情報によって更新される。 On the other hand, as a result of the check in step S613, if the face image detection flag (Flg) is “0” and no face image is detected, the process proceeds to step S614. In step S614, the representative image generation control circuit 516 writes the index information of the current video frame F (t) in the representative image index memory 519. As described above, when the face image is not detected, the content stored in the representative image index memory 519 until the elapsed time t exceeds the second time period T2 is determined by the index information of the latest video frame F (t). Updated.

次に、ステップＳ６１５において、代表画像生成制御回路５１６は、ユーザの操作等によってシステムコントローラ５０３から記録停止が指示されたか否かを判定する。この判定の結果、記録停止が指示されていない場合は、ステップＳ６１６に進み、処理を継続する。そして、ステップＳ６１６において、記録時間計数回路５１７は、経過時間ｔを所定の検出単位でインクリメントする。前述したように、顔画像検出処理能力が高くかつ要求される検出間隔が短い場合は毎フレームごとにインクリメントを行い、検出間隔を長く取る場合は所定のインターバル処理となるようにインクリメントする。そして、インクリメントした後はステップＳ６０４に戻り、処理を繰り返す。 Next, in step S615, the representative image generation control circuit 516 determines whether or not a recording stop is instructed from the system controller 503 by a user operation or the like. If the result of this determination is that there is no instruction to stop recording, processing proceeds to step S616 and processing continues. In step S616, the recording time counting circuit 517 increments the elapsed time t by a predetermined detection unit. As described above, when the face image detection processing capability is high and the required detection interval is short, the increment is performed every frame, and when the detection interval is long, the increment is performed so as to be a predetermined interval process. After the increment, the process returns to step S604 and the process is repeated.

一方、ステップＳ６１５の判定の結果、記録停止が指示された場合は、ステップＳ６１７に進み、システムコントローラ５０３の制御により、代表画像インデックスメモリ５１９からインデックス情報ｔｍを読み出す。そして、ステップＳ６１８において、メディア再生回路５２０は、読み出したインデックス情報ｔｍを用いて今回記録したＡＶデータを読み出し、代表画像として選択された映像フレームＦ（ｔｍ）を再生する。 On the other hand, if recording stop is instructed as a result of the determination in step S615, the process proceeds to step S617, and the index information tm is read from the representative image index memory 519 under the control of the system controller 503. In step S618, the media playback circuit 520 reads the AV data recorded this time using the read index information tm, and plays back the video frame F (tm) selected as the representative image.

次に、ステップＳ６１９において、代表画像生成制御回路５１６は、映像フレームＦ（ｔｍ）を代表画像メモリ５１０に書き込む。そして、ステップＳ６２０において、サムネール画像生成回路５１１は、代表画像メモリ５１０に記憶されている映像フレームからサムネール画像データを生成する。そして、ステップＳ６２１において、メディア記録回路５０６は、生成されたサムネール画像データを記録メディア５０７に記録して処理を終了する。 Next, in step S619, the representative image generation control circuit 516 writes the video frame F (tm) in the representative image memory 510. In step S 620, the thumbnail image generation circuit 511 generates thumbnail image data from the video frame stored in the representative image memory 510. In step S621, the media recording circuit 506 records the generated thumbnail image data on the recording medium 507 and ends the process.

（第３の実施形態）
第２の実施形態の構成では、代表画像となる映像フレームの読み出しは記録終了後に行われる。そこで本実施形態では、実際の撮影時間に応じて顔画像検出期間（第１の時間期間）Ｔ１を変える例について説明する。 (Third embodiment)
In the configuration of the second embodiment, the reading of the video frame serving as the representative image is performed after the recording is completed. Therefore, in the present embodiment, an example in which the face image detection period (first time period) T1 is changed according to the actual shooting time will be described.

図８−１及び図８−２は、本実施形態において、顔画像検出結果と記録開始からの経過時間とによって代表画像を決定し、サムネール画像データを作成する処理手順の一例を示すフローチャートである。なお、図８−１及び図８−２に示すフローチャートにおける各処理を実行するための映像記録処理回路は第２の実施形態と同様であるため説明を省略する。 FIG. 8A and FIG. 8B are flowcharts illustrating an example of a processing procedure for determining a representative image based on a face image detection result and an elapsed time from the start of recording and creating thumbnail image data in the present embodiment. . The video recording processing circuit for executing each process in the flowcharts shown in FIGS. 8A and 8B is the same as that in the second embodiment, and a description thereof will be omitted.

まず、ステップＳ８０１において、処理を開始する。次に、ステップＳ８０２において、代表画像生成制御回路５１６は、顔画像検出状態を保持する顔画像検出フラグ（Flg）を未検出状態を表す「０」にリセットする。また、記録時間計数回路５１７で計数される経過時間ｔを「０」にリセットする。 First, in step S801, processing is started. Next, in step S802, the representative image generation control circuit 516 resets the face image detection flag (Flg) holding the face image detection state to “0” indicating the undetected state. Further, the elapsed time t counted by the recording time counting circuit 517 is reset to “0”.

次に、ステップＳ８０３において、システムコントローラ５０３の制御によって映像信号の記録を開始する。そして、ステップＳ８０４において、代表画像生成制御回路５１６は第１のスイッチ５１３を制御することにより、顔画像検出処理を行うための映像フレームＦ（ｔ）をフレームメモリ５０８に書き込む。 Next, in step S803, recording of the video signal is started under the control of the system controller 503. In step S804, the representative image generation control circuit 516 controls the first switch 513 to write the video frame F (t) for performing the face image detection process in the frame memory 508.

次に、ステップＳ８０５において、代表画像生成制御回路５１６は、顔画像検出フラグ（Flg）をチェックすることにより、顔画像検出が既になされているかどうかを判定する。この判定の結果、顔画像検出フラグ（Flg）が、検出済みを表す「１」にセットされている場合は、以下の検出処理をスキップしてステップＳ８１２に進む。一方、ステップＳ８０５の判定の結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ８０６の顔画像特徴量検出処理に進む。 In step S805, the representative image generation control circuit 516 checks the face image detection flag (Flg) to determine whether face image detection has already been performed. As a result of this determination, if the face image detection flag (Flg) is set to “1” indicating detection completion, the following detection processing is skipped and the process proceeds to step S812. On the other hand, as a result of the determination in step S805, if the face image detection flag (Flg) is “0” and no face image is detected, the process proceeds to face image feature amount detection processing in step S806.

次に、ステップＳ８０６において、特徴量検出回路５１２は、既に記憶されている特定人物の顔画像を判定するための顔画像特徴量Ｄ（Ｆ（ｔ））を映像フレームＦ（ｔ）から検出して、特徴量比較回路５１５に特徴量Ｐを出力する。そして、ステップＳ８０７において、特徴量比較回路５１５は、特徴量記憶メモリ５１４に記憶された特定人物の顔画像特徴量Ｍと特徴量Ｐとを所定の関数Ｃ（Ｐ，Ｍ）を用いて比較し、比較結果Ｋを算出する。 In step S806, the feature amount detection circuit 512 detects a face image feature amount D (F (t)) for determining a face image of a specific person that has already been stored from the video frame F (t). The feature amount P is output to the feature amount comparison circuit 515. In step S807, the feature amount comparison circuit 515 compares the face image feature amount M and the feature amount P of the specific person stored in the feature amount storage memory 514 using a predetermined function C (P, M). The comparison result K is calculated.

次に、ステップＳ８０８において、特徴量比較回路５１５は、比較結果Ｋを所定の閾値Ｋｔｈと比較することによって、映像フレームＦ（ｔ）内に特定人物の顔画像が検出されたか否かを判定する。この判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ以上であり、顔画像が検出された場合は、ステップＳ８０９に進む。そして、ステップＳ８０９において、代表画像生成制御回路５１６は、映像フレームＦ（ｔ）のインデックス情報を代表画像インデックスメモリ５１９に書き込む。そして、ステップＳ８１０において、代表画像生成制御回路５１６は、顔画像検出フラグ（Flg）を検出済みを示す「１」にセットする。本実施形態では、インデックス情報として経過時間ｔを用いている。 Next, in step S808, the feature amount comparison circuit 515 determines whether or not a face image of a specific person has been detected in the video frame F (t) by comparing the comparison result K with a predetermined threshold value Kth. . If it is determined that the comparison result K is equal to or greater than the predetermined threshold value Kth and a face image is detected, the process proceeds to step S809. In step S809, the representative image generation control circuit 516 writes the index information of the video frame F (t) in the representative image index memory 519. In step S810, the representative image generation control circuit 516 sets the face image detection flag (Flg) to “1” indicating that detection has been performed. In this embodiment, the elapsed time t is used as index information.

一方、ステップＳ８０８の判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ未満であり、顔画像が検出されなかった場合は、以下のステップＳ８０９、Ｓ８１０の処理をスキップしてステップＳ８１２に進む。このように、検出された先頭の顔画像に対するインデックス情報が代表画像インデックスメモリ５１９に記憶される。本実施形態と第２の実施形態とが異なる点は、ビデオデータの記録時には顔画像検出の時間制限（第１の時間期間）Ｔ１が無いという点である。 On the other hand, if the result of determination in step S808 is that the comparison result K is less than the predetermined threshold value Kth and no face image is detected, the processing of steps S809 and S810 below is skipped and the process proceeds to step S812. In this way, index information for the detected first face image is stored in the representative image index memory 519. The difference between this embodiment and the second embodiment is that there is no face image detection time limit (first time period) T1 when video data is recorded.

ステップＳ８１２において、ユーザの操作等によってシステムコントローラ１０３から記録停止が指示されたか否かを判定する。この判定の結果、記録停止が指示されていない場合は、ステップＳ８１１に進み、処理を継続する。そして、ステップＳ８１１において、記録時間計数回路５１７は、経過時間ｔを所定の検出単位でインクリメントする。前述したように、顔画像検出処理能力が高くかつ要求される検出間隔が短い場合は毎フレームごとにインクリメントを行い、検出間隔を長く取る場合は所定のインターバル処理となるようにインクリメントする。そして、インクリメントした後はステップＳ８０４に戻り、処理を繰り返す。 In step S812, it is determined whether or not recording stop is instructed from the system controller 103 by a user operation or the like. As a result of the determination, if recording stop is not instructed, the process proceeds to step S811, and the process is continued. In step S811, the recording time counting circuit 517 increments the elapsed time t by a predetermined detection unit. As described above, when the face image detection processing capability is high and the required detection interval is short, the increment is performed every frame, and when the detection interval is long, the increment is performed so as to be a predetermined interval process. After the increment, the process returns to step S804 to repeat the process.

一方、ステップＳ８１２の判定の結果、記録停止が指示された場合は、ステップＳ８１３に進み、代表画像生成制御回路５１６は期間変更手段として機能し、記録開始から記録停止までの経過時間Ｔ３から、第１の時間期間Ｔ１を計算する。本実施形態では、Ｔ１＝Ｔ３／３となるよう第１の時間期間Ｔ１を設定している。すなわち、記録開始からクリップの先頭１／３までが顔画像検出の有効期間となる。 On the other hand, if the result of determination in step S812 is that recording stop has been instructed, processing proceeds to step S813, where the representative image generation control circuit 516 functions as a period changing means, and from the elapsed time T3 from recording start to recording stop, One time period T1 is calculated. In the present embodiment, the first time period T1 is set so that T1 = T3 / 3. That is, the effective period of face image detection is from the start of recording to the first third of the clip.

次に、ステップＳ８１４において、代表画像生成制御回路５１６は、ディフォルトの代表画像フレーム位置である第２の時間期間Ｔ２と、記録開始から記録停止までの経過時間Ｔ３とを比較し、第２の時間期間Ｔ２の有効性を検証する。この比較の結果、経過時間Ｔ３が第２の時間期間Ｔ２未満である場合は、第２の時間期間Ｔ２が経過した時の映像フレームは存在しないため、ステップＳ８１５に進み、代表画像生成制御回路５１６は、第２の時間期間Ｔ２を経過時間Ｔ３に置き換える。一方、ステップＳ８１４の比較の結果、経過時間Ｔ３が第２の時間期間Ｔ２以上である場合は、第２の時間期間Ｔ２が経過した時の映像フレームが存在するため、ステップＳ８１５をスキップしてステップＳ８１６に進む。 Next, in step S814, the representative image generation control circuit 516 compares the second time period T2, which is the default representative image frame position, with the elapsed time T3 from the start of recording to the stop of recording to determine the second time period. The validity of the period T2 is verified. As a result of this comparison, if the elapsed time T3 is less than the second time period T2, there is no video frame when the second time period T2 has elapsed, so the process proceeds to step S815, and the representative image generation control circuit 516 is reached. Replaces the second time period T2 with an elapsed time T3. On the other hand, if the elapsed time T3 is equal to or longer than the second time period T2 as a result of the comparison in step S814, there is a video frame when the second time period T2 has elapsed, so step S815 is skipped. The process proceeds to S816.

次に、ステップＳ８１６において、代表画像生成制御回路５１６は、顔画像検出フラグ（Flg）をチェックする。このチェックの結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ８１９に進み、代表画像生成制御回路５１６は、代表画像インデックスｔｍの時間を第２の時間期間Ｔ２とする。一方、ステップＳ８１６のチェックの結果、顔画像検出フラグ（Flg）が検出済みを表す「１」にセットされている場合は、ステップＳ８１７に進む。そして、ステップＳ８１７において、システムコントローラ５０３の制御により、代表画像インデックスメモリ５１９からインデックス情報ｔｍを読み出す。 Next, in step S816, the representative image generation control circuit 516 checks the face image detection flag (Flg). As a result of this check, if the face image detection flag (Flg) is “0” and no face image is detected, the process proceeds to step S819, and the representative image generation control circuit 516 sets the time of the representative image index tm to the first time. It is assumed that the time period T2 is 2. On the other hand, if it is determined in step S816 that the face image detection flag (Flg) is set to “1” indicating detection, the process proceeds to step S817. In step S817, the index information tm is read from the representative image index memory 519 under the control of the system controller 503.

次に、ステップＳ８１８において、代表画像生成制御回路５１６は、インデックス情報ｔｍの時間と顔画像検出の有効期間である第１の時間期間Ｔ１とを比較する。この比較の結果、インデックス情報ｔｍの時間が第１の時間期間Ｔ１以内である場合は、ステップＳ８２０以降の代表画像フレームを読み出す処理に進む。一方、ステップＳ８１８の比較の結果、インデックス情報ｔｍの時間が第１の時間期間Ｔ１を越えている場合は、ステップＳ８１９に進み、代表画像生成制御回路５１６は、代表画像インデックスｔｍの時間を第２の時間期間Ｔ２とする。 Next, in step S818, the representative image generation control circuit 516 compares the time of the index information tm with the first time period T1 that is the effective period of face image detection. As a result of the comparison, if the time of the index information tm is within the first time period T1, the process proceeds to the process of reading the representative image frame after step S820. On the other hand, as a result of the comparison in step S818, if the time of the index information tm exceeds the first time period T1, the process proceeds to step S819, and the representative image generation control circuit 516 sets the time of the representative image index tm to the second time. It is assumed that the time period T2.

次に、ステップＳ８２０において、メディア再生回路５２０は、このインデックス情報ｔｍを用いて今回記録したＡＶデータを読み出し、代表画像として選択された映像フレームＦ（ｔｍ）を再生する。そして、ステップＳ８２１において、代表画像生成制御回路５１６は、映像フレームＦ（ｔｍ）を代表画像メモリ５１０に書き込む。そして、ステップＳ８２２において、サムネール画像生成回路５１１は、代表画像メモリ５１０に記憶されている映像フレームからサムネール画像データを生成する。そして、ステップＳ８２３において、メディア記録回路５０６は、生成されたサムネール画像データを記録メディア５０７に記録して処理を終了する。 Next, in step S820, the media playback circuit 520 reads the AV data recorded this time using the index information tm, and plays back the video frame F (tm) selected as the representative image. In step S821, the representative image generation control circuit 516 writes the video frame F (tm) in the representative image memory 510. In step S822, the thumbnail image generation circuit 511 generates thumbnail image data from the video frames stored in the representative image memory 510. In step S823, the media recording circuit 506 records the generated thumbnail image data on the recording medium 507 and ends the process.

図７は、以上の処理によって代表画像を選択する具体例を示す図である。なお、本実施形態では、顔画像検出期間（第１の時間期間Ｔ１）については、Ｔ１＝Ｔ３／３であるものとして説明する。 FIG. 7 is a diagram showing a specific example of selecting a representative image by the above processing. In the present embodiment, the face image detection period (first time period T1) is described as T1 = T3 / 3.

図７（ａ）に示す例では、Ｔ１＜Ｔ２＜顔画像を検出した時間ｔｍであり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されなかったため、第２の時間期間Ｔ２の位置の映像フレームが代表画像として選択される。一方、図７（ｂ）に示す例では、Ｔ２＜顔画像を検出した時間ｔｍ＜Ｔ１であり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されたため、顔画像を検出した時間ｔｍの位置の映像フレームが代表画像として選択される。 In the example shown in FIG. 7A, T1 <T2 <time tm when the face image is detected, and the face image of the specific person is not detected within the face image detection period T1, so the position of the second time period T2 Are selected as representative images. On the other hand, in the example shown in FIG. 7B, T2 <time tm <T1 when the face image is detected, and since the face image of the specific person is detected within the face image detection period T1, time tm when the face image is detected. Is selected as a representative image.

このように図７（ａ）及び図７（ｂ）に示す例では、記録開始から顔画像が検出されるまでの時間は同一であるが、記録開始から記録停止までのクリップ全体の記録時間Ｔ３が異なっている。このため、代表画像として選択される映像フレームが異なっている。図７（ａ）に示す例では、クリップの終わり付近で特定人物が写っており、この場合は、特定人物が主な被写体とは判定されない。一方、図７（ｂ）に示す例では、クリップ先頭から１／３以内に特定人物が写っており、この場合は、特定人物が主な被写体である可能性が高いと判定される。 As described above, in the example shown in FIGS. 7A and 7B, the time from the start of recording to the detection of the face image is the same, but the recording time T3 of the entire clip from the start of recording to the stop of recording. Is different. For this reason, video frames selected as representative images are different. In the example shown in FIG. 7A, a specific person is captured near the end of the clip. In this case, the specific person is not determined to be the main subject. On the other hand, in the example shown in FIG. 7B, the specific person is shown within 1/3 from the top of the clip. In this case, it is determined that the specific person is likely to be the main subject.

（第４の実施形態）
本実施形態は、ビデオカメラ等による撮影時のみならず、撮影後のビデオクリップデータに対して処理を行う例について説明する。図９は、本実施形態における画像記録装置９００の映像記録処理回路の構成例を示すブロック図である。なお、本実施形態では記録済みのビデオデータに対しての処理を説明するため記録時の構成要素は省略しているが、ビデオカメラなどの画像処理装置に適用することも可能であることは言うまでもない。 (Fourth embodiment)
In the present embodiment, an example will be described in which processing is performed on video clip data after shooting as well as when shooting with a video camera or the like. FIG. 9 is a block diagram illustrating a configuration example of a video recording processing circuit of the image recording apparatus 900 according to the present embodiment. In this embodiment, components for recording are omitted in order to describe processing for recorded video data. However, it goes without saying that the present invention can also be applied to an image processing apparatus such as a video camera. Yes.

図９において、システムコントローラ９０３の制御に従い、メディア再生回路９２０は記録メディア９０７から対象となるビデオクリップのＡＶデータを読み出す。読み出されたＡＶデータからデマルチプレクサ９２１によって映像データを取り出し、動画像復号化回路９２２によって、例えばＭＰＥＧ等の符号化方式で圧縮符号化された動画像データから動画像を再生する。 In FIG. 9, under the control of the system controller 903, the media playback circuit 920 reads AV data of the target video clip from the recording medium 907. Video data is extracted from the read AV data by the demultiplexer 921, and a moving image is reproduced from the moving image data compressed and encoded by a moving image decoding circuit 922, for example, using an encoding method such as MPEG.

再生された動画像の映像フレームはフレームメモリ９０８に記憶される。前述した第１〜第３の実施形態では、撮影時におけるリアルタイム処理であった。一方、本実施形態ではシステムコントローラ９０３によって記録メディア９０７から読み出す処理自体を制御することによって特徴量検出処理に同期させて映像フレームを再生することが可能である。このため、特徴量検出処理能力が高くない場合でも各コマ毎の処理をすることが可能である。 The reproduced video frame of the moving image is stored in the frame memory 908. In the first to third embodiments described above, real-time processing was performed at the time of shooting. On the other hand, in the present embodiment, it is possible to reproduce a video frame in synchronization with the feature amount detection process by controlling the process itself read from the recording medium 907 by the system controller 903. For this reason, even if the feature amount detection processing capability is not high, it is possible to perform processing for each frame.

特徴量検出回路９１２は、フレームメモリ９０８に記憶された映像フレームに対して画像特徴量の検出を行う。本実施形態では、人物の顔画像を判定するための画像特徴量を用いた例について説明する。特徴量記憶メモリ（記憶部）９１４には、特定人物の認識に必要な特徴量が予め記憶されている。この特徴量を記憶させる手段については、例えば、第１〜第３の実施形態と同様な手順で行う。 The feature amount detection circuit 912 detects an image feature amount for the video frame stored in the frame memory 908. In the present embodiment, an example using an image feature amount for determining a human face image will be described. In the feature amount storage memory (storage unit) 914, feature amounts necessary for recognition of a specific person are stored in advance. About the means to memorize | store this feature-value, it carries out in the procedure similar to the 1st-3rd embodiment, for example.

特徴量比較回路９１５は、特徴量検出回路９１２から出力される画像特徴量と、特徴量記憶メモリ９１４に記憶されている画像特徴量とを比較することによって現在再生中の映像フレームに特定人物の顔画像が含まれているか否かを判定する。そして、特徴量比較回路９１５及び特徴量検出回路９１２が、特徴量検出手段として機能する。再生時間計数回路９１７は、システムコントローラ９０３から出力される再生開始情報に基づき、再生開始から顔画像判定を行った映像フレームまでの経過時間を計数する。 The feature amount comparison circuit 915 compares the image feature amount output from the feature amount detection circuit 912 with the image feature amount stored in the feature amount storage memory 914 to thereby add a specific person to the video frame currently being reproduced. It is determined whether or not a face image is included. The feature amount comparison circuit 915 and the feature amount detection circuit 912 function as feature amount detection means. Based on the reproduction start information output from the system controller 903, the reproduction time counting circuit 917 counts the elapsed time from the reproduction start to the video frame for which face image determination has been performed.

代表画像生成制御回路９１６は、特徴量比較回路９１５による顔画像の特徴量比較結果と、再生時間計数回路９１７において計数された再生開始からの経過時間に応じて、代表画像とすべき映像フレームを決定する。そして、第１のスイッチ９０９を制御することによって、決定した時点の映像フレームを代表画像メモリ９１０に記憶する。 The representative image generation control circuit 916 selects a video frame to be a representative image according to the feature amount comparison result of the face image by the feature amount comparison circuit 915 and the elapsed time from the start of reproduction counted by the reproduction time counting circuit 917. decide. Then, by controlling the first switch 909, the determined video frame is stored in the representative image memory 910.

サムネール画像生成回路９１１は、代表画像メモリ９１０に記憶された映像フレームを縮小その他必要な変換処理を行ってサムネール画像データを生成する。この際、特定人物の顔画像を含む映像フレームが代表画像フレームとして選択された場合は、予め記憶させておいた特定人物の名前等の付加情報をサムネール画像データに付加することも可能である。そして、生成されたサムネール画像データをメディア記録回路９０６によって記録メディア９０７に記録する。 The thumbnail image generation circuit 911 generates thumbnail image data by reducing the video frames stored in the representative image memory 910 and performing other necessary conversion processing. At this time, when a video frame including a face image of a specific person is selected as a representative image frame, additional information such as the name of the specific person stored in advance can be added to the thumbnail image data. The generated thumbnail image data is recorded on the recording medium 907 by the media recording circuit 906.

図１０は、本実施形態において、顔画像検出結果と記録開始からの経過時間とによって代表画像を決定し、サムネール画像データを作成する処理手順の一例を示すフローチャートである。
まず、ステップＳ１００１において、処理を開始する。次に、ステップＳ１００２において、代表画像生成制御回路９１６は、顔画像検出状態を保持する顔画像検出フラグ（Flg）を未検出状態を表す「０」にリセットする。また、再生時間計数回路９１７で計数される経過時間ｔを「０」にリセットする。 FIG. 10 is a flowchart illustrating an example of a processing procedure for determining a representative image based on a face image detection result and an elapsed time from the start of recording and creating thumbnail image data in the present embodiment.
First, in step S1001, processing is started. Next, in step S1002, the representative image generation control circuit 916 resets the face image detection flag (Flg) holding the face image detection state to “0” indicating the undetected state. Further, the elapsed time t counted by the reproduction time counting circuit 917 is reset to “0”.

ステップＳ１００３において、システムコントローラ９０３の制御によって映像フレームの再生を開始する。そして、ステップＳ１００４において、代表画像生成制御回路９１６は、顔画像検出処理を行うための映像フレームＦ（ｔ）をフレームメモリ９０８に書き込む。 In step S1003, the playback of the video frame is started under the control of the system controller 903. In step S1004, the representative image generation control circuit 916 writes the video frame F (t) for performing the face image detection process in the frame memory 908.

次に、ステップＳ１００５において、代表画像生成制御回路９１６は、経過時間ｔと顔画像検出の有効期間である第１の時間期間Ｔ１とを比較する。この比較の結果、経過時間ｔが第１の時間期間Ｔ１以内である場合は、ステップＳ１００６以降の顔画像検出処理に進む。一方、ステップＳ１００５の比較の結果、経過時間ｔが第１の時間期間Ｔ１を越えている場合は、顔画像検出処理をスキップして、ステップＳ１０１２に進む。 Next, in step S1005, the representative image generation control circuit 916 compares the elapsed time t with a first time period T1 that is an effective period of face image detection. As a result of the comparison, when the elapsed time t is within the first time period T1, the process proceeds to the face image detection process after step S1006. On the other hand, as a result of the comparison in step S1005, when the elapsed time t exceeds the first time period T1, the face image detection process is skipped and the process proceeds to step S1012.

次に、ステップＳ１００６において、代表画像生成制御回路９１６は、顔画像検出フラグ（Flg）をチェックすることにより、顔画像検出が既になされているかどうかを判定する。この判定の結果、顔画像検出フラグ（Flg）が、検出済みを表す「１」にセットされている場合は、以下の検出処理をスキップしてステップＳ１０１２に進む。一方、ステップＳ１００６の判定の結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ１００７の顔画像特徴量検出処理に進む。 Next, in step S1006, the representative image generation control circuit 916 determines whether or not face image detection has already been performed by checking the face image detection flag (Flg). As a result of this determination, if the face image detection flag (Flg) is set to “1” indicating detection completion, the following detection processing is skipped and the process proceeds to step S1012. On the other hand, as a result of the determination in step S1006, if the face image detection flag (Flg) is “0” and no face image is detected, the process proceeds to the face image feature amount detection process in step S1007.

次に、ステップＳ１００７において、特徴量検出回路９１２は、既に記憶されている特定人物の顔画像を判定するための顔画像特徴量Ｄ（Ｆ（ｔ））映像フレームＦ（ｔ）から検出して、特徴量比較回路９１５に特徴量Ｐを出力する。そして、ステップＳ１００８において、特徴量比較回路９１５は、特徴量記憶メモリ９１４に記憶された特定人物の顔画像特徴量Ｍと特徴量Ｐとを所定の関数Ｃ（Ｐ，Ｍ）を用いて比較し、比較結果Ｋを算出する。 Next, in step S1007, the feature amount detection circuit 912 detects the face image feature amount D (F (t)) for determining the face image of the specific person already stored, and detects it from the video frame F (t). The feature amount P is output to the feature amount comparison circuit 915. In step S1008, the feature amount comparison circuit 915 compares the face image feature amount M and the feature amount P of the specific person stored in the feature amount storage memory 914 using a predetermined function C (P, M). The comparison result K is calculated.

次に、ステップＳ１００９において、特徴量比較回路９１５は、比較結果Ｋを所定の閾値Ｋｔｈと比較することによって、映像フレームＦ（ｔ）内に特定人物の顔画像が検出されたか否かを判定する。この判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ以上であり、顔画像が検出された場合は、ステップＳ１０１０に進む。そして、ステップＳ１０１０において、代表画像生成制御回路９１６は、映像フレームＦ（ｔ）を代表画像メモリ９１０に書き込む。そして、ステップＳ１０１１において、代表画像生成制御回路９１６は、顔画像検出フラグ（Flg）を検出済みを示す「１」にセットする。 Next, in step S1009, the feature amount comparison circuit 915 determines whether or not a face image of a specific person is detected in the video frame F (t) by comparing the comparison result K with a predetermined threshold value Kth. . As a result of the determination, if the comparison result K is equal to or greater than the predetermined threshold value Kth and a face image is detected, the process proceeds to step S1010. In step S1010, the representative image generation control circuit 916 writes the video frame F (t) in the representative image memory 910. In step S1011, the representative image generation control circuit 916 sets the face image detection flag (Flg) to “1” indicating detection.

一方、ステップＳ１００９の判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ未満であり、顔画像が検出されなかった場合は、以下のステップＳ１０１０、Ｓ１０１１の処理をスキップしてステップＳ１０１２に進む。このように、経過時間ｔが第１の時間期間Ｔ１を越えるまで顔画像検出を実行し、検出された先頭の顔画像が代表画像メモリ９１０に記憶される。 On the other hand, as a result of the determination in step S1009, if the comparison result K is less than the predetermined threshold value Kth and no face image is detected, the processing in steps S1010 and S1011 below is skipped and the process proceeds to step S1012. In this manner, face image detection is executed until the elapsed time t exceeds the first time period T1, and the detected head face image is stored in the representative image memory 910.

次に、ステップＳ１０１２において、代表画像生成制御回路９１６は、顔画像が第１の時間期間Ｔ１内に検出されなかった場合の代表画像フレーム位置である第２の時間期間Ｔ２と経過時間ｔとを比較する。この比較の結果、経過時間ｔが第２の時間期間Ｔ２を越えている場合は、以下のステップＳ１０１３、Ｓ１０１４の処理をスキップしてステップＳ１０１５に進む。 Next, in step S1012, the representative image generation control circuit 916 calculates the second time period T2 that is the representative image frame position and the elapsed time t when the face image is not detected within the first time period T1. Compare. As a result of this comparison, when the elapsed time t exceeds the second time period T2, the processing of the following steps S1013 and S1014 is skipped and the process proceeds to step S1015.

一方、ステップＳ１０１２の比較の結果、経過時間ｔが第２の時間期間Ｔ２以内である場合は、ステップＳ１０１３において、代表画像生成制御回路９１６は、顔画像検出フラグ（Flg）をチェックする。このチェックの結果、顔画像検出フラグ（Flg）が検出済みを表す「１」にセットされている場合は、以下のステップＳ１０１４の処理をスキップしてステップＳ１０１５に進む。 On the other hand, when the elapsed time t is within the second time period T2 as a result of the comparison in step S1012, the representative image generation control circuit 916 checks the face image detection flag (Flg) in step S1013. As a result of this check, when the face image detection flag (Flg) is set to “1” indicating detection, the processing of the following step S1014 is skipped and the process proceeds to step S1015.

一方、ステップＳ１０１３のチェックの結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ１０１４に進み、代表画像生成制御回路９１６は、現在の映像フレームＦ（ｔ）を代表画像メモリ９１０に書き込む。このように、顔画像が検出されていない場合は、経過時間ｔが第２の時間期間Ｔ２を越えるまで代表画像メモリ９１０に記憶される内容は最新の映像フレームＦ（ｔ）によって更新される。 On the other hand, if the face image detection flag (Flg) is “0” as a result of the check in step S1013 and no face image is detected, the process proceeds to step S1014, and the representative image generation control circuit 916 displays the current video frame. F (t) is written into the representative image memory 910. As described above, when the face image is not detected, the content stored in the representative image memory 910 is updated with the latest video frame F (t) until the elapsed time t exceeds the second time period T2.

次に、ステップＳ１０１５において、ユーザの操作等によってシステムコントローラ９０３から記録停止が指示されたか否かを判定する。この判定の結果、記録停止が指示されていない場合は、ステップＳ１０１６に進み、処理を継続する。そして、ステップＳ１０１６において、再生時間計数回路９１７は、経過時間ｔを所定の検出単位でインクリメントする。前述したように、顔画像検出処理能力が高くかつ要求される検出間隔が短い場合は毎フレームごとにインクリメントを行い、検出間隔を長く取る場合は所定のインターバル処理となるようにインクリメントする。そして、インクリメントした後はステップＳ１００４に戻り処理を繰り返す。 Next, in step S1015, it is determined whether or not a recording stop is instructed from the system controller 903 by a user operation or the like. As a result of the determination, if recording stop is not instructed, the process proceeds to step S1016 and the process is continued. In step S1016, the reproduction time counting circuit 917 increments the elapsed time t by a predetermined detection unit. As described above, when the face image detection processing capability is high and the required detection interval is short, the increment is performed every frame, and when the detection interval is long, the increment is performed so as to be a predetermined interval process. After the increment, the process returns to step S1004 to repeat the process.

一方、ステップＳ１０１５の判定の結果、記録停止が指示された場合は、ステップＳ１０１７に進み、サムネール画像生成回路９１１は、代表画像メモリ９１０に記憶されている映像フレームからサムネール画像データを生成する。そして、ステップＳ１０１８において、メディア記録回路９０６は、生成されたサムネール画像データを記録メディア９０７に記録して処理を終了する。 On the other hand, if recording stop is instructed as a result of the determination in step S1015, the process advances to step S1017, and the thumbnail image generation circuit 911 generates thumbnail image data from the video frames stored in the representative image memory 910. In step S1018, the media recording circuit 906 records the generated thumbnail image data on the recording medium 907, and ends the process.

本実施形態は、記録済みのビデオデータに対してサムネール画像データを生成するものであり、記録時に処理を行う場合と比べてリアルタイム性が要求されないため、ハードウェアコストを抑えることが可能である。 In the present embodiment, thumbnail image data is generated for recorded video data, and real-time performance is not required as compared with the case where processing is performed at the time of recording, so that hardware costs can be suppressed.

（第５の実施形態）
図１１は、本実施形態における画像処理装置１１００の映像記録処理回路の構成例を示すブロック図である。
第１の端子１１０１には、レンズ等の光学系、ＣＣＤ等のセンサー、カメラ信号処理回路からなる撮像手段から映像信号が入力される。入力された映像信号は、動画像符号化回路１１０４において、例えばＭＰＥＧ２などの符号化方式を用いて圧縮符号化される。マルチプレクサ１１０５は、圧縮符号化された動画像データに音声データ、サブコードデータ等を多重化する。なお、第１〜第３の実施形態と同様に、これらの映像記録処理以外のビデオカメラ構成要素については説明を省略する。 (Fifth embodiment)
FIG. 11 is a block diagram illustrating a configuration example of a video recording processing circuit of the image processing apparatus 1100 according to the present embodiment.
A video signal is input to the first terminal 1101 from an imaging unit including an optical system such as a lens, a sensor such as a CCD, and a camera signal processing circuit. The input video signal is compressed and encoded by a moving image encoding circuit 1104 using an encoding method such as MPEG2. The multiplexer 1105 multiplexes audio data, subcode data, and the like on the compressed and encoded moving image data. Similar to the first to third embodiments, description of video camera components other than these video recording processes is omitted.

マルチプレクサ１１０５によって多重化されたＡＶデータは、メディア記録回路１１０６によって記録メディア１１０７に記録される。システムコントローラ１１０３は、第２の端子１１０２から入力される記録開始指示によって各ブロックを制御し、ＡＶデータの記録をコントロールする。 The AV data multiplexed by the multiplexer 1105 is recorded on the recording medium 1107 by the media recording circuit 1106. The system controller 1103 controls each block according to a recording start instruction input from the second terminal 1102, and controls recording of AV data.

一方、第１の端子１１０１を介して入力された映像信号は、代表画像生成制御回路１１１６が第１のスイッチ１１０９を制御することにより、後述の顔画像検出処理のサイクルごとにフレームメモリ１１０８に記憶される。なお、このサイクルは顔画像検出処理能力が高ければ毎フレームごとでもよく、処理能力や必要な検出間隔に応じて間歇的であってもよい。 On the other hand, the video signal input via the first terminal 1101 is stored in the frame memory 1108 for each cycle of face image detection processing to be described later by the representative image generation control circuit 1116 controlling the first switch 1109. Is done. Note that this cycle may be performed every frame if the face image detection processing capability is high, or may be intermittent depending on the processing capability and a necessary detection interval.

特徴量検出回路１１１２は、フレームメモリ１１０８に記憶された映像フレーム（映像信号）に対して画像特徴量の検出を行う。本実施形態では、人物の顔画像を判定するための画像特徴量を用いた例について説明する。 A feature amount detection circuit 1112 detects an image feature amount for a video frame (video signal) stored in the frame memory 1108. In the present embodiment, an example using an image feature amount for determining a human face image will be described.

特徴量記憶メモリ（記憶部）１１１４には、特定人物の認識に必要な特徴量が予め記憶されている。例えば、ビデオカメラの記録前に家族などの特定人物を被写体として撮像し、その際に、代表画像生成制御回路１１１６が特徴量記憶手段として機能し、第２のスイッチ１１１３を書き込み側に制御する。これにより、特徴量検出回路１１１２から出力される画像特徴量を特徴量記憶メモリ１１１４に記憶することができる。これらの制御は、システムコントローラ１１０３から送られる指示に基づいて代表画像生成制御回路１１１６が各部を制御することによって行う。 In the feature amount storage memory (storage unit) 1114, feature amounts necessary for recognition of a specific person are stored in advance. For example, a specific person such as a family is imaged as a subject before recording by the video camera. At that time, the representative image generation control circuit 1116 functions as a feature amount storage unit, and controls the second switch 1113 to the writing side. As a result, the image feature amount output from the feature amount detection circuit 1112 can be stored in the feature amount storage memory 1114. These controls are performed by the representative image generation control circuit 1116 controlling each unit based on an instruction sent from the system controller 1103.

特徴量比較回路１１１５は、特徴量検出回路１１１２から出力される画像特徴量と、特徴量記憶メモリ１１１４に記憶されている画像特徴量とを比較することによって現在撮影中の映像フレームに特定人物の顔画像が含まれているか否かを判定する。そして、特徴量比較回路１１１５及び特徴量検出回路１１１２が、特徴量検出手段として機能する。記録時間計数回路１１１７は、システムコントローラ１１０３から出力される記録開始情報に基づき、記録開始から顔画像判定を行った映像フレームまでの経過時間を計数する。 The feature amount comparison circuit 1115 compares the image feature amount output from the feature amount detection circuit 1112 with the image feature amount stored in the feature amount storage memory 1114, so that a specific person is added to the currently captured video frame. It is determined whether or not a face image is included. The feature quantity comparison circuit 1115 and the feature quantity detection circuit 1112 function as feature quantity detection means. Based on the recording start information output from the system controller 1103, the recording time counting circuit 1117 counts the elapsed time from the recording start to the video frame for which the face image determination is performed.

一方、表情判定回路１１２４は、フレームメモリ１１０８に記憶された映像フレームに対して人物の表情に関する表情特徴量の検出と、その表情特徴量による表情の判定を行う。表情特徴量による表情の判定によって、「まばたき」、「笑い」、「泣き」、「怒り」などの特定表情の識別が可能であることが知られているが、本実施形態では人物のまばたき検知を用いた例について説明する。特定人物が写った映像フレームであっても、特定人物がまばたきしている映像フレームは、代表画像として望ましくない。そこで、「まばたき」を検出することによって代表画像を選択する有効性を高める。 On the other hand, the facial expression determination circuit 1124 detects a facial expression feature amount related to a human facial expression from the video frame stored in the frame memory 1108 and determines a facial expression based on the facial expression feature amount. It is known that identification of specific facial expressions such as “blink”, “laughter”, “cry”, “anger”, etc. is possible by determining facial expressions based on facial expression features. In this embodiment, human blink detection is performed. An example using this will be described. Even a video frame in which a specific person is captured is not desirable as a representative image. Therefore, the effectiveness of selecting the representative image is enhanced by detecting “blink”.

代表画像生成制御回路１１１６は、特徴量比較回路１１１５による顔画像の特徴量比較結果と、記録時間計数回路１１１７において計数された記録開始からの経過時間と、表情判定回路１１２４におけるまばたき検出結果とに応じて、代表画像とすべき映像フレームを決定する。そして、第３のスイッチ１１１８を制御することによって、代表画像となるべき映像フレームの画像を代表画像メモリ１１１０に記憶する。ここで行われる特徴量比較結果と、記録開始からの経過時間と、まばたき検出結果とに基づいた代表画像決定方法の詳細については、フローチャートを参照しながら後述する。 The representative image generation control circuit 1116 uses the facial image feature amount comparison result by the feature amount comparison circuit 1115, the elapsed time from the recording start counted by the recording time counting circuit 1117, and the blink detection result by the facial expression determination circuit 1124. In response, a video frame to be a representative image is determined. Then, by controlling the third switch 1118, the image of the video frame to be the representative image is stored in the representative image memory 1110. Details of the representative image determination method based on the feature amount comparison result, the elapsed time from the start of recording, and the blink detection result will be described later with reference to a flowchart.

サムネール画像生成回路１１１１は、代表画像メモリ１１１０に記憶された代表画像を縮小その他必要な変換処理を行ってサムネール画像データを生成する。この際、特定人物の顔画像を含む映像フレームが代表画像フレームとして選択された場合は、予め記憶させておいた特定人物の名前等の付加情報をサムネール画像データに付加することも可能である。そして、生成されたサムネール画像データをメディア記録回路１１０６によって記録メディア１１０７に記録する。 The thumbnail image generation circuit 1111 performs thumbnail image data reduction or other necessary conversion processing on the representative image stored in the representative image memory 1110 to generate thumbnail image data. At this time, when a video frame including a face image of a specific person is selected as a representative image frame, additional information such as the name of the specific person stored in advance can be added to the thumbnail image data. The generated thumbnail image data is recorded on the recording medium 1107 by the media recording circuit 1106.

図１２は、本実施形態において、顔画像検出結果と記録開始からの経過時間と表情判定結果とによって代表画像を決定し、サムネール画像データを作成する処理手順の一例を示すフローチャートである。
まず、ステップＳ１２０１において、処理を開始する。次に、代表画像生成制御回路１１１６は、ステップＳ１２０２において、顔画像検出状態を保持する顔画像検出フラグ（Flg）を未検出状態を表す「０」にリセットする。また、記録時間計数回路１１１７で計数される経過時間ｔを「０」にリセットする。さらに、「まばたき」を検知するためのまばたき検知フラグ（Eflg）を初期状態の「０」にリセットする。 FIG. 12 is a flowchart illustrating an example of a processing procedure for determining a representative image based on a face image detection result, an elapsed time from the start of recording, and a facial expression determination result, and creating thumbnail image data in the present embodiment.
First, in step S1201, processing is started. Next, in step S1202, the representative image generation control circuit 1116 resets the face image detection flag (Flg) holding the face image detection state to “0” indicating the undetected state. Further, the elapsed time t counted by the recording time counting circuit 1117 is reset to “0”. Further, the blink detection flag (Eflg) for detecting “blink” is reset to “0” in the initial state.

次に、ステップＳ１２０３において、システムコントローラ１１０３の制御によって映像信号の記録を開始する。そして、ステップＳ１２０４において、代表画像生成制御回路１１１６は、顔画像検出処理のためのフレームＦ（ｔ）をフレームメモリ１１０８に書き込む。 In step S1203, recording of the video signal is started under the control of the system controller 1103. In step S1204, the representative image generation control circuit 1116 writes the frame F (t) for the face image detection process in the frame memory 1108.

次に、ステップＳ１２０５において、代表画像生成制御回路１１１６は、経過時間ｔと顔画像検出の有効期間である第１の時間期間Ｔ１とを比較する。この比較の結果、経過時間ｔが第１の時間期間Ｔ１以内である場合、ステップＳ１２０６以降の顔画像検出処理に進む。一方、ステップＳ１２０５の比較の結果、経過時間ｔが第１の時間期間Ｔ１を越えている場合は、顔画像検出処理をスキップして、ステップＳ１２１２に進む。 In step S1205, the representative image generation control circuit 1116 compares the elapsed time t with a first time period T1 that is an effective period of face image detection. As a result of the comparison, when the elapsed time t is within the first time period T1, the process proceeds to the face image detection process after step S1206. On the other hand, if the elapsed time t exceeds the first time period T1 as a result of the comparison in step S1205, the face image detection process is skipped and the process proceeds to step S1212.

次に、ステップＳ１２０６において、特徴量検出回路１１１２は、既に記憶されている特定人物の顔画像を判定するための顔画像特徴量Ｄ（Ｆ（ｔ））を映像フレームＦ（ｔ）から検出して、特徴量比較回路１１１５に特徴量Ｐを出力する。そして、ステップＳ１２０７において、特徴量比較回路１１１５は、特徴量記憶メモリ１１１４に記憶された特定人物の顔画像特徴量Ｍと特徴量Ｐとを所定の関数Ｃ（Ｐ，Ｍ）を用いて比較し、比較結果Ｋを算出する。 Next, in step S1206, the feature amount detection circuit 1112 detects a face image feature amount D (F (t)) for determining a face image of a specific person that has already been stored from the video frame F (t). The feature amount P is output to the feature amount comparison circuit 1115. In step S1207, the feature amount comparison circuit 1115 compares the face image feature amount M and the feature amount P of the specific person stored in the feature amount storage memory 1114 using a predetermined function C (P, M). The comparison result K is calculated.

次に、ステップＳ１２０８において、特徴量比較回路１１１５は、比較結果Ｋを所定の閾値Ｋｔｈと比較することによって、映像フレームＦ（ｔ）内に特定人物の顔画像が検出されたか否かを判定する。この判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ未満であり顔画像が検出されなかった場合は、以下のステップＳ１２０９〜Ｓ１２１１の処理をスキップしてステップＳ１２１２に進む。一方、ステップＳ１２０８の判定の結果、比較結果Ｋが所定の閾値Ｋｔｈ以上であり顔画像が検出された場合は、ステップＳ１２０９に進む。 Next, in step S1208, the feature amount comparison circuit 1115 determines whether or not a face image of a specific person has been detected in the video frame F (t) by comparing the comparison result K with a predetermined threshold value Kth. . As a result of the determination, if the comparison result K is less than the predetermined threshold value Kth and no face image is detected, the processing of the following steps S1209 to S1211 is skipped and the process proceeds to step S1212. On the other hand, if the result of determination in step S1208 is that the comparison result K is greater than or equal to the predetermined threshold value Kth and a face image is detected, the process proceeds to step S1209.

次に、ステップＳ１２０９において、代表画像生成制御回路１１１６は、顔画像検出フラグ（Flg）をチェックすることにより、顔画像検出が既になされているかどうかを判定する。この判定の結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ１２１０に進む。そして、ステップＳ１２１０において、代表画像生成制御回路１１１６は、映像フレームＦ（ｔ）を代表画像として代表画像メモリ１１１０に書き込む。そして、ステップＳ１２１１において、代表画像生成制御回路１１１６は、顔画像検出フラグ（Flg）を検出済みを示す「１」にセットする。 In step S1209, the representative image generation control circuit 1116 checks the face image detection flag (Flg) to determine whether face image detection has already been performed. As a result of this determination, if the face image detection flag (Flg) is “0” and no face image has been detected, the process proceeds to step S1210. In step S1210, the representative image generation control circuit 1116 writes the video frame F (t) as a representative image in the representative image memory 1110. In step S1211, the representative image generation control circuit 1116 sets the face image detection flag (Flg) to “1” indicating that detection has been performed.

次に、ステップＳ１２１２において、代表画像生成制御回路１１１６は、顔画像が第１の時間期間Ｔ１内に検出されなかった場合の代表画像フレーム位置である第２の時間期間Ｔ２と経過時間ｔとを比較する。この比較の結果、経過時間ｔが第２の時間期間Ｔ２を越えている場合は、以下のステップＳ１２１３、Ｓ１２１４の処理をスキップしてステップＳ１２１５に進む。 Next, in step S1212, the representative image generation control circuit 1116 calculates a second time period T2 that is a representative image frame position and an elapsed time t when the face image is not detected within the first time period T1. Compare. As a result of this comparison, when the elapsed time t exceeds the second time period T2, the processing of the following steps S1213 and S1214 is skipped and the process proceeds to step S1215.

一方、ステップＳ１２１２の比較の結果、経過時間ｔが第２の時間期間Ｔ２以内である場合は、ステップＳ１２１３において、代表画像生成制御回路１１１６は、顔画像検出フラグ（Flg）をチェックする。このチェックの結果、顔画像検出フラグ（Flg）が検出済みを表す「１」にセットされている場合は、以下のステップＳ１２１４の検出処理をスキップしてステップＳ１２１５に進む。 On the other hand, as a result of the comparison in step S1212, if the elapsed time t is within the second time period T2, the representative image generation control circuit 1116 checks the face image detection flag (Flg) in step S1213. As a result of this check, if the face image detection flag (Flg) is set to “1” indicating detection, the detection processing in the following step S1214 is skipped and the process proceeds to step S1215.

一方、ステップＳ１２１３のチェックの結果、顔画像検出フラグ（Flg）が「０」であり、顔画像が未検出である場合は、ステップＳ１２１４に進み、代表画像生成制御回路１１１６は、現在の映像フレームＦ（ｔ）を代表画像メモリ１１１０に書き込む。このように、顔画像が検出されていない場合は、経過時間ｔが第２の時間期間Ｔ２を越えるまで代表画像メモリ１１１０に記憶される内容は最新の映像フレームＦ（ｔ）によって更新される。 On the other hand, if the face image detection flag (Flg) is “0” as a result of the check in step S1213 and no face image is detected, the process advances to step S1214, and the representative image generation control circuit 1116 F (t) is written into the representative image memory 1110. As described above, when the face image is not detected, the content stored in the representative image memory 1110 is updated with the latest video frame F (t) until the elapsed time t exceeds the second time period T2.

一方、ステップＳ１２０９の判定の結果、顔画像検出フラグ（Flg）が検出済みを表す「１」にセットされている場合は、ステップＳ１２１９からステップＳ１２２３までのまばたき検出処理に進む。 On the other hand, if the result of determination in step S1209 is that the face image detection flag (Flg) has been set to “1” indicating detection, the process proceeds to blink detection processing from step S1219 to step S1223.

ステップＳ１２１９において、表情判定回路１１２４は、まばたき検知フラグ（Eflg）をチェックする。このチェックの結果、まばたき検知フラグ（Eflg）が「１」である場合は、「まばたき」をしていない特定人物の顔画像を含む代表画像が既に選択されていることを示している。このため、以下のステップＳ１２２０〜Ｓ１２２３までの処理をスキップしてステップＳ１２１５に進む。 In step S1219, the facial expression determination circuit 1124 checks the blink detection flag (Eflg). As a result of this check, if the blink detection flag (Eflg) is “1”, it indicates that a representative image including a face image of a specific person who has not performed “blink” has already been selected. For this reason, the process from the following steps S1220 to S1223 is skipped and the process proceeds to step S1215.

一方、ステップＳ１２１９のチェックの結果、まばたき検知フラグ（Eflg）が「０」である場合は、「まばたき」をしていない特定人物の顔画像を含む代表画像がまだ選択されていないことを示している。そこで、現映像フレームのまばたき判定を行うため、ステップＳ１２２０に進む。そして、ステップＳ１２２０において、表情判定回路１１２４は、映像フレームＦ（ｔ）の表情特徴量Ｈ（Ｆ（ｔ））を検出する。そして、「まばたき」に関する表情特徴量Ｅを検出結果とする。 On the other hand, if the result of the check in step S1219 is that the blink detection flag (Eflg) is “0”, it indicates that a representative image including a face image of a specific person who has not “blinks” has not yet been selected. Yes. Therefore, the process proceeds to step S1220 in order to determine whether or not to blink the current video frame. In step S1220, the facial expression determination circuit 1124 detects the facial expression feature amount H (F (t)) of the video frame F (t). Then, the facial expression feature amount E related to “blink” is set as a detection result.

次に、ステップＳ１２２１において、表情判定回路１１２４は、検出された「まばたき」に関する表情特徴量Ｅを所定の閾値Ｅｔｈと比較することによって、映像フレームＦ（ｔ）の顔画像が「まばたき」をしているか否かを判定する。この判定の結果、表情特徴量Ｅ＜所定の閾値Ｅｔｈであり、「まばたき」をしていると判定した場合は、以下のステップＳ１２２２、Ｓ１２２３の処理をスキップしてステップＳ１２１５に進む。 Next, in step S1221, the facial expression determination circuit 1124 compares the detected facial expression feature amount E related to “blink” with a predetermined threshold Eth, thereby “blinks” the face image of the video frame F (t). It is determined whether or not. As a result of this determination, if it is determined that expression feature amount E <predetermined threshold Eth and “blink” is being performed, the processing of steps S1222 and S1223 below is skipped and the process proceeds to step S1215.

一方、ステップＳ１２２１の判定の結果、表情特徴量Ｅ≧所定の閾値Ｅｔｈであり、「まばたき」をしていないと判定した場合は、ステップＳ１２２２に進む。そして、ステップＳ１２２２において、代表画像生成制御回路１１１６は、映像フレームＦ（ｔ）を代表画像として代表画像メモリ１１１０に書き込む。そして、ステップＳ１２２３において、代表画像生成制御回路１１１６は、まばたき検知フラグ（Eflg）を「１」にセットする。 On the other hand, as a result of the determination in step S1221, if it is determined that expression feature amount E ≧ predetermined threshold Eth and “blink” is not performed, the process proceeds to step S1222. In step S1222, the representative image generation control circuit 1116 writes the video frame F (t) as a representative image in the representative image memory 1110. In step S1223, the representative image generation control circuit 1116 sets the blink detection flag (Eflg) to “1”.

このように、経過時間ｔが第１の時間期間Ｔ１を越えるまで顔画像検出を実行し、検出された顔画像のうち、「まばたき」をしていない先頭の映像フレームが代表画像メモリ１１１０に記憶される。なお、第１の時間期間Ｔ１までで複数枚の映像フレームで「まばたき」をしていないと判定した場合は、その中の先頭の映像フレームが代表画像メモリ１１１０に記憶される。また、顔画像が検出されなかった場合は、第２の時間期間Ｔ２を越えない範囲で最後に処理した映像フレームが代表画像メモリ１１１０に記憶される。 In this way, face image detection is performed until the elapsed time t exceeds the first time period T1, and among the detected face images, the first video frame that is not “blink” is stored in the representative image memory 1110. Is done. If it is determined that the plurality of video frames have not been “blinked” up to the first time period T1, the first video frame among them is stored in the representative image memory 1110. If no face image is detected, the last processed video frame within a range not exceeding the second time period T2 is stored in the representative image memory 1110.

次に、ステップＳ１２１５において、ユーザの操作等によってシステムコントローラ１１０３から記録停止が指示されたか否かを判定する。この判定の結果、記録停止が指示されていない場合は、ステップＳ１２１６に進み、処理を継続する。そして、ステップＳ１２１６において、記録時間計数回路１１１７は、経過時間ｔを所定の検出単位でインクリメントする。前述したように、顔画像検出処理能力が高くかつ要求される検出間隔が短い場合は毎フレームごとにインクリメントを行い、検出間隔を長く取る場合は所定のインターバル処理となるようにインクリメントする。そして、インクリメントした後はステップＳ１２０４に戻り、処理を繰り返す。 Next, in step S1215, it is determined whether or not recording stop is instructed from the system controller 1103 by a user operation or the like. If the result of this determination is that there is no instruction to stop recording, processing proceeds to step S1216 and processing continues. In step S1216, the recording time counting circuit 1117 increments the elapsed time t by a predetermined detection unit. As described above, when the face image detection processing capability is high and the required detection interval is short, the increment is performed every frame, and when the detection interval is long, the increment is performed so as to be a predetermined interval process. After the increment, the process returns to step S1204 to repeat the process.

一方、ステップＳ１２１５の判定の結果、記録停止が指示された場合は、ステップ１２１７に進み、サムネール画像生成回路１１１１は、代表画像メモリ１１１０に記憶されている映像フレームからサムネール画像データを生成する。そして、ステップＳ１２１８において、メディア記録回路１１０６は、生成されたサムネール画像データを記録メディア１１０７に記録して処理を終了する。 On the other hand, if recording stop is instructed as a result of the determination in step S1215, the process advances to step 1217, and the thumbnail image generation circuit 1111 generates thumbnail image data from the video frames stored in the representative image memory 1110. In step S1218, the media recording circuit 1106 records the generated thumbnail image data on the recording medium 1107, and ends the process.

図１３は、以上の処理によって代表画像を選択する具体例を示す図である。図１３に示す例では、顔画像が検出されなかった場合のディフォルトの代表画像位置（第２の時間期間）Ｔ２＜顔画像検出期間（第１の時間期間）Ｔ１として説明する。 FIG. 13 is a diagram showing a specific example of selecting a representative image by the above processing. In the example illustrated in FIG. 13, description will be made assuming that the default representative image position (second time period) T2 <face image detection period (first time period) T1 when no face image is detected.

図１３（ａ）に示す例では、まず、経過時間ｔｅにおいて、「まばたき」をしている特定人物の顔画像が検出されたが、後に、経過時間ｔ（ｔ＜Ｔ１）において、「まばたき」をしていない特定人物の顔画像が検出されている。このため、経過時間ｔの位置の映像フレームが代表画像として選択される。 In the example shown in FIG. 13A, first, a face image of a specific person who “blinks” is detected at the elapsed time te, but later “blink” at the elapsed time t (t <T1). A face image of a specific person who has not performed is detected. For this reason, the video frame at the position of the elapsed time t is selected as the representative image.

また、図１３（ｂ）に示す例では、経過時間ｔ（ｔ＜Ｔ１）において、「まばたき」をしている特定人物の顔画像が検出されたが、顔画像検出期間Ｔ１以内に「まばたき」をしていない顔画像が検出されていない。このため、経過時間ｔの位置の映像フレームが代表画像として選択される。 In the example shown in FIG. 13B, the face image of the specific person who “blinks” is detected at the elapsed time t (t <T1), but “blink” within the face image detection period T1. No face image is detected. For this reason, the video frame at the position of the elapsed time t is selected as the representative image.

さらに、図１３（ｃ）に示す例では、Ｔ２＜Ｔ１＜経過時間ｔであり、顔画像検出期間Ｔ１内に特定人物の顔画像が検出されなかったため、ディフォルトの代表画像位置Ｔ２の映像フレームが代表画像として選択される。 Further, in the example shown in FIG. 13C, T2 <T1 <elapsed time t, and the face image of the specific person was not detected within the face image detection period T1, so that the video frame at the default representative image position T2 is Selected as a representative image.

以上のようにように本実施形態によれば、前述したように家庭用ビデオカメラにおいては、主となる被写体を狙ってから撮影を開始する場合が多い。このことに基づき、顔画像検出結果と撮影開始からの経過時間とを利用して、撮影の主たる被写体が検出対象となる人物であるのかまたは別の被写体であるのかを判定し、撮影意図を反映したサムネール画像作成が実現できる。 As described above, according to the present embodiment, as described above, in a home video camera, shooting is often started after aiming at a main subject. Based on this, the face image detection result and the elapsed time from the start of shooting are used to determine whether the main subject of shooting is a person to be detected or another subject and reflect the intention of shooting Thumbnail image creation can be realized.

また、本実施形態では、映像を記録する時に行うことも可能であり、記録済みのビデオデータから処理を行うことも可能である。特に、後者の場合はリアルタイム処理が必要でないため、ソフトウェアによる処理も可能である。また、特定人物の検出と表情検出とを組み合わせて、「まばたき」をしている等の望ましくない映像フレームを除外して、より最適な代表画像を選択してサムネール画像データを作成することが可能である。 In the present embodiment, it can be performed when video is recorded, and processing can be performed from recorded video data. In particular, in the latter case, since real-time processing is not necessary, processing by software is also possible. It is also possible to create thumbnail image data by combining the detection of a specific person and facial expression detection, excluding unwanted video frames such as blinking, and selecting more optimal representative images It is.

（本発明に係る他の実施形態）
前述した本発明の実施形態における画像処理装置を構成する各手段、並びに画像処理方法の各工程は、コンピュータのＲＡＭやＲＯＭなどに記憶されたプログラムが動作することによって実現できる。このプログラム及び前記プログラムを記録（記憶）したコンピュータ読み取り可能な記録媒体（記憶媒体）は本発明に含まれる。 (Other embodiments according to the present invention)
Each means constituting the image processing apparatus and each step of the image processing method in the embodiment of the present invention described above can be realized by operating a program stored in a RAM or ROM of a computer. This program and a computer-readable recording medium (storage medium) recording (storing) the program are included in the present invention.

また、本発明は、例えば、システム、装置、方法、プログラムもしくは記録媒体（記憶媒体）等としての実施形態も可能であり、具体的には、複数の機器から構成されるシステムに適用してもよいし、また、一つの機器からなる装置に適用してもよい。 In addition, the present invention can be implemented as a system, apparatus, method, program, recording medium (storage medium), or the like, and can be applied to a system including a plurality of devices. It may also be applied to an apparatus consisting of a single device.

なお、本発明は、前述した実施形態の機能を実現するソフトウェアのプログラム（実施形態では図２、６−１，６−２、８−１，８−２、１０、１２に示すフローチャートに対応したプログラム）をシステムまたは装置に直接、または遠隔から供給する場合も含む。そして、そのシステムまたは装置のコンピュータが前記供給されたプログラムコードを読み出して実行することによっても達成される場合を含む。 Note that the present invention corresponds to the software program that implements the functions of the above-described embodiments (in the embodiment, the flowcharts shown in FIGS. 2, 6-1, 6-2, 8-1, 8-2, 10, 12). Including the case where the program is supplied directly or remotely to the system or apparatus. This includes the case where the system or the computer of the apparatus is also achieved by reading and executing the supplied program code.

したがって、本発明の機能処理をコンピュータで実現するために、前記コンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等の形態であってもよい。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, and the like.

プログラムを供給するための記録媒体（記憶媒体）としては、例えば、フレキシブルディスク、ハードディスク、光ディスク、光磁気ディスクなどがある。さらに、ＭＯ、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＤＶＤ（ＤＶＤ−ＲＯＭ、ＤＶＤ−Ｒ）などもある。 Examples of the recording medium (storage medium) for supplying the program include a flexible disk, a hard disk, an optical disk, and a magneto-optical disk. Further, there are MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD-R) and the like.

その他、プログラムの供給方法としては、クライアントコンピュータのブラウザを用いてインターネットのホームページに接続する方法がある。そして、前記ホームページから本発明のコンピュータプログラムそのもの、もしくは圧縮され自動インストール機能を含むファイルをハードディスク等の記録媒体（記憶媒体）にダウンロードすることによっても供給できる。 As another program supply method, there is a method of connecting to a homepage on the Internet using a browser of a client computer. Further, the computer program itself of the present invention or a compressed file including an automatic installation function can be downloaded from the homepage by downloading it to a recording medium (storage medium) such as a hard disk.

また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるＷＷＷサーバも、本発明に含まれるものである。 It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer is also included in the present invention.

また、その他の方法として、本発明のプログラムを暗号化してＣＤ−ＲＯＭ等の記録媒体に格納してユーザに配布し、所定の条件をクリアしたユーザに対し、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードさせる。そして、その鍵情報を使用することにより暗号化されたプログラムを実行してコンピュータにインストールさせて実現することも可能である。 As another method, the program of the present invention is encrypted, stored in a recording medium such as a CD-ROM, distributed to users, and encrypted from a homepage via the Internet to users who have cleared predetermined conditions. Download the key information to be solved. It is also possible to execute the encrypted program by using the key information and install the program on a computer.

また、コンピュータが、読み出したプログラムを実行することによって、前述した実施形態の機能が実現される。さらに、そのプログラムの指示に基づき、コンピュータ上で稼動しているＯＳなどが、実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現され得る。 Further, the functions of the above-described embodiments are realized by the computer executing the read program. Furthermore, based on the instructions of the program, an OS or the like running on the computer performs part or all of the actual processing, and the functions of the above-described embodiments can be realized by the processing.

さらに、その他の方法として、まず記録媒体（記憶媒体）から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれる。そして、そのプログラムの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現される。 Furthermore, as another method, a program read from a recording medium (storage medium) is first written in a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer. Then, based on the instructions of the program, the CPU or the like provided in the function expansion board or function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are also realized by the processing.

本発明の第１の実施形態における画像処理装置の映像処理回路の構成例を示すブロック図である。It is a block diagram which shows the structural example of the video processing circuit of the image processing apparatus in the 1st Embodiment of this invention. 本発明の第１の実施形態におけるサムネール画像データを作成する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which produces the thumbnail image data in the 1st Embodiment of this invention. 本発明の第１の実施形態における代表画像を選択する具体例を示す図である。It is a figure which shows the specific example which selects the representative image in the 1st Embodiment of this invention. 本発明の第１の実施形態における代表画像を選択する他の具体例を示す図である。It is a figure which shows the other specific example which selects the representative image in the 1st Embodiment of this invention. 本発明の第２の実施形態における画像処理装置の映像処理回路の構成例を示すブロック図である。It is a block diagram which shows the structural example of the video processing circuit of the image processing apparatus in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるサムネール画像データを作成する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which produces the thumbnail image data in the 2nd Embodiment of this invention. 本発明の第２の実施形態におけるサムネール画像データを作成する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which produces the thumbnail image data in the 2nd Embodiment of this invention. 本発明の第３の実施形態における代表画像を選択する具体例を示す図である。It is a figure which shows the specific example which selects the representative image in the 3rd Embodiment of this invention. 本発明の第３の実施形態におけるサムネール画像データを作成する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which produces the thumbnail image data in the 3rd Embodiment of this invention. 本発明の第３の実施形態におけるサムネール画像データを作成する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which produces the thumbnail image data in the 3rd Embodiment of this invention. 本発明の第４の実施形態における画像処理装置の映像処理回路の構成例を示すブロック図である。It is a block diagram which shows the structural example of the video processing circuit of the image processing apparatus in the 4th Embodiment of this invention. 本発明の第４の実施形態におけるサムネール画像データを作成する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which produces the thumbnail image data in the 4th Embodiment of this invention. 本発明の第５の実施形態における画像処理装置の映像処理回路の構成例を示すブロック図である。It is a block diagram which shows the structural example of the video processing circuit of the image processing apparatus in the 5th Embodiment of this invention. 本発明の第５の実施形態におけるサムネール画像データを作成する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which produces the thumbnail image data in the 5th Embodiment of this invention. 本発明の第５の実施形態における代表画像を選択する具体例を示す図である。It is a figure which shows the specific example which selects the representative image in the 5th Embodiment of this invention. 従来における代表画像を選択する処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence which selects the representative image in the past.

Explanation of symbols

１００画像処理装置
１０１第１の端子
１０２第２の端子
１０３システムコントローラ
１０４動画像符号化回路
１０５マルチプレクサ
１０６メディア記録回路
１０７記録メディア
１０８フレームメモリ
１０９第１のスイッチ
１１０代表画像メモリ
１１１サムネール画像生成回路
１１２特徴量検出回路
１１３第２のスイッチ
１１４特徴量記憶メモリ
１１５特徴量比較回路
１１６代表画像生成制御回路
１１７記録時間係数回路
１１８第３のスイッチ DESCRIPTION OF SYMBOLS 100 Image processing apparatus 101 1st terminal 102 2nd terminal 103 System controller 104 Moving image encoding circuit 105 Multiplexer 106 Media recording circuit 107 Recording medium 108 Frame memory 109 1st switch 110 Representative image memory 111 Thumbnail image generation circuit 112 Feature amount detection circuit 113 Second switch 114 Feature amount storage memory 115 Feature amount comparison circuit 116 Representative image generation control circuit 117 Recording time coefficient circuit 118 Third switch

Claims

Detect a characteristic amount of the face of the entered human product that is part of the image movies, the person is an image processing apparatus having a feature amount detection means for recognizing whether a specific person,
Feature quantity storage means for storing in the storage unit the feature quantity of the face of the specific person to be recognized by the feature quantity detection means;
Recording means for recording video on a recording medium;
From the time of recording start of the recording means images recorded by, in a short first predetermined time period than up to the point of end of recording, the feature amount of the face of the specific person which is previously stored in the storage unit based on the case where the specific person from the recorded image is recognized by the feature quantity detecting unit, the recording from the video frame a particular person has been recognized in the image in the first predetermined time period A thumbnail image as a representative image of the recorded video, and when the specific person is not recognized by the feature amount detection means from the recorded video within the first predetermined period, the recording start time It is characterized in that a thumbnail image generating means for generating the thumbnail images from the first predetermined short second predetermined image frame at the time the period of time than the period from Image processing apparatus.

The image processing apparatus according to claim 1, further comprising a period changing unit that changes the first predetermined period according to a time from a recording start to a recording end of the recorded video .

The feature quantity detecting unit, the first after a lapse of the predetermined time period, the image processing apparatus according to claim 1 or 2, characterized in that stopping the detection of the feature quantity of the face of pre-Symbol human product.

The thumbnail image generating means, the image processing apparatus according to any one of claim 1 to 3, characterized in that additional information is added about a particular person that is recognized by the feature quantity detecting unit to the thumbnail image.

Further comprising determining facial expression determination unit facial expressions of the inputted human product that is part of the image,
When there are a plurality of video frames in which the specific person is recognized by the feature amount detection unit from the video within the first predetermined period, the thumbnail image generation unit determines the facial expression determined by the facial expression determination unit. The image processing apparatus according to claim 1, wherein the thumbnail image is generated from a video frame corresponding to a predetermined facial expression.

Detect a characteristic amount of the face of the entered human product that is part of the image movies, the person is an image processing method having a feature amount detection step for recognizing whether a specific person,
A feature amount storage step of storing in a storage unit a feature amount of the face of the specific person to be recognized in the feature amount detection step;
A recording process for recording video on a recording medium;
From the time of start of recording video recorded in the recording step, in a short first predetermined time period than up to the point of end of recording, the feature amount of the face of the specific person which is previously stored in the storage unit Based on the recorded video, when the specific person recognizes in the feature amount detection step, the recording is performed from the video frame in which the specific person in the video within the first predetermined period is recognized. When a thumbnail image is generated as a representative image of the recorded video and the specific person does not recognize from the recorded video in the feature amount detection step within the first predetermined period, from the time of the recording start characterized in that a thumbnail image generating process of generating the thumbnail image from the video frame of the time the first short second predetermined time period than the predetermined period of time Image processing method.

Detect a characteristic amount of the face of a human product is Ru contained in movies image is entered, a feature amount detection step for recognizing whether the person is a specific person,
A feature amount storage step of storing in a storage unit a feature amount of the face of the specific person to be recognized in the feature amount detection step;
A recording process for recording video on a recording medium;
From the time of start of recording video recorded in the recording step, in a short first predetermined time period than up to the point of end of recording, the feature amount of the face of the specific person which is previously stored in the storage unit Based on the recorded video, when the specific person recognizes in the feature amount detection step, the recording is performed from the video frame in which the specific person in the video within the first predetermined period is recognized. When a thumbnail image is generated as a representative image of the recorded video and the specific person does not recognize from the recorded video in the feature amount detection step within the first predetermined period, from the time of the recording start wherein to execute the thumbnail image generation step of generating a thumbnail image to a computer from a short second predetermined image frame at the time the period of time than the first predetermined time period Program, characterized in that.

A computer-readable recording medium, wherein the program according to claim 7 is recorded.