JP2004297305A

JP2004297305A - System and program for configuring data base, system and program for retrieving image, and image recorder/reproducer

Info

Publication number: JP2004297305A
Application number: JP2003084906A
Authority: JP
Inventors: Koji Minami; 功治南
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2003-03-26
Filing date: 2003-03-26
Publication date: 2004-10-21
Anticipated expiration: 2023-03-26
Also published as: JP4334898B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image recorder/reproducer in which scene information can be acquired automatically for dynamic image data of a personal image subject, data base of scene information can be made easily, a local variation in scene can be determined readily, and image retrieval (scene retrieval) can be carried out along the movement of a person. <P>SOLUTION: Under the control of a CPU 6, still image data are acquired from dynamic image data of a first recording medium 1 reproduced by a first drive 11 and stored in a buffer memory 3. The CPU 6 searches a person in the still image data stored in the buffer memory 3 by utilizing an image processing program or the like in the memory 4 and extracts a region including the person if any. Under the control of a CPU 6, partial image data thus extracted are recorded on a second recording medium 2 by using a second drive 12 while being associated with address information on the first recording medium 1 of the still image data from which the region including the person is extracted. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、記録媒体に記録された動画像データから特定のシーンの検索を容易に行えるようにする技術にかかり、データベース構築装置、データベース構築プログラム、画像検索装置、画像検索プログラム、及び画像記録再生装置に関する。
【０００２】
【従来の技術】
まず、ここで、本明細書で用いる言葉について定義する。動画像は静止画像の列で構成され、動画像全体の一部を構成する任意の静止画像の列をシーンとする。そして、シーン情報とは、シーンを構成する静止画像の画像データそのものとは区別され、個々のシーンに付与され、シーンを特定し得る情報である。また、動画像の画像データを動画像データ、静止画像の画像データを静止画像データとし、動画像データ、静止画像データと区別する必要がない場合は、単に画像データと称する。
【０００３】
今日、記録媒体への動画像の記録が簡単にできるようになり、記録した動画像の編集を効率よく行い得る技術の開発が望まれている。しかしながら、動画像を再生して、ユーザ自らが画面を見つめながら動画像に含まれている特定の対象物や事象を含んだシーンを探し当てるのは冗長であり、かつ目的のシーンを短時間で探し当てることは困難である。
【０００４】
そこで、従来から、動画像を構成する各シーンに対して、シーンに関する情報（以下、シーン情報）を与え、これをデータベース化しておき、シーン情報を用いて動画像データを検索することが行われている。
【０００５】
シーン情報として、例えば特許文献１には、動画像中での位置（例えば開始・終了フレーム番号、タイム・コード）、シーンの意味内容（例えばキーワード、属性、代表フレーム）、シーン相互の関係（例えば親または子のシーンの識別子）、シーン変化の情報（例えば変化点の動画像中での位置、変化のタイプ、確からしさ）等の情報が挙げられている。
【０００６】
図８に、該文献に記載されている、シーン情報の入力システムの構成図を示す。これにおいて、シーン情報エディタ１１２は、複数のシーンについて、それを代表する静止画像データを代表フレームファイル１２２から取り出し、静止画像を時間軸とともに、かつその時間軸に沿って時間順に、ディスプレイ１１４の画面に表示する。そして、入力装置１１５を通してユーザからの指示を受けると、指示されたところの時間軸の一部に対応する期間について、動画像データをＬＤ１１６から取り出し、ＴＶモニター１１８に表示する。
【０００７】
また、シーン情報エディタ１１２は、それらのシーンに与えられた情報であるシーン情報をシーン情報ファイル１２１から取り出し、ディスプレイ１１４の画面に同時に図形的に表示する。そして、ユーザが入力装置１１５を通じて編集コマンドを入力したときには、それを実行し、シーン情報ファイル１２１を編集する。
【０００８】
図９に、該文献に記載されているシーン情報を構築していくための手順を示す。これは、時間的に連続するフレーム列で構成される動画像を、部分フレーム列であるところのシーンに分割し、シーンへの分割を確認・修正し、シーン情報を入力し、入力した情報を動画像データベースに登録するまでの処理の流れを示している。
【０００９】
シーン変化点を検出するステップａ）、検出されたシーン変化点に基づいてシーン情報を生成し、シーン情報ファイルに格納するステップｂ）、シーン情報をディスプレイに図形的に表示するステップｃ）、入力手段を使ってユーザより指示された編集コマンドを実行して、シーン情報ファイルを編集するステップｄ）、およびシーン情報ファイルのシーン情報をデータベース手段に登録するステップｅ）よりなる。
【００１０】
【特許文献１】
特開平５−３３４３７４号公報（１９９３年１２月１７日）
【００１１】
【発明が解決しようとする課題】
動画像を検索するにおいて、シーン情報のデータベース化は有意な中核技術であるが、上記した文献に記載されている技術は、ユーザ自らがシーン情報を入力してデータベースを構築するものであるため、やはりユーザにとって困難な作業と言わざるを得ない。
【００１２】
そこで、シーン情報を自動で入力できるシステムが求められるが、自動によるシーン情報の完全な情報としての入力は非常に難しいものとなる。それは、シーン情報の内容が、人物、そこに存在する物全ての名称、構成要素それぞれの配置場所、構成要素それぞれの明るさなど非常に多岐にわたり、これら全てが入力対象になるからである。
【００１３】
上記した文献の技術は、シーン情報のデータベースを構築する助けとなる発明ではあるが、シーン情報として、画像データの内容に関する全ての情報を扱う必要があるという点では、その文献の従来技術と同じであり、自動化は難しい。
【００１４】
また、上記した文献の技術では、シーンの変化をとらえてシーンを分けるようにようになっている。しかしながら、変化が小さい映像では変化の判断が難しく、また、局所的（画面の一部）に変化が起こった場合なども、変化の発生箇所の探索まで含めると、非常に難しい変化の判断機能が要求される。
【００１５】
このような理由で、上記した文献のシステムで構築されたデータベース（動画像データベース）は、シーン情報の内容が緻密である一方で、どんな種類の画像に対しても有効なデータベースとはなり得ない。
【００１６】
例えば、人物主体の動画像で、スナップ写真的に動画像から静止画像を切り出したい要望が生じたときなど、シーンの変化が局所的であったり、小さかったりした複数のシーンは、シーン情報検索に使うキーワードの選択によっては、同時に検索されてしまうことになる。
【００１７】
本発明は、上記課題に鑑みなされたもので、動画像データを構成する静止画像データをさらに細分化し、静止画像に含まれる人物、或いは人物とその付加情報の組み合わせとしてシーン情報を取得することで、人物画像主体の動画像データに対しては、シーン情報を人物を含む画像情報としてユーザの手を煩わせることなく自動に取得できるようにして、シーン情報のデータベース化を容易にするとともに、局所的なシーン変化なども容易に判別でき、人物の動きに沿った画像検索等も可能にする、データベース構築装置、データベース構築プログラム、画像検索装置、画像検索プログラム、及び画像記録再生装置を提供することを目的としている。
【００１８】
【課題を解決するための手段】
本発明のデータベース構築装置は、上記課題を解決するために、第１のドライブにて再生された第１の記録媒体の動画像データから静止画像データを取得する静止画像データ取得手段と、上記静止画像データ取得手段にて取得された静止画像データを蓄積していく蓄積手段と、該蓄積手段が蓄積した静止画像データに含まれる人物を探索し、人物が含まれている場合は人物を含む所定領域を部分画像データとして抽出する探索抽出手段と、該探索抽出手段にて部分画像データが抽出されると、抽出された部分画像データを抽出元である静止画像データの上記第１の記録媒体上のアドレス情報と対応付けて第２のドライブを用いて第２の記録媒体に記録させる記録手段とを有することを特徴としている。
【００１９】
これによれば、静止画像データ取得手段が、第１のドライブにて再生された第１の記録媒体の動画像データから静止画像データを取得する。取得された静止画像データは、蓄積手段に蓄積されていき、探索抽出手段が、該蓄積手段が蓄積した静止画像データに含まれる人物を探索して、人物が含まれている場合は人物を含む所定領域を部分画像データとして抽出する。探索抽出手段にて抽出された部分画像データは、記録手段にて、抽出された部分画像データを抽出元である静止画像データの上記第１の記録媒体上のアドレス情報と対応付けて第２のドライブを用いて第２の記録媒体に記録されることとなる。
【００２０】
第２の記録媒体に記録される部分画像データには、その人物の行為など、そこには既にこれを抽出した静止画像データ、つまりその静止画像データを構成要素とする動画像のあるシーンを特徴付ける情報が含まれているので、部分画像データを表示させることで、部分画像データ自体がシーン情報となる。また、その部分画像データから複数の人物情報が得られる場合も、その複数人の人物構成などが重要なシーン情報となる。つまり、これらの部分画像データは、そのままでもユーザにとってはシーン変化を検出するために得られる情報と相当のシーン情報となる。
【００２１】
そして、このようなシーン情報は、人物を含む領域を切り出した部分画像データであるので、既存の技術にてユーザの手を煩わせることなく自動にて取得して、データベースを構築することができる。
【００２２】
これにより、人物画像主体の動画像データに適した、シーン情報をユーザの手を煩わせることなく自動に取得可能な、シーン情報のデータベース構築装置を提供することができる。
【００２３】
そして、このようなデータベース構築装置を画像記録再生装置に搭載させることで、局所的なシーン変化なども容易に判別でき、人物の動きに沿った画像検索等も可能な、人物を主体とした動画像データの内容編集等を容易に行うことのできる画像記録再生装置を提供することができる。
【００２４】
また、本発明のデータベース構築装置においては、さらに、上記探索抽出手段は、部分画像データを抽出する際に、当該部分画像データ内に含まれる人物の数を表す数情報を含む付加情報を併せて取得し、上記記録手段は、部分画像データと共に対応する付加情報を、部分画像データに連なるツリー構造で関係付け得るように上記第２の記録媒体に記録することを特徴とすることもできる。
【００２５】
これによれば、部分画像データ内に含まれる人物の数を表す数情報を含む付加情報が取得され、部分画像データと共に対応する付加情報が、部分画像データに連なるツリー構造で関係付け得るように第２の記録媒体に記録されるので、たとえ、人物を含む部分画像データの切り出しが、人物が二人重なっている画像として取得した場合でも、第２の記録媒体に記録されたデータ側では、二人の人物に分離して管理することができる。
【００２６】
また、上記付加情報には、さらに、各人物の色的特徴を表す色情報、及び／又は、各人物の形状的特徴を表す形状情報を含めておくこともできる。
【００２７】
色情報を利用すれば、例えば動画像全体の中で特定の人物が着ている衣装をキーに、特定のシーンを検索することが可能となる。また、別の例で、形状情報を利用すれば、その形状変化を捉えて、特定の人物の行動をキーに、特定のシーンを検索することが可能である。
【００２８】
したがって、人物を含み領域を抜き出した部分画像データをシーン情報とし、これをもとに動画像データの検索を行う場合、検索のキーとなる項目が増え、より効果な検索を行いえるデータベースを構築することができる。
【００２９】
また、本発明のデータベース構築装置においては、上記探索抽出手段は、静止画像データに含まれる全ての人物が何れかの部分画像データに含まれるように部分画像データの抽出を行うことが好ましい。
【００３０】
これによれば、静止画像データに含まれる全人物が、独立して或いはほかの人物と共に、部分画像データとして取得されるので、シーン情報の内容としてより確度の高い情報となる。
【００３１】
また、本発明のデータベース構築装置においては、上記探索抽出手段は、人物を探索する際の探索領域を適宜変更しながら行うことを特徴とすることができる。
【００３２】
これによれば、人物を探索する際の探索領域を適宜変更しながら行うので、探索時に、人物情報の静止画像データに占める割合が任意に変化する動画像データなどに対しても、探索領域を画面最大範囲から段階的に縮小していくなど、探索領域を常に適切に定めていくことができ、探索をより効率良く行うことができる。また、最初の探索で複数の人物を含む領域として探索した際にも、より情報確度を上げるため、探索領域を変えて再度探索を行うといったことも容易に行える。
【００３３】
また、本発明のデータベース構築装置においては、上記探索抽出手段は、静止画像データを予め複数の領域に分割し、分割した領域を探索領域として探索することを特徴とすることもできる。
【００３４】
これによれば、静止画像データを予め複数の領域に分割してから分割した各領域を探索領域として探索するので、切り出した静止画像データに予め多数の人物が記録されていると予想される動画像データの場合、一回の探索で見つけて切り出すところの人物が含まれる領域に存在する人物数を適宜少なくでき、個別人物情報の数を少なくできる。また、分割した各領域にて独立して探索処理を行うようにすれば、静止画像データからの人物の抽出をより短時間で行うことができる。
【００３５】
また、このよう予め静止画像データを分割する場合、静止画像データに含まれる人物の数の状態を判定し、判定に基づいて静止画像データの分割数を決定する分割数決定手段を備え、上記探索抽出手段は、該分割数決定手段にて決定された分割数にて画像データを分割する構成とすることがより好ましい。
【００３６】
これにより、静止画像データの分割が、静止画像データに含まれる人物の数の状態に応じて行われるので、静止画像データの分割数が静止画像データに含まれる人物の数の状態にあったものとなり、分割数が固定されている構成よりも、個別人物情報の数をより的確にできる。
【００３７】
また、本発明のデータベース構築装置では、上記第１のドライブにおける上記第１の記録媒体の動画像データの再生が初めてか否かを判定し、初めてである場合は、上記静止画像データ取得手段による静止画像データの取得を開始させる開始指示手段を備えている構成とすることもできる。
【００３８】
これによれば、第１のドライブにおける上記第１の記録媒体の動画像データの再生が初めてである場合、開始指示手段にて、静止画像データ取得手段による静止画像データの取得が開始されるので、ユーザは第１のドライブで第１の記録媒体を再生させるだけで特別な指示を行うことなく、シーン情報のデータベースを取得することができる。
【００３９】
本発明の画像記録再生装置は、上記課題を解決するために、第１の記録媒体に記録されている情報を再生する第１のドライブと、第２の記録媒体に情報を記録・再生する第２のドライブとを備えた画像記録再生装置において、上記請求項１〜９に記載のデータベース構築装置を備えたことを特徴としている。
【００４０】
既にデータベース構築装置として説明したように、本発明のデータベース構築装置は、人物画像主体の動画像データに適した、シーン情報をユーザの手を煩わせることなく自動に取得し得るシーン情報のデータベース構築装置である。
【００４１】
したがって、このようなデータベース構築装置を搭載してなる本発明の画像記録再生装置は、局所的なシーン変化なども容易に判別でき、人物の動きに沿った画像検索等も可能な、人物を主体とした動画像データの内容編集等を容易に行うことのが可能な優れた画像記録再生装置となる。
【００４２】
本発明の画像検索方法は、本発明の画像記録再生装置の画像検索方法であって、上記第２の記録媒体に記録されている部分画像データをシーン情報として用いて、上記第１の記録媒体に記録されている動画像データにおける任意のシーンの検索を行うことを特徴としている。
【００４３】
上述したように、第２の記録媒体に記録される部分画像データには、その人物の行為など、そこには既にこれを抽出した静止画像データ、つまりその静止画像データを構成要素とする動画像のあるシーンを特徴付ける情報が含まれているので、部分画像データを表示させることで、部分画像データ自体がシーン情報となる。また、その部分画像データから複数の人物情報が得られる場合も、その複数人の人物構成などが重要なシーン情報となる。つまり、これらの部分画像データは、そのままでもユーザにとってはシーン変化を検出するために得られる情報と相当のシーン情報となる。
【００４４】
したがって、画像に含まれる人物の動作まで含めた細かいレベルでのシーン情報を用いて検索するので、人物が中心の動画データ再生時の検索においてなど、より効率の良い画像検索が可能になる。
【００４５】
また、本発明の画像検索装置は、上記画像記録再生装置に備えられる画像検索装置であって、上記第２ドライブを用いて上記第２の記録媒体に記録されている部分画像データ群を再生し、表示手段に表示させる部分画像データ表示手段と、ユーザからの表示されている部分画像データに対する選択を受けつける入力手段と、上記入力手段にて選択された部分画像データをもとに、第１の記録媒体に記録されている動画像データに対して、該部分画像データの抽出元となる静止画像データの検索を行う検索手段とからなることを特徴としている。
【００４６】
これによれば、部分画像データ再生表示手段が、像上記第２ドライブを用いて上記第２の記録媒体に記録されている部分画像データ群を再生し、表示手段に表示させる。ユーザが表示されている部分画像データの１つを入力手段を用いて選択すると、検索手段が、上記入力手段にて選択された部分画像データの抽出元となる静止画像データを動画像データより検索する。
【００４７】
したがって、第２の記録媒体に記録されている人物の動作、表情などを直接に見て必要な画像データの検索（シーン検索）を行うことができるので、例えば動画像検索を、静止画像全体を表示して行う場合よりもきめ細かい検索が可能になる。例えば、元の静止画像内で人物が記録された領域の占める割合が全体の中で小さい場合にも、人物情報をキーにした画像検索ができる。つまり、画面内を絞って表示するので、ユーザにとってはより見やすく、かつ情報確度の高い検索が可能になる。
【００４８】
本発明の画像検索装置では、さらに、部分画像データ再生表示手段は、上記第１の記録媒体に記録されている動画像データの再生が指示された場合、動画像データの再生前に部分画像データの一部を表示することを特徴とすることもできる。
【００４９】
これによれば、第１の記録媒体の動画像データの部分画像データ群が第２の記録媒体にある場合は、動画像データの再生を前にして部分画像データの一部が自動的に再生されるので、例えばユーザはその部分画像データをみて、見たいシーンのみ選択的に再生させるといった使い方が可能となる。
【００５０】
また、本発明のデータベース構築プログラム及び記録媒体は、上記した本発明のデータベース構築装置における各手段としてコンピュータを機能させるプログラム及びそれを記録した記録媒体である。
【００５１】
また、本発明の画像検索プログラム及び記録媒体は、上記した本発明の画像検索装置における各手段としてコンピュータを機能させるプログラム及びそれを記録した記録媒体である。
【００５２】
これにより、上記したデータベース構築プログラム、或いは画像検索プログラムをコンピュータによって実行させれば、特定の特定のデータベース構築装置、画像検索装置、画像記録再生装置ではなく、不特定の画像記録再生装置に対しても本発明のデータベース構築装置、画像検索装置、画像記録再生装置を実現させることが可能となる。
【００５３】
そしてまた、本発明は、以下のように表現することもできる。つまり、本発明の画像記録再生装置は、第１の記録媒体を記録再生可能な第１のドライブと、第２の記録媒体を記録再生可能な第２のドライブと、第１のドライブから再生された画像データを一時的に格納するためのバッファメモリと、画像処理のためのプログラムを記憶するメモリと、第１ドライブで再生されたデータ、および第２ドライブで再生されたデータの少なくとも一方を表示する表示装置とからなり、第１の記録媒体、第２の記録媒体を同時に少なくとも記録、再生の一方が可能な再生装置において、第１の記録媒体からの再生画像データを少なくとも一つのバッファメモリに蓄積し、前記バッファメモリに蓄積された画像データから人物が含まれる領域を探索し、かつ、人物が含まれる領域が少なくとも一つ以上探索できた場合に、前記領域を少なくとも一つ以上の部分画像データとして抜き出し、前記部分画像データを、前記部分画像データを抜き出した元画像データの第１の記録媒体におけるアドレス情報と結び付け、第２の記録媒体に記録することを特徴としている。
【００５４】
また、ここで、前記第１の記録媒体から人物が含まれる領域を探索して得られた前記部分画像データに対し、前記探索時に得られた、人数情報、色情報、形状情報の三つの情報を少なくとも含む、前記第１の記録媒体の画像データに関する複数の情報を、前記個々の部分画像データに連なるツリー構造で関係付けられるように、前記第２の記録媒体に記録することが好ましい。
【００５５】
また、第１の記録媒体からの再生画像データを少なくとも一つのバッファメモリに蓄積する動作と、前記バッファメモリに蓄積された画像データから人物が含まれる領域を探索し、かつ、人物が含まれる領域が少なくとも一つ以上探索できた場合に、前記領域を少なくとも一つ以上の部分画像データとして抜き出し、前記部分画像データを、前記部分画像データを抜き出した元画像データの第１の記録媒体におけるアドレス情報と結び付け、第２の記録媒体に記録する動作とを独立して実行可能な上記画像記録再生装置であって、かつ、人物が含まれる領域の探索を画像データ内の探索領域を順次変えていくことで行うことを特徴とすることもできる。
【００５６】
また、第１の記録媒体からの再生画像データを少なくとも一つのバッファメモリに蓄積する動作と、前記バッファメモリに蓄積された画像データから人物が含まれる領域を探索し、かつ、人物が含まれる領域が少なくとも一つ以上探索できた場合に、前記領域を少なくとも一つ以上の部分画像データとして抜き出し、前記部分画像データを、前記部分画像データを抜き出した元画像データの第１の記録媒体におけるアドレス情報と結び付け、第２の記録媒体に記録する動作とを独立して実行可能な請求項１記載の画像記録再生装置であって、かつ、第１の記録媒体から再生した画像データを予め複数の探索領域に分けてから、人物が含まれる領域の探索を行うことを特徴とすることもできる。
【００５７】
また、本発明の画像記録再生装置における画像検索システムは、前記第２記録媒体に記録した人物を含む部分画像データを、前記第１の記録媒体の画像データのシーン情報として用いることを特徴としている。
【００５８】
また、本発明の画像記録再生装置における画像検索システムは、前記第１の記録媒体の画像データを再生することで、第２の記録媒体に自動記録された部分画像データ群の一部を、第２の記録媒体も再生して前記第１の記録媒体の２回目以降の再生前に表示し、前記部分画像データを選択することで、部分画像データを第１の記録媒体記録された画像データの検索キーとすることを特徴としている。
【００５９】
【発明の実施の形態】
〔実施の形態１〕
本発明の実施の一形態について図１ないし図５に基づいて説明すれば、以下の通りである。
【００６０】
図１は、本発明のデータベース構築装置並びに画像検索装置を具備する本実施の形態の画像記録再生装置の構成図である。
【００６１】
本画像記録再生装置は、第１及び第２の２つのドライブ１１・１２と、バッファメモリ３と、表示装置５と、ＣＰＵ６と、メモリ４とを少なくとも備えている。
【００６２】
第１のドライブ１１は、第１の記録媒体１を記録再生可能な装置であり、第２のドライブ１２は、第２の記録媒体２を記録再生可能な装置である。
【００６３】
バッファメモリ（蓄積手段）３は、第１のドライブ１１から再生された動画像データ（静止画像データの列からなる）より、所定のタイミングで取得された静止画像データを一時的に格納するものである。サンプリングされた静止画像データは、必要に応じて、後述のようにさらに細分化されて第２のドライブ１２へ転送される。
【００６４】
表示装置（表示手段）５は、第１のドライブ１１で再生された画像データ、及び第２のドライブ１２で再生された画像データを表示するものである。再生された画像データの表示は独立して行われ、表示装置５は、第１及び第２のドライブ１１・１２で再生された何れか一方の画像データを表示する。
【００６５】
メモリ４は、ハードディスク等からなり、画像処理のためのプログラムを始め、各種のアプリケーションプログラムを格納している。
【００６６】
ＣＰＵ６は、図示しないＲＡＭを作業領域として備えており、上記した第１及び第２のドライブ１１・１２、バッファメモリ３、表示装置５の各種動作を制御する制御中枢である。また、上記メモリ４よりアプリケーションプログラムを読み出して実行するものであり、本発明のデータベース構築装置並びに画像検索装置を具現化するものでもある。つまり、ＣＰＵ６とメモリ４にて、静止画像データ取得手段、探索抽出手段、記録手段、開始指示手段、部分画像データ表示手段、検索手段等の機能を有している。
【００６７】
また、上記した第１のドライブ１１と第２のドライブ１２とは、対応する記録媒体である記録媒体１或いは第２の記録媒体２に対して、同時に少なくとも記録、再生の一方が可能となっている。
【００６８】
上記第１及び第２の記録媒体として、特に大きな制約はないが、本実施の形態では、一例として、第１の記録媒体１にはＤＶＲディスク（大容量の相変化光ディスク）を、第２の記録媒体２にはＤＶＤ−ＲＷをそれぞれ用いている。
【００６９】
そして、本画像記録再生装置では、第１のドライブ１１にて第１の記録媒体１の動画像データを再生する際に、動画像データの構成要素である静止画像データを所定のタイミングでサンプリングし、静止画像データのさらに一部である部分画像データを抽出して第２のドライブ１２にて第２の記録媒体２に記録するようになっている。
【００７０】
そしてさらに、本画像記録再生装置では、例えば、第１の記録媒体１に記録されている動画像データの内容編集等の目的で、動画像データに含まれる特定のシーン或いは画像を検索する必要が生じた場合は、第２の記録媒体２に記録されている上記した部分画像データ群を、画像検索のための情報として用いるようになっている。
【００７１】
動画像データのシーンを検索するために用いる情報は、シーン情報と称される。本発明では、動画像データを構成する静止画像データの一部である部分画像データをシーン情報とする。より詳細には、静止画像データにおける人物とその周辺の画像とからなる部分画像データをシーン情報とする。
【００７２】
前述した従来の構成では、シーン情報を、動画像中での位置、シーンの意味内容、シーン相互の関係、シーン変化の情報等の情報からなる構成としていた。そのため、シーン情報を自動にて完全な情報として取得することは難しかった。しかしながら、動画像データを構成する静止画像データにおける人物とその周辺画像とからなるシーン情報であれば、既存の技術にて自動的に取得していくことが可能となる。
【００７３】
また、シーン情報を、人物とその周辺画像とに絞ることで、人物主体の動画像データにおける検索情報を、その人物の動作も含めてより具体的なものにすることができる。
【００７４】
つまり、図２（ａ）（ｂ）に、人物情報をそのある程度周囲の周辺画像と共に抜き出す。人物情報を抜き出す際、人物情報そのものだけを抜き出すのではなく、破線にて示すように、人物をある程度周囲にある周辺画像とともに抜き出す。
【００７５】
また、シーン情報の内容としてより確度の高い情報とするために、その静止画像データに含まれる人物全てが何れかの部分画像データに含まれるように、部分画像データは複数であることが望ましい。つまり、その静止画像データから取得した部分画像データ全体で、その静止画像データに登場する全ての人物を網羅していることが望ましい。
【００７６】
そこで、図２（ａ）に示すように、画像に人物の重なりがない場合は、部分画像データの取得は比較的容易である。例えば、大きな探索領域から出発して、その探索領域内に一人の人物が入り、人物探索が上手くいくことも増える。一方、図２（ｂ）に示すように、画像に人物の重なりがある場合は、探索領域の形状と大きさを変えていき、例えば探索領域が大きい場合では重なった人物二人の部分画像データを重なった二人の人物双方情報として抜き出して取得する。また、探索領域が小さい場合では、一人の人物を含む部分画像データとして抜き出して取得することもできる。
【００７７】
静止画像データ内に含まれる複数の画像の中で、画像が人物か否かの判定は、例えば以下のように、人物情報を基本フレームの組み合わせとして捉えることで行うことができる。
【００７８】
ここで言う、基本フレームとは、顔フレーム、身体パーツフレーム（手、胴体、足など）から構成される。顔フレームについては、顔が観測方向によって平面的形態が変わることを応用して、その形態の変化に応じた判定基準情報として複数パターンの顔フレームを、メモリ４に記憶させておく。顔フレームのパターンを判定基準情報としていくつ用意するか、つまりその個数Ｎは、システムの条件によって変わるが、顔フレームの個数としては、所望の観測方向の角度分解能を（△φ，△θ： △φはあおり、△θは回転）とした場合はＮ＝（１８０／△φ）×（３６０／△θ）となる。また、顔フレームの情報形態は、その構成要素（目や鼻、口等）を複数の線情報に置き換えたものである。
【００７９】
人物かどうかの判定にあたり、まず、画像処理が行われ、静止画像データの輪郭画像（画像をいくつかの複数の線情報に加工した画像、既存技術で形成される）を形成する。そして、その輪郭画像に含まれるいくつかの輪郭群の中のいずれかと、その判定基準となるＮ個の顔フレームを形成するための輪郭モデルとの間で、顔フレームの各構成要素の配置を比較することで行う。Ｎ個の顔フレームの中に近い輪郭があるかどうかをチェックする。これが、人物であるかどうかの第１段階の判定となる。
【００８０】
次に、第１段階の判定結果を受けて、顔フレームの近傍に身体パーツフレームがあるかどうかで、最終判断を行う。判断用の身体パーツフレームは、腕フレーム、胴フレームなどがある。例えば判定用に複数の腕フレームを用いた場合、腕フレームの輪郭モデルが、顔フレームと所定の距離の範囲にあるかどうかを判断し、この距離的な条件が満たされた場合、その顔フレームの判定で得られた画像情報が人物と判断するというような判断方法がある。このようなほぼ二段階の判定方法で人物かどうかの判定を行うことができる。
【００８１】
なお、このような人物判定の方法については、例えば、『ＬａｂｅｌｅｄＧｒａｐｈＭａｔｃｈｉｎｇを用いた動画像に対する人物頭部及び表情変化を伴う部位の抽出』電子情報通信学会論文誌Ｖｏｌ．Ｊ８５−Ｄ−ＩＩＮｏ．１１ｐｐ．１６５６−１６６３２００２年１１月等に記載されている。
【００８２】
また、これらの部分画像データを、人物を含む画像データ情報として抜き出す際に、部分画像データに関する付加情報として、少なくとも人数情報を含む情報を取得することが好ましい。人数情報は、部分画像データに含まれる人物の数を表す情報である。また、より好ましくは、付加情報に、各人物の色を特徴付ける色情報或いは各人物の形状を特徴付ける形状情報の一方或いは両方（より好い）を含めておくことである。
【００８３】
このように付加情報を部分画像データと共に取得させておくことで、画像に人物の重なりがあり、抜き出した１つの部分画像データに複数の人物が含まれていても、得られた部分画像データに含まれる個別人物情報を、図３に示すようなツリー構造で情報管理される付加データ（付加情報）を同時に取得することで、部分画像データの中に含まれる複数の人物情報を取得でき、シーン情報として活用することができるようになる。
【００８４】
色情報及び形状情報は、より詳細に言えば、部分画像データを取り込んで、取り込んだ画像の中で、人物毎に、後述する顔フレームや身体パーツフレームなど、顔フレーム群、身体フレーム群におけるどのパターンかを判別した後で、さらにこれらを特徴付けるために用いる情報である。例えば、色情報は、画像を取り込んで判断用に適合させたフレーム（顔フレームや身体パーツフレームなど）全体の色調を表現する情報のことで、形状情報は、判断用に適合させたフレームのまさにその形状を現す情報のことである。
【００８５】
また、付加情報に、動きの情報を含めてもよい。動きの情報とは、判断用に適合させたフレーム（顔フレームや、身体パーツフレームなど）が、次の静止画像データの取り込み操作で、同フレーム群の先とは異なるフレーム群の別のフレームに適合するようになったことを示す情報である。さらに、顔フレームの中の構成要素（目や鼻、口等）を別のフレーム群化することで、表情変化などにも対応した処理が可能になる。
【００８６】
人物が複数重なった画像では、人物かどうか判定するための顔フレームや身体パーツフレームが、不完全な形でしかも、近接した形で複数存在するので、判定の基準になる。顔フレームや身体パーツフレーム、それぞれの色情報、形状情報は人物固有であるから、この各フレームの色、及び形状情報で、複数人それぞれの情報を得ることができる。そして、さらにその部分画像データを特徴づけるための情報量として、＜人数情報＞×＜人物判断用フレームの数＞×＜形状情報＋色情報＞という情報量として、部分画像に関する情報を管理できるので、画像情報を特徴づけしやすい。また、ツリー構造で管理することで、検索の際にも検索が容易になる。
【００８７】
このような色情報、形状情報は、人物かどうかを判断するのに用いた、顔フレームや身体フレームに近いと判断した輪郭モデルを、輪郭モデル専用の一時記憶装置（図示せず）が情報として取り込み、輪郭モデルのもとになっている人物かどうかの判定のために取り込んだ画像データと参照して、その輪郭モデルの位置に相当する部分で、もとの画像での色や形状といった情報を、色情報専用の一時記憶装置（図示せず）、形状情報専用の一時記憶装置（図示せず）が情報として取り込むことで取得できる。
【００８８】
また、人数情報は、抜き出した部分画像データの中に、人物と判断された人物情報がいくつあったかの情報であるから、これも人数情報専用の一時記憶装置（図示せず）が、同一の部分画像データの中で何度輪郭モデルを取得したかを、カウントしてこれを記憶することで取得できる。
【００８９】
また、静止画像データより人物を含む領域を探索する際、探索の単位となる探索領域を、静止画像データの最大サイズ以下の範囲で適宜変更することが好ましい。探索領域を変更するとはつまり、探索領域の形状（通常は矩形）と大きさとを変更することである。探索領域を適宜切り換えることで、探索領域を人物画像に合った適切な形とでき、人物に関する確度の高い情報を短時間で取得することができる。
【００９０】
探索領域の設定の仕方としては、例えば顔フレームを利用する方法がある。次のタイミングで取得した静止画像データと前のタイミングで取得した静止画像データとで、同じ人物の大きさが変化する場合は、顔フレームの大きさの変化を検出して、その変化率に合わせて探索領域を小さくする。この場合は、顔フレームの大きさの変化率が元の７０％になったなら、探索領域も元の７０％にするといった具合である。
【００９１】
また、複数の人物を含む領域が探索された場合については、顔フレームが例えば５つあった場合は、その顔フレームの数が３つになるように、領域を設定直すことで、探索領域に含まれる個々の人物情報は多くなり、確度が上がることになる。
【００９２】
そして、このように、顔フレームを基準に探索領域を小さくする場合には、顔フレームの存在する場所を探索領域の対角要素の基点にする方法が有効である。
【００９３】
具体的には、図４に示すように、顔フレームのサイズｘ１×ｙ１とｘ２×ｙ２と顔フレーム間の距離Ｘ，Ｙ、さらにはその探索部分を十分に確保するためのマージンとして設けた、元の顔フレームのサイズを基準に予め定めた（ｍ１，ｍ２，ｎ１，ｎ２は任意の数）範囲を組み合わせて探索領域サイズは、｛（ｍ１＋１）・ｘ１＋Ｘ＋（ｍ２＋１）・ｘ２｝×｛（ｎ１＋１）・ｙ１＋Ｙ＋（ｎ２＋１）・ｙ２｝で与えられる。
【００９４】
以上に示した、人物を含む部分画像データによるシーン情報の取得により、シーンの変化様々な情報を考慮して異なるシーンであると判別して、その変化を階層構造で管理するなどの概念はなくなり、それらの人物が何をするか等にも着目した情報が管理できることになる。
【００９５】
次に、図５のフローチャートを用いて、人物を含む部分画像データを抜き出す操作を説明する。
【００９６】
図５では、静止画像データ内の人物を含む領域を部分画像データとして抜き出す際に、人物を探索する探索領域を適宜変更しながら行う場合の部分画像データの取得までの流れを示す。
【００９７】
第１のドライブ１１で、第１の記録媒体１の動画像再生中に開始信号が検出されると、第１のドライブ１１で再生される動画像データの一コマを取得するタイミング情報を発生させ、静止画像データの取得を開始する（Ｓ１）。
【００９８】
動画像データを取得するタイミング情報が与えられると、それに同期して、バッファメモリ３に、動画像データの一コマである静止画像データを、そのアドレスデータ（第１の記録媒体１上のアドレス）と共に蓄積する（Ｓ２、Ｓ３）。
【００９９】
次に、開始直後であるか否かを判断し（Ｓ４）、開始直後以外は、続けて画像データ転送要求のトリガが検出されたか否かを確認する（Ｓ５）。開始直後以外で、画像データ転送要求のトリガが検出されなければ、Ｓ１〜Ｓ５を繰り返して、画像蓄積を続けていく。
【０１００】
一方、開始直後である場合、及び、画像データ転送要求のトリガが検出された場合は、バッファメモリ３に取り込んだ静止画像データを、人物情報検出系に送る（Ｓ７）。ここまでの処理は、静止画像データを蓄積する画像蓄積系の処理である。
【０１０１】
以下に人物情報検出系について説明する。転送された静止画像データに対し、まず、画像処理を行って、上述した輪郭画像を形成する（Ｓ８）。次に、探索領域の見直しを行いながら、人物を含む領域を探索していく（Ｓ９）。
【０１０２】
Ｓ９では、静止画像データから得られた輪郭画像に対して、探索領域を所定の画素ずつ（輪郭画像の分解能に合わせて決める）水平方向にずらしていき、その探索で読み込まれる線分をベクトルと見ての方向と長さの情報を検出する。そして、方向と長さ情報を取得しながら探索を進める。また、１水平方向の探索走査が終了したら、垂直方向に所定の画素だけずらして、再度水平方向に探索走査を開始する。この探索を輪郭画像データ全体に対して、人物と判断できる情報が得られるところまで続け、線分の有無の情報と、線分が有る場合は、その方向と長さの情報とが輪郭画像情報として、一時記憶装置に格納されていく。
【０１０３】
この探索で、輪郭画像内の線分の情報が定量的に得られるので、顔フレームや、身体パーツフレームといった、人物かどうか判断するための各フレームの輪郭モデルとの間で、その類似性を数値的に比較し、上記各フレームの輪郭モデルと一致した際には、人物が含まれると判断する（Ｓ１０）。
【０１０４】
人物が含まれると判断すると、その一致部分を含む一定の範囲を、人物を含む部分画像データとして切り出す（Ｓ１２）。また、この際、切り出した部分画像データが静止画像データのどこにあったかもわかるようにフィールド情報を併せて取得する（Ｓ１１）。フィールド情報とは、静止画像内のどの位置に人物画像が存在するかを示す情報である。
【０１０５】
切り出した部分画像データとフィールド情報とは、当該部分画像データが含まれていた静止画像データの第１の記録媒体１上の位置を示すアドレス情報と対応付けて、第２のドライブ１２を用いて第２の記録媒体２に記録する（Ｓ１３）。
【０１０６】
次に、輪郭画像全体の探索を完了していない段階では、未探索部分への探索を続行し、探索領域の見直しを行いながら、人物画像の探索を続ける（Ｓ１４、Ｓ１５）。Ｓ１６にて人物情報が他にないことを確認するまで、Ｓ１２〜Ｓ１５の処理を繰り返す。
【０１０７】
また、より望ましくは、一度得られた部分画像データの範囲内をさらにそれより小さい探索領域にて探索することである。これにより、顔の表情といった、よりそのシーンを特徴付けられる確度の高い情報を短時間で得ることができる。
【０１０８】
一方、Ｓ１０にて、静止画像データの全体を探索しても人物が含まれると判断しなかった場合、及びＳ１６にて人物情報が他にないことを確認すると、人物情報検出系の処理が終了する。
【０１０９】
この人物情報の検出系の処理が終わった時点で、次の静止画像データを人物情報検出系に送るための画像データ転送要求のトリガを発生させ（Ｓ１７）、そのトリガデータを、静止画像データを蓄積しているバッファメモリ３に送る。これにて、画像蓄積系より人物情報検出系へ、次の静止画像データの転送処理がなされる（Ｓ７）。
【０１１０】
この探索の際には、人物を含む領域を探索する人物情報検出系と、画像データを蓄積する画像蓄積系とは別の動きをしている。したがって、動画像データの再生が終了して後しばらくしてから、人物を含む領域を探索する処理が終了することになる。探索の終了に際しては、第１のドライブ１１で再生される動画像データのエンド情報が転送されたことをもとに終了動作に入る（Ｓ６）。
【０１１１】
なお、上記したＳ１における、再生された動画像データを構成する静止画像データをサンプリングし、バッファメモリ３に蓄積していく時間間隔（サンプリングのタイミング）は、人物を含む領域の探索と記録に要する時間を考慮して設定するのが好ましい。
【０１１２】
また、ユーザがサンプリングのタイミングを自由に設定できるようにしてもよい。第１の記録媒体１の動画像データを再生して、シーン情報となる部分画像データを取得しているときに、ユーザは表示装置５に映し出される動画像をもとに、その内容を把握することができるので、その内容に合わせて静止画像データを取得するタイミングを適宜設定すれば、動画像の内容に応じたサンプリングが可能となる。
【０１１３】
次に、本画像記録再生装置における動画像データの画像検索について説明する。
【０１１４】
上記のようにして取得した部分画像データは、その人物の行為など、そこには既にこれを抽出した静止画像データ、つまりその静止画像データを構成要素とする動画像のあるシーンを特徴付ける情報が含まれているので、部分画像データ自体を表示装置５に表示させることで、部分画像データ自体がシーン情報となる。
【０１１５】
また、その部分画像データから複数の人物情報が得られる場合も、その複数人の人物構成などが重要なシーン情報となる。つまり、これらの部分画像データは、そのままでもユーザにとってはシーン変化を検出するために得られる情報と相当のシーン情報を提供することになる。
【０１１６】
そこで、上述したように、本画像記録再生装置では、第１の記録媒体１に記録された動画像データの数あるシーンの中から、ある特定のシーン（画像）を検索したい場合、本動画像データより取得され、既に第２の記録媒体２に記録されている部分画像データ群を、画像検索のための情報として用いるようになっている。
【０１１７】
画像検索において、表示装置５には、部分画像データ群より部分画像データが、動画像データの時間軸に沿った順番等で表示される。このとき、部分画像データをサムネイル表示してもよい。また、全ての部分画像データを表示する必要はなく、その一部を時間的に間引いて表示することもできる。
【０１１８】
そして、このように部分画像データを表示している状態で、図示しない入力装置を介して、ユーザからの表示されている部分画像データに対する選択を受けつけ、選択された部分画像データをもとに、第１の記録媒体１に記録されている動画像データに対して、該部分画像データの抽出元となる静止画像データの検索を行い、選択された部分画像データの抽出元の静止画像データから（その前後を含む）、動画像データの再生を開始する。
【０１１９】
また、本画像記録再生装置では、第１の記録媒体１の動画像データを初めて本装置で再生する際に、人物情報とその周辺情報に人物を含む部分画像データを自動的に取得しておくようになっている。そしてまた、本画像記録再生装置では、第１の記録媒体１に記録された動画像データの２回目以降の再生が指示されたとき、動画像データの再生前に、取得した部分画像データの一部を、図１の枠２０に示すように、表示装置５の画面にサムネイル表示するようになっている。
【０１２０】
部分画像データの表示は、動画像データの時間軸方向に、単純に取得した部分画像データ（静止画）を所定数（図１，６では８個）ずつ所定時間表示しても、部分画像データの中のいくつかだけを選択してから表示してもよい。さらに、その選択表示方法には、元の動画像全体の中の一定時間ごとに表示用部分画像を選択しておいて、その選択した部分画像データを表示するなどの方法がある。
【０１２１】
このような部分画像データは、既に、人物の動作、表情などに絞られているので、通常の静止画像全体のサムネイル表示よりも、ユーザにとってはより見やすくなる。そして、このような既に絞り込まれた部分画像データをもとにした検索は、通常の静止画像全体のサムネイル表示に秘して、情報確度が高く、かつ、よりきめ細かい画像検索（シーン検索）が可能になる。
【０１２２】
例えば、元の静止画像内で人物が記録された領域の占める割合が全体の中で小さい場合、静止画像像全体のサムネイル表示では、人物の特定さえ難しくなっていたが、本画像記録再生装置の場合は、人物画像がクローズアップされて表示されるので、人物情報をキーにした画像検索が可能となる。
【０１２３】
また、部分画像は人物情報が基本に管理されるので、例えば、身体パーツフレームに関する色情報を利用すれば、例えば動画像全体の中で特定の人物が着ている衣装をキーに、部分画像データ群よりその衣装を含む部分画像データのみを選択的に表示させるといったことも可能となり、より効果的なシーン検索が可能となる。
【０１２４】
また、別の例で、身体パーツフレームに関する形状情報を利用すれば、足の形状の情報を動きの情報と複合したり、腕の形状の情報を動きの情報と複合したりして、特定の人物の行動をキーに、シーンを検索することが可能である。また、フィールド情報を利用すれば、特定の人物が移動しているシーンを、それが歩いているか、走っているかを区別して検索することができる。
【０１２５】
〔実施の形態２〕
本発明の実施の他の形態について図６、図７に基づいて説明すれば、以下の通りである。なお、説明の便宜上、実施の形態１で用いた部材と同じ機能を有する部材には同じ符号を付して説明を省略する。
【０１２６】
本実施の形態の画像記録再生装置は、図６に示すように、独立して書込み読出しが可能なバッファメモリ群３０を備えている点が、実施の形態１の画像記録再生装置（図１）と大きく異なる点である。実施の形態１の画像記録再生装置では、ＣＰＵ６は、１つのバッファメモリ３を用いて、静止画像データ内で探索領域を移動しながら静止画像データ全体に対して探索を行っていたが、ここでは、複数のバッファメモリからなるバッファメモリ群３０を用いて、静止画像データを予め複数の領域に分割し、各分割領域内で独自に人物の探索を行い、部分画像データの抽出を行うようになっている。つまり、探索領域の移動を行わない。
【０１２７】
図７のフローチャートを用いて、人物を含む部分画像データを抜き出す操作を説明する。
【０１２８】
図７では、静止画像データ内の人物を含む領域を部分画像データとして抜き出す際に、人物を探索する探索領域を予め分割しておいて探索を行い、部分画像データを取得するまでの流れを示す。
【０１２９】
第１のドライブ１１で、第１の記録媒体１の動画再生中に開始信号が検出されれると、第１のドライブ１１で再生される動画像データの一コマを取得するタイミング情報を発生させ、静止画像データの取得を開始する（Ｓ２１）。
【０１３０】
この取得後に、静止画像データの分割数Ｎを決定する（Ｓ２２）。例えば、画面上に多数の人物が存在するかという情報を、静止画像データの輪郭画像データのレベルで粗く探索して画像全体の人物数の状態を分析し、分析結果をもとに分割数を決定する。判定には、図７には示していないが、メモリ４に格納してある簡易な画像分析プログラムをＣＰＵ６が起動して、自動判定作業をするという機能が含まれる。尚、分割数Ｎは最大、バッファメモリ群３０を構成するバッファメモリ数となる。
【０１３１】
Ｎ個に分割された静止画像データは、分割と同時に、Ｎ個の独立して書込み及び読出し動作が可能な、バッファメモリ群３０の１〜Ｎに、静止画像データをＮ分割したＮ個の画像データ群として、画像分割前の第１の記録媒体１でのアドレスデータと共に一括蓄積される（Ｓ２３、Ｓ２４）。
【０１３２】
次に、開始直後であるかを判断し（Ｓ２５）、開始直後以外は、続けて画像データ転送要求のトリガが検出されたか否かを確認する（Ｓ２６）。開始直後以外で、画像データ転送要求のトリガが検出されなければ、Ｓ２１〜Ｓ２６を繰り返して、分割された静止画像データ（以下、分割画像データと称する）のバッファメモリ群３０への画像蓄積を続けていく。
【０１３３】
一方、開始直後である場合、及び、画像データ転送要求のトリガが検出された場合は、バッファメモリ群３０に取り込んだ分割画像データを人物情報検出系に一括して送る（Ｓ２７）。ここまでの処理は、分割画像データを蓄積する画像蓄積系の処理である。
【０１３４】
ここで、静止画像データはＮ個に分割され、１〜Ｎまでのバッファメモリに蓄積されているので、人物情報検出系はＮ個存在することとなる。Ｎ個の各人物情報検出系ではそれぞれ、転送された分割画像データに対し、まず、画像処理を行って、上述した輪郭画像を形成する（Ｓ２８）。次に、人物を含む領域を探索する（Ｓ２９）。但し、ここでは、探索領域の見直しを行うことなく探索する。
【０１３５】
そして、実施の形態１で説明したと同様の判定して、Ｓ３０で人物が含まれると判断すると、その一致部分を含む一定の範囲を、人物を含む部分画像データとして切り出す（Ｓ３２）。また、この際も、切り出した部分画像データが静止画像データのどこにあったかもわかるようにフィールド情報を併せて取得する（Ｓ３１）。
【０１３６】
切り出された部分画像データとフィールド情報とは、当該部分画像データが含まれていた静止画像データの第１の記録媒体１上の位置を示すアドレス情報と対応付けて、第２のドライブ１２を用いて第２の記録媒体２に記録される（Ｓ３６）。
【０１３７】
但し、ここでは、Ｎ個の人物情報検出系が独立し、並行して処理を行っているので、Ｓ３４及びＳ３５の処理を経て、ＣＰＵ６との間で通信しながら、第２のドライブ１２への記録要求に関する許可を待って行われる。Ｎ個の人物情報検出系は、個々に図６には示さないが、一時記憶機能を有している。
【０１３８】
Ｓ３０にて分割画像データには人物が含まれないと判断し探索抽出手段場合、及び、Ｓ３６にて第２の記録媒体２への記録が完了すると、次の人物情報検出系のＳ３５に進み。
【０１３９】
上述したように、この実施の形態では、探索領域を変えての再探索は行わない。たとえ、複数の人物が重なる人物情報が仮に取得されても、個別の人物情報については付加情報として別途記録する方法で補うことができる。
【０１４０】
最終の人物情報検出系で、第２の記録媒体２への分割画像データの記録が完了する、或いはＳ３０にて人物が含まれていないと判断すると、Ｎ個の人物情報検出系の処理が終了する。
【０１４１】
Ｎ個の人物情報検出系の処理が終わった時点で、次の分割画像データを各人物情報検出系に送るための画像データ転送要求のトリガを発生させ（Ｓ３７）、そのトリガデータを、分割画像データを蓄積しているＮ個のバッファメモリに送る。これにて、画像蓄積系より人物情報検出系へ、次の分割画像データが一括して転送処理がなされる（Ｓ２７）。
【０１４２】
この探索の際においても、人物を含む領域を探索する人物情報検出系と、画像データを蓄積する画像蓄積系とは別の動きをしている。したがって、動画像データの再生が終了して後しばらくしてから、人物を含む領域を探索する処理が終了することになる。また、探索の終了に際しては、第１のドライブ１１で再生される元の動画像データのエンド情報が転送されたことをもとに終了動作に入る（Ｓ３８）。
【０１４３】
なお、上記したＳ２１における、再生された動画像データを構成する静止画像データをサンプリングし、分割してバッファメモリ群３０に蓄積していく時間間隔（サンプリングのタイミング）は、人物を含む領域の探索と記録に要する時間を考慮して設定するのが好ましく、また、ユーザがサンプリングのタイミングを自由に設定できるようにしてもよい。
【０１４４】
このように、本実施の形態の画像記録再生装置は、実施の形態１の画像記録再生装置と比べて、探索領域を移動させることもなく、また、探索領域を変化させることもない。したがって、繰り返し探索などの動作がない分、静止画像データが多数の人物が含まれるような動画像データの場合は、このような静止画像データを予めＮ個に分割して人物情報を検出する処理の方が、効率良い探索が可能になる。
【０１４５】
なお、本画像記録再生装置においても、第２の記録媒体２に部分画像データ群を用いた画像検索は実施の形態１の画像記録再生装置と同じであるので、説明は省略する。
【０１４６】
以上説明した、実施の形態１，２の画像記録再生装置は、コンピュータ読み取り可能な記録媒体にプログラムとして記録することも可能である。例えば、コンピュータを、静止画像データ取得手段、探索抽出手段、記録手段、開始指示手段、部分画像データ表示手段、検索手段として機能させるデータベース構築プログラム、画像検索プログラムが記録された記録媒体が考えられる。
【０１４７】
本発明の目的は、このような手段をコンピュータに実現させるソフトウエアであるデータベース構築プログラム、画像検索プログラムのプログラムコード（実行形式プログラム、中間コードプログラム、ソースプログラム）を、コンピュータが読み取り得るように記録媒体に記録させ、該記録媒体を、画像記録再生装置に供給し、そのコンピュータが記録媒体に記録されているプログラムコードを読み出し実行することによっても、達成可能である。この場合、記録媒体から読み出されたプログラムコード自体が上述した手順を実現することになり、そのプログラムコードを記録した記録媒体は本発明を構成することになる。
【０１４８】
ここで、上記プログラムメディアとしての記録媒体は、本体と分離可能に構成される記録媒体であり、磁気テープやカセットテープ等のテープ系、フレキシブルディスクやハードディスク等の磁気ディスクやＣＤ−ＲＯＭ／ＭＯ／ＭＤ／ＤＶＤ等の光ディスクのディスク系、ＩＣカード（メモリカードを含む）／光カード等のカード系、あるいはマスクＲＯＭ、ＥＰＲＯＭ、ＥＥＰＲＯＭ、フラッシュＲＯＭ等による半導体メモリを含めた固定的にプログラムを担持する媒体であってもよい。
【０１４９】
なお、本発明は、上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的手段に含まれる。
【０１５０】
【発明の効果】
本発明のデータベース構築装置は、以上のように、第１のドライブにて再生された第１の記録媒体の動画像データから静止画像データを取得する静止画像データ取得手段と、上記静止画像データ取得手段にて取得された静止画像データを蓄積していく蓄積手段と、該蓄積手段が蓄積した静止画像データに含まれる人物を探索し、人物が含まれている場合は人物を含む所定領域を部分画像データとして抽出する探索抽出手段と、該探索抽出手段にて部分画像データが抽出されると、抽出された部分画像データを抽出元である静止画像データの上記第１の記録媒体上のアドレス情報と対応付けて第２のドライブを用いて第２の記録媒体に記録させる記録手段とを有することを特徴としている。
【０１５１】
これによれば、第２の記録媒体には、人物を含む領域を抽出してなる部分画像データが記録されデータベース化される。このような部分画像データは、それ自身を表示させることで、部分画像データ自体がシーン情報となる。また、その部分画像データから複数の人物情報が得られる場合も、その複数人の人物構成などが重要なシーン情報となる。つまり、これらの部分画像データは、そのままでもユーザにとってはシーン変化を検出するために得られる情報と相当のシーン情報となる。
【０１５２】
しかも、このようなシーン情報は、人物を含む領域を切り出した部分画像データであるので、既存の技術にてユーザの手を煩わせることなく自動にて取得して、データベースを構築することができる。
【０１５３】
したがって、これにより、人物画像主体の動画像データに適した、シーン情報をユーザの手を煩わせることなく自動に取得可能な、シーン情報のデータベース構築装置を提供することができるという効果を奏する。
【０１５４】
そしてまた、このようなデータベース構築装置を画像記録再生装置に搭載させることで、局所的なシーン変化なども容易に判別でき、人物の動きに沿った画像検索等も可能な、人物を主体とした動画像データの内容編集等を容易に行うことのできる画像記録再生装置を提供することができるという効果を奏する。
【０１５５】
また、本発明のデータベース構築装置においては、さらに、上記探索抽出手段は、部分画像データを抽出する際に、当該部分画像データ内に含まれる人物の数を表す数情報を含む付加情報を併せて取得し、上記記録手段は、部分画像データと共に対応する付加情報を、部分画像データに連なるツリー構造で関係付け得るように上記第２の記録媒体に記録することを特徴とすることもできる。
【０１５６】
これによれば、部分画像データ内に含まれる人物の数を表す数情報を含む付加情報が取得され、部分画像データと共に対応する付加情報が、部分画像データに連なるツリー構造で関係付け得るように第２の記録媒体に記録されるので、たとえ、人物を含む部分画像データの切り出しが、人物が二人重なっている画像として取得した場合でも、第２の記録媒体に記録されたデータ側では、二人の人物に分離して管理することができるという効果を併せて奏する。
【０１５７】
また、上記付加情報には、さらに、各人物の色的特徴を表す色情報、及び／又は、各人物の形状的特徴を表す形状情報を含めておくこともできる。
【０１５８】
色情報を利用すれば、例えば動画像全体の中で特定の人物が着ている衣装をキーに、特定のシーンを検索することが可能となる。また、別の例で、形状情報を利用すれば、その形状変化を捉えて、特定の人物の行動をキーに、特定のシーンを検索することが可能である。
【０１５９】
したがって、人物を含み領域を抜き出した部分画像データをシーン情報とし、これをもとに動画像データの検索を行う場合、検索のキーとなる項目が増え、より効果な検索を行いえるデータベースを構築することができるという効果を併せて奏する。
【０１６０】
また、本発明のデータベース構築装置においては、上記探索抽出手段は、静止画像データに含まれる全ての人物が何れかの部分画像データに含まれるように部分画像データの抽出を行うことが好ましい。
【０１６１】
これによれば、静止画像データに含まれる全人物が、独立して或いはほかの人物と共に、部分画像データとして取得されるので、シーン情報の内容としてより確度の高い情報となるという効果を併せて奏する。
【０１６２】
また、本発明のデータベース構築装置においては、上記探索抽出手段は、人物を探索する際の探索領域を適宜変更しながら行うことを特徴とすることができる。
【０１６３】
これによれば、人物を探索する際の探索領域を適宜変更しながら行うので、探索時に、人物情報の静止画像データに占める割合が任意に変化する動画像データなどに対しても、探索領域を画面最大範囲から段階的に縮小していくなど、探索領域を常に適切に定めていくことができ、探索をより効率良く行うことができる。また、最初の探索で複数の人物を含む領域として探索した際にも、より情報確度を上げるため、探索領域を変えて再度探索を行うといったことも容易に行えるという効果を併せて奏する。
【０１６４】
また、本発明のデータベース構築装置においては、上記探索抽出手段は、静止画像データを予め複数の領域に分割し、分割した領域を探索領域として探索することを特徴とすることもできる。
【０１６５】
これによれば、静止画像データを予め複数の領域に分割してから分割した各領域を探索領域として探索するので、切り出した静止画像データに予め多数の人物が記録されていると予想される動画像データの場合、一回の探索で見つけて切り出すところの人物が含まれる領域に存在する人物数を適宜少なくでき、個別人物情報の数を少なくできる。また、分割した各領域にて独立して探索処理を行うようにすれば、静止画像データからの人物の抽出をより短時間で行うことができるという効果を併せて奏する。
【０１６６】
また、このよう予め静止画像データを分割する場合、静止画像データに含まれる人物の数の状態を判定し、判定に基づいて静止画像データの分割数を決定する分割数決定手段を備え、上記探索抽出手段は、該分割数決定手段にて決定された分割数にて画像データを分割する構成とすることがより好ましい。
【０１６７】
これにより、静止画像データの分割が、静止画像データに含まれる人物の数の状態に応じて行われるので、静止画像データの分割数が静止画像データに含まれる人物の数の状態にあったものとなり、分割数が固定されている構成よりも、個別人物情報の数をより的確にできるという効果を併せて奏する。
【０１６８】
また、本発明のデータベース構築装置では、上記第１のドライブにおける上記第１の記録媒体の動画像データの再生が初めてか否かを判定し、初めてである場合は、上記静止画像データ取得手段による静止画像データの取得を開始させる開始指示手段を備えている構成とすることもできる。
【０１６９】
これによれば、第１のドライブにおける上記第１の記録媒体の動画像データの再生が初めてである場合、開始指示手段にて、静止画像データ取得手段による静止画像データの取得が開始されるので、ユーザは第１のドライブで第１の記録媒体を再生させるだけで特別な指示を行うことなく、シーン情報のデータベースを取得することができるという効果を併せて奏する。
【０１７０】
本発明の画像記録再生装置は、上記課題を解決するために、第１の記録媒体に記録されている情報を再生する第１のドライブと、第２の記録媒体に情報を記録・再生する第２のドライブとを備えた画像記録再生装置において、上記請求項１〜９に記載のデータベース構築装置を備えたことを特徴としている。
【０１７１】
既にデータベース構築装置として説明したように、本発明のデータベース構築装置は、人物画像主体の動画像データに適した、シーン情報をユーザの手を煩わせることなく自動に取得し得るシーン情報のデータベース構築装置である。
【０１７２】
したがって、このようなデータベース構築装置を搭載してなる本発明の画像記録再生装置は、局所的なシーン変化なども容易に判別でき、人物の動きに沿った画像検索等も可能な、人物を主体とした動画像データの内容編集等を容易に行うことのが可能な優れた画像記録再生装置となるという効果を奏する。
【０１７３】
本発明の画像検索方法は、本発明の画像記録再生装置の画像検索方法であって、上記第２の記録媒体に記録されている部分画像データをシーン情報として用いて、上記第１の記録媒体に記録されている動画像データにおける任意のシーンの検索を行うことを特徴としている。
【０１７４】
上述したように、第２の記録媒体に記録される部分画像データには、その人物の行為など、そこには既にこれを抽出した静止画像データ、つまりその静止画像データを構成要素とする動画像のあるシーンを特徴付ける情報が含まれているので、部分画像データを表示させることで、部分画像データ自体がシーン情報となる。また、その部分画像データから複数の人物情報が得られる場合も、その複数人の人物構成などが重要なシーン情報となる。つまり、これらの部分画像データは、そのままでもユーザにとってはシーン変化を検出するために得られる情報と相当のシーン情報となる。
【０１７５】
したがって、画像に含まれる人物の動作まで含めた細かいレベルでのシーン情報を用いて検索するので、人物が中心の動画データ再生時の検索においてなど、より効率の良い画像検索が可能になるという効果を併せて奏する。
【０１７６】
また、本発明の画像検索装置は、上記画像記録再生装置に備えられる画像検索装置であって、上記第２ドライブを用いて上記第２の記録媒体に記録されている部分画像データ群を再生し、表示手段に表示させる部分画像データ表示手段と、ユーザからの表示されている部分画像データに対する選択を受けつける入力手段と、上記入力手段にて選択された部分画像データをもとに、第１の記録媒体に記録されている動画像データに対して、該部分画像データの抽出元となる静止画像データの検索を行う検索手段とからなることを特徴としている。
【０１７７】
これによれば、部分画像データ再生表示手段が、像上記第２ドライブを用いて上記第２の記録媒体に記録されている部分画像データ群を再生し、表示手段に表示させる。ユーザが表示されている部分画像データの１つを入力手段を用いて選択すると、検索手段が、上記入力手段にて選択された部分画像データの抽出元となる静止画像データを動画像データより検索する。
【０１７８】
したがって、第２の記録媒体に記録されている人物の動作、表情などを直接に見て必要な画像データの検索（シーン検索）を行うことができるので、例えば動画像検索を、静止画像全体を表示して行う場合よりもきめ細かい検索が可能になる。例えば、元の静止画像内で人物が記録された領域の占める割合が全体の中で小さい場合にも、人物情報をキーにした画像検索ができる。つまり、画面内を絞って表示するので、ユーザにとってはより見やすく、かつ情報確度の高い検索が可能になるという効果を奏する。
【０１７９】
本発明の画像検索装置では、さらに、部分画像データ再生表示手段は、上記第１の記録媒体に記録されている動画像データの再生が指示された場合、動画像データの再生前に部分画像データの一部を表示することを特徴とすることもできる。
【０１８０】
これによれば、第１の記録媒体の動画像データの部分画像データ群が第２の記録媒体にある場合は、動画像データの再生を前にして部分画像データの一部が自動的に再生されるので、例えばユーザはその部分画像データをみて、見たいシーンのみ選択的に再生させるといった使い方が可能となるという効果を併せて奏する。
【０１８１】
また、本発明のデータベース構築プログラム及び記録媒体は、上記した本発明のデータベース構築装置における各手段としてコンピュータを機能させるプログラム及びそれを記録した記録媒体である。
【０１８２】
また、本発明の画像検索プログラム及び記録媒体は、上記した本発明の画像検索装置における各手段としてコンピュータを機能させるプログラム及びそれを記録した記録媒体である。
【０１８３】
これにより、上記したデータベース構築プログラム、或いは画像検索プログラムをコンピュータによって実行させれば、特定のデータベース構築装置、画像検索装置、画像記録再生装置ではなく、不特定の画像記録再生装置に対しても本発明のデータベース構築装置、画像検索装置、画像記録再生装置を実現させることが可能となるという効果を併せて奏する。
【図面の簡単な説明】
【図１】本発明の実施の一形態の画像記録再生装置の構成を示すブロック図である。
【図２】図２（ａ）（ｂ）は、上記画像記録再生装置におけるシーン情報の取得例を示す説明図である。
【図３】上記画像記録再生装置における人物情報検索における情報付加構造を示す説明図である。
【図４】上記画像記録再生装置における人物探索の際の探索領域の変更方法を示す説明図である。
【図５】上記画像記録再生装置における人物情報自動取得の手順を示すフローチャートである。
【図６】本発明の実施の他の形態の画像記録再生装置の構成を示すブロック図である。
【図７】上記画像記録再生装置における人物情報自動取得の手順を示すフローチャートである。
【図８】従来公報に開示された発明のシーン情報入力システムを示す図である。
【図９】上記従公報に開示された発明のシーン情報入力の手順を示す流れ図である。
【符号の説明】
１第１の記録媒体
２第２の記録媒体
３バッファメモリ（蓄積手段）
４メモリ（静止画像データ取得手段、探索抽出手段、記録手段、開始指示
手段、部分画像データ表示手段、検索手段）
５表示手段
６ＣＰＵ（静止画像データ取得手段、探索抽出手段、記録手段、開始指示
手段、部分画像データ表示手段、検索手段）
１１第１のドライブ
１２第２のドライブ
３０バッファメモリ群（蓄積手段）[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a technique for easily searching for a specific scene from moving image data recorded on a recording medium, and relates to a database construction device, a database construction program, an image retrieval device, an image retrieval program, and an image recording / reproduction. Equipment related.
[0002]
[Prior art]
First, terms used in this specification will be defined. A moving image is composed of a sequence of still images, and an arbitrary sequence of still images constituting a part of the entire moving image is defined as a scene. The scene information is information that can be distinguished from the image data itself of the still images that constitute the scene, is given to each scene, and can specify the scene. Further, the image data of a moving image is referred to as moving image data, and the image data of a still image is referred to as still image data. When there is no need to distinguish between the moving image data and the still image data, they are simply referred to as image data.
[0003]
Today, recording of a moving image on a recording medium becomes easy, and there is a demand for the development of a technology capable of efficiently editing the recorded moving image. However, it is redundant to search for a scene including a specific object or an event included in the moving image while reproducing the moving image while the user himself looks at the screen, and the user can find the target scene in a short time. It is difficult.
[0004]
Therefore, conventionally, scene-related information (hereinafter, scene information) is given to each scene constituting a moving image, and this is stored in a database, and moving image data is searched using the scene information. ing.
[0005]
As scene information, for example, Patent Document 1 discloses a position in a moving image (eg, start / end frame number, time code), a meaning content of a scene (eg, keyword, attribute, representative frame), a relationship between scenes (eg, Information such as an identifier of a parent or child scene) and information of a scene change (for example, a position of a change point in a moving image, a type of change, and certainty) are listed.
[0006]
FIG. 8 shows a configuration diagram of a scene information input system described in the document. In this case, the scene information editor 112 extracts still image data representing the plurality of scenes from the representative frame file 122, and extracts the still images along with the time axis and in chronological order along the time axis. To be displayed. Then, when an instruction from the user is received through the input device 115, moving image data is extracted from the LD 116 and displayed on the TV monitor 118 for a period corresponding to a part of the instructed time axis.
[0007]
Further, the scene information editor 112 extracts scene information, which is information given to those scenes, from the scene information file 121 and simultaneously graphically displays the scene information on the screen of the display 114. When the user inputs an editing command through the input device 115, the editing command is executed and the scene information file 121 is edited.
[0008]
FIG. 9 shows a procedure for constructing scene information described in the document. In this method, a moving image composed of a temporally continuous frame sequence is divided into scenes, which are partial frame sequences, the division into scenes is confirmed and corrected, scene information is input, and the input information is processed. 5 shows a flow of processing up to registration in a moving image database.
[0009]
A) detecting a scene change point, generating scene information based on the detected scene change point, storing the scene information in a scene information file, b) displaying the scene information graphically on a display, and c) inputting. Step d) of editing the scene information file by executing an editing command instructed by the user using the means, and step e) of registering scene information of the scene information file in the database means.
[0010]
[Patent Document 1]
JP-A-5-334374 (December 17, 1993)
[0011]
[Problems to be solved by the invention]
In retrieving moving images, database creation of scene information is a significant core technology.However, the technology described in the above-mentioned literature is such that a user himself inputs scene information and constructs a database. After all, it is a difficult task for the user.
[0012]
Therefore, a system that can automatically input scene information is required, but it is very difficult to automatically input scene information as complete information. This is because the content of the scene information is very diverse, such as the person, the names of all the objects existing there, the location of each component, and the brightness of each component, all of which are to be input.
[0013]
Although the technique of the above-mentioned document is an invention that helps to construct a database of scene information, it is the same as the prior art of the document in that it is necessary to handle all information relating to the content of image data as scene information. And automation is difficult.
[0014]
Further, according to the technology of the above-mentioned literature, scenes are divided by capturing changes in the scenes. However, it is difficult to judge a change in an image with a small change, and when a change occurs locally (a part of the screen), including a search for a place where the change occurs, a very difficult change judgment function is provided. Required.
[0015]
For such a reason, the database (moving image database) constructed by the above-mentioned document system cannot be an effective database for any kind of image while the content of the scene information is precise. .
[0016]
For example, a plurality of scenes in which a scene change is local or small, such as when there is a request to cut out a still image from a moving image in a snap shot in a moving image mainly composed of a person, is used for scene information search. Depending on the selection of the keyword to be used, the search may be performed at the same time.
[0017]
SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and further obtains scene information as a person included in a still image or a combination of a person and additional information thereof, by further dividing still image data constituting moving image data. For moving image data mainly composed of a person image, scene information can be automatically obtained as image information including a person without bothering the user, thereby facilitating the creation of a database of the scene information. To provide a database construction device, a database construction program, an image retrieval device, an image retrieval program, and an image recording / reproducing device that can easily determine a typical scene change and the like and enable an image retrieval or the like in accordance with the movement of a person. It is an object.
[0018]
[Means for Solving the Problems]
In order to solve the above-mentioned problems, a database construction apparatus of the present invention includes: a still image data acquisition unit configured to acquire still image data from moving image data of a first recording medium reproduced by a first drive; A storage unit for storing the still image data obtained by the image data obtaining unit; and a person included in the still image data stored by the storage unit, and a predetermined person including the person when the person is included. A search and extraction unit for extracting an area as partial image data; and, when the partial image data is extracted by the search and extraction unit, the extracted partial image data is extracted from the still image data as the extraction source on the first recording medium. Recording means for recording the information on the second recording medium using the second drive in association with the address information.
[0019]
According to this, the still image data acquiring means acquires the still image data from the moving image data of the first recording medium reproduced by the first drive. The obtained still image data is accumulated in the accumulation unit, and the search and extraction unit searches for the person included in the still image data accumulated by the accumulation unit, and includes the person when the person is included. A predetermined area is extracted as partial image data. The partial image data extracted by the search and extraction unit is stored in the recording unit by associating the extracted partial image data with the address information on the first recording medium of the still image data from which the extraction is performed. The data is recorded on the second recording medium using the drive.
[0020]
The partial image data recorded on the second recording medium characterizes a scene with a moving image including the still image data already extracted therefrom, such as the act of the person, that is, the still image data. Since the information is included, the partial image data itself becomes scene information by displaying the partial image data. Also, when a plurality of pieces of person information can be obtained from the partial image data, the configuration of the plurality of persons is important scene information. In other words, these pieces of partial image data become scene information corresponding to information obtained for detecting a scene change for the user as it is.
[0021]
Since such scene information is partial image data obtained by cutting out a region including a person, the database can be constructed by automatically acquiring the scene information without using a user's hand using existing technology. .
[0022]
Accordingly, it is possible to provide a scene information database construction apparatus suitable for moving image data mainly composed of a person image and capable of automatically acquiring scene information without bothering the user.
[0023]
By mounting such a database construction apparatus in an image recording / reproducing apparatus, it is possible to easily determine a local scene change or the like, and to perform an image search or the like according to the movement of a person. It is possible to provide an image recording / reproducing apparatus that can easily edit the contents of image data and the like.
[0024]
Further, in the database construction device of the present invention, the search and extraction means may further include, when extracting the partial image data, additional information including numerical information indicating the number of persons included in the partial image data. The acquired recording means may be characterized in that the recording means records the additional information corresponding to the partial image data together with the partial image data in the second recording medium so that the additional information can be associated with the partial image data in a tree structure.
[0025]
According to this, additional information including number information indicating the number of persons included in the partial image data is obtained, and the additional information corresponding to the partial image data can be associated with the partial image data in a tree structure. Since the data is recorded on the second recording medium, even if the partial image data including the person is obtained as an image in which two persons overlap, the data recorded on the second recording medium has It can be managed separately for two people.
[0026]
Further, the additional information may further include color information indicating the color characteristics of each person and / or shape information indicating the shape characteristics of each person.
[0027]
If color information is used, it is possible to search for a specific scene using, for example, a costume worn by a specific person in the entire moving image as a key. In another example, if shape information is used, a change in the shape can be captured, and a specific scene can be searched using the behavior of a specific person as a key.
[0028]
Therefore, when the moving image data is searched based on the partial image data extracted from the region including the person and extracted from the area, the key items of the search are increased, and a database that can perform a more effective search is constructed. can do.
[0029]
Further, in the database construction apparatus of the present invention, it is preferable that the search and extraction unit extracts the partial image data so that all the persons included in the still image data are included in any of the partial image data.
[0030]
According to this, all the persons included in the still image data are acquired as partial image data independently or together with other persons, so that the content of the scene information becomes more accurate information.
[0031]
Further, in the database construction device of the present invention, the search and extraction means may be characterized in that the search and extraction are performed while appropriately changing a search area when searching for a person.
[0032]
According to this, since the search area when searching for a person is appropriately changed, the search area can be set even for moving image data in which the ratio of personal information to still image data changes arbitrarily at the time of search. The search area can always be appropriately determined, for example, by gradually reducing the size from the screen maximum range, and the search can be performed more efficiently. In addition, even when a search is performed as an area including a plurality of persons in the first search, it is possible to easily perform another search by changing the search area in order to further increase the information accuracy.
[0033]
In the database construction device of the present invention, the search and extraction means may be characterized in that the still image data is divided into a plurality of regions in advance, and the divided regions are searched as search regions.
[0034]
According to this, the still image data is divided into a plurality of regions in advance, and each of the divided regions is searched as a search region. In the case of image data, the number of persons existing in an area including a person to be found and cut out by one search can be appropriately reduced, and the number of individual person information can be reduced. In addition, if the search processing is performed independently in each of the divided areas, the extraction of the person from the still image data can be performed in a shorter time.
[0035]
Further, when the still image data is divided in advance in this manner, the method further includes a division number determining unit that determines the number of persons included in the still image data and determines the division number of the still image data based on the determination. More preferably, the extracting means divides the image data by the number of divisions determined by the number of division determining means.
[0036]
Since the still image data is divided according to the number of persons included in the still image data, the division number of the still image data is equal to the number of persons included in the still image data. Thus, the number of pieces of individual person information can be more accurate than a configuration in which the number of divisions is fixed.
[0037]
Further, in the database construction apparatus of the present invention, it is determined whether or not the reproduction of the moving image data of the first recording medium in the first drive is the first time. A configuration may also be provided that includes start instruction means for starting acquisition of still image data.
[0038]
According to this, when the reproduction of the moving image data of the first recording medium in the first drive is the first time, the acquisition of the still image data by the still image data acquiring means is started by the start instruction means. In addition, the user can acquire the scene information database by only playing the first recording medium on the first drive without giving a special instruction.
[0039]
In order to solve the above problems, an image recording / reproducing apparatus according to the present invention includes a first drive for reproducing information recorded on a first recording medium, and a first drive for recording / reproducing information on a second recording medium. An image recording / reproducing apparatus provided with a second drive and a database construction apparatus according to the first to ninth aspects.
[0040]
As already described as the database construction apparatus, the database construction apparatus of the present invention is a database construction of scene information suitable for moving image data mainly composed of human images and capable of automatically acquiring scene information without bothering the user. Device.
[0041]
Therefore, the image recording / reproducing apparatus of the present invention equipped with such a database construction apparatus can easily determine a local scene change or the like, and can perform an image search or the like in accordance with the movement of the person. An excellent image recording / reproducing apparatus that can easily edit the contents of the moving image data described above.
[0042]
The image search method of the present invention is an image search method of the image recording / reproducing apparatus of the present invention, wherein the partial image data recorded on the second recording medium is used as scene information, Is searched for an arbitrary scene in the moving image data recorded in the.
[0043]
As described above, the partial image data recorded on the second recording medium includes, for example, the act of the person, the still image data already extracted therefrom, that is, the moving image having the still image data as a component. Since information that characterizes a certain scene is included, displaying partial image data makes the partial image data itself scene information. Also, when a plurality of pieces of person information can be obtained from the partial image data, the configuration of the plurality of persons is important scene information. In other words, these pieces of partial image data become scene information corresponding to information obtained for detecting a scene change for the user as it is.
[0044]
Therefore, since the search is performed using the scene information at a fine level including the motion of the person included in the image, a more efficient image search can be performed, for example, in a search at the time of reproducing moving image data centered on a person.
[0045]
The image search device of the present invention is an image search device provided in the image recording / reproducing device, and reproduces a partial image data group recorded on the second recording medium using the second drive. A partial image data display means to be displayed on the display means, an input means for receiving a selection from the user for the displayed partial image data, and a first image data based on the partial image data selected by the input means. Search means for searching for still image data from which the partial image data is to be extracted from the moving image data recorded on the recording medium.
[0046]
According to this, the partial image data reproduction / display means reproduces the partial image data group recorded on the second recording medium using the second drive, and causes the display means to display the partial image data group. When the user selects one of the displayed partial image data using the input means, the search means searches the moving image data for still image data from which the partial image data selected by the input means is extracted. I do.
[0047]
Therefore, it is possible to search for necessary image data (scene search) by directly looking at the motion, facial expression, and the like of a person recorded on the second recording medium. A more detailed search can be performed than when the display is performed. For example, even when the ratio of the area where the person is recorded in the original still image is small in the whole, the image search using the person information as a key can be performed. In other words, since the display is narrowed down on the screen, it is possible to perform a search that is easier for the user to see and that has a high information accuracy.
[0048]
In the image search device of the present invention, when the reproduction of the moving image data recorded on the first recording medium is instructed, the partial image data reproducing / displaying means may perform the reproduction of the partial image data before reproducing the moving image data. May be displayed.
[0049]
According to this, when the partial image data group of the moving image data of the first recording medium is present on the second recording medium, a part of the partial image data is automatically reproduced before the reproduction of the moving image data. Therefore, for example, it is possible for the user to view the partial image data and selectively reproduce only a desired scene.
[0050]
Further, a database construction program and a recording medium of the present invention are a program for causing a computer to function as each means in the above-described database construction apparatus of the present invention, and a recording medium recording the program.
[0051]
Further, an image search program and a recording medium according to the present invention are a program for causing a computer to function as each unit in the above-described image search apparatus according to the present invention, and a recording medium storing the program.
[0052]
Thereby, if the above-described database construction program or image search program is executed by a computer, the specific database construction apparatus, the image search apparatus, and the image recording / reproducing apparatus, but not the specific image recording / reproducing apparatus. This also makes it possible to realize the database construction device, the image search device, and the image recording / reproducing device of the present invention.
[0053]
Further, the present invention can also be expressed as follows. That is, the image recording / reproducing apparatus of the present invention includes a first drive capable of recording / reproducing a first recording medium, a second drive capable of recording / reproducing a second recording medium, and a reproducing apparatus capable of reproducing / recording from the first drive. A buffer memory for temporarily storing the reproduced image data, a memory for storing a program for image processing, and displaying at least one of data reproduced by the first drive and data reproduced by the second drive. A playback device capable of at least one of recording and playback of the first recording medium and the second recording medium at the same time, wherein reproduced image data from the first recording medium is stored in at least one buffer memory. A search is performed for an area including a person from the image data stored and stored in the buffer memory, and at least one area including the person is searched. Extracting the area as at least one or more partial image data, linking the partial image data with the address information of the original image data from which the partial image data has been extracted on the first recording medium, and It is characterized by recording.
[0054]
Also, here, the partial image data obtained by searching for an area including a person from the first recording medium is obtained by three pieces of information of number of people, color information, and shape information obtained at the time of the search. It is preferable that a plurality of pieces of information regarding the image data of the first recording medium, including at least the following, are recorded on the second recording medium so as to be associated with each of the partial image data in a tree structure.
[0055]
An operation of storing the reproduced image data from the first recording medium in at least one buffer memory; searching for an area including a person from the image data stored in the buffer memory; When at least one of the partial image data has been searched, the area is extracted as at least one or more partial image data, and the partial image data is the address information of the original image data from which the partial image data is extracted on the first recording medium. And the image recording / reproducing apparatus is capable of independently executing the operation of recording on the second recording medium, and sequentially changes the search area in the image data to search for an area including a person. It can also be characterized by performing.
[0056]
An operation of storing the reproduced image data from the first recording medium in at least one buffer memory; searching for an area including a person from the image data stored in the buffer memory; When at least one of the partial image data has been searched, the area is extracted as at least one or more partial image data, and the partial image data is the address information of the original image data from which the partial image data is extracted on the first recording medium. 2. The image recording / reproducing apparatus according to claim 1, wherein the image recording / reproducing apparatus is capable of independently executing an operation of recording on the second recording medium, and further searches a plurality of image data reproduced from the first recording medium in advance. After dividing into regions, a search for a region including a person may be performed.
[0057]
Further, the image retrieval system in the image recording / reproducing apparatus according to the present invention is characterized in that partial image data including a person recorded on the second recording medium is used as scene information of the image data on the first recording medium. .
[0058]
Further, the image retrieval system in the image recording / reproducing apparatus of the present invention reproduces the image data of the first recording medium, thereby rewriting a part of the partial image data group automatically recorded on the second recording medium. The second recording medium is also reproduced and displayed before the second and subsequent reproductions of the first recording medium, and by selecting the partial image data, the partial image data is converted to the image data recorded on the first recording medium. It is characterized as a search key.
[0059]
BEST MODE FOR CARRYING OUT THE INVENTION
[Embodiment 1]
An embodiment of the present invention will be described below with reference to FIGS.
[0060]
FIG. 1 is a configuration diagram of an image recording / reproducing apparatus of the present embodiment including a database construction apparatus and an image search apparatus of the present invention.
[0061]
The image recording / reproducing apparatus includes at least first and second two drives 11 and 12, a buffer memory 3, a display device 5, a CPU 6, and a memory 4.
[0062]
The first drive 11 is a device capable of recording and reproducing the first recording medium 1, and the second drive 12 is a device capable of recording and reproducing the second recording medium 2.
[0063]
The buffer memory (accumulating means) 3 temporarily stores still image data obtained at a predetermined timing from moving image data (consisting of a sequence of still image data) reproduced from the first drive 11. is there. The sampled still image data is further subdivided as necessary and transferred to the second drive 12, as described later.
[0064]
The display device (display means) 5 displays the image data reproduced by the first drive 11 and the image data reproduced by the second drive 12. The display of the reproduced image data is performed independently, and the display device 5 displays one of the image data reproduced by the first and second drives 11 and 12.
[0065]
The memory 4 is composed of a hard disk or the like, and stores various application programs including a program for image processing.
[0066]
The CPU 6 includes a RAM (not shown) as a work area, and is a control center for controlling various operations of the first and second drives 11 and 12, the buffer memory 3, and the display device 5. In addition, the application program is read from the memory 4 and executed, and embodies the database construction device and the image search device of the present invention. That is, the CPU 6 and the memory 4 have functions of a still image data acquisition unit, a search and extraction unit, a recording unit, a start instruction unit, a partial image data display unit, a search unit, and the like.
[0067]
Further, the first drive 11 and the second drive 12 can simultaneously perform at least one of recording and reproduction on the recording medium 1 or the second recording medium 2 as the corresponding recording medium. I have.
[0068]
The first and second recording media are not particularly limited, but in the present embodiment, as an example, a DVR disc (large-capacity phase-change optical disc) is used as the first recording medium 1, DVD-RW is used as the recording medium 2.
[0069]
When reproducing the moving image data of the first recording medium 1 with the first drive 11, the image recording and reproducing apparatus samples the still image data, which is a component of the moving image data, at a predetermined timing. The second drive 12 extracts partial image data, which is a part of still image data, and records the extracted partial image data on the second recording medium 2.
[0070]
Further, in the present image recording / reproducing apparatus, it is necessary to search for a specific scene or image included in the moving image data for the purpose of editing the contents of the moving image data recorded on the first recording medium 1, for example. When this occurs, the above-described partial image data group recorded on the second recording medium 2 is used as information for image search.
[0071]
Information used to search for a scene in moving image data is referred to as scene information. In the present invention, partial image data which is a part of still image data constituting moving image data is set as scene information. More specifically, partial image data including a person and surrounding images in the still image data is set as scene information.
[0072]
In the above-described conventional configuration, the scene information is configured to include information such as the position in the moving image, the meaning of the scene, the relationship between the scenes, and the information on the scene change. Therefore, it was difficult to automatically acquire scene information as complete information. However, if it is scene information including a person and surrounding images in still image data constituting moving image data, it is possible to automatically acquire the scene information using existing technology.
[0073]
Further, by narrowing down the scene information to a person and its peripheral image, search information in moving image data mainly composed of a person can be made more specific, including the motion of the person.
[0074]
In other words, the person information is extracted together with the surrounding images to some extent in FIGS. 2 (a) and 2 (b). When extracting the person information, the person is extracted not only with the person information itself but also with a certain peripheral image as shown by a broken line.
[0075]
Further, in order to obtain more accurate information as the content of the scene information, it is desirable that there be a plurality of partial image data so that all the persons included in the still image data are included in any of the partial image data. That is, it is desirable that the entire partial image data acquired from the still image data covers all the persons appearing in the still image data.
[0076]
Therefore, as shown in FIG. 2A, when there is no overlap between persons in the image, acquisition of partial image data is relatively easy. For example, starting from a large search area, one person may enter the search area and the person search may be successful. On the other hand, as shown in FIG. 2B, when there is an overlap of persons in the image, the shape and the size of the search area are changed. For example, when the search area is large, the partial image data of the two overlapping persons is changed. Are extracted and acquired as information on both of the two persons who overlap. Further, when the search area is small, it can be extracted and acquired as partial image data including one person.
[0077]
In the plurality of images included in the still image data, the determination as to whether or not the image is a person can be performed by, for example, capturing person information as a combination of basic frames as described below.
[0078]
Here, the basic frame includes a face frame and a body part frame (such as a hand, a torso, and a foot). With respect to the face frame, a plurality of face frames are stored in the memory 4 as criterion information corresponding to the change in the form by applying the fact that the planar form of the face changes according to the observation direction. The number of face frame patterns to be prepared as determination reference information, that is, the number N thereof varies depending on system conditions. However, as the number of face frames, the angular resolution in the desired observation direction is determined by (△ φ, Δθ: Δ When φ is tilted and △ θ is rotated, N = (180 / △ φ) × (360 / △ θ). The information form of the face frame is obtained by replacing the constituent elements (eyes, nose, mouth, etc.) with a plurality of pieces of line information.
[0079]
In determining whether a person is a person, first, image processing is performed to form a contour image of still image data (an image obtained by processing an image into several pieces of line information, which is formed by an existing technology). Then, the arrangement of each component of the face frame is determined between any one of several contour groups included in the contour image and a contour model for forming N face frames serving as the determination criterion. This is done by comparison. Check if there are close contours in the N face frames. This is the first step of determining whether or not the person is a person.
[0080]
Next, upon receiving the determination result of the first stage, a final determination is made as to whether or not there is a body part frame near the face frame. The body part frame for determination includes an arm frame, a torso frame, and the like. For example, when a plurality of arm frames are used for determination, it is determined whether or not the outline model of the arm frame is within a predetermined distance from the face frame. There is a method of judging that the image information obtained in the above judgment is a person. It is possible to determine whether a person is a person by such a two-stage determination method.
[0081]
Such a method of determining a person is described in, for example, “Extraction of a part of a moving image using a Labeled Graph Matching with a human head and a change in facial expression”, IEICE Transactions Vol. J85-D-II No. 11 pp. 1656-1663, November 2002, and the like.
[0082]
Further, when extracting these partial image data as image data information including a person, it is preferable to acquire information including at least the number of people as additional information regarding the partial image data. The number of people information is information indicating the number of persons included in the partial image data. More preferably, the additional information includes one or both (more preferable) of color information characterizing the color of each person or shape information characterizing the shape of each person.
[0083]
By acquiring the additional information together with the partial image data in this manner, even if the image has a person overlap and a plurality of persons are included in one extracted partial image data, the obtained partial image data is Simultaneously obtaining additional data (additional information), which is information managed in a tree structure as shown in FIG. 3, obtains a plurality of pieces of personal information included in the partial image data. It can be used as information.
[0084]
In more detail, the color information and the shape information are obtained by capturing the partial image data, and in the captured image, for each person, a face frame group and a body frame group, such as a face frame and a body part frame, which will be described later. This is information used to further characterize these after determining whether they are patterns. For example, color information is information representing the overall color tone of a frame (such as a face frame or body part frame) that has been captured and adapted for determination, and shape information is the exact color of the frame adapted for determination. It is information that represents the shape.
[0085]
Further, the additional information may include motion information. The motion information means that a frame (face frame, body part frame, etc.) adapted for judgment is transferred to another frame in a different frame group from the same frame group in the next still image data capture operation. This is information indicating that it has become compatible. Furthermore, by processing the components (eyes, nose, mouth, etc.) in the face frame into another frame group, it is possible to perform a process corresponding to a change in facial expression.
[0086]
In an image in which a plurality of people are overlapped, a plurality of face frames and body part frames for determining whether or not a person exists in an incomplete shape and in a close shape are used as determination criteria. Since the color information and shape information of the face frame and the body part frame and the respective color information and shape information are unique to the person, information of each of a plurality of persons can be obtained from the color and shape information of each frame. Further, as information amount for characterizing the partial image data, information on the partial image can be managed as an information amount of <number of people> × <number of person determination frames> × <shape information + color information>. , Easy to characterize image information. Also, by managing the information in a tree structure, the search can be easily performed.
[0087]
As such color information and shape information, a temporary storage device (not shown) dedicated to a contour model uses a contour model determined to be close to a face frame or a body frame, which is used to determine whether or not the person is a person, as information. The part corresponding to the position of the contour model by referring to the captured image data to determine whether or not the person is the source of the contour model, and information such as the color and shape in the original image Can be acquired by taking in as a temporary storage device (not shown) dedicated to color information and a temporary storage device (not shown) dedicated to shape information.
[0088]
Also, since the number-of-persons information is information on how many pieces of personal information determined to be a person in the extracted partial image data, a temporary storage device (not shown) dedicated to the number-of-persons information is also stored in the same part. The number of contour models acquired in the image data can be acquired by counting and storing this.
[0089]
Further, when searching for an area including a person from the still image data, it is preferable to appropriately change the search area, which is a unit of search, within a range equal to or less than the maximum size of the still image data. Changing the search area means changing the shape (usually a rectangle) and size of the search area. By appropriately switching the search area, the search area can be formed in an appropriate shape suitable for the human image, and highly accurate information about the human can be obtained in a short time.
[0090]
As a method of setting the search area, for example, there is a method of using a face frame. If the size of the same person changes between the still image data obtained at the next timing and the still image data obtained at the previous timing, the change in the size of the face frame is detected and adjusted according to the change rate. To reduce the search area. In this case, if the rate of change of the size of the face frame becomes 70% of the original, the search area is also made 70% of the original.
[0091]
When an area including a plurality of persons is searched, if there are five face frames, for example, the area is reset so that the number of the face frames becomes three. The individual information included is increased, and the accuracy is improved.
[0092]
In order to reduce the search area based on the face frame, it is effective to use the place where the face frame exists as the base point of the diagonal element of the search area.
[0093]
Specifically, as shown in FIG. 4, the size of the face frame is set as x1 × y1 and x2 × y2, and the distances X and Y between the face frames, and further provided as margins for sufficiently securing a search portion thereof. By combining ranges (m1, m2, n1, and n2 are arbitrary numbers) predetermined on the basis of the size of the original face frame, the search area size is {(m1 + 1) .x1 + X + (m2 + 1) .x2} × {(n1 + 1) ) .Y1 + Y + (n2 + 1) .y2}.
[0094]
By acquiring the scene information based on the partial image data including a person as described above, there is no concept that the scene is determined to be a different scene in consideration of various information of the scene and the change is managed in a hierarchical structure. Thus, it is possible to manage information focusing on what those persons do.
[0095]
Next, an operation of extracting partial image data including a person will be described with reference to the flowchart of FIG.
[0096]
FIG. 5 shows a flow up to the acquisition of partial image data in a case where a region including a person in the still image data is extracted as partial image data while the search area for searching for a person is appropriately changed.
[0097]
When a start signal is detected by the first drive 11 during reproduction of a moving image on the first recording medium 1, timing information for acquiring one frame of moving image data reproduced by the first drive 11 is generated. Then, acquisition of still image data is started (S1).
[0098]
When the timing information for acquiring the moving image data is given, in synchronization with the timing information, the still image data, which is one frame of the moving image data, is stored in the buffer memory 3 with its address data (address on the first recording medium 1). (S2, S3).
[0099]
Next, it is determined whether or not immediately after the start (S4), and other than immediately after the start, it is confirmed whether or not a trigger of the image data transfer request is continuously detected (S5). If the trigger of the image data transfer request is not detected except immediately after the start, S1 to S5 are repeated to continue the image storage.
[0100]
On the other hand, if it is immediately after the start, or if the trigger of the image data transfer request is detected, the still image data fetched into the buffer memory 3 is sent to the person information detection system (S7). The processing so far is the processing of the image storage system for storing still image data.
[0101]
Hereinafter, the person information detection system will be described. First, image processing is performed on the transferred still image data to form the above-described outline image (S8). Next, while reviewing the search area, an area including a person is searched (S9).
[0102]
In S9, the search area is shifted in the horizontal direction by a predetermined pixel (determined according to the resolution of the contour image) with respect to the contour image obtained from the still image data, and the line segment read in the search is defined as a vector. Detects viewing direction and length information. Then, the search proceeds while acquiring the direction and length information. When the search scan in one horizontal direction is completed, the search scan is started again in the horizontal direction, shifted by a predetermined pixel in the vertical direction. This search is continued for the entire contour image data until information that can be determined to be a person is obtained, and information on the presence or absence of a line segment and, if there is a line segment, information on the direction and length thereof are included in the contour image information. Are stored in the temporary storage device.
[0103]
In this search, information on the line segments in the contour image can be obtained quantitatively, and the similarity between the contour model of each frame, such as a face frame and a body part frame, for determining whether the person is a person can be determined. Numerical comparisons are made, and if they match the contour model of each frame, it is determined that a person is included (S10).
[0104]
If it is determined that a person is included, a certain range including the matching part is cut out as partial image data including the person (S12). At this time, field information is also acquired so as to know where the extracted partial image data was in the still image data (S11). The field information is information indicating at which position in the still image the person image exists.
[0105]
The cut-out partial image data and the field information are associated with the address information indicating the position on the first recording medium 1 of the still image data containing the partial image data, and the second drive 12 is used. Recording is performed on the second recording medium 2 (S13).
[0106]
Next, at the stage where the search of the entire contour image is not completed, the search for the unsearched portion is continued, and the search for the person image is continued while the search area is reviewed (S14, S15). The processes of S12 to S15 are repeated until it is confirmed in S16 that there is no other personal information.
[0107]
More preferably, the search is performed in a smaller search area within the range of the partial image data obtained once. This makes it possible to obtain information with a high degree of certainty that characterizes the scene, such as facial expressions, in a short time.
[0108]
On the other hand, in S10, if it is not determined that a person is included even when the entire still image data is searched, and if it is confirmed in S16 that there is no other person information, the processing of the person information detection system ends. I do.
[0109]
When the processing of the person information detection system is completed, a trigger of an image data transfer request for sending the next still image data to the person information detection system is generated (S17), and the trigger data is converted to the still image data. The data is sent to the stored buffer memory 3. As a result, the next still image data is transferred from the image storage system to the person information detection system (S7).
[0110]
In this search, a person information detection system that searches for an area including a person and an image storage system that stores image data make different movements. Therefore, some time after the end of the reproduction of the moving image data, the process of searching for an area including a person ends. At the end of the search, an end operation is started based on the transfer of the end information of the moving image data reproduced by the first drive 11 (S6).
[0111]
The time interval (sampling timing) at which the still image data constituting the reproduced moving image data is sampled and stored in the buffer memory 3 in S1 described above is required for searching for and recording an area including a person. It is preferable to set in consideration of time.
[0112]
Further, the user may be allowed to freely set the sampling timing. When reproducing the moving image data of the first recording medium 1 and acquiring the partial image data as the scene information, the user grasps the contents based on the moving image projected on the display device 5. Therefore, if the timing for acquiring still image data is appropriately set according to the content, sampling according to the content of the moving image can be performed.
[0113]
Next, an image search of moving image data in the present image recording / reproducing apparatus will be described.
[0114]
The partial image data obtained as described above includes, for example, the act of the person, the still image data already extracted from the partial image data, that is, information characterizing a scene with a moving image having the still image data as a component. Since the partial image data itself is displayed on the display device 5, the partial image data itself becomes scene information.
[0115]
Also, when a plurality of pieces of person information can be obtained from the partial image data, the configuration of the plurality of persons is important scene information. In other words, these partial image data provide the user with information obtained for detecting a scene change and considerable scene information even as it is.
[0116]
Therefore, as described above, in the present image recording / reproducing apparatus, when it is desired to search for a specific scene (image) from a number of scenes of moving image data recorded on the first recording medium 1, the main moving image A partial image data group acquired from the data and already recorded on the second recording medium 2 is used as information for image retrieval.
[0117]
In the image search, the partial image data from the partial image data group is displayed on the display device 5 in an order along the time axis of the moving image data. At this time, the partial image data may be displayed as a thumbnail. Further, it is not necessary to display all the partial image data, and a part thereof can be displayed by thinning it out temporally.
[0118]
Then, in a state where the partial image data is displayed in this manner, a selection from the user for the displayed partial image data is received via an input device (not shown), and based on the selected partial image data, The moving image data recorded on the first recording medium 1 is searched for still image data from which the partial image data is to be extracted. (Including before and after), the reproduction of the moving image data is started.
[0119]
Also, in the present image recording / reproducing apparatus, when the moving image data of the first recording medium 1 is reproduced by the present apparatus for the first time, partial image data including a person in the person information and its surrounding information is automatically acquired. It has become. Further, in the present image recording / reproducing apparatus, when the second or subsequent reproduction of the moving image data recorded on the first recording medium 1 is instructed, one of the acquired partial image data is reproduced before reproducing the moving image data. The unit is displayed as a thumbnail on the screen of the display device 5 as shown in a frame 20 in FIG.
[0120]
The display of the partial image data can be performed by simply displaying the acquired partial image data (still images) by a predetermined number (eight in FIGS. 1 and 6) for a predetermined time in the time axis direction of the moving image data. May be displayed after selecting only some of them. Further, as the selective display method, there is a method in which a display partial image is selected at regular time intervals in the entire original moving image, and the selected partial image data is displayed.
[0121]
Such partial image data is already narrowed down to a person's motion, facial expression, etc., so that it becomes easier for the user to view than a normal thumbnail display of the entire still image. Such a search based on the already narrowed partial image data is concealed in the thumbnail display of the entire still image, so that the information accuracy is high and a more detailed image search (scene search) is possible. become.
[0122]
For example, when the ratio of the area where the person is recorded in the original still image is small in the whole, it is difficult to identify the person in the thumbnail display of the entire still image image. In this case, since the person image is displayed in a close-up manner, an image search can be performed using the person information as a key.
[0123]
Further, since partial information is basically managed based on personal information, if, for example, color information on a body part frame is used, partial image data is used, for example, by using a costume worn by a specific person in the entire moving image as a key. It is also possible to selectively display only the partial image data including the costume from the group, and a more effective scene search can be performed.
[0124]
In another example, if the shape information on the body part frame is used, the information on the shape of the foot is combined with the information on the movement, and the information on the shape of the arm is combined with the information on the movement, so that the specific It is possible to search for a scene using the behavior of a person as a key. Also, if field information is used, a scene in which a specific person is moving can be searched while distinguishing whether it is walking or running.
[0125]
[Embodiment 2]
Another embodiment of the present invention will be described below with reference to FIGS. For convenience of explanation, members having the same functions as the members used in Embodiment 1 are given the same reference numerals, and descriptions thereof are omitted.
[0126]
The image recording / reproducing apparatus according to the first embodiment (FIG. 1) is different from the image recording / reproducing apparatus according to the first embodiment (FIG. 1) in that the image recording / reproducing apparatus according to the present embodiment includes a buffer memory group 30 that can be independently written and read. This is a big difference. In the image recording / reproducing apparatus according to the first embodiment, the CPU 6 uses one buffer memory 3 to search the entire still image data while moving the search area in the still image data. The still image data is divided into a plurality of regions in advance by using a buffer memory group 30 including a plurality of buffer memories, a person is independently searched in each divided region, and partial image data is extracted. ing. That is, the search area is not moved.
[0127]
An operation of extracting partial image data including a person will be described with reference to the flowchart of FIG.
[0128]
FIG. 7 shows a flow from when a region including a person in the still image data is extracted as partial image data, a search region for searching for a person is divided in advance, search is performed, and partial image data is acquired. .
[0129]
When a start signal is detected by the first drive 11 during reproduction of a moving image on the first recording medium 1, timing information for acquiring one frame of moving image data reproduced by the first drive 11 is generated, The acquisition of still image data is started (S21).
[0130]
After this acquisition, the division number N of the still image data is determined (S22). For example, information on whether a large number of people are present on the screen is roughly searched at the level of the outline image data of the still image data, and the state of the number of people in the entire image is analyzed. decide. Although not shown in FIG. 7, the determination includes a function that the CPU 6 starts a simple image analysis program stored in the memory 4 and performs an automatic determination operation. Note that the division number N is the maximum number of buffer memories constituting the buffer memory group 30.
[0131]
The still image data divided into N pieces is divided into N pieces of N pieces of still image data by dividing the still image data into 1 to N of the buffer memory group 30 which can perform N independent write and read operations simultaneously with the division. The data group is collectively stored together with the address data on the first recording medium 1 before the image division (S23, S24).
[0132]
Next, it is determined whether it is immediately after the start (S25), and other than immediately after the start, it is continuously confirmed whether or not a trigger of the image data transfer request is detected (S26). If the trigger of the image data transfer request is not detected except immediately after the start, S21 to S26 are repeated to continue the image storage of the divided still image data (hereinafter, referred to as divided image data) in the buffer memory group 30. To go.
[0133]
On the other hand, if it is immediately after the start or if the trigger of the image data transfer request is detected, the divided image data taken into the buffer memory group 30 is sent to the person information detection system in a lump (S27). The processing up to this point is the processing of the image storage system that stores the divided image data.
[0134]
Here, since the still image data is divided into N pieces and stored in buffer memories 1 to N, there are N pieces of person information detection systems. In each of the N person information detection systems, image processing is first performed on the transferred divided image data to form the above-described outline image (S28). Next, an area including a person is searched (S29). Here, the search is performed without reviewing the search area.
[0135]
Then, in the same manner as described in the first embodiment, when it is determined that a person is included in S30, a certain range including the matching part is cut out as partial image data including the person (S32). Also at this time, field information is also acquired so that the location of the cut-out partial image data in the still image data can be known (S31).
[0136]
The cut-out partial image data and the field information are associated with the address information indicating the position on the first recording medium 1 of the still image data including the partial image data, and the second drive 12 is used. Is recorded on the second recording medium 2 (S36).
[0137]
However, in this case, since the N person information detection systems are independent and perform processing in parallel, the communication with the CPU 6 via the processing of S34 and S35 is performed while the communication with the CPU 6 is performed. Waiting for permission for recording request. Although not shown individually in FIG. 6, the N person information detection systems have a temporary storage function.
[0138]
If it is determined in S30 that the divided image data does not include a person and the search and extraction unit is used, and if the recording on the second recording medium 2 is completed in S36, the process proceeds to S35 of the next person information detection system.
[0139]
As described above, in this embodiment, a re-search is not performed by changing the search area. For example, even if personal information in which a plurality of persons overlap is provisionally obtained, individual personal information can be supplemented by a method of separately recording as additional information.
[0140]
When the recording of the divided image data on the second recording medium 2 is completed in the final person information detection system, or when it is determined in S30 that no person is included, the processing of the N person information detection systems ends. I do.
[0141]
When the processing of the N person information detection systems is completed, a trigger of an image data transfer request for sending the next divided image data to each person information detection system is generated (S37), and the trigger data is transmitted to the divided image data. The data is sent to N buffer memories storing the data. Thus, the next divided image data is collectively transferred from the image storage system to the person information detection system (S27).
[0142]
Also in this search, a person information detection system that searches for an area including a person and an image storage system that stores image data move differently. Therefore, some time after the end of the reproduction of the moving image data, the process of searching for an area including a person ends. At the end of the search, an end operation is started based on the transfer of the end information of the original moving image data reproduced by the first drive 11 (S38).
[0143]
Note that the time interval (sampling timing) at which the still image data constituting the reproduced moving image data is sampled, divided, and stored in the buffer memory group 30 in S21 described above is determined by searching for an area including a person. It is preferable that the setting be made in consideration of the time required for the recording and the recording, and the user may be able to freely set the sampling timing.
[0144]
As described above, the image recording / reproducing apparatus of the present embodiment does not move the search area and does not change the search area, as compared with the image recording / reproducing apparatus of the first embodiment. Therefore, in the case where the still image data is moving image data including a large number of persons because there is no operation such as repetitive search, the processing of dividing such still image data into N pieces in advance and detecting person information Can search more efficiently.
[0145]
In the present image recording / reproducing apparatus, the image retrieval using the partial image data group for the second recording medium 2 is the same as that of the image recording / reproducing apparatus of the first embodiment, and the description is omitted.
[0146]
The image recording / reproducing apparatuses according to the first and second embodiments described above can also be recorded as a program on a computer-readable recording medium. For example, a recording medium on which a database construction program and an image retrieval program for causing a computer to function as a still image data acquisition unit, a search and extraction unit, a recording unit, a start instruction unit, a partial image data display unit, a retrieval unit, and the like can be considered.
[0147]
An object of the present invention is to record the program codes (executable program, intermediate code program, and source program) of a database construction program and an image search program, which are software for realizing such means on a computer, so that the computer can read them. The present invention can also be achieved by causing a computer to record the program code on a recording medium, supplying the recording medium to an image recording / reproducing apparatus, and reading and executing the program code recorded on the recording medium. In this case, the program code itself read from the recording medium implements the above-described procedure, and the recording medium on which the program code is recorded constitutes the present invention.
[0148]
Here, the recording medium as the program medium is a recording medium configured to be separable from the main body, such as a tape system such as a magnetic tape or a cassette tape, a magnetic disk such as a flexible disk or a hard disk, or a CD-ROM / MO / It carries a fixed program including a disk system of an optical disk such as an MD / DVD, a card system such as an IC card (including a memory card) / optical card, or a semiconductor memory such as a mask ROM, EPROM, EEPROM, flash ROM, or the like. It may be a medium.
[0149]
It should be noted that the present invention is not limited to the above-described embodiments, and various changes can be made within the scope of the claims, and the technical means disclosed in different embodiments can be appropriately combined. Embodiments included in the invention are also included in the technical means of the present invention.
[0150]
【The invention's effect】
As described above, the database construction apparatus of the present invention includes: a still image data acquisition unit configured to acquire still image data from moving image data of a first recording medium reproduced by a first drive; Means for accumulating the still image data obtained by the means, and searching for a person included in the still image data accumulated by the accumulating means, and if a person is included, a predetermined area including the person is partially searched. Search and extraction means for extracting the extracted partial image data as image data; and address information on the first recording medium of the still image data from which the extracted partial image data is to be extracted. And a recording means for recording on a second recording medium using a second drive in association with the recording means.
[0151]
According to this, partial image data obtained by extracting an area including a person is recorded on the second recording medium and is made into a database. By displaying such partial image data itself, the partial image data itself becomes scene information. Also, when a plurality of pieces of person information can be obtained from the partial image data, the configuration of the plurality of persons is important scene information. In other words, these pieces of partial image data become scene information corresponding to information obtained for detecting a scene change for the user as it is.
[0152]
In addition, since such scene information is partial image data obtained by cutting out a region including a person, the database can be constructed by automatically obtaining the scene information by using existing technology without bothering the user. .
[0153]
Therefore, this provides an effect of providing a scene information database construction apparatus suitable for moving image data mainly composed of human images and capable of automatically acquiring scene information without bothering the user.
[0154]
Also, by mounting such a database construction apparatus in an image recording / reproducing apparatus, it is possible to easily determine a local scene change and the like, and to perform an image search or the like in accordance with the movement of a person. An advantage is provided in that an image recording / reproducing apparatus that can easily edit the contents of moving image data can be provided.
[0155]
Further, in the database construction device of the present invention, the search and extraction means may further include, when extracting the partial image data, additional information including numerical information indicating the number of persons included in the partial image data. The acquired recording means may be characterized in that the recording means records the additional information corresponding to the partial image data together with the partial image data in the second recording medium so that the additional information can be associated with the partial image data in a tree structure.
[0156]
According to this, additional information including number information indicating the number of persons included in the partial image data is obtained, and the additional information corresponding to the partial image data can be associated with the partial image data in a tree structure. Since the data is recorded on the second recording medium, even if the cutout of the partial image data including the person is obtained as an image in which two persons overlap, the data recorded on the second recording medium has It also has the effect that it can be managed separately for two people.
[0157]
Further, the additional information may further include color information indicating the color characteristics of each person and / or shape information indicating the shape characteristics of each person.
[0158]
If color information is used, it is possible to search for a specific scene using, for example, a costume worn by a specific person in the entire moving image as a key. In another example, if shape information is used, a change in the shape can be captured, and a specific scene can be searched using the behavior of a specific person as a key.
[0159]
Therefore, when the moving image data is searched based on the partial image data extracted from the region including the person and extracted from the area, the key items of the search are increased, and a database that can perform a more effective search is constructed. It also has the effect of being able to do so.
[0160]
Further, in the database construction apparatus of the present invention, it is preferable that the search and extraction unit extracts the partial image data so that all the persons included in the still image data are included in any of the partial image data.
[0161]
According to this, since all the persons included in the still image data are acquired independently or together with other persons as partial image data, the effect that the information of the scene information becomes more accurate information is added. Play.
[0162]
Further, in the database construction device of the present invention, the search and extraction means may be characterized in that the search and extraction are performed while appropriately changing a search area when searching for a person.
[0163]
According to this, since the search area when searching for a person is appropriately changed, the search area can be set even for moving image data in which the ratio of personal information to still image data changes arbitrarily at the time of search. The search area can always be appropriately determined, for example, by gradually reducing the size from the screen maximum range, and the search can be performed more efficiently. In addition, even when a search is performed as a region including a plurality of persons in the first search, it is possible to easily perform another search by changing the search region in order to further increase the information accuracy.
[0164]
In the database construction device of the present invention, the search and extraction means may be characterized in that the still image data is divided into a plurality of regions in advance, and the divided regions are searched as search regions.
[0165]
According to this, the still image data is divided into a plurality of regions in advance, and each of the divided regions is searched as a search region. In the case of image data, the number of persons existing in an area including a person to be found and cut out by one search can be appropriately reduced, and the number of individual person information can be reduced. In addition, if the search processing is performed independently in each of the divided areas, the effect of extracting the person from the still image data in a shorter time can be achieved.
[0166]
Further, when the still image data is divided in advance in this manner, the method further includes a division number determining unit that determines the number of persons included in the still image data and determines the division number of the still image data based on the determination. More preferably, the extracting means divides the image data by the number of divisions determined by the number of division determining means.
[0167]
Since the still image data is divided according to the number of persons included in the still image data, the division number of the still image data is equal to the number of persons included in the still image data. Thus, an effect that the number of individual person information can be more accurately achieved than the configuration in which the number of divisions is fixed is also provided.
[0168]
Further, in the database construction apparatus of the present invention, it is determined whether or not the reproduction of the moving image data of the first recording medium in the first drive is the first time. A configuration may also be provided that includes start instruction means for starting acquisition of still image data.
[0169]
According to this, when the reproduction of the moving image data of the first recording medium in the first drive is the first time, the acquisition of the still image data by the still image data acquiring means is started by the start instruction means. In addition, there is an effect that the user can acquire the database of the scene information without playing a special instruction only by playing the first recording medium with the first drive.
[0170]
In order to solve the above problems, an image recording / reproducing apparatus according to the present invention includes a first drive for reproducing information recorded on a first recording medium, and a first drive for recording / reproducing information on a second recording medium. An image recording / reproducing apparatus provided with a second drive and a database construction apparatus according to the first to ninth aspects.
[0171]
As already described as the database construction apparatus, the database construction apparatus of the present invention is a database construction of scene information suitable for moving image data mainly composed of human images and capable of automatically acquiring scene information without bothering the user. Device.
[0172]
Therefore, the image recording / reproducing apparatus of the present invention equipped with such a database construction apparatus can easily discriminate a local scene change and the like, and can perform an image search or the like in accordance with the movement of the person. This makes it possible to provide an excellent image recording / reproducing apparatus which can easily edit the contents of moving image data.
[0173]
The image search method of the present invention is an image search method of the image recording / reproducing apparatus of the present invention, wherein the partial image data recorded on the second recording medium is used as scene information, Is searched for an arbitrary scene in the moving image data recorded in the.
[0174]
As described above, the partial image data recorded on the second recording medium includes, for example, the act of the person, the still image data already extracted therefrom, that is, the moving image having the still image data as a component. Since information that characterizes a certain scene is included, displaying partial image data makes the partial image data itself scene information. Also, when a plurality of pieces of person information can be obtained from the partial image data, the configuration of the plurality of persons is important scene information. In other words, these pieces of partial image data become scene information corresponding to information obtained for detecting a scene change for the user as it is.
[0175]
Therefore, since the search is performed using scene information at a fine level including even the motion of a person included in the image, more efficient image search can be performed, such as in a search when reproducing moving image data centered on a person. Is also played.
[0176]
The image search device of the present invention is an image search device provided in the image recording / reproducing device, and reproduces a partial image data group recorded on the second recording medium using the second drive. A partial image data display means to be displayed on the display means, an input means for receiving a selection from the user for the displayed partial image data, and a first image data based on the partial image data selected by the input means. Search means for searching for still image data from which the partial image data is to be extracted from the moving image data recorded on the recording medium.
[0177]
According to this, the partial image data reproduction / display means reproduces the partial image data group recorded on the second recording medium using the second drive, and causes the display means to display the partial image data group. When the user selects one of the displayed partial image data using the input means, the search means searches the moving image data for still image data from which the partial image data selected by the input means is extracted. I do.
[0178]
Therefore, it is possible to search for necessary image data (scene search) by directly looking at the motion, facial expression, and the like of a person recorded on the second recording medium. A more detailed search can be performed than when the display is performed. For example, even when the ratio of the area where the person is recorded in the original still image is small in the whole, the image search using the person information as a key can be performed. In other words, since the display is narrowed down on the screen, it is possible to perform a search that is easier for the user to view and that has a high information accuracy.
[0179]
In the image search device of the present invention, when the reproduction of the moving image data recorded on the first recording medium is instructed, the partial image data reproducing / displaying means may perform the reproduction of the partial image data before reproducing the moving image data. May be displayed.
[0180]
According to this, when the partial image data group of the moving image data on the first recording medium is present on the second recording medium, a part of the partial image data is automatically reproduced before reproducing the moving image data. Therefore, for example, the user can view the partial image data and selectively reproduce only the scenes desired to be viewed.
[0181]
Further, a database construction program and a recording medium of the present invention are a program for causing a computer to function as each means in the above-described database construction apparatus of the present invention, and a recording medium recording the program.
[0182]
Further, an image search program and a recording medium according to the present invention are a program for causing a computer to function as each unit in the above-described image search apparatus according to the present invention, and a recording medium storing the program.
[0183]
Thus, if the above-described database construction program or image search program is executed by a computer, the present invention can be applied not only to a specific database construction apparatus, image search apparatus, and image recording / reproducing apparatus but also to an unspecified image recording / reproducing apparatus. The database construction device, the image search device, and the image recording / reproducing device of the present invention can also be realized.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of an image recording / reproducing apparatus according to an embodiment of the present invention.
FIGS. 2A and 2B are explanatory diagrams showing an example of acquiring scene information in the image recording / reproducing apparatus. FIG.
FIG. 3 is an explanatory diagram showing an information addition structure in a person information search in the image recording / reproducing apparatus.
FIG. 4 is an explanatory diagram showing a method of changing a search area when searching for a person in the image recording / reproducing apparatus.
FIG. 5 is a flowchart showing a procedure for automatically acquiring personal information in the image recording / reproducing apparatus.
FIG. 6 is a block diagram illustrating a configuration of an image recording / reproducing apparatus according to another embodiment of the present invention.
FIG. 7 is a flowchart showing a procedure for automatically acquiring personal information in the image recording / reproducing apparatus.
FIG. 8 is a diagram showing a scene information input system of the invention disclosed in the conventional publication.
FIG. 9 is a flowchart showing a procedure for inputting scene information according to the invention disclosed in the above-mentioned publication.
[Explanation of symbols]
1 First recording medium
2 Second recording medium
3 buffer memory (accumulation means)
4 memory (still image data acquisition means, search and extraction means, recording means, start instruction
Means, partial image data display means, search means)
5 Display means
6 CPU (still image data acquisition unit, search and extraction unit, recording unit, start instruction
Means, partial image data display means, search means)
11 First drive
12 Second drive
30 buffer memory group (accumulation means)

Claims

A still image data acquisition unit for acquiring still image data from moving image data of a first recording medium reproduced by a first drive;
Accumulation means for accumulating the still image data acquired by the still image data acquisition means,
Search and extraction means for searching for a person included in the still image data accumulated by the accumulation means, and extracting a predetermined area including the person as partial image data when a person is included,
When the partial image data is extracted by the search and extraction means, the extracted partial image data is associated with the address information on the first recording medium of the still image data from which the extraction is performed, and the second drive is used. And a recording means for recording on a second recording medium.

The search and extraction means, when extracting the partial image data, additionally obtains additional information including number information indicating the number of persons included in the partial image data,
2. The database construction according to claim 1, wherein the recording means records the additional information corresponding to the partial image data together with the partial image data on the second recording medium so that the additional information can be associated with the partial image data in a tree structure. apparatus.

3. The database construction apparatus according to claim 2, wherein the additional information includes color information indicating a color characteristic of each person.

The database construction apparatus according to claim 2, wherein the additional information includes shape information representing a shape characteristic of each person.

The method according to claim 1, wherein the search and extraction unit extracts the partial image data such that all persons included in the still image data are included in any of the partial image data. Database construction device.

The database construction apparatus according to any one of claims 1 to 4, wherein the search and extraction unit performs the search while appropriately changing a search area when searching for a person.

The database construction apparatus according to any one of claims 1 to 4, wherein the search and extraction unit divides the still image data into a plurality of regions in advance and searches the divided regions as search regions.

The state of the number of persons included in the still image data is determined, and a division number determination unit that determines the division number of the still image data based on the determination,
8. The database construction apparatus according to claim 7, wherein said search and extraction means divides the image data by the number of divisions determined by said number of division determination means.

It is determined whether or not the reproduction of the moving image data of the first recording medium in the first drive is the first time, and if it is the first time, the start instruction to start the acquisition of the still image data by the still image data acquiring means The database construction apparatus according to claim 1, further comprising means.

A first drive for reproducing information recorded on the first recording medium;
An image recording / reproducing apparatus including a second drive for recording / reproducing information on / from a second recording medium;
An image recording / reproducing apparatus comprising the database construction apparatus according to claim 1.

An image retrieval method for the image recording / reproducing apparatus according to claim 10, wherein
An image search characterized by performing a search for an arbitrary scene in moving image data recorded on the first recording medium using partial image data recorded on the second recording medium as scene information. Method.

An image retrieval device provided in the image recording / reproducing device according to claim 10,
A partial image data display unit for reproducing the partial image data group recorded on the second recording medium using the second drive and displaying the partial image data group on a display unit;
Input means for receiving selection from the user for the displayed partial image data;
Based on the partial image data selected by the input means, a search is performed on the moving image data recorded on the first recording medium to retrieve still image data from which the partial image data is to be extracted. An image search device comprising:

When the reproduction of the moving image data recorded on the first recording medium is instructed, the partial image data display means displays a part of the partial image data before reproducing the moving image data. The image search device according to claim 11, wherein:

A database construction program for causing a computer to function as each means in the database construction apparatus according to any one of claims 1 to 9.

A computer-readable recording medium on which the database construction program according to claim 14 is recorded.

An image search program for causing a computer to function as each unit in the image search device according to claim 12.

A computer-readable recording medium recording the image search program according to claim 16.