JP4379491B2

JP4379491B2 - Face data recording device, playback device, imaging device, image playback system, face data recording method and program

Info

Publication number: JP4379491B2
Application number: JP2007134948A
Authority: JP
Inventors: 修伊達; 敏弥石坂
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2007-04-04
Filing date: 2007-05-22
Publication date: 2009-12-09
Anticipated expiration: 2027-05-22
Also published as: CN101282446B; JP2008276707A; CN101282446A

Description

本発明は、顔データ記録装置に関し、特に、顔データを記録する顔データ記録装置、再生装置、撮像装置、画像再生システム、および、これらにおける処理方法ならびに当該方法をコンピュータに実行させるプログラムに関する。 The present invention relates to a face data recording apparatus , and more particularly to a face data recording apparatus , a reproducing apparatus, an imaging apparatus , an image reproducing system, a processing method in these, and a program for causing a computer to execute the method.

従来、静止画や動画等のコンテンツと、このコンテンツに付随するデータであるメタデータとを関連付けて記録しておき、このメタデータを用いて各種操作を容易にするための技術が多数提案されている。 Conventionally, many techniques have been proposed for recording content such as still images and moving images in association with metadata that is data associated with the content, and for facilitating various operations using the metadata. Yes.

また、近年では、静止画や動画等のコンテンツに含まれる人物の顔を検出する技術が存在し、検出された顔に関する情報をメタデータとして登録する技術が提案されている。また、この検出された人物の顔が特定人物の顔であるか否かを識別する識別処理が可能となっている。 In recent years, there is a technique for detecting a human face included in content such as a still image or a moving image, and a technique for registering information about the detected face as metadata has been proposed. Further, identification processing for identifying whether or not the detected face of the person is the face of the specific person is possible.

例えば、撮影された画像から顔を検出し、検出された顔を含む矩形領域と人の名前等の個人情報とをタグ形式でメタデータとして画像ファイルに書き込んで登録するメタデータ登録方法が提案されている（例えば、特許文献１参照。）。
特開２００４−３３６４６６号公報（図２） For example, a metadata registration method has been proposed in which a face is detected from a captured image, and a rectangular area including the detected face and personal information such as a person's name are written and registered in an image file as metadata in a tag format. (For example, refer to Patent Document 1).
JP 2004-336466 A (FIG. 2)

上述の従来技術では、検出された顔を含む矩形領域と個人情報とを含むメタデータが画像ファイルにタグ形式で保存されている。このため、この画像ファイルを閲覧している場合に、例えば、所定の顔をクリックすることによって、その顔に対応して登録されているメタデータを用いた操作を行うことができる。 In the above-described prior art, metadata including a rectangular area including a detected face and personal information is stored in an image file in a tag format. For this reason, when browsing this image file, for example, by clicking a predetermined face, an operation using metadata registered corresponding to the face can be performed.

ここで、画像ファイルを検索する場合について考える。上述の従来技術により登録されたメタデータを用いて画像ファイルの検索をする場合には、メタデータが画像ファイルにタグ形式で書き込まれているため、これらのタグのそれぞれを検出して確認する必要がある。この場合には、タグのそれぞれを検出して確認する時間が必要であり、画像ファイルの検索時間が増大する。このため、コンテンツを迅速に利用することができない。 Here, consider the case of searching for an image file. When searching for an image file using the metadata registered by the above-described conventional technology, the metadata is written in the tag format in the image file, so it is necessary to detect and confirm each of these tags. There is. In this case, it takes time to detect and confirm each of the tags, and the search time for the image file increases. For this reason, the content cannot be used quickly.

そこで、本発明は、メタデータを用いてコンテンツを迅速に利用することを目的とする。 Accordingly, an object of the present invention is to quickly use content using metadata.

本発明は、上記課題を解決するためになされたものであり、その第１の側面は、画像を入力する画像入力部と、上記入力された画像に含まれる被写体の顔を検出する顔検出部と、上記顔検出部の検出結果に基づいて、複数の要素情報から構成される上記検出された顔に関する顔データと、上記複数の要素情報の記録順序に対応してアサインされたビット列であって上記複数の要素情報の有無を記録するデータ構造情報と当該顔が検出された際における上記入力された画像に関する属性情報とを有し上記顔データを管理する顔データ管理情報とを作成し、上記顔データおよび上記顔データ管理情報を記録媒体に記録させる第１の制御部と、上記入力された画像に関する属性情報と上記顔データ管理情報に含まれる属性情報とを比較する比較部と、上記比較部による比較対象となる属性情報が一致した場合に上記データ構造情報に基づいて上記顔データを構成する上記要素情報の有無を確認し、上記複数の要素情報のうち一の要素情報の上記顔データにおける先頭からの記録オフセット値を算出し、上記算出された記録オフセット値に基づいて上記顔データを構成する要素情報から上記一の要素情報を読み出し、当該一の要素情報を用いて上記入力された画像を再生させる第２の制御部とを具備する画像再生システムおよびその処理方法ならびに当該方法をコンピュータに実行させるプログラムである。これにより、入力された画像に含まれる被写体の顔を検出し、この検出結果に基づいて顔データおよび顔データ管理情報を作成し、この顔データおよび顔データ管理情報を記録媒体に記録させ、入力された画像に関する属性情報と、顔データ管理情報に含まれる属性情報とを比較し、属性情報が一致した場合に、顔データを構成する要素情報の有無を確認し、一の要素情報の顔データにおける先頭からの記録オフセット値を算出し、この算出された記録オフセット値に基づいて一の要素情報を読み出し、この要素情報を用いて、入力された画像を再生させるという作用をもたらす。 The present invention has been made to solve the above problems, a first aspect of an image input unit for inputting an image and the face detection unit for detecting a face of a subject contained in the input image And a bit string assigned to the face data related to the detected face composed of a plurality of element information and the recording order of the plurality of element information based on the detection result of the face detection unit. Creating face data management information for managing the face data having data structure information for recording presence / absence of the plurality of element information and attribute information on the input image when the face is detected; A first control unit that records the face data and the face data management information on a recording medium; a comparison unit that compares the attribute information about the input image with the attribute information included in the face data management information; When the attribute information to be compared by the comparison unit matches, the presence of the element information constituting the face data is confirmed based on the data structure information, and the element information of one element information among the plurality of element information A recording offset value from the head in the face data is calculated, the one element information is read from the element information constituting the face data based on the calculated recording offset value, and the input is performed using the one element information. An image reproduction system including a second control unit that reproduces a recorded image, a processing method thereof, and a program for causing a computer to execute the method. Thereby, the face of the subject included in the input image is detected, face data and face data management information are created based on the detection result, the face data and face data management information are recorded on the recording medium, and input If the attribute information matches, the attribute information about the generated image is compared with the attribute information included in the face data management information, and the presence / absence of the element information constituting the face data is confirmed. The recording offset value from the head is calculated, one element information is read based on the calculated recording offset value, and the input image is reproduced using the element information .

また、本発明の第２の側面は、画像を入力する画像入力部と、上記入力された画像に含まれる被写体の顔を検出する顔検出部と、上記顔検出部の検出結果に基づいて、複数の要素情報から構成される上記検出された顔に関する顔データと、上記複数の要素情報の記録順序に対応してアサインされたビット列であって上記複数の要素情報の有無を記録するデータ構造情報と当該顔が検出された際における上記入力された画像に関する属性情報とを有し上記顔データを管理する顔データ管理情報とを作成し、上記顔データおよび上記顔データ管理情報を記録媒体に記録させる第１の制御部と、上記入力された画像に関する属性情報と上記顔データ管理情報に含まれる属性情報とを比較する比較部と、上記比較部による比較対象となる属性情報が一致しないと判断された画像について当該画像に含まれる被写体の顔を上記顔検出部に検出させ、当該検出結果に基づいて上記顔データおよび上記顔データ管理情報を作成し、当該作成された顔データおよび顔データ管理情報を上記記録媒体に記録させる第２の制御部とを具備する顔データ記録装置およびその処理方法ならびに当該方法をコンピュータに実行させるプログラムである。これにより、入力された画像に含まれる被写体の顔を検出し、この検出結果に基づいて顔データおよび顔データ管理情報を作成し、この顔データおよび顔データ管理情報を記録媒体に記録させ、入力された画像に関する属性情報と、顔データ管理情報に含まれる属性情報とを比較し、属性情報が一致しないと判断された画像について、この画像に含まれる被写体の顔を検出させ、この検出結果に基づいて顔データおよび顔データ管理情報を作成し、この作成された顔データおよび顔データ管理情報を記録媒体に記録させるという作用をもたらす。 The second aspect of the present invention includes an image input unit for inputting an image, a face detection unit that detects a face of a subject contained in the input image, based on the detection result of the face detection unit, Data structure information for recording the presence / absence of the plurality of element information, which is a bit string assigned corresponding to the recording order of the plurality of element information and face data relating to the detected face composed of a plurality of element information And face data management information for managing the face data, and recording the face data and the face data management information on a recording medium. A first control unit that performs comparison, a comparison unit that compares the attribute information about the input image with the attribute information included in the face data management information, and attribute information that is a comparison target by the comparison unit. The face detection unit detects the face of the subject included in the image determined not to be generated, creates the face data and the face data management information based on the detection result, the created face data and A face data recording apparatus including a second control unit that records face data management information on the recording medium, a processing method thereof, and a program for causing a computer to execute the method. Thereby, the face of the subject included in the input image is detected, face data and face data management information are created based on the detection result, the face data and face data management information are recorded on the recording medium, and input The attribute information related to the captured image is compared with the attribute information included in the face data management information, and for the image determined to be inconsistent with the attribute information, the face of the subject included in the image is detected. Based on this, face data and face data management information are created, and the created face data and face data management information are recorded on a recording medium .

また、この第２の側面において、上記データ構造情報は、連続して割り当てられたビット列を有するデータ構造であって上記記録順序で記録された各要素情報に対して当該記録順序に従って所定のフラグがアサインされており、上記フラグは、上記顔データ内における当該フラグに対応する上記要素情報の有無を示すことができる。これにより、連続して割り当てられたビット列を有するデータ構造を具備し、顔データ内における要素情報の有無を示すフラグが、顔データの記録順序に従ってアサインされたデータ構造情報を有する顔データ管理情報を作成するという作用をもたらす。 In the second aspect, the data structure information is a data structure having continuously assigned bit strings, and a predetermined flag is set according to the recording order for each element information recorded in the recording order. Assigned, the flag can indicate the presence or absence of the element information corresponding to the flag in the face data. Thus, comprises a data structure having a bit string assigned consecutively, flag indicating the presence or absence of element information in the face data, the face data management information having a data structure information that are assigned in accordance with the recording order of the face data The effect is to create.

また、この第２の側面において、上記データ構造情報は、上記要素情報以外の拡張顔データに割り当てるための予約ビット列を有することができる。これにより、要素情報以外の拡張顔データに割り当てるための予約ビット列を有するデータ構造情報を具備する顔データ管理情報を作成するという作用をもたらす。 In the second aspect, the data structure information may have a reserved bit string for allocating to extended face data other than the element information. This produces an effect of creating face data management information including data structure information having a reserved bit string to be assigned to extended face data other than element information.

また、この第２の側面において、上記第１の制御部は、上記顔検出部により検出された顔について所定の条件を満たさない顔については当該顔に関する顔データを作成しないことができる。これにより、検出された顔について所定の条件を満たさない顔については、顔に関する顔データを作成しないという作用をもたらす。 In the second aspect, the first control unit may not create face data relating to a face detected by the face detection unit for a face that does not satisfy a predetermined condition. This brings about the effect that face data relating to the face is not created for a face that does not satisfy the predetermined condition for the detected face.

また、この第２の側面において、上記顔データ管理情報は、対応する上記顔データのデータ容量を示すデータ容量情報と当該顔データのバージョンを示すバージョン情報とを含むことができる。これにより、対応する顔データのデータ容量を示すデータ容量情報と、その顔データのバージョンを示すバージョン情報とを含む顔データ管理情報を作成するという作用をもたらす。 In the second aspect, the face data management information can include data capacity information indicating a data capacity of the corresponding face data and version information indicating a version of the face data. This brings about the effect that the face data management information including the data capacity information indicating the data capacity of the corresponding face data and the version information indicating the version of the face data is created.

また、この第２の側面において、上記顔データは、上記顔検出部により検出された顔の位置、その大きさ、顔らしさを示す顔スコア、笑顔の度合いを示す笑顔スコア、その検出時刻、上記入力された画像におけるその顔の重要度の少なくとも１つを含むことができる。これにより、検出された顔の位置、その大きさ、顔らしさを示す顔スコア、笑顔の度合いを示す笑顔スコア、その検出時刻、上記入力された画像におけるその顔の重要度の少なくとも１つを含む顔データを作成するという作用をもたらす。 In the second aspect, the face data includes the position of the face detected by the face detection unit , its size, a face score indicating the likelihood of a face, a smile score indicating the degree of smile, its detection time, At least one of the importance levels of the face in the input image may be included. This includes at least one of the position of the detected face , its size, a face score indicating the likelihood of a face, a smile score indicating the degree of smile, its detection time, and the importance of the face in the input image. This produces the effect of creating face data.

また、この第２の側面において、上記画像入力部は、上記画像として動画を入力し、上記顔検出部は、上記動画に含まれる顔を所定間隔で検出することができる。これにより、動画に含まれる顔を所定間隔で検出するという作用をもたらす。また、この場合において、上記第１の制御部は、上記検出された顔に関する上記顔データおよび上記顔データ管理情報を当該顔が検出された動画に対応する動画ファイルに記録することができる。これにより、検出された顔に関する顔データおよび顔データ管理情報を、顔が検出された動画ファイルに記録するという作用をもたらす。 Further, in the second aspect, the image input unit inputs a video as the image, the face detecting section can it detect a face contained in the video at predetermined intervals. This brings about the effect | action of detecting the face contained in a moving image at a predetermined space | interval. In this case, the first control unit can record the face data related to the detected face and the face data management information in a moving image file corresponding to the moving image in which the face is detected. As a result, the face data and the face data management information relating to the detected face are recorded in the moving image file in which the face is detected.

また、この第２の側面において、上記画像入力部は、上記画像としてＡＶＣコーデックされた動画を入力し、上記顔検出部は、ＳＰＳが付加されたＡＵに含まれるＩＤＲピクチャまたはＩピクチャにおいて顔を検出することができる。これにより、ＳＰＳが付加されたＡＵに含まれるＩＤＲピクチャまたはＩピクチャにおいて顔を検出するという作用をもたらす。また、この場合において、上記第１の制御部は、上記検出された顔に関する上記顔データおよび上記顔データ管理情報を当該顔が検出されたＩＤＲピクチャまたはＩピクチャを含む上記ＡＵにおけるＳＥＩに記録することができる。これにより、検出された顔に関する顔データおよび顔データ管理情報を、顔が検出されたＩＤＲピクチャまたはＩピクチャを含むＡＵにおけるＳＥＩに記録するという作用をもたらす。 Also, in this second aspect, the image input unit inputs a video that has been AVC codec as the image, and the face detection unit detects a face in the IDR picture or I picture included in the AU to which the SPS is added. Can be detected. This brings about the effect that the face is detected in the IDR picture or I picture included in the AU to which the SPS is added. In this case, the first control unit records the face data and the face data management information related to the detected face in the SEI in the AU including the IDR picture or I picture in which the face is detected. be able to. As a result, the face data and the face data management information regarding the detected face are recorded in the SEI in the AU including the IDR picture or I picture from which the face is detected.

また、この第２の側面において、上記画像入力部は、上記画像として静止画を入力し、上記第１の制御部は、上記検出された顔に関する上記顔データおよび上記顔データ管理情報を当該顔が検出された静止画に対応する静止画ファイルに記録することができる。これにより、検出された顔に関する顔データおよび顔データ管理情報を、顔が検出された静止画ファイルに記録するという作用をもたらす。 In the second aspect, the image input unit inputs a still image as the image, and the first control unit receives the face data and the face data management information related to the detected face. Can be recorded in a still image file corresponding to the detected still image . As a result, the face data and the face data management information relating to the detected face are recorded in the still image file from which the face is detected.

また、本発明の第３の側面は、画像に含まれる顔に関するデータであって複数の要素情報から構成される顔データと、上記複数の要素情報の記録順序に対応してアサインされたビット列であって上記複数の要素情報の有無を記録するデータ構造情報と当該顔が検出された際における上記画像に関する属性情報とを有し上記顔データを管理する顔データ管理情報とを入力する入力部と、上記画像に関する属性情報と上記顔データ管理情報に含まれる属性情報とを比較する比較部と、上記比較部による比較対象となる属性情報が一致した場合に上記データ構造情報に基づいて上記顔データを構成する上記要素情報の有無を確認し、上記複数の要素情報のうち一の要素情報の上記顔データにおける先頭からの記録オフセット値を算出し、上記算出された記録オフセット値に基づいて上記顔データを構成する要素情報から上記一の要素情報を読み出し、当該一の要素情報を用いて上記画像を再生させる制御部とを具備する再生装置およびその処理方法ならびに当該方法をコンピュータに実行させるプログラムである。これにより、画像に関する属性情報と、顔データ管理情報に含まれる属性情報とを比較し、属性情報が一致した場合に、顔データを構成する要素情報の有無を確認し、一の要素情報の顔データにおける先頭からの記録オフセット値を算出し、この算出された記録オフセット値に基づいて一の要素情報を読み出し、この要素情報を用いて、画像を再生させるという作用をもたらす。 According to a third aspect of the present invention , there is face data composed of a plurality of element information and data relating to a face included in an image, and a bit string assigned corresponding to the recording order of the plurality of element information. An input unit for inputting data structure information for recording presence / absence of the plurality of element information and face data management information for managing the face data having attribute information regarding the image when the face is detected; The face data based on the data structure information when the attribute information on the image and the attribute information included in the face data management information match with the attribute information to be compared by the comparison unit. And the recording offset value from the head of the face data of one element information among the plurality of element information is calculated, and the calculated It was based on the recording offset value read out element data of the one from the element information constituting the face data, reproducing apparatus and a processing method using the element information of the one and a control unit for reproducing the image, as well as A program for causing a computer to execute the method. As a result, the attribute information related to the image is compared with the attribute information included in the face data management information. When the attribute information matches, the presence / absence of the element information constituting the face data is confirmed. The recording offset value from the head in the data is calculated, one element information is read based on the calculated recording offset value, and an image is reproduced using the element information .

また、この第３の側面において、上記属性情報は、当該属性情報に対応する画像が更新された日時を示す更新日時を含み、上記顔データ管理情報は、対応する顔が検出された際における画像が更新された日時を示す更新日時を上記属性情報として含み、上記比較部は、上記画像に関する属性情報に含まれる更新日時と上記顔データ管理情報に含まれる更新日時とを比較することができる。これにより、画像の更新日時と、その画像に関する顔データ管理情報に含まれる更新日時とを比較して、更新日時が一致すると判断された画像に含まれる顔に関する顔データについて所定の要素情報を読み出すという作用をもたらす。 In the third aspect, the attribute information includes an update date and time indicating the date and time when an image corresponding to the attribute information is updated, and the face data management information is an image when a corresponding face is detected. The attribute information includes an update date and time indicating the date and time when the image is updated, and the comparison unit can compare the update date and time included in the attribute information on the image with the update date and time included in the face data management information . As a result, the update date / time of the image is compared with the update date / time included in the face data management information related to the image, and predetermined element information regarding the face data related to the face included in the image determined to match the update date / time is read out. This brings about the effect.

また、この第３の側面において、上記比較部による比較対象となる属性情報が一致しないと判断された画像に含まれる被写体の顔を検出する顔検出部をさらに具備し、上記制御部は、上記顔検出部の検出結果に基づいて、上記顔データおよび上記顔データ管理情報を作成し、上記作成された顔データおよび上記作成された顔データ管理情報を記録媒体に記録させることができる。これにより、属性情報が一致しないと判断された画像については、その画像に含まれる被写体の顔に基づいて顔に関する顔データを作成するとともに、顔データを管理する顔データ管理情報を作成し、その画像と顔データおよび顔データ管理情報とを関連付けて記録媒体に記録するという作用をもたらす。 In addition, in the third aspect, the image processing apparatus further includes a face detection unit that detects a face of a subject included in an image that is determined that attribute information to be compared by the comparison unit does not match. The face data and the face data management information can be created based on the detection result of the face detection unit, and the created face data and the created face data management information can be recorded on a recording medium . As a result, for an image for which attribute information is determined not to match, face data related to the face is created based on the face of the subject included in the image, and face data management information for managing the face data is created. The image, the face data, and the face data management information are associated and recorded on the recording medium .

また、この第３の側面において、上記比較部による比較対象となる属性情報が一致しないと判断された場合には、当該一致しないと判断された画像とは異なる画像に対応する顔データおよび顔データ管理情報を検索する検索部をさらに具備することができる。これにより、属性情報が一致しないと判断された場合には、一致しないと判断された画像とは異なる画像に対応する顔データおよび顔データ管理情報を検索するという作用をもたらす。 In the third aspect, when it is determined that the attribute information to be compared by the comparison unit does not match, the face data and the face data corresponding to an image different from the image determined not to match The information processing apparatus may further include a search unit that searches for management information. Thereby, when it is determined that the attribute information does not match, the face data and the face data management information corresponding to an image different from the image determined not to match are retrieved.

また、この第３の側面において、上記属性情報は、当該属性情報に対応する画像の大きさを示す画像サイズを含み、上記顔データ管理情報は、対応する顔が検出された際における画像の画像サイズを上記属性情報として含み、上記比較部は、上記画像に関する属性情報に含まれる画像サイズと上記顔データ管理情報に含まれる画像サイズとを比較することができる。これにより、画像サイズが一致すると判断された画像については、その画像に含まれる顔に関する顔データについて所定の要素情報を読み出すという作用をもたらす。
また、この第３の側面において、上記属性情報は、当該属性情報に対応する画像に関する回転情報を含み、上記制御部は、上記比較部による比較対象となる属性情報が一致した場合に上記画像に関する属性情報に含まれる回転情報の有無と当該回転情報が無効値か否かとを確認し、当該回転情報が存在するとともに当該回転情報が無効値ではないと確認された画像に含まれる顔に関する顔データについて上記一の要素情報を読み出すことができる。これにより、回転情報が存在するとともに、この回転情報が無効値ではないと確認された画像については、その画像に含まれる顔に関する顔データについて所定の要素情報を読み出すという作用をもたらす。 In the third aspect, the attribute information includes an image size indicating a size of an image corresponding to the attribute information, and the face data management information is an image of an image when a corresponding face is detected. A size is included as the attribute information, and the comparison unit can compare the image size included in the attribute information regarding the image with the image size included in the face data management information . As a result, for the image determined to have the same image size, the predetermined element information is read out from the face data related to the face included in the image.
In the third aspect, the attribute information includes rotation information about an image corresponding to the attribute information, and the control unit relates to the image when the attribute information to be compared by the comparison unit matches. The presence / absence of the rotation information included in the attribute information and whether or not the rotation information is an invalid value, and face data related to the face included in the image in which the rotation information is present and the rotation information is confirmed not to be an invalid value The above one element information can be read out. As a result, the rotation information is present, and for an image in which the rotation information is confirmed not to be an invalid value, predetermined element information is read for the face data related to the face included in the image.

また、この第３の側面において、上記顔データ管理情報は、対応する画像から求められた誤り検出符号値を含み、上記画像に対応する画像データのうちの少なくとも一部のデータに基づいて誤り検出符号値を算出する誤り検出符号値算出部をさらに具備し、上記比較部は、上記算出された上記画像に関する誤り検出符号値と当該画像に対応する顔データ管理情報に含まれる誤り検出符号値とを比較し、上記制御部は、誤り検出符号値が一致すると判断された画像に含まれる顔に関する顔データについて上記一の要素情報を読み出すことができる。これにより、誤り検出符号値が一致すると判断された画像については、その画像に含まれる顔に関する顔データについて所定の要素情報を読み出すという作用をもたらす。 In the third aspect, the face data management information includes an error detection code value obtained from a corresponding image, and error detection is performed based on at least a part of the image data corresponding to the image. An error detection code value calculation unit that calculates a code value; and the comparison unit includes an error detection code value related to the calculated image and an error detection code value included in face data management information corresponding to the image. compare, the control unit can be the face data relating to a face included in an image is determined to an error detection code value matches reads the element information of the one. As a result, for an image that is determined to have the same error detection code value, there is an effect that predetermined element information is read for face data relating to the face included in the image.

また、この第３の側面において、上記顔データ管理情報は、上記顔データのバージョンを示すバージョン情報を含み、上記制御部は、上記顔データ管理情報に含まれるバージョン情報に基づいて当該顔データ管理情報に対応する上記顔データが対応可能か否かを判断し、対応可能であると判断された顔データについて上記一の要素情報を読み出すことができる。これにより、顔データ管理情報に含まれるバージョン情報に基づいて、その顔データ管理情報に対応する顔データが対応可能か否かを判断し、対応可能であると判断された顔データについて所定の要素情報を読み出すという作用をもたらす。 In the third aspect, the face data management information includes version information indicating a version of the face data, and the control unit performs the face data management based on the version information included in the face data management information. the face data is determined whether adaptable corresponding to information, for which is determined to be the corresponding face data can be read element information of the one. Thus, based on the version information contained in the face data management information, to determine whether it is possible correspondence corresponding face data on the face data management information, the predetermined on is determined to be the corresponding face data element The effect of reading out information is brought about.

また、本発明の第４の側面は、被写体の画像を撮像する撮像部と、上記撮像された画像に含まれる被写体の顔を検出する顔検出部と、上記顔検出部の検出結果に基づいて、複数の要素情報から構成される上記検出された顔に関する顔データと、上記複数の要素情報の記録順序に対応してアサインされたビット列であって上記複数の要素情報の有無を記録するデータ構造情報と当該顔が検出された際における上記撮像された画像に関する属性情報とを有し上記顔データを管理する顔データ管理情報とを作成し、上記顔データおよび上記顔データ管理情報を記録媒体に記録させる第１の制御部と、上記撮像された画像に関する属性情報と上記顔データ管理情報に含まれる属性情報とを比較する比較部と、上記比較部による比較対象となる属性情報が一致しないと判断された画像について当該画像に含まれる被写体の顔を上記顔検出部に検出させ、当該検出結果に基づいて上記顔データおよび上記顔データ管理情報を作成し、当該作成された顔データおよび顔データ管理情報を上記記録媒体に記録させる第２の制御部とを具備する撮像装置およびその処理方法ならびに当該方法をコンピュータに実行させるプログラムである。これにより、撮像された画像に含まれる被写体の顔を検出し、この検出結果に基づいて顔データおよび顔データ管理情報を作成し、この顔データおよび顔データ管理情報を記録媒体に記録させ、撮像された画像に関する属性情報と、顔データ管理情報に含まれる属性情報とを比較し、属性情報が一致しないと判断された画像について、この画像に含まれる被写体の顔を検出させ、この検出結果に基づいて顔データおよび顔データ管理情報を作成し、この作成された顔データおよび顔データ管理情報を記録媒体に記録させるという作用をもたらす。 The fourth aspect of the present invention includes an imaging unit for capturing an image of a subject, a face detection unit that detects a face of the subject included in the captured image, based on the detection result of the face detection unit A data structure for recording the presence / absence of the plurality of element information, the face data relating to the detected face composed of a plurality of element information, and a bit string assigned corresponding to the recording order of the plurality of element information Generating face data management information for managing the face data having information and attribute information about the captured image when the face is detected, and storing the face data and the face data management information on a recording medium A first control unit to be recorded; a comparison unit that compares the attribute information about the captured image and the attribute information included in the face data management information; and the attribute information to be compared by the comparison unit For the image determined not to match, the face detection unit detects the face of the subject included in the image, creates the face data and the face data management information based on the detection result, and the created face data And a second control unit that records face data management information on the recording medium, a processing method thereof, and a program for causing a computer to execute the method. Thereby, the face of the subject included in the captured image is detected, face data and face data management information are created based on the detection result, the face data and face data management information are recorded on a recording medium, and imaging is performed. The attribute information related to the captured image is compared with the attribute information included in the face data management information, and for the image determined to be inconsistent with the attribute information, the face of the subject included in the image is detected. Based on this, face data and face data management information are created, and the created face data and face data management information are recorded on a recording medium .

本発明によれば、コンテンツを迅速に利用することができるという優れた効果を奏し得る。 According to the present invention, it is possible to achieve an excellent effect that content can be used quickly.

次に本発明の実施の形態について図面を参照して詳細に説明する。 Next, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の実施の形態における撮像装置１００の構成を示すブロック図である。この撮像装置１００は、大きく分けると、カメラ部１１０と、カメラＤＳＰ（Digital Signal Processor）１２０と、ＳＤＲＡＭ（Synchronous Dynamic Random Access Memory）１２１と、制御部１３０と、操作部１４０と、媒体インタフェース（以下、媒体Ｉ／Ｆという。）１５０と、ＬＣＤ（Liquid Crystal Display）コントローラ１６１と、ＬＣＤ１６２と、外部インタフェース（以下、外部Ｉ／Ｆという。）１６３と、通信インタフェース（以下、通信Ｉ／Ｆという。）１６４とを備える。なお、媒体インタフェース１５０と接続される記録媒体１７０については、撮像装置１００に内蔵するようにしてもよく、撮像装置１００から着脱可能とするようにしてもよい。 FIG. 1 is a block diagram illustrating a configuration of an imaging apparatus 100 according to an embodiment of the present invention. The imaging apparatus 100 is roughly divided into a camera unit 110, a camera DSP (Digital Signal Processor) 120, an SDRAM (Synchronous Dynamic Random Access Memory) 121, a control unit 130, an operation unit 140, and a medium interface (hereinafter referred to as a media interface). , Medium I / F) 150, LCD (Liquid Crystal Display) controller 161, LCD 162, external interface (hereinafter referred to as external I / F) 163, and communication interface (hereinafter referred to as communication I / F). 164). Note that the recording medium 170 connected to the medium interface 150 may be built in the imaging apparatus 100 or detachable from the imaging apparatus 100.

記録媒体１７０は、半導体メモリを用いたいわゆるメモリカード、記録可能なＤＶＤ（Digital Versatile Disc）、記録可能なＣＤ（Compact Disc）等の光記録媒体、磁気ディスク、ＨＤＤ（Hard Disk Drive）等の種々のものを用いるようにすることが考えられる。 The recording medium 170 may be various types such as a so-called memory card using a semiconductor memory, an optical recording medium such as a recordable DVD (Digital Versatile Disc) and a recordable CD (Compact Disc), a magnetic disk, and an HDD (Hard Disk Drive). It is conceivable to use one of these.

カメラ部１１０は、光学ブロック１１１、ＣＣＤ（Charge Coupled Device）１１２、前処理回路１１３、光学ブロック用ドライバ１１４、ＣＣＤ用ドライバ１１５、および、タイミング生成回路１１６を備えるものである。ここで、光学ブロック１１１は、レンズ、フォーカス機構、シャッター機構、および、絞り（アイリス）機構などを備えるものである。 The camera unit 110 includes an optical block 111, a CCD (Charge Coupled Device) 112, a preprocessing circuit 113, an optical block driver 114, a CCD driver 115, and a timing generation circuit 116. Here, the optical block 111 includes a lens, a focus mechanism, a shutter mechanism, a diaphragm (iris) mechanism, and the like.

また、制御部１３０は、ＣＰＵ（Central Processing Unit）１４１、ＲＡＭ（Random Access Memory）１４２、フラッシュＲＯＭ（Read Only Memory）１４３、および、時計回路１４４が、システムバス１４５を通じて接続されて構成されている。制御部１３０は、例えば、汎用の組み込み型のマイクロコンピュータまたは専用のシステムＬＳＩ（Large Scale Integrated circuit）などからなる。また、制御部１３０は、撮像装置１００の各部を制御するものである。 The control unit 130 includes a CPU (Central Processing Unit) 141, a RAM (Random Access Memory) 142, a flash ROM (Read Only Memory) 143, and a clock circuit 144 connected through a system bus 145. . The control unit 130 includes, for example, a general-purpose embedded microcomputer or a dedicated system LSI (Large Scale Integrated circuit). The control unit 130 controls each unit of the imaging device 100.

ここで、ＲＡＭ１４２は、処理の途中結果を一時記憶する等、主に作業領域として用いられるものである。また、フラッシュＲＯＭ１４３は、ＣＰＵ１４１において実行される種々のプログラムや、処理に必要になるデータなどを記憶したものである。また、時計回路１４４は、現在年月日、現在曜日、現在時刻を提供するとともに、撮影日時等を提供するものである。 Here, the RAM 142 is mainly used as a work area, such as temporarily storing intermediate results of processing. The flash ROM 143 stores various programs executed by the CPU 141, data necessary for processing, and the like. The clock circuit 144 provides the current date, the current day of the week, and the current time, as well as the shooting date and time.

そして、画像の撮影時においては、光学ブロック用ドライバ１１４は、制御部１３０からの制御に応じて、光学ブロック１１１を動作させるようにする駆動信号を形成し、これを光学ブロック１１１に供給して、光学ブロック１１１を動作させるようにする。光学ブロック用ドライバ１１４からの駆動信号に応じて、光学ブロック１１１のフォーカス機構、シャッター機構、および、絞り機構が制御される。光学ブロック１１１は、被写体の光学的な画像を取り込んで、これをＣＣＤ１１２に結像させる。 At the time of shooting an image, the optical block driver 114 forms a drive signal for operating the optical block 111 according to control from the control unit 130, and supplies the drive signal to the optical block 111. Then, the optical block 111 is operated. In accordance with the drive signal from the optical block driver 114, the focus mechanism, shutter mechanism, and aperture mechanism of the optical block 111 are controlled. The optical block 111 captures an optical image of a subject and forms it on the CCD 112.

ＣＣＤ１１２は、光学ブロック１１１からの光学的な画像を光電変換して、変換により得られた画像の電気信号を出力する。すなわち、ＣＣＤ１１２は、ＣＣＤ用ドライバ１１５からの駆動信号に応じて動作し、光学ブロック１１１からの光学的な被写体の画像を取り込むとともに、制御部１３０によって制御されるタイミング生成回路１１６からのタイミング信号に基づいて、取り込んだ被写体の画像（画像情報）を電気信号として前処理回路１１３に供給する。なお、ＣＣＤ１１２の代わりに、ＣＭＯＳ（Complementary Metal-Oxide Semiconductor）センサなどの光電変換デバイスを用いるようにしてもよい。 The CCD 112 photoelectrically converts the optical image from the optical block 111 and outputs an electrical signal of the image obtained by the conversion. That is, the CCD 112 operates in accordance with the drive signal from the CCD driver 115, captures an optical subject image from the optical block 111, and outputs a timing signal from the timing generation circuit 116 controlled by the control unit 130. Based on this, the captured image of the subject (image information) is supplied to the preprocessing circuit 113 as an electrical signal. Note that a photoelectric conversion device such as a CMOS (Complementary Metal-Oxide Semiconductor) sensor may be used instead of the CCD 112.

また、上述のように、タイミング生成回路１１６は、制御部１３０からの制御に応じて、所定のタイミングを提供するタイミング信号を形成するものである。また、ＣＣＤドライバ１１５は、タイミング生成回路１１６からのタイミング信号に基づいて、ＣＣＤ１１２に供給する駆動信号を形成するものである。 Further, as described above, the timing generation circuit 116 forms a timing signal that provides a predetermined timing in accordance with control from the control unit 130. The CCD driver 115 forms a drive signal to be supplied to the CCD 112 based on the timing signal from the timing generation circuit 116.

前処理回路１１３は、ＣＣＤ１１２から供給された電気信号の画像情報に対して、ＣＤＳ（Correlated Double Sampling）処理を行って、Ｓ／Ｎ比を良好に保つようにするとともに、ＡＧＣ（Automatic Gain Control）処理を行って、利得を制御し、そして、Ａ／Ｄ（Analog/Digital）変換を行って、デジタル信号とされた画像データを形成する。 The pre-processing circuit 113 performs CDS (Correlated Double Sampling) processing on the image information of the electrical signal supplied from the CCD 112 so as to maintain a good S / N ratio, and AGC (Automatic Gain Control). Processing is performed to control the gain, and A / D (Analog / Digital) conversion is performed to form digital image data.

前処理回路１１３においてデジタル信号とされた画像データは、カメラＤＳＰ１２０に供給される。カメラＤＳＰ１２０は、これに供給された画像データに対して、ＡＦ（Auto Focus）、ＡＥ（Auto Exposure）、および、ＡＷＢ（Auto White Balance）などのカメラ信号処理を施す。このようにして種々の調整がされた画像データは、例えば、ＪＰＥＧ（Joint Photographic Experts Group）またはＪＰＥＧ２０００などの所定の符号化方式で符号化され、システムバス１４５および媒体Ｉ／Ｆ１５０を通じて記録媒体１７０に供給され、記録媒体１７０にファイルとして記録される。また、カメラＤＳＰ１２０は、ＭＰＥＧ４−ＡＶＣ規格に基づいて、データ圧縮処理およびデータ伸長処理を行う。 The image data converted into a digital signal in the preprocessing circuit 113 is supplied to the camera DSP 120. The camera DSP 120 performs camera signal processing such as AF (Auto Focus), AE (Auto Exposure), and AWB (Auto White Balance) on the image data supplied thereto. The image data thus adjusted in various ways is encoded by a predetermined encoding method such as JPEG (Joint Photographic Experts Group) or JPEG 2000, and is stored in the recording medium 170 through the system bus 145 and the medium I / F 150. Supplied and recorded on the recording medium 170 as a file. The camera DSP 120 performs data compression processing and data expansion processing based on the MPEG4-AVC standard.

また、記録媒体１７０に記録された画像データは、タッチパネルやコントロールキーなどからなる操作部１４０を通じて受け付けたユーザからの操作入力に応じて、目的とする画像データが媒体Ｉ／Ｆ１５０を通じて記録媒体１７０から読み出され、これがカメラＤＳＰ１２０に供給される。 Also, the image data recorded on the recording medium 170 is transferred from the recording medium 170 through the medium I / F 150 in accordance with the operation input from the user received through the operation unit 140 including a touch panel and control keys. This is read out and supplied to the camera DSP 120.

カメラＤＳＰ１２０は、記録媒体１７０から読み出され、媒体Ｉ／Ｆ１５０を通じて供給された符号化されている画像データを復号し、復号後の画像データをシステムバス１４５を通じてＬＣＤコントローラ１６１に供給する。ＬＣＤコントローラ１６１は、これに供給された画像データからＬＣＤ１６２に供給する画像信号を形成し、これをＬＣＤ１６２に供給する。これにより、記録媒体１７０に記録されている画像データに応じた画像が、ＬＣＤ１６２の表示画面に表示される。なお、カメラＤＳＰ１２０は、前処理回路１１３または記録媒体１７０から供給された画像データに含まれる顔を検出して、検出された顔に関する情報を制御部１３０に出力する。 The camera DSP 120 decodes the encoded image data read from the recording medium 170 and supplied through the medium I / F 150, and supplies the decoded image data to the LCD controller 161 through the system bus 145. The LCD controller 161 forms an image signal to be supplied to the LCD 162 from the image data supplied thereto, and supplies this to the LCD 162. As a result, an image corresponding to the image data recorded on the recording medium 170 is displayed on the display screen of the LCD 162. The camera DSP 120 detects a face included in the image data supplied from the preprocessing circuit 113 or the recording medium 170, and outputs information about the detected face to the control unit 130.

また、撮像装置１００には、外部Ｉ／Ｆ１６３が設けられている。この外部Ｉ／Ｆ１６３を通じて、例えば外部のパーソナルコンピュータと接続して、パーソナルコンピュータから画像データの供給を受けて、これを撮像装置１００に装着された記録媒体１７０に記録したり、また、撮像装置１００に装着された記録媒体１７０に記録されている画像データを外部のパーソナルコンピュータ等に供給したりすることもできるものである。 The imaging apparatus 100 is provided with an external I / F 163. The external I / F 163 is connected to, for example, an external personal computer, receives image data supplied from the personal computer, and records the data on a recording medium 170 attached to the imaging apparatus 100, or the imaging apparatus 100 It is also possible to supply the image data recorded on the recording medium 170 mounted to an external personal computer or the like.

また、通信Ｉ／Ｆ１６４は、いわゆるネットワークインターフェースカード（ＮＩＣ）などからなり、ネットワークに接続して、ネットワークを通じて種々の画像データやその他の情報を取得する。 The communication I / F 164 includes a so-called network interface card (NIC) or the like, and is connected to the network and acquires various image data and other information through the network.

また、外部のパーソナルコンピュータやネットワークを通じて取得し、記録媒体１７０に記録された画像データ等の情報についても、上述したように、撮像装置１００において読み出して再生し、ＬＣＤ１６２に表示してユーザが利用することもできる。 Further, as described above, information such as image data acquired through an external personal computer or network and recorded in the recording medium 170 is read out and reproduced by the imaging apparatus 100 and displayed on the LCD 162 for use by the user. You can also.

なお、通信Ｉ／Ｆ１６４は、ＩＥＥＥ（Institute of Electrical and Electronic Engineers）１３９４またはＵＳＢ（Universal Serial Bus）などの規格に準拠した有線用インタフェースとして設けることも可能であり、また、ＩＥＥＥ８０２．１１ａ、ＩＥＥＥ８０２．１１ｂ、ＩＥＥＥ８０２．１１ｇ、または、ブルートゥースの規格に準拠した光や電波による無線インタフェースとして設けることも可能である。すなわち、通信Ｉ／Ｆ１６４は、有線または無線の何れのインタフェースであってもよい。 The communication I / F 164 can also be provided as a wired interface conforming to a standard such as IEEE (Institute of Electrical and Electronic Engineers) 1394 or USB (Universal Serial Bus). 11b, IEEE802.11g, or a wireless interface using light or radio waves compliant with the Bluetooth standard. That is, the communication I / F 164 may be a wired or wireless interface.

このように、撮像装置１００は、被写体の画像を撮影して、撮像装置１００に装填された記録媒体１７０に記録することができるとともに、記録媒体１７０に記録された画像データを読み出して、これを再生し、利用することができるものである。また、外部のパーソナルコンピュータやネットワークを通じて、画像データの提供を受けて、これを撮像装置１００に装填された記録媒体１７０に記録したり、また、読み出して再生したりすることもできる。 As described above, the imaging apparatus 100 can capture an image of a subject and record it on the recording medium 170 loaded in the imaging apparatus 100, and also reads out the image data recorded on the recording medium 170 and uses it. It can be reproduced and used. In addition, image data can be provided via an external personal computer or a network, and can be recorded on a recording medium 170 loaded in the imaging apparatus 100, or read and reproduced.

次に、本発明の実施の形態で用いる動画コンテンツファイルについて図面を詳細に説明する。 Next, the drawings of the moving image content file used in the embodiment of the present invention will be described in detail.

図２は、撮像装置１００で撮影された画像データがＭＰＥＧ４−ＡＶＣ（MPEG-4 part10:AVC）で符号化された、ビデオ信号の所定フレームを模式的に示す図である。 FIG. 2 is a diagram schematically illustrating a predetermined frame of a video signal in which image data captured by the imaging apparatus 100 is encoded by MPEG4-AVC (MPEG-4 part 10: AVC).

本発明の実施の形態では、ＭＰＥＧ４−ＡＶＣで符号化されたビデオ信号の何れかのフレームに含まれる人間の顔を検出し、検出された顔に対応する顔メタデータを記録する記録方法について説明する。 In the embodiment of the present invention, a recording method for detecting a human face included in any frame of a video signal encoded by MPEG4-AVC and recording face metadata corresponding to the detected face will be described. To do.

ＭＰＥＧ４−ＡＶＣ規格では、動画像符号化処理を扱うＶＣＬ（Video Coding Layer）と、符号化された情報を伝送、蓄積する下位システムとの間にＮＡＬ（Network Abstraction Layer）が存在する。また、シーケンスやピクチャのヘッダ情報に相当するパラメータセットをＶＣＬで生成された情報と分離して扱うことができる。さらに、ＭＰＥＧ−２システムなどの下位システムへのビットストリームの対応付けは、ＮＡＬの一区切りである「ＮＡＬユニット」を単位として行われる。 In the MPEG4-AVC standard, there is a NAL (Network Abstraction Layer) between a VCL (Video Coding Layer) that handles moving image encoding processing and a lower system that transmits and stores encoded information. Also, parameter sets corresponding to sequence and picture header information can be handled separately from the information generated by the VCL. Further, the bit stream is associated with a lower system such as the MPEG-2 system in units of “NAL units” which are one segment of the NAL.

ここでは、主なＮＡＬユニットについて説明する。ＳＰＳ（Sequence Parameter Set）ＮＡＬユニットには、プロファイル、レベル情報等シーケンス全体の符号化に関わる情報が含まれる。後述するＡＵ（Access Unit）において、ＳＰＳＮＡＬユニットが挿入されているＡＵ区間が、一般的には１シーケンスとされる。そして、この１シーケンスを編集単位として、ストリームの部分消去、結合等の編集が行われる。ＰＰＳ（Picture Parameter Set）ＮＡＬユニットには、エントロピー符号化モード、ピクチャ単位の量子化パラメータ等のピクチャ全体の符号化モードに関する情報が含まれる。 Here, main NAL units will be described. An SPS (Sequence Parameter Set) NAL unit includes information relating to encoding of the entire sequence such as profile and level information. In an AU (Access Unit) described later, an AU section in which an SPS NAL unit is inserted is generally one sequence. Then, editing such as partial deletion and combination of streams is performed with this one sequence as an editing unit. A PPS (Picture Parameter Set) NAL unit includes information related to the coding mode of the entire picture such as an entropy coding mode and a quantization parameter for each picture.

ＣｏｄｅｄＳｌｉｃｅｏｆａｎＩＤＲｐｉｃｔｕｒｅＮＡＬユニットには、ＩＤＲ（Instantaneous Decoder Refresh）ピクチャの符号化データが格納される。ＣｏｄｅｄＳｌｉｃｅｏｆａｎｏｎＩＤＲｐｉｃｔｕｒｅＮＡＬユニットには、ＩＤＲピクチャでない、その他のピクチャの符号化データが格納される。 Coded slice of an IDR picture NAL unit stores encoded data of an IDR (Instantaneous Decoder Refresh) picture. Coded slice of a non IDR picture NAL unit stores encoded data of other pictures that are not IDR pictures.

ＳＥＩ（Supplemental Enhancement Information）ＮＡＬユニットには、ＶＣＬの符号に必須でない付加情報が格納される。例えば、ランダムアクセスを行うのに便利な情報、ユーザが独自に定義する情報等が格納される。ＡＵＤ（Access Unit Delimiter）ＮＡＬユニットは、後述するアクセスユニット（ＡＵ）の先頭に付加される。このＡＵＤＮＡＬユニットには、アクセスユニットに含まれるスライスの種類を示す情報が含まれる。その他、シーケンスの終了を示すＥＯＳ（End Of Sequence）ＮＡＬユニット、および、ストリームの終了を示すＥＯＳＴ（End Of Stream）ＮＡＬユニットが定義されている。 In the SEI (Supplemental Enhancement Information) NAL unit, additional information that is not essential for the VCL code is stored. For example, information useful for random access, information uniquely defined by the user, and the like are stored. An AUD (Access Unit Delimiter) NAL unit is added to the head of an access unit (AU) described later. The AUD NAL unit includes information indicating the type of slice included in the access unit. In addition, an EOS (End Of Sequence) NAL unit indicating the end of the sequence and an EOST (End Of Stream) NAL unit indicating the end of the stream are defined.

ビットストリーム中の情報をピクチャ単位にアクセスするために、いくつかのＮＡＬユニットをまとめたものをアクセスユニット（ＡＵ）と呼ぶ。アクセスユニットには、ピクチャのスライスに相当するＮＡＬユニット（Coded Slice of an IDR picture ＮＡＬユニットまたはCoded Slice of a non IDR picture ＮＡＬユニット）が必ず含まれる。本発明の実施の形態では、あるＳＰＳＮＡＬユニットを含むＡＵを始点とし、ＥＯＳＮＡＬユニットを含むＡＵを終点とした一連のＡＵの括りを１シーケンスとして定義する。さらにＳＰＳを含むＡＵは、ＩＤＲピクチャまたはＩピクチャのスライスに相当されるＮＡＬユニットを含むものとする。つまり、１シーケンスの復号化順における先頭には他のピクチャに依存せずに復号可能なＩＤＲピクチャまたはＩピクチャを有することになるため、１シーケンスをランダムアクセスの単位、または編集における編集単位とすることが可能となる。 In order to access information in the bitstream in units of pictures, a group of several NAL units is called an access unit (AU). An access unit always includes a NAL unit (Coded Slice of an IDR picture NAL unit or Coded Slice of a non IDR picture NAL unit) corresponding to a slice of a picture. In the embodiment of the present invention, a sequence of AUs starting from an AU including an SPS NAL unit and starting from an AU including an EOS NAL unit is defined as one sequence. Furthermore, the AU including the SPS includes a NAL unit corresponding to a slice of an IDR picture or an I picture. That is, since an IDR picture or I picture that can be decoded without depending on other pictures is included at the head in the decoding order of one sequence, one sequence is a unit of random access or an editing unit in editing. It becomes possible.

例えば、図２に示すように、ＳＰＳを含むＡＵ１８０には、ＳＥＩＮＡＬユニット１８１が含まれ、ＳＰＳを含むＡＵ１９０には、ＳＥＩＮＡＬユニット１９１が含まれているものとする。このＳＥＩＮＡＬユニット１８１およびＳＥＩＮＡＬユニット１９１については、本発明の実施の形態の変形例において詳細に説明する。 For example, as shown in FIG. 2, it is assumed that the AU 180 including the SPS includes the SEI NAL unit 181 and the AU 190 including the SPS includes the SEI NAL unit 191. The SEI NAL unit 181 and the SEI NAL unit 191 will be described in detail in a modification of the embodiment of the present invention.

なお、本発明の実施の形態では、動画コンテンツから人間の顔を抽出する際、その検出の単位をこの１シーケンスとする。すなわち、１シーケンス内において、このシーケンスに含まれる１フレームのみから顔を検出し、他のフレームからは顔を検出しない。ただし、所定シーケンス間隔おきに顔を検出するようにしてもよく、ＩＤＲを含むシーケンスおきに顔を検出するようにしてもよい。 In the embodiment of the present invention, when a human face is extracted from moving image content, the unit of detection is defined as one sequence. That is, in one sequence, a face is detected from only one frame included in this sequence, and a face is not detected from other frames. However, a face may be detected every predetermined sequence interval, or a face may be detected every sequence including IDR.

次に、記録媒体１７０に記録されている実ファイルについて図面を参照して詳細に説明する。 Next, the actual file recorded on the recording medium 170 will be described in detail with reference to the drawings.

図３は、ファイルシステム（File System）上に登録されている実ファイルのファイル構造を概略的に示す図である。本発明の実施の形態では、動画または静止画コンテンツファイルとこれらのコンテンツファイルに関する顔メタデータとについて、実ディレクトリとは異なる仮想的なエントリ構造で管理する。具体的には、動画または静止画コンテンツファイル以外に、これらのファイルと顔メタデータとを管理するコンテンツ管理ファイル３４０が記録媒体１７０に記録される。 FIG. 3 is a diagram schematically showing the file structure of an actual file registered on the file system. In the embodiment of the present invention, a moving image or still image content file and face metadata related to these content files are managed with a virtual entry structure different from the real directory. Specifically, in addition to the moving image or still image content file, a content management file 340 that manages these files and face metadata is recorded on the recording medium 170.

ルートディレクトリ３００には、動画コンテンツフォルダ３１０と、静止画コンテンツフォルダ３２０と、コンテンツ管理フォルダ３３０とが属する。 A moving image content folder 310, a still image content folder 320, and a content management folder 330 belong to the root directory 300.

動画コンテンツフォルダ３１０は、撮像装置１００で撮像された動画データである動画コンテンツファイル３１１および３１２が属する動画コンテンツフォルダである。なお、この例では、動画コンテンツファイル３１１および３１２が動画コンテンツフォルダ３１０に属するものと想定している。 The moving image content folder 310 is a moving image content folder to which the moving image content files 311 and 312 that are moving image data captured by the imaging apparatus 100 belong. In this example, it is assumed that the moving image content files 311 and 312 belong to the moving image content folder 310.

静止画コンテンツフォルダ３２０は、撮像装置１００で撮像された静止画データである静止画コンテンツファイル３２１および３２２が属する静止画コンテンツフォルダである。なお、この例では、静止画コンテンツファイル３２１および３２２が静止画コンテンツフォルダ３２０に属するものと想定している。 The still image content folder 320 is a still image content folder to which still image content files 321 and 322 that are still image data captured by the imaging apparatus 100 belong. In this example, it is assumed that the still image content files 321 and 322 belong to the still image content folder 320.

コンテンツ管理フォルダ３３０は、コンテンツ管理ファイル３４０が属するコンテンツ管理フォルダである。コンテンツ管理ファイル３４０は、動画コンテンツフォルダ３１０および静止画コンテンツフォルダ３２０に属する各コンテンツファイルを仮想的な階層エントリで管理するファイルであり、プロパティファイル４００とサムネイルファイル５００とで構成されている。プロパティファイル４００は、各コンテンツファイルを仮想的に管理するための管理情報と、各コンテンツファイルの作成日時等のコンテンツ属性情報と、顔メタデータ等の各コンテンツファイルに付随するメタデータとが記録されているファイルである。また、サムネイルファイル５００は、各コンテンツファイルの代表サムネイル画像が格納されているファイルである。なお、プロパティファイル４００およびサムネイルファイル５００の詳細については、図４乃至図８等を参照して詳細に説明する。 The content management folder 330 is a content management folder to which the content management file 340 belongs. The content management file 340 is a file that manages each content file belonging to the moving image content folder 310 and the still image content folder 320 with a virtual hierarchy entry, and includes a property file 400 and a thumbnail file 500. The property file 400 records management information for virtually managing each content file, content attribute information such as the creation date and time of each content file, and metadata associated with each content file such as face metadata. File. The thumbnail file 500 is a file in which a representative thumbnail image of each content file is stored. Details of the property file 400 and the thumbnail file 500 will be described in detail with reference to FIGS.

ここで、動画コンテンツフォルダ３１０に属する各動画コンテンツファイル、および、静止画コンテンツフォルダ３２０に属する各静止画コンテンツファイルは、ユーザに可視である。すなわち、ユーザからの操作入力によって、これらのコンテンツファイルに対応する画像をＬＣＤ１６２に表示させることが可能である。 Here, each moving image content file belonging to the moving image content folder 310 and each still image content file belonging to the still image content folder 320 are visible to the user. That is, an image corresponding to these content files can be displayed on the LCD 162 by an operation input from the user.

一方、コンテンツ管理ファイル３４０については、コンテンツ管理ファイル３４０の内容がユーザに改変されることを避けるため、ユーザに不可視とする。コンテンツ管理ファイル３４０の内容を不可視とする具体的な設定方法として、例えば、ファイルシステムの対象となるコンテンツ管理フォルダ３３０を不可視にするフラグをオンにすることによってコンテンツ管理ファイル３４０の内容を不可視とすることができる。さらに、不可視にするタイミングとして、例えば、撮像装置１００がＵＳＢ（Universal Serial Bus）経由でＰＣ（パーソナルコンピュータ）と接続された場合（マスストレージ接続）において、撮像装置１００が接続を感知したとき（接続が正しく行えたという信号をＰＣ（ホスト）から受信したとき）に、上記フラグをオンにするようにしてもよい。 On the other hand, the content management file 340 is made invisible to the user in order to prevent the content management file 340 from being altered by the user. As a specific setting method for making the contents of the content management file 340 invisible, for example, the contents management file 340 is made invisible by turning on a flag that makes the contents management folder 330 that is the target of the file system invisible. be able to. Further, as the timing of making it invisible, for example, when the imaging apparatus 100 is connected to a PC (personal computer) via USB (Universal Serial Bus) (mass storage connection), the imaging apparatus 100 senses a connection (connection) The above flag may be turned on when a signal indicating that the data has been correctly received is received from the PC (host).

次に、プロパティファイル４００の仮想的なエントリ構造について図面を参照して詳細に説明する。 Next, the virtual entry structure of the property file 400 will be described in detail with reference to the drawings.

図４は、プロパティファイル４００が管理する仮想フォルダおよび仮想ファイルの構成例を示す図である。 FIG. 4 is a diagram illustrating a configuration example of virtual folders and virtual files managed by the property file 400.

プロパティファイル４００は、上述したように、記録媒体１７０に記録されている動画または静止画コンテンツファイルを管理するものであり、アプリケーションに応じた柔軟性のある管理方法が可能である。例えば、動画または静止画コンテンツファイルが撮像装置１００に記録された日時に応じて管理することができる。また、動画または静止画の種別に応じて管理することができる。ここでは、記録された日時に応じて動画コンテンツファイルを分類して管理する管理方法について説明する。また、各エントリ内に示す数字は、エントリ番号を示す数字である。なお、エントリ番号については、図７を参照して詳細に説明する。 As described above, the property file 400 manages the moving image or still image content file recorded on the recording medium 170, and a flexible management method according to the application is possible. For example, management can be performed according to the date and time when a moving image or still image content file was recorded in the imaging apparatus 100. Also, management can be performed according to the type of moving image or still image. Here, a management method for classifying and managing moving image content files according to the recorded date and time will be described. The numbers shown in each entry are numbers indicating the entry number. The entry number will be described in detail with reference to FIG.

ルートエントリ４０７は、階層型エントリ構造における最上階層のエントリである。この例では、ルートエントリ４０７には、動画フォルダエントリ４１０および静止画フォルダエントリ４０９が属する。また、プロファイルエントリ４０８（エントリ番号：＃１５０）は、各ファイルエントリのコーデック情報（符号化フォーマット、画サイズ、ビットレート等）を一括して保存するエントリである。なお、プロファイルエントリ４０８については、図７（ｃ）を参照して詳細に説明する。静止画フォルダエントリ４０９は、静止画に関する日付フォルダエントリを下位の階層で管理するエントリである。動画フォルダエントリ４１０（エントリ番号：＃１）は、日付フォルダエントリを下位の階層で管理するエントリである。この例では、動画フォルダエントリ４１０には、日付フォルダエントリ４１１および日付フォルダエントリ４１６が属する。 The root entry 407 is an entry at the highest level in the hierarchical entry structure. In this example, a moving image folder entry 410 and a still image folder entry 409 belong to the route entry 407. A profile entry 408 (entry number: # 150) is an entry that collectively stores codec information (encoding format, image size, bit rate, etc.) of each file entry. The profile entry 408 will be described in detail with reference to FIG. The still image folder entry 409 is an entry for managing a date folder entry related to a still image in a lower hierarchy. The moving image folder entry 410 (entry number: # 1) is an entry for managing date folder entries in a lower hierarchy. In this example, a date folder entry 411 and a date folder entry 416 belong to the moving image folder entry 410.

日付フォルダエントリ４１１（エントリ番号：＃３）および日付フォルダエントリ４１６（エントリ番号：＃５）は、記録媒体１７０に記録されている動画コンテンツファイルを日付毎に分類して管理するエントリであり、分類された動画コンテンツファイルを下位の階層で管理するエントリである。この例では、日付フォルダエントリ４１１は、「２００６／１／１１」に記録された動画コンテンツファイルを管理するエントリとし、日付フォルダエントリ４１１には動画ファイルエントリ４１２および動画ファイルエントリ４１４が属する。また、日付フォルダエントリ４１６は、「２００６／７／２８」に記録された動画コンテンツファイルを管理するエントリとし、日付フォルダエントリ４１６には動画ファイルエントリ４１７および動画ファイルエントリ４１９が属する。なお、フォルダエントリの詳細については、図５を参照して詳細に説明する。 The date folder entry 411 (entry number: # 3) and the date folder entry 416 (entry number: # 5) are entries that classify and manage moving image content files recorded on the recording medium 170 by date. This is an entry for managing the recorded video content file in a lower hierarchy. In this example, the date folder entry 411 is an entry for managing the moving image content file recorded in “2006/1/11”, and the moving image file entry 412 and the moving image file entry 414 belong to the date folder entry 411. The date folder entry 416 is an entry for managing the moving image content file recorded in “2006/7/28”, and the moving image file entry 417 and the moving image file entry 419 belong to the date folder entry 416 . Details of the folder entry will be described in detail with reference to FIG.

動画ファイルエントリ４１２（エントリ番号：＃７）、動画ファイルエントリ４１４（エントリ番号：＃２８）、動画ファイルエントリ４１７（エントリ番号：＃１４）、動画ファイルエントリ４１９（エントリ番号：＃２１）には、記録媒体１７０に記録されている各動画コンテンツファイルを仮想的に管理するための管理情報と、各動画コンテンツファイルの作成日時等のコンテンツ属性情報とが格納されている。なお、ファイルエントリの詳細については、図５を参照して詳細に説明する。 The movie file entry 412 (entry number: # 7), movie file entry 414 (entry number: # 28), movie file entry 417 (entry number: # 14), movie file entry 419 (entry number: # 21) includes Management information for virtually managing each moving image content file recorded on the recording medium 170 and content attribute information such as the creation date and time of each moving image content file are stored. Details of the file entry will be described in detail with reference to FIG.

メタデータエントリ４１３（エントリ番号：＃１０）、メタデータエントリ４１５（エントリ番号：＃３１）、メタデータエントリ４１８（エントリ番号：＃１７）、メタデータエントリ４２０（エントリ番号：＃２４）は、それぞれ連結されている動画ファイルエントリが管理する動画コンテンツファイルに付随するメタデータを格納するメタデータエントリである。メタデータとして、この例では、動画コンテンツファイルから抽出された顔データが格納される。この顔データは、動画コンテンツファイルから抽出された顔に関する各種データであり、例えば、図１１に示すように、顔検出時刻情報、顔基本情報、顔スコア、笑顔スコア等のデータである。なお、メタデータエントリの詳細については、図５乃至図１６を参照して詳細に説明する。 The metadata entry 413 (entry number: # 10), the metadata entry 415 (entry number: # 31), the metadata entry 418 (entry number: # 17), and the metadata entry 420 (entry number: # 24) are respectively It is a metadata entry for storing metadata attached to a moving image content file managed by a linked moving image file entry. In this example, face data extracted from a moving image content file is stored as metadata. This face data is various data related to the face extracted from the moving image content file, and is, for example, data such as face detection time information, face basic information, face score, smile score, as shown in FIG. Details of the metadata entry will be described in detail with reference to FIGS.

次に、コンテンツ管理ファイルとコンテンツファイルとの関係について図面を参照して詳細に説明する。 Next, the relationship between the content management file and the content file will be described in detail with reference to the drawings.

図５は、コンテンツ管理ファイル３４０を構成するプロパティファイル４００およびサムネイルファイル５００と、動画コンテンツフォルダ３１０に属する動画コンテンツファイル３１１乃至３１６との関係を概略的に示す図である。ここでは、図４に示す日付フォルダエントリ４１１、動画ファイルエントリ４１４、メタデータエントリ４１５と、代表サムネイル画像５０２と、動画コンテンツファイル３１２との関係について説明する。 FIG. 5 is a diagram schematically showing the relationship between the property file 400 and the thumbnail file 500 constituting the content management file 340 and the moving image content files 311 to 316 belonging to the moving image content folder 310. Here, the relationship between the date folder entry 411, the movie file entry 414, the metadata entry 415, the representative thumbnail image 502, and the movie content file 312 shown in FIG. 4 will be described.

日付フォルダエントリ４１１は、実コンテンツファイルの日付を仮想的に管理するフォルダエントリであり、「エントリ種別」、「親エントリリスト」、「親エントリ種別」、「子エントリリスト」、「子エントリ種別」、「スロット有効フラグ」、「スロットチェーン」等の情報が格納されている。 The date folder entry 411 is a folder entry that virtually manages the date of the actual content file, and includes “entry type”, “parent entry list”, “parent entry type”, “child entry list”, and “child entry type”. , “Slot valid flag”, “slot chain”, and the like are stored.

なお、エントリ番号は、各エントリを識別するための識別番号であり、日付フォルダエントリ４１１のエントリ番号として「＃３」が割り当てられる。なお、このエントリ番号の割り当て方法については、図７および図８を参照して説明する。 The entry number is an identification number for identifying each entry, and “# 3” is assigned as the entry number of the date folder entry 411. This entry number assignment method will be described with reference to FIGS.

「エントリ種別」は、このエントリの種類を示すものであり、エントリの種類に応じて「動画フォルダエントリ」、「日付フォルダエントリ」、「動画ファイルエントリ」、「静止画ファイルエントリ」、「メタデータエントリ」等が格納される。例えば、日付フォルダエントリ４１１の「エントリ種別」には「日付フォルダエントリ」が格納される。 “Entry type” indicates the type of this entry, and “video folder entry”, “date folder entry”, “video file entry”, “still image file entry”, “metadata” depending on the type of entry. “Entry” and the like are stored. For example, “date folder entry” is stored in “entry type” of the date folder entry 411.

「親エントリリスト」には、このエントリが属する上位の階層エントリである親エントリに対応するエントリ番号が格納される。例えば、日付フォルダエントリ４１１の「親エントリリスト」には「＃１」が格納される。 The “parent entry list” stores an entry number corresponding to a parent entry which is an upper layer entry to which this entry belongs. For example, “# 1” is stored in the “parent entry list” of the date folder entry 411.

「親エントリ種別」は、「親エントリリスト」に格納されているエントリ番号に対応する親エントリの種類を示すものであり、親エントリの種類に応じて「動画フォルダエントリ」、「日付フォルダエントリ」、「動画ファイルエントリ」、「静止画ファイルエントリ」等が格納される。例えば、日付フォルダエントリ４１１の「親エントリ種別」には「動画フォルダエントリ」が格納される。 “Parent entry type” indicates the type of the parent entry corresponding to the entry number stored in the “Parent entry list”, and “Movie folder entry” and “Date folder entry” according to the type of the parent entry. , “Moving image file entry”, “still image file entry” and the like are stored. For example, “moving image folder entry” is stored in the “parent entry type” of the date folder entry 411.

「子エントリリスト」は、このエントリに属する下位階層のエントリである子エントリに対応するエントリ番号が記録される。例えば、日付フォルダエントリ４１１の「子エントリリスト」には「＃７」および「＃２８」が格納される。 In the “child entry list”, an entry number corresponding to a child entry that is a lower-level entry belonging to this entry is recorded. For example, “# 7” and “# 28” are stored in the “child entry list” of the date folder entry 411.

「子エントリ種別」は、「子エントリリスト」に格納されているエントリ番号に対応する子エントリの種類を示すものであり、子エントリの種類に応じて「動画フォルダエントリ」、「日付フォルダエントリ」、「動画ファイルエントリ」、「静止画ファイルエントリ」、「メタデータエントリ」等が記録される。例えば、日付フォルダエントリ４１１の「子エントリ種別」には「動画ファイルエントリ」が格納される。 “Child entry type” indicates the type of child entry corresponding to the entry number stored in the “child entry list”, and “video folder entry” and “date folder entry” according to the type of child entry. , “Moving image file entry”, “still image file entry”, “metadata entry”, and the like are recorded. For example, “moving image file entry” is stored in the “child entry type” of the date folder entry 411.

「スロット有効フラグ」は、このエントリを構成する各スロットが有効であるか無効であるかを示すフラグである。「スロットチェーン」は、このエントリを構成する各スロットに関するリンクや連結等の情報である。なお、「スロット有効フラグ」および「スロットチェーン」については、図７（ｂ）を参照して詳細に説明する。 The “slot valid flag” is a flag indicating whether each slot constituting this entry is valid or invalid. The “slot chain” is information such as a link or connection regarding each slot constituting this entry. The “slot valid flag” and “slot chain” will be described in detail with reference to FIG.

動画ファイルエントリ４１４は、実コンテンツファイルを仮想的に管理するファイルエントリであり、仮想管理情報４０１およびコンテンツ属性情報４０２が格納されている。仮想管理情報４０１には、「エントリ種別」、「コンテンツ種別」、「コンテンツアドレス」、「親エントリリスト」、「親エントリ種別」、「子エントリリスト」、「子エントリ種別」、「スロット有効フラグ」、「スロットチェーン」等の情報が格納されている。なお、「エントリ種別」、「親エントリリスト」、「親エントリ種別」、「子エントリリスト」、「子エントリ種別」、「スロット有効フラグ」、「スロットチェーン」については、日付フォルダエントリ４１１で示したものと同様であるため、ここでの説明は省略する。 The moving image file entry 414 is a file entry for virtually managing an actual content file, and stores virtual management information 401 and content attribute information 402. The virtual management information 401 includes “entry type”, “content type”, “content address”, “parent entry list”, “parent entry type”, “child entry list”, “child entry type”, “slot valid flag”. "," Slot chain "and the like are stored. The “entry type”, “parent entry list”, “parent entry type”, “child entry list”, “child entry type”, “slot valid flag”, and “slot chain” are indicated by the date folder entry 411. The description is omitted here because it is similar to the above.

「コンテンツ種別」は、このファイルエントリに対応するコンテンツファイルの種類を示すものであり、ファイルエントリに対応するコンテンツファイルの種類に応じて、「動画コンテンツファイル」、「静止画コンテンツファイル」等が記録される。例えば、動画ファイルエントリ４１４の「コンテンツ種別」には「動画コンテンツファイル」が格納される。 The “content type” indicates the type of content file corresponding to this file entry, and “video content file”, “still image content file”, etc. are recorded according to the type of content file corresponding to the file entry. Is done. For example, “video content file” is stored in “content type” of the video file entry 414.

「コンテンツアドレス」は、記録媒体１７０に記録されている動画コンテンツファイルの記録位置を示す情報であり、この記録位置情報によって記録媒体１７０に記録されている動画コンテンツファイルへのアクセスが可能となる。例えば、動画ファイルエントリ４１４の「コンテンツアドレス」には、動画コンテンツファイル３１２のアドレスを示す「Ａ３１２」が格納される。 The “content address” is information indicating the recording position of the moving image content file recorded on the recording medium 170, and the moving image content file recorded on the recording medium 170 can be accessed by this recording position information. For example, “content address” of the moving image file entry 414 stores “A 312” indicating the address of the moving image content file 312.

コンテンツ属性情報４０２は、仮想管理情報４０１に格納されているコンテンツファイルの属性情報であり、「作成日時」、「更新日時」、「区間情報」、「サイズ情報」、「サムネイルアドレス」、「プロファイル情報」等の情報が格納されている。 The content attribute information 402 is attribute information of the content file stored in the virtual management information 401, and includes “creation date”, “update date”, “section information”, “size information”, “thumbnail address”, “profile”. Information such as “information” is stored.

「作成日時」には、このファイルエントリに対応するコンテンツファイルが作成された日時が格納される。「更新日時」には、このファイルエントリに対応するコンテンツファイルが更新された日時が格納される。なお、「更新日時」を用いて、メタデータの不整合が判別される。「区間情報」には、このファイルエントリに対応するコンテンツファイルの時間の長さを示す情報が格納される。「サイズ情報」は、このファイルエントリに対応するコンテンツファイルのサイズを示す情報が格納される。 The “date and time of creation” stores the date and time when the content file corresponding to this file entry was created. “Update date and time” stores the date and time when the content file corresponding to this file entry was updated. Note that the inconsistency of metadata is determined using “update date and time”. The “section information” stores information indicating the length of time of the content file corresponding to this file entry. “Size information” stores information indicating the size of the content file corresponding to the file entry.

「サムネイルアドレス」は、サムネイルファイル５００に格納されている代表サムネイル画像の記録位置を示す情報であり、この位置情報によってサムネイルファイル５００に格納されている代表サムネイル画像へのアクセスが可能となる。例えば、動画ファイルエントリ４１４の「サムネイルアドレス」には、動画コンテンツファイル３１２の代表画像である代表サムネイル画像５０２のサムネイルファイル５００内部におけるエントリ番号が格納される。 The “thumbnail address” is information indicating the recording position of the representative thumbnail image stored in the thumbnail file 500, and this position information enables access to the representative thumbnail image stored in the thumbnail file 500. For example, the “thumbnail address” of the moving image file entry 414 stores the entry number in the thumbnail file 500 of the representative thumbnail image 502 that is the representative image of the moving image content file 312.

「プロファイル情報」には、プロファイルエントリ４０８内部に格納されているビデオ・オーディオエントリ（video audio entry）のエントリ番号が記録されている。なお、ビデオ・オーディオエントリについては、図７（ｃ）を参照して詳細に説明する。 In the “profile information”, an entry number of a video / audio entry (video audio entry) stored in the profile entry 408 is recorded. The video / audio entry will be described in detail with reference to FIG.

メタデータエントリ４１５には、「エントリ種別」、「親エントリリスト」、「親エントリ種別」、「スロット有効フラグ」、「スロットチェーン」、「メタデータ」等の情報が格納されている。なお、「エントリ種別」、「親エントリリスト」、「親エントリ種別」「スロット有効フラグ」、「スロットチェーン」については、日付フォルダエントリ４１１で示したものと同様であるため、ここでの説明は省略する。 The metadata entry 415 stores information such as “entry type”, “parent entry list”, “parent entry type”, “slot valid flag”, “slot chain”, “metadata”, and the like. Since “entry type”, “parent entry list”, “parent entry type”, “slot valid flag”, and “slot chain” are the same as those shown in the date folder entry 411, the description here will be omitted. Omitted.

「メタデータ」は、このメタデータエントリが属する上位の階層ファイルエントリである親エントリに対応するコンテンツファイルから取得された各種属性情報（メタデータ）である。この「メタデータ」に格納される各種情報については、図９乃至図１６を参照して詳細に説明する。 “Metadata” is various pieces of attribute information (metadata) acquired from a content file corresponding to a parent entry that is an upper layer file entry to which this metadata entry belongs. Various information stored in the “metadata” will be described in detail with reference to FIGS. 9 to 16.

サムネイルファイル５００は、各コンテンツファイルの代表画像である代表サムネイル画像が格納されるサムネイルファイルである。例えば、図５に示すように、動画コンテンツフォルダ３１０に属する動画コンテンツファイル３１１乃至３１６の代表画像として、代表サムネイル画像５０１乃至５０６がサムネイルファイル５００に格納されている。なお、サムネイルファイル５００に格納されている各サムネイル画像については、プロパティファイル４００に含まれるコンテンツ属性情報４０２の「サムネイルアドレス」に基づいてアクセスすることができる。また、各コンテンツファイルについては、プロパティファイル４００に含まれる仮想管理情報４０１の「コンテンツアドレス」に基づいてアクセスすることができる。 The thumbnail file 500 is a thumbnail file in which a representative thumbnail image that is a representative image of each content file is stored. For example, as shown in FIG. 5, representative thumbnail images 501 to 506 are stored in the thumbnail file 500 as representative images of the moving image content files 311 to 316 belonging to the moving image content folder 310. Each thumbnail image stored in the thumbnail file 500 can be accessed based on the “thumbnail address” of the content attribute information 402 included in the property file 400. Each content file can be accessed based on the “content address” of the virtual management information 401 included in the property file 400.

次に、プロパティファイルに格納されている各エントリの親子関係について図面を参照して詳細に説明する。 Next, the parent-child relationship of each entry stored in the property file will be described in detail with reference to the drawings.

図６は、図４に示す動画フォルダエントリ４１０と、日付フォルダエントリ４１１と、動画ファイルエントリ４１２および４１４と、メタデータエントリ４１３および４１５との親子関係を概略的に示す図である。 FIG. 6 is a diagram schematically showing a parent-child relationship among the moving picture folder entry 410, the date folder entry 411, the moving picture file entries 412 and 414, and the metadata entries 413 and 415 shown in FIG.

動画フォルダエントリ４１０（エントリ番号：＃１）には、「子エントリリスト」等の情報が格納されている。例えば、「子エントリリスト」には「＃３」および「＃５」が格納される。 The movie folder entry 410 (entry number: # 1) stores information such as “child entry list”. For example, “# 3” and “# 5” are stored in the “child entry list”.

日付フォルダエントリ４１１（エントリ番号：＃３）には、「親エントリリスト」、「子エントリリスト」等の情報が格納されている。例えば、「親エントリリスト」には「＃１」が格納され、「子エントリリスト」には「＃７」および「＃２８」が格納される。 The date folder entry 411 (entry number: # 3) stores information such as “parent entry list” and “child entry list”. For example, “# 1” is stored in the “parent entry list”, and “# 7” and “# 28” are stored in the “child entry list”.

動画ファイルエントリ４１２（エントリ番号：＃７）および４１４（エントリ番号：＃２８）には、「親エントリリスト」、「子エントリリスト」、「コンテンツアドレス」、「サムネイルアドレス」等の情報が格納されている。例えば、動画ファイルエントリ４１２において、「親エントリリスト」には「＃３」が格納され、「子エントリリスト」には「＃１０」が格納され、「コンテンツアドレス」には「Ａ３１１」が格納され、「サムネイルアドレス」には「＃１」が格納される。なお、「サムネイルアドレス」に格納される「＃１」は、サムネイルファイル５００におけるエントリ番号であり、プロパティファイル４００に格納されている各エントリのエントリ番号とは異なる。なお、「サムネイルアドレス」については、図７を参照した説明において詳細する。 The moving image file entries 412 (entry number: # 7) and 414 (entry number: # 28) store information such as “parent entry list”, “child entry list”, “content address”, “thumbnail address”, and the like. ing. For example, in the moving image file entry 412, “# 3” is stored in the “parent entry list”, “# 10” is stored in the “child entry list”, and “A311” is stored in the “content address”. “# 1” is stored in the “thumbnail address”. Note that “# 1” stored in the “thumbnail address” is an entry number in the thumbnail file 500 and is different from the entry number of each entry stored in the property file 400. The “thumbnail address” will be described in detail with reference to FIG.

メタデータエントリ４１３（エントリ番号：＃１０）および４１５（エントリ番号：＃３１）には、「親エントリリスト」等の情報が格納されている。例えば、メタデータエントリ４１３において、「親エントリリスト」には「＃７」が格納される。これらの親子関係については、図６において、各エントリの親子関係について、「親エントリリスト」または「子エントリリスト」からの矢印で示す。また、図４に示す動画フォルダエントリ４１０と、日付フォルダエントリ４１６と、動画ファイルエントリ４１７および４１９と、メタデータエントリ４１８および４２０とについても、同様の親子関係が成立している。 In the metadata entries 413 (entry number: # 10) and 415 (entry number: # 31), information such as “parent entry list” is stored. For example, in the metadata entry 413, “# 7” is stored in the “parent entry list”. These parent-child relationships are indicated by arrows from the “parent entry list” or “child entry list” in FIG. The same parent-child relationship is also established for the movie folder entry 410, the date folder entry 416, the movie file entries 417 and 419, and the metadata entries 418 and 420 shown in FIG.

なお、図４および図６に示すプロパティファイル４００においては、１つのファイルエントリに１つのメタデータエントリを関連付けた構成例を示すが、１つのファイルエントリに複数のメタデータエントリを関連付けるようにしてもよい。すなわち、１つの親ファイルエントリに複数の子メタデータエントリを対応させることができる。 The property file 400 shown in FIGS. 4 and 6 shows a configuration example in which one metadata entry is associated with one file entry. However, a plurality of metadata entries may be associated with one file entry. Good. That is, a plurality of child metadata entries can be associated with one parent file entry.

例えば、動画ファイルエントリ４１２の子メタデータエントリとして、顔メタデータを格納するメタデータエントリ４１３とともに、ＧＰＳ情報を格納するメタデータエントリ（エントリ番号：＃４０）（図示せず）を対応させ、動画ファイルエントリ４１２の子エントリリストに「＃１０」および「＃４０」を記録する。この場合には、子エントリリストの格納順序をメタデータの種類に応じて予め決めておくようにする。これにより、１つのファイルエントリに複数のメタデータを格納する場合において、メタデータの数が増加した場合でも、データ管理が煩雑になることを防止して、所望のメタデータの抽出時間を短縮することができる。なお、ここでのメタデータの種類とは、単なるデータの種類（顔メタ、ＧＰＳ等の種類）でもよく、メタデータがバイナリデータかテキストデータかというコーディングの種類でもよい。 For example, as a child metadata entry of the moving image file entry 412, a metadata entry (entry number: # 40) (not shown) for storing GPS information is associated with a metadata entry 413 for storing face metadata. “# 10” and “# 40” are recorded in the child entry list of the file entry 412. In this case, the storage order of the child entry list is determined in advance according to the type of metadata. As a result, when storing a plurality of metadata in one file entry, even if the number of metadata increases, data management is prevented from becoming complicated, and the extraction time of desired metadata is shortened. be able to. Note that the type of metadata here may be a simple data type (type of face metadata, GPS, etc.) or a coding type of whether the metadata is binary data or text data.

図７（ａ）は、プロパティファイル４００の基本構造の一例を示す図であり、図７（ｂ）は、各エントリを構成するスロットの構造を示す図であり、図７（ｃ）は、プロファイルエントリに含まれる情報の一例を示す図であり、図７（ｄ）は、ヘッダ部４３０に含まれる情報のうちで、コンテンツ管理ファイル３４０が管理するコンテンツの種別を示す情報の一例を示す図である。また、図８は、図４に示すプロパティファイル４００の全体構造を概略的に示す図である。 FIG. 7A is a diagram showing an example of the basic structure of the property file 400, FIG. 7B is a diagram showing the structure of slots constituting each entry, and FIG. 7C is a profile. FIG. 7D is a diagram illustrating an example of information included in the entry, and FIG. 7D is a diagram illustrating an example of information indicating the type of content managed by the content management file 340 among the information included in the header section 430. is there. FIG. 8 is a diagram schematically showing the overall structure of the property file 400 shown in FIG.

プロパティファイル４００は、図７（ａ）に示すように、ヘッダ部４３０およびエントリ部４４０の基本構造を有するファイルであり、これらの各エントリが１つの仮想フォルダや仮想ファイル等を示す単位となる。 As shown in FIG. 7A, the property file 400 is a file having a basic structure of a header part 430 and an entry part 440, and each of these entries is a unit indicating one virtual folder, virtual file, or the like.

エントリ部４４０を構成する各エントリは、１または複数のスロットで構成されている。なお、各エントリに格納されるデータの容量に応じて、各エントリには１または複数のスロットが割り当てられる。また、各エントリを構成するスロットは、プロパティファイルやサムネイルファイル等のファイル毎に決められた固定長のデータブロックとして定義されている。ただし、エントリによっては、構成されるスロット数が異なるため、スロットの整数倍で各エントリが可変長となる。 Each entry configuring the entry unit 440 includes one or a plurality of slots. Note that one or more slots are assigned to each entry according to the capacity of data stored in each entry. Further, the slots constituting each entry are defined as fixed-length data blocks determined for each file such as a property file and a thumbnail file. However, since the number of configured slots differs depending on the entry, each entry has a variable length by an integral multiple of the slot.

例えば、図７（ａ）に示すように、動画フォルダエントリ４１０には、格納されるデータ４５１のデータ容量に応じて２つのスロット４４１および４４２が割り当てられる。また、日付フォルダエントリ４１１には、格納されるデータ４５２のデータ容量に応じて２つのスロット４４３および４４４が割り当てられる。 For example, as shown in FIG. 7A, two slots 441 and 442 are assigned to the moving image folder entry 410 according to the data capacity of the stored data 451. In addition, two slots 443 and 444 are assigned to the date folder entry 411 according to the data capacity of the stored data 452.

なお、スロットが固定長であるため、スロットの全ての領域が有効データで埋められることがない場合があり、データ的にロスが発生する場合があるものの、スロットを固定長とすることによるデータアクセス性やデータ管理性を重視するため、このような構造とすることが好ましい。 Since the slot has a fixed length, the entire area of the slot may not be filled with valid data, and data loss may occur, but data access by making the slot a fixed length Therefore, such a structure is preferable.

また、エントリ部４４０を構成する各エントリは、図４および図６で示すように、エントリ番号で管理される。このエントリ番号は、エントリを構成する先頭のスロットが、プロパティファイル４００の全体を構成するスロットの先頭から何番目のスロットに該当するかに応じて割り当てられる。例えば、図７（ａ）および図８に示すように、動画フォルダエントリ４１０は、このエントリ内の先頭のスロットが、プロパティファイル４００の全体を構成するスロットの先頭から数えて１番目のスロットとなるため、エントリ番号として「＃１」が割り当てられる。また、日付フォルダエントリ４１１は、このエントリ内の先頭のスロットが、プロパティファイル４００の全体を構成するスロットの先頭から数えて３番目のスロットとなるため、エントリ番号として「＃３」が割り当てられる。また、日付フォルダエントリ４１６は、このエントリ内の先頭のスロットが、プロパティファイル４００の全体を構成するスロットの先頭から数えて５番目のスロットとなるため、エントリ番号として「＃５」が割り当てられる。なお、他の各エントリに割り当てられるエントリ番号についても同様である。これらのエントリ番号に基づいて、各エントリが管理されるとともに各エントリの親子関係が管理される。なお、エントリをサーチする場合には、エントリ部４４０を構成するスロットを最初からカウントして対象となるエントリをサーチする。 Each entry constituting the entry unit 440 is managed by an entry number as shown in FIGS. This entry number is assigned according to the number of the slot from the top of the slots constituting the entire property file 400 corresponding to the top slot constituting the entry. For example, as shown in FIGS. 7A and 8, in the movie folder entry 410, the first slot in this entry is the first slot counted from the beginning of the slots constituting the entire property file 400. Therefore, “# 1” is assigned as the entry number. The date folder entry 411 is assigned with the entry number “# 3” because the first slot in the entry is the third slot counted from the beginning of the slots constituting the entire property file 400. The date folder entry 416 is assigned the entry number “# 5” because the first slot in this entry is the fifth slot counted from the beginning of the slots constituting the entire property file 400. The same applies to entry numbers assigned to other entries. Based on these entry numbers, each entry is managed and the parent-child relationship of each entry is managed. When searching for an entry, the slot constituting the entry unit 440 is counted from the beginning to search for the target entry.

各エントリを構成するスロットは、図７（ｂ）に示すように、スロットヘッダ部４６０および実データ部４７０の構造を有するスロットである。スロットヘッダ部４６０は、スロットが有効であるか無効であるかを示す有効／無効フラグ４６１と、チェーン４６２とで構成されている。 The slot constituting each entry is a slot having a structure of a slot header portion 460 and an actual data portion 470, as shown in FIG. The slot header portion 460 includes a valid / invalid flag 461 indicating whether a slot is valid or invalid, and a chain 462.

有効／無効フラグ４６１には、対応するコンテンツファイルが有効に存在する場合には有効フラグが立てられ、対応するコンテンツファイルが削除された場合には無効フラグが立てられる。このように、対応するコンテンツファイルが削除された場合には有効／無効フラグ４６１に無効フラグを立てることによって、この削除されたコンテンツファイルに対応するスロット内部の情報を削除する処理を発生させずに、このスロットが見かけ上存在しないことを示すことができる。仮に、有効／無効フラグ４６１がない場合には、対応するコンテンツファイルが削除されると、この削除されたコンテンツファイルに対応するスロット内部の情報を削除する処理が必要であるとともに、削除されたスロットの物理的に後ろに存在するスロット内部の情報を前につめる必要があるため、処理が煩雑になる。 In the valid / invalid flag 461, a valid flag is set when the corresponding content file exists effectively, and an invalid flag is set when the corresponding content file is deleted. As described above, when the corresponding content file is deleted, an invalid flag is set in the valid / invalid flag 461 so that processing for deleting information in the slot corresponding to the deleted content file does not occur. , It can be shown that this slot is not apparently present. If there is no valid / invalid flag 461, when the corresponding content file is deleted, it is necessary to delete information in the slot corresponding to the deleted content file, and the deleted slot is deleted. Since it is necessary to pack the information inside the slot that is physically behind, the processing becomes complicated.

チェーン４６２には、各スロットを連結するためのリンクや連結等の情報が格納される。このチェーン４６２に格納される情報により、複数のスロットが連結されて１つのエントリが構成される。また、実データ部４７０には、各エントリの実データが格納されている。 The chain 462 stores information such as links and connections for connecting the slots. With the information stored in this chain 462, a plurality of slots are connected to form one entry. The actual data portion 470 stores the actual data of each entry.

プロファイルエントリ４０８には、各コンテンツファイルのビデオおよびオーディオに関するコーデック情報が１対となった１００種類程度のデータが記録されている。ビデオに関するコーデック情報として、ビデオエントリ（video entry）には、「符号化フォーマット（codec type）」、「画サイズ（visual size）」、「ビットレート（bit rate）」等が格納されている。また、オーディオに関するコーデック情報として、オーディオエントリ（audio entry）には、「符号化フォーマット（codec type）」、「サンプリングレート（sampling rate）」等が格納されている。また、各ビデオ・オーディオエントリには、エントリ番号が割り当てられている。このエントリ番号として、プロファイルエントリ４０８内部における記録順序を示す番号が割り当てられる。例えば、図７（ｃ）に示すように、最初のビデオ・オーディオエントリ４７１には「＃１」が割り当てられ、２番目のビデオ・オーディオエントリ４７２には「＃２」が割り当てられる。なお、このビデオ・オーディオエントリのエントリ番号が、ファイルエントリの「プロファイル情報」（図５に示す）に記録される。そして、「プロファイル情報」に記録されているエントリ番号に基づいて、ファイルエントリに対応するコンテンツファイルのコーデック情報が読み出される。 The profile entry 408 records about 100 types of data in which codec information related to video and audio of each content file is paired. As codec information regarding video, a video entry stores “codec type”, “visual size”, “bit rate”, and the like. Also, as codec information related to audio, an audio entry stores “encoding format (codec type)”, “sampling rate”, and the like. Each video / audio entry is assigned an entry number. As this entry number, a number indicating the recording order in the profile entry 408 is assigned. For example, as shown in FIG. 7C, “# 1” is assigned to the first video / audio entry 471 and “# 2” is assigned to the second video / audio entry 472. The entry number of this video / audio entry is recorded in the “profile information” (shown in FIG. 5) of the file entry. Based on the entry number recorded in the “profile information”, the codec information of the content file corresponding to the file entry is read out.

サムネイルファイル５００（図５に示す）は、基本的な構造はプロパティファイル４００と同様であり、各エントリが１または複数のスロットで構成されている。これらの各エントリが１つの代表サムネイル画像を示す単位となる。ただし、サムネイルファイル５００にはヘッダ部が存在しない。各スロットは、ファイル内で固定長であり、この１スロットの固定長に関するスロットサイズは、プロパティファイル４００のヘッダ部４３０に記録されている。また、サムネイルファイル５００の各エントリの対応関係は、プロパティファイル４００に格納されている。なお、サムネイルファイル５００のスロットサイズは、プロパティファイル４００のスロットサイズとは異なる。 The thumbnail file 500 (shown in FIG. 5) has the same basic structure as the property file 400, and each entry is composed of one or a plurality of slots. Each of these entries is a unit indicating one representative thumbnail image. However, the thumbnail file 500 has no header part. Each slot has a fixed length in the file, and the slot size related to the fixed length of one slot is recorded in the header portion 430 of the property file 400. Also, the correspondence relationship between the entries of the thumbnail file 500 is stored in the property file 400. Note that the slot size of the thumbnail file 500 is different from the slot size of the property file 400.

サムネイルファイル５００のスロットの容量は、サムネイルファイル毎に設定することができ、この容量はプロパティファイル４００のヘッダ部４３０に記録される。また、ヘッダ部４３０にはサムネイルファイル５００のサムネイルファイル名が記録されている。 The slot capacity of the thumbnail file 500 can be set for each thumbnail file, and this capacity is recorded in the header section 430 of the property file 400. The header file 430 records the thumbnail file name of the thumbnail file 500.

サムネイルファイル５００には、コンテンツファイルの代表画像である代表サムネイル画像が、コンテンツファイルに対応するファイルエントリ毎に１枚記録されている。コンテンツファイルの代表画像は、例えば、コンテンツファイルが動画の場合には、その先頭画像である画面全体の画像とすることができる。また、通常のサムネイルファイルの場合には、１つのファイルエントリについて１つのスロットが対応する。また、サムネイルファイル５００を構成する各エントリには、エントリ番号が割り当てられている。このサムネイルファイルのエントリ番号は、サムネイルファイル内を１エントリに１スロットを対応させる構成とする場合には、スロット番号となる。また、このサムネイルファイルのエントリ番号が、各ファイルエントリの「サムネイルアドレス」（図５に示す）に格納される。 In the thumbnail file 500, one representative thumbnail image, which is a representative image of the content file, is recorded for each file entry corresponding to the content file. For example, when the content file is a moving image, the representative image of the content file may be an image of the entire screen that is the top image. In the case of a normal thumbnail file, one slot corresponds to one file entry. An entry number is assigned to each entry constituting the thumbnail file 500. The entry number of the thumbnail file is a slot number when one slot corresponds to one entry in the thumbnail file. The entry number of this thumbnail file is stored in the “thumbnail address” (shown in FIG. 5) of each file entry.

ヘッダ部４３０には、各エントリを管理する各種情報が記録されている。例えば、図７（ｄ）に示すように、コンテンツ管理ファイル３４０が管理するコンテンツファイルの種別を示す情報がヘッダ部４３０に格納されている。なお、図７（ｄ）に示す例では、コンテンツ管理ファイル３４０が管理するコンテンツファイルは、ＨＤ動画およびＳＤ動画となり、静止画は管理しないことになる。これは、動画および静止画を記録することができるコンテンツ記録装置であっても、静止画はコンテンツ管理ファイル３４０で管理しない場合があるからである。図７（ｄ）に示すようにヘッダ部４３０に記録されている場合には、静止画は、通常のファイルシステムに基づいて管理されることになる。なお、動画についても、通常のファイルシステムで管理されているため、コンテンツ管理ファイルを理解することができないコンテンツ再生装置等では、ファイルシステムの情報に基づいてコンテンツの再生が実行される。また、撮像装置１００を他のコンテンツ再生装置に接続する場合や、着脱可能な記録媒体を他のコンテンツ再生装置に移動させて再生する場合等において、他のコンテンツ再生装置がコンテンツ管理ファイルを理解することができる場合には、コンテンツ管理ファイルに基づいてコンテンツファイルの読み出し等が実行される。また、ヘッダ部４３０には、プロファイルエントリ４０８（エントリ番号：＃１５０）のエントリ番号が記録されている。これにより、エントリ部４４０を構成する各エントリの中からプロファイルエントリの位置を特定することができる。 In the header part 430, various information for managing each entry is recorded. For example, as shown in FIG. 7D, information indicating the type of content file managed by the content management file 340 is stored in the header section 430. In the example shown in FIG. 7D, the content files managed by the content management file 340 are HD moving images and SD moving images, and still images are not managed. This is because even a content recording apparatus capable of recording moving images and still images may not manage the still images with the content management file 340. When recorded in the header section 430 as shown in FIG. 7D, the still image is managed based on a normal file system. In addition, since the moving image is also managed by a normal file system, the content reproduction device or the like that cannot understand the content management file reproduces the content based on the information of the file system. In addition, when the imaging apparatus 100 is connected to another content playback apparatus or when a removable recording medium is moved to another content playback apparatus for playback, the other content playback apparatus understands the content management file. If it can, the content file is read based on the content management file. The header section 430 records the entry number of the profile entry 408 (entry number: # 150). Thereby, the position of the profile entry can be specified from among the entries constituting the entry unit 440.

図８には、図４に示すプロパティファイル４００を構成する各エントリと、各エントリに対応するスロットと、各スロットに格納されるデータとの関係を概略的に示す。なお、各エントリの名称については省略してエントリ番号を記載する。 FIG. 8 schematically shows the relationship between each entry constituting the property file 400 shown in FIG. 4, a slot corresponding to each entry, and data stored in each slot. Note that the entry number is described with the name of each entry omitted.

図９は、メタデータエントリ６００の内部構成を概略的に示す図である。なお、メタデータエントリ６００は、図４または図６等に示すメタデータエントリ４１３、４１５、４１８、４２０に対応する。また、本発明の実施の形態では、１つの動画コンテンツファイル毎に顔メタデータが記録されるものとする。 FIG. 9 is a diagram schematically showing the internal configuration of the metadata entry 600. The metadata entry 600 corresponds to the metadata entries 413, 415, 418, and 420 shown in FIG. 4 or FIG. In the embodiment of the present invention, face metadata is recorded for each moving image content file.

メタデータエントリ６００は、１または複数のメタデータユニット（Meta_Data_Unit）から構成されている。また、メタデータユニット６１０は、データユニットサイズ（data_unit_size）６１１と、言語（language）６１２と、符号化形式（encoding_type）６１３と、メタデータの種類（data_type_ID）６１４と、メタデータ６１５とから構成されている。 The metadata entry 600 is composed of one or a plurality of metadata units (Meta_Data_Unit). The metadata unit 610 includes a data unit size (data_unit_size) 611, a language (language) 612, an encoding format (encoding_type) 613, a metadata type (data_type_ID) 614, and metadata 615. ing.

データユニットサイズ６１１には、メタデータユニット６１０に格納されているメタデータのサイズが記録される。言語６１２には、メタデータユニット６１０に格納されているメタデータの言語が記録される。符号化形式６１３には、メタデータユニット６１０に格納されているメタデータの符号化形式が記録される。メタデータの種類６１４には、個々のメタデータの種類を識別するための識別情報が記録される。 In the data unit size 611, the size of metadata stored in the metadata unit 610 is recorded. In the language 612, the language of metadata stored in the metadata unit 610 is recorded. In the encoding format 613, the encoding format of metadata stored in the metadata unit 610 is recorded. In the metadata type 614, identification information for identifying individual metadata types is recorded.

なお、メタデータ６１５には、顔メタデータ６２０が記録されるとともに、顔メタデータ以外のメタデータである他のメタデータ６５０が記録される。例えば、他のメタデータ６５０として、コンテンツファイルのタイトル情報やジャンル情報等の情報が格納される。 The metadata 615 includes face metadata 620 and other metadata 650 that is metadata other than the face metadata. For example, information such as title information and genre information of the content file is stored as other metadata 650.

顔メタデータ６２０は、ヘッダ部６３０と顔データ部６４０とから構成されている。ヘッダ部６３０には、顔メタデータを管理する情報が格納される。また、ヘッダ部６３０は動画コンテンツ毎に固定長とする。顔データ部６４０には、動画コンテンツファイルから検出された顔について顔メタデータとして記録される顔毎に顔データが記録される。例えば、顔データ部６４０には、顔データ６２１乃至６２３等が格納される。この顔データは、図１１に示すように、顔検出時刻情報、顔基本情報、顔スコア、笑顔スコア等のデータである。また、顔データ部６４０は、１つの動画コンテンツファイルで固定長とする。このように、ヘッダ部６３０および顔データ部６４０が固定長であるため、顔データへのアクセスを容易に行うことができる。 The face metadata 620 includes a header part 630 and a face data part 640. The header part 630 stores information for managing face metadata. The header portion 630 has a fixed length for each moving image content. In the face data section 640, face data is recorded for each face recorded as face metadata for the face detected from the moving image content file. For example, the face data section 640 stores face data 621 to 623 and the like. As shown in FIG. 11, the face data is data such as face detection time information, face basic information, face score, smile score, and the like. The face data portion 640 has a fixed length for one moving image content file. Thus, since the header part 630 and the face data part 640 are fixed length, access to face data can be performed easily.

また、他のメタデータ６５０の構成についても、顔メタデータ６２０の構成と同様である。 The configuration of the other metadata 650 is the same as the configuration of the face metadata 620.

なお、本発明の実施の形態においては、１フレーム内において検出された顔のうちで、顔データ部に記録すべき顔データの値を規定する。例えば、１フレーム内において検出された顔の大きさや顔スコアの上位の顔等の所定の条件に基づいて、顔データ部に記録する顔データの最大値を規定して制限することができる。このように制限することによって、１フレーム内において不必要な顔（条件の悪い顔、顔らしくない顔等）を顔データ部に記録することによる記録媒体１７０の容量圧迫を防止することができる。 In the embodiment of the present invention, the value of face data to be recorded in the face data portion is defined among the faces detected in one frame. For example, the maximum value of face data to be recorded in the face data portion can be defined and limited based on predetermined conditions such as the size of the face detected in one frame and the face having a higher face score. By limiting in this way, it is possible to prevent compression of the recording medium 170 due to recording unnecessary faces (faces with poor conditions, faces that do not look like faces, etc.) in the face data portion within one frame.

また、動画コンテンツファイルを記録媒体１７０に記録する場合において、顔検出エンジンにより検出された全ての顔毎に顔データが作成された場合には、作成された顔データの容量が莫大なものになる。また、顔を検出する時間間隔が小さい場合には、さらに容量が増加する。このため、例えば、時刻ｔ０のフレームに対して記録すべき顔の顔データの個数が、次の検出時刻である時刻ｔ１のフレームに対して記録すべき顔の顔データの個数が同数である場合には、時刻ｔ１で検出した顔に対する顔データを顔データ部に記録しないようにする。これは、検出された顔の個数が同数であるため、同じ顔に関するメタデータが記録される可能性が高いためである。つまり、顔を検出する時刻の前後で記録すべき顔データの個数に変化がある場合にのみ、顔データを記録することによって記録媒体に不必要な重複顔データの記録を防ぐことができる。このように、本発明の実施の形態においては、１フレーム内において検出された顔の全てについて顔データを作成する必要はない。 In addition, when moving image content files are recorded on the recording medium 170, if face data is created for every face detected by the face detection engine, the volume of the created face data becomes enormous. . Further, when the time interval for detecting a face is small, the capacity further increases. Therefore, for example, when the number of face data to be recorded for the frame at time t0 is the same as the number of face data to be recorded for the frame at time t1, which is the next detection time. The face data for the face detected at time t1 is not recorded in the face data portion. This is because there is a high possibility that metadata relating to the same face is recorded because the number of detected faces is the same. That is, unnecessary face data can be prevented from being recorded on the recording medium by recording face data only when the number of face data to be recorded changes before and after the face detection time. As described above, in the embodiment of the present invention, it is not necessary to create face data for all faces detected in one frame.

図１０は、ヘッダ部６３０に格納される各種情報の概略を示す図である。 FIG. 10 is a diagram showing an outline of various types of information stored in the header part 630.

ヘッダ部６３０には、ヘッダサイズ６３１と、メタデータバージョン６３２と、コンテンツ更新日時６３３と、顔データ構造フラグ６６０と、タイムスケール６３４と、顔データ個数６３５と、顔データサイズ６３６と、顔検出エンジンバージョン６３７と、コンテンツ画像サイズ６３８と、誤り検出符号値６３９とが格納される。なお、これらの格納単位は、図１０の「サイズ」に示すように、バイトで規定される。 The header portion 630 includes a header size 631, a metadata version 632, a content update date and time 633, a face data structure flag 660, a time scale 634, a face data number 635, a face data size 636, and a face detection engine. A version 637, a content image size 638, and an error detection code value 639 are stored. These storage units are defined in bytes as shown in “Size” in FIG.

ヘッダサイズ６３１には、ヘッダ部６３０のデータサイズが格納される。このヘッダサイズ６３１によって、顔データ部６４０にアクセスする場合に、ヘッダ部６３０をジャンプして即座にアクセスすることが可能である。また、データサイズとして２バイトが規定されている。 The header size 631 stores the data size of the header portion 630. With this header size 631, when accessing the face data portion 640, the header portion 630 can be jumped and accessed immediately. In addition, 2 bytes are defined as the data size.

メタデータバージョン６３２には、ヘッダ部６３０に対応する顔データ部６４０に記録されている顔メタデータのバージョン情報が格納される。コンテンツ再生装置でコンテンツファイルを再生する場合には、メタデータバージョン６３２に格納されている内容を確認することによって、そのコンテンツ再生装置が対応可能なデータであるか否かを装置自体が確認することが可能となる。本発明の実施の形態では、例えば、「１．００」が記録されるものとする。また、データサイズとして２バイトが規定され、上位８ビットがメジャーバージョンを示し、下位８ビットがマイナーバージョンを示す。なお、将来、顔メタデータフォーマットが拡張された場合には、更新されたバージョン情報が格納される。 The metadata version 632 stores version information of face metadata recorded in the face data part 640 corresponding to the header part 630. When a content file is played back by a content playback device, the device itself checks whether the content playback device can handle the data by checking the contents stored in the metadata version 632. Is possible. In the embodiment of the present invention, for example, “1.00” is recorded. Also, 2 bytes are defined as the data size, the upper 8 bits indicate the major version, and the lower 8 bits indicate the minor version. If the face metadata format is expanded in the future, updated version information is stored.

コンテンツ更新日時６３３には、動画コンテンツファイルに記録される更新日時が格納される。例えば、撮像装置１００で撮影された動画コンテンツファイルが他の装置に移動して編集された後に、この編集された動画コンテンツファイルが撮像装置１００に再度記録されたような場合には、編集後の動画コンテンツファイルと顔メタデータと間で不整合が発生する。具体的には、以下で示す（１）乃至（３）のステップで動画コンテンツファイルが移動する場合が考えられる。このような場合に、これらの不整合を検出して、動画コンテンツファイルＢから顔メタデータを再検出させ、編集後の動画コンテンツファイルと顔メタデータと間で発生した不整合を修正することが可能となる。 The content update date and time 633 stores the update date and time recorded in the moving image content file. For example, when the edited moving image content file is recorded on the image capturing apparatus 100 again after the moving image content file captured by the image capturing apparatus 100 is moved to another apparatus and edited, Inconsistency occurs between video content file and face metadata. Specifically, a case where a moving image content file moves in the following steps (1) to (3) can be considered. In such a case, it is possible to detect these inconsistencies, re-detect the face metadata from the video content file B, and correct the inconsistency generated between the edited video content file and the face metadata. It becomes possible.

（１）ステップ１
コンテンツ記録装置Ａで動画コンテンツファイルＡが記録され、動画コンテンツファイルＡに対応する顔メタデータが生成される。この場合には、動画コンテンツファイルＡの作成日時および更新日時と、顔メタデータのコンテンツ更新日時とが同じ値となる。 (1) Step 1
The moving image content file A is recorded by the content recording device A, and face metadata corresponding to the moving image content file A is generated. In this case, the creation date and update date and time of the moving image content file A and the content update date and time of the face metadata have the same value.

（２）ステップ２
動画コンテンツファイルＡがコンテンツ再生装置Ｂに移動された後に、コンテンツ再生装置Ｂで編集されて、動画コンテンツファイルＢとなる。この場合には、動画コンテンツファイルＢの更新日時が編集時の日時に更新される。 (2) Step 2
After the moving image content file A is moved to the content reproduction device B, it is edited by the content reproduction device B to become the moving image content file B. In this case, the update date / time of the moving image content file B is updated to the date / time at the time of editing.

（３）ステップ３
動画コンテンツファイルＢがコンテンツ記録装置Ａに戻される。この場合には、動画コンテンツファイルＢと、顔メタデータのコンテンツ更新日時との値が異なる。 (3) Step 3
The moving image content file B is returned to the content recording device A. In this case, the values of the moving image content file B and the content update date / time of the face metadata are different.

顔データ構造フラグ６６０には、顔データ部６４０に格納される顔データで定義されたメタデータの有無を示すフラグが格納される。なお、顔データ構造フラグ６６０については、図１２乃至図１６を参照して詳細に説明する。 The face data structure flag 660 stores a flag indicating the presence / absence of metadata defined by the face data stored in the face data unit 640. The face data structure flag 660 will be described in detail with reference to FIGS.

タイムスケール６３４には、顔データ部で使用される時刻情報のタイムスケール（１秒あたりのユニット数を表す値）が格納される。すなわち、動画コンテンツファイルから顔が検出された時刻を示す情報（顔検出時刻情報）が顔データとして顔データ部に記録されるが、その時刻情報のタイムスケールがタイムスケール６３４に格納される。なお、単位はＨｚである。 The time scale 634 stores the time scale of time information (a value representing the number of units per second) used in the face data portion. That is, information (face detection time information) indicating the time when a face is detected from the moving image content file is recorded as face data in the face data portion, and the time scale of the time information is stored in the time scale 634. The unit is Hz.

顔データ個数６３５は、ヘッダ部６３０に続いて記録される顔データの個数を示す情報が格納される。顔を検出しなかった場合には、「０」が記録される。 The face data number 635 stores information indicating the number of face data recorded following the header portion 630. If no face is detected, “0” is recorded.

顔データサイズ６３６には、ヘッダ部６３０に続いて記録される１つの顔データのデータサイズを示す情報が格納される。この顔データサイズ６３６に格納される情報に基づいて個々の顔データ間をジャンプすることが可能となる。なお、顔が検出されなかった場合には、「０」が記録される。 In the face data size 636, information indicating the data size of one face data recorded following the header portion 630 is stored. Based on information stored in the face data size 636, it is possible to jump between individual face data. If no face is detected, “0” is recorded.

顔検出エンジンバージョン６３７には、動画コンテンツファイルから顔を検出する顔検出エンジンに関する情報が記録される。これは、顔メタデータの再生時において、自機よりも性能の低い顔検出エンジンで検出された顔メタデータであることを認識した場合に、顔メタデータを再検出するか否かの指標として使用される。顔検出エンジンに関する情報は、例えば、ＡＳＣＩＩコードで記述される。 In the face detection engine version 637, information related to a face detection engine that detects a face from a moving image content file is recorded. This is an indicator of whether or not to re-detect face metadata when recognizing that it is face metadata detected by a face detection engine that has a lower performance than that of its own device during face metadata playback. used. Information about the face detection engine is described in, for example, ASCII code.

例えば、メタデータバージョンが「１．００」の場合には、図１１に示すデータの順序で顔データ部６４０に各データが記録される。このため、コンテンツ再生装置がメタデータバージョンを「１．００」であると認識した場合には、それぞれのデータが固定長に、かつ予め決められた位置に配置されているため、顔データ部６４０の所望のデータの位置まで迅速にアクセスすることが可能となる。 For example, when the metadata version is “1.00”, each data is recorded in the face data section 640 in the data order shown in FIG. For this reason, when the content reproduction apparatus recognizes that the metadata version is “1.00”, each piece of data is arranged at a fixed length and at a predetermined position. It is possible to quickly access the desired data position.

コンテンツ画像サイズ６３８には、顔が検出された画像の高さおよび幅を示す情報が記録される。また、誤り検出符号値６３９には、顔が検出された画像において所定範囲で計算された誤り検出符号値（エラー訂正符号値）を示す情報が記録される。例えば、誤り検出符号値６３９には、顔メタデータの作成時において、対応する画像データから計算されたチェックサムの値が記録される。なお、誤り検出符号値としては、チェックサム以外に、ＣＲＣ（Cyclic Redundancy Check：巡回冗長検査）やハッシュ関数を用いたハッシュ値等を用いることが可能である。 In the content image size 638, information indicating the height and width of the image in which the face is detected is recorded. The error detection code value 639 is recorded with information indicating an error detection code value (error correction code value) calculated in a predetermined range in the image from which the face is detected. For example, the error detection code value 639 records a checksum value calculated from the corresponding image data when the face metadata is created. In addition to the checksum, a CRC (Cyclic Redundancy Check), a hash value using a hash function, or the like can be used as the error detection code value.

コンテンツ画像サイズ６３８および誤り検出符号値６３９は、コンテンツ更新日時６３３と同様に、動画コンテンツファイルと顔メタデータとの間で発生する不整合を検出するために用いられる。この不整合発生のメカニズムは、上述した（１）ステップ１乃至（３）ステップ３と同様である。例えば、静止画コンテンツファイルについては、静止画編集ソフトが数多く存在しているものの、これらの静止画編集ソフトの中には、静止画が編集された場合でもコンテンツ内部のコンテンツ日時情報が更新されないものが存在する。このような場合においては、コンテンツ更新日時の比較とともに、このコンテンツ画像サイズを用いた比較をすることによって、さらに確実な不整合を検出することが可能である。 Similar to the content update date and time 633, the content image size 638 and the error detection code value 639 are used to detect inconsistencies occurring between the moving image content file and the face metadata. The mechanism of the occurrence of inconsistency is the same as (1) Step 1 to (3) Step 3 described above. For example, there are many still image editing software for still image content files, but some of these still image editing software does not update the content date / time information inside the content even if the still image is edited. Exists. In such a case, it is possible to detect a more reliable mismatch by comparing the content update date and the content image size.

図１１は、顔データ部６４０に格納される顔データの概略を示す図である。なお、顔データ部６４０には、ヘッダ部６３０の顔データ構造フラグ６６０でビットアサインされた順序で各顔データが格納される。 FIG. 11 is a diagram showing an outline of face data stored in the face data unit 640. The face data portion 640 stores each face data in the order of bit assignment by the face data structure flag 660 of the header portion 630.

顔データ部６４０には、顔検出時刻情報６４１と、顔基本情報６４２と、顔スコア６４３と、笑顔スコア６４４と、顔重要度６４５とが記録される。なお、これらの格納単位は、バイトで規定される。ここでは、上述したように、メタデータバージョンが「１．００」の場合における顔データとして定義されるメタデータを例にして説明する。 In the face data portion 640, face detection time information 641, basic face information 642, a face score 643, a smile score 644, and a face importance degree 645 are recorded. Note that these storage units are defined in bytes. Here, as described above, the metadata defined as face data when the metadata version is “1.00” will be described as an example.

顔検出時刻情報６４１には、対応する動画コンテンツファイルの先頭を「０」として、この顔データが検出されたフレームの時刻が記録される。なお、顔検出時刻情報６４１には、ヘッダ部６３０のタイムスケール６３４に格納されたタイムスケールの整数倍の値が格納される。 In the face detection time information 641, the time of the frame in which the face data is detected is recorded with the top of the corresponding moving image content file being “0”. The face detection time information 641 stores a value that is an integer multiple of the time scale stored in the time scale 634 of the header section 630.

顔基本情報６４２には、動画コンテンツファイルを構成する各フレームから検出された顔の位置および大きさが格納される。顔基本情報６４２として、顔位置情報が上位４バイトで規定され、顔サイズ情報が下位４バイトで規定される。また、顔位置情報は、例えば、顔が検出された画像における左上部分から検出された顔の左上部分までの差の値であり、上位１６ビットで横軸の位置の値が規定され、下位１６ビットで縦軸の位置の値が規定される。また、顔サイズ情報は、例えば、検出された顔の画像サイズを示す値であり、上位１６ビットで顔の幅を示す値が規定され、下位１６ビットで顔の高さを示す値が規定される。なお、顔基本情報６４２は、顔メタデータを利用するアプリケーションとしては最も重要なメタデータである。 The face basic information 642 stores the position and size of the face detected from each frame constituting the moving image content file. As the face basic information 642, face position information is defined by upper 4 bytes, and face size information is defined by lower 4 bytes. The face position information is, for example, a value of a difference from the upper left part of the image in which the face is detected to the upper left part of the detected face, and the position value on the horizontal axis is defined by upper 16 bits, and the lower 16 The value of the position of the vertical axis is defined by bits. The face size information is, for example, a value indicating the image size of the detected face, a value indicating the face width is defined by the upper 16 bits, and a value indicating the face height is defined by the lower 16 bits. The The face basic information 642 is the most important metadata for an application that uses face metadata.

顔スコア６４３には、検出された顔の顔らしさを表すスコアに関する情報が格納される。 The face score 643 stores information related to the score representing the face likeness of the detected face.

笑顔スコア６４４には、検出された顔がどのくらい笑っているかに関するスコア情報が格納される。 The smile score 644 stores score information regarding how much the detected face is laughing.

顔重要度６４５には、同一時刻で検出された画像の優先順位（重要度）を示す情報が格納される。これは、例えば、１フレーム中で複数の顔が検出された場合において、画面の中心に近い顔から高い優先順位を割り当てたり、フォーカスされている顔に高い優先順位を割り当てることができる。格納される情報としては、例えば、小さい値ほど重要度が大きいと規定して、「１」を最大重要度と規定することができる。これにより、例えば、画像を表示する表示部が小さいモバイル機器を使用する場合でも、全ての顔画像を小さく表示する代わりに、優先順位の高い顔のみを大きく表示させることが可能となる。 The face importance 645 stores information indicating the priority (importance) of images detected at the same time. For example, when a plurality of faces are detected in one frame, a high priority can be assigned from the face close to the center of the screen, or a high priority can be assigned to the focused face. As stored information, for example, it can be defined that the smaller the value is, the greater the importance is, and “1” can be defined as the maximum importance. Thereby, for example, even when a mobile device with a small display unit for displaying an image is used, instead of displaying all the face images in a small size, only the face with a high priority can be displayed in a large size.

本発明の実施の形態では、顔データを検出された時刻順序に記録する。これにより、時間順で検索する場合に迅速に行うことができる。さらに、同一の動画コンテンツファイルにおいては、全ての顔データに含まれるメタデータの種別は同じものとし、図１１に示す順序で顔データを記録する。ただし、図１１に示す全てのデータを記録する必要はないものの、同一の動画コンテンツファイルで同種のメタデータを記録する。これにより、全ての顔データが固定長となり、顔データへのアクセス性を向上させることができる。また、同一の動画コンテンツファイルで同種のメタデータが格納されているため、所定のメタデータへのアクセスを向上させることができる。 In the embodiment of the present invention, face data is recorded in the detected time order. Thereby, when searching in time order, it can carry out rapidly. Further, in the same moving image content file, the types of metadata included in all face data are the same, and face data is recorded in the order shown in FIG. However, although it is not necessary to record all the data shown in FIG. 11, the same kind of metadata is recorded in the same moving image content file. Thereby, all face data becomes fixed length, and the accessibility to face data can be improved. Moreover, since the same kind of metadata is stored in the same moving image content file, access to predetermined metadata can be improved.

図１２は、図１０に示すヘッダ部６３０の顔データ構造フラグ６６０のデータ構造を示す図である。図１３乃至図１６は、顔データ構造フラグ６６０に格納されたビットと、顔データ部６４０に格納された顔データとの関係を示す図である。 FIG. 12 is a diagram showing a data structure of the face data structure flag 660 of the header section 630 shown in FIG. FIG. 13 to FIG. 16 are diagrams showing the relationship between the bits stored in the face data structure flag 660 and the face data stored in the face data unit 640.

本発明の実施の形態では、図１１に示すように、顔データ部６４０において５個のメタデータが定義されているため、顔データ構造フラグ６６０のＬＳＢ（Least Significant Bit）から順番に、顔データ部６４０の順序に従って、０−４ビットにそれぞれのデータが割り当てられる。そして、顔データ構造フラグ６６０の各ビットには、顔メタデータのデータフィールドのデータの有無が格納される。すなわち、顔データ構造フラグ６６０の各ビットには、顔メタデータのデータフィールドにデータが存在する場合には「１」が格納され、データが存在しない場合には「０」が格納される。このように、顔データ部６４０に存在するメタデータが存在する場合には、対応するビットに「１」が設定される。なお、６ビット目以降は将来の顔データ内部のデータの拡張のための予約領域となる。 In the embodiment of the present invention, as shown in FIG. 11, since five pieces of metadata are defined in the face data unit 640, face data is sequentially processed from the LSB (Least Significant Bit) of the face data structure flag 660. Each data is assigned to 0-4 bits according to the order of the unit 640. Each bit of the face data structure flag 660 stores the presence / absence of data in the data field of the face metadata. That is, each bit of the face data structure flag 660 stores “1” when data exists in the data field of the face metadata, and stores “0” when there is no data. As described above, when the metadata existing in the face data portion 640 exists, “1” is set to the corresponding bit. Note that the sixth and subsequent bits are reserved areas for future expansion of the internal face data.

具体的には、例えば、図１３（ａ）に示すように、顔データ部６４０には、メタデータバージョンが「１．００」で規定されたデータが格納されているとする。この場合には、図１３（ｂ）に示すように、ＬＳＢから０−４ビットのそれぞれには「１」が格納される。なお、コンテンツ記録装置は定義された全てのデータを記録する必要はなく、必要なデータのみを記録することができる。これにより、顔メタデータを利用するアプリケーションに応じた柔軟な顔メタデータの記録が可能となり、データ量を削減することも可能となる。 Specifically, for example, as shown in FIG. 13A, it is assumed that the face data unit 640 stores data defined with a metadata version of “1.00”. In this case, as shown in FIG. 13B, “1” is stored in each of the 0-4 bits from the LSB. Note that the content recording apparatus need not record all defined data, and can record only necessary data. Thereby, flexible face metadata can be recorded according to the application using the face metadata, and the amount of data can be reduced.

また、図１４（ａ）に示すように、顔データ部６４０には、メタデータバージョンが「１．００」で規定された５つのデータのうちの３つのデータが、他のコンテンツ記録装置によって格納されているとする。この場合には、記録される顔データの順序は、図１１に示す順序となり、記録されないデータの分はつめて記録される。図１４（ｂ）は、上記他のコンテンツ記録装置によって記録された顔データ構造フラグ６６０の実データの例を示すものであり、顔データとして存在するデータフィールドに割り当てられたフラグに「１」が格納される。このように、メタデータバージョンが「１．００」で規定された範囲内であれば、コンテンツ記録装置は、何れのメタデータでも記録することができる。また、顔メタデータを再生するコンテンツ再生装置は、他のコンテンツ記録装置により異なるメタデータが記録されていたとしても、ヘッダ部の情報を参照することによって顔データ内部のメタデータの有無を確認することができる。また、顔データが固定長であるため、所望のメタデータへのアクセスを高速に行うことが可能となる。 Further, as shown in FIG. 14A, in the face data unit 640, three data among the five data defined by the metadata version “1.00” are stored by other content recording devices. Suppose that In this case, the order of the face data to be recorded is the order shown in FIG. 11, and the unrecorded data is recorded together. FIG. 14B shows an example of actual data of the face data structure flag 660 recorded by the other content recording device, and “1” is assigned to the flag assigned to the data field existing as face data. Stored. In this way, the content recording apparatus can record any metadata as long as the metadata version is within the range defined by “1.00”. In addition, a content playback device that plays back face metadata checks whether or not there is metadata in the face data by referring to information in the header portion even if different metadata is recorded by other content recording devices. be able to. In addition, since the face data has a fixed length, it is possible to access desired metadata at high speed.

次に、本発明の実施の形態における顔データ部６４０に格納される顔データの拡張方法について図面を参照して説明する。 Next, a method for expanding the face data stored in the face data unit 640 according to the embodiment of the present invention will be described with reference to the drawings.

将来的に顔検出技術が向上した場合や検出された顔の結果を新たなアプリケーションで利用する場合等において、メタデータバージョンが「１．００」で規定された顔メタデータのみでは不十分な場合が想定される。 When face detection technology is improved in the future or when the result of a detected face is used in a new application, the face metadata specified by “1.00” is not enough. Is assumed.

図１５（ａ）に拡張された顔データの例を示す。ここでは、検出された顔の性別度合いを示す「性別スコア」と、フレーム上の顔の傾き度合いを示す「角度情報」とが拡張された顔データとして示されている。これらを追加した顔メタデータのメタデータバージョンが「１．１０」として定義され、ヘッダ部のメタデータバージョンフィールドには「１．１０」が記録される。メタデータの拡張の方法は、前バージョンで定義されたデータ下に新規メタデータを追加する形で行われる。具体的には、データを記録媒体１７０に記録する際には、顔データ単位でバージョン「１．００」で規定されたデータが記録された物理アドレスに連続する物理アドレスからバージョン「１．１０」で規定されたデータを記録する。そして、バージョン「１．１０」規定のメタデータが記録された物理アドレスに連続するアドレスに同様に次の顔データ単位のメタデータの記録が開始される。 FIG. 15A shows an example of expanded face data. Here, “sex score” indicating the degree of gender of the detected face and “angle information” indicating the degree of inclination of the face on the frame are shown as expanded face data. The metadata version of the face metadata to which these are added is defined as “1.10”, and “1.10” is recorded in the metadata version field of the header part. The method of extending the metadata is performed by adding new metadata under the data defined in the previous version. Specifically, when data is recorded on the recording medium 170, the version “1.10” is determined from the physical address continuous to the physical address where the data defined by the version “1.00” is recorded in face data units. Record the data specified in. Then, similarly, the recording of the metadata of the next face data unit is started at the address continuous to the physical address where the metadata of the version “1.10” is recorded.

図１６（ｂ）には、バージョン「１．１０」で定義されたメタデータのうち、ある記録機によって記録されたメタデータを示す。例えば、図１５（ａ）に示す拡張された顔データが記録される場合でも、図１５（ａ）に示す顔データの全てが記録される必要はない。ただし、このように記録されない顔データが存在する場合には、図１５（ａ）に示す顔データのうちの所定の顔データが図１６（ａ）に示す順序で記録されるとともに、顔データが記録されないフィールド分はつめて記録される。 FIG. 16B shows metadata recorded by a certain recorder among the metadata defined by the version “1.10”. For example, even when the expanded face data shown in FIG. 15A is recorded, it is not necessary to record all the face data shown in FIG. However, when there is face data that is not recorded in this way, predetermined face data among the face data shown in FIG. 15A is recorded in the order shown in FIG. Fields that are not recorded are recorded together.

さらに、バージョン「１．１０」へのバージョンアップにともない顔データ構造フラグも拡張され、バージョン「１．００」時には予約領域だったビットに、図１５（ａ）で定義されたフィールド順序に従い新規ビットが割り当てられ、顔データ部にデータが存在するビットには、図１５（ｂ）のように「１」がセットされる。これにより、バージョン「１．１０」に対応した再生機では、ヘッダ部の顔データ構造フラグのビット列を確認することにより、顔データ部のデータ構造を理解可能となり、個々の顔データは固定長となるため所望のメタデータまで迅速にアクセス可能となる。 Furthermore, the face data structure flag is expanded with the version upgrade to version “1.10”, and a bit that was a reserved area at the time of version “1.00” is replaced with a new bit according to the field order defined in FIG. Is assigned to the bit in which data exists in the face data part, as shown in FIG. 15B. As a result, a playback device compatible with version “1.10” can understand the data structure of the face data portion by checking the bit string of the face data structure flag in the header portion, and each face data has a fixed length. Therefore, the desired metadata can be quickly accessed.

さらに、バージョン「１．１０」に対応する記録機によって、着脱可能な記録媒体に顔メタデータが記録され、この記録媒体がバージョン「１．００」にのみ対応する再生機に移動された場合を考える。この場合において、この再生機はヘッダ部の顔データ構造フラグの０−４ビットまでは認識可能である。また、顔データサイズの仕様が変わっていないため、バージョン「１．００」で想定されていない顔データが格納されていたとしても、この再生機は、バージョン「１．００」で規定されている顔データを認識することが可能である。例えば、図１６に示す例では、この再生機は、「顔検出時刻情報」、「顔基本情報」、「顔スコア」、「顔重要度」を理解することができる。このため、この再生機は、これらのメタデータへのアクセスが可能である。このように、メタデータエントリは、アクセス性に優れているデータ構造であるとともに、記録機または再生機のバージョンが変更された場合でも、この変更に対応することが可能である。 Furthermore, a case where face metadata is recorded on a removable recording medium by a recording device corresponding to version “1.10”, and this recording medium is moved to a playback device corresponding to version “1.00” only. Think. In this case, the player can recognize up to 0-4 bits of the face data structure flag in the header part. Further, since the specification of the face data size has not changed, even if face data that is not assumed in the version “1.00” is stored, the player is defined in the version “1.00”. It is possible to recognize face data. For example, in the example shown in FIG. 16, the player can understand “face detection time information”, “face basic information”, “face score”, and “face importance”. For this reason, this player can access these metadata. As described above, the metadata entry has a data structure having excellent accessibility, and can cope with the change even when the version of the recorder or the player is changed.

次に、本発明の実施の形態における撮像装置１００の機能構成例について図面を参照して説明する。 Next, a functional configuration example of the imaging apparatus 100 according to the embodiment of the present invention will be described with reference to the drawings.

図１７は、本発明の実施の形態における撮像装置１００の機能構成例を示すブロック図である。この撮像装置１００は、コンテンツ管理ファイル記憶部２１０と、コンテンツ入力部２１１と、顔検出部２１２と、顔メタデータ作成部２１３と、仮想管理情報作成部２１４と、代表サムネイル画像抽出部２１５と、コンテンツ属性情報作成部２１６と、記録制御部２１７とを備える。 FIG. 17 is a block diagram illustrating a functional configuration example of the imaging apparatus 100 according to the embodiment of the present invention. The imaging apparatus 100 includes a content management file storage unit 210, a content input unit 211, a face detection unit 212, a face metadata creation unit 213, a virtual management information creation unit 214, a representative thumbnail image extraction unit 215, A content attribute information creation unit 216 and a recording control unit 217 are provided.

コンテンツ管理ファイル記憶部２１０は、仮想的な階層構造により構成される階層エントリを記録するコンテンツ管理ファイル３４０を記憶するものである。なお、コンテンツ管理ファイル３４０の詳細については、図３乃至図９等に示す。 The content management file storage unit 210 stores a content management file 340 that records a hierarchical entry composed of a virtual hierarchical structure. Details of the content management file 340 are shown in FIGS.

コンテンツ入力部２１１は、コンテンツファイルを入力するものであり、入力されたコンテンツファイルを、顔検出部２１２、顔メタデータ作成部２１３、仮想管理情報作成部２１４、代表サムネイル画像抽出部２１５およびコンテンツ属性情報作成部２１６に出力する。具体的には、カメラ部１１０で撮影されたフレームがコンテンツ入力部２１１から順次入力される。 The content input unit 211 inputs a content file. The content file is input to the face detection unit 212, the face metadata creation unit 213, the virtual management information creation unit 214, the representative thumbnail image extraction unit 215, and the content attribute. The information is output to the information creation unit 216. Specifically, frames taken by the camera unit 110 are sequentially input from the content input unit 211.

顔検出部２１２は、コンテンツ入力部２１１から入力されたコンテンツファイルに含まれる顔を検出するものであり、検出された顔の出現時刻および位置等を顔メタデータ作成部２１３に出力する。なお、同一時刻の画像から複数の顔が検出された場合には、検出された各顔についての出現時刻および位置等を顔メタデータ作成部２１３に出力する。 The face detection unit 212 detects a face included in the content file input from the content input unit 211, and outputs the detected appearance time and position of the face to the face metadata creation unit 213. When a plurality of faces are detected from images at the same time, the appearance time and position of each detected face are output to the face metadata creation unit 213.

顔メタデータ作成部２１３は、コンテンツ入力部２１１から入力されたコンテンツファイルに基づいて顔メタデータを作成するものであり、作成された顔メタデータを記録制御部２１７に出力する。この顔メタデータ作成部２１３は、顔データ作成部２１８およびヘッダ情報作成部２１９を含む。顔データ作成部２１８は、顔検出部２１２により検出された顔の出現時刻および位置等に基づいてその顔に関する顔データ（図１１の顔データ部６４０の各データ）を作成するものである。また、ヘッダ情報作成部２１９は、顔データ作成部２１８により作成された顔データを管理するヘッダ情報（図１０のヘッダ部６３０の各情報）を作成するものである。これら顔データ作成部２１８により作成された顔データおよびヘッダ情報作成部２１９により作成されたヘッダ情報は、記録制御部２１７に出力される。また、顔データ作成部２１８は、所定間隔で検出された顔のうちで所定条件を満たさない顔については、顔に関する顔データを作成しないようにしてもよい。 The face metadata creation unit 213 creates face metadata based on the content file input from the content input unit 211, and outputs the created face metadata to the recording control unit 217. The face metadata creation unit 213 includes a face data creation unit 218 and a header information creation unit 219. The face data creation unit 218 creates face data related to the face (each data of the face data unit 640 in FIG. 11) based on the appearance time and position of the face detected by the face detection unit 212. The header information creation unit 219 creates header information (each information of the header part 630 in FIG. 10) for managing the face data created by the face data creation unit 218. The face data created by the face data creation unit 218 and the header information created by the header information creation unit 219 are output to the recording control unit 217. The face data creation unit 218 may not create face data related to a face that does not satisfy a predetermined condition among faces detected at a predetermined interval.

仮想管理情報作成部２１４は、コンテンツ入力部２１１から入力されたコンテンツファイルを仮想的に管理するための仮想管理情報４０１（図５）を、そのコンテンツファイルに基づいて作成するものであり、作成された仮想管理情報を記録制御部２１７に出力する。 The virtual management information creation unit 214 creates virtual management information 401 (FIG. 5) for virtually managing the content file input from the content input unit 211 based on the content file. The virtual management information is output to the recording control unit 217.

代表サムネイル画像抽出部２１５は、コンテンツ入力部２１１から入力されたコンテンツファイルから、そのコンテンツファイルの代表サムネイル画像５０１乃至５０６（図５）を抽出するものであり、抽出された代表サムネイル画像をコンテンツ属性情報作成部２１６および記録制御部２１７に出力する。 The representative thumbnail image extraction unit 215 extracts the representative thumbnail images 501 to 506 (FIG. 5) of the content file from the content file input from the content input unit 211. The representative thumbnail image extraction unit 215 uses the extracted representative thumbnail image as a content attribute. The information is output to the information creation unit 216 and the recording control unit 217.

コンテンツ属性情報作成部２１６は、コンテンツ入力部２１１から入力されたコンテンツファイルに関するコンテンツ属性情報４０２（図５）を、そのコンテンツファイルに基づいて作成するものであり、作成されたコンテンツ属性情報を記録制御部２１７に出力する。また、コンテンツ属性情報作成部２１６は、代表サムネイル画像抽出部２１５により抽出された代表サムネイル画像に対応するコンテンツファイルに関するコンテンツ属性情報にその代表サムネイル画像のサムネイルファイル５００における記録位置（サムネイルアドレス）を含めて属性情報を作成する。 The content attribute information creation unit 216 creates content attribute information 402 (FIG. 5) related to the content file input from the content input unit 211 based on the content file, and controls recording of the created content attribute information. To the unit 217. Further, the content attribute information creation unit 216 includes the recording position (thumbnail address) of the representative thumbnail image in the thumbnail file 500 in the content attribute information regarding the content file corresponding to the representative thumbnail image extracted by the representative thumbnail image extraction unit 215. To create attribute information.

記録制御部２１７は、仮想管理情報作成部２１４により作成された仮想管理情報４０１とコンテンツ属性情報作成部２１６により作成されたコンテンツ属性情報４０２とを含む動画ファイルエントリ４１４をプロパティファイル４００としてコンテンツ管理ファイル記憶部２１０に記録するものである。また、記録制御部２１７は、顔メタデータ作成部２１３により作成された顔メタデータを含むメタデータエントリ４１５を、その顔メタデータが作成されたコンテンツファイルに対応する動画ファイルエントリ４１４のプロパティファイル４００における下位の階層としてコンテンツ管理ファイル記憶部２１０に記録する。さらに、記録制御部２１７は、代表サムネイル画像抽出部２１５により抽出された代表サムネイル画像をサムネイルファイル５００としてコンテンツ管理ファイル記憶部２１０に記録する。 The recording control unit 217 uses the video file entry 414 including the virtual management information 401 created by the virtual management information creation unit 214 and the content attribute information 402 created by the content attribute information creation unit 216 as a property file 400 as a content management file. It is recorded in the storage unit 210. The recording control unit 217 also converts the metadata entry 415 including the face metadata created by the face metadata creation unit 213 into the property file 400 of the moving image file entry 414 corresponding to the content file in which the face metadata is created. Is recorded in the content management file storage unit 210 as a lower hierarchy. Further, the recording control unit 217 records the representative thumbnail image extracted by the representative thumbnail image extraction unit 215 as the thumbnail file 500 in the content management file storage unit 210.

図１８は、本発明の実施の形態における撮像装置１００の機能構成例を示すブロック図である。この撮像装置１００は、コンテンツ管理ファイル記憶部２１０と、操作受付部２２１と、コンテンツ記憶部２２３と、選択部２２４と、抽出部２２５と、描画部２２６と、表示部２２７とを備える。 FIG. 18 is a block diagram illustrating a functional configuration example of the imaging apparatus 100 according to the embodiment of the present invention. The imaging apparatus 100 includes a content management file storage unit 210, an operation reception unit 221, a content storage unit 223, a selection unit 224, an extraction unit 225, a drawing unit 226, and a display unit 227.

コンテンツ管理ファイル記憶部２１０は、記録制御部２１７（図１７）によって記録されたコンテンツ管理ファイル３４０を記憶するものである。そして、コンテンツ管理ファイル３４０に記録されている各エントリを選択部２２４および抽出部２２５に出力する。 The content management file storage unit 210 stores the content management file 340 recorded by the recording control unit 217 (FIG. 17). Then, each entry recorded in the content management file 340 is output to the selection unit 224 and the extraction unit 225.

操作受付部２２１は、各種入力キーを備え、これらの入力キーから操作入力を受け付けると、受け付けた操作入力の内容を選択部２２４に出力するものである。なお、操作受付部２２１の少なくとも一部と表示部２２７とをタッチパネルとして一体化して構成するようにしてもよい。 The operation accepting unit 221 includes various input keys, and when accepting an operation input from these input keys, outputs the content of the accepted operation input to the selection unit 224. Note that at least a part of the operation receiving unit 221 and the display unit 227 may be integrated as a touch panel.

コンテンツ記憶部２２３は、動画や静止画等のコンテンツファイルを記憶するものであり、記憶されているコンテンツファイルを抽出部２２５および描画部２２６に出力する。 The content storage unit 223 stores content files such as moving images and still images, and outputs the stored content files to the extraction unit 225 and the drawing unit 226.

選択部２２４は、操作受付部２２１から入力された操作入力に応じた選択処理を実行し、この選択結果を抽出部２２５に出力するものである。具体的には、選択部２２４は、表示部２２７に表示されている代表サムネイル画像のうちから１つの代表サムネイル画像を選択する旨の操作入力を操作受付部２２１から入力すると、その操作入力に応じて、選択された代表サムネイル画像に対応するファイルエントリを選択して、その選択されたファイルエントリのエントリ番号を抽出部２２５に出力する。また、選択部２２４は、表示部２２７に表示されている顔サムネイル画像のうちから１つの顔サムネイル画像を選択する旨の操作入力を操作受付部２２１から入力すると、その操作入力に応じて、選択された顔サムネイル画像に対応する顔データを選択して、その選択された顔データの顔検出時刻情報６４１を抽出部２２５に出力する。すなわち、選択部２２４は、コンテンツ管理ファイル記憶部２１０に記憶されているコンテンツ管理ファイルに記録されているファイルエントリの中から所望のファイルエントリを選択するものであり、また、メタデータエントリに含まれる顔メタデータの顔データの中から、所望する顔データを選択するものである。 The selection unit 224 executes a selection process corresponding to the operation input input from the operation reception unit 221 and outputs the selection result to the extraction unit 225. Specifically, when the operation input indicating that one representative thumbnail image is selected from the representative thumbnail images displayed on the display unit 227 is input from the operation reception unit 221, the selection unit 224 responds to the operation input. Then, the file entry corresponding to the selected representative thumbnail image is selected, and the entry number of the selected file entry is output to the extraction unit 225. Further, when the operation input indicating that one face thumbnail image is selected from the face thumbnail images displayed on the display unit 227 is input from the operation receiving unit 221, the selection unit 224 selects according to the operation input. The face data corresponding to the selected face thumbnail image is selected, and face detection time information 641 of the selected face data is output to the extraction unit 225. That is, the selection unit 224 selects a desired file entry from the file entries recorded in the content management file stored in the content management file storage unit 210, and is included in the metadata entry. The desired face data is selected from the face data of the face metadata.

抽出部２２５は、選択部２２４から入力されたファイルエントリのエントリ番号に基づいて、コンテンツ記憶部２２３に記憶されているコンテンツファイルを抽出するものである。また、抽出部２２５は、選択部２２４から入力されたエントリ番号に対応するファイルエントリの下位階層に記録されているメタデータエントリに含まれる顔データを抽出し、この顔データに含まれる顔の時刻および位置等に基づいて、この顔データに対応する顔サムネイル画像をコンテンツファイルから抽出する。さらに、抽出部２２５は、選択部２２４から入力された選択された顔データの顔検出時刻情報６４１が含まれるメタデータエントリの上位階層に記録されているファイルエントリに基づいてコンテンツファイルを抽出する。また、抽出部２２５は、選択部２２４から入力された顔検出時刻情報６４１に対応する記録時間以降に記録された動画を、コンテンツ記憶部２２３に記憶されているコンテンツファイルから抽出する。なお、抽出部２２５は、これらの抽出した結果を描画部２２６に出力する。なお、これらの選択および抽出については、図１９および図２０を参照して詳細に説明する。 The extraction unit 225 extracts the content file stored in the content storage unit 223 based on the entry number of the file entry input from the selection unit 224. Further, the extraction unit 225 extracts face data included in the metadata entry recorded in the lower hierarchy of the file entry corresponding to the entry number input from the selection unit 224, and the face time included in the face data A face thumbnail image corresponding to the face data is extracted from the content file based on the position and the like. Further, the extraction unit 225 extracts the content file based on the file entry recorded in the upper layer of the metadata entry including the face detection time information 641 of the selected face data input from the selection unit 224. Further, the extraction unit 225 extracts a moving image recorded after the recording time corresponding to the face detection time information 641 input from the selection unit 224 from the content file stored in the content storage unit 223. Note that the extraction unit 225 outputs these extracted results to the drawing unit 226. Note that selection and extraction of these will be described in detail with reference to FIGS.

また、抽出部２２５は、コンテンツ記憶部２２３に記憶されているコンテンツファイルに対応する画像とこの画像に対応する顔データとについて所定の条件を満たすか否かを確認して、所定の条件を満たす画像に含まれる顔に関する顔データについて、所望の要素情報の各顔データにおける先頭からの記録オフセット値を算出し、この算出された記録オフセット値に基づいて顔データから所望の要素情報を読み出す。また、抽出部２２５は、所定の条件を満たさない場合には、所定の条件を満たさないと判断された画像とは異なる画像に対応する顔データおよび顔データ管理情報を検索する。なお、これらの要素情報の読出しは、図２６、図２７、図３２、図３３を参照して詳細に説明する。 Further, the extraction unit 225 checks whether or not a predetermined condition is satisfied for the image corresponding to the content file stored in the content storage unit 223 and the face data corresponding to the image, and satisfies the predetermined condition. For the face data related to the face included in the image, a recording offset value from the head of each face data of the desired element information is calculated, and the desired element information is read from the face data based on the calculated recording offset value. Further, when the predetermined condition is not satisfied, the extraction unit 225 searches for face data and face data management information corresponding to an image different from the image determined not to satisfy the predetermined condition. The reading of these element information will be described in detail with reference to FIGS. 26, 27, 32, and 33.

描画部２２６は、抽出部２２５から入力された抽出結果に基づいて、コンテンツ記憶部２２３に記憶されているコンテンツファイルから抽出された顔サムネイル画像、コンテンツ記憶部２２３に記憶されているコンテンツファイルから抽出された動画等を描画するものである。また、描画部２２６は、コンテンツ管理ファイル記憶部２１０のサムネイルファイル５００に記憶されている代表サムネイル画像を描画するものである。 The drawing unit 226 extracts the face thumbnail image extracted from the content file stored in the content storage unit 223 and the content file stored in the content storage unit 223 based on the extraction result input from the extraction unit 225. This is for drawing a moving image or the like. The drawing unit 226 draws a representative thumbnail image stored in the thumbnail file 500 of the content management file storage unit 210.

表示部２２７は、描画部２２６により描画された画像を表示するものである。 The display unit 227 displays the image drawn by the drawing unit 226.

次に、プロパティファイルと、サムネイルファイルと、動画コンテンツファイルとの関係について図面を参照して詳細に説明する。 Next, the relationship among the property file, thumbnail file, and moving image content file will be described in detail with reference to the drawings.

図１９は、動画ファイルエントリ４１４と、メタデータエントリ４１５と、サムネイルファイル５００と、動画コンテンツファイル３１２との関係を概略的に示す図である。 FIG. 19 is a diagram schematically illustrating a relationship among the moving image file entry 414, the metadata entry 415, the thumbnail file 500, and the moving image content file 312.

例えば、図１９に示すように、動画ファイルエントリ４１４には、動画コンテンツファイル３１２のコンテンツアドレスを示す「Ａ３１２」と、動画コンテンツファイル３１２に対応する代表サムネイル画像５０２のサムネイルアドレスを示す「＃２」が格納されている。また、動画ファイルエントリ４１４の子エントリリストには、動画コンテンツファイル３１２に関するメタデータが格納されているメタデータエントリ４１５のエントリ番号「＃３１」が格納されている。また、メタデータエントリ４１５の親エントリリストには、動画ファイルエントリ４１４のエントリ番号「＃２８」が格納されている。さらに、メタデータエントリ４１５の顔メタデータには、図９および図１１に示すように、検出された顔に関する各種の顔メタデータが格納されている。この顔メタデータのうちの顔検出時刻情報および顔基本情報に基づいて、動画コンテンツファイル３１２の各フレームのうちから、１つのフレームを特定することができる。なお、これらの関係を矢印で示す。 For example, as shown in FIG. 19, the moving image file entry 414 includes “A312” indicating the content address of the moving image content file 312 and “# 2” indicating the thumbnail address of the representative thumbnail image 502 corresponding to the moving image content file 312. Is stored. In addition, the child entry list of the moving image file entry 414 stores the entry number “# 31” of the metadata entry 415 in which the metadata related to the moving image content file 312 is stored. Further, the entry number “# 28” of the moving image file entry 414 is stored in the parent entry list of the metadata entry 415. Further, the face metadata of the metadata entry 415 stores various face metadata relating to the detected face as shown in FIGS. One frame can be identified from each frame of the moving image content file 312 based on the face detection time information and the face basic information in the face metadata. These relationships are indicated by arrows.

このように各エントリの内容を関連付けて管理することによって、コンテンツファイルのサーチを迅速に行うことができる。 As described above, by managing the contents of each entry in association with each other, it is possible to quickly search for a content file.

例えば、２００６年１月１１日に撮影された動画像の一覧を表示する場合には、プロパティファイル４００の各エントリの中で、動画コンテンツファイルを管理する動画フォルダエントリ４１０がサーチされ、サーチされた動画フォルダエントリ４１０の中の子エントリリストに格納された日付フォルダエントリ４１１および４１６の中から、２００６年１月１１日の日付に対応するファイルを管理する日付フォルダエントリ４１１がサーチされる。続いて、サーチされた日付フォルダエントリ４１１の子エントリリストに格納された動画ファイルエントリ４１２および４１４がサーチされ、各動画ファイルエントリ４１２および４１４に記録されたサムネイルファイル５００のサムネイルアドレス（エントリ参照情報）が抽出される。続いて、サムネイルファイル５００がオープンされ、抽出されたサムネイルアドレスに基づいてサムネイルファイル５００から代表サムネイル画像が抽出され、抽出された代表サムネイル画像が表示される。 For example, when displaying a list of moving images taken on January 11, 2006, a moving image folder entry 410 that manages a moving image content file is searched and searched among the entries of the property file 400. A date folder entry 411 that manages a file corresponding to the date of January 11, 2006 is searched from the date folder entries 411 and 416 stored in the child entry list in the moving image folder entry 410. Subsequently, the moving image file entries 412 and 414 stored in the child entry list of the searched date folder entry 411 are searched, and the thumbnail address (entry reference information) of the thumbnail file 500 recorded in each moving image file entry 412 and 414 is searched. Is extracted. Subsequently, the thumbnail file 500 is opened, a representative thumbnail image is extracted from the thumbnail file 500 based on the extracted thumbnail address, and the extracted representative thumbnail image is displayed.

なお、コンテンツ管理ファイル３４０を用いずに、２００６年１月１１日に撮影された動画像の一覧を表示する場合には、各コンテンツファイルをサーチするために、全ての実コンテンツファイルのオープンおよびクローズが必要となり、処理に時間を要する。さらに、代表サムネイル画像を表示する場合には、実コンテンツファイルに対応する画像を縮小して表示するという処理が必要になるため、さらに処理時間を要することになる。 When a list of moving images taken on January 11, 2006 is displayed without using the content management file 340, all actual content files are opened and closed in order to search for each content file. Is required and processing takes time. Furthermore, when displaying a representative thumbnail image, it is necessary to perform a process of reducing and displaying an image corresponding to the actual content file, which further requires processing time.

また、例えば、２００６年１月１１日に記録された動画像に登場する人物の顔を表示する場合には、表示されている代表サムネイル画像５０２に基づいて、動画ファイルエントリ４１４およびメタデータエントリ４１５が抽出され、動画ファイルエントリ４１４が管理する動画コンテンツファイル３１２にアクセスされ、メタデータエントリ４１５に記憶されている顔メタデータ（顔検出時刻情報６４１、顔基本情報６４２）に基づいて動画コンテンツファイル３１２から顔画像が抽出され、抽出された顔画像を表示させることができる。 Further, for example, when displaying the face of a person appearing in a moving image recorded on January 11, 2006, based on the displayed representative thumbnail image 502, a moving image file entry 414 and a metadata entry 415 are displayed. Is extracted, the moving image content file 312 managed by the moving image file entry 414 is accessed, and the moving image content file 312 is based on the face metadata (face detection time information 641 and basic face information 642) stored in the metadata entry 415. The face image is extracted from the image, and the extracted face image can be displayed.

図２０は、コンテンツ管理ファイル３４０を用いたアプリケーションの一例を示す図である。ここでは、動画コンテンツファイル３１２に関する各種画像をＬＣＤ１６２に表示させ、動画コンテンツファイル３１２に対応する画像を所望の時刻から再生する場合について説明する。 FIG. 20 is a diagram illustrating an example of an application using the content management file 340. Here, a case where various images related to the moving image content file 312 are displayed on the LCD 162 and an image corresponding to the moving image content file 312 is reproduced from a desired time will be described.

最初に、図１９で示したように、サムネイルファイル５００がオープンされ、サムネイルファイル５００に格納されている代表サムネイル画像５０１乃至５０６の一覧がＬＣＤ１６２に表示される。例えば、表示画面７１０に示すように、代表サムネイル画像５０１乃至５０３が表示される。また、選択マーク７１５が付されている代表サムネイル画像５０２の右側には、代表サムネイル画像５０２に対応する動画コンテンツファイル３１２の記録日時７１４が表示されている。また、上ボタン７１１または下ボタン７１２を押下することによって、スクロールバー７１３を上下に移動させ、表示画面７１０に表示される代表サムネイル画像を上下に移動させ、他の代表サムネイル画像を表示させることができる。また、代表サムネイル画像は、例えば、記録日時の順番で上から表示させることができる。 First, as shown in FIG. 19, the thumbnail file 500 is opened, and a list of representative thumbnail images 501 to 506 stored in the thumbnail file 500 is displayed on the LCD 162. For example, as shown in the display screen 710, representative thumbnail images 501 to 503 are displayed. The recording date and time 714 of the moving image content file 312 corresponding to the representative thumbnail image 502 is displayed on the right side of the representative thumbnail image 502 to which the selection mark 715 is attached. Further, when the up button 711 or the down button 712 is pressed, the scroll bar 713 is moved up and down, the representative thumbnail image displayed on the display screen 710 is moved up and down, and another representative thumbnail image is displayed. it can. The representative thumbnail image can be displayed from the top in the order of recording date and time, for example.

表示画面７１０において、代表サムネイル画像５０２を選択する旨の操作入力がされると、代表サムネイル画像５０２に対応する動画ファイルエントリ４１４に格納されているコンテンツアドレスに基づいて、動画ファイルエントリ４１４に対応する動画コンテンツファイル３１２が抽出される。そして、動画ファイルエントリ４１４に格納されている子エントリリストに基づいて、動画ファイルエントリ４１４に対応するメタデータエントリ４１５が抽出される。続いて、メタデータエントリ４１５に格納されている顔メタデータに基づいて、動画コンテンツファイル３１２から顔サムネイル画像が抽出され、抽出された顔サムネイル画像の一覧がＬＣＤ１６２に表示される。この顔サムネイル画像は、例えば、表示画面７２０に示すように、一人の顔を含む矩形画像である。また、例えば、表示画面７２０に示すように、表示画面７１０で選択された代表サムネイル画像５０２が左側に表示されるとともに、右側の顔サムネイル画像表示領域７２５には、抽出された顔サムネイル画像７３０乃至７３２が表示される。また、選択されている顔サムネイル画像には、選択マーク７２６が付される。また、表示画面７１０で選択された代表サムネイル画像５０２に対応する動画コンテンツファイル３１２の記録日時７２４が表示されている。また、左ボタン７２１または右ボタン７２２を押下することによって、スクロールバー７２３を左右に移動させ、表示画面７２０に表示される顔サムネイル画像を左右に移動させ、他の顔サムネイル画像を表示させることができる。また、顔サムネイル画像は、例えば、記録日時の順番で左から表示させることができる。 When an operation input for selecting the representative thumbnail image 502 is made on the display screen 710, the video file entry 414 corresponding to the video file entry 414 is based on the content address stored in the video file entry 414 corresponding to the representative thumbnail image 502. A moving image content file 312 is extracted. Then, based on the child entry list stored in the moving image file entry 414, a metadata entry 415 corresponding to the moving image file entry 414 is extracted. Subsequently, based on the face metadata stored in the metadata entry 415, face thumbnail images are extracted from the moving image content file 312, and a list of the extracted face thumbnail images is displayed on the LCD 162. This face thumbnail image is, for example, a rectangular image including one face as shown on the display screen 720. For example, as shown in the display screen 720, the representative thumbnail image 502 selected on the display screen 710 is displayed on the left side, and the extracted face thumbnail images 730 to 730 are displayed in the face thumbnail image display area 725 on the right side. 732 is displayed. A selection mark 726 is added to the selected face thumbnail image. In addition, the recording date and time 724 of the moving image content file 312 corresponding to the representative thumbnail image 502 selected on the display screen 710 is displayed. Further, by pressing the left button 721 or the right button 722, the scroll bar 723 is moved to the left and right, the face thumbnail image displayed on the display screen 720 is moved to the left and right, and another face thumbnail image is displayed. it can. Further, the face thumbnail image can be displayed from the left in the order of recording date and time, for example.

表示画面７２０において、例えば、顔サムネイル画像７３１を選択する旨の操作入力がされると、メタデータエントリ４１５に格納されている顔メタデータの顔検出時刻情報の中から、顔サムネイル画像７３１に対応する顔検出時刻情報が抽出される。この場合に、選択された顔サムネイル画像７３１についての先頭から順番に基づいて、メタデータエントリ４１５に格納されている顔メタデータから、顔サムネイル画像７３１に対応する顔データが特定され、この顔データに含まれる顔検出時刻情報が抽出される。続いて、抽出された顔検出時刻情報に基づいて、動画コンテンツファイル３１２のうちの顔検出時刻情報に対応する時刻からの再生画像がＬＣＤ１６２に表示される。例えば、図１９に示すように、動画コンテンツファイル３１２のフレーム７０４から動画が再生される。そして、表示画面７４０に示すように、その再生画像が表示されるとともに、右上部分には再生画像の記録日時７４１が表示される。このように、所定の人物（例えば、本人）が出現する時刻から動画を再生させたい場合には、その人物に関する顔サムネイル画像を選択することによって、その時刻からの再生を容易に行うことができる。なお、同一時刻の画像から複数の顔が検出された場合には、同一時刻の複数の顔データが作成される。この場合には、それぞれの顔データに基づいて顔サムネイル画像が抽出される。このため、同一時刻の顔サムネイル画像が複数表示される場合がある。このように、同一時刻の顔サムネイル画像が複数表示されている場合においては、同一時刻の顔サムネイル画像の何れかが選択された場合でも、同一時刻からの動画が再生される。 On the display screen 720, for example, when an operation input for selecting the face thumbnail image 731 is made, the face thumbnail image 731 is selected from the face detection time information of the face metadata stored in the metadata entry 415. Face detection time information to be extracted. In this case, face data corresponding to the face thumbnail image 731 is specified from the face metadata stored in the metadata entry 415 based on the order from the top of the selected face thumbnail image 731, and this face data The face detection time information included in is extracted. Subsequently, based on the extracted face detection time information, a playback image from the time corresponding to the face detection time information in the moving image content file 312 is displayed on the LCD 162. For example, as shown in FIG. 19, a moving image is reproduced from a frame 704 of the moving image content file 312. Then, as shown in the display screen 740, the reproduced image is displayed, and the recorded date 741 of the reproduced image is displayed in the upper right part. As described above, when it is desired to reproduce a moving image from a time when a predetermined person (for example, the person) appears, it is possible to easily perform reproduction from that time by selecting a face thumbnail image related to that person. . Note that when a plurality of faces are detected from an image at the same time, a plurality of face data at the same time is created. In this case, a face thumbnail image is extracted based on each face data. For this reason, a plurality of face thumbnail images at the same time may be displayed. As described above, when a plurality of face thumbnail images at the same time are displayed, even when any one of the face thumbnail images at the same time is selected, a moving image from the same time is reproduced.

このように、仮想ファイル構造であるエントリから実ファイル構造への連結情報（コンテンツアドレス）が格納されているため、ファイルエントリ内の何らかの情報（例えば、記録日時に関する情報）からコンテンツファイルを検索して再生する場合には、その日時が記録されているファイルエントリを検索し、そのファイルエントリ内のコンテンツアドレスに基づいてコンテンツファイルを再生することができる。このように、全てのコンテンツファイルをオープンさせずにプロパティファイルのみをオープンさせればよく、さらに、スロットによる固定長管理（エントリ番号管理）であるため、迅速な処理が可能となる。 In this way, since the connection information (content address) from the entry having the virtual file structure to the real file structure is stored, the content file is searched from some information in the file entry (for example, information on the recording date and time). In the case of reproduction, the file entry in which the date and time is recorded can be searched, and the content file can be reproduced based on the content address in the file entry. In this way, it is only necessary to open the property file without opening all the content files. Furthermore, since the fixed length management (entry number management) is performed by the slot, rapid processing is possible.

仮に、仮想ファイル管理をしない場合において、同様の検索を行う場合には、実際にコンテンツファイルをオープンさせた後に、その内部の情報（例えば、記録日時情報）を読み出し、ファイルクローズし、さらに次のコンテンツファイルをオープンするという処理が必要となり検索に莫大な時間を要する。また、記録媒体の記録容量が大きくなれば、記録されるコンテンツ数も増加するため、問題がさらに顕著になる。 If the same search is performed when virtual file management is not performed, after the content file is actually opened, the internal information (for example, recording date and time information) is read, the file is closed, and the next The process of opening the content file is required, and the search takes an enormous amount of time. In addition, as the recording capacity of the recording medium increases, the number of contents to be recorded increases, and the problem becomes even more pronounced.

次に、本発明の実施の形態における撮像装置１００の動作について図面を参照して説明する。 Next, the operation of the imaging apparatus 100 according to the embodiment of the present invention will be described with reference to the drawings.

図２１は、撮像装置１００によるプロパティファイル４００の記録処理の処理手順を示すフローチャートである。なお、ここでは、コンテンツファイルとして、撮像された画像データに対応する動画コンテンツファイルが入力された場合について説明する。 FIG. 21 is a flowchart illustrating a processing procedure for recording the property file 400 by the imaging apparatus 100. Here, a case where a moving image content file corresponding to captured image data is input as the content file will be described.

最初に、カメラ部１１０で撮像された画像が符号化され、符号化された画像データであるストリームがコンテンツ入力部２１１に入力される（ステップＳ９０１）。 First, an image captured by the camera unit 110 is encoded, and a stream that is encoded image data is input to the content input unit 211 (step S901).

続いて、入力されたストリームを構成するフレームが、シーケンスの先頭のＩピクチャまたはＩＤＲピクチャであるか否かが順次判断される（ステップＳ９０２）。入力されたストリームを構成するフレームが、ＩピクチャおよびＩＤＲピクチャの何れでもなければ（ステップＳ９０２）、ストリームの入力が継続される（ステップＳ９０１）。 Subsequently, it is sequentially determined whether or not the frame constituting the input stream is the first I picture or IDR picture of the sequence (step S902). If the frame constituting the input stream is neither an I picture nor an IDR picture (step S902), the stream input is continued (step S901).

一方、入力されたストリームを構成するフレームが、ＩピクチャまたはＩＤＲピクチャであれば（ステップＳ９０２）、そのフレームから顔検出部２１２が顔を検出する（ステップＳ９０３）。続いて、検出された顔が所定条件の範囲内の顔であるか否かが判断される（ステップＳ９０４）。顔が検出されなかった場合、または、検出された顔が所定条件の範囲内の顔でなかった場合には（ステップＳ９０４）、ステップＳ９０３に戻り、フレームからの顔の検出を繰り返す。 On the other hand, if the frame constituting the input stream is an I picture or IDR picture (step S902), the face detection unit 212 detects a face from the frame (step S903). Subsequently, it is determined whether or not the detected face is a face within a predetermined condition range (step S904). If no face is detected, or if the detected face is not within the range of the predetermined condition (step S904), the process returns to step S903, and the face detection from the frame is repeated.

一方、検出された顔が所定条件の範囲内の顔であった場合には（ステップＳ９０４）、検出された顔に基づいて顔データが作成され、作成された顔データがメモリに記録される（ステップＳ９０５）。続いて、１つのフレーム内において顔の検出が終了したか否かが判断される（ステップＳ９０６）。つまり、１フレーム内の全ての領域で顔検出を行う。１つのフレーム内において顔の検出が終了していなければ（ステップＳ９０６）、ステップＳ９０３に戻り、フレームからの顔の検出を繰り返す。 On the other hand, if the detected face is a face within the range of the predetermined condition (step S904), face data is created based on the detected face, and the created face data is recorded in the memory ( Step S905). Subsequently, it is determined whether or not face detection has been completed within one frame (step S906). That is, face detection is performed in all areas within one frame. If face detection is not completed within one frame (step S906), the process returns to step S903, and face detection from the frame is repeated.

一方、１つのフレーム内において顔の検出が終了していれば（ステップＳ９０６）、ストリームの入力が終了したか否かが判断される（ステップＳ９０７）。つまり、１つのまとまった画像コンテンツの入力が終了したか否かが判断される（ステップＳ９０７）。ストリームの入力が終了していなければ（ステップＳ９０７）、ステップＳ９０１に戻り、ストリームの入力を継続する。 On the other hand, if face detection has been completed within one frame (step S906), it is determined whether or not the input of the stream has been completed (step S907). That is, it is determined whether or not input of a single image content has been completed (step S907). If the input of the stream is not completed (step S907), the process returns to step S901, and the input of the stream is continued.

ストリームの入力が終了していれば（ステップＳ９０７）、メモリに記録されている顔データに基づいて顔メタデータのヘッダ部６３０（図１０）に記録されるヘッダ情報が作成される（ステップＳ９０８）。 If the input of the stream has been completed (step S907), header information recorded in the header portion 630 (FIG. 10) of the face metadata is created based on the face data recorded in the memory (step S908). .

続いて、作成されたヘッダ情報を記録するヘッダ部と、検出された顔に対応する顔データを記録する顔データ部とを含むメタデータエントリが作成される（ステップＳ９０９）。続いて、入力されたストリームに対応する動画コンテンツファイルを管理するファイルエントリが作成される（ステップＳ９１０）。 Subsequently, a metadata entry including a header portion that records the created header information and a face data portion that records face data corresponding to the detected face is created (step S909). Subsequently, a file entry for managing the moving image content file corresponding to the input stream is created (step S910).

続いて、プロパティファイル４００がオープンされ（ステップＳ９１１）、作成されたメタデータエントリおよびファイルエントリについてのエントリ番号が計算され、この計算された結果に基づいて、作成されたメタデータエントリおよびファイルエントリがプロパティファイル４００に割り当てられる（ステップＳ９１２）。つまり、複数のエントリがスロット番号順にプロパティファイル４００に割り当てられる。 Subsequently, the property file 400 is opened (step S911), the entry numbers for the created metadata entry and file entry are calculated, and the created metadata entry and file entry are determined based on the calculated result. Assigned to the property file 400 (step S912). That is, a plurality of entries are assigned to the property file 400 in the order of slot numbers.

続いて、プロパティファイル４００に割り当てられたファイルエントリの子エントリリストに、このファイルエントリに属するメタデータエントリのエントリ番号が記録され、また、このメタデータエントリの親エントリリストに、このメタデータエントリが属するファイルエントリのエントリ番号が記録される（ステップＳ９１３）。 Subsequently, the entry number of the metadata entry belonging to the file entry is recorded in the child entry list of the file entry assigned to the property file 400, and the metadata entry is recorded in the parent entry list of the metadata entry. The entry number of the file entry to which it belongs is recorded (step S913).

続いて、プロパティファイル４００に割り当てられたファイルエントリが属するフォルダエントリの子エントリリストに、このファイルエントリのエントリ番号が記録され、また、このファイルエントリの親エントリリストに、このフォルダエントリのエントリ番号が記録される（ステップＳ９１４）。続いて、プロパティファイル４００がクローズされて（ステップＳ９１５）、プロパティファイル４００の記録処理の処理手順が終了する。 Subsequently, the entry number of this file entry is recorded in the child entry list of the folder entry to which the file entry assigned to the property file 400 belongs, and the entry number of this folder entry is recorded in the parent entry list of this file entry. It is recorded (step S914). Subsequently, the property file 400 is closed (step S915), and the processing procedure of the recording process of the property file 400 ends.

なお、ステップＳ９０１で入力されたストリームを構成するフレームが、先頭のフレームである場合には、代表画像である代表サムネイル画像が抽出され（ステップＳ９０３）、この代表サムネイル画像がサムネイルファイル５００に格納されるとともに、この代表サムネイル画像のサムネイルアドレスが、対応するファイルエントリのサムネイルアドレスに記録される（ステップＳ９１２）。また、入力されたストリームに対応するコンテンツファイルのコンテンツアドレスが、対応するファイルエントリのコンテンツアドレスに格納される（ステップＳ９１２）。 If the frame constituting the stream input in step S901 is the first frame, a representative thumbnail image that is a representative image is extracted (step S903), and the representative thumbnail image is stored in the thumbnail file 500. At the same time, the thumbnail address of the representative thumbnail image is recorded in the thumbnail address of the corresponding file entry (step S912). Further, the content address of the content file corresponding to the input stream is stored in the content address of the corresponding file entry (step S912).

次に、動画コンテンツファイルを再生する場合に、所望する撮影時刻から再生させる場合における動作について図面を参照して説明する。 Next, an operation in the case of reproducing a moving image content file from a desired shooting time will be described with reference to the drawings.

図２２乃至図２４は、撮像装置１００による動画コンテンツファイルの再生処理の処理手順を示すフローチャートである。 FIG. 22 to FIG. 24 are flowcharts showing the processing procedure of the moving image content file playback processing by the imaging apparatus 100.

操作部１４０からの操作入力を監視して、動画コンテンツファイルの一覧表示を指示する旨の操作入力がされたか否かが判断される（ステップＳ９２１）。コンテンツ一覧表示を指示する旨の操作入力がされなければ（ステップＳ９２１）、操作入力の監視を継続する。 The operation input from the operation unit 140 is monitored, and it is determined whether or not an operation input for instructing to display a list of moving image content files has been made (step S921). If there is no operation input for instructing content list display (step S921), the monitoring of the operation input is continued.

コンテンツ一覧表示を指示する旨の操作入力がされると（ステップＳ９２１）、プロパティファイル４００をオープンさせ（ステップＳ９２２）、プロパティファイル４００から動画コンテンツファイルを管理するフォルダエントリが抽出される（ステップＳ９２３）。続いて、抽出されたフォルダエントリに記録されている子エントリリストから、日付フォルダエントリのエントリ番号が抽出され、抽出されたエントリ番号に基づいて日付フォルダエントリが抽出される（ステップＳ９２４）。 When an operation input for instructing content list display is made (step S921), the property file 400 is opened (step S922), and a folder entry for managing the video content file is extracted from the property file 400 (step S923). . Subsequently, the entry number of the date folder entry is extracted from the child entry list recorded in the extracted folder entry, and the date folder entry is extracted based on the extracted entry number (step S924).

続いて、抽出された日付フォルダエントリに記録されている子エントリリストから、動画ファイルエントリのエントリ番号が抽出され、抽出されたエントリ番号に基づいて動画ファイルエントリが抽出される（ステップＳ９２５）。続いて、抽出されたファイルエントリのエントリ番号がメモリに順次記録される（ステップＳ９２６）。続いて、メモリに記録されたエントリ番号に対応するファイルエントリに記録されているサムネイルアドレスがメモリに順次記録される（ステップＳ９２７）。 Subsequently, the entry number of the moving image file entry is extracted from the child entry list recorded in the extracted date folder entry, and the moving image file entry is extracted based on the extracted entry number (step S925). Subsequently, the entry numbers of the extracted file entries are sequentially recorded in the memory (step S926). Subsequently, the thumbnail addresses recorded in the file entry corresponding to the entry number recorded in the memory are sequentially recorded in the memory (step S927).

続いて、１つの日付フォルダエントリに属するファイルエントリに記録されているサムネイルアドレスの抽出が全て終了したか否かが判断される（ステップＳ９２８）。終了していなければ、ステップＳ９２７に戻り、抽出処理を繰り返す。 Subsequently, it is determined whether or not extraction of all thumbnail addresses recorded in the file entry belonging to one date folder entry is completed (step S928). If not completed, the process returns to step S927 to repeat the extraction process.

一方、サムネイルアドレスの抽出が全て終了していれば（ステップＳ９２８）、全ての日付フォルダエントリについての抽出が終了したか否かが判断される（ステップＳ９２９）。全ての日付フォルダエントリについての抽出が終了していなければ（ステップＳ９２９）、ステップＳ９２５に戻り、抽出処理を繰り返す。 On the other hand, if the extraction of all thumbnail addresses has been completed (step S928), it is determined whether the extraction for all date folder entries has been completed (step S929). If extraction has not been completed for all date folder entries (step S929), the process returns to step S925 to repeat the extraction process.

全ての日付フォルダエントリについての抽出が終了していれば（ステップＳ９２９）、プロパティファイル４００をクローズさせ（ステップＳ９３０）、サムネイルファイル５００をオープンさせる（ステップＳ９３１）。続いて、ステップＳ９２７においてメモリに記録されたサムネイルアドレスに基づいて、サムネイルファイル５００から代表サムネイル画像が読み出され、読み出された代表サムネイル画像がメモリに順次記録される（ステップＳ９３２）。続いて、サムネイルファイル５００をクローズさせる（ステップＳ９３３）。続いて、ステップＳ９３２においてメモリに記録された代表サムネイル画像がＬＣＤ１６２に表示される（ステップＳ９３４）。例えば、図２０の表示画面７１０に示すように表示される。 If extraction for all date folder entries has been completed (step S929), the property file 400 is closed (step S930), and the thumbnail file 500 is opened (step S931). Subsequently, based on the thumbnail address recorded in the memory in step S927, the representative thumbnail image is read from the thumbnail file 500, and the read representative thumbnail images are sequentially recorded in the memory (step S932). Subsequently, the thumbnail file 500 is closed (step S933). Subsequently, the representative thumbnail image recorded in the memory in step S932 is displayed on the LCD 162 (step S934). For example, it is displayed as shown in a display screen 710 in FIG.

続いて、ＬＣＤ１６２に表示されているサムネイル画像の中から、１つのサムネイル画像を選択する旨の操作入力が操作部１４０からされたか否かが判断される（ステップＳ９３５）。この操作入力がなければ（ステップＳ９３５）、操作入力の監視を継続する。 Subsequently, it is determined whether or not an operation input for selecting one thumbnail image from the thumbnail images displayed on the LCD 162 is made from the operation unit 140 (step S935). If there is no operation input (step S935), monitoring of the operation input is continued.

代表サムネイル画像を選択する旨の操作入力がされると（ステップＳ９３５）、選択された代表サムネイル画像の順番に基づいて、ステップＳ９２６においてメモリに記録されたファイルエントリのエントリ番号が抽出される（ステップＳ９３６）。続いて、プロパティファイル４００がオープンされ（ステップＳ９３７）、抽出されたエントリ番号に対応するファイルエントリがプロパティファイル４００から抽出される（ステップＳ９３８）。 When an operation input for selecting a representative thumbnail image is made (step S935), the entry number of the file entry recorded in the memory is extracted in step S926 based on the order of the selected representative thumbnail images (step S935). S936). Subsequently, the property file 400 is opened (step S937), and a file entry corresponding to the extracted entry number is extracted from the property file 400 (step S938).

続いて、抽出されたファイルエントリに記録されている子エントリリストからメタデータエントリのエントリ番号が抽出され、抽出されたメタデータエントリのエントリ番号がメモリに記録される（ステップＳ９３９）。続いて、メモリに記録されたエントリ番号に対応するメタデータエントリがプロパティファイルから抽出される（ステップＳ９４０）。続いて、抽出されたメタデータエントリから顔メタデータが抽出され（ステップＳ９４１）、抽出された顔メタデータのヘッダ部の情報が確認される（ステップＳ９４２）。 Subsequently, the entry number of the metadata entry is extracted from the child entry list recorded in the extracted file entry, and the entry number of the extracted metadata entry is recorded in the memory (step S939). Subsequently, a metadata entry corresponding to the entry number recorded in the memory is extracted from the property file (step S940). Subsequently, face metadata is extracted from the extracted metadata entry (step S941), and information of the header portion of the extracted face metadata is confirmed (step S942).

続いて、ヘッダ部の情報に基づいて顔データが順次読み出され（ステップＳ９４３）、読み出された顔データに含まれる顔基本情報がメモリに順次記録される（ステップＳ９４４）。続いて、全ての顔データの読み出しが終了したか否かが判断される（ステップＳ９４５）。全ての顔データの読み出しが終了していなければ（ステップＳ９４５）、顔データの読み出しおよびメモリへの記録を継続する（ステップＳ９４３およびステップＳ９４４）。全ての顔データの読み出しが終了していれば（ステップＳ９４５）、プロパティファイル４００をクローズさせ（ステップＳ９４６）、ステップＳ９４４においてメモリに記録された顔基本情報に基づいて、動画コンテンツファイルから顔サムネイル画像が作成され、作成された顔サムネイル画像がメモリに順次記録される（ステップＳ９４７）。続いて、ステップＳ９４７においてメモリに記録された顔サムネイル画像がＬＣＤ１６２に表示される（ステップＳ９４８）。例えば、図２０の表示画面７２０に示すように表示される。 Subsequently, face data is sequentially read based on the information in the header portion (step S943), and the basic face information included in the read face data is sequentially recorded in the memory (step S944). Subsequently, it is determined whether or not all face data has been read (step S945). If reading of all face data has not been completed (step S945), reading of face data and recording to memory are continued (steps S943 and S944). If the reading of all face data has been completed (step S945), the property file 400 is closed (step S946), and the face thumbnail image from the video content file is based on the basic face information recorded in the memory in step S944. Are created, and the created face thumbnail images are sequentially recorded in the memory (step S947). Subsequently, the face thumbnail image recorded in the memory in step S947 is displayed on the LCD 162 (step S948). For example, it is displayed as shown in the display screen 720 of FIG.

続いて、ＬＣＤ１６２に表示されている顔サムネイル画像の中から、１つの顔サムネイル画像を選択する旨の操作入力が操作部１４０からされたか否かが判断される（ステップＳ９４９）。この操作入力がなければ（ステップＳ９４９）、操作入力の監視を継続する。 Subsequently, it is determined whether or not an operation input for selecting one face thumbnail image from the face thumbnail images displayed on the LCD 162 is made from the operation unit 140 (step S949). If there is no operation input (step S949), monitoring of the operation input is continued.

顔サムネイル画像を選択する旨の操作入力がされると（ステップＳ９４９）、選択された顔サムネイル画像の表示順に応じた番号がメモリに記録される（ステップＳ９５０）。続いて、プロパティファイル４００がオープンされ（ステップＳ９５１）、ステップＳ９３９においてメモリに記録されたメタデータエントリのエントリ番号に基づいて、このメタデータエントリがプロパティファイル４００から抽出される（ステップＳ９５２）。 When an operation input for selecting a face thumbnail image is made (step S949), a number corresponding to the display order of the selected face thumbnail image is recorded in the memory (step S950). Subsequently, the property file 400 is opened (step S951), and the metadata entry is extracted from the property file 400 based on the entry number of the metadata entry recorded in the memory in step S939 (step S952).

続いて、抽出されたメタデータエントリから顔メタデータが抽出され（ステップＳ９５３）、抽出された顔メタデータから、ステップＳ９５０においてメモリに記録された番号に対応する顔データが抽出される（ステップＳ９５４）。続いて、抽出された顔データから顔検出時刻情報が抽出され、抽出された顔検出時刻情報がメモリに記録される（ステップＳ９５５）。 Subsequently, face metadata is extracted from the extracted metadata entry (step S953), and face data corresponding to the number recorded in the memory in step S950 is extracted from the extracted face metadata (step S954). ). Subsequently, face detection time information is extracted from the extracted face data, and the extracted face detection time information is recorded in the memory (step S955).

続いて、メモリにエントリ番号が記録されているメタデータエントリの親エントリリストに対応するファイルエントリのエントリ番号が抽出され（ステップＳ９５６）、抽出されたエントリ番号に対応するファイルエントリがプロパティファイル４００から抽出される（ステップＳ９５７）。続いて、抽出されたファイルエントリに記録されているコンテンツアドレスが抽出され、抽出されたコンテンツアドレスがメモリに記録される（ステップＳ９５８）。そして、プロパティファイル４００がクローズされる（ステップＳ９５９）。 Subsequently, the entry number of the file entry corresponding to the parent entry list of the metadata entry whose entry number is recorded in the memory is extracted (step S956), and the file entry corresponding to the extracted entry number is extracted from the property file 400. Extracted (step S957). Subsequently, the content address recorded in the extracted file entry is extracted, and the extracted content address is recorded in the memory (step S958). Then, the property file 400 is closed (step S959).

続いて、ステップＳ９５７において抽出されたコンテンツアドレスに対応するコンテンツファイルについて、ステップＳ９５５においてメモリに記録された顔検出時刻情報に対応する時刻から再生を開始させる（ステップＳ９６０）。 Subsequently, the reproduction of the content file corresponding to the content address extracted in step S957 is started from the time corresponding to the face detection time information recorded in the memory in step S955 (step S960).

図２５は、図９に示すメタデータエントリ６００に含まれる顔メタデータ６２０の構成を概略的に示す図である。ここでは、顔データに記録されているデータをデータ１乃至６として、顔データの読出処理における顔データのオフセット値の計算方法について説明する。 FIG. 25 is a diagram schematically showing a configuration of face metadata 620 included in the metadata entry 600 shown in FIG. Here, a method for calculating the offset value of the face data in the face data reading process will be described using the data recorded in the face data as data 1 to 6.

顔メタデータ６２０のヘッダサイズａは、顔メタデータ６２０のヘッダ部６３０のヘッダサイズ６３１に記録されている。また、顔メタデータ６２０の顔データサイズｂは、顔メタデータ６２０のヘッダ部６３０の顔データサイズ６３６に記録されている。ｃは、１つの顔データの所望データまでの距離を示す。顔メタデータ６２０から必要なデータを読み出す場合には、各顔データの先頭からのオフセット値を、以下に示す式１を用いて計算し、計算して求められたオフセット値を用いてデータを読み出す。これにより、顔データに記録されているデータから所望のデータを読み出す場合に、読出処理を迅速に行うことができる。例えば、図２５には、必要なデータ（所望データ）がデータ３である場合を示す。
ａ＋ｃ＋ｎ×ｂ（ｎ：０以上の整数）［ｂｙｔｅ］……（式１） The header size a of the face metadata 620 is recorded in the header size 631 of the header portion 630 of the face metadata 620. The face data size b of the face metadata 620 is recorded in the face data size 636 of the header portion 630 of the face metadata 620. c represents the distance to the desired data of one face data. When necessary data is read from the face metadata 620, an offset value from the head of each face data is calculated using the following equation 1, and the data is read using the offset value obtained by calculation. . As a result, when desired data is read from the data recorded in the face data, the reading process can be performed quickly. For example, FIG. 25 shows a case where necessary data (desired data) is data 3.
a + c + n × b (n: integer greater than or equal to 0) [bytes] (Equation 1)

図２６は、撮像装置１００による顔データの読出処理の処理手順を示すフローチャートである。この処理手順は、例えば、図２３に示すステップＳ９４１乃至ステップＳ９４３に対応する。ここでは、図１０に示すヘッダ部６３０を参照して説明する。 FIG. 26 is a flowchart illustrating a processing procedure of face data reading processing by the imaging apparatus 100. This processing procedure corresponds to, for example, steps S941 to S943 shown in FIG. Here, description will be made with reference to the header portion 630 shown in FIG.

最初に、メタデータエントリから顔メタデータが読み出される（ステップＳ９７１）。続いて、読み出された顔メタデータのヘッダ部６３０の情報が読み出される（ステップＳ９７２）。続いて、読み出されたヘッダ部６３０のメタデータバージョン６３２に記録されている顔メタデータのバージョン情報に基づいて、撮像装置１００が対応可能な顔メタデータのバージョンであるか否かが判断される（ステップＳ９７３）。なお、ここでは、所望するデータが存在する顔メタデータのバージョンであるか否かも判断される。例えば、バージョン「１．１０」から付加された顔メタデータを使用する場合において、バージョン「１．００」が確認された場合には、ステップＳ９８０に進む。 First, face metadata is read from the metadata entry (step S971). Subsequently, information of the header part 630 of the read face metadata is read (step S972). Subsequently, based on the face metadata version information recorded in the metadata version 632 of the read header section 630, it is determined whether or not the face metadata version is compatible with the imaging apparatus 100. (Step S973). Here, it is also determined whether the version of the face metadata includes the desired data. For example, when the face metadata added from version “1.10” is used and the version “1.00” is confirmed, the process proceeds to step S980.

対応可能な顔メタデータのバージョンではない場合には（ステップＳ９７３）、ステップＳ９８０に進み、コンテンツ記憶部２２３に記憶されている全てのコンテンツについて顔データの読出処理が終了したか否かが判断される（ステップＳ９８０）。 If the face metadata version is not compatible (step S973), the process proceeds to step S980, where it is determined whether or not the face data reading process has been completed for all the contents stored in the content storage unit 223. (Step S980).

対応可能な顔メタデータのバージョンである場合には（ステップＳ９７３）、対応する動画コンテンツファイルの更新日時と、ヘッダ部６３０のコンテンツ更新日時６３３に記録されている更新日時とが同じであるか否かが判断される（ステップＳ９７４）。 If the version of the face metadata is compatible (step S973), whether the update date / time of the corresponding video content file is the same as the update date / time recorded in the content update date / time 633 of the header section 630 or not. Is determined (step S974).

動画コンテンツファイルの更新日時と、ヘッダ部６３０のコンテンツ更新日時６３３に記録されている更新日時とが同じでない場合には（ステップＳ９７４）、顔の再検出を行う設定であるか否かが判断される（ステップＳ９８２）。顔の再検出を行う設定である場合には、更新日時が同じでないと判断された動画コンテンツファイルについて、ステップＳ９００のプロパティファイルの記録処理を実行して（ステップＳ９００）、ステップＳ９７１に戻る。そして、プロパティファイルの記録処理が実行された動画コンテンツファイルに対応するメタデータエントリから顔メタデータが読み出される（ステップＳ９７１）。 If the update date / time of the video content file and the update date / time recorded in the content update date / time 633 of the header portion 630 are not the same (step S974), it is determined whether or not the setting is to perform face redetection. (Step S982). If the setting is to perform face re-detection, the property file recording process in step S900 is executed for the moving image content file determined to have the same update date and time (step S900), and the process returns to step S971. Then, face metadata is read from the metadata entry corresponding to the moving image content file for which the property file recording process has been executed (step S971).

動画コンテンツファイルの更新日時と、ヘッダ部６３０のコンテンツ更新日時６３３に記録されている更新日時とが同じある場合には（ステップＳ９７４）、対応する動画コンテンツファイルの画像サイズと、ヘッダ部６３０のコンテンツ画像サイズ６３８に記録されている画像サイズとが同じであるか否かが判断される（ステップＳ９７５）。動画コンテンツファイルの画像サイズと、ヘッダ部６３０のコンテンツ画像サイズ６３８に記録されている画像サイズとが同じでない場合には（ステップＳ９７５）、ステップＳ９８２に進み、上述した処理を繰り返す。 When the update date / time of the video content file is the same as the update date / time recorded in the content update date / time 633 of the header portion 630 (step S974), the image size of the corresponding video content file and the content of the header portion 630 are displayed. It is determined whether or not the image size recorded in the image size 638 is the same (step S975). If the image size of the moving image content file and the image size recorded in the content image size 638 of the header portion 630 are not the same (step S975), the process proceeds to step S982, and the above-described processing is repeated.

動画コンテンツファイルの画像サイズと、ヘッダ部６３０のコンテンツ画像サイズ６３８に記録されている画像サイズとが同じである場合には（ステップＳ９７５）、ヘッダ部６３０の顔データ個数６３５に「０」が記録されているか否かが判断される（ステップＳ９７６）。顔データ個数６３５に「０」が記録されている場合には（ステップＳ９７６）、対象となる動画コンテンツファイルから顔が検出されず、顔データが存在しない場合であるため、ステップＳ９８０に進む。 If the image size of the moving image content file is the same as the image size recorded in the content image size 638 of the header portion 630 (step S975), “0” is recorded in the face data count 635 of the header portion 630. It is determined whether or not it has been performed (step S976). If “0” is recorded in the face data count 635 (step S976), no face is detected from the target moving image content file, and no face data exists, so the process proceeds to step S980.

顔データ個数６３５に「０」が記録されていない場合には（ステップＳ９７６）、ヘッダ部６３０の顔データ構造フラグ６６０の記録に基づいて、必要なデータが顔データとして記録されているか否かが判断される（ステップＳ９７７）。これは、バージョンが同じでも、必要なデータが含まれていない可能性があるために行うものである。必要なデータが顔データとして記録されていない場合には（ステップＳ９７７）、ステップＳ９８０に進む。 If “0” is not recorded in the face data count 635 (step S976), whether or not necessary data is recorded as face data based on the record of the face data structure flag 660 of the header section 630 is determined. Determination is made (step S977). This is done because the necessary data may not be included even if the versions are the same. If necessary data is not recorded as face data (step S977), the process proceeds to step S980.

必要なデータが顔データとして記録されている場合には（ステップＳ９７７）、顔データ構造フラグ６６０の記録に基づいて、式１を用いて顔データ内の必要なデータまでのオフセット値が計算される（ステップＳ９７８）。これは、必要なデータが顔データ内で先頭から何バイト目から始まるかを求めるためのものである。また、顔データがどのような構造であるかを検索するためのものである。続いて、計算して求められたオフセット値に基づいて、顔データを読み出す（ステップＳ９７９）。続いて、コンテンツ記憶部２２３に記憶されている全てのコンテンツについて顔データの読出処理が終了したか否かが判断される（ステップＳ９８０）。コンテンツ記憶部２２３に記憶されている全てのコンテンツについて顔データの読出処理が終了した場合には（ステップＳ９８０）、顔データの読出処理を終了する。 If the necessary data is recorded as face data (step S977), an offset value to the necessary data in the face data is calculated using Expression 1 based on the recording of the face data structure flag 660. (Step S978). This is to determine the number of bytes from the beginning of the necessary data in the face data. Further, it is for searching for the structure of the face data. Subsequently, face data is read based on the offset value obtained by calculation (step S979). Subsequently, it is determined whether or not the face data reading process has been completed for all contents stored in the content storage unit 223 (step S980). When the face data reading process is completed for all contents stored in the content storage unit 223 (step S980), the face data reading process is ended.

一方、コンテンツ記憶部２２３に記憶されている全てのコンテンツについて顔データの読出処理が終了していない場合には（ステップＳ９８０）、コンテンツ記憶部２２３に記憶されているコンテンツの中で、顔データの読出処理が終了していないコンテンツに対応するメタデータエントリから顔メタデータが選択され（ステップＳ９８１）、顔データの読出処理を繰り返す（ステップＳ９７１乃至ステップＳ９７９）。なお、この例では、コンテンツ記憶部２２３に記憶されている全てのコンテンツについて、顔データの読出処理を実行する場合について説明したが、コンテンツ記憶部２２３に記憶されているコンテンツの中の所望のコンテンツについてのみ顔データの読出処理を実行する場合についても、この例を適用することができる。 On the other hand, when the face data reading process has not been completed for all the contents stored in the content storage unit 223 (step S980), the content of the face data in the content stored in the content storage unit 223 is determined. Face metadata is selected from the metadata entry corresponding to the content that has not been read (step S981), and the face data reading process is repeated (steps S971 to S979). In this example, the case where the face data reading process is executed for all the contents stored in the content storage unit 223 has been described. However, the desired content among the contents stored in the content storage unit 223 is described. This example can also be applied to the case where the face data reading process is executed only for.

このように、コンテンツ更新日時の比較とともに、コンテンツ画像サイズを用いた比較をすることによって、さらに確実な不整合を検出することが可能である。 In this way, by comparing the content update date and the content image size, it is possible to detect a more reliable mismatch.

図２７は、撮像装置１００による顔データの読出処理の処理手順を示すフローチャートである。この処理手順は、チェックサムを用いて不整合を検出する処理手順であり、図２６に示す処理手順において、ステップＳ９７４およびステップＳ９７５の代わりに、ステップＳ９８３およびステップＳ９８４の処理を行うものである。このため、ステップＳ９８３およびステップＳ９８４について詳細に説明し、他の処理についての説明を省略する。また、ここでは、図１０に示すヘッダ部６３０を参照して説明する。 FIG. 27 is a flowchart illustrating a processing procedure of face data reading processing by the imaging apparatus 100. This processing procedure is a processing procedure for detecting an inconsistency using a checksum. In the processing procedure shown in FIG. 26, the processing of step S983 and step S984 is performed instead of step S974 and step S975. For this reason, step S983 and step S984 are demonstrated in detail, and the description about another process is abbreviate | omitted. Here, the description will be made with reference to the header portion 630 shown in FIG.

ステップＳ９７２で読み出されたヘッダ部６３０のメタデータバージョン６３２に記録されている顔メタデータのバージョン情報に基づいて、撮像装置１００が対応可能な顔メタデータのバージョンであるか否かが判断される（ステップＳ９７３）。そして、対応可能な顔メタデータのバージョンである場合には（ステップＳ９７３）、対応する動画コンテンツファイルの画像データからチェックサムが計算される（ステップＳ９８３）。このチェックサムの計算を行う場合において、対応する全ての画像データに基づいてチェックサムの計算を行うと、多くの処理時間を要すると考えられる。このため、対応する画像データのうちから、記録再生処理に支障がない程度の画像データを抽出して、この抽出された画像データを用いてチェックサムを計算するようにしてもよい。例えば、画像データの開始から１００バイトまでの値を用いてチェックサムを計算することができる。この場合には、ヘッダ部６３０の誤り検出符号値６３９に記録されているチェックサムの値についても、画像データの開始から１００バイトまでの値を用いて計算されている。 Based on the face metadata version information recorded in the metadata version 632 of the header section 630 read out in step S972, it is determined whether or not the face metadata version is compatible with the imaging apparatus 100. (Step S973). If the face metadata version is compatible (step S973), a checksum is calculated from the image data of the corresponding moving image content file (step S983). In the case of calculating the checksum, if the checksum is calculated based on all corresponding image data, it is considered that a long processing time is required. For this reason, image data that does not interfere with the recording / playback process may be extracted from the corresponding image data, and a checksum may be calculated using the extracted image data. For example, a checksum can be calculated using a value from the start of image data to 100 bytes. In this case, the checksum value recorded in the error detection code value 639 of the header section 630 is also calculated using a value from the start of the image data to 100 bytes.

続いて、計算して求められたチェックサムの値と、ヘッダ部６３０の誤り検出符号値６３９に記録されているチェックサムの値とが同じであるか否かが判断される（ステップＳ９８４）。 Subsequently, it is determined whether the checksum value obtained by calculation is the same as the checksum value recorded in the error detection code value 639 of the header portion 630 (step S984).

計算して求められたチェックサムの値と、ヘッダ部６３０の誤り検出符号値６３９に記録されているチェックサムの値とが同じである場合には（ステップＳ９８４）、顔メタデータが信頼できるものであると判断することができるため、ステップＳ９７６に進む。一方、計算して求められたチェックサムの値と、ヘッダ部６３０の誤り検出符号値６３９に記録されているチェックサムの値とが同じでない場合には（ステップＳ９８４）、ステップＳ９８２に進む。ここで、誤り検出符号値として、ＣＲＣやハッシュ関数を用いたハッシュ値を用いる場合についても同様にこの処理手順を適用することができる。また、図２６および図２７で示したコンテンツ更新日時の比較（ステップＳ９７４）、コンテンツ画像サイズの比較（ステップＳ９７５）、チェックサムの比較（ステップＳ９８３およびステップＳ９８４）のうちの少なくとも２つを組み合わせて不整合を検出するようにしてもよい。 When the calculated checksum value is the same as the checksum value recorded in the error detection code value 639 of the header section 630 (step S984), the face metadata is reliable Therefore, the process proceeds to step S976. On the other hand, when the calculated checksum value is not the same as the checksum value recorded in the error detection code value 639 of the header 630 (step S984), the process proceeds to step S982. Here, this processing procedure can be similarly applied to a case where a hash value using a CRC or a hash function is used as the error detection code value. Also, combining at least two of the content update date / time comparison (step S974), content image size comparison (step S975), and checksum comparison (step S983 and step S984) shown in FIGS. An inconsistency may be detected.

次に本発明の実施の形態の変形例について図面を参照して詳細に説明する。 Next, modifications of the embodiment of the present invention will be described in detail with reference to the drawings.

ここでは、コンテンツファイルが動画コンテンツファイルである場合において、この動画コンテンツファイルに基づいて作成された顔メタデータを含むメタデータエントリをコンテンツ管理ファイル３４０に記録するとともに、この動画コンテンツファイル内部にも記録を行う例について説明する。この例では、顔メタデータを、図２に示すＡＵ（アクセスユニット）に含まれるＳＥＩＮＡＬユニットの付加情報として記録する場合について説明する。 Here, when the content file is a video content file, a metadata entry including face metadata created based on the video content file is recorded in the content management file 340 and also recorded inside the video content file. An example of performing is described. In this example, a case will be described in which face metadata is recorded as additional information of the SEI NAL unit included in the AU (access unit) shown in FIG.

図２を参照して説明したように、本発明の実施の形態において、ＭＰＥＧ４−ＡＶＣで符号化された動画ストリームに含まれる顔を検出するタイミングは、ＩＤＲＡＵまたはｎｏｎ−ＩＤＲ−ＩＡＵが出現するタイミングである。このため、例えば、ＩＤＲＡＵに対応するフレームから顔が検出された場合には、この検出された顔に関する顔メタデータが、そのＩＤＲＡＵの中に含まれるＳＥＩＮＡＬユニットの付加情報として記録される。例えば、図２に示すように、ＡＵ１８０に対応するフレームから顔が検出された場合には、この検出された顔に関する顔メタデータが、ＡＵ１８０に含まれるＳＥＩＮＡＬユニット１８１の付加情報として記録される。また、ＡＵ１９０に対応するフレームから顔が検出された場合には、この検出された顔に関する顔メタデータが、ＡＵ１９０に含まれるＳＥＩＮＡＬユニット１９１の付加情報として記録される。 As described with reference to FIG. 2, in the embodiment of the present invention, IDR AU or non-IDR-I AU appears at the timing of detecting a face included in a moving image stream encoded by MPEG4-AVC. It is time to do. Therefore, for example, when a face is detected from a frame corresponding to the IDR AU, face metadata regarding the detected face is recorded as additional information of the SEI NAL unit included in the IDR AU. . For example, as shown in FIG. 2, when a face is detected from a frame corresponding to the AU 180, the face metadata regarding the detected face is recorded as additional information of the SEI NAL unit 181 included in the AU 180. . When a face is detected from a frame corresponding to the AU 190, face metadata regarding the detected face is recorded as additional information of the SEI NAL unit 191 included in the AU 190.

ここで、ＳＥＩＮＡＬユニット（以下、ＳＥＩと称する。）に記録される顔メタデータは、例えば、図１０に示すヘッダ部６３０および図１１に示す顔データ部６４０から構成される顔メタデータ６２０である。なお、図１３乃至図１６等を参照して説明したように、顔データ部６４０については、必要な情報のみとすることができる。 Here, the face metadata recorded in the SEI NAL unit (hereinafter referred to as SEI) is, for example, face metadata 620 including a header part 630 shown in FIG. 10 and a face data part 640 shown in FIG. is there. Note that as described with reference to FIGS. 13 to 16 and the like, the face data unit 640 can include only necessary information.

ここで、ＳＥＩに記録される顔データが満たすべき所定の条件について図２８を参照して詳細に説明する。上述したように、コンテンツ管理ファイル３４０において顔データ部に顔データの値を記録する場合には、例えば、所定の条件（顔の大きさ、位置、直前に検出された顔の個数の増減等）に基づいて、１フレーム内において検出された顔のうちで、顔データ部に記録すべき顔データを規定して制限していた。これに対して、ＳＥＩに顔データを記録する場合には、１フレーム内において検出された顔に関する顔メタデータを極力記録するものとする。すなわち、ＳＥＩに顔データを記録する場合には、コンテンツ管理ファイル３４０に顔データを記録する場合における所定の条件よりも緩和された条件に基づいて記録するものとする。 Here, a predetermined condition to be satisfied by the face data recorded in the SEI will be described in detail with reference to FIG. As described above, when the value of face data is recorded in the face data portion in the content management file 340, for example, predetermined conditions (face size, position, increase / decrease in the number of faces detected immediately before, etc.) Based on the above, among the faces detected in one frame, face data to be recorded in the face data portion is defined and limited. On the other hand, when face data is recorded in the SEI, face metadata relating to the face detected in one frame is recorded as much as possible. That is, when face data is recorded in the SEI, the face data is recorded based on conditions that are more relaxed than predetermined conditions in the case of recording face data in the content management file 340.

例えば、ＳＥＩに格納する顔の個数の上限値を予め決めておき、検出された顔が上限値を超えた場合にのみ、検出された顔の大きさや位置等に基づいて、ＳＥＩに記録される顔メタデータを制限する。ここで、図２８を参照して顔データの記録方法の一例を説明する。 For example, an upper limit value of the number of faces to be stored in the SEI is determined in advance, and only when the detected face exceeds the upper limit value, it is recorded in the SEI based on the size and position of the detected face. Limit face metadata. Here, an example of a face data recording method will be described with reference to FIG.

図２８は、動画コンテンツファイルを構成するフレーム８２３乃至８２８において検出された顔と、顔データ部６４０に記録される顔データ８１１乃至８２２との関係を示す図である。なお、図２８では、フレーム８２３乃至８２８において検出された顔が四角の枠で囲まれている状態を示す。また、フレーム８２３または８２４では１人の顔が検出され、フレーム８２５または８２７では２人の顔が検出され、フレーム８２６または８２８では３人の顔が検出されたものとする。 FIG. 28 is a diagram illustrating the relationship between the faces detected in the frames 823 to 828 constituting the moving image content file and the face data 811 to 822 recorded in the face data unit 640. FIG. 28 shows a state in which the faces detected in the frames 823 to 828 are surrounded by a square frame. Further, it is assumed that one face is detected in the frame 823 or 824, two faces are detected in the frame 825 or 827, and three faces are detected in the frame 826 or 828.

例えば、検出時刻ｔ１のフレーム８２３において検出された顔の数と、検出時刻ｔ２のフレーム８２４において検出された顔の数とが同数である場合においても、検出された顔の数が上限値を超えていなければ、検出時刻ｔ１のフレーム８２３および検出時刻ｔ２のフレーム８２４において検出された顔の顔データを顔データ部６４０に記録する。また、同様に、時刻ｔ５のフレーム８２７において検出された顔の数は、時刻ｔ４のフレーム８２６において検出された顔の数よりも少ないものの、検出された顔の数が上限値を超えていなければ、検出時刻ｔ４のフレーム８２６および検出時刻ｔ５のフレーム８２７において検出された顔の顔データを顔データ部６４０に記録する。 For example, even when the number of faces detected in the frame 823 at the detection time t1 is the same as the number of faces detected in the frame 824 at the detection time t2, the number of detected faces exceeds the upper limit value. If not, the face data of the face detected in the frame 823 at the detection time t1 and the frame 824 at the detection time t2 is recorded in the face data unit 640. Similarly, the number of faces detected in the frame 827 at time t5 is smaller than the number of faces detected in the frame 826 at time t4, but the number of detected faces does not exceed the upper limit value. The face data of the face detected in the frame 826 at the detection time t4 and the frame 827 at the detection time t5 is recorded in the face data unit 640.

これに対して、コンテンツ管理ファイル３４０に顔データを記録する場合における所定の条件は、例えば、ある検出時刻のフレームにおいて検出された顔の個数と、次の検出時刻のフレームにおいて検出された顔の個数とが同数である場合には、次の検出時刻で検出された顔に関する顔データを顔データ部に記録しないとする条件である。これは、検出された顔の個数が同数であるため、同じ顔に関するメタデータが記録される可能性が高いためである。また、ある検出時刻のフレームにおいて検出された顔の個数よりも、次の検出時刻のフレームにおいて検出された顔の個数が少ない場合についても、同様に、次の検出時刻で検出された顔に関する顔データを顔データ部に記録しないようにすることができる。 On the other hand, the predetermined condition in the case of recording face data in the content management file 340 is, for example, the number of faces detected in a frame at a certain detection time and the number of faces detected in a frame at the next detection time If the number is the same, the condition is that face data relating to the face detected at the next detection time is not recorded in the face data portion. This is because there is a high possibility that metadata relating to the same face is recorded because the number of detected faces is the same. Similarly, in the case where the number of faces detected in the next detection time frame is smaller than the number of faces detected in a certain detection time frame, the face related to the face detected at the next detection time is also the same. Data can be prevented from being recorded in the face data portion.

例えば、図２８に示すように、検出時刻ｔ１のフレーム８２３において検出された顔の数と、検出時刻ｔ２のフレーム８２４において検出された顔の数とが同数である場合において、コンテンツ管理ファイル３４０に顔データを記録する場合には、検出時刻ｔ２のフレーム８２４において検出された顔の顔データは、顔データ部６４０に記録されない。また、検出時刻ｔ５のフレーム８２７において検出された顔の数は、検出時刻ｔ４のフレーム８２６において検出された顔の数よりも少ない。このため、検出時刻ｔ５のフレーム８２５において検出された顔の顔データは、顔データ部６４０に記録されない。 For example, as shown in FIG. 28, when the number of faces detected in the frame 823 at the detection time t1 is the same as the number of faces detected in the frame 824 at the detection time t2, the content management file 340 When the face data is recorded, the face data of the face detected in the frame 824 at the detection time t2 is not recorded in the face data unit 640. Further, the number of faces detected in the frame 827 at the detection time t5 is smaller than the number of faces detected in the frame 826 at the detection time t4. For this reason, the face data of the face detected in the frame 825 at the detection time t5 is not recorded in the face data unit 640.

このように、ＳＥＩに顔データを記録する場合には、コンテンツ管理ファイルに顔メタデータを記録する場合における条件よりも緩和された条件に基づいて、顔メタデータの記録の是非を判定するようにする。これにより、顔データが記録されているＳＥＩを含むコンテンツファイルが、この顔データが記録された記録機器から他の機器に移動された場合でも、このコンテンツファイルを移動先の機器におけるアプリケーションに広く対応させることができる。 As described above, when face data is recorded in the SEI, whether or not face metadata is recorded is determined based on a condition that is more relaxed than the condition in the case of recording face metadata in the content management file. To do. As a result, even if a content file containing SEI in which face data is recorded is moved from the recording device in which the face data is recorded to another device, the content file can be widely applied to applications in the destination device. Can be made.

例えば、検出された顔に関する顔メタデータが記録機器の所定条件に基づいて記録された場合において、この移動元の記録機器の所定条件により記録された顔メタデータが移動先の機器において有用だとは限らない。そこで、移動先の機器におけるアプリケーションに広く対応させることができるように、ＳＥＩに顔データを記録する場合における条件を緩和して比較的多数の顔データを記録しておく。これにより、顔メタデータの取捨選択の幅を持たせておくことができる。 For example, when face metadata relating to a detected face is recorded based on a predetermined condition of the recording device, the face metadata recorded based on the predetermined condition of the source recording device is useful in the destination device. Is not limited. Therefore, a relatively large amount of face data is recorded by relaxing the conditions for recording face data in SEI so that it can be widely applied to applications in destination devices. Thereby, it is possible to provide a range of selection of face metadata.

しかしながら、コンテンツ管理ファイルと動画ストリームとの両方に顔メタデータを記録する場合においては、同じ顔メタデータをそれぞれに記録する必要はない。例えば、コンテンツ管理ファイルと動画ストリームとの両方に顔メタデータを記録する場合において、コンテンツ管理ファイルに顔検出時刻情報が記録されている場合でも、ＳＥＩを含むＡＵにおける他のＮＡＬユニットに時刻情報が記録されているため、ＳＥＩには顔検出時刻情報を記録しないようにすることができる。これにより、顔メタデータの容量を削減することが可能となる。また、顔が検出されたＡＵは、通常は編集点となるＡＵである。このため、動画ストリームが途中で削除された場合でも顔検出時刻情報については正しい値を保持することができる。また、動画ストリームを編集する場合において、コンテンツ管理ファイル内の顔メタデータをメンテナンスする場合にも、ＳＥＩを含むＡＵの他のＮＡＬユニットに記録されている時刻情報を利用することが可能となる。 However, when face metadata is recorded in both the content management file and the moving image stream, it is not necessary to record the same face metadata in each. For example, when face metadata is recorded in both a content management file and a video stream, even when face detection time information is recorded in the content management file, the time information is stored in other NAL units in the AU including SEI. Since it is recorded, face detection time information can be prevented from being recorded in SEI. This makes it possible to reduce the face metadata capacity. Further, the AU in which the face is detected is usually an AU that becomes an editing point. For this reason, even when the moving image stream is deleted halfway, the correct value can be held for the face detection time information. In addition, when editing a video stream, time information recorded in another NAL unit of the AU including SEI can be used also when maintaining the face metadata in the content management file.

さらに、コンテンツ管理ファイルを備えた記録機器において、ストリーム内部に顔メタデータを記録することによって、コンテンツ管理ファイルが何らかの事情で破壊された場合には、ストリーム内部の顔メタデータを用いて迅速にコンテンツ管理ファイルの顔メタデータの再構築を行うことができる。これにより、全てのストリームから顔を検出して顔メタデータの補修を行うよりも、迅速にコンテンツ管理ファイルの顔メタデータの再構築を行うことができる。 Furthermore, in the recording device equipped with the content management file, if the content management file is destroyed for some reason by recording the face metadata inside the stream, the content can be quickly used by using the face metadata inside the stream. The face metadata of the management file can be reconstructed. As a result, the face metadata of the content management file can be reconstructed more quickly than the face metadata is repaired by detecting faces from all the streams.

コンテンツ管理ファイルを備えていない記録機器の場合には、動画ストリームの所定ＡＵにおけるＳＥＩＮＡＬユニットにのみ顔メタデータが記録されることになる。この場合には、この動画ストリーム内部に記録された顔メタデータを使用することによって、迅速にアプリケーションを実行することができる。これに対して、顔メタデータが記録されていない動画ストリームについて顔メタデータを用いるアプリケーションを実行する場合には、動画ストリームから顔を検出する必要があるため、アプリケーションの実行を迅速に行うことができない場合がある。 In the case of a recording device that does not include a content management file, face metadata is recorded only in the SEI NAL unit in a predetermined AU of the moving image stream. In this case, the application can be quickly executed by using the face metadata recorded in the moving image stream. On the other hand, when executing an application using face metadata for a video stream in which no face metadata is recorded, it is necessary to detect the face from the video stream, so that the application can be executed quickly. There are cases where it is not possible.

次に、コンテンツファイルが静止画コンテンツファイルである場合において、この静止画コンテンツファイルに基づいて作成された顔メタデータを、コンテンツ管理ファイル３４０に記録せずに、この静止画コンテンツファイル内部に記録する例について説明する。 Next, when the content file is a still image content file, face metadata created based on the still image content file is recorded in the still image content file without being recorded in the content management file 340. An example will be described.

図２９は、ＤＣＦ（Design rule for Camera File system）規格により記録される静止画ファイルのファイル構造の概略を示す図である。ＤＣＦは、デジタルスチルカメラやプリンタ等の機器間で、記録媒体を介して画像の相互利用を実現するためのファイルシステム規格であり、Ｅｘｉｆ（Exchangeable image file format）をベースにして記録媒体に記録する場合におけるファイル名の付け方やフォルダの構成が規定されている。このＥｘｉｆは、画像ファイルの中に画像データおよびカメラ情報を付加するための規格であり、画像ファイルを記録するための形式（ファイルフォーマット）を規定するものである。 FIG. 29 is a diagram showing an outline of the file structure of a still image file recorded according to the DCF (Design Rule for Camera File system) standard. DCF is a file system standard for realizing mutual use of images via a recording medium between devices such as a digital still camera and a printer, and records on a recording medium based on the Exif (Exchangeable image file format). In this case, file naming and folder structure are specified. This Exif is a standard for adding image data and camera information to an image file, and defines a format (file format) for recording the image file.

静止画ファイル８００は、ＤＣＦ規格により記録される静止画ファイルであり、図２９（ａ）に示すように、付属情報８０１および画像情報８０２から構成されている。画像情報８０２は、例えば、カメラ部１１０により撮像された被写体の画像データである。 The still image file 800 is a still image file recorded according to the DCF standard, and is composed of attached information 801 and image information 802 as shown in FIG. The image information 802 is image data of a subject imaged by the camera unit 110, for example.

付属情報８０１は、図２９（ｂ）に示すように、属性情報８０３およびメーカーノート（maker note）８０４から構成されている。属性情報８０３は、静止画ファイル８００に関する属性情報等であり、例えば、撮影更新日時、画サイズ、色空間情報、メーカー名等が含まれる。また、属性情報８０３には、画像の回転の有無を示す回転情報（ＴＡＧＩＤ＝２７４、Orientation）が含まれる。なお、この回転情報については、Ｅｘｉｆ情報として画像の回転情報を記録しない設定（すなわち、回転情報をタグに記録しない設定）をすることができる。この設定がされている場合には、無効値として「０」が記録される。 The attached information 801 is composed of attribute information 803 and a maker note 804 as shown in FIG. The attribute information 803 is attribute information related to the still image file 800 and includes, for example, shooting update date / time, image size, color space information, manufacturer name, and the like. Further, the attribute information 803 includes rotation information (TAGID = 274, Orientation) indicating whether or not the image is rotated. In addition, about this rotation information, the setting which does not record the rotation information of an image as Exif information (that is, the setting which does not record rotation information on a tag) can be performed. When this setting is made, “0” is recorded as an invalid value.

メーカーノート８０４は、一般的にユーザ独自のデータが記録される領域であり、各メーカーが自由に情報を記録することができる拡張領域である（ＴＡＧＩＤ＝３７５００、MakerNote）。この例では、図２９（ｃ）に示すように、このメーカーノート８０４に顔メタデータが記録される。すなわち、メーカーノート８０４は、顔メタデータ８０７等の１または複数の顔メタデータを記録する顔メタデータ記録領域８０５と、他の独自メタデータ等を記録する記録領域８０６とから構成されている。このように、顔メタデータを静止画ファイルに記録する場合には、Ｅｘｉｆで規定されたメーカーノート８０４の内部に顔メタデータが記録される。 The maker note 804 is generally an area where user-specific data is recorded, and is an extended area where each maker can freely record information (TAGID = 37500, MakerNote). In this example, face metadata is recorded in the manufacturer note 804 as shown in FIG. That is, the maker note 804 includes a face metadata recording area 805 for recording one or more face metadata such as the face metadata 807 and a recording area 806 for recording other unique metadata. In this way, when face metadata is recorded in a still image file, the face metadata is recorded inside the maker note 804 defined by Exif.

ここで、メーカーノート８０４に記録される顔メタデータについて説明する。メーカーノート８０４に記録される顔メタデータは、例えば、図１０に示すヘッダ部６３０および図１１に示す顔データ部６４０から構成される顔メタデータ６２０である。ここで、顔データ部６４０については、図１３乃至図１６等を参照して説明したように、必要な情報のみとすることができる。ただし、静止画の場合は、ヘッダ部６３０に記録される各情報のうちでタイムスケール６３５が必要ないものの、静止画のタイムスケール６３５には「０」が記録される。これは、動画または静止画に応じて異なるメタデータ量にするよりも、同じメタデータ量とする方がヘッダ部６３０を固定長とすることができるため、データのアクセスを容易にすることができる。また、動画または静止画において、異なるメタデータを記録することは、記録機器のシステム的に負荷が重い。このため、動画または静止画で検出された顔に関する顔メタデータを作成する場合には、それぞれ同様のメタデータとすることによって、負荷を軽減することが可能となる。 Here, the face metadata recorded in the maker note 804 will be described. The face metadata recorded in the maker note 804 is, for example, face metadata 620 including a header part 630 shown in FIG. 10 and a face data part 640 shown in FIG. Here, as described with reference to FIGS. 13 to 16 and the like, the face data portion 640 can be only necessary information. However, in the case of a still image, “0” is recorded in the time scale 635 of the still image although the time scale 635 is not required among the pieces of information recorded in the header portion 630. This is because it is possible to make the header portion 630 have a fixed length when the metadata amount is the same, rather than using different metadata amounts depending on the moving image or the still image, so that data access can be facilitated. . In addition, recording different metadata in moving images or still images places a heavy system load on the recording device. For this reason, when creating face metadata related to a face detected in a moving image or a still image, it is possible to reduce the load by using the same metadata.

図３０は、本発明の実施の形態の変形例における撮像装置１００の機能構成例を示すブロック図である。この撮像装置１００は、コンテンツ管理ファイル記憶部２１０と、コンテンツ入力部２１１と、顔検出部２１２と、顔メタデータ作成部２１３と、仮想管理情報作成部２１４と、代表サムネイル画像抽出部２１５と、コンテンツ属性情報作成部２１６と、記録制御部２３０と、コンテンツ記憶部２２３とを備える。ここでは、コンテンツ管理ファイル記憶部２１０と、コンテンツ入力部２１１と、記録制御部２３０と、コンテンツ記憶部２２３とについて、図１７に示すものと異なる部分を詳細に説明して、他の構成については、ここでの説明を省略する。 FIG. 30 is a block diagram illustrating a functional configuration example of the imaging apparatus 100 according to a modification of the embodiment of the present invention. The imaging apparatus 100 includes a content management file storage unit 210, a content input unit 211, a face detection unit 212, a face metadata creation unit 213, a virtual management information creation unit 214, a representative thumbnail image extraction unit 215, A content attribute information creation unit 216, a recording control unit 230, and a content storage unit 223 are provided. Here, the content management file storage unit 210, the content input unit 211, the recording control unit 230, and the content storage unit 223 will be described in detail with respect to different parts from those shown in FIG. The description here is omitted.

コンテンツ管理ファイル記憶部２１０は、仮想的な階層構造により構成される階層エントリを記録するコンテンツ管理ファイル３４０を記憶するものである。なお、この変形例では、コンテンツ管理ファイル記憶部２１０には、静止画に関する階層エントリを記録しにしない。 The content management file storage unit 210 stores a content management file 340 that records a hierarchical entry composed of a virtual hierarchical structure. In this modification, the content management file storage unit 210 does not record a hierarchical entry related to a still image.

コンテンツ入力部２１１は、コンテンツファイルを入力するものであり、入力されたコンテンツファイルを、顔検出部２１２、顔メタデータ作成部２１３、仮想管理情報作成部２１４、代表サムネイル画像抽出部２１５、コンテンツ属性情報作成部２１６および記録制御部２３０に出力する。具体的には、動画の場合には、カメラ部１１０で撮影されたフレームがコンテンツ入力部２１１から順次入力される。静止画の場合には、カメラ部１１０で撮影された画像がコンテンツ入力部２１１から順次入力される。 The content input unit 211 inputs a content file, and the input content file is divided into a face detection unit 212, a face metadata creation unit 213, a virtual management information creation unit 214, a representative thumbnail image extraction unit 215, a content attribute The information is output to the information creation unit 216 and the recording control unit 230. Specifically, in the case of a moving image, frames taken by the camera unit 110 are sequentially input from the content input unit 211. In the case of a still image, images captured by the camera unit 110 are sequentially input from the content input unit 211.

記録制御部２３０は、顔メタデータ作成部２１３により作成された顔メタデータを、この顔メタデータに対応するコンテンツファイルに記録する。また、記録制御部２３０は、動画コンテンツファイルに関して、ＩＤＲピクチャまたはＩピクチャ毎に作成された顔メタデータを、この顔メタデータに対応するＩＤＲピクチャまたはＩピクチャを含むＡＵにおけるＳＥＩに記録する。さらに、記録制御部２３０は、動画コンテンツファイルに関して所定間隔で作成された顔メタデータを動画コンテンツファイルに記録する場合には、コンテンツ管理ファイル３４０に顔メタデータを記録する場合における記録条件よりも緩和された記録条件を用いて、動画コンテンツファイルに顔メタデータを記録する。また、記録制御部２３０は、静止画に関する顔メタデータについては、コンテンツ管理ファイル３４０に記録しない。 The recording control unit 230 records the face metadata created by the face metadata creation unit 213 in a content file corresponding to the face metadata. In addition, the recording control unit 230 records the face metadata created for each IDR picture or I picture in the SEI in the AU including the IDR picture or I picture corresponding to the face metadata regarding the moving image content file. Furthermore, the recording control unit 230, when recording face metadata created at a predetermined interval with respect to the moving image content file in the moving image content file, relaxes than the recording conditions for recording the face metadata in the content management file 340. The face metadata is recorded in the moving image content file using the recorded conditions. Further, the recording control unit 230 does not record the face metadata regarding the still image in the content management file 340.

コンテンツ記憶部２２３は、顔メタデータが記録された動画や静止画等のコンテンツファイルを記憶するものである。 The content storage unit 223 stores content files such as moving images and still images in which face metadata is recorded.

ここで、静止画および動画の利用環境等について簡単に説明する。 Here, a usage environment of still images and moving images will be briefly described.

一般的に、静止画は、記録媒体に記録されて機器間を移動することが多く、動画に比べて可搬性が高いと考えられる。このように静止画が移動される場合においては、移動先の機器においてコンテンツ管理ファイルを理解することができない市販の画像管理アプリケーションソフトウェアが利用されている可能性が高いと考えられる。このため、静止画については、コンテンツ管理ファイルで管理しなくてもよいと考えられる。 In general, a still image is often recorded on a recording medium and moves between devices, and is considered to be more portable than a moving image. When a still image is moved in this way, it is highly likely that commercially available image management application software that cannot understand the content management file is used in the destination device. For this reason, it is considered that still images need not be managed by the content management file.

また、静止画の場合は、静止画ファイルをＰＣ上で編集可能なＰＣアプリケーションソフトウェアが多く存在している。これらのＰＣアプリケーションソフトウェアの中には、静止画がトリミングや回転処理された場合においても、Ｅｘｉｆのメーカーノート以外のカメラ情報（更新日時や回転情報等）を正しくメンテナンスしないものも多い。そのようなＰＣアプリケーションソフトウェアで編集された静止画ファイルが、顔が検出された記録機器に戻される場合がある。この場合において、例えば、顔位置を示す顔データを用いて静止画から顔を切り出す処理を実行しても、顔が正しく切り出されていないということが生じ得る。 In the case of still images, there are many PC application software that can edit still image files on a PC. Many of these PC application softwares do not properly maintain camera information (update date and time, rotation information, etc.) other than Exif manufacturer notes even when a still image is trimmed or rotated. A still image file edited with such PC application software may be returned to the recording device in which the face is detected. In this case, for example, even if a process of cutting out a face from a still image using face data indicating the face position is executed, the face may not be cut out correctly.

このような場合を極力避けるために、静止画コンテンツ内にある更新日時情報とともに、画像サイズ情報等を用いることによって、不整合検出の可能性を高めることができる。 In order to avoid such a case as much as possible, the possibility of inconsistency detection can be increased by using the image size information and the like together with the update date / time information in the still image content.

一方、動画については、ＡＶＣＨＤ（Advanced Video Codec High Definition）、ＢＤ（Blu-ray Disc：ブルーレイディスク）等の再生環境が整っていない段階では、動画が撮影された撮像装置に付属するＰＣアプリケーションソフトウェアでないと再生できない可能性が高い。このため、動画については、コンテンツ管理ファイルを理解することができるＰＣアプリケーションソフトウェアをユーザが使用する可能性が高く、メタデータへのアクセス性等のメリットを考えてコンテンツ管理ファイルで動画コンテンツを管理する。また、動画コンテンツのメタデータもコンテンツ管理ファイルに記録する。 On the other hand, for video, when playback environment such as AVCHD (Advanced Video Codec High Definition), BD (Blu-ray Disc: Blu-ray Disc) is not in place, it is not PC application software attached to the imaging device where the video was shot. There is a high possibility that it cannot be played. For this reason, with regard to moving images, it is highly likely that the user will use PC application software that can understand the content management file, and the moving image content is managed with the content management file in consideration of merits such as accessibility to metadata. . Also, the metadata of the moving image content is recorded in the content management file.

動画ファイルが編集される場合には、上述したように、動画対応フォーマットに対応する編集アプリケーションが少なければ、独自ファイルであるコンテンツ管理ファイルや動画ファイルに記録される更新日時情報が、その独自ファイルに対応するＰＣアプリケーションソフトウェアにより確実にメンテナンスされる可能性が高い。 When a movie file is edited, as described above, if there are few editing applications that support the movie-compatible format, the update date information recorded in the content management file or movie file that is a unique file is stored in the unique file. There is a high possibility that maintenance will be reliably performed by corresponding PC application software.

以上で示したように、静止画および動画の使用環境が異なるため、この変形例では、動画コンテンツファイルと、この動画コンテンツファイルから検出されたメタデータ（顔メタデータに限定されず）とをコンテンツ管理ファイルで管理する。一方、静止画コンテンツファイルについては、コンテンツ管理ファイルで管理せずに、通常のファイルシステムで管理して、この静止画コンテンツファイルに含まれるメタデータを、この静止画ファイル内（すなわち、Ｅｘｉｆのメーカーノート）に記録する。 As described above, since the usage environment of still images and moving images is different, in this modified example, the content of the moving image content file and metadata (not limited to face metadata) detected from the moving image content file are included. Manage with a management file. On the other hand, the still image content file is not managed by the content management file, but is managed by a normal file system, and the metadata included in the still image content file is stored in the still image file (that is, the manufacturer of Exif). Note).

次に、本発明の実施の形態の変形例における顔データの読出処理について図面を参照して詳細に説明する。 Next, face data reading processing in a modification of the embodiment of the present invention will be described in detail with reference to the drawings.

図３１および図３２は、撮像装置１００による顔データの読出処理の処理手順を示すフローチャートである。この例では、メーカーノート８０４に顔メタデータが記録されている静止画について、コンテンツ更新日時、コンテンツ画像サイズ、コンテンツ回転情報を用いて、静止画とメタデータとの不整合を検出する処理手順を示す。この処理手順は、図２６に示す処理手順において、ステップＳ９７５とステップＳ９７６との処理の間に、ステップＳ９８５の処理を行うものである。このため、ステップＳ９８５について詳細に説明し、他の処理についての説明を省略する。また、ここでは、図１０に示すヘッダ部６３０を参照して説明する。 FIG. 31 and FIG. 32 are flowcharts showing a processing procedure of face data reading processing by the imaging apparatus 100. In this example, for a still image in which face metadata is recorded in the maker note 804, a processing procedure for detecting inconsistency between the still image and the metadata using the content update date / time, the content image size, and the content rotation information is performed. Show. In this processing procedure, the processing in step S985 is performed between the processing in steps S975 and S976 in the processing procedure shown in FIG. For this reason, step S985 will be described in detail, and descriptions of other processes will be omitted. Here, the description will be made with reference to the header portion 630 shown in FIG.

ステップＳ９７２で読み出されたヘッダ部６３０のメタデータバージョン６３２に記録されている顔メタデータのバージョン情報に基づいて、撮像装置１００が対応可能な顔メタデータのバージョンであるか否かが判断される（ステップＳ９７３）。そして、対応可能な顔メタデータのバージョンである場合において（ステップＳ９７３）、対応する静止画コンテンツファイルの更新日時と、ヘッダ部６３０のコンテンツ更新日時６３３に記録されている更新日時とが同じでない場合（ステップＳ９７４）、または、対応する静止画コンテンツファイルの画像サイズと、ヘッダ部６３０のコンテンツ画像サイズ６３８に記録されている画像サイズとが同じでない場合には（ステップＳ９７５）、対応する静止画コンテンツファイルの画像がトリミングや解像度変換等の処理を施されている可能性が高いため、ステップＳ９８２に進み、上述した処理を繰り返す。 Based on the face metadata version information recorded in the metadata version 632 of the header section 630 read out in step S972, it is determined whether or not the face metadata version is compatible with the imaging apparatus 100. (Step S973). When the face metadata version is compatible (step S973), the update date / time of the corresponding still image content file and the update date / time recorded in the content update date / time 633 of the header 630 are not the same. (Step S974) or when the image size of the corresponding still image content file is not the same as the image size recorded in the content image size 638 of the header portion 630 (Step S975), the corresponding still image content Since there is a high possibility that the image of the file has been subjected to processing such as trimming or resolution conversion, the process proceeds to step S982, and the above-described processing is repeated.

一方、対応する静止画コンテンツファイルの更新日時と、ヘッダ部６３０のコンテンツ更新日時６３３に記録されている更新日時とが同じであり（ステップＳ９７４）、かつ、対応する静止画コンテンツファイルの画像サイズと、ヘッダ部６３０のコンテンツ画像サイズ６３８に記録されている画像サイズとが同じである場合には（ステップＳ９７５）、対応する静止画コンテンツファイルの回転情報が存在し、かつ、この回転情報に無効値が記録されていないか否かが判断される（ステップＳ９８５）。対応する静止画コンテンツファイルの回転情報が存在し、かつ、この回転情報に無効値が記録されていない場合には（ステップＳ９８５）、ステップＳ９７６に進む。 On the other hand, the update date and time of the corresponding still image content file is the same as the update date and time recorded in the content update date and time 633 of the header portion 630 (step S974), and the image size of the corresponding still image content file and If the image size recorded in the content image size 638 of the header portion 630 is the same (step S975), the rotation information of the corresponding still image content file exists and an invalid value is included in this rotation information. Whether or not is recorded is determined (step S985). If rotation information of the corresponding still image content file exists and an invalid value is not recorded in this rotation information (step S985), the process proceeds to step S976.

一方、対応する静止画コンテンツファイルの回転情報が存在しない場合、または、この回転情報に無効値が記録されている場合には（ステップＳ９８５）、画像が回転処理されている可能性が高いため、ステップＳ９８２に進み、上述した処理を繰り返す。これらにより、静止画ファイルの編集において比較的多く用いられる画像の回転、トリミング、解像度変換等を考慮して不整合検出の可能性を高めることができる。なお、図２６、図２７、図３１で示したコンテンツ更新日時の比較、コンテンツ画像サイズの比較、チェックサムの比較、回転情報の確認のうちの少なくとも２つを組み合わせて不整合を検出するようにしてもよい。 On the other hand, if there is no rotation information of the corresponding still image content file, or if an invalid value is recorded in this rotation information (step S985), there is a high possibility that the image has been rotated. Proceeding to step S982, the above-described processing is repeated. Accordingly, it is possible to increase the possibility of inconsistency detection in consideration of image rotation, trimming, resolution conversion, and the like that are used relatively frequently in editing of still image files. Note that inconsistency is detected by combining at least two of the content update date comparison, the content image size comparison, the checksum comparison, and the rotation information confirmation shown in FIGS. 26, 27, and 31. May be.

次に、顔メタデータを利用したアプリケーションの実行例について図面を参照して詳細に説明する。 Next, an execution example of an application using face metadata will be described in detail with reference to the drawings.

図３３は、静止画コンテンツファイルについてのスライドショーが実行される場合における表示例を示す図である。図３３（ａ）は、顔８５１を含む画像が表示部８５０に表示されている状態を示す図である。顔８５１については、静止画ファイル内のメーカーノートに顔データが記録されており、この顔データにより顔８５１を含む領域８５２を認識することが可能である。 FIG. 33 is a diagram illustrating a display example when a slide show is executed for a still image content file. FIG. 33A is a diagram illustrating a state where an image including the face 851 is displayed on the display unit 850. As for the face 851, face data is recorded in the maker note in the still image file, and an area 852 including the face 851 can be recognized by this face data.

従来では、１枚の画像をスライドショーによって表示させる場合には、例えば、１枚の画像の真中部分を上下に切り分け、この切り分けられた画像の上部分を画面上の右側から進入させるとともに、画像の下部分を画面上の左側から進入させていき、一枚の画像を再生するトランジション効果によって表示させるスライドショーが行われている。 Conventionally, when displaying one image by a slide show, for example, the middle part of one image is cut up and down, the upper part of the cut image is entered from the right side of the screen, and the image There is a slide show in which the lower part is entered from the left side of the screen and displayed by a transition effect that reproduces one image.

例えば、図３３（ａ）に示す画像についてトランジション効果によりスライドショー表示させる場合には、図３３（ａ）に示す点線８５３で示す真中部分で画像を上下に切り分けて、図３３（ｂ）に示すように、上部分の画像を矢印８５５方向に順次移動させるとともに、下部分の画像を矢印８５６方向に順次移動させていき、この画像全体を表示させることができる。しかしながら、このように点線８５３で示す部分で画像を上下に切り分けた場合には、この画像に含まれる顔８５１が上下に分かれてしまうため、上下に分かれた画像が組み合わされるまでの間は、顔８５１の全体を閲覧することができない。 For example, when the image shown in FIG. 33A is displayed as a slide show by the transition effect, the image is cut up and down at the middle portion indicated by the dotted line 853 shown in FIG. 33A, as shown in FIG. In addition, the upper image can be moved in the direction of arrow 855 and the lower image can be moved in the direction of arrow 856 to display the entire image. However, when the image is cut up and down at the portion indicated by the dotted line 853 in this way, the face 851 included in this image is divided into upper and lower parts, so that the face is divided until the vertically divided images are combined. 851 cannot be browsed in its entirety.

そこで、顔を含む画像についてトランジション効果によりスライドショー表示させる場合には、トランジションをかける前に、メーカーノートに記録されている顔メタデータに含まれる顔基本情報に基づいて、顔の位置を把握して、画像の上下を区切る位置を調節することができる。これにより、領域８５２に含まれる顔８５１が切り分けられないようにすることができる。例えば、領域８５２に含まれる顔８５１が切り分けられないように、図３３（ａ）に示す点線８５４で示す部分で画像を上下に切り分けることができる。これにより、図３３（ｃ）に示すように、上下に区切られた画像が移動中であっても、顔８５１の全体を閲覧することができる。 Therefore, if you want to display a slideshow with an image that includes a face, you must know the position of the face based on the basic face information included in the face metadata recorded in the manufacturer's note before applying the transition. The position that separates the top and bottom of the image can be adjusted. Thereby, the face 851 included in the region 852 can be prevented from being cut out. For example, the image can be cut up and down at a portion indicated by a dotted line 854 shown in FIG. 33A so that the face 851 included in the region 852 is not cut. As a result, as shown in FIG. 33C, the entire face 851 can be viewed even when the vertically divided image is moving.

また、図３３（ａ）に示すように、メーカーノートに顔データが記録されている画像については、上述したトランジション効果とは異なるトランジション効果によりスライドショー表示させることができる。例えば、顔が含まれる画像については、画像に含まれる顔のズームから、通常の顔の大きさに戻す等のように、画像に含まれる顔が切り分けられないようなトランジション効果を用いることができる。顔が含まれる画像と顔が含まれない画像とについて、トランジションを切り替えることによって、画像に含まれる顔を効果的に表示させることができるスライドショーを実行することができる。 As shown in FIG. 33A, an image in which face data is recorded in the maker note can be displayed as a slide show with a transition effect different from the transition effect described above. For example, for an image that includes a face, a transition effect that prevents the face included in the image from being separated, such as returning the face size from normal to the normal size, can be used. . By switching the transition between an image including a face and an image not including a face, a slide show that can effectively display the face included in the image can be executed.

次に、デジタルスチルカメラやデジタルビデオカメラ等の記録装置により撮像された画像データに付加された顔メタデータを、ビデオプレイヤーのような再生装置が利用する例について図面を参照して詳細に説明する。 Next, an example in which a playback device such as a video player uses face metadata added to image data captured by a recording device such as a digital still camera or a digital video camera will be described in detail with reference to the drawings. .

図３４は、着脱可能な記録媒体８３１を接続することが可能な画像記録装置８３０および画像再生装置８３４を示す図である。ここでは、顔メタデータがコンテンツファイル内に内包される場合における顔メタデータの利用例について説明する。なお、画像記録装置８３０および画像再生装置８３４の構成は、図１７、図１８および図３０に示す撮像装置１００の構成とほぼ同様である。 FIG. 34 is a diagram showing an image recording device 830 and an image reproduction device 834 to which a removable recording medium 831 can be connected. Here, an example of using face metadata when face metadata is included in a content file will be described. Note that the configurations of the image recording device 830 and the image reproducing device 834 are substantially the same as the configurations of the imaging device 100 shown in FIGS. 17, 18, and 30.

図３４（ａ）に示すように、画像記録装置８３０に記録媒体８３１が接続されている状態で被写体の撮像が行われ、これにより撮像された画像データおよびこの画像データにより作成された顔メタデータが記録媒体８３１にコンテンツファイル８３２として記録される。そして、コンテンツファイル８３２を画像再生装置８３４によって再生させる場合には、図３４（ｂ）に示すように、画像記録装置８３０から記録媒体８３１を取り外し、図３４（ｃ）に示すように、画像再生装置８３４に記録媒体８３１を接続することによって、記録媒体８３１に記録されているコンテンツファイル８３２を画像再生装置８３４に入力して再生させることができる。 As shown in FIG. 34 (a), the subject is imaged in a state where the recording medium 831 is connected to the image recording device 830, and the image data thus captured and the face metadata created from the image data. Is recorded as a content file 832 on the recording medium 831. When the content file 832 is played back by the image playback device 834, the recording medium 831 is removed from the image recording device 830 as shown in FIG. 34 (b), and the image playback is performed as shown in FIG. 34 (c). By connecting the recording medium 831 to the apparatus 834, the content file 832 recorded on the recording medium 831 can be input to the image reproducing apparatus 834 and reproduced.

このように画像記録装置８３０により付加されたメタデータを画像再生装置８３４が利用することができるため、画像再生装置８３４が顔検出機能を有しない場合でも、顔メタデータを用いた再生を画像再生装置８３４が行うことができる。これにより、モバイル機器のように大きな再生負荷をかけることができない機器の場合においても高度な再生アプリケーションを実現することが可能となる。また、顔検出機能を有する再生機器においても、再生時に顔を検索する必要がないため、再生処理に要する時間を大幅に短縮することが可能となる。 Since the metadata added by the image recording device 830 can be used by the image reproducing device 834 in this way, even if the image reproducing device 834 does not have a face detection function, reproduction using the face metadata is performed. Device 834 can do this. Thereby, even in the case of a device that cannot apply a large reproduction load such as a mobile device, an advanced reproduction application can be realized. Further, even in a playback device having a face detection function, it is not necessary to search for a face during playback, so that the time required for playback processing can be greatly reduced.

図３５は、画像記録装置８７０および画像再生装置８８０で構成されている画像処理システム８６０の概略を示すシステム構成図である。画像記録装置８７０および画像再生装置８８０は、ＵＳＢケーブル等の装置間インタフェースで接続される。 FIG. 35 is a system configuration diagram illustrating an outline of an image processing system 860 including an image recording device 870 and an image reproduction device 880. The image recording device 870 and the image reproducing device 880 are connected by an inter-device interface such as a USB cable.

画像記録装置８７０は、デジタルスチルカメラやデジタルビデオカメラ等の画像記録装置であり、撮像された画像データをコンテンツファイルとしてコンテンツファイル記憶部８７２に記憶するとともに、このコンテンツファイルに関する顔メタデータをコンテンツ管理ファイル８７１に記録する画像記録装置である。 The image recording device 870 is an image recording device such as a digital still camera or a digital video camera. The image recording device 870 stores captured image data as a content file in the content file storage unit 872 and manages the face metadata related to the content file as content management. This is an image recording apparatus for recording in a file 871.

画像再生装置８８０は、送信要求出力部８８１と、再生制御部８８２と、表示部８８３とを備え、装置間インタフェースで接続されている画像記録装置８７０のコンテンツファイル記憶部８７２に記憶されているコンテンツファイルを読み出し、読み出されたコンテンツファイルを表示部８８３に表示して再生する画像再生装置である。なお、画像記録装置８７０の構成は、図１７、図１８および図３０に示す撮像装置１００の構成とほぼ同様であるため、その他の構成に関する図示および説明は省略する。 The image reproduction device 880 includes a transmission request output unit 881, a reproduction control unit 882, and a display unit 883, and the content stored in the content file storage unit 872 of the image recording device 870 connected by the inter-device interface. This is an image playback device that reads a file and displays the read content file on the display unit 883 for playback. The configuration of the image recording apparatus 870 is substantially the same as the configuration of the imaging apparatus 100 shown in FIGS.

送信要求出力部８８１は、画像記録装置８７０のコンテンツ管理ファイル８７１に記録されているメタデータエントリに含まれるメタデータの中から所望のメタデータを抽出するための送信要求を、信号線８８４に出力するものである。この信号線８８４に出力された送信要求に応答して、コンテンツ管理ファイル８７１に記録されているメタデータエントリに含まれるメタデータの中から所望のメタデータが抽出されるとともに、この抽出されたメタデータを含むメタデータエントリの上位階層に記録されているファイルエントリに含まれる仮想管理情報に基づいてコンテンツファイル記憶部８７２に記録されているコンテンツファイルが抽出される。そして、コンテンツ管理ファイル８７１から抽出されたメタデータが信号線８８５に出力されるとともに、コンテンツファイル記憶部８７２から抽出されたコンテンツファイルが信号線８８６に出力される。 The transmission request output unit 881 outputs a transmission request for extracting desired metadata from the metadata included in the metadata entry recorded in the content management file 871 of the image recording device 870 to the signal line 884. To do. In response to the transmission request output to the signal line 884, desired metadata is extracted from the metadata included in the metadata entry recorded in the content management file 871, and the extracted meta data The content file recorded in the content file storage unit 872 is extracted based on the virtual management information included in the file entry recorded in the upper hierarchy of the metadata entry including data. Then, the metadata extracted from the content management file 871 is output to the signal line 885, and the content file extracted from the content file storage unit 872 is output to the signal line 886.

再生制御部８８２は、コンテンツ管理ファイル８７１から信号線８８５に出力されたメタデータを用いて、コンテンツファイル記憶部８７２から信号線８８６に出力されて表示部８８３に表示されるコンテンツファイルの再生を制御する再生制御部である。 The reproduction control unit 882 controls reproduction of the content file output from the content file storage unit 872 to the signal line 886 and displayed on the display unit 883 using the metadata output from the content management file 871 to the signal line 885. A playback control unit.

このように、画像再生装置８８０は、画像記録装置８７０に記録されているコンテンツ管理ファイル８７１を読み出し、読み出されたコンテンツ管理ファイル８７１の中から必要なメタデータを抽出して、抽出されたメタデータをコンテンツファイルの再生時に用いる。これにより、例えば、図３３を参照して説明したように、画像記録装置８７０に記録されているコンテンツ管理ファイル８７１のメタデータを用いて、コンテンツファイル記憶部８７２に記憶されているコンテンツファイルを表示部８８３に表示することができる。 As described above, the image playback device 880 reads the content management file 871 recorded in the image recording device 870, extracts necessary metadata from the read content management file 871, and extracts the extracted meta data. Data is used when playing content files. Thus, for example, as described with reference to FIG. 33, the content file stored in the content file storage unit 872 is displayed using the metadata of the content management file 871 recorded in the image recording device 870. Part 883 can be displayed.

ここでは、画像記録装置８７０および画像再生装置８８０を接続する接続手段としてＵＳＢケーブル等の装置間インタフェースを用いた例について説明したが、有線回線または無線回線を用いたネットワーク等の他の接続手段を用いるようにしてもよい。 Here, an example using an inter-device interface such as a USB cable as a connection means for connecting the image recording apparatus 870 and the image reproduction apparatus 880 has been described, but other connection means such as a network using a wired line or a wireless line may be used. You may make it use.

このように、本発明の実施の形態によれば、所望のメタデータを迅速に検索することができ、対応するコンテンツファイルを迅速にサーチすることができる。このため、所望のアプリケーションを迅速に実行することができる。また、コンテンツファイルに関するメタデータを迅速に利用することができる。 Thus, according to the embodiment of the present invention, desired metadata can be searched quickly, and corresponding content files can be searched quickly. For this reason, a desired application can be executed quickly. Further, the metadata regarding the content file can be used quickly.

また、現在では、顔メタデータを利用するアプリケーションが多数開発されており、顔メタデータを利用するアプリケーションは今後も種々多様化するものと考えられる。このため、将来は、顔メタデータのフォーマット拡張が予想される。このような顔メタデータのフォーマットが拡張された場合であっても、本発明の実施の形態によれば、そのフォーマットの拡張に対して再生機器において互換性を確保することができるため、コンテンツファイルに関するメタデータを迅速に利用することができる。 At present, many applications that use face metadata have been developed, and it is considered that applications that use face metadata will be diversified in the future. For this reason, the format expansion of face metadata is expected in the future. Even when the format of such face metadata is extended, according to the embodiment of the present invention, compatibility with the playback device can be ensured for the extension of the format. The metadata about can be used quickly.

このように、本発明の実施の形態によれば、コンテンツファイルを迅速に利用することができる。 Thus, according to the embodiment of the present invention, a content file can be used quickly.

なお、本発明の実施の形態では、メタデータとして、人の顔に関する顔メタデータを一例として示したが、他のメタデータについても本発明の実施の形態を適用することができる。例えば、動物またはペット認識アルゴリズムを用いて画像に含まれる動物等を検出し、検出された動物の顔等に関する情報に対応するメタデータについて、本発明の実施の形態を適用することができる。例えば、顔検出エンジンの代わりに、ペット検出エンジンを設け、このペット検出エンジンによって検出されたペットに関するメタデータを用いて、本発明の実施の形態を適用することができる。また、人物または動物等の行動を認識し、この認識された行動が所定の記述により記録されたメタデータについても、本発明の実施の形態を適用することができる。また、本発明の実施の形態では、コンテンツ記録装置の一例として撮像装置について説明したが、コンテンツファイルを記録する携帯端末装置等の他のコンテンツ記録装置に本発明の実施の形態を適用することができる。また、本発明の実施の形態では、コンテンツ再生装置の一例として撮像装置について説明したが、コンテンツを再生するＤＶＤ（Digital Versatile Disc）レコーダ等の他のコンテンツ再生装置に本発明の実施の形態を適用することができる。 In the embodiment of the present invention, face metadata relating to a human face is shown as an example of metadata, but the embodiment of the present invention can be applied to other metadata. For example, the embodiment of the present invention can be applied to metadata corresponding to information relating to the detected animal's face or the like by detecting an animal or the like included in the image using an animal or pet recognition algorithm. For example, instead of the face detection engine, a pet detection engine is provided, and the embodiment of the present invention can be applied using metadata about a pet detected by the pet detection engine. The embodiment of the present invention can also be applied to metadata in which an action such as a person or an animal is recognized and the recognized action is recorded by a predetermined description. In the embodiment of the present invention, the imaging device has been described as an example of the content recording device. However, the embodiment of the present invention can be applied to other content recording devices such as a portable terminal device that records content files. it can. In the embodiment of the present invention, the imaging apparatus has been described as an example of the content reproduction apparatus. However, the embodiment of the present invention is applied to other content reproduction apparatuses such as a DVD (Digital Versatile Disc) recorder that reproduces content. can do.

なお、本発明の実施の形態は本発明を具現化するための一例を示したものであり、以下に示すように特許請求の範囲における発明特定事項とそれぞれ対応関係を有するが、これに限定されるものではなく本発明の要旨を逸脱しない範囲において種々の変形を施すことができる。 The embodiment of the present invention is an example for embodying the present invention and has a corresponding relationship with the invention-specific matters in the claims as shown below, but is not limited thereto. However, various modifications can be made without departing from the scope of the present invention.

すなわち、請求項１において、画像再生システムは、例えば画像処理システム８６０に対応する。また、請求項２乃至請求項１２において、顔データ記録装置は、例えば撮像装置１００に対応する。また、請求項１３乃至請求項２０において、再生装置は、例えば撮像装置１００に対応する。また、請求項２１において、撮像装置は、例えば撮像装置１００に対応する。 That is, in claim 1, the image reproduction system corresponds to, for example, the image processing system 860. Further, in claims 2 to 12, the face data recording device corresponds to, for example, the imaging device 100. Further, in claims 13 to 20, a playback device corresponds to the imaging device 100, for example. In claim 21, an imaging apparatus corresponds to the imaging apparatus 100, for example.

また、請求項１、２、８、１０、１２において、画像入力部は、例えばコンテンツ入力部２１１に対応する。 In claims 1, 2 , 8, 10 , and 12 , the image input unit corresponds to, for example, the content input unit 211.

また、請求項１、２、８、１０、１５、２１、２２において、顔検出部は、例えば顔検出部２１２に対応する。 Further, in claims 1, 2 , 8 , 10 , 15 , 21 , and 22 , the face detection unit corresponds to, for example, the face detection unit 212.

また、請求項１、２、５、９、１１、１２、２１、２２において、第１の制御部は、例えば顔メタデータ作成部２１３および記録制御部２１７、２３０に対応する。 Further, in claims 1, 2 , 5, 9 , 11, 12 , 21 , and 22 , the first control unit corresponds to, for example, the face metadata creation unit 213 and the recording control units 217 and 230 .

また、請求項１、２、１３、１４、１７、１９、２１、２２において、比較部は、例えば抽出部２２５に対応する。Further, in claims 1, 2, 13, 14, 17, 19, 21, and 22, the comparison unit corresponds to, for example, the extraction unit 225.

また、請求項１において、第２の制御部は、例えば抽出部２２５および描画部２２６に対応する。Further, in claim 1, the second control unit corresponds to, for example, the extraction unit 225 and the drawing unit 226.

また、請求項２、２１、２２において、第２の制御部は、例えばＣＰＵ１４１、記録制御部２１７、２３０に対応する。Further, in claims 2, 21, and 22, the second control unit corresponds to the CPU 141 and the recording control units 217 and 230, for example.

また、請求項１３において、入力部は、例えば抽出部２２５に対応する。In claim 13, the input unit corresponds to, for example, the extraction unit 225.

また、請求項１３、１５、１８乃至２０において、制御部は、例えば抽出部２２５および描画部２２６に対応する。Further, in claims 13, 15, 18 to 20, the control unit corresponds to, for example, the extraction unit 225 and the drawing unit 226.

また、請求項１６において、検索部は、例えば抽出部２２５に対応する。 Further, in claim 16, the search unit corresponds to, for example, the extraction unit 225.

また、請求項１９において、誤り検出符号値算出部は、例えば抽出部２２５に対応する。 In claim 19, the error detection code value calculation unit corresponds to, for example, the extraction unit 225 .

また、請求項２１において、撮像部は、例えばカメラ部１１０に対応する。 In claim 21, the imaging unit corresponds to, for example, the camera unit 110.

また、請求項２２および請求項２３において、顔検出手順は、例えばステップＳ９０３に対応する。また、第１の制御手順は、例えばステップＳ９０５、ステップＳ９０８、ステップＳ９１２乃至Ｓ９１４に対応する。また、比較手順は、例えばステップＳ９７４、Ｓ９７５に対応する。また、第２の制御手順は、例えばステップＳ９００に対応する。

In claim 22 and claim 23, the face detection procedure corresponds to, for example, step S903. The first control procedure corresponds to, for example, step S905, step S908, and steps S912 to S914. The comparison procedure corresponds to, for example, steps S974 and S975. The second control procedure corresponds to, for example, step S900.

なお、本発明の実施の形態において説明した処理手順は、これら一連の手順を有する方法として捉えてもよく、また、これら一連の手順をコンピュータに実行させるためのプログラム乃至そのプログラムを記憶する記録媒体として捉えてもよい。 The processing procedure described in the embodiment of the present invention may be regarded as a method having a series of these procedures, and a program for causing a computer to execute these series of procedures or a recording medium storing the program May be taken as

撮像装置１００の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an imaging apparatus 100. FIG. 撮像装置１００で撮影された画像データがＭＰＥＧ４−ＡＶＣで符号化された、ビデオ信号の所定フレームを模式的に示す図である。It is a figure which shows typically the predetermined frame of a video signal by which the image data image | photographed with the imaging device 100 was encoded by MPEG4-AVC. ファイルシステム上に登録されている実ファイルのファイル構造を概略的に示す図である。It is a figure which shows roughly the file structure of the real file registered on the file system. プロパティファイル４００が管理する仮想フォルダおよび仮想ファイルの構成例を示す図である。It is a figure which shows the structural example of the virtual folder and virtual file which the property file 400 manages. プロパティファイル４００およびサムネイルファイル５００と、動画コンテンツファイル３１１乃至３１６との関係を概略的に示す図である。FIG. 4 is a diagram schematically showing a relationship between a property file 400 and a thumbnail file 500 and moving image content files 311 to 316. 動画フォルダエントリ４１０、日付フォルダエントリ４１１、動画ファイルエントリ、メタデータエントリの親子関係を概略的に示す図である。It is a figure which shows schematically the parent-child relationship of the moving image folder entry 410, the date folder entry 411, the moving image file entry, and the metadata entry. プロパティファイル４００の基本構造の一例を示す図である。3 is a diagram illustrating an example of a basic structure of a property file 400. FIG. プロパティファイル４００の全体構造を概略的に示す図である。FIG. 3 is a diagram schematically showing an entire structure of a property file 400. メタデータエントリ６００の内部構成を概略的に示す図である。3 is a diagram schematically showing an internal configuration of a metadata entry 600. FIG. ヘッダ部６３０に格納される各種情報の概略を示す図である。5 is a diagram showing an outline of various types of information stored in a header section 630. FIG. 顔データ部６４０に格納される顔データの概略を示す図である。5 is a diagram showing an outline of face data stored in a face data unit 640. FIG. ヘッダ部６３０の顔データ構造フラグ６６０のデータ構造を示す図である。It is a figure which shows the data structure of the face data structure flag 660 of the header part 630. FIG. 顔データ構造フラグ６６０に格納されたビットと、顔データ部６４０に格納された顔データとの関係を示す図である。FIG. 10 is a diagram illustrating a relationship between bits stored in a face data structure flag 660 and face data stored in a face data unit 640. 顔データ構造フラグ６６０に格納されたビットと、顔データ部６４０に格納された顔データとの関係を示す図である。FIG. 10 is a diagram illustrating a relationship between bits stored in a face data structure flag 660 and face data stored in a face data unit 640. 顔データ構造フラグ６６０に格納されたビットと、顔データ部６４０に格納された顔データとの関係を示す図である。FIG. 10 is a diagram illustrating a relationship between bits stored in a face data structure flag 660 and face data stored in a face data unit 640. 顔データ構造フラグ６６０に格納されたビットと、顔データ部６４０に格納された顔データとの関係を示す図である。FIG. 10 is a diagram illustrating a relationship between bits stored in a face data structure flag 660 and face data stored in a face data unit 640. 撮像装置１００の機能構成例を示すブロック図である。2 is a block diagram illustrating a functional configuration example of an imaging apparatus 100. FIG. 撮像装置１００の機能構成例を示すブロック図である。2 is a block diagram illustrating a functional configuration example of an imaging apparatus 100. FIG. 動画ファイルエントリ４１４とメタデータエントリ４１５とサムネイルファイル５００と動画コンテンツファイル３１２との関係を概略的に示す図である。4 is a diagram schematically showing a relationship among a moving image file entry 414, a metadata entry 415, a thumbnail file 500, and a moving image content file 312. FIG. コンテンツ管理ファイル３４０を用いたアプリケーションの一例を示す図である。It is a figure which shows an example of the application using the content management file. 撮像装置１００によるプロパティファイル４００の記録処理の処理手順を示すフローチャートである。11 is a flowchart illustrating a processing procedure for recording a property file 400 by the imaging apparatus 100. 撮像装置１００による動画コンテンツファイルの再生処理の処理手順を示すフローチャートである。10 is a flowchart illustrating a processing procedure of a moving image content file reproduction process performed by the imaging apparatus. 撮像装置１００による動画コンテンツファイルの再生処理の処理手順を示すフローチャートである。10 is a flowchart illustrating a processing procedure of a moving image content file reproduction process performed by the imaging apparatus. 撮像装置１００による動画コンテンツファイルの再生処理の処理手順を示すフローチャートである。10 is a flowchart illustrating a processing procedure of a moving image content file reproduction process performed by the imaging apparatus. メタデータエントリ６００に含まれる顔メタデータ６２０の構成を概略的に示す図である。5 is a diagram schematically showing a configuration of face metadata 620 included in a metadata entry 600. FIG. 撮像装置１００による顔データの読出処理の処理手順を示すフローチャートである。6 is a flowchart illustrating a processing procedure of face data reading processing by the imaging apparatus. 撮像装置１００による顔データの読出処理の処理手順を示すフローチャートである。6 is a flowchart illustrating a processing procedure of face data reading processing by the imaging apparatus. フレーム８２３乃至８２８において検出された顔と顔データ８１１乃至８２２との関係を示す図である。It is a figure which shows the relationship between the face detected in the frames 823 thru | or 828, and the face data 811 thru | or 822. FIG. ＤＣＦ規格により記録される静止画ファイルのファイル構造の概略を示す図である。It is a figure which shows the outline of the file structure of the still image file recorded by DCF standard. 本発明の実施の形態の変形例における撮像装置１００の機能構成例を示すブロック図である。It is a block diagram which shows the function structural example of the imaging device 100 in the modification of embodiment of this invention. 撮像装置１００による顔データの読出処理の処理手順を示すフローチャートである。6 is a flowchart illustrating a processing procedure of face data reading processing by the imaging apparatus. 撮像装置１００による顔データの読出処理の処理手順を示すフローチャートである。6 is a flowchart illustrating a processing procedure of face data reading processing by the imaging apparatus. 静止画コンテンツファイルについてのスライドショーが実行される場合における表示例を示す図である。It is a figure which shows the example of a display in case the slide show about a still image content file is performed. 着脱可能な記録媒体８３１を接続することが可能な画像記録装置８３０および画像再生装置８３４を示す図である。It is a diagram showing an image recording device 830 and an image reproducing device 834 to which a removable recording medium 831 can be connected. 画像記録装置８７０および画像再生装置８８０で構成されている画像処理システム８６０の概略を示すシステム構成図である。2 is a system configuration diagram illustrating an outline of an image processing system 860 including an image recording device 870 and an image reproduction device 880. FIG.

Explanation of symbols

１００撮像装置
１１０カメラ部
１１１光学ブロック
１１２ＣＣＤ
１１３前処理回路
１１４光学ブロック用ドライバ
１１５ＣＣＤ用ドライバ
１１６タイミング生成回路
１２０カメラＤＳＰ
１２１ＳＤＲＡＭ
１３０制御部
１４０操作部
１４１ＣＰＵ
１４２ＲＡＭ
１４３フラッシュＲＯＭ
１４４時計回路
１４５システムバス
１５０媒媒体Ｉ／Ｆ
１６１ＬＣＤコントローラ
１６２ＬＣＤ
１６３外部Ｉ／Ｆ
１６４通信Ｉ／Ｆ
１７０記録媒体
２１０コンテンツ管理ファイル記憶部
２１１コンテンツ入力部
２１２顔検出部
２１３顔メタデータ作成部
２１４仮想管理情報作成部
２１５代表サムネイル画像抽出部
２１６コンテンツ属性情報作成部
２１７記録制御部
２１８顔データ作成部
２１９ヘッダ情報作成部
２２１操作受付部
２２３コンテンツ記憶部
２２４選択部
２２５抽出部
２２６描画部
２２７表示部 DESCRIPTION OF SYMBOLS 100 Imaging device 110 Camera part 111 Optical block 112 CCD
113 Pre-processing circuit 114 Optical block driver 115 CCD driver 116 Timing generation circuit 120 Camera DSP
121 SDRAM
130 Control Unit 140 Operation Unit 141 CPU
142 RAM
143 Flash ROM
144 Clock circuit 145 System bus 150 Media I / F
161 LCD controller 162 LCD
163 External I / F
164 Communication I / F
170 Recording medium 210 Content management file storage unit 211 Content input unit 212 Face detection unit 213 Face metadata creation unit 214 Virtual management information creation unit 215 Representative thumbnail image extraction unit 216 Content attribute information creation unit 217 Recording control unit 218 Face data creation unit 219 Header information creation unit 221 Operation reception unit 223 Content storage unit 224 Selection unit 225 Extraction unit 226 Drawing unit 227 Display unit

Claims

An image input unit for inputting an image,
A face detection unit for detecting a face of a subject included in the input image;
Based on the detection result of the face detection unit, the face data related to the detected face composed of a plurality of element information, and a bit string assigned corresponding to the recording order of the plurality of element information, Generating face data management information for managing face data having data structure information for recording presence / absence of element information and attribute information regarding the input image when the face is detected, And a first control unit that records the face data management information on a recording medium;
A comparison unit that compares the attribute information about the input image with the attribute information included in the face data management information;
When the attribute information to be compared by the comparison unit matches, the presence or absence of the element information constituting the face data is confirmed based on the data structure information, and the element information of one element information among the plurality of element information Calculates a recording offset value from the beginning of the face data, reads the one element information from element information constituting the face data based on the calculated recording offset value, and uses the one element information to perform the input images reproducing system that includes a <br/> the second control unit to reproduce the image.

An image input unit for inputting an image,
A face detection unit for detecting a face of a subject included in the input image;
Based on the detection result of the face detection unit, the face data related to the detected face composed of a plurality of element information, and a bit string assigned corresponding to the recording order of the plurality of element information, Generating face data management information for managing face data having data structure information for recording presence / absence of element information and attribute information regarding the input image when the face is detected, And a first control unit that records the face data management information on a recording medium;
A comparison unit that compares the attribute information about the input image with the attribute information included in the face data management information;
The face detection unit is caused to detect the face of the subject included in the image for which the attribute information to be compared by the comparison unit is determined not to match, and the face data and the face data management are based on the detection result A face data recording apparatus comprising: a second control unit that creates information and records the created face data and face data management information on the recording medium .

The data structure information is a data structure having continuously assigned bit strings, and a predetermined flag is assigned to each element information recorded in the recording order according to the recording order,
The flag shows the presence or absence of the element information corresponding to the flag in the said face data
Face data storage device 請 Motomeko 2 wherein.

Wherein the data structure information, that have a reservation bit string for allocating the extended face data other than the element information claim 2 face data recording apparatus according.

Wherein the first control unit, the face detection unit by the detected face face data storage device made by such I請 Motomeko 2 wherein the face data related to the face for the face that does not satisfy a predetermined condition about.

The face data management information, said corresponding face data face data recording apparatus including請 Motomeko 2 wherein the version information that indicates the data capacity information and the version of the face data indicating the data capacity of.

The face data includes the position of the face detected by the face detection unit , its size, a face score indicating the likelihood of a face, a smile score indicating the degree of smile, its detection time, and the importance of the face in the input image. time of at least one face data recording apparatus including請 Motomeko 2 describes.

The image input unit inputs a moving image as the image,
The face detecting section that detect the face contained in the video at predetermined intervals
Face data storage device 請 Motomeko 2 wherein.

Wherein the first control unit, the detected face regarding the face data and the face data management information face data recording apparatus 請 Motomeko 8, wherein that records the moving image file corresponding to the video the face has been detected .

The image input unit inputs an AVC codec moving image as the image ,
The face detecting section that detect a face in the IDR picture or an I picture contained in the SPS is added AU
Face data storage device 請 Motomeko 2 wherein.

Wherein the first control unit, the said detected relating to a face face data and the face data management information 請 Motomeko 10 described that records the SEI in the AU containing the IDR picture or an I picture in which the face has been detected Face data recording device .

The image input unit inputs a still image as the image,
Wherein the first control unit that records the face data and the face data management information on the detected face to a still image file corresponding to the still image in which the face is detected
Face data storage device 請 Motomeko 2 wherein.

Data relating to a face included in an image, which is composed of face data composed of a plurality of element information, and a bit string assigned corresponding to the recording order of the plurality of element information, and records the presence / absence of the plurality of element information An input unit for inputting face data management information for managing the face data having data structure information to be performed and attribute information on the image when the face is detected;
A comparison unit that compares the attribute information about the image and the attribute information included in the face data management information;
When the attribute information to be compared by the comparison unit matches, the presence or absence of the element information constituting the face data is confirmed based on the data structure information, and the element information of one element information among the plurality of element information A recording offset value from the head in the face data is calculated, the one element information is read out from element information constituting the face data based on the calculated recording offset value, and the image is used using the one element information playback device includes a <br/> a control unit to regenerate.

The attribute information includes an update date and time indicating a date and time when an image corresponding to the attribute information is updated.
The face data management information includes, as the attribute information , an update date and time indicating the date and time when the image was updated when the corresponding face was detected ,
The comparison unit compares the update date / time included in the attribute information on the image with the update date / time included in the face data management information.
Reproducing apparatus of 請 Motomeko 13 described.

A face detection unit for detecting the face of the subject included in the image determined to be inconsistent with the attribute information to be compared by the comparison unit ;
The control unit creates the face data and the face data management information based on the detection result of the face detection unit, and records the created face data and the created face data management information on a recording medium.
The playback apparatus according to claim 13 .

A search unit for searching for face data and face data management information corresponding to an image different from the image determined not to match when the attribute information to be compared by the comparison unit is determined not to match; face data management device you provided according to claim 13.

The attribute information includes an image size indicating a size of an image corresponding to the attribute information ,
The face data management information includes an image size of an image when a corresponding face is detected as the attribute information ,
The comparison unit compares an image size included in attribute information about the image with an image size included in the face data management information.
The playback apparatus according to claim 13 .

The attribute information includes rotation information related to an image corresponding to the attribute information ,
When the attribute information to be compared by the comparison unit matches, the control unit checks whether or not the rotation information is included in the attribute information about the image and whether the rotation information is an invalid value , and the rotation information The one element information is read for face data relating to a face included in an image that exists and is confirmed that the rotation information is not an invalid value.
The playback apparatus according to claim 13 .

The face data management information includes an error detection code value obtained from a corresponding image,
An error detection code value calculation unit for calculating an error detection code value based on at least a part of the image data corresponding to the image ;
The comparison unit compares the error detection code value contained in the face data management information corresponding to the error detection code value and the image relating to the image pre-Symbol is calculated,
The control unit reads the one element information regarding face data relating to a face included in an image determined to have the same error detection code value.
Reproducing apparatus of 請 Motomeko 13 described.

The face data management information includes version information indicating a version of the face data,
Wherein the control unit, wherein the face data corresponding to the face data management information based on the version information contained in the face data management information is determined whether it is possible to correspond, for being determined to be corresponding face data Read the one element information
Reproducing apparatus of 請 Motomeko 13 described.

An imaging unit that captures an image of a subject;
A face detection unit that detects a face of a subject contained in the captured image,
Based on the detection result of the face detection unit, the face data related to the detected face composed of a plurality of element information, and a bit string assigned corresponding to the recording order of the plurality of element information, Creating face data management information for managing the face data having data structure information for recording presence / absence of element information and attribute information on the captured image when the face is detected, And a first control unit that records the face data management information on a recording medium;
A comparison unit that compares the attribute information about the captured image with the attribute information included in the face data management information;
The face detection unit is caused to detect the face of the subject included in the image for which the attribute information to be compared by the comparison unit is determined not to match, and the face data and the face data management are based on the detection result A second control unit that creates information and records the created face data and face data management information on the recording medium;
An imaging apparatus comprising:

A face detection procedure in which the face detection unit detects the face of the subject included in the input image;
The first control unit assigns the face data related to the detected face composed of a plurality of element information and the recording order of the plurality of element information based on the detection result in the face detection procedure. Data management information for managing the face data having data structure information for recording presence / absence of the plurality of element information and attribute information on the input image when the face is detected. And a first control procedure for recording the face data and the face data management information on a recording medium,
A comparison procedure in which the comparison unit compares the attribute information regarding the input image with the attribute information included in the face data management information;
The second control unit causes the face detection unit to detect the face of the subject included in the image for which the attribute information to be compared by the comparison unit is determined not to match, and based on the detection result, create a face data and the face data management information, the second control procedure as to that face data recording method comprising the <br/> to record the created face data and the face data management information on said recording medium.

A face detection procedure for detecting the face of the subject included in the input image;
Based on the detection result in the face detection procedure, the face data related to the detected face composed of a plurality of element information, and a bit string assigned corresponding to the recording order of the plurality of element information, Creating face data management information for managing the face data having data structure information for recording presence / absence of a plurality of element information and attribute information on the input image when the face is detected, A first control procedure for recording data and the face data management information on a recording medium;
A comparison procedure for comparing the attribute information about the input image with the attribute information included in the face data management information;
The face detection unit detects the face of the subject included in the image for which the attribute information to be compared does not match in the comparison procedure, and based on the detection result, the face data and the face data management information create and Help program is executed and the second control procedure for recording the created face data and the face data management information on said recording medium <br/> the computer.