JP2011076551A

JP2011076551A - Electronic apparatus, access control method and program

Info

Publication number: JP2011076551A
Application number: JP2009229999A
Authority: JP
Inventors: Hiroshi Okada; 浩岡田; Masahiko Enari; 正彦江成; Noboru Murabayashi; 昇村林
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2009-10-01
Filing date: 2009-10-01
Publication date: 2011-04-14

Abstract

<P>PROBLEM TO BE SOLVED: To readily authenticate access rights for contents which are not created by users themselves of the apparatuses of access originations. <P>SOLUTION: A CPU 12 of a PVR 100 accepts registrations of face images of the users of the respective apparatuses from the respective apparatuses over a network 50, and detects the face images from respective contents on the network 50; and when the registered facial images and the facial images detected from the respective contents are coincided with one another, the access permission data that permit access to the contents. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、ネットワーク上の他の機器からの再生要求に応じて当該他の機器にコンテンツを送信可能な電子機器、当該電子機器におけるアクセス制御方法及びプログラムに関する。 The present invention relates to an electronic device capable of transmitting content to another device in response to a reproduction request from the other device on a network, an access control method in the electronic device, and a program.

従来から、家庭内のＡＶ（Audio/Visual）ネットワークシステムでは、例えばある部屋に存在する機器を用いて、別の部屋に存在する機器に記憶されたコンテンツを、その記憶場所をユーザに意識させずに再生することが可能となっている。 Conventionally, in an AV (Audio / Visual) network system in a home, for example, a device that exists in one room is used, and content stored in a device that exists in another room is not made aware of the storage location by the user. It is possible to play back.

ところで、同じ家庭内の機器に記憶されたコンテンツのみならず、ある家庭内の機器から、別の家庭内の機器のコンテンツにアクセスして当該コンテンツを再生するようなシステムも想定される。このようなシステムにおいては、家庭毎に機器のユーザが異なることが通常であるため、コンテンツへのアクセス権限の認証が重要となってくる。 By the way, not only the content stored in the device in the same home but also the system which accesses the content of the device in another home from the device in one home, and reproduces the content is also assumed. In such a system, since the user of the device is usually different for each home, it is important to authenticate the access authority to the content.

コンテンツへのアクセス権限の認証を行うための技術として、例えば下記特許文献１には、サーバに記憶された、ユーザ端末のユーザ保有のコンテンツに対する当該ユーザ端末からのアクセスを、ユーザＩＤ及びパスワードにより認証することが記載されている。 As a technique for authenticating the access authority to content, for example, in Patent Document 1 below, access from a user terminal to user-owned content stored in a server is authenticated by a user ID and a password. It is described to do.

特開２００７−３１７１７８号公報JP 2007-317178 A

しかしながら、上記特許文献１に記載のように、ユーザＩＤ及びパスワードを用いてアクセスを認証する場合、ユーザは当該ユーザＩＤ及びパスワードをいちいち把握する必要があり、それらを忘れてしまった場合にはアクセスができなくなる。また、ユーザはアクセスの度にユーザＩＤ及びパスワードの入力という煩雑なステップを踏む必要がある。 However, as described in Patent Document 1, when authenticating access using a user ID and password, the user needs to grasp the user ID and password one by one. Can not be. In addition, the user needs to take a complicated step of inputting a user ID and a password every time the user accesses.

さらに、上記特許文献１に記載の技術では、ユーザ自身が作成（記録）しサーバへアップロードしたコンテンツでなければ完全な再生は不可能である。したがって、例えばユーザの知人がアップロードし当該ユーザにも当然公開しても構わないコンテンツがあっても、当該ユーザがそれを再生するためには、例えば知人からユーザＩＤやパスワードを教えてもらう等の煩雑な作業が必要となる。 Furthermore, with the technique described in Patent Document 1, complete reproduction is impossible unless the content is created (recorded) by the user and uploaded to the server. Therefore, for example, even if there is content that the user's acquaintance uploaded and may be publicly disclosed to the user, in order for the user to play it, for example, the acquaintance will give the user ID and password, etc. Complicated work is required.

以上のような事情に鑑み、本発明の目的は、アクセス元の機器のユーザ自身が作成していないコンテンツに対するアクセス権限の認証を容易に行うことが可能な電子機器、当該電子機器におけるアクセス制御方法及びプログラムを提供することにある。 In view of the circumstances as described above, an object of the present invention is to provide an electronic device capable of easily authenticating access authority for content not created by the user of the access source device, and an access control method in the electronic device. and to provide a program.

上記目的を達成するため、本発明の一形態に係る電子機器は、記憶部と、通信部と、制御部とを有する。上記記憶部は、コンテンツを記憶する。上記通信部は、ネットワーク上の他の機器から第１の顔画像データを受信する。上記制御部は、上記受信された第１の顔画像データから第１の顔特徴データを抽出し、上記記憶されたコンテンツから第２の顔画像データを検出し、当該第２の顔画像データから第２の顔特徴データを抽出する。また制御部は、上記第１の顔特徴データと上記第２の顔特徴データとが合致するか否かを判断し、当該第１の顔特徴データと第２の顔特徴データとが合致すると判断された場合に、上記他の機器からの上記コンテンツに対するアクセスを許可するアクセス許可データを生成する。 In order to achieve the above object, an electronic apparatus according to an embodiment of the present invention includes a storage unit, a communication unit, and a control unit. The storage unit stores content. The communication unit receives first face image data from another device on the network. The control unit extracts first facial feature data from the received first facial image data, detects second facial image data from the stored content, and extracts the second facial image data from the second facial image data. Second face feature data is extracted. The control unit determines whether the first face feature data matches the second face feature data, and determines that the first face feature data matches the second face feature data. If so, access permission data for permitting access to the content from the other device is generated.

この構成により電子機器は、他の機器から送信された第１の顔画像と合致する第２の画像がコンテンツに含まれている場合には、当該コンテンツが当該他の機器のユーザにより作成されたものでなくても、当該コンテンツに対する他の機器からのアクセスを許可することができる。すなわち電子機器は、顔画像をキーとしてコンテンツに対するアクセス権限を認証することができる。ここで第１の顔画像データは、他の機器のユーザの顔画像データであり、例えば他の機器が有するカメラにより撮影されたものである。すなわち、他の機器のユーザは、自身が映っているコンテンツについてはアクセスを許可される。また、当該第１の顔画像データは、それが他の機器のユーザの顔画像であることを保証するために、他の機器にインストールされた特定のアプリケーションのみによって作成されるものであってもよい。また上記コンテンツは、当該電子機器のディスクドライブに挿入された光ディスク（BD, DVD）等の可般性の記録媒体に記録されたものであってもよい。また、当該電子機器は、当該電子機器以外の他の機器に記憶されたコンテンツについて、上記アクセス許可データを生成してもよい。 With this configuration, when the content includes a second image that matches the first face image transmitted from another device, the content is created by the user of the other device. Even if it is not a thing, the access from the other apparatus with respect to the said content can be permitted. That is, the electronic device can authenticate the access authority to the content using the face image as a key. Here, the first face image data is the face image data of the user of another device, and is taken by a camera of the other device, for example. In other words, users of other devices are permitted to access the content in which they are shown. Further, the first face image data may be created only by a specific application installed in another device in order to ensure that the first face image data is a face image of a user of the other device. Good. The content may be recorded on a general-purpose recording medium such as an optical disk (BD, DVD) inserted into a disk drive of the electronic device. In addition, the electronic device may generate the access permission data for content stored in a device other than the electronic device.

上記制御部は、上記コンテンツを上記記憶部に記憶する際に上記第２の顔画像データの検出及び上記第２の顔特徴データの抽出を実行し、当該第２の顔特徴データを上記コンテンツとともに上記記憶部に記憶してもよい。 The control unit executes the detection of the second face image data and the extraction of the second face feature data when storing the content in the storage unit, and the second face feature data together with the content You may memorize | store in the said memory | storage part.

これにより電子機器は、他の機器からコンテンツへのアクセス要求があった場合に当該コンテンツに対するアクセス権限の認証を迅速に実行することができる。 As a result, when there is a request for access to content from another device, the electronic device can quickly execute authentication of access authority to the content.

上記記憶部は、上記記憶されたコンテンツの作成者を示す第３の顔画像データ及び当該第３の顔画像データから抽出された第３の顔特徴データを上記コンテンツとともに記憶してもよい。この場合上記制御部は、上記第１の顔特徴データと上記第３の顔特徴データとが合致するか否かを判断し、当該第１の顔特徴データと上記第３の顔特徴データとが合致すると判断された場合に、上記アクセス許可データを生成してもよい。 The storage unit may store, together with the content, third face image data indicating a creator of the stored content and third face feature data extracted from the third face image data. In this case, the control unit determines whether the first face feature data and the third face feature data match, and the first face feature data and the third face feature data are When it is determined that they match, the access permission data may be generated.

これにより電子機器は、コンテンツに他の機器のユーザの顔画像が映っていない場合でも、当該コンテンツの作成者については当該コンテンツのアクセスを許可することができる。したがってコンテンツの作成者が他の機器から当該コンテンツにアクセスした場合にアクセスが許否される事態が防止される。 Thereby, even when the face image of the user of another device is not shown in the content, the electronic device can permit the content creator to access the content. Therefore, when the creator of the content accesses the content from another device, a situation where access is permitted is prevented.

上記制御部は、上記アクセス許可データが生成されたコンテンツのリストを生成し、上記通信部により、当該リストを上記他の機器へ送信してもよい。そして制御部は、当該他の機器から当該リスト上の１つのコンテンツの再生を要求する再生要求信号を受信し、当該再生要求信号に応じて当該１つのコンテンツを当該他の機器へ送信してもよい。 The control unit may generate a list of contents for which the access permission data is generated, and transmit the list to the other device by the communication unit. The control unit receives a reproduction request signal for requesting reproduction of one content on the list from the other device, and transmits the one content to the other device in response to the reproduction request signal. Good.

これにより電子機器は、記憶部に記憶されたコンテンツのうち他の機器のユーザからアクセス可能なコンテンツのみを当該ユーザに把握させ視聴させることができる。 Accordingly, the electronic device can cause the user to grasp and view only the content that can be accessed by the user of the other device among the content stored in the storage unit.

本発明の別の形態に係るアクセス制御方法は、コンテンツを記憶すること及びネットワーク上の他の機器から第１の顔画像データを受信することを含む。上記受信された第１の顔画像データからは、第１の顔特徴データが抽出される。上記記憶されたコンテンツからは、第２の顔画像データが検出され、上記第２の顔画像データから第２の顔特徴データが抽出される。そして、上記第１の顔特徴データと上記第２の顔特徴データとが合致するか否かが判断される。上記第１の顔特徴データと第２の顔特徴データとが合致すると判断された場合には、上記他の機器からの上記コンテンツに対するアクセスを許可するアクセス許可データが生成される。 An access control method according to another aspect of the present invention includes storing content and receiving first face image data from another device on the network. First facial feature data is extracted from the received first facial image data. Second facial image data is detected from the stored content, and second facial feature data is extracted from the second facial image data. Then, it is determined whether or not the first face feature data matches the second face feature data. When it is determined that the first face feature data and the second face feature data match, access permission data for permitting access to the content from the other device is generated.

本発明のまた別の形態に係るプログラムは、電子機器に、記憶ステップと、受信ステップと、第１の抽出ステップと、検出ステップと、第２の抽出ステップと、判断ステップと、生成ステップとを実行させる。上記記憶ステップでは、コンテンツが記憶される。上記受信ステップでは、ネットワーク上の他の機器から第１の顔画像データが受信される。上記第１の抽出ステップでは、上記受信された第１の顔画像データから第１の顔特徴データが抽出される。上記検出ステップでは、上記記憶されたコンテンツから第２の顔画像データが検出される。上記第２の抽出ステップでは、上記第２の顔画像データから第２の顔特徴データが抽出される。上記判断ステップでは、上記第１の顔特徴データと上記第２の顔特徴データとが合致するか否かが判断される。上記生成ステップでは、上記第１の顔特徴データと第２の顔特徴データとが合致すると判断された場合に、上記他の機器からの上記コンテンツに対するアクセスを許可するアクセス許可データが生成される。 A program according to another aspect of the present invention includes a storage step, a reception step, a first extraction step, a detection step, a second extraction step, a determination step, and a generation step. Let it run. In the storage step, content is stored. In the receiving step, the first face image data is received from another device on the network. In the first extraction step, first face feature data is extracted from the received first face image data. In the detection step, second face image data is detected from the stored content. In the second extraction step, second face feature data is extracted from the second face image data. In the determining step, it is determined whether or not the first face feature data matches the second face feature data. In the generating step, when it is determined that the first face feature data matches the second face feature data, access permission data for permitting access to the content from the other device is generated.

以上のように、本発明によれば、アクセス元の機器のユーザ自身が作成していないコンテンツに対するアクセス権限の認証を容易に行うことができる。 As described above, according to the present invention, it is possible to easily authenticate access authority for content that is not created by the user of the access source device.

本発明の一実施形態において想定されるＡＶ（Audio/Visual）ネットワークシステムの一例を概略的に示す図である。1 is a diagram schematically showing an example of an AV (Audio / Visual) network system assumed in an embodiment of the present invention. FIG. 本発明の一実施形態においてコンテンツのメタデータ生成機能を有するＰＶＲのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of PVR which has a metadata production | generation function of content in one Embodiment of this invention. 本発明の一実施形態においてコンテンツのメタデータ生成機能を有しないＰＶＲのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of PVR which does not have the metadata production | generation function of content in one Embodiment of this invention. 本発明の一実施形態におけるＰＶＲによるデフォルトでのメタデータ生成処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the metadata production | generation process by default by PVR in one Embodiment of this invention. 本発明の一実施形態において、ＰＶＲが、メタデータを生成可能な他の機器を探索する処理の詳細な流れを示したフローチャートである。In one Embodiment of this invention, PVR is the flowchart which showed the detailed flow of the process which searches the other apparatus which can produce | generate metadata. 上記コンテンツ記憶機器から返信されるテストデータのデータ構造の例を示した図である。It is the figure which showed the example of the data structure of the test data returned from the said content storage apparatus. 図６に示したテストデータに含まれる動作モード用データの詳細を示した図である。It is the figure which showed the detail of the data for operation modes contained in the test data shown in FIG. 図６に示したテストデータに含まれる処理用データの詳細を示した図である。FIG. 7 is a diagram showing details of processing data included in the test data shown in FIG. 6. 本発明の一実施形態において、コンテンツジャンルが、放送コンテンツであって、ニュース番組の場合に各動作モードの実行に必要とされる処理を示した図である。In one Embodiment of this invention, when a content genre is a broadcast content and it is a news program, it is the figure which showed the process required for execution of each operation mode. 本発明の一実施形態において、コンテンツジャンルが、放送コンテンツであって、スポーツ番組の場合に各動作モードの実行に必要とされる処理を示した図である。In one Embodiment of this invention, when a content genre is a broadcast content and it is a sports program, it is the figure which showed the process required for execution of each operation mode. 本発明の一実施形態において、コンテンツジャンルが、放送コンテンツであって、音楽番組の場合に各動作モードの実行に必要とされる処理を示した図である。In one Embodiment of this invention, when a content genre is a broadcast content and it is a music program, it is the figure which showed the process required for execution of each operation mode. 本発明の一実施形態において、コンテンツジャンルがプライベートコンテンツの場合に各動作モードの実行に必要とされる処理を示した図である。In one Embodiment of this invention, when a content genre is a private content, it is the figure which showed the process required for execution of each operation mode. 本発明の一実施形態における手振れ特徴抽出処理を概念的に示す図である。It is a figure which shows notionally the camera shake feature extraction process in one Embodiment of this invention. 本発明の一実施形態におけるキーフレーム検出処理を概念的に示す図である。It is a figure which shows notionally the key frame detection process in one Embodiment of this invention. 本発明の一実施形態におけるデフォルトによるメタデータ生成処理を概念的に示す図である。It is a figure which shows notionally the metadata production | generation process by default in one Embodiment of this invention. 本発明の一実施形態におけるＰＶＲによるマニュアルでのメタデータ生成処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the metadata production | generation process by the manual by PVR in one Embodiment of this invention. 本発明の一実施形態におけるマニュアルでのメタデータ生成処理を概念的に示す図である。It is a figure which shows notionally the metadata production | generation process in the manual in one Embodiment of this invention. 本発明の一実施形態におけるコンテンツ分類処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the content classification process in one Embodiment of this invention. 本発明の一実施形態において生成されるコンテンツリストの表示例を示した図である。It is the figure which showed the example of a display of the content list produced | generated in one Embodiment of this invention. 本発明の一実施形態における、トラフィック状況等に応じたコンテンツリストの表示制御処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the display control process of the content list according to the traffic condition etc. in one Embodiment of this invention. 本発明の一実施形態において、コンテンツリストの表示制御処理によりサムネイルの状態が変化する様子を示した図である。FIG. 6 is a diagram illustrating a state in which a thumbnail state is changed by a content list display control process in an embodiment of the present invention. 本発明の一実施形態における、顔識別メタデータに基づくアクセス制御処理の流れを示したフローチャートである。It is the flowchart which showed the flow of the access control process based on face identification metadata in one Embodiment of this invention. 本発明の一実施形態における、動作モードに応じたアクセス制御処理の概要を示した表である。6 is a table showing an overview of access control processing according to an operation mode in an embodiment of the present invention.

以下、図面を参照しながら、本発明の実施形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

［ＡＶネットワークシステムの概要］
図１は、本発明の一実施形態において想定されるＡＶネットワークシステムの一例を概略的に示す図である。 [Outline of AV network system]
FIG. 1 is a diagram schematically showing an example of an AV network system assumed in an embodiment of the present invention.

同図に示すように、当該ＡＶネットワークシステムでは、例えば異なる家（Ａさん宅及びＢさん宅）に設置された機器が例えばインターネット等のネットワーク５０を介して接続されている。例えばＡさん宅には、２台のＰＶＲ（Personal Video Recorder）１００及び２００と、当該ＰＶＲ１００及び２００にそれぞれ接続されたＴＶ６０が異なる部屋に設置されている。また例えばＢさん宅には、１台のＰＶＲ３００が１つの部屋に設置され、当該ＰＶＲに接続された２台のＴＶ６０が、ＰＶＲ３００が設置された部屋及びそれとは別の部屋にそれぞれ設置されている。 As shown in the figure, in the AV network system, for example, devices installed in different houses (A's house and B's house) are connected via a network 50 such as the Internet. For example, in Mr. A's house, two PVRs (Personal Video Recorders) 100 and 200 and a TV 60 connected to each of the PVRs 100 and 200 are installed in different rooms. For example, in Mr. B's house, one PVR 300 is installed in one room, and two TVs 60 connected to the PVR are installed in a room where the PVR 300 is installed and in a different room. .

同図においては、Ａさん宅内の４つの機器とＢさん宅内の３つの機器とがネットワーク５０を介して接続されている例が示されているが、もちろん、各家に設置される機器の数や種類はこれに限られず、またその他の家の機器もネットワーク５０に接続可能である。以下でも、同図に示した機器以外の機器がネットワーク５０に接続されていることを前提に説明がなされる場合がある。 The figure shows an example in which four devices in Mr. A's house and three devices in Mr. B's house are connected via the network 50. Of course, the number of devices installed in each house is shown. The type is not limited to this, and other home devices can be connected to the network 50. The following description may be made on the assumption that devices other than the devices shown in the figure are connected to the network 50.

このような構成において、Ａさん宅の機器とＢさん宅の機器とは、所定の認証が行えた場合には、ネットワーク５０を介して相互にアクセスすることが可能となっている。したがって例えばＢさんは、Ｂさん宅に存在するＰＶＲ３００から、ネットワーク５０を介してＡさん宅のＰＶＲ１００及び２００へアクセスし、当該ＰＶＲ１００及び２００に記憶されたコンテンツを視聴することができる。 In such a configuration, the device at Mr. A's house and the device at Mr. B's house can access each other via the network 50 when predetermined authentication can be performed. Therefore, for example, Mr. B can access the PVRs 100 and 200 of Mr. A's house from the PVR 300 existing at Mr. B's house via the network 50 and view the contents stored in the PVRs 100 and 200.

ここで、上記ＰＶＲ１００、２００及び３００のうち、ＰＶＲ１００は、コンテンツの編集やダイジェスト再生等の特殊再生やコンテンツの分類等を行うためのメタデータを生成する機能を有しており、ＰＶＲ２００及び３００はそのような機能を有していない。したがって、例えばＢさんがＢさん宅からＰＶＲ２００に記憶されたコンテンツの特殊再生や分類を行いたい場合でも、当該ＰＶＲ３００によっては当該特殊再生や分類を行うことができない。 Here, among the PVRs 100, 200, and 300, the PVR 100 has a function of generating metadata for performing special reproduction such as content editing and digest reproduction, content classification, and the like. It does not have such a function. Therefore, for example, even when Mr. B wants to perform special playback or classification of the content stored in the PVR 200 from Mr. B's house, the special playback or classification cannot be performed by the PVR 300.

そこで本実施形態では、上記ＰＶＲ３００のようにコンテンツのメタデータ生成機能を有しない機器が、上記ＰＶＲ１００のように当該メタデータ生成機能を有するネットワーク５０上の他の機器に当該メタデータを生成させることを可能としている。 Therefore, in this embodiment, a device that does not have a content metadata generation function, such as the PVR 300, causes the other devices on the network 50 that have the metadata generation function, such as the PVR 100, to generate the metadata. It is made possible.

［ＰＶＲのハードウェア構成］
図２は、メタデータ生成機能を有する上記ＰＶＲ１００のハードウェア構成を示す図である。同図に示すように、このＰＶＲ１００は、デジタルチューナ１、復調部２、デマルチプレクサ３、デコーダ４、記録再生部５、ＨＤＤ（Hard Disk Drive）８、光ディスクドライブ９、通信部１１を有する。またＰＶＲ１００は、ＣＰＵ（Central Processing Unit）１２、フラッシュメモリ１３、ＲＡＭ（Random Access Memory）１４を有する。さらにＰＶＲ１００は、操作入力部１５、グラフィック制御部１６、映像Ｄ／Ａ（Digital/Analog）コンバータ１７、音声Ｄ／Ａ（Digital/Analog）コンバータ１８、外部インタフェース１９及び特徴抽出回路２０を有する。 [Hardware configuration of PVR]
FIG. 2 is a diagram illustrating a hardware configuration of the PVR 100 having a metadata generation function. As shown in the figure, the PVR 100 includes a digital tuner 1, a demodulator 2, a demultiplexer 3, a decoder 4, a recording / reproducing unit 5, an HDD (Hard Disk Drive) 8, an optical disk drive 9, and a communication unit 11. The PVR 100 includes a CPU (Central Processing Unit) 12, a flash memory 13, and a RAM (Random Access Memory) 14. The PVR 100 further includes an operation input unit 15, a graphic control unit 16, a video D / A (Digital / Analog) converter 17, an audio D / A (Digital / Analog) converter 18, an external interface 19, and a feature extraction circuit 20.

デジタルチューナ１は、ＣＰＵ１２の制御に従って、アンテナ２２を介してデジタル放送の特定のチャンネルを選局して、番組のデータを含む放送信号を受信する。この放送信号は、例えばＭＰＥＧ−２ＴＳフォーマットで符号化されたＭＰＥＧストリーム（ＴＳ：トランスポートストリーム）であるが、このフォーマットに限られるものではない。復調部２は、変調された当該放送信号を復調する。 The digital tuner 1 selects a specific channel of digital broadcasting via the antenna 22 under the control of the CPU 12 and receives a broadcast signal including program data. This broadcast signal is, for example, an MPEG stream (TS: transport stream) encoded in the MPEG-2 TS format, but is not limited to this format. The demodulator 2 demodulates the modulated broadcast signal.

デマルチプレクサ３は、多重化された上記放送信号を、映像信号、音声信号、字幕信号、ＳＩ（Service Information）信号等の各信号に分離し、デコーダ４へ供給する。またデマルチプレクサ３は、分離したデコード前の各信号を、特徴抽出回路２０へも供給可能である。上記ＳＩ信号は、ＥＰＧ（Electronic Program Guide）を表示するためのデータ等を伝送するための信号である。当該ＥＰＧ情報は、後述するようにコンテンツのジャンル判定に用いられる。 The demultiplexer 3 separates the multiplexed broadcast signal into signals such as a video signal, an audio signal, a caption signal, and an SI (Service Information) signal, and supplies them to the decoder 4. The demultiplexer 3 can also supply the separated signals before decoding to the feature extraction circuit 20. The SI signal is a signal for transmitting data or the like for displaying an EPG (Electronic Program Guide). The EPG information is used for content genre determination as described later.

デコーダ４は、デマルチプレクサ３で分離された映像信号、音声信号、字幕信号、ＳＩ信号をそれぞれデコードする。デコードされた各信号は記録再生部５へ供給される。 The decoder 4 decodes the video signal, audio signal, caption signal, and SI signal separated by the demultiplexer 3, respectively. Each decoded signal is supplied to the recording / reproducing unit 5.

記録再生部５は、記録部６及び再生部７を有する。記録部６は、デコーダ４によりデコードされ入力された映像信号及び音声信号を一時的に蓄積して、タイミングやデータ量を制御しながらＨＤＤ８や光ディスクドライブ９に出力して記録させる。また記録部６は、ＨＤＤ８に記録されたコンテンツを読み出して光ディスクドライブ９に出力し、光ディスク１０に記録させることも可能である。再生部７は、ＨＤＤ８や光ディスク１０に記録された映像コンテンツの映像信号及び音声信号を読み出し、タイミングやデータ量を制御しながらデコーダ４へ出力して再生させる。 The recording / reproducing unit 5 includes a recording unit 6 and a reproducing unit 7. The recording unit 6 temporarily stores the video signal and the audio signal decoded and input by the decoder 4 and outputs and records them on the HDD 8 and the optical disk drive 9 while controlling the timing and the data amount. The recording unit 6 can also read the content recorded on the HDD 8, output it to the optical disc drive 9, and record it on the optical disc 10. The playback unit 7 reads the video signal and audio signal of the video content recorded on the HDD 8 or the optical disk 10 and outputs them to the decoder 4 for playback while controlling the timing and the data amount.

ＨＤＤ８は、デジタルチューナ１を介して受信した番組や、通信部１１によりネットワーク５０を介して受信される種々のコンテンツを内蔵のハードディスクに記憶する。これらの記憶されたコンテンツが再生される際には、ＨＤＤ８は、これらのデータを上記ハードディスクから読み出し、記録再生部５へ出力する。 The HDD 8 stores a program received via the digital tuner 1 and various contents received via the network 50 by the communication unit 11 in a built-in hard disk. When these stored contents are reproduced, the HDD 8 reads these data from the hard disk and outputs them to the recording / reproducing unit 5.

またＨＤＤ８は、各種プログラムやその他のデータ等を記憶する場合もある。これらのプログラムやデータは、それらの実行時及び参照時に、ＣＰＵ１２の指令によりＨＤＤ８から読み出され、ＲＡＭ１４へ展開される。 The HDD 8 may store various programs and other data. These programs and data are read from the HDD 8 according to a command from the CPU 12 and executed in the RAM 14 at the time of execution and reference.

光ディスクドライブ９は、上記ＨＤＤ８と同様に、当該装着された光ディスク１０に上記番組コンテンツ等の各種データを記録し、また記録されたデータを読み出すことが可能である。また上記各種プログラムは、これら光ディスク１０等の可般性の記録媒体に記録され、光ディスクドライブ９によりＰＶＲ１００にインストールされてもよい。 Similar to the HDD 8, the optical disk drive 9 can record various data such as the program content on the mounted optical disk 10, and can read the recorded data. The various programs may be recorded on a general-purpose recording medium such as the optical disk 10 and installed in the PVR 100 by the optical disk drive 9.

通信部１１は、上記ネットワーク５０に接続してＴＣＰ／ＩＰ（Transmission Control Protocol / Internet Protocol）等のプロトコルによりネットワーク５０上の他の装置とデータのやり取りを行うためのネットワークインタフェースである。通信部１１により受信されたデータが多重化されている場合、デマルチプレクサ３に供給される。また受信されたデータのうち少なくとも一部は、必要に応じて特徴抽出回路２０にも供給される。上記ＥＰＧ情報は、デジタルチューナ１が受信した放送信号からではなく、ネットワーク５０上から通信部１１により取得されてもよい。 The communication unit 11 is a network interface for connecting to the network 50 and exchanging data with other devices on the network 50 using a protocol such as TCP / IP (Transmission Control Protocol / Internet Protocol). When the data received by the communication unit 11 is multiplexed, the data is supplied to the demultiplexer 3. At least a part of the received data is also supplied to the feature extraction circuit 20 as necessary. The EPG information may be acquired by the communication unit 11 from the network 50 instead of from the broadcast signal received by the digital tuner 1.

外部インタフェース１９は、例えばＵＳＢインタフェースやメモリカードインタフェース等からなり、例えばデジタルビデオカメラ、デジタルスチルカメラ等の外部機器やメモリカード等と接続し、それらからコンテンツ等のデータを読み出す。 The external interface 19 includes, for example, a USB interface and a memory card interface, and is connected to an external device such as a digital video camera and a digital still camera, a memory card, and the like, and reads data such as content from them.

ＣＰＵ１２は、必要に応じてＲＡＭ１４等にアクセスし、映像データの受信処理、コンテンツの記録再生処理、特徴抽出（メタデータ生成）処理等、ＰＶＲ１００の各ブロックの処理を統括的に制御する。 The CPU 12 accesses the RAM 14 or the like as necessary, and comprehensively controls processing of each block of the PVR 100 such as video data reception processing, content recording / playback processing, and feature extraction (metadata generation) processing.

フラッシュメモリ１３は、例えばＮＡＮＤ型のものであり、ＣＰＵ１２に実行させるＯＳ、プログラムや各種パラメータなどのファームウェアが固定的に記憶されている不揮発性のメモリである。またフラッシュメモリ１３は、メタデータ生成処理において特徴抽出回路２０とともに動作するプログラムやその他の各種プログラム、上記ＥＰＧ情報等の各種データを記憶する。 The flash memory 13 is, for example, a NAND type, and is a non-volatile memory in which firmware such as an OS, a program, and various parameters executed by the CPU 12 is fixedly stored. The flash memory 13 stores a program that operates together with the feature extraction circuit 20 in the metadata generation process, various other programs, and various data such as the EPG information.

ＲＡＭ１４は、ＣＰＵ１２の作業用領域等として用いられ、上記コンテンツの記録再生処理やメタデータ生成処理の最中に、ＯＳやプログラム、処理データ等を一時的に保持するメモリである。 The RAM 14 is used as a work area of the CPU 12 and is a memory that temporarily holds an OS, a program, processing data, and the like during the content recording / playback processing and metadata generation processing.

操作入力部１５は、例えば複数のキーを有するリモートコントローラ２１（以下、リモコン２１と称する）から、ユーザの操作による各種設定値や指令を入力してＣＰＵ１２へ出力する。もちろん、操作入力部１５は、リモコン２１によらずに、ＰＶＲ１００に接続されたキーボードやマウス、ＰＶＲ１００に実装されたスイッチ等で構成されていても構わない。 The operation input unit 15 inputs various setting values and commands by a user's operation from, for example, a remote controller 21 having a plurality of keys (hereinafter referred to as a remote controller 21), and outputs it to the CPU 12. Of course, the operation input unit 15 may be configured by a keyboard or mouse connected to the PVR 100, a switch mounted on the PVR 100, or the like, without using the remote controller 21.

グラフィック制御部１６は、デコーダ４から出力された映像信号やＣＰＵ１２から出力されるその他の映像データにＯＳＤ（On Screen Display）処理等のグラフィック処理を施し、例えばＴＶ６０等の表示装置に表示させるための映像信号を生成する。 The graphic control unit 16 performs graphic processing such as OSD (On Screen Display) processing on the video signal output from the decoder 4 and other video data output from the CPU 12, and displays the result on a display device such as the TV 60, for example. to generate a video signal.

映像Ｄ／Ａコンバータ１７は、上記グラフィック制御部１６から入力されたデジタル映像信号をアナログ映像信号に変換して、映像出力端子等を介して表示装置へ出力する。 The video D / A converter 17 converts the digital video signal input from the graphic control unit 16 into an analog video signal, and outputs the analog video signal to a display device via a video output terminal or the like.

音声Ｄ／Ａコンバータ１８は、上記デコーダ４から入力されたデジタル音声信号をアナログ音声信号に変換して、音声出力端子等を介してＴＶ等へ出力する。 The audio D / A converter 18 converts the digital audio signal input from the decoder 4 into an analog audio signal and outputs the analog audio signal to a TV or the like via an audio output terminal or the like.

特徴抽出回路２０は、ＨＤＤ８に記憶されたコンテンツやネットワーク５０上の他の機器に記憶されたコンテンツから、所定の映像特徴及び音声特徴を抽出し、当該各特徴を基に、各コンテンツの特殊再生処理や分類処理等に用いられる各種メタデータを生成する。具体的には、特徴抽出回路２０は、コンテンツの映像信号及び音声信号を所定の時間毎（フレーム間隔毎）のデータに分割し（データセグメンテーション処理）、それをＲＡＭ１４に一時的に保存する。そして特徴抽出回路２０は、当該保存された各データのうち映像データについては例えば動きベクトルを検出することで映像特徴を抽出し、音声データについては例えばパワーレベルを解析することで音声特徴を抽出する。 The feature extraction circuit 20 extracts predetermined video features and audio features from content stored in the HDD 8 and content stored in other devices on the network 50, and special reproduction of each content is performed based on the features. Various metadata used for processing and classification processing are generated. Specifically, the feature extraction circuit 20 divides the video signal and audio signal of the content into data at predetermined time intervals (every frame interval) (data segmentation process), and temporarily stores them in the RAM 14. The feature extraction circuit 20 extracts video features by detecting, for example, motion vectors for video data, and extracts audio features by analyzing power levels, for example, for audio data. .

上記所定の映像特徴とは、例えばパン、チルト、ズーム等のカメラ特徴や、人物（顔）、動物、建物等のオブジェクトを示す画像特徴等である。所定の音声特徴とは、例えば人の声を示す特徴、より具体的には音声パワーの継続時間を示す特徴等である。またメタデータとは、上記カメラ特徴や音声特徴を基に生成されるコンテンツ内の盛り上がりシーン（キーフレーム）に関するメタデータ（シーン特徴メタデータ）、コンテンツ内の被写体（人かそれ以外か）等に応じてコンテンツを分類するためのメタデータ（分類用メタデータ）、コンテンツ内の人の顔を識別してコンテンツへのアクセスを制御するために用いられるメタデータ（顔識別メタデータ）等である。これらの各種メタデータの詳細については後述する。またメタデータは、上記特徴抽出処理により抽出された特徴データそのものであってもよい。 The predetermined video features include, for example, camera features such as pan, tilt, and zoom, and image features that show objects such as people (faces), animals, and buildings. The predetermined voice feature is, for example, a feature showing a human voice, more specifically, a feature showing a duration of voice power. Metadata refers to metadata (scene feature metadata) related to swell scenes (keyframes) in content generated based on the above camera features and audio features, subjects in the content (people or otherwise), etc. Accordingly, metadata for classifying content (classification metadata), metadata used for identifying a human face in the content and controlling access to the content (face identification metadata), and the like. Details of these various metadata will be described later. The metadata may be the feature data itself extracted by the feature extraction process.

図３は、メタデータ生成機能を有しない上記ＰＶＲ３００のハードウェア構成を示す図である。同図に示すように、ＰＶＲ３００の構成は、上記図２に示したＰＶＲ１００と比較して、ＰＶＲ１００の特徴抽出回路２０を有していないこと以外は共通している。したがってここではＰＶＲ３００の各ブロックについての説明は省略する。上記ＰＶＲ２００の構成も当該ＰＶＲ３００の構成と同様である。 FIG. 3 is a diagram illustrating a hardware configuration of the PVR 300 that does not have a metadata generation function. As shown in the figure, the configuration of the PVR 300 is the same as that of the PVR 100 shown in FIG. 2 except that the feature extraction circuit 20 of the PVR 100 is not provided. Therefore, description of each block of the PVR 300 is omitted here. The configuration of the PVR 200 is the same as that of the PVR 300.

［ＡＶネットワークシステム内の各機器の動作］
次に、以上のように構成されたＡＶネットワークシステムにおける各機器の動作について説明する。以下では、特にＰＶＲ３００の動作を中心に説明する。以下では、ＰＶＲ３００のＣＰＵ４２を主な動作主体としてＰＶＲ３００の動作が説明されるが、この動作は、その他のハードウェア及びＣＰＵ４２の制御下において実行されるプログラムとも協働して行われる。 [Operation of each device in the AV network system]
Next, the operation of each device in the AV network system configured as described above will be described. Hereinafter, the operation of the PVR 300 will be mainly described. In the following, the operation of the PVR 300 will be described with the CPU 42 of the PVR 300 as the main operation subject, but this operation is also performed in cooperation with other hardware and programs executed under the control of the CPU 42.

（メタデータの生成処理）
まず、本システムにおけるメタデータの生成処理について説明する。本実施形態では、ネットワーク５０上の全ての機器に記憶された全てのコンテンツについてデフォルトで自動的にメタデータが生成される場合と、特定のコンテンツについてメタデータが生成される場合とがある。 (Metadata generation process)
First, metadata generation processing in this system will be described. In the present embodiment, there are a case where metadata is automatically generated by default for all contents stored in all devices on the network 50 and a case where metadata is generated for specific contents.

まず、ネットワーク５０上の各機器に記憶された各コンテンツのメタデータがデフォルトで生成される場合について説明する。図４は、ＰＶＲ３００によるデフォルトでのメタデータ生成処理の流れを示したフローチャートである。 First, a case where metadata of each content stored in each device on the network 50 is generated by default will be described. FIG. 4 is a flowchart showing a flow of default metadata generation processing by the PVR 300.

同図に示すように、ＰＶＲ３００のＣＰＵ４２は、ネットワーク５０上の各機器にアクセスを試み、各機器からアクセス認証を受ける（ステップ４１）。当該認証に通った場合（ステップ４２のＹｅｓ）、ＣＰＵ４２は、当該アクセスした機器に記憶されたコンテンツを検出する（ステップ４３）。認証に通らなかった場合（Ｎｏ）、すなわちアクセス可能な他の機器がネットワーク５０上に存在しない場合、ＣＰＵ４２は、ユーザから、例えばコンテンツの通常再生等、メタデータの生成動作以外の動作モードへ移行する指示があったか否かを判断し（ステップ４４）、当該指示があった場合（Ｙｅｓ）は他の動作モードへ移行する。 As shown in the figure, the CPU 42 of the PVR 300 attempts to access each device on the network 50 and receives access authentication from each device (step 41). When the authentication is passed (Yes in step 42), the CPU 42 detects the content stored in the accessed device (step 43). If the authentication fails (No), that is, if no other accessible device exists on the network 50, the CPU 42 shifts from the user to an operation mode other than the metadata generation operation, such as normal playback of content, for example. It is determined whether or not there is an instruction to perform (step 44), and if there is such an instruction (Yes), the process proceeds to another operation mode.

上記ステップ４２においてアクセス可能な他の機器が存在した場合（Ｙｅｓ）、ＣＰＵ４２は、当該他の機器に記憶されたコンテンツを検出する（ステップ４３）。そしてＣＰＵ４２は、当該検出された各コンテンツについて、メタデータが生成されているか否かを判断する（ステップ４５）。 If there is another device accessible in step 42 (Yes), the CPU 42 detects the content stored in the other device (step 43). Then, the CPU 42 determines whether metadata is generated for each detected content (step 45).

上記ステップ４５においてメタデータが生成されていると判断された場合（Ｙｅｓ）、ＣＰＵ４２は、ネットワーク５０上の全ての機器に記憶された全てのコンテンツについてメタデータが生成されているか否か、すなわちその他にメタデータを生成すべきコンテンツがないかを判断する（ステップ５１）。メタデータを生成すべきコンテンツが存在しないと判断された場合（Ｙｅｓ）、処理は終了し、存在する場合（Ｎｏ）には上記ステップ４１へ戻り処理を繰り返す。 If it is determined in step 45 that metadata has been generated (Yes), the CPU 42 determines whether metadata has been generated for all contents stored in all devices on the network 50, that is, other It is determined whether there is any content for which metadata is to be generated (step 51). If it is determined that there is no content for which metadata is to be generated (Yes), the process ends, and if it exists (No), the process returns to Step 41 and repeats the process.

上記ステップ４５においてメタデータが生成されていないと判断された場合（Ｎｏ）、ＣＰＵ４２は、メタデータを生成可能な他の機器をネットワーク５０上から探索する（ステップ４６）。この探索処理の詳細は後述する。 If it is determined in step 45 that metadata has not been generated (No), the CPU 42 searches the network 50 for other devices capable of generating metadata (step 46). Details of this search process will be described later.

上記探索処理の結果、メタデータの生成に対応した他の機器が見つかった場合（ステップ４７のＹｅｓ）、ＣＰＵ４２は、当該他の機器へメタデータを生成させるためのコマンドを送信する（ステップ４８）。そしてＣＰＵ４２は、当該コマンドに応じて他の機器で生成されたメタデータを受信する（ステップ５０）。ＣＰＵ４２は、以上の処理を、ネットワーク５０上の全ての機器の全てのコンテンツについて繰り返す（ステップ５１）。 If another device corresponding to the generation of metadata is found as a result of the search process (Yes in step 47), the CPU 42 transmits a command for generating metadata to the other device (step 48). . Then, the CPU 42 receives metadata generated by another device in response to the command (step 50). The CPU 42 repeats the above processing for all contents of all devices on the network 50 (step 51).

上記探索処理の結果、メタデータの生成に対応した他の機器が見つからなかった場合（ステップ４７のＹｅｓ）、ＣＰＵ４２は、例えば当該ＰＶＲ３００に接続されたＴＶ６０を介して、メタデータを生成できない旨の警告表示を行う（ステップ４９）。 If no other device corresponding to the generation of metadata is found as a result of the search process (Yes in step 47), the CPU 42 indicates that the metadata cannot be generated, for example, via the TV 60 connected to the PVR 300. A warning is displayed (step 49).

図５は、上記メタデータを生成可能な他の機器の探索処理の詳細な流れを示したフローチャートである。
同図に示すように、ＣＰＵ４２は、まず、コンテンツを記憶するネットワーク５０上の他の機器（以下、コンテンツ記憶機器と称する）に対して、当該他の機器以外の各機器へテストデータを送信するよう指示するコマンドを送信する（ステップ６１）。ここでテストデータは、メタデータを生成可能か否かを問い合わせるデータであり、かつ、そのメタデータ生成処理を最も確実に効率よく行える機器を探索するために送信されるデータである。 FIG. 5 is a flowchart showing a detailed flow of a search process for another device capable of generating the metadata.
As shown in the figure, the CPU 42 first transmits test data to each device other than the other device to another device (hereinafter referred to as a content storage device) on the network 50 that stores the content. A command for instructing is transmitted (step 61). Here, the test data is data for inquiring whether or not the metadata can be generated, and is transmitted to search for a device that can perform the metadata generation processing most reliably and efficiently.

続いてＣＰＵ４２は、上記コンテンツ記憶機器が他の機器から受信した上記テストデータの返信結果の転送を受ける（ステップ６２）。ここで、当該テストデータの詳細について説明する。 Subsequently, the CPU 42 receives a return result of the test data received by the content storage device from another device (step 62). Here, the details of the test data will be described.

図６は、上記他の機器からコンテンツ記憶機器へ返信されるテストデータのデータ構造の例を示した図である。同図に示すように、当該テストデータは、同期データ７１、システム用データ７２、動作モード用データ７３、処理用データ７４及びエラー処理用データ７５からなる。 FIG. 6 is a diagram showing an example of the data structure of test data returned from the other device to the content storage device. As shown in the figure, the test data includes synchronization data 71, system data 72, operation mode data 73, processing data 74, and error processing data 75.

同期データ７１は、コンテンツ記憶機器が送信したテストデータと他の機器から返信されたテストデータとの同期をとるためのデータである。システム用データ７２は、当該返信元の他の機器のＩＰアドレス等のデータである。 The synchronization data 71 is data for synchronizing the test data transmitted from the content storage device with the test data returned from another device. The system data 72 is data such as an IP address of another device of the reply source.

図７は、上記動作モード用データ７３の詳細を示した図である。同図に示すように、動作モード用データ７３は例えば16ビット（m1〜m16）からなり、複数の動作モードについて、それが当該他の機器において処理が可能か否かを０か１のビットで示したものである。同図において動作モードが定義されていない箇所（−）には、適宜動作モードを追加可能である。動作モードとしては、映像音声表示出力モード、コンテンツの通常再生モード、コンテンツのダイジェスト再生モード、コンテンツの自動編集モード、コンテンツ蓄積（記録）モード、コンテンツ撮影（カメラ機能）モード等がある。同図に示した例では、テストデータの返信元の他の機器がこれらの各モード全てに対応していることが示されている。 FIG. 7 is a diagram showing details of the operation mode data 73. As shown in the figure, the operation mode data 73 is composed of, for example, 16 bits (m1 to m16), and for a plurality of operation modes, whether or not it can be processed in the other device is 0 or 1 bit. there is shown. In the figure, an operation mode can be added as appropriate to a location (−) where the operation mode is not defined. The operation modes include a video / audio display output mode, a normal content playback mode, a content digest playback mode, a content automatic editing mode, a content storage (recording) mode, a content shooting (camera function) mode, and the like. In the example shown in the figure, it is shown that other devices from which test data is returned correspond to all these modes.

図４及び図５のフローチャートでは、他の機器に生成させるメタデータは、予め決まっており、当該特定のメタデータが生成可能な否かが上記動作モード用データから判断されることとなる。例えば、ダイジェスト再生用のメタデータの生成が意図されている場合、上記動作モード用データから、他の機器がダイジェスト再生に対応しているか否かが判断されることとなる。 In the flowcharts of FIGS. 4 and 5, metadata to be generated by other devices is determined in advance, and it is determined from the operation mode data whether the specific metadata can be generated. For example, when generating metadata for digest playback is intended, it is determined from the operation mode data whether or not another device supports digest playback.

図８は、上記処理用データ７４の詳細を示した図である。同図に示すように、処理用データ７４は、例えば12ビット（g1〜g4, c1〜c4, a1〜a4）からなり、上記ダイジェスト再生モードや自動編集モード等において必要となる各種映像音声特徴抽出処理の可否を０か１のビットで示したものである。同図において処理機能が定義されていない箇所（−）には、適宜処理機能を追加可能である。処理用データは、画像特徴、カメラ特徴及び音声特徴からなる。画像特徴の処理機能としては、顔検出機能やテロップ検出機能がある。カメラ特徴としては、ズーム、パン、チルト、手振れの各検出機能がある。音声特徴としては、音声の盛り上がり（レベル大）検出機能、音楽検出機能、人の声検出機能がある。同図に示した例では、テストデータの返信元の他の機器がこれらの各処理機能全てに対応していることが示されている。 FIG. 8 is a diagram showing details of the processing data 74. As shown in the figure, the processing data 74 is composed of, for example, 12 bits (g1 to g4, c1 to c4, a1 to a4), and various video and audio feature extraction required in the digest playback mode, the automatic editing mode, and the like. Whether or not processing is possible is indicated by 0 or 1 bit. In the figure, a processing function can be added as appropriate to a location (-) where the processing function is not defined. The processing data includes an image feature, a camera feature, and an audio feature. Image feature processing functions include a face detection function and a telop detection function. Camera features include zoom, pan, tilt, and camera shake detection functions. The voice features include a voice swell (high level) detection function, a music detection function, and a human voice detection function. In the example shown in the figure, it is shown that other devices from which the test data is returned correspond to all these processing functions.

エラー処理用データ７５は、例えば16ビットからなり、当該テストデータがコンテンツ記憶機器から送信されて返信されるまでの間に発生したエラーのエラーレートを計測するためのデータである。具体的には、例えばＣＲＣコードやリードソロモン符号等のエラー検出コードが用いられる。 The error processing data 75 is composed of, for example, 16 bits, and is data for measuring an error rate of an error that occurs until the test data is transmitted from the content storage device and returned. Specifically, for example, an error detection code such as a CRC code or a Reed-Solomon code is used.

図５のフローチャートに戻り、ＣＰＵ４２は、上記テストデータの返信結果データを受信すると、当該データから、特定のメタデータの生成に対応した他の機器（以下、単に対応機器ともいう。）が存在するか否かを判断する（ステップ６３）。対応機器が存在しないと判断された場合（Ｎｏ）、ＣＰＵ４２は、図４のステップ４９で示した警告表示を行う。 Returning to the flowchart of FIG. 5, when the CPU 42 receives the return result data of the test data, there is another device (hereinafter also simply referred to as a “corresponding device”) corresponding to generation of specific metadata from the data. Whether or not (step 63). When it is determined that the corresponding device does not exist (No), the CPU 42 performs a warning display shown in step 49 of FIG.

対応機器が存在すると判断された場合（Ｙｅｓ）、ＣＰＵ４２は、当該対応機器が複数存在するか否かを判断する（ステップ６４）。対応機器が複数存在すると判断された場合（Ｙｅｓ）、ＣＰＵ４２は、上記テストデータの返信結果データを基に、上記エラーレート及び遅延時間を検出する（ステップ６５）。エラーレートは、上記返信されたテストデータ中のエラー処理用データから算出される。遅延時間は、コンテンツ記憶装置がテストデータを送信した時刻と他の装置からそれを返信した時刻とから算出される。ＣＰＵ４２は、当該エラーレート及び遅延時間を、各対応機器について算出する。 When it is determined that there is a corresponding device (Yes), the CPU 42 determines whether or not there are a plurality of the corresponding devices (step 64). If it is determined that there are a plurality of compatible devices (Yes), the CPU 42 detects the error rate and the delay time based on the return result data of the test data (step 65). The error rate is calculated from the error processing data in the returned test data. The delay time is calculated from the time when the content storage device transmits the test data and the time when it is returned from another device. The CPU 42 calculates the error rate and delay time for each corresponding device.

そしてＣＰＵ４２は、上記複数の対応機器のうち、当該エラーレート及び遅延時間が最小となる対応機器を選択する（ステップ６６）。すなわち、ＣＰＵ４２は、メタデータ生成を行った場合にそれを最も効率的かつ高精度に行える他の機器を選択する。またＣＰＵ４２は、上記エラーレート及び遅延時間を乗じた値を基に上記選択を行ってもよい。 Then, the CPU 42 selects a corresponding device that minimizes the error rate and the delay time from among the plurality of corresponding devices (step 66). That is, the CPU 42 selects another device that can perform the metadata generation most efficiently and with high accuracy when the metadata is generated. Further, the CPU 42 may perform the selection based on a value obtained by multiplying the error rate and the delay time.

続いてＣＰＵ４２は、コンテンツ記憶装置に記憶された各コンテンツについて、ジャンルを判定する（ステップ６７）。ここでコンテンツのジャンルが判定されるのは、当該ジャンルに応じて、上記自動編集モードやダイジェスト再生モードに必要な特徴データが異なるからである。すなわち、コンテンツのジャンルが異なると、他の機器に生成を要求するメタデータ及びそのために対応すべき動作モードも異なってくるため、ＰＶＲ３００は、当該ジャンル及び動作モードに応じて対応機器を適切に選択することとしている。 Subsequently, the CPU 42 determines a genre for each content stored in the content storage device (step 67). The reason why the genre of the content is determined here is that the characteristic data required for the automatic editing mode and the digest reproduction mode differ depending on the genre. That is, if the content genre is different, the metadata that is requested to be generated by other devices and the operation mode to be handled for the metadata also differ, so the PVR 300 appropriately selects the corresponding device according to the genre and the operation mode. It is set to be.

ここで、コンテンツのジャンルと、当該コンテンツのメタデータを生成するために必要となる動作モードとの関係について説明する。
図９〜図１２は、コンテンツジャンルが、放送コンテンツ（テレビコンテンツ）であって、ニュース番組の場合（図９）、スポーツ番組の場合（図１０）、音楽番組の場合（図１１）及びコンテンツジャンルが一般ユーザにより撮影されたプライベートコンテンツ（図１２）である場合のそれぞれについて、各動作モードの実行に必要とされる処理を示した図である。 Here, the relationship between the genre of the content and the operation mode necessary for generating the metadata of the content will be described.
9 to 12 show that the content genre is broadcast content (television content) and is a news program (FIG. 9), a sports program (FIG. 10), a music program (FIG. 11), and a content genre. It is the figure which showed the process required for execution of each operation mode about each when each is private content image | photographed by the general user (FIG. 12).

図９に示すように、コンテンツが放送コンテンツであってニュース番組の場合、ダイジェスト再生及び自動編集の各モードを実行するには、画像特徴抽出処理のうち顔検出及びテロップ検出の各処理が必須となり、音声特徴抽出処理のうち人の声の検出処理が必須となる。これは、ニュース番組には、ニュースキャスターの画像、当該ニュースキャスターの下部に通常表示されるテロップ等、ニュース番組特有の特徴が存在するからである。またこの場合、カメラ特徴のうちズーム、パン、チルトの各カメラ特徴の検出処理は、必須ではないが精度を良くするにはあった方がよい。これは、実際のニュース現場の映像から特徴的なシーンを抽出するためである。 As shown in FIG. 9, when the content is a broadcast content and a news program, in order to execute the digest playback and automatic editing modes, the face detection and telop detection processes are essential in the image feature extraction process. Of human voice feature extraction processing, human voice detection processing is essential. This is because the news program has features unique to the news program, such as a news caster image and a telop that is normally displayed below the news caster. Further, in this case, detection processing of camera features such as zoom, pan, and tilt among camera features is not essential, but it is better to improve accuracy. This is because a characteristic scene is extracted from the video of the actual news site.

図１０に示すように、コンテンツが放送コンテンツであってスポーツ番組の場合、ダイジェスト再生及び自動編集の各モードを実行するには、画像特徴抽出処理のうち顔検出処理が必須となり、カメラ特徴抽出処理のうちズーム、パン、チルトの各カメラ特徴の検出処理が必須となり、音声特徴抽出処理のうち音声盛り上がり検出処理が必須となる。これは、スポーツ番組では、例えばサッカーのゴールシーン等では選手の動き及びカメラの動きが活発となり、また歓声も大きくなるという特徴があるからである。またこの場合、画像特徴のうちテロップ検出処理は、必須ではないがあった方がよい。これは選手情報や試合経過等の情報がテロップとして表示される場合があるからである。 As shown in FIG. 10, when the content is a broadcast content and a sports program, in order to execute each mode of digest playback and automatic editing, face detection processing is essential among image feature extraction processing, and camera feature extraction processing Among these, zoom, pan, and tilt camera feature detection processing is indispensable, and audio feature detection processing is indispensable in audio feature extraction processing. This is because sports programs are characterized by active player movements and camera movements, for example, in soccer goal scenes, and cheers. In this case, it is better that the telop detection process is not essential among the image features. This is because information such as player information and game progress may be displayed as a telop.

図１１に示すように、コンテンツが放送コンテンツであって音楽番組の場合、ダイジェスト再生及び自動編集の各モードを実行するには、画像特徴抽出処理のうち顔検出処理が必須となり、音声特徴抽出処理のうち音楽検出処理及び人の声検出処理が必須となる。これは、音楽番組では、出演している歌手及びその歌手が歌う音楽を識別する必要があるからである。またこの場合、カメラ特徴抽出処理のうちズーム、パン、チルトの各カメラ特徴の検出処理は、必須ではないがあった方がよい。これは、歌手が実際に歌を歌っているシーン等を抽出するためである。 As shown in FIG. 11, in the case where the content is a broadcast content and a music program, in order to execute each mode of digest playback and automatic editing, face detection processing is essential among image feature extraction processing, and audio feature extraction processing Among them, music detection processing and human voice detection processing are essential. This is because in a music program, it is necessary to identify the performing singer and the music that the singer sings. In this case, the zoom, pan, and tilt camera feature detection processing of the camera feature extraction processing is preferably not essential. This is to extract a scene or the like where the singer actually sings a song.

図１２に示すように、コンテンツがプライベートコンテンツの場合、画像特徴抽出処理のうち顔検出処理が必須となり、ズーム、パン、チルト及び手振れの各カメラ特徴の検出処理が必須となり、音声特徴抽出処理のうち音声盛り上がり検出処理及び人の声検出処理が必須となる。これは、当該プライベートコンテンツに映っている人及びその人の言動を検出するためである。また、手振れ特徴検出処理は、当該プライベートコンテンツからキーフレームを検出する際に、見るに耐えない手振れ映像を除外するためである。 As shown in FIG. 12, when the content is private content, face detection processing is essential among the image feature extraction processing, detection processing of each camera feature of zoom, pan, tilt, and camera shake is indispensable. Among them, voice excitement detection processing and human voice detection processing are essential. This is to detect a person reflected in the private content and the behavior of the person. In addition, the camera shake feature detection process is to exclude a camera shake image that cannot be seen when detecting a key frame from the private content.

ここで、上記コンテンツが放送コンテンツである場合には、各ジャンルは、各コンテンツ記憶装置においてコンテンツが記録される際に、例えば上記ＲＰＧ情報を基に当該コンテンツ自体と共に記録されたジャンル情報を基に判別することができる。一方、コンテンツが放送コンテンツかプライベートコンテンツかは、上記手振れ特徴を検出することで判別することができる。すなわち、コンテンツの映像中に手振れ映像があれば当該コンテンツは、プロではなく一般ユーザが撮影したプライベートコンテンツであると判断され得るからである。ただし、本実施形態においては、当該ＰＶＲ３００は、そもそもメタデータ生成機能を有していないため、上記カメラ特徴抽出処理に対応していない。したがってＰＶＲ３００は、上記返信されたテストデータを基に、手振れ特徴抽出処理に対応した機器（例えば上記ＰＶＲ１００等）に、当該放送コンテンツかプライベートコンテンツかの判別処理を依頼する。後述するが、この手振れ特徴に基づく判別データを含めたコンテンツのジャンルに関するデータは、上記分類用メタデータとしても用いられる。 Here, when the content is a broadcast content, each genre is recorded based on the genre information recorded together with the content itself when the content is recorded in each content storage device, for example, based on the RPG information. it can be determined. On the other hand, whether the content is broadcast content or private content can be determined by detecting the above-described camera shake feature. That is, if there is a camera shake image in the content video, it can be determined that the content is private content photographed by a general user, not a professional. However, in this embodiment, since the PVR 300 does not have a metadata generation function in the first place, it does not support the camera feature extraction process. Therefore, the PVR 300 requests a device (for example, the above-described PVR 100) corresponding to the shake feature extraction process to determine whether the content is the broadcast content or the private content based on the returned test data. As will be described later, the data relating to the genre of the content including the discrimination data based on the hand movement feature is also used as the classification metadata.

また、放送コンテンツ当該放送コンテンツかプライベートコンテンツかの判別データは、例えば放送コンテンツ及びプライベートコンテンツがコンテンツ記憶装置に記録される際に、上記ＥＰＧ情報に基づくジャンル情報と共に、自動的にジャンルＩＤとして記憶されてもよい。これは例えばプライベートコンテンツの入力元となる機器（カムコーダ等）を識別することで達成し得る。 In addition, for example, when the broadcast content and the private content are recorded in the content storage device, the determination data for the broadcast content is automatically stored as a genre ID together with the genre information based on the EPG information. May be. This can be achieved, for example, by identifying a device (camcorder or the like) that is an input source of private content.

当該判別処理の依頼を受けた他の機器は、コンテンツ記憶装置から対象コンテンツを取得し、自身が有する手振れ特徴抽出機能を用いて、当該コンテンツから手振れ特徴を抽出する。ここで、当該手振れ特徴抽出処理の詳細について説明する。 The other device that has received the request for the determination processing acquires the target content from the content storage device, and extracts the shake feature from the content using the shake feature extraction function that the device has. Here, details of the camera shake feature extraction processing will be described.

図１３は、当該手振れ特徴抽出処理を概念的に示す図である。ＰＶＲ３００から依頼を受けた他の機器は、特徴抽出回路を用いて、コンテンツ中の基準フレームと探索フレームとの間でブロックマッチング処理により動きベクトルを検出し、当該動きベクトルを基に、重回帰分析によりアフィン係数を算出する。そして他の機器は、同図に示すように、映像コンテンツ中の所定区間（ｔ０〜ｔ１、ｔ１〜ｔ２、ｔ２〜ｔ３、ｔ３〜ｔ４）毎のアフィン係数から算出した、パン係数Ｐｘ、チルト係数Ｐｙの分散と、所定区間の平均値レベルとの交差回数とで手振れを判定することができる。所定区間としては、例えば０．５秒〜５秒程度の時間長が設定される。 FIG. 13 is a diagram conceptually illustrating the camera shake feature extraction process. Another device that receives a request from the PVR 300 uses a feature extraction circuit to detect a motion vector by block matching processing between a reference frame and a search frame in the content, and performs multiple regression analysis based on the motion vector. to calculate the affine coefficient by. Then, as shown in the figure, the other device uses the pan coefficient Px and the tilt coefficient calculated from the affine coefficients for each predetermined section (t0 to t1, t1 to t2, t2 to t3, t3 to t4) in the video content. Camera shake can be determined by the variance of Py and the number of intersections with the average value level of a predetermined section. As the predetermined section, for example, a time length of about 0.5 seconds to 5 seconds is set.

例えば、同図のｔ０〜ｔ１の区間においては、ＰｘまたはＰｙは、平均レベルの値と１２回交差している。他の機器の特徴抽出回路は、この交差回数の閾値をＴｈｃｒとし、交差回数がＴｈｃｒより大きく、かつ、ＰｘまたはＰｙの上記各所定区間での分散値が所定の閾値Ｔｈｖより大きい場合には、当該所定区間の映像は手振れ映像であると判定し、そうでない場合には手振れ映像ではないと判定する。当該判定結果は、上記放送コンテンツかパーソナルコンテンツかの判別のためにＰＶＲ３００へ送信される。 For example, in the interval from t0 to t1 in the figure, Px or Py intersects the average level value 12 times. The feature extraction circuit of another device sets the threshold of the number of intersections as Thcr, the number of intersections is larger than Thcr, and the variance value of each of the predetermined sections of Px or Py is larger than the predetermined threshold Thv, It is determined that the video in the predetermined section is a camera shake video. Otherwise, it is determined that the video is not a camera shake video. The determination result is transmitted to the PVR 300 in order to determine whether the content is broadcast content or personal content.

図５に戻り、ＣＰＵ４２は、上記図９〜図１２に示した基準を用いて、各コンテンツについて判定したジャンルに応じて、他の機器に生成させるべきメタデータに必要な動作モードを検出する（ステップ６８）。そしてＣＰＵ４２は、当該動作モードの実行に必須となる、動作用特徴量ＩＤ（上記g1〜g4, c1〜c4, a1〜a4）を検出する（ステップ６９）。 Returning to FIG. 5, the CPU 42 detects an operation mode necessary for metadata to be generated by another device according to the genre determined for each content, using the criteria shown in FIG. 9 to FIG. 12 ( Step 68). Then, the CPU 42 detects an operation feature ID (g1 to g4, c1 to c4, a1 to a4) that is indispensable for executing the operation mode (step 69).

続いてＣＰＵ４２は、図４のステップ４７において、当該動作用特徴量ＩＤで示される処理に対応した他の機器が存在するか否かを判断し、当該対応機器を選択する。そしてＣＰＵ４２は、上述したように、当該選択した機器へ、メタデータの生成を指示するコマンドを送信する（ステップ４８）。 Subsequently, in step 47 in FIG. 4, the CPU 42 determines whether or not there is another device corresponding to the process indicated by the operation feature amount ID, and selects the corresponding device. Then, as described above, the CPU 42 transmits a command for instructing generation of metadata to the selected device (step 48).

上記メタデータ生成を指示された他の機器は、当該指示対象のコンテンツを、コンテンツ記憶機器から受信し、当該他の機器が有する特徴抽出回路を用いて、特徴抽出処理及びそれに基づくメタデータ生成処理を実行する。 The other device instructed to generate the metadata receives the content to be instructed from the content storage device, and uses the feature extraction circuit of the other device to perform feature extraction processing and metadata generation processing based on the feature extraction processing Execute.

ここで、当該特徴抽出処理について説明する。図１４は、当該特徴抽出処理を概念的に示した図である。同図では、例えば上記シーン特徴メタデータの生成のために、特徴量として盛り上がりシーン（キーフレーム）が検出される例を示している。 Here, the feature extraction process will be described. FIG. 14 is a diagram conceptually showing the feature extraction processing. In the figure, an example is shown in which a rising scene (key frame) is detected as a feature amount, for example, in order to generate the scene feature metadata.

同図に示すように、上記メタデータ生成を指示された他の機器の特徴抽出回路は、動画像コンテンツの映像シーンについて、上記カメラ特徴抽出処理、画像特徴抽出処理、音声特徴抽出処理を実行し、当該抽出結果を基に、所定の閾値以上の特徴が抽出された区間のフレームを、キーフレームとして抽出する。 As shown in the figure, the feature extraction circuit of another device instructed to generate the metadata executes the camera feature extraction process, the image feature extraction process, and the audio feature extraction process for the video scene of the moving image content. Based on the extraction result, a frame in a section in which a feature equal to or greater than a predetermined threshold is extracted is extracted as a key frame.

ここで、カメラ特徴や画像特徴を基にキーフレームが抽出されるのは、カメラ特徴区間はユーザが意図してカメラを操作して撮影した区間であるため、ユーザにとって重要度が高いシーンであると考えられ、また人の顔はユーザの注目の的となりやすいためである。また、音声特徴を基にキーフレームが抽出されるのも、例えば音声レベルが大きい区間はユーザの注目度が高いと考えられるからである。 Here, the key frame is extracted based on the camera feature and the image feature, because the camera feature section is a section that is shot by the user operating the camera intentionally, and thus is a scene having high importance for the user. This is because the human face is likely to be the focus of the user's attention. Moreover, the reason why key frames are extracted based on voice features is that, for example, it is considered that a user's attention is high in a section where the voice level is high.

具体的には、上記特徴抽出回路は、映像コンテンツ中の基準フレームと探索フレームとの間でブロックマッチング処理により動きベクトルを検出する。そして特徴抽出回路は、当該動きベクトルを基に、重回帰分析によりアフィン係数を算出し、当該アフィン係数から、パン係数Ｐｘ、チルト係数Ｐｙ及びズーム係数Ｐｚを算出する。当該パン係数Ｐｘ、チルト係数Ｐｙ及びズーム係数Ｐｚがそれぞれ所定の閾値を超えた区間が、キーフレームとして検出される。また、特徴抽出回路は、動画像コンテンツの音声信号について、例えば所定の閾値以上のパワーを有する区間を音声盛り上がり区間として抽出する。そして特徴抽出回路は、例えば、上記映像シーンから抽出されたキーフレーム区間と、上記音声信号から抽出された音声盛り上がり区間とが重複する区間を、当該動画像コンテンツのキーフレームとして検出する。そして、このキーフレームの情報（タイムコード等）がシーン特徴メタデータとして生成される。 Specifically, the feature extraction circuit detects a motion vector by block matching processing between a reference frame and a search frame in video content. The feature extraction circuit calculates an affine coefficient by multiple regression analysis based on the motion vector, and calculates a pan coefficient Px, a tilt coefficient Py, and a zoom coefficient Pz from the affine coefficient. A section in which the pan coefficient Px, tilt coefficient Py, and zoom coefficient Pz each exceed a predetermined threshold is detected as a key frame. In addition, the feature extraction circuit extracts a section having a power equal to or higher than a predetermined threshold, for example, as an audio excitement section from the audio signal of the moving image content. The feature extraction circuit detects, for example, a section where a key frame section extracted from the video scene and a voice excitement section extracted from the audio signal overlap as a key frame of the moving image content. The key frame information (time code or the like) is generated as scene feature metadata.

図１５は、以上説明したデフォルトによるメタデータ生成処理を概念的に示す図である。同図に示すように、例えば、ネットワーク５０上に、メタデータの生成に対応しているＰＶＲ１００及び５００と、メタデータの生成に対応していないＰＶＲ２００、３００及び４００が存在しているとする。また、ＰＶＲ２００及び４００は、それぞれコンテンツＡ、コンテンツＢを記憶する上記コンテンツ記憶機器である。 FIG. 15 is a diagram conceptually showing the metadata generation processing by default described above. As shown in the figure, for example, it is assumed that PVRs 100 and 500 corresponding to generation of metadata and PVRs 200, 300, and 400 not corresponding to generation of metadata exist on the network 50. The PVRs 200 and 400 are the content storage devices that store the content A and the content B, respectively.

この例において、メタデータの生成に対応していないＰＶＲ３００が、ＰＶＲ２００に記憶されたコンテンツＡについてメタデータを生成させる場合、ＰＶＲ３００は、上記テストデータを基に、対応機器のうち、メタデータを生成させた場合にエラーレート及び遅延時間が最小となる他の機器として、ＰＶＲ１００を選択し、メタデータを生成させる。同図に示すように、対応機器であるＰＶＲ１００とＰＶＲ５００とでは、ＰＶＲ１００の方が、コンテンツ記憶機器であるＰＶＲ２００により近い距離に存在するためである。当該生成されたメタデータ（デフォルトメタデータＡ）は、ＰＶＲ２００の記録媒体に記憶され、例えば特殊再生等の必要に応じてＰＶＲ３００にダウンロードされる。 In this example, when the PVR 300 that does not support generation of metadata generates metadata for the content A stored in the PVR 200, the PVR 300 generates metadata among the corresponding devices based on the test data. In this case, the PVR 100 is selected as another device that minimizes the error rate and the delay time, and metadata is generated. As shown in the figure, this is because the PVR 100 and PVR 500, which are compatible devices, are located closer to the PVR 200, which is a content storage device. The generated metadata (default metadata A) is stored in the recording medium of the PVR 200 and downloaded to the PVR 300 as necessary for special reproduction, for example.

同様に、ＰＶＲ３００が、ＰＶＲ４００に記憶されたコンテンツＢについてメタデータを生成させる場合、ＰＶＲ３００は、上記テストデータを基に、対応機器のうち、エラーレート及び遅延時間が最小となる他の機器として、ＰＶＲ５００を選択し、メタデータを生成させる。当該生成されたメタデータ（デフォルトメタデータＢ）は、ＰＶＲ４００の記録媒体に記憶され、必要に応じてＰＶＲ３００にダウンロードされる。 Similarly, when the PVR 300 generates metadata for the content B stored in the PVR 400, the PVR 300 is based on the test data as other devices that minimize the error rate and delay time among the corresponding devices. Select PVR 500 and generate metadata. The generated metadata (default metadata B) is stored in the recording medium of the PVR 400 and downloaded to the PVR 300 as necessary.

次に、ネットワーク５０上の特定のコンテンツについてメタデータがマニュアルで生成される場合について説明する。図１６は、ＰＶＲ３００によるマニュアルでのメタデータ生成処理の流れを示したフローチャートである。当該処理は、通常、コンテンツの再生処理（特殊再生または通常再生）に際して実行される。 Next, a case where metadata is manually generated for specific content on the network 50 will be described. FIG. 16 is a flowchart showing a manual metadata generation process performed by the PVR 300. This processing is usually executed during content playback processing (special playback or normal playback).

同図に示すように、ＰＶＲ３００のＣＰＵ４２は、上記図４のフローチャートにおけるステップ４１及び４２と同様に、ネットワーク５０上の特定の機器にアクセスを試み、当該機器からアクセス認証を受ける（ステップ１６１）。当該認証に通った場合（ステップ１６２のＹｅｓ）、ＣＰＵ４２は、当該アクセスした機器に記憶されたコンテンツから、特定のコンテンツを選択する（ステップ１６３）。認証に通らなかった場合（Ｎｏ）、ＣＰＵ４２は、ユーザから、コンテンツの再生動作以外の動作モードへ移行する指示があったか否かを判断し（ステップ１６４）、当該指示があった場合（Ｙｅｓ）は他の動作モードへ移行する。 As shown in the figure, the CPU 42 of the PVR 300 attempts to access a specific device on the network 50 and receives access authentication from the device (step 161), similarly to steps 41 and 42 in the flowchart of FIG. When the authentication is passed (Yes in Step 162), the CPU 42 selects specific content from the content stored in the accessed device (Step 163). When the authentication fails (No), the CPU 42 determines whether or not there is an instruction from the user to shift to an operation mode other than the content reproduction operation (step 164), and when the instruction is present (Yes). to shift to other modes of operation.

ここで、上記コンテンツの選択は、ＰＶＲ３００に接続されたＴＶ６０等の表示装置に表示されるコンテンツリストに基づいて行われるが、当該コンテンツリストについては後述する。 Here, the selection of the content is performed based on a content list displayed on a display device such as the TV 60 connected to the PVR 300. The content list will be described later.

続いてＣＰＵ４２は、上記選択したコンテンツについて特殊再生を行うか通常再生を行うかを、ユーザの操作に基づいて判断する（ステップ１６５）。通常再生を行うことが指示された場合（Ｎｏ）、ＣＰＵ４２は、当該コンテンツの通常再生モードへ移行し、通常再生させる（ステップ１６７）。 Subsequently, the CPU 42 determines whether to perform special reproduction or normal reproduction for the selected content based on a user operation (step 165). When instructed to perform normal reproduction (No), the CPU 42 shifts to the normal reproduction mode of the content and performs normal reproduction (step 167).

上記ステップ１６５において特殊再生が指示された場合（Ｙｅｓ）、ＣＰＵ４２は、上記選択したコンテンツについて、メタデータが存在するか否かを判断する（ステップ１６６）。メタデータが存在すると判断された場合（Ｙｅｓ）、ＣＰＵ４２は、当該メタデータを受信して当該メタデータを基に上記選択したコンテンツの特殊再生を実行する（ステップ１７９）。 If special playback is instructed in step 165 (Yes), the CPU 42 determines whether metadata exists for the selected content (step 166). When it is determined that the metadata exists (Yes), the CPU 42 receives the metadata and executes special reproduction of the selected content based on the metadata (step 179).

上記ステップ１６６においてメタデータが存在しないと判断された場合（Ｎｏ）、ＣＰＵ４２は、メタデータ生成動作モードへ移行する（ステップ１６８）。そしてＣＰＵ４２は、上記選択されたコンテンツについて、上記図５の処理と同様に、メタデータ生成に対応する機器を探索する（ステップ１６９）。この場合、対応機器が複数存在する場合、上述と同様に、上記テストデータに基づいて、エラーレート及び遅延時間が最小となる機器が選択される。 If it is determined in step 166 that there is no metadata (No), the CPU 42 shifts to the metadata generation operation mode (step 168). Then, the CPU 42 searches for the device corresponding to the metadata generation for the selected content in the same manner as the processing of FIG. 5 (step 169). In this case, when there are a plurality of corresponding devices, the device that minimizes the error rate and the delay time is selected based on the test data, as described above.

メタデータ生成に対応する機器が存在する場合（ステップ１７０のＹｅｓ）、ＣＰＵ４２は、当該選択されたコンテンツがその時点でコンテンツ記憶機器に存在するか否かを確認し（ステップ１７２）、存在する場合には、上記選択した機器へメタデータ生成を指示する（ステップ１７８）。そしてＣＰＵ４２は、当該生成されたメタデータを受信し（ステップ１７８）、当該メタデータを用いてコンテンツの特殊再生を実行する（ステップ１７９）。 If there is a device corresponding to the metadata generation (Yes in Step 170), the CPU 42 checks whether or not the selected content exists in the content storage device at that time (Step 172), and if it exists. Is instructed to generate metadata to the selected device (step 178). Then, the CPU 42 receives the generated metadata (step 178), and executes special reproduction of the content using the metadata (step 179).

上記ステップ１７０において、対応機器が存在しないと判断された場合、ＣＰＵ４２は、上述と同様の警告表示を行い（ステップ１７３）、上記コンテンツについて通常再生を実行するか否かについてユーザに指示を促す（ステップ１７４）。当該通常再生が指示された場合（Ｙｅｓ）には、ＣＰＵ４２は、上記通常再生モードへ移行する（ステップ１６７）。通常再生が指示されなかった場合（Ｎｏ）、ＣＰＵ４２は、処理を終了する。 If it is determined in step 170 that there is no corresponding device, the CPU 42 displays a warning similar to that described above (step 173) and prompts the user to instruct whether or not to perform normal playback of the content (step 173). Step 174). When the normal reproduction is instructed (Yes), the CPU 42 shifts to the normal reproduction mode (step 167). When normal reproduction is not instructed (No), the CPU 42 ends the process.

また上記ステップ１７５においてコンテンツが存在しないと判断された場合（Ｎｏ）、ＣＰＵ４２は、その旨の警告表示を行い（ステップ１７７）、処理を終了する。 If it is determined in step 175 that there is no content (No), the CPU 42 displays a warning to that effect (step 177) and ends the process.

図１７は、以上説明したマニュアルでのメタデータ生成処理を概念的に示す図である。同図において、ネットワーク５０上の機器の構成は上記図１５と同様である。 FIG. 17 is a diagram conceptually illustrating the metadata generation process in the manual described above. In this figure, the configuration of the devices on the network 50 is the same as that shown in FIG.

同図に示すように、上記図１５と比較して、ＰＶＲ２００に記憶されたコンテンツＡについての個別メタデータＡは、当該ＰＶＲ２００には記憶されず、メタデータ生成を指示したＰＶＲ３００により直接受信され、特殊再生処理等に用いられる。同様に、ＰＶＲ４００に記憶されたコンテンツＢについての個別メタデータＢは、当該ＰＶＲ４００には記憶されず、ＰＶＲ３００により直接受信され、特殊再生処理等に用いられる。 As shown in FIG. 15, compared to FIG. 15, the individual metadata A for the content A stored in the PVR 200 is not stored in the PVR 200 but is directly received by the PVR 300 instructed to generate metadata. used in special reproduction processing, and the like. Similarly, the individual metadata B for the content B stored in the PVR 400 is not stored in the PVR 400 but is directly received by the PVR 300 and used for special reproduction processing or the like.

（コンテンツの分類処理）
次に、上記ネットワーク５０上の各機器に存在するコンテンツの分類処理について説明する。当該コンテンツの分類処理の前提として、ネットワーク５０上の各機器には、コンテンツとともに、上述したメタデータ生成処理により、上記分類用メタデータが生成され記憶されているものとする。当該分類用メタデータは、上述したコンテンツのジャンルに関するデータである。この分類用メタデータは、上述した図５のステップ６７におけるコンテンツジャンル判定処理において既に生成されているものとする。 (Content classification process)
Next, content classification processing existing in each device on the network 50 will be described. As a premise of the content classification processing, it is assumed that the above-mentioned classification metadata is generated and stored in each device on the network 50 by the above-described metadata generation processing together with the content. The classification metadata is data related to the content genre described above. It is assumed that the classification metadata has already been generated in the content genre determination process in step 67 of FIG. 5 described above.

図１８は、ＰＶＲ３００によるコンテンツ分類処理の流れを示したフローチャートである。同図に示すように、ＣＰＵ４２は、上述と同様に、ネットワーク５０上の各機器からアクセス認証を受ける（ステップ１８１）。当該認証に通った場合（ステップ１８２のＹｅｓ）、ＣＰＵ４２は、各機器に記憶されているコンテンツを検出する（ステップ１８３）。当該コンテンツの検出処理の対象には、自機（ＰＶＲ３００）も含まれる。 FIG. 18 is a flowchart showing a flow of content classification processing by the PVR 300. As shown in the figure, the CPU 42 receives access authentication from each device on the network 50 in the same manner as described above (step 181). When the authentication is passed (Yes in Step 182), the CPU 42 detects the content stored in each device (Step 183). The target of the content detection process includes the own device (PVR300).

上記認証に通らなかった場合（Ｎｏ）、ＣＰＵ４２は、他動作モードへの移行がユーザから指示されたか否かを判断し（ステップ１８６）、当該移行が指示された場合（Ｙｅｓ）には他動作モードへ移行し、指示されなかった場合（Ｎｏ）は上記ステップ１８１へ戻る。 When the authentication is not passed (No), the CPU 42 determines whether or not the user has instructed the transition to the other operation mode (Step 186), and when the transition is instructed (Yes), the other operation is performed. If the mode is entered and no instruction is given (No), the process returns to step 181.

上記ステップ１８３においてコンテンツを検出すると、ＣＰＵ４２は、当該検出されたコンテンツが、まだ分類処理の対象になっていない新コンテンツであるか否かを判断する（ステップ１８４）。当該コンテンツが既に分類処理の対象になっているコンテンツであると判断された場合（Ｎｏ）、ＣＰＵ４２は、上記ステップ１８６以降の処理を実行する。 When the content is detected in step 183, the CPU 42 determines whether or not the detected content is new content that has not yet been classified (step 184). When it is determined that the content is content that is already subject to classification processing (No), the CPU 42 executes the processing from step 186 onward.

上記コンテンツがまだ分類処理の対象となっていないコンテンツであると判断された場合（Ｙｅｓ）、ＣＰＵ４２は、上記分類用メタデータを基に、コンテンツのジャンルを検出する（ステップ１８５）。ここで、当該コンテンツについて分類用メタデータが生成されていない場合には、上述のメタデータ生成処理にしたがって、ネットワーク５０上の他の機器に分類用メタデータの生成を指示する。プライベートコンテンツについての分類用メタデータの生成にあたっては、上述したように、手振れ特徴抽出処理も実行される。 When it is determined that the content is content that has not yet been classified (Yes), the CPU 42 detects the genre of the content based on the classification metadata (step 185). Here, when the classification metadata is not generated for the content, the generation of the classification metadata is instructed to other devices on the network 50 according to the above-described metadata generation processing. In generating the classification metadata for the private content, the camera shake feature extraction process is also performed as described above.

続いて、ＣＰＵ４２は、検出したジャンルと同一のジャンルまたは類似のジャンルが他のコンテンツについて存在するか否かを判断する（ステップ１８７）。当該ジャンルの類似を判断するために、ＰＶＲ３００は、類似関係を記載したデータを予め記憶していてもよい。 Subsequently, the CPU 42 determines whether or not the same genre as the detected genre or a similar genre exists for other contents (step 187). In order to determine the similarity of the genre, the PVR 300 may store data describing the similarity relationship in advance.

上記ステップ１８７において、同一または類似のジャンルがあると判断された場合（Ｙｅｓ）、ＣＰＵ４２は、当該コンテンツを既存のジャンルに追加し、分類ＩＤを付加する（ステップ１８８）。一方、同一または類似のジャンルがないと判断された場合（Ｎｏ）、ＣＰＵ４２は、当該コンテンツを新規のジャンルに追加し、新規の分類ＩＤを付加する（ステップ１８９）。当該分類ＩＤには、上記プライベートコンテンツか放送コンテンツかの判別データも含まれる。 If it is determined in step 187 that there is an identical or similar genre (Yes), the CPU 42 adds the content to the existing genre and adds a classification ID (step 188). On the other hand, when it is determined that there is no same or similar genre (No), the CPU 42 adds the content to a new genre and adds a new classification ID (step 189). The classification ID includes discrimination data for the private content or the broadcast content.

ＣＰＵ４２は、以上の分類処理により生成された上記分類ＩＤを用いて、当該ＰＶＲ３００に記憶されたものも含むネットワーク５０上のコンテンツについてのコンテンツリストを生成し、当該コンテンツリストをＴＶ６０等の表示装置に表示させることができる。 The CPU 42 generates a content list for content on the network 50 including those stored in the PVR 300 by using the classification ID generated by the above classification processing, and uses the content list on a display device such as the TV 60. it can be displayed.

図１９は、当該生成されたコンテンツリストの表示例を示した図である。同図に示すように、当該コンテンツリストでは、複数のコンテンツが、上記分類ＩＤプライベートコンテンツとテレビコンテンツとに分類され、当該分類毎に、コンテンツのサムネイル９１及びタイトル９２が表示される。ここで、プライベートコンテンツについて、その内容を示す情報（運動会、旅行等）が表示されてもよい。当該情報は、当該プライベートコンテンツの記録時または他の機器への転送時に例えばユーザの入力等により生成される。またテレビコンテンツについては、上記分類ＩＤに基づいて、そのジャンルまたはサブジャンル（野球、歌番組等）が表示される。 FIG. 19 is a diagram showing a display example of the generated content list. As shown in the figure, in the content list, a plurality of contents are classified into the classification ID private contents and the television contents, and a thumbnail 91 and a title 92 of the contents are displayed for each classification. Here, information (such as an athletic meet, a trip, etc.) indicating the content of the private content may be displayed. The information is generated by, for example, user input when the private content is recorded or transferred to another device. For television content, the genre or sub-genre (baseball, song program, etc.) is displayed based on the classification ID.

当該コンテンツリスト上で、ユーザが、特定のコンテンツを再生のために選択する操作を行った場合、ＣＰＵ４２は、当該コンテンツを、当該コンテンツが記憶されたコンテンツ記憶機器から受信し、再生することができる。 When the user performs an operation of selecting a specific content for reproduction on the content list, the CPU 42 can receive the content from the content storage device in which the content is stored and reproduce it. .

（トラフィック状況等に応じたコンテンツリストの表示制御処理）
ここで、本実施形態においては、ＰＶＲ３００は、上記表示されるコンテンツリスト上で、当該コンテンツリスト上の各コンテンツが円滑に再生可能か否かを示すことが可能となっている。すなわち、ＰＶＲ３００は、ネットワーク５０のトラフィック状況（アクセス状態）に応じて、その円滑な再生の可否を、上記各コンテンツに対応するサムネイルの状態を変化させることでユーザに把握させる。さらに、ＰＶＲ３００は、ネットワーク５０上に新たなコンテンツが追加された場合、またはネットワーク５０上にコンテンツが存在しなくなった場合に、それらの状況をコンテンツリスト上に反映してユーザに把握させることができる。以下、当該処理の詳細について説明する。 (Content list display control processing according to traffic conditions)
Here, in the present embodiment, the PVR 300 can indicate whether or not each content on the content list can be smoothly reproduced on the displayed content list. That is, the PVR 300 allows the user to grasp whether smooth playback is possible or not according to the traffic status (access status) of the network 50 by changing the status of the thumbnail corresponding to each content. Furthermore, when new content is added on the network 50 or when content no longer exists on the network 50, the PVR 300 can reflect the situation on the content list and allow the user to grasp the situation. . Hereinafter, details of the processing will be described.

図２０は、上記トラフィック状況等に応じたコンテンツリストの表示制御処理の流れを示したフローチャートである。
同図に示すように、ＣＰＵ４２は、まず、各コンテンツの判別用のカウンタｎを０に初期設定する（ステップ１９１）。続いてＣＰＵ４２は、ネットワーク５０上の各機器からアクセス認証を受け（ステップ１９２）、認証に通った場合（ステップ１９３のＹｅｓ）、各機器に記憶されたコンテンツを検出する（ステップ１９４）。当該コンテンツの検出は、当該ＰＶＲ３００自身に記憶されたコンテンツについても実行される。したがって上記コンテンツリストでは、ＰＶＲ３００自身及び他の機器に記憶されたコンテンツ全てが表示されることとなる。これによりユーザは、当該ユーザが操作する機器に記憶されたコンテンツか、他の機器に記憶されたコンテンツかを意識しなくて済む。 FIG. 20 is a flowchart showing the flow of content list display control processing in accordance with the traffic status and the like.
As shown in the figure, the CPU 42 first initializes a counter n for discrimination of each content to 0 (step 191). Subsequently, the CPU 42 receives access authentication from each device on the network 50 (step 192), and when the authentication passes (Yes in step 193), detects the content stored in each device (step 194). The detection of the content is also executed for the content stored in the PVR 300 itself. Therefore, in the content list, all contents stored in the PVR 300 and other devices are displayed. As a result, the user does not have to be aware of whether the content is stored in the device operated by the user or the content stored in another device.

上記ステップ１９３において認証に通らなかった場合（Ｎｏ）、ＣＰＵ４２は、コンテンツリストの表示動作モードが続行中か否かを判断し（ステップ１９８）、続行中である場合にはステップ２０２へ移る。表示動作モードの終了がユーザから指示された場合（Ｎｏ）、他の動作モードへの移行がユーザから指示されたか否かを判断し（ステップ１９９）、指示された場合（Ｙｅｓ）には当該他の動作モードへ移行する。一方他の動作モードへの移行が指示されていない場合（Ｎｏ）、ＣＰＵ４２は、上記ステップ１９２へ戻り、以降の処理を繰り返す。 If the authentication is not passed in step 193 (No), the CPU 42 determines whether or not the content list display operation mode is continuing (step 198), and proceeds to step 202 if continuing. When the end of the display operation mode is instructed by the user (No), it is determined whether or not the user has instructed the transition to another operation mode (Step 199). When instructed (Yes), the other to shift to modes of operation. On the other hand, when the shift to another operation mode is not instructed (No), the CPU 42 returns to step 192 and repeats the subsequent processing.

上記ステップ１９４においてコンテンツを検出すると、ＣＰＵ４２は、上記コンテンツのうち、まだ当該コンテンツリスト表示制御処理の対象となっていないコンテンツが存在するか否かを判断する（ステップ１９５）。当該処理の対象となっていないコンテンツが存在すると判断された場合（Ｙｅｓ）、ＣＰＵ４２は、当該コンテンツ（ｎ）について、それがコンテンツリスト上で表示可能か否かを示す表示可能フラグをＯＮに設定する（ステップ１９６）。 When the content is detected in step 194, the CPU 42 determines whether or not there is content that is not yet the target of the content list display control process (step 195). When it is determined that there is content that is not the target of the processing (Yes), the CPU 42 sets the displayable flag indicating whether the content (n) can be displayed on the content list to ON. (step 196).

続いてＣＰＵ４２は、ネットワーク５０上の全てのコンテンツを検出したか否かを判断し（ステップ１９７）、まだ検出すべきコンテンツが存在する場合（Ｎｏ）には、上記カウンタｎをインクリメントして上記ステップ１９４以降の処理を繰り返す。 Subsequently, the CPU 42 determines whether or not all contents on the network 50 have been detected (step 197). If there is still content to be detected (No), the counter n is incremented and the above steps are performed. The process after 194 is repeated.

全てのコンテンツを検出したと判断された場合（Ｙｅｓ）、ＣＰＵ４２は、コンテンツリストの表示動作モードが続行中か否かを判断し（ステップ２０１）、続行中の場合（Ｙｅｓ）には、コンテンツへのアクセス状態を検出するためのカウンタｋを０に初期設定する（ステップ２０２）。上記ステップ２０１において、上記表示動作モードの終了がユーザから指示された場合（Ｎｏ）、上記ステップ１９９と同様の処理が実行される。 If it is determined that all contents have been detected (Yes), the CPU 42 determines whether or not the content list display operation mode is continuing (step 201). A counter k for detecting the access state is initialized to 0 (step 202). In step 201, when the user instructs the end of the display operation mode (No), the same processing as in step 199 is executed.

上記カウンタｋを初期設定すると、ＣＰＵ４２は、コンテンツ（ｋ）について、そのアクセス状態を検出する（ステップ２０３）。すなわち、ＣＰＵ４２は、上述したテストデータを各機器へ送信しその返信を受け、トラフィック状況（エラーレート）を検出することで、各機器に記憶されたコンテンツのエラーレートを推定し、当該エラーレートが所定の閾値を上回った場合には当該コンテンツの円滑な再生はできない（アクセス不可能である）と判断する。 When the counter k is initially set, the CPU 42 detects the access state of the content (k) (step 203). In other words, the CPU 42 estimates the error rate of the content stored in each device by transmitting the above-described test data to each device, receiving the response, and detecting the traffic situation (error rate). If the predetermined threshold value is exceeded, it is determined that the content cannot be reproduced smoothly (is not accessible).

続いてＣＰＵ４２は、上記アクセス可能と判断されたコンテンツ（ｋ）については上記コンテンツリスト上で上記サムネイルを通常の状態で表示し（ステップ２０５）、アクセス不可能と判断された場合（Ｎｏ）には、当該サムネイルの状態を変化させて表示させる（ステップ２０８）。 Subsequently, for the content (k) determined to be accessible, the CPU 42 displays the thumbnails in the normal state on the content list (step 205), and when it is determined that access is impossible (No) The thumbnail state is changed and displayed (step 208).

続いてＣＰＵ４２は、上記カウンタｋをインクリメントし、上記検出された全てのコンテンツ（ｎ個のコンテンツ）について、以上の処理を繰り返す（ステップ２０７）。そしてＣＰＵ４２は、ネットワーク５０上の各機器を認証する所定のタイミングが到来したか否かを判断する（ステップ２０９）。当該タイミングが到来していないと判断された場合（Ｎｏ）、ＣＰＵ４２は上記ステップ２０１へ戻り、到来したと判断した場合（Ｙｅｓ）には上記ステップ１９２へ戻って認証処理を実行する。 Subsequently, the CPU 42 increments the counter k, and repeats the above processing for all the detected contents (n contents) (step 207). Then, the CPU 42 determines whether or not a predetermined timing for authenticating each device on the network 50 has arrived (step 209). When it is determined that the timing has not arrived (No), the CPU 42 returns to Step 201, and when it is determined that it has arrived (Yes), the CPU 42 returns to Step 192 to execute the authentication process.

以上の処理により、ネットワーク５０上のトラフィック状況に応じて、コンテンツの円滑な再生が不可能である場合にはサムネイルが変化した状態で表示される。またＣＰＵ４２は、上記認証処理を所定タイミングで繰り返すことで、ネットワーク５０上に新たなコンテンツが追加された場合またはコンテンツがネットワーク５０上から切り離された場合に、それらの状態に応じて上記コンテンツリスト上のサムネイルを追加または削除することができる。 Through the above processing, the thumbnail is displayed in a changed state when smooth playback of the content is impossible according to the traffic situation on the network 50. Further, the CPU 42 repeats the authentication process at a predetermined timing, so that when new content is added on the network 50 or when the content is disconnected from the network 50, the CPU 42 displays the content list according to the state of the content. You can add or delete thumbnails.

図２１は、上記コンテンツリストの表示制御処理によりサムネイルの状態が変化する様子を示した図である。同図（Ａ）が変化前の状態、同図（Ｂ）が変化後の状態を示す。同図（Ａ）に示すように、通常表示されていたコンテンツ２ａのサムネイル９１は、同図（Ｂ）に示すように、トラフィック状況の悪化または当該コンテンツ２ａがネットワーク５０上から切り離されたことにより、影付き（グレー）で表示されている。当該サムネイルの表示状態の変化は、影付きによるものに限られず、例えば彩度や明度の変化により行われても構わない。 FIG. 21 is a diagram showing a state in which the thumbnail state changes due to the content list display control process. FIG. 4A shows the state before the change, and FIG. 4B shows the state after the change. As shown in FIG. 6A, the thumbnail 91 of the content 2a that has been normally displayed is displayed as a result of the deterioration of traffic conditions or the content 2a being disconnected from the network 50, as shown in FIG. , Shaded (gray). The change in the display state of the thumbnail is not limited to the shaded state, and may be performed by, for example, a change in saturation or brightness.

また、上記図２０のフローチャートでは、エラーレートに関する所定の閾値を１つ設定されたが、当該閾値を複数設定し、エラーレートに応じて段階的に上記表示変化が行われても構わない。 In the flowchart of FIG. 20, one predetermined threshold value regarding the error rate is set. However, a plurality of threshold values may be set, and the display change may be performed step by step according to the error rate.

また、ＣＰＵ４２は、上記トラフィック状況が改善し、エラーレートが低下した場合には、上記サムネイルの状態を通常表示に変化させるとともに、それを示す何らかのマークを当該サムネイル近傍に表示してもよい。この際、ＣＰＵ４２は、上記コンテンツリスト上で、予めユーザが再生を所望するコンテンツを選択させておき、当該選択されたコンテンツのみについて当該マークを表示してもよい。 Further, when the traffic situation is improved and the error rate is lowered, the CPU 42 may change the state of the thumbnail to the normal display and display some mark indicating it in the vicinity of the thumbnail. At this time, the CPU 42 may cause the user to select content that the user desires to reproduce on the content list in advance, and display the mark only for the selected content.

（顔識別メタデータによるアクセス制御処理）
本実施形態においては、ネットワーク５０上の各機器は、各機器へのコンテンツへのアクセス権限の認証を、当該コンテンツに、アクセス元の機器のユーザの顔が映っているか否かにより行うことができる。すなわち、各機器は、自身が記憶するコンテンツへアクセスがあった場合、当該コンテンツに、アクセス元の機器のユーザの顔が映っている場合にはアクセスを許可し、そうでない場合には許否することができる。これは、ユーザは、当該ユーザ自身が映っているコンテンツにはアクセスを許可されるべきであるという考え方に基づいている。以下、当該アクセス制御処理について説明する。このアクセス制御処理には、上記メタデータ生成処理において生成される顔識別メタデータが用いられる。 (Access control processing using face identification metadata)
In the present embodiment, each device on the network 50 can authenticate the access authority to the content for each device depending on whether the user's face of the access source device is reflected in the content. . In other words, when each device has access to the content stored by itself, the device permits access if the user's face of the accessing device is reflected in the content, and permits otherwise. Can do. This is based on the idea that a user should be allowed to access content that the user himself / herself shows. The access control process will be described below. In this access control process, face identification metadata generated in the metadata generation process is used.

図２２は、当該顔識別メタデータに基づくアクセス制御処理の流れを示したフローチャートである。同図においては、上記メタデータ生成処理に対応するＰＶＲ１００が、ネットワーク５０上の全てのコンテンツについて、アクセス許可／不許可を示すデータを生成する場合について説明する。 FIG. 22 is a flowchart showing the flow of access control processing based on the face identification metadata. In the figure, a case will be described in which the PVR 100 corresponding to the metadata generation process generates data indicating access permission / denial for all contents on the network 50.

同図に示すように、ＰＶＲ１００のＣＰＵ１２は、他の機器から、当該他の機器のユーザＡの顔画像の登録を受け付ける（ステップ２１１）。当該顔画像は、当該他の機器が有するカメラにより撮影されたものであってもよいし、当該ユーザＡが有するデジタルカメラやカメラ付き携帯電話によりユーザＡが撮影し、他の機器を介して送信されたものであってもよい。 As shown in the figure, the CPU 12 of the PVR 100 accepts registration of the face image of the user A of the other device from the other device (step 211). The face image may be taken by a camera of the other device, or may be taken by the user A using a digital camera or a mobile phone with the user A and transmitted via the other device. It may be what was done.

当該顔画像の登録があると、ＣＰＵ１２は、当該顔画像データから、目、鼻、口、輪郭、テクスチャ特徴等の顔特徴データ（特徴ベクトル）を抽出する。当該顔特徴データの抽出には、例えば、顔の各パーツの位置関係に応じた特徴フィルター、輝度分布情報、肌色情報等が用いられる。 When the face image is registered, the CPU 12 extracts face feature data (feature vector) such as eyes, nose, mouth, contour, texture feature and the like from the face image data. For the extraction of the facial feature data, for example, feature filters, luminance distribution information, skin color information, and the like corresponding to the positional relationship of each part of the face are used.

続いて、ＣＰＵ１２は、ネットワーク５０上の各機器からアクセス認証を受け（ステップ２１２）、認証に通った場合（ステップ２１３のＹｅｓ）、各機器からコンテンツを検出する（ステップ２１４）。当該コンテンツの検出対象となる各機器には、自機（ＰＶＲ１００）も含まれる。認証に通らなかった場合（Ｎｏ）、ＣＰＵ１２は、他の動作モードへの移行が当該ＰＶＲ１００のユーザから指示されたか否かを判断し（ステップ２１９）、指示があった場合（Ｙｅｓ）は当該他の動作モードへ移行し、指示がない場合（Ｎｏ）には上記ステップ２１２へ戻る。 Subsequently, the CPU 12 receives access authentication from each device on the network 50 (step 212). When the authentication passes (Yes in step 213), the CPU 12 detects content from each device (step 214). Each device that is the target of content detection includes its own device (PVR 100). If the authentication is not successful (No), the CPU 12 determines whether or not the user of the PVR 100 is instructed to move to another operation mode (step 219). If the instruction is received (Yes), the other If there is no instruction (No), the process returns to step 212.

上記コンテンツを検出すると、ＣＰＵ１２は、まだ当該アクセス制御処理の対象となっていないコンテンツが存在するか否かを判断する（ステップ２１５）。当該コンテンツが存在しない場合には上記ステップ２１９へ移る。当該コンテンツが存在する場合、ＣＰＵ１２は、当該コンテンツから、顔画像及び顔特徴を検出する。顔画像の検出には、例えば肌色検出等の既知の手法が用いられる。顔特徴の検出は、上記登録された顔画像についての処理と同様である。ここで、当該各検出処理は、上記顔識別メタデータが生成されている場合には不要であり、当該顔識別メタデータをそのままアクセス制御処理に用いることができる。当該検出処理により顔画像が検出されなかった場合（ステップ２１７のＮｏ）には、上記ステップ２１２へ戻り、その他の機器について処理を繰り返す。 When the content is detected, the CPU 12 determines whether or not there is content that is not yet subject to the access control process (step 215). If the content does not exist, the process proceeds to step 219. When the content exists, the CPU 12 detects a face image and a facial feature from the content. For detecting the face image, a known method such as skin color detection is used. The detection of the facial feature is the same as the processing for the registered facial image. Here, each detection process is not necessary when the face identification metadata is generated, and the face identification metadata can be used as it is for the access control process. If a face image is not detected by the detection process (No in step 217), the process returns to step 212 and the process is repeated for other devices.

上記ステップ２１６において何らかの顔画像が検出された場合（ステップ２１７のＹｅｓ）、ＣＰＵ１２は、上記登録された顔画像の顔特徴データと、上記コンテンツから検出された顔特徴データを照合し、上記登録したユーザＡの顔が当該コンテンツに存在するか否かを判断する（ステップ２１８）。 When any face image is detected in step 216 (Yes in step 217), the CPU 12 collates the face feature data of the registered face image with the face feature data detected from the content, and registers the registered face image. It is determined whether the face of user A exists in the content (step 218).

当該コンテンツにユーザＡの顔が存在すると判断された場合（ステップ２２０のＹｅｓ）、ＣＰＵ１２は、当該コンテンツについて、ユーザＡの機器からのアクセスを許可する許可データを生成する（ステップ２２１）。そしてＣＰＵ１２は、例えば当該コンテンツから検出された顔画像のサムネイルをユーザＡの機器へ転送する（ステップ２２２）。当該サムネイルは、コンテンツリストとして転送されてもよい。これにより、ユーザＡの機器が表示動作モードである場合、アクセス可能なコンテンツがコンテンツリストとして表示されることで、ユーザＡの機器は当該リスト上のコンテンツに即座にアクセスして当該コンテンツを再生することが可能となる。 When it is determined that the face of the user A exists in the content (Yes in Step 220), the CPU 12 generates permission data for permitting access from the device of the user A for the content (Step 221). Then, for example, the CPU 12 transfers the thumbnail of the face image detected from the content to the user A's device (step 222). The thumbnail may be transferred as a content list. Thus, when the user A device is in the display operation mode, accessible content is displayed as a content list, so that the user A device immediately accesses the content on the list and reproduces the content. It becomes possible.

そしてＣＰＵ１２は、上記新たにアクセス制御対象とすべき全てのコンテンツについて解析が終了したか否かを判断し（ステップ２２３）、終了した場合（Ｙｅｓ）には処理を終了し、終了していない場合（Ｎｏ）には上記２１２へ戻って以降の処理を繰り返す。 Then, the CPU 12 determines whether or not the analysis has been completed for all the contents that are to be newly subjected to access control (step 223). When the analysis has been completed (Yes), the processing is terminated, and when it has not been completed. In (No), it returns to said 212 and repeats subsequent processes.

以上の処理は、上記アクセス許可データを自動的に生成するモードであるが、当該許可データは、手動で行われてもよい。図２３は、当該手動／自動の各動作モードに応じたアクセス制御処理の概要を示した表である。 Although the above processing is a mode for automatically generating the access permission data, the permission data may be manually performed. FIG. 23 is a table showing an outline of the access control process corresponding to each manual / automatic operation mode.

同図に示すように、自動モードにおいては、ＣＰＵ１２は、コンテンツに映っている顔が、機器に顔登録したユーザの場合に、上述のように、それを自動で知らせる。この場合、上述したように、当該ユーザの機器の例えばキャッシュメモリへ上記サムネイルを転送する。また、手動モードにおいては、登録元の機器は、ユーザが映っているコンテンツがネットワーク５０上に存在するか否かを検出する動作モードを、当該機器のユーザの操作に基づいて設定する。この場合、上記顔画像の探索によりその存在有無が検出された場合には、登録元の機器のキャッシュメモリへ上記サムネイルが転送される。 As shown in the figure, in the automatic mode, the CPU 12 automatically notifies the face shown in the content as described above when the face is registered in the device. In this case, as described above, the thumbnail is transferred to, for example, a cache memory of the user's device. Further, in the manual mode, the registration source device sets an operation mode for detecting whether or not content showing the user exists on the network 50 based on the operation of the user of the device. In this case, when the presence / absence of the face image is detected by the search for the face image, the thumbnail is transferred to the cache memory of the registration source device.

以上の処理により、顔画像をキーとして、ネットワーク５０上の各コンテンツへのアクセス権限を認証することができ、ＩＤやパスワードを用いる場合に比べてより直感的な認証が可能となる。 With the above processing, it is possible to authenticate the access authority to each content on the network 50 using the face image as a key, and more intuitive authentication is possible as compared with the case of using an ID or password.

ここで、上記各コンテンツ内の顔画像及び顔特徴データの検出処理は、当該コンテンツの記録時に当該記録した機器により実行され、当該検出された顔特徴データがメタデータとして当該コンテンツと共に記憶されてもよい。これにより各機器はアクセス要求に対して即座にその認証を行うことができる。 Here, the detection processing of the face image and the face feature data in each content is executed by the recorded device when the content is recorded, and the detected face feature data may be stored as metadata together with the content. Good. As a result, each device can immediately authenticate the access request.

また、コンテンツに映っているユーザのみならず、当該コンテンツの作成者もアクセスを許可されてよい。この場合、コンテンツの記録（作成）時に、当該コンテンツの作成者の顔画像から検出された顔特徴データがメタデータとして記憶される。そして、上記解析時に、コンテンツ内の顔画像の顔特徴データとの照合に加えて、当該記憶された作成者の顔画像の顔特徴データとの照合も行われる。これにより、コンテンツ作成者が、自身のコンテンツを再生できなくなる事態を防ぐことができる。 Further, not only the user shown in the content but also the creator of the content may be permitted access. In this case, the face feature data detected from the face image of the creator of the content is stored as metadata when the content is recorded (created). At the time of the analysis, in addition to the collation with the facial feature data of the facial image in the content, the collation with the facial feature data of the stored creator's facial image is also performed. As a result, it is possible to prevent the content creator from being unable to reproduce its own content.

［まとめ］
以上説明したように、本実施形態によれば、ネットワーク５０上の各機器は、当該機器自身がメタデータを生成できない場合であっても、それが可能なネットワーク上の他の機器を検索してメタデータを生成させることができ、当該メタデータを種々の処理に活用することができる。 [Summary]
As described above, according to the present embodiment, each device on the network 50 searches for other devices on the network that can do this even if the device itself cannot generate metadata. Metadata can be generated, and the metadata can be used for various processes.

また、各機器は、コンテンツを再生した場合に推定されるエラーレートに応じてコンテンツリスト上のサムネイル画像の状態を変化させることで、当該サムネイル画像に対応するコンテンツが円滑に再生可能か否かをユーザに直感的に把握させることができる。したがって各機器は、トラフィック状況が悪く円滑な再生ができないコンテンツをユーザが再生してしまい不快感を覚えるような事態を防ぐことができる。 In addition, each device changes whether or not the content corresponding to the thumbnail image can be smoothly reproduced by changing the state of the thumbnail image on the content list according to the error rate estimated when the content is reproduced. The user can intuitively grasp it. Accordingly, each device can prevent a situation in which the user reproduces content that cannot be smoothly reproduced due to a bad traffic situation and feels uncomfortable.

さらに、各機器は、他の機器から送信された顔画像と合致する顔画像がコンテンツに含まれている場合には、当該コンテンツが当該他の機器のユーザにより作成されたものでなくても、当該コンテンツに対する他の機器からのアクセスを許可することができる。 Furthermore, when each device includes a face image that matches a face image transmitted from another device, even if the content is not created by the user of the other device, Access to the content from other devices can be permitted.

［変形例］
本発明は上述の実施形態にのみ限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々変更され得る。 [Modification]
The present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the present invention.

上述の実施形態においては、ＰＶＲ３００が、他の機器に記憶されたコンテンツのメタデータの生成をＰＶＲ１００へ指示する例を示したが、ＰＶＲ３００自身が記憶するコンテンツについて、自らＰＶＲ１００との間でテストデータを送受信することで、メタデータの生成を指示してもよい。 In the above-described embodiment, an example in which the PVR 300 instructs the PVR 100 to generate metadata of content stored in another device has been described. However, for the content stored by the PVR 300 itself, test data is itself transmitted to the PVR 100. May be instructed to generate metadata.

上述の実施形態においては、コンテンツがＨＤＤ等の記憶装置に記憶されている場合について説明されたが、例えばＢＤやＤＶＤ等の可般性の記録媒体に記憶されたコンテンツについても本発明を適用することができる。 In the above-described embodiment, the case where content is stored in a storage device such as an HDD has been described. However, the present invention is also applied to content stored in a general-purpose recording medium such as a BD or DVD. be able to.

上述の実施形態において説明した各種メタデータは例示にすぎず、あらゆる動作に対応したあらゆるメタデータについて本発明を適用することができる。 The various metadata described in the above embodiment is merely an example, and the present invention can be applied to any metadata corresponding to any operation.

上述の実施形態においては、コンテンツリスト上におけるトラフィック状況に応じたサムネイルの表示変化処理について、コンテンツの分類処理がなされたことを前提に説明がなされたが、もちろん、コンテンツの分類処理がなされていない状態でも当該表示変化が実行されてもよい。 In the above-described embodiment, the thumbnail display change process according to the traffic situation on the content list has been described on the assumption that the content classification process has been performed, but of course, the content classification process has not been performed. The display change may be executed even in the state.

上述の実施形態においては、本発明をＰＶＲに適用した例を説明した。しかし本発明は、例えばＰＣ（Personal Computer）、テレビジョン装置、ゲーム機器、携帯電話機、その他ＡＶ（Audio/Visual）機器等、あらゆる電子機器に適用可能である。 In the above-mentioned embodiment, the example which applied this invention to PVR was demonstrated. However, the present invention is applicable to all electronic devices such as a PC (Personal Computer), a television device, a game device, a mobile phone, and other AV (Audio / Visual) devices.

７、３７…再生部
８、３８…ＨＤＤ
１１、４１…通信部
１２、４２…ＣＰＵ
１４、４４…ＲＡＭ
１５、４５…操作入力部
１６、４６…グラフィック制御部
２０…特徴抽出回路
５０…ネットワーク
６０…ＴＶ
７３…動作モード用データ
７４…処理用データ
７５…エラー処理用データ
９１…サムネイル
１００、２００、３００、４００、５００…ＰＶＲ 7, 37 ... Playback unit 8, 38 ... HDD
11, 41 ... Communication unit 12, 42 ... CPU
14, 44 ... RAM
15, 45 ... Operation input unit 16, 46 ... Graphic control unit 20 ... Feature extraction circuit 50 ... Network 60 ... TV
73 ... Data for operation mode 74 ... Data for processing 75 ... Data for error processing 91 ... Thumbnail 100, 200, 300, 400, 500 ... PVR

Claims

A storage unit for storing content;
A communication unit that receives first face image data from another device on the network;
First facial feature data is extracted from the received first facial image data, second facial image data is detected from the stored content, and second facial feature data is detected from the second facial image data. Data is extracted, and it is determined whether or not the first face feature data and the second face feature data match, and it is determined that the first face feature data and the second face feature data match. And a control unit that generates access permission data for permitting access to the content from the other device.

The electronic device according to claim 1,
The control unit executes detection of the second face image data and extraction of the second face feature data when storing the content in the storage unit, and the second face feature data together with the content Electronic equipment stored in the storage unit.

The electronic device according to claim 2,
The storage unit stores third facial image data indicating a creator of the stored content and third facial feature data extracted from the third facial image data together with the content,
The controller determines whether or not the first face feature data and the third face feature data match, and if the first face feature data and the third face feature data match, An electronic device that generates the access permission data when it is determined.

The electronic device according to claim 2,
The control unit generates a list of contents for which the access permission data is generated, transmits the list to the other device by the communication unit, and reproduces one content on the list from the other device. An electronic device that receives a reproduction request signal that requests the one content and transmits the one content to the other device in response to the reproduction request signal.

Remember the content,
Receive first face image data from other devices on the network,
Extracting first facial feature data from the received first facial image data;
Detecting second face image data from the stored content;
Extracting second facial feature data from the second facial image data;
Determining whether the first facial feature data and the second facial feature data match;
An access control method for generating access permission data permitting access to the content from the other device when it is determined that the first face feature data matches the second face feature data.

Electronic equipment,
Storing the content;
Receiving first face image data from another device on the network;
Extracting first facial feature data from the received first facial image data;
Detecting second face image data from the stored content;
Extracting second facial feature data from the second facial image data;
Determining whether the first facial feature data and the second facial feature data match;
A program for executing, when it is determined that the first face feature data matches the second face feature data, generating access permission data for permitting access to the content from the other device.