JP4529632B2

JP4529632B2 - Content processing method and content processing apparatus

Info

Publication number: JP4529632B2
Application number: JP2004303934A
Authority: JP
Inventors: 雅美三浦; 進矢部; 功誠山下; 俊郎寺内; 曜一郎佐古
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2004-10-19
Filing date: 2004-10-19
Publication date: 2010-08-25
Anticipated expiration: 2024-10-19
Also published as: JP2006119178A

Description

この発明は、映像や音響などのコンテンツを処理する方法および装置に関する。 The present invention relates to a method and apparatus for processing content such as video and audio.

コンテンツの再生にあたって、別のコンテンツなどを一緒に再生または提示することが考えられている。 When playing back content, it is considered to play or present other content together.

例えば、特許文献１（特開平８−７６７７９号公報）には、カラオケの曲の音楽情報を参照し、曲の内容に合わせて、その場の雰囲気を盛り上げるために照明や空調を制御することが示されている。 For example, Patent Literature 1 (Japanese Patent Laid-Open No. 8-76779) refers to music information of a karaoke song, and controls lighting and air conditioning in order to raise the atmosphere of the place according to the content of the song. It is shown.

また、特許文献２（特表２００３−５０７８０８号公報）には、映像音声データの内容や構造を表記する記述方法が示されている。 Japanese Patent Application Laid-Open No. 2003-507808 discloses a description method for describing the contents and structure of video / audio data.

さらに、特許文献３（特開２００１−３３３４１０号公報）には、コンテンツの最適視聴条件を記述したメタデータによって、ユーザが最適な条件でコンテンツを視聴できるようにすることや、ユーザの嗜好を示すデータによって、ユーザに提供するコンテンツを判定することが示されている。 Further, Patent Document 3 (Japanese Patent Laid-Open No. 2001-333410) shows that the user can view the content under the optimum condition and the user's preference by using the metadata describing the optimum viewing condition of the content. The data indicates that the content to be provided to the user is determined.

また、特許文献４（特開２００２−１５８９４８号公報）には、同一シーンを記録した複数の映像音声データを関連づけて再生することが示されている。 Patent Document 4 (Japanese Patent Laid-Open No. 2002-158948) discloses that a plurality of video / audio data in which the same scene is recorded are reproduced in association with each other.

以上のように、音楽コンテンツの再生にあたって、その場の雰囲気を盛り上げるために、コンテンツ内容に応じて照明や空調を制御することが考えられており、また、ユーザの嗜好に応じてコンテンツを選択することが考えられている。 As described above, it is considered to control lighting and air conditioning according to content contents in order to increase the atmosphere of the place when playing music content, and to select content according to user preference It is considered.

上に挙げた先行技術文献は、以下の通りである。
特開平８−７６７７９号公報特表２００３−５０７８０８号公報特開２００１−３３３４１０号公報特開２００２−１５８９４８号公報 The prior art documents listed above are as follows.
JP-A-8-76779 Special table 2003-507808 gazette JP 2001-333410 A JP 2002-158948 A

しかしながら、特許文献１に示されるように、その場の雰囲気や臨場感を盛り上げる場合でも、最適な雰囲気や臨場感はユーザにとって異なることが多い。 However, as shown in Patent Document 1, even when the atmosphere and presence are enhanced, the optimum atmosphere and presence are often different for the user.

特に、スポーツ中継のようなコンテンツでは、応援しているチーム側のスタンドにいるような雰囲気であれば、楽しく鑑賞することができるが、相手チーム側のスタンドにいるような雰囲気であると、楽しく感じられないことが多い。 In particular, content such as sports broadcasts can be enjoyed as long as the atmosphere is on the stand of the supporting team, but it is fun if the atmosphere is on the other team's stand. Often not felt.

そのため、コンテンツの内容からだけでは、最適な雰囲気や臨場感を作ることは困難である。 For this reason, it is difficult to create an optimal atmosphere and presence only from the content.

また、同じ歌であっても、ユーザのそのときの気分などによって、手拍子を打つなどして聴きたいときと、静かにじっくりと聴きたいときとがあるが、上記のようにコンテンツの内容のみから雰囲気や臨場感を作成したのでは、ユーザのそのときの気分などに応じた雰囲気や臨場感を作ることはできない。 Even if the song is the same, depending on the mood of the user, there are times when you want to listen by clapping, etc., and when you want to listen carefully and quietly. If an atmosphere or a sense of presence is created, an atmosphere or a sense of presence corresponding to the user's current mood cannot be created.

そこで、この発明は、コンテンツの内容だけでなく、ユーザのそのときの気分などに応じた最適な雰囲気や臨場感を作り出すことができるようにしたものである。 In view of this, the present invention is capable of creating an optimal atmosphere and a realistic sensation according to not only the content but also the user's mood at that time.

この発明のコンテンツ処理方法は、
メインコンテンツの内容を特徴づけるメインコンテンツ内容メタデータと、ユーザプロファイルおよびそのときのユーザの身体的または精神的な状態を示すデータであるユーザ状況メタデータとから、当該メインコンテンツと一緒に再生されるべきサブコンテンツを推薦する推薦情報を生成する推薦情報生成ステップと、
サブコンテンツの内容を特徴づけるサブコンテンツ内容メタデータから、当該サブコンテンツの特徴を示す特徴情報を生成する特徴情報生成ステップと、
複数のサブコンテンツの中から、前記特徴情報が前記推薦情報に最も近いサブコンテンツを、当該メインコンテンツと一緒に再生されるべきサブコンテンツとして選択するサブコンテンツ選択ステップと、
を備えるものである。 The content processing method of the present invention includes:
Sub-content to be played back together with the main content from the main content content metadata that characterizes the content of the main content and user status metadata that is data indicating the user profile and the physical or mental state of the user at that time A recommendation information generation step for generating recommendation information for recommending,
A feature information generating step for generating feature information indicating the feature of the sub-content from the sub-content content metadata characterizing the content of the sub-content;
A sub-content selection step of selecting, from among a plurality of sub-contents, a sub-content whose feature information is closest to the recommendation information as a sub-content to be reproduced together with the main content;
It is those with a.

上記のコンテンツ処理方法では、音楽や映像などのコンテンツを再生または記録するとき、そのコンテンツであるメインコンテンツの内容だけでなく、ユーザのそのときの状況に応じた最適な別のコンテンツが、サブコンテンツとして選択され、メインコンテンツと一緒に再生または記録されるので、コンテンツ（メインコンテンツ）の内容だけでなく、ユーザのそのときの気分などに応じた最適な雰囲気や臨場感を作り出すことができる。 In the content processing method described above, when content such as music or video is played back or recorded, not only the content of the main content that is the content, but also other content that is optimal for the current situation of the user is used as the sub-content. Since it is selected and reproduced or recorded together with the main content, it is possible to create an optimal atmosphere and a sense of realism not only according to the content (main content) but also according to the user's mood at that time.

以上のように、この発明によれば、コンテンツの内容だけでなく、ユーザのそのときの気分などに応じた最適な雰囲気や臨場感を作り出すことができる。 As described above, according to the present invention, it is possible to create an optimal atmosphere and realistic sensation according to not only the contents but also the user's mood at that time.

［１．コンテンツを再生する場合の実施形態：図１〜図４］
コンテンツを再生する場合の実施形態を以下に示す。 [1. Embodiment for Reproducing Content: FIGS. 1 to 4]
An embodiment in the case of reproducing content will be described below.

なお、メインコンテンツおよびサブコンテンツというのは、上述したように便宜的に呼称したものであって、特にメインコンテンツの方がサブコンテンツより、重要である、データ量が多い、などという区別があるものではない。 The main content and the sub-content are named for convenience as described above, and there is no distinction that the main content is more important than the sub-content, and that the amount of data is large. .

（１−１．再生の場合の第１の例：図１〜図３）
図１は、この発明のコンテンツ処理装置の一例を示し、記録媒体に記録されたメインコンテンツの再生時、当該メインコンテンツの内容およびユーザの状況に応じたサブコンテンツを選択し、メインコンテンツと一緒に再生するものである。 (1-1. First example of reproduction: FIGS. 1 to 3)
FIG. 1 shows an example of a content processing apparatus according to the present invention. When main content recorded on a recording medium is reproduced, sub-content according to the content of the main content and the user's situation is selected and reproduced together with the main content. It is.

この例のコンテンツ処理装置は、コンテンツ再生装置として、ディスク再生部１を備え、システムコントローラ１１による制御によってスピンドルモータ３および光ピックアップ４を駆動して、光ディスク２から、これに記録されているメインコンテンツを読み取り、再生するものである。 The content processing device of this example includes a disc playback unit 1 as a content playback device, drives the spindle motor 3 and the optical pickup 4 under the control of the system controller 11, and stores the main content recorded on the optical disc 2. Read and play.

メインコンテンツは、この例では、映画やドラマなどの映像データ、字幕データおよび音声データが、ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）方式によって記録されたものである。 In this example, the main content is video data such as movies and dramas, subtitle data, and audio data recorded by the MPEG (Moving Picture Experts Group) system.

具体的に、メインコンテンツのデータストリームは、図２に示すように、ヘッダに対して映像パケット、字幕パケットおよび音声パケットの各パケットが付加されたものであり、各パケットは、ヘッダに対して映像データ、字幕データまたは音声データが付加されたものであり、音声パケットの音声データは、各音声チャネルの音声データからなるものである。 Specifically, as shown in FIG. 2, the data stream of the main content is obtained by adding video packets, subtitle packets, and audio packets to the header, and each packet includes video data for the header. Subtitle data or audio data is added, and the audio data of the audio packet consists of audio data of each audio channel.

光ディスク２には、メインコンテンツデータとともに、メインコンテンツの内容を特徴づけるメインコンテンツ内容メタデータを記録することができる。光ディスク２にメインコンテンツ内容メタデータを記録する場合、メインコンテンツ内容メタデータは、ＭＰＥＧ７の規格に準拠した書式、またはその他の書式で記述し、ＭＰＥＧストリームのＧＯＰ（ＧｒｏｕｐＯｆＰｉｃｔｕｒｅｓ）ごとに、またはＭＰＥＧストリームのピクチャごとに多重化する。 On the optical disc 2, main content content metadata characterizing the content of the main content can be recorded together with the main content data. When the main content content metadata is recorded on the optical disc 2, the main content content metadata is described in a format compliant with the MPEG7 standard or other formats, and is recorded for each GOP (Group Of Pictures) of the MPEG stream or in the MPEG stream. Multiplex for each picture.

メインコンテンツの内容を特徴づけるメインコンテンツ内容メタデータは、一般には、（１）音楽、映画、ドラマ、ドキュメンタリー、スポーツなど、コンテンツのジャンルを示す情報、（２）音楽であれば、演奏中であることや曲間であることなどを示す情報、映画やドラマであれば、ストーリーの起承転結などを示す情報、スポーツであれば、得点シーンであることなどを示す情報など、シーンの分類を示す情報、（３）音楽などの音響（音声）であれば、音の大きさなどを示す情報、映像であれば、画面の明るさなどを示す情報など、信号としての特徴を示す情報、などである。 Main content content metadata that characterizes the content of main content is generally (1) information indicating the genre of content such as music, movies, dramas, documentaries, sports, and (2) if it is music, Information indicating the classification of scenes, such as information indicating that it is between songs, information indicating the succession of a story if it is a movie or drama, information indicating that it is a scoring scene if it is a sport, (3 In the case of sound (sound) such as music, information indicating the loudness of the sound, and in the case of video, information indicating the characteristics of the signal, such as information indicating the brightness of the screen.

ＭＰＥＧ２１で規定された著作物の権利関係を示す情報も、コンテンツの内容を特徴づける情報として、そのコンテンツの２次加工などに関する条件などを、後述の推薦情報の生成に用いることができる。 Information indicating the copyright relationship of the copyrighted work defined in MPEG21 can also be used for generating recommendation information, which will be described later, as information characterizing the content of the content, such as conditions relating to secondary processing of the content.

以下では、上記のようにメインコンテンツ内容メタデータがメインコンテンツデータとともに光ディスク２に記録されている場合と、後述のようにコンテンツ処理装置（コンテンツ再生装置）において光ディスク２から読み取ったメインコンテンツデータからメインコンテンツ内容メタデータを生成する場合とを、一緒に示す。 In the following, the main content content metadata is recorded on the optical disc 2 together with the main content data as described above, and the main content content metadata is read from the main content data read from the optical disc 2 in the content processing device (content reproduction device) as described later. Together with the generation of.

光ピックアップ４によって光ディスク２から読み取られたデータは、ＲＦアンプ１２を通じて復調回路１３に供給され、復調回路１３で復調され、エラー訂正される。 Data read from the optical disk 2 by the optical pickup 4 is supplied to the demodulation circuit 13 through the RF amplifier 12, demodulated by the demodulation circuit 13, and error-corrected.

さらに、その復調エラー訂正後のデータが、デマルチプレクサ１４に供給されて、デマルチプレクサ１４から、メインコンテンツを構成する映像データ、字幕データおよび音声データ、および光ディスク２にメインコンテンツ内容メタデータが記録されている場合にはメインコンテンツ内容メタデータが、分離されて得られる。 Further, the data after demodulation error correction is supplied to the demultiplexer 14, and video data, subtitle data and audio data constituting the main content, and main content content metadata are recorded on the optical disc 2 from the demultiplexer 14. In some cases, the main content content metadata is obtained separately.

これら分離された映像データ、字幕データ、音声データおよびメインコンテンツ内容メタデータは、デコードタイムスタンプに従って、それぞれ映像デコーダ１５、字幕デコーダ１６、音声デコーダ１７およびメタデータデコーダ１８によってデコードされる。デコードされた映像データ、字幕データおよび音声データは、プレゼンテーションタイムスタンプに従って、それぞれ映像信号、字幕信号および音声信号に変換される。 The separated video data, subtitle data, audio data, and main content content metadata are decoded by the video decoder 15, subtitle decoder 16, audio decoder 17, and metadata decoder 18, respectively, according to the decoding time stamp. The decoded video data, subtitle data, and audio data are converted into a video signal, a subtitle signal, and an audio signal, respectively, according to the presentation time stamp.

その映像デコーダ１５からの映像信号、および字幕デコーダ１６からの字幕信号は、それぞれ映像再生回路２１および字幕再生回路２２を通じて字幕スーパーインポーズ回路２３に供給され、字幕スーパーインポーズ回路２３において、映像信号中に字幕信号がスーパーインポーズされて、液晶表示装置などの映像表示装置２４の表示画面上に、字幕がスーパーインポーズされた映像が表示される。 The video signal from the video decoder 15 and the subtitle signal from the subtitle decoder 16 are supplied to the subtitle superimpose circuit 23 through the video reproduction circuit 21 and the subtitle reproduction circuit 22, respectively. The subtitle signal is superimposed on the screen, and the video with the subtitle superimposed is displayed on the display screen of the video display device 24 such as a liquid crystal display device.

音声デコーダ１７からの音声データは、後述のようにサブコンテンツ選択部４６でサブコンテンツとして選択され、音声デコーダ４７でデコードされた音声データとともに、音声データミキシング回路２７に供給され、音声データミキシング回路２７において、メインコンテンツの一部としての音声データとサブコンテンツとしての音声データがミキシングされて、スピーカやヘッドフォンなどの音響出力装置２８から、ミキシング後の音声が出力される。 The audio data from the audio decoder 17 is selected as sub-content by the sub-content selection unit 46 as will be described later, and is supplied to the audio data mixing circuit 27 together with the audio data decoded by the audio decoder 47, and the audio data mixing circuit 27 1, the audio data as a part of the main content and the audio data as the sub content are mixed, and the mixed audio is output from the sound output device 28 such as a speaker or a headphone.

一方、ユーザの状況を特徴づけるユーザ状況メタデータを生成するために、キー入力部３１からのキー入力情報、生体センサ３３からの生体情報、および外部記憶媒体３５から読み出された情報が、それぞれインターフェース３２，３４および３６を介してユーザ状況メタデータ生成部４１に供給される。 On the other hand, in order to generate user situation metadata characterizing the user situation, key input information from the key input unit 31, biological information from the biological sensor 33, and information read from the external storage medium 35 are respectively The user status metadata generation unit 41 is supplied via the interfaces 32, 34 and 36.

キー入力部３１は、コンテンツ処理装置本体のパネル部に設けられた、またはコンテンツ処理装置のリモートコントローラを構成する赤外線発光部とコンテンツ処理装置本体のパネル部に設けられた赤外線受光部とからなるもので、ユーザは、あらかじめ、またはコンテンツ再生時、このキー入力部３１によって、ユーザの状況を入力する。 The key input unit 31 includes an infrared light emitting unit provided on the panel unit of the content processing apparatus main body or constituting a remote controller of the content processing apparatus and an infrared light receiving unit provided on the panel unit of the content processing apparatus main body. The user inputs the user's situation through the key input unit 31 in advance or at the time of content reproduction.

生体センサ３３は、ユーザの身体的または精神的な状態を検出するもので、脈拍センサや発汗センサなどのセンサ、ユーザの体動（身体の一部または全体の動き）を検出するカメラ、ユーザが発した声を捉えるマイクロフォンなどである。 The biological sensor 33 detects a user's physical or mental state. The sensor such as a pulse sensor or a perspiration sensor, a camera that detects a user's body movement (part or whole body movement), and a user's For example, a microphone that captures the voice.

外部記憶媒体３５は、ユーザの状況を示す情報が書き込まれたメモリカードなどのリムーバブル記憶媒体である。 The external storage medium 35 is a removable storage medium such as a memory card in which information indicating a user's situation is written.

ユーザの状況は、直接的には、（ａ）ユーザの好みの音楽ジャンルやアーティスト、ひいきのチームや選手などのユーザプロファイル、および（ｂ）そのときのユーザの身体的または精神的な状態、具体的には、脈拍数、発汗量、指先の温度、聴力年齢、視覚年齢、体動の有無や大きさなどの状態、に大別することができるが、（ｂ）の情報から検出することができる、音楽へのノリの様子、ストーリーへの集中の度合い、「楽しい」「ゆったりとしたい」などの気分も、（ｂ）の身体的または精神的な状態に含ませる。 The user's situation is directly: (a) the user's favorite music genre or artist, user profile such as a favorite team or player, and (b) the physical or mental state of the user at that time, concrete Specifically, it can be roughly classified into the pulse rate, the amount of sweating, the temperature of the fingertip, the hearing age, the visual age, the presence or absence of body movement, the size, etc., but it can be detected from the information in (b). The feelings of music that can be done, the degree of concentration in the story, and the feelings of “fun” and “want to relax” are also included in the physical or mental state of (b).

また、そのときのユーザの視聴環境、例えば視聴室の暗騒音や明るさなども、ユーザの身体的または精神的な状態に影響を与えるものとして、（ｂ）の身体的または精神的な状態に含ませる。 In addition, the viewing environment of the user at that time, for example, background noise or brightness in the viewing room, also affects the physical or mental state of the user. Include.

上記（ａ）の、ユーザプロファイルは、ユーザがキー入力部３１によって入力し、または外部記憶媒体３５に書き込んでおく。上記（ｂ）の、そのときのユーザの身体的または精神的な状態は、生体センサ３３によって直接検出し、または生体センサ３３の出力信号を解析することによって検出する。また、そのときのユーザの視聴環境などは、ユーザプロファイルと同様に、ユーザがキー入力部３１によって入力し、または外部記憶媒体３５に書き込んでおくようにしてもよい。 The user profile (a) is input by the user through the key input unit 31 or written in the external storage medium 35. The physical or mental state of the user at that time in (b) is detected directly by the biosensor 33 or by analyzing the output signal of the biosensor 33. In addition, the user's viewing environment at that time may be input by the user using the key input unit 31 or written in the external storage medium 35 as in the case of the user profile.

ユーザ状況メタデータ生成部４１では、以上のようなユーザの状況を示す情報が総合的に分析されて、ユーザの状況を特徴づけるユーザ状況メタデータが生成される。生成されたユーザ状況メタデータは、推薦情報生成部４２に送出される。 In the user situation metadata generation unit 41, the information indicating the user situation as described above is comprehensively analyzed, and user situation metadata characterizing the user situation is generated. The generated user situation metadata is sent to the recommendation information generation unit 42.

一方、上述したように、メインコンテンツ内容メタデータが光ディスク２に記録されていて、メタデータデコーダ１８でデコードされる場合には、そのメタデータデコーダ１８からのメインコンテンツ内容メタデータが、推薦情報生成部４２に送出される。 On the other hand, as described above, when the main content content metadata is recorded on the optical disc 2 and decoded by the metadata decoder 18, the main content content metadata from the metadata decoder 18 is used as the recommendation information generating unit 42. Is sent out.

また、光ディスク２にメタデータが記録されていない場合には、映像デコーダ１５からの映像データ、字幕デコーダ１６からの字幕データ、および音声デコーダ１７からの音声データが、メインコンテンツ内容メタデータ生成部４３に供給されて、メインコンテンツ内容メタデータ生成部４３でメインコンテンツ内容メタデータが生成され、推薦情報生成部４２に送出される。 If no metadata is recorded on the optical disc 2, the video data from the video decoder 15, the subtitle data from the subtitle decoder 16, and the audio data from the audio decoder 17 are sent to the main content content metadata generation unit 43. The main content content metadata is generated by the main content content metadata generation unit 43 and sent to the recommendation information generation unit 42.

具体的に、メインコンテンツ内容メタデータ生成部４３では、映像データ、字幕データおよび字幕データが分析されて、「暗いシーン」、「画面全体が赤いシーン」、「画面全体が青いシーン」、「暗騒音レベルが高いシーン」、「ダイナミックレンジが広いシーン」、「低域周波数成分のレベルが大きいシーン」、「高域周波数成分のレベルが大きいシーン」、「字幕が青色」、「白地に黄色の字幕」などのメインコンテンツ内容メタデータが生成される。 Specifically, the main content content metadata generation unit 43 analyzes the video data, caption data, and caption data, and displays “dark scene”, “entire screen is red scene”, “entire screen is blue scene”, “dark noise”. "High-level scene", "Scene with wide dynamic range", "Scene with high level of low frequency component", "Scene with high level of high frequency component", "Subtitle is blue", "Yellow subtitle on white background" Main content content metadata such as “” is generated.

そして、推薦情報生成部４２では、ユーザ状況メタデータ生成部４１からのユーザ状況メタデータと、メタデータデコーダ１８またはメインコンテンツ内容メタデータ生成部４３からのメインコンテンツ内容メタデータとから、当該メインコンテンツと組み合わせるべきサブコンテンツを推薦する推薦情報が生成される。 The recommendation information generation unit 42 should combine the main content from the user status metadata from the user status metadata generation unit 41 and the main content content metadata from the metadata decoder 18 or the main content content metadata generation unit 43. Recommendation information for recommending sub-content is generated.

一方、この例では、あらかじめ外部記憶媒体３７にサブコンテンツとしての音声データ（音声ファイル）が複数記録されていて、その外部記憶媒体３７がコンテンツ処理装置にロードされることによって、外部記憶媒体３７からサブコンテンツとしての音声データが読み出されて、インターフェース３８を通じてサブコンテンツ内容メタデータ生成部４４に供給される。サブコンテンツとしての音声データが記録された外部記憶媒体３７は、上記のユーザプロファイルなどが記録された外部記憶媒体３５と同一の記憶媒体でもよい。 On the other hand, in this example, a plurality of audio data (audio files) as sub-contents are recorded in advance in the external storage medium 37, and the external storage medium 37 is loaded into the content processing apparatus, so that the external storage medium 37 The audio data as the sub content is read and supplied to the sub content content metadata generation unit 44 through the interface 38. The external storage medium 37 in which the audio data as the sub-content is recorded may be the same storage medium as the external storage medium 35 in which the user profile and the like are recorded.

サブコンテンツ内容メタデータ生成部４４では、そのサブコンテンツとしての音声データが分析されて、サブコンテンツの内容を特徴づけるサブコンテンツ内容メタデータが生成される。 The sub-content content metadata generation unit 44 analyzes the audio data as the sub-content and generates sub-content content metadata that characterizes the content of the sub-content.

このサブコンテンツ内容メタデータ生成部４４で生成されたサブコンテンツ内容メタデータは、特徴情報生成部４５に送出され、特徴情報生成部４５において、サブコンテンツとしての音声データの特徴を示す特徴情報が生成される。 The sub content content metadata generated by the sub content content metadata generation unit 44 is sent to the feature information generation unit 45, and the feature information generation unit 45 generates feature information indicating the characteristics of the audio data as the sub content. Is done.

特徴情報は、具体的に、サブコンテンツの分析の結果を多次元空間上で表記することによって生成される。次元は、例えば、残響の豊かなコンサートホールでの拍手を録音したコンテンツの場合であれば、「ライブな気分を与える」、「聴衆による拍手」、「掛け声は無し」というものである。 Specifically, the feature information is generated by expressing the result of sub-content analysis on a multidimensional space. For example, in the case of content recorded with applause in a concert hall with a lot of reverberation, the dimensions are “giving a live feeling”, “applause by the audience”, and “no shout”.

この特徴情報生成部４５で生成された特徴情報は、サブコンテンツ選択部４６に送出され、サブコンテンツ選択部４６において、推薦情報生成部４２で生成された推薦情報と特徴情報生成部４５で生成された特徴情報とから、サブコンテンツとして、特徴情報が推薦情報に最も近い音声データが選択される。 The feature information generated by the feature information generation unit 45 is sent to the sub-content selection unit 46, and the sub-content selection unit 46 generates the recommendation information generated by the recommendation information generation unit 42 and the feature information generation unit 45. From the feature information, the audio data whose feature information is closest to the recommendation information is selected as sub-contents.

具体的に、サブコンテンツ選択部４６は、外部記憶媒体３７に記録されている複数の曲などの複数の音声ファイルにつき特徴情報生成部４５で生成された特徴情報を、そのときのユーザの状況およびメインコンテンツの内容に基づいて推薦情報生成部４２で生成された推薦情報と比較して、特徴情報が推薦情報に最も近いサブコンテンツ（音声ファイル）を検出し、そのサブコンテンツの識別コードや曲名などの識別情報をシステムコントローラ１１に通知する。これを受けて、システムコントローラ１１は、その識別情報を有するサブコンテンツ（音声ファイル）を外部記憶媒体３７から読み出し、インターフェース３８を通じてサブコンテンツ選択部４６に送出する。 Specifically, the sub-content selection unit 46 displays the feature information generated by the feature information generation unit 45 for a plurality of audio files such as a plurality of songs recorded in the external storage medium 37, and the user status at that time and Compared with the recommendation information generated by the recommendation information generation unit 42 based on the content of the main content, the sub-content (audio file) whose feature information is closest to the recommendation information is detected, and the identification code, song title, etc. of the sub-content are detected. Identification information is notified to the system controller 11. In response to this, the system controller 11 reads the sub-content (audio file) having the identification information from the external storage medium 37 and sends it to the sub-content selection unit 46 through the interface 38.

サブコンテンツの選択にあたっては、推薦情報の次元のうち、重要でない次元は、推薦情報と特徴情報との間の距離（違い）の判断に用いないように構成される。 In selecting the sub-content, the dimension that is not important among the dimensions of the recommendation information is configured not to be used for determining the distance (difference) between the recommendation information and the feature information.

このサブコンテンツ選択部４６で選択された音声データは、音声デコーダ４７でデコードされて、音声データミキシング回路２７に供給され、音声デコーダ１７でデコードされた音声データとミキシングされて、音響出力装置からミキシング後の音声が出力される。 The audio data selected by the sub-content selection unit 46 is decoded by the audio decoder 47, supplied to the audio data mixing circuit 27, mixed with the audio data decoded by the audio decoder 17, and mixed from the sound output device. Later audio is output.

以上の例で、メインコンテンツ内容メタデータとしては、具体的に、「音楽ライブ」、「歌手Ａの歌唱中」、「演奏曲目Ｂの終了時」、「野球中継」、「Ｃチームの攻撃中」、「得点シーン」、「明るい画面」、「静かなシーン」、「せりふを話しているシーン」、「効果音が流れているシーン」、「字幕が右から左にロールしながら表示されている状態」などのメタデータが、メインコンテンツのプレゼンテーションタイムに関連づけられて得られる。 In the above example, the main content content metadata specifically includes “music live”, “during singing by singer A”, “at the end of performance song B”, “baseball broadcast”, “during C team attack” , "Scoring scene", "Bright screen", "Quiet scene", "Speaking scene", "Scene with sound effects", "Subtitles are displayed rolling from right to left Metadata such as “status” is obtained in association with the presentation time of the main content.

推薦情報としては、具体的に、例えば、メインコンテンツ内容メタデータが「ポップス系の歌手」および「ライブコンサート」であり、ユーザ状況メタデータが「ひいきの歌手」、「好きな曲」および「楽しい気分」であったときには、「ライブコンサートの雰囲気を盛り上げる」および「参加している気分にさせる」という情報が生成される。 As the recommendation information, specifically, for example, the main content content metadata is “pop singer” and “live concert”, and the user situation metadata is “favorite singer”, “favorite song” and “fun mood” ”Is generated,“ Energize live concert atmosphere ”and“ Make you feel participating ”.

サブコンテンツとしては、具体的に、例えば、推薦情報が「ライブコンサートの雰囲気を盛り上げる」、「参加している気分にさせる」および「演奏曲終了」という情報であるとき、「ライブな気分を与える」、「聴衆による拍手」および「掛け声は無し」という特徴情報を有する「残響の豊かなコンサートホールでの拍手の録音」という音声ファイルが選択される。 As sub-contents, for example, when the recommended information is information such as “enliven the atmosphere of a live concert”, “feel like participating”, and “end of a performance song”, “provide a live feeling” ”,“ Applause by audience ”and“ Recording applause in a concert hall rich in reverberation ”having feature information of“ no applause ”are selected.

したがって、上述したコンテンツ処理方法（コンテンツ再生方法）では、メインコンテンツの内容だけでなく、ユーザのそのときの気分などに応じた最適な雰囲気や臨場感を作り出すことができる。 Therefore, in the above-described content processing method (content reproduction method), it is possible to create an optimal atmosphere and realistic sensation according to not only the content of the main content but also the user's mood at that time.

図３に、図１の例のコンテンツ処理装置（コンテンツ再生装置）がシステムコントローラ１１の制御のもとに行う再生処理の例を示す。 FIG. 3 shows an example of reproduction processing performed by the content processing apparatus (content reproduction apparatus) in the example of FIG. 1 under the control of the system controller 11.

この例では、ユーザの再生開始操作によって再生を開始して、まずステップ５１で、光ディスク２および光ピックアップ４を駆動して、メインコンテンツの多重化されたデータを読み出し、復調する。 In this example, the reproduction is started by the user's reproduction start operation. First, in step 51, the optical disc 2 and the optical pickup 4 are driven, and the multiplexed data of the main content is read and demodulated.

次に、ステップ５２で、映像データ、字幕データ、音声データおよびメインコンテンツ内容メタデータ（光ディスク２にメインコンテンツ内容メタデータが記録されていない場合には、映像データ、字幕データおよび音声データ）を分離し、さらにステップ５３に進んで、ユーザの指示やコンテンツ処理装置の再生環境に応じて、必要なパケットと音声チャネルを選択する。 Next, in step 52, the video data, subtitle data, audio data, and main content content metadata (or video data, subtitle data, and audio data if main content content metadata is not recorded on the optical disc 2) are separated, Further, the process proceeds to step 53, where necessary packets and audio channels are selected according to the user instruction and the playback environment of the content processing apparatus.

次に、ステップ５４で、パケットのヘッダをみて、各データを対応するデコーダに送出し、さらにステップ５５に進んで、各データを各デコーダでデコードする。光ディスク２にメインコンテンツ内容メタデータが記録されていない場合には、ステップ５５の後に、メインコンテンツ内容メタデータを生成する。 Next, in step 54, the packet header is viewed and each data is sent to the corresponding decoder, and the process proceeds to step 55 where each data is decoded by each decoder. If the main content content metadata is not recorded on the optical disc 2, the main content content metadata is generated after step 55.

次に、ステップ５６で、ユーザ状況メタデータを生成し、さらに、そのユーザ状況メタデータとメインコンテンツ内容メタデータとから、推薦情報を生成する。 Next, in step 56, user status metadata is generated, and further, recommendation information is generated from the user status metadata and main content content metadata.

次に、ステップ５７で、特徴情報を生成し、さらにステップ５８に進んで、その特徴情報を推薦情報と比較して、サブコンテンツとしての音声データを選択する。 Next, in step 57, feature information is generated, and the process further proceeds to step 58, where the feature information is compared with recommendation information, and audio data as sub-contents is selected.

次に、ステップ５９で、メインコンテンツの映像、字幕および音声と、サブコンテンツの音声とを、組み合わせて提示する。すなわち、映像および字幕は、映像表示装置２４により表示し、音声は音響出力装置２８により出力する。ただし、ステップ５３で選択しなかったパケットに係るコンテンツは別である。 Next, in step 59, the main content video, subtitles and audio and the sub-content audio are presented in combination. That is, video and subtitles are displayed by the video display device 24, and audio is output by the acoustic output device 28. However, the contents related to the packets not selected in step 53 are different.

図１の例は、光ディスク２にメインコンテンツ内容メタデータが記録されている場合、またはメインコンテンツ内容メタデータ生成部４３でメインコンテンツ内容メタデータを生成する場合であるが、メインコンテンツ内容メタデータは、コンテンツ処理装置が備える回線接続部によりコンテンツ提供元などのサーバに接続することによってコンテンツ処理装置の外部から取得し、またはメインコンテンツが記録された光ディスク２とは別のメモリカードなどの記憶媒体によって取得するようにしてもよい。 The example of FIG. 1 is a case where main content content metadata is recorded on the optical disc 2, or a case where main content content metadata is generated by the main content content metadata generation unit 43. The main content content metadata is a content processing device. It is acquired from the outside of the content processing apparatus by connecting to a server such as a content provider through a line connection unit included in the network, or acquired by a storage medium such as a memory card different from the optical disk 2 on which the main content is recorded. Also good.

また、図１の例は、外部記憶媒体３７にサブコンテンツが記録されている場合であるが、サブコンテンツは、コンテンツ処理装置が備える回線接続部によりコンテンツ提供元などのサーバに接続することによってコンテンツ処理装置の外部から取得するようにしてもよい。 The example of FIG. 1 is a case where sub-contents are recorded on the external storage medium 37. The sub-contents are obtained by connecting to a server such as a content provider by a line connection unit provided in the content processing apparatus. You may make it acquire from the outside of a processing apparatus.

その場合、外部のサーバにおいて、サブコンテンツ内容メタデータ、さらには特徴情報を生成して、ユーザ側のコンテンツ処理装置に送信し、コンテンツ処理装置において、その特徴情報と推薦情報とから、推薦情報に最も近い特徴情報を有するサブコンテンツを選択するように、システムを構成してもよい。 In that case, sub-content content metadata and further feature information is generated in an external server and transmitted to the content processing device on the user side. The content processing device converts the feature information and recommendation information into recommendation information. The system may be configured to select the sub-content having the closest feature information.

さらに、ユーザ側のコンテンツ処理装置から外部のサーバに、メインコンテンツ内容メタデータおよびユーザ状況メタデータ、または推薦情報を送信することによって、あるいは、上記のように外部のサーバにメインコンテンツ内容メタデータを備え、または外部のサーバがメインコンテンツ内容メタデータを生成する場合には、ユーザ側のコンテンツ処理装置から外部のサーバに、ユーザ状況メタデータを送信することによって、外部のサーバにおいて、推薦情報と特徴情報とから、推薦情報に最も近い特徴情報を有するサブコンテンツを選択し、そのサブコンテンツをユーザ側のコンテンツ処理装置に送信するように、システムを構成してもよい。 Further, by transmitting the main content content metadata and user status metadata or recommendation information from the content processing device on the user side to the external server, or the main content content metadata is provided in the external server as described above, Alternatively, when the external server generates the main content content metadata, the user status metadata is transmitted from the content processing device on the user side to the external server, so that the external server uses the recommendation information and the feature information. The system may be configured to select the sub-content having the feature information closest to the recommendation information and transmit the sub-content to the user-side content processing apparatus.

（１−２．再生の場合の第２の例：図４）
図４に、放送を受信して、メインコンテンツを取得し、再生する場合の例を示す。 (1-2. Second example of reproduction: FIG. 4)
FIG. 4 shows an example in which a broadcast is received, main content is acquired and played back.

この例では、デジタル放送受信部６１で、ＢＳデジタル放送、ＣＳデジタル放送、地上デジタル放送などのデジタル放送が、メインコンテンツとして受信される。受信されたメインコンテンツのデータは、復調回路６３で復調され、デマルチプレクサ１４で図1の例と同様に各データに分離される。 In this example, the digital broadcast receiving unit 61 receives digital broadcasts such as BS digital broadcasts, CS digital broadcasts, and terrestrial digital broadcasts as main contents. The received main content data is demodulated by the demodulation circuit 63 and separated into each data by the demultiplexer 14 as in the example of FIG.

さらに、この例では、コンテンツ処理装置（放送受信再生装置）が、回線接続部６７により放送元などのコンテンツ提供元のサーバに接続してサブコンテンツを取得できるように構成される。取得されたサブコンテンツのデータは、インターフェース６８を通じてサブコンテンツ内容メタデータ生成部４４やサブコンテンツ選択部４６に送出される。そのほかは、図1の例と同じである。 Furthermore, in this example, the content processing device (broadcast reception / playback device) is configured to be able to acquire sub-contents by connecting to a content provider server such as a broadcast source by the line connection unit 67. The acquired sub content data is sent to the sub content content metadata generation unit 44 and the sub content selection unit 46 through the interface 68. The rest is the same as the example of FIG.

放送では生中継の場合もあり、雰囲気や臨場感だけでなく即時性が要求されるが、この例では、メインコンテンツの放送が野球やサッカーなどの試合の中継であるとき、サブコンテンツとして各チームの応援席の最新の様子などのコンテンツを得ることができる。 Broadcasting may be live broadcasts, and not only the atmosphere and presence, but also immediacy is required. In this example, when the main content broadcast is a game relay such as baseball or soccer, sub-contents of each team You can get contents such as the latest state of the support seat.

具体的に、例えば、推薦情報が「野球のチームＡとチームＢとの対戦」、「チームＡの攻撃中で選手Ｃの打席」、「選手Ｃの打率は０．３３」および「ユーザは選手Ｃのファン」というものであるとき、サブコンテンツとして、「チームＡ側のスタンドの応援」および「選手Ｃへの応援」という特徴情報を有する「チームＡ側のスタンドで選手Ｃを応援する様子」という音声ファイルが選択され、メインコンテンツと組み合わされて再生される。 Specifically, for example, the recommendation information is “a match between baseball team A and team B”, “attack of player C during attack of team A”, “batting rate of player C is 0.33”, and “user is player When it is called “C fan”, the sub-contents have the characteristic information “support for the stand on the team A side” and “support for the player C”. Is selected and played in combination with the main content.

図４の例は、図１の例と同様にサブコンテンツが音声データの場合であるが、サブコンテンツは、映像データでも、映像データおよび音声データでもよい。メインコンテンツおよびサブコンテンツが共に映像を含む場合、サブコンテンツの映像は、メインコンテンツの映像とともに２画面表示により表示され、またはメインコンテンツの映像に代えて表示され、あるいはメインコンテンツの映像と交互に表示されるように、コンテンツ処理装置を構成する。 The example of FIG. 4 is a case where the sub-content is audio data as in the example of FIG. 1, but the sub-content may be video data, video data, and audio data. When the main content and the sub-content both include video, the sub-content video is displayed in a two-screen display together with the main content video, or is displayed in place of the main content video, or alternately with the main content video. The content processing apparatus is configured.

この例でも、図１の例につき上述したように、外部のサーバにおいてサブコンテンツを選択するように構成することができる。 Also in this example, as described above with reference to the example of FIG. 1, the sub-content can be selected in an external server.

［２．コンテンツを記録する場合の実施形態：図５〜図８］
この発明は、コンテンツを記録する場合にも適用することができる。 [2. Embodiment for Recording Content: FIGS. 5 to 8]
The present invention can also be applied when recording content.

図５に、その場合の一例として、撮像装置により被写体を撮影して得られた画像（画像データ）をメインコンテンツとし、当該メインコンテンツの内容およびユーザの状況に応じた画像（画像データ）をサブコンテンツとして選択して、メインコンテンツと組み合わせて記録する場合を示す。 As an example in FIG. 5, an image (image data) obtained by photographing a subject with an imaging device is used as main content, and an image (image data) according to the content of the main content and the user's situation is used as sub content. The case of selecting and recording in combination with the main content is shown.

この例のコンテンツ処理システムは、ユーザ側の撮像装置７０と、サービス提供側のサーバシステムとによって構成され、例えば、図６に示すようなテーマパークで用いられる。 The content processing system of this example is configured by an imaging device 70 on the user side and a server system on the service providing side, and is used in a theme park as shown in FIG. 6, for example.

図６のテーマパークは、広い敷地９０内に個別のテーマを有する複数の施設９１〜９７が設けられ、利用者が希望する施設を利用し、その施設の内部または近傍で写真を撮り、ビデオ撮影を行うことができるものである。 The theme park of FIG. 6 is provided with a plurality of facilities 91 to 97 having individual themes in a large site 90, using facilities desired by the user, taking pictures in or near the facilities, and taking video Is something that can be done.

図５の撮像装置７０は、デジタルビデオカメラ、デジタルスチルカメラ、カメラ付き携帯電話端末などであり、サービス提供側のサーバシステムは、敷地９０内または敷地９０外に一つ、またはホットスポットである施設９１〜９７ごとに設けられ、送受信用のアンテナ８１、サービス提供側サーバ８２およびコンテンツデータベース８３を備えるものである。 The imaging device 70 in FIG. 5 is a digital video camera, a digital still camera, a mobile phone terminal with a camera, or the like, and the server system on the service providing side is one in or outside the site 90 or a facility that is a hot spot. It is provided for each of 91 to 97, and includes a transmission / reception antenna 81, a service providing server 82, and a content database 83.

コンテンツデータベース８３には、サブコンテンツとなる多数の画像ファイルや音声ファイルが蓄積される。この場合の画像は、アニメーション画像やキャラクタ画像などであり、音声は、音楽やキャラクタの声などであり、各施設９１〜９７にふさわしいものが複数ずつ用意される。また、それぞれの画像ファイルや音声ファイルには、上述したサブコンテンツ内容メタデータまたは特徴情報が付加される。 The content database 83 stores a large number of image files and audio files that are sub-contents. The image in this case is an animation image, a character image, or the like, and the sound is music or a voice of a character. A plurality of images suitable for each facility 91 to 97 are prepared. Further, the above-described sub-content content metadata or feature information is added to each image file and audio file.

撮像装置７０では、カメラ部７１によって被写体が撮影され、カメラ部７１から撮影画像データがメインコンテンツとして得られる。その撮影画像データは、画像処理部７２で処理され、書き込み読み出し制御部７３によって記録媒体７４に記録される。記録媒体７４は、例えば撮像装置７０に装着されたリムーバブル記録媒体であるが、撮像装置７０に内蔵された半導体メモリなどの記録媒体でもよい。 In the imaging device 70, a subject is photographed by the camera unit 71, and photographed image data is obtained as main content from the camera unit 71. The captured image data is processed by the image processing unit 72 and recorded on the recording medium 74 by the writing / reading control unit 73. The recording medium 74 is, for example, a removable recording medium attached to the imaging device 70, but may be a recording medium such as a semiconductor memory built in the imaging device 70.

記録媒体７４に記録された撮影画像データは、書き込み読み出し制御部７３によって記録媒体７４から読み出されて、画像処理部７２で処理され、液晶表示部などの表示部７５に撮影画像が表示される。 The captured image data recorded on the recording medium 74 is read from the recording medium 74 by the writing / reading control unit 73, processed by the image processing unit 72, and the captured image is displayed on the display unit 75 such as a liquid crystal display unit. .

さらに、撮影時または撮影直後、撮影画像データが、画像処理部７２を通じてメインコンテンツ内容メタデータ生成部４３に送出されて、メインコンテンツ内容メタデータ生成部４３でメインコンテンツ内容メタデータが生成され、推薦情報生成部４２に送出される。 Further, at the time of shooting or immediately after shooting, the shot image data is sent to the main content content metadata generation unit 43 through the image processing unit 72, and the main content content metadata generation unit 43 generates the main content content metadata. 42.

一方、撮影時または撮影直後、図１または図４の例と同様に、ユーザ状況メタデータ生成部４１において、キー入力部３１からインターフェース３２を介して入力されたキー入力情報、生体センサ３３からインターフェース３４を介して入力された生体情報、および外部記憶媒体３５から読み出されてインターフェース３６を介して入力された情報から、ユーザの状況を特徴づけるユーザ状況メタデータが生成され、推薦情報生成部４２に送出される。 On the other hand, at the time of shooting or immediately after shooting, as in the example of FIG. 1 or 4, in the user situation metadata generation unit 41, the key input information input from the key input unit 31 via the interface 32 and the interface from the biometric sensor 33. From the biometric information input via 34 and the information read from the external storage medium 35 and input via the interface 36, user status metadata characterizing the user status is generated, and the recommended information generation unit 42 Is sent out.

また、この例では、撮像装置７０が、ＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）アンテナ７８およびＧＰＳ測位部７９を備え、ＧＰＳ衛星からの電波を受信し、データを処理することによって、撮像装置７０の位置を測定できるようにされ、撮影時または撮影直後、その測定結果の位置情報が、推薦情報生成部４２に送出される。 In this example, the imaging device 70 includes a GPS (Global Positioning System) antenna 78 and a GPS positioning unit 79, receives radio waves from GPS satellites, processes the data, and measures the position of the imaging device 70. The position information of the measurement result is sent to the recommendation information generation unit 42 at the time of shooting or immediately after shooting.

この場合のＧＰＳ測位情報は、メインコンテンツである撮影画像の内容を特徴づけるメインコンテンツ内容メタデータとなり得るものであり、サブコンテンツの選択のための判断材料となり得るものである。 The GPS positioning information in this case can be main content content metadata that characterizes the content of the captured image that is the main content, and can be a determination material for selecting the sub content.

例えば、ユーザが図６の施設９１の前で撮影したとき、撮影された画像には施設９１が写し出されるとともに、ＧＰＳ測位情報は施設９１の近くで撮影したことを示すものとなり、サブコンテンツとして、施設９１にふさわしい画像や音声を選択することが可能となる。 For example, when the user takes a picture in front of the facility 91 in FIG. 6, the facility 91 is shown in the photographed image, and the GPS positioning information indicates that the picture is taken near the facility 91. It becomes possible to select an image or sound suitable for the facility 91.

推薦情報生成部４２では、ユーザ状況メタデータ生成部４１からのユーザ状況メタデータ、メインコンテンツ内容メタデータ生成部４３からのメインコンテンツ内容メタデータ、およびＧＰＳ測位部７９からのＧＰＳ測位情報から、メインコンテンツである撮影画像と組み合わせるべきサブコンテンツを推薦する推薦情報が生成される。 The recommendation information generation unit 42 is the main content from the user situation metadata from the user situation metadata generation unit 41, the main content content metadata from the main content content metadata generation unit 43, and the GPS positioning information from the GPS positioning unit 79. Recommendation information for recommending sub-contents to be combined with the photographed image is generated.

そして、この例では、その推薦情報が送受信部７６からアンテナ７７を通じて、サービス提供側のサーバシステムに送信される。 In this example, the recommendation information is transmitted from the transmission / reception unit 76 through the antenna 77 to the server system on the service providing side.

サービス提供側のサーバシステムでは、サービス提供側サーバ８２において、アンテナ８１を通じて、その推薦情報が受信され、この例では、サブコンテンツとして、コンテンツデータベース８３に蓄積されている多数の画像ファイルから、その特徴情報が、受信された推薦情報に最も近い画像ファイルが選択され、サービス提供側サーバ８２から、アンテナ８１を通じて撮像装置７０に送信される。 In the server system on the service providing side, the recommendation information is received by the service providing server 82 via the antenna 81. In this example, the feature information is obtained from a large number of image files stored in the content database 83 as sub-contents. The image file whose information is closest to the received recommendation information is selected and transmitted from the service providing server 82 to the imaging device 70 through the antenna 81.

撮像装置７０では、その提供された画像データが、アンテナ７７を通じて送受信部７６で受信され、画像処理部７２に送出される。 In the imaging device 70, the provided image data is received by the transmission / reception unit 76 through the antenna 77 and sent to the image processing unit 72.

画像処理部７２では、そのサブコンテンツとしての提供画像が、メインコンテンツとしての撮影画像と組み合わせられる。組み合わせ態様としては、ユーザの設定によって、図７に示すように、メインコンテンツ画像（撮影画像）Ｐｍにサブコンテンツ画像（提供画像）Ｐｓが合成され、合成後の画像が一つの画像ファイルとして記録媒体７４に記録される合成モードと、メインコンテンツ画像とサブコンテンツ画像とが関連識別子によってリンクされて別個の画像ファイルとして記録媒体７４に記録されるリンク・モードの、いずれかを選択できるように、撮像装置７０を構成する。 In the image processing unit 72, the provided image as the sub-content is combined with the captured image as the main content. As a combination mode, as shown in FIG. 7, the sub content image (provided image) Ps is combined with the main content image (captured image) Pm according to user settings, and the combined image is recorded as one image file on the recording medium 74. The imaging device 70 can select one of a synthesis mode recorded on the recording medium 74 and a link mode in which the main content image and the sub-content image are linked by a related identifier and recorded on the recording medium 74 as separate image files. Configure.

以上は、メインコンテンツとしての撮影画像にサブコンテンツとしての提供画像が組み合わせられる場合であるが、ユーザの設定によって、サブコンテンツが組み合わせられるモードと組み合わせられないモードのいずれかを選択できるように、撮像装置７０を構成してもよい。 The above is a case where the provided image as the sub-content is combined with the captured image as the main content, but the imaging device can select either the mode in which the sub-content is combined or the mode in which the sub-content is not combined depending on the user setting. 70 may be configured.

以上は、サブコンテンツも画像の場合であるが、上述したようにサブコンテンツは音楽やキャラクタの声などの音声でもよい。その場合には、図８に示すように、メインコンテンツとしての画像ファイルＦｐとサブコンテンツとしての音声ファイルＦａとが、関連識別子によってリンクされて記録媒体７４に記録される。 The above is the case where the sub-content is also an image, but as described above, the sub-content may be sound such as music or voice of a character. In that case, as shown in FIG. 8, the image file Fp as the main content and the audio file Fa as the sub-content are linked by the related identifier and recorded on the recording medium 74.

この発明のコンテンツ処理装置の一例を示す図である。It is a figure which shows an example of the content processing apparatus of this invention. 光ディスクに記録されるＭＰＥＧストリームの一例を示す図である。It is a figure which shows an example of the MPEG stream recorded on an optical disk. 図１のコンテンツ処理装置における再生処理の一例を示す図である。It is a figure which shows an example of the reproduction | regeneration processing in the content processing apparatus of FIG. この発明のコンテンツ処理装置の他の例を示す図である。It is a figure which shows the other example of the content processing apparatus of this invention. コンテンツを記録する場合のコンテンツ処理システムの一例を示す図である。It is a figure which shows an example of the content processing system in the case of recording a content. 図５のコンテンツ処理システムが用いられる施設の一例を示す図である。It is a figure which shows an example of the facility where the content processing system of FIG. 5 is used. メインコンテンツ画像とサブコンテンツ画像を合成する場合を示す図である。It is a figure which shows the case where a main content image and a sub content image are synthesize | combined. メインコンテンツ画像ファイルとサブコンテンツ音声ファイルをリンクさせる場合を示す図である。It is a figure which shows the case where a main content image file and a sub content audio | voice file are linked.

Explanation of symbols

主要部については図中に全て記述したので、ここでは省略する。 Since all the main parts are described in the figure, they are omitted here.

Claims

And the main content content metadata characterizing the content of the main content, and a user profile Oyo Biyu chromatography The status metadata, and recommendation information generation step of generating recommendation information for recommending sub content to be reproduced along with the main content ,
A feature information generating step for generating feature information indicating the feature of the sub-content from the sub-content content metadata characterizing the content of the sub-content;
A sub-content selection step of selecting, from among a plurality of sub-contents, a sub-content whose feature information is closest to the recommendation information as a sub-content to be reproduced together with the main content;
And the user situation metadata is data indicating physical or mental state of the user at the time when the recommendation information is generated .

The content processing method according to claim 1,
A content processing method for reproducing together the main content and the sub-content selected in the sub-content selection step as sub-content to be reproduced together with the main content.

The content processing method according to claim 1,
A content processing method in which the main content and the sub-content selected in the sub-content selection step as sub-content to be reproduced together with the main content are combined and recorded, or recorded in association with each other.

The content processing method according to claim 1,
A content processing method for analyzing the main content and generating the main content content metadata in a content processing apparatus on a user side.

The content processing method according to claim 1,
The content processing method for fetching the main content content metadata into the user-side content processing apparatus from outside the user-side content processing apparatus.

The content processing method according to claim 1,
A content processing method for analyzing the sub-content and generating the sub-content content metadata in a user-side content processing apparatus.

The content processing method according to claim 1,
The content processing method according to claim 1, wherein the sub-content content metadata is taken into the user-side content processing apparatus from outside the user-side content processing apparatus.

The content processing method according to claim 1,
A content processing method for executing the recommendation information generation step, the feature information generation step, and the sub-content selection step in a user-side content processing apparatus.

The content processing method according to claim 1,
In the content processing device on the user side, the recommendation information generation step is executed, and the recommendation information generated in the recommendation information generation step is transmitted to a service providing side device outside the content processing device on the user side, A content processing method for executing the feature information generation step and the sub-content selection step in the service-providing device and transmitting the sub-content selected in the sub-content selection step to the content processing device on the user side .

The content processing method according to claim 1,
In the service providing device external to the user-side content processing apparatus, the main content content metadata and the user situation metadata or the user situation metadata are acquired from the user-side content processing apparatus, and the recommendation information A content processing method that executes a generation step, the feature information generation step, and the sub-content selection step, and transmits the sub-content selected in the sub-content selection step to the content processing apparatus on the user side.

And the main content content metadata characterizing the content of the main content, and a user profile Oyo Biyu chromatography The status metadata, and recommendation information generation means for generating recommendation information for recommending sub content to be reproduced along with the main content ,
Feature information generating means for generating feature information indicating the feature of the sub-content from the sub-content content metadata characterizing the content of the sub-content;
Sub-content selection means for selecting, from among a plurality of sub-contents, a sub-content whose feature information is closest to the recommendation information as a sub-content to be reproduced together with the main content;
And the user situation metadata is data indicating a physical or mental state of the user at the time when the recommendation information is generated .

The content processing apparatus according to claim 11, wherein
A content processing apparatus that reproduces together the main content and the sub-content selected by the sub-content selection unit as the sub-content to be reproduced together with the main content.

The content processing apparatus according to claim 11, wherein
A content processing apparatus that synthesizes and records the main content and the sub-content selected by the sub-content selection unit as sub-content to be reproduced together with the main content, or records the sub-content in association with each other.

The content processing apparatus according to claim 11, wherein
A content processing apparatus comprising main content content metadata generating means for analyzing the main content and generating the main content content metadata.

The content processing apparatus according to claim 11, wherein
A content processing apparatus comprising main content content metadata acquisition means for fetching the main content content metadata from outside the content processing apparatus into the content processing apparatus.

The content processing apparatus according to claim 11, wherein
A content processing apparatus comprising sub-content content metadata generating means for analyzing the sub-content and generating the sub-content content metadata.

The content processing apparatus according to claim 11, wherein
A content processing apparatus comprising sub-content content metadata acquisition means for fetching the sub-content content metadata from outside the content processing apparatus into the content processing apparatus.

And the main content content metadata characterizing the content of the main content, and a user profile Oyo Biyu chromatography The status metadata, and recommendation information generation means for generating recommendation information for recommending sub content to be reproduced along with the main content ,
Transmitting means for transmitting the recommendation information generated by the recommendation information generating means to the device on the service providing side;
In the service providing device, the feature information indicating the feature of the content is selected as the content closest to the recommendation information transmitted by the transmitting unit, and the receiving unit receives the transmitted sub-content.
And the user situation metadata is data indicating a physical or mental state of the user at the time when the recommendation information is generated .