JP2014150314A

JP2014150314A - Information processing device and information processing method

Info

Publication number: JP2014150314A
Application number: JP2013016549A
Authority: JP
Inventors: Masashi Kamiya; 雅志神谷; Kensuke Ueda; 健介上田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2013-01-31
Filing date: 2013-01-31
Publication date: 2014-08-21

Abstract

PROBLEM TO BE SOLVED: To prevent viewing and listening by a user from being interrupted by accompanying information.SOLUTION: An information processing device 100 comprises: a related information buffer 114 for holding related information of video data; and a control part 110 for determining whether a frame image of the video data is an image of an important scene and when the frame image of the video data is not the image of the important scene, causing the related information to be outputted from the related information buffer 114 in order to cause an image of the related information to be displayed together with the frame image of the video data, and when the frame image of the video data is the image of the important scene, causing the related information not to be outputted from the related information buffer 114 in order to cause an image of the related information not to be displayed together with the frame image of the video data.

Description

本発明は、情報処理装置及び情報処理方法に関する。 The present invention relates to an information processing apparatus and an information processing method.

テレビ等の映像表示装置では、一般的に主映像である番組映像の表示に付帯して、字幕、テロップ、字幕から得られたキーワード及び画質設定用のメニュー画面等、番組映像とは異なる情報の映像を、番組映像と同時かつ同画面上に表示することができる。 In a video display device such as a TV, generally, information that differs from the program video such as subtitles, telops, keywords obtained from subtitles, menu screens for image quality setting, etc. is attached to the display of the main video. The video can be displayed simultaneously with the program video on the same screen.

従来の映像表示装置では、番組映像、字幕及びメニュー画面等に対して、それぞれ表示優先度が設定される。そして、従来の映像表示装置は、表示優先度を比較し、それらを同画面上に重畳して表示する際の、それぞれの表示位置及び表示タイミングを調整することで、表示優先度の高い情報が表示優先度の低い映像に覆われることがない構成にされている（例えば、特許文献１を参照）。 In the conventional video display device, display priorities are set for program videos, subtitles, menu screens, and the like. The conventional video display device compares the display priorities and adjusts the respective display positions and display timings when displaying them superimposed on the same screen, so that information with high display priorities can be obtained. It is configured not to be covered with a video with a low display priority (see, for example, Patent Document 1).

特開２０１０−１２４４２９号公報JP 2010-124429 A

しかしながら、従来の映像表示装置は、主映像である番組映像が各時刻に表示している内容について、その重要性を意識した上で付帯する情報の表示位置及び表示タイミングを制御するものではない。そのため番組が重要な内容を表示している最中であっても、付帯する情報を表示してしまうことがある。その結果、付帯する情報によって、ユーザの番組視聴が妨害され、重要な内容の見逃し又は聞き逃しが発生してしまい、全体のストーリー展開が理解できなくなる。 However, the conventional video display device does not control the display position and display timing of the accompanying information in consideration of the importance of the content displayed by the program video as the main video at each time. For this reason, even if the program is displaying important contents, accompanying information may be displayed. As a result, the user's viewing of the program is hindered by the accompanying information, and important contents are overlooked or missed, making it impossible to understand the entire story development.

そこで、本発明は、付帯する情報によって、ユーザの視聴が妨害されてしまうことを防止することを目的とする。 Therefore, an object of the present invention is to prevent a user's viewing from being disturbed by accompanying information.

本発明の一態様に係る情報処理装置は、映像データの関連情報を保持する関連情報保持部と、前記映像データのフレーム画像が重要なシーンの画像であるか否かを判断し、前記映像データのフレーム画像が重要なシーンの画像ではない場合に、前記関連情報の画像を前記映像データのフレーム画像とともに表示するために、前記関連情報保持部から前記関連情報を出力させるとともに、前記映像データのフレーム画像が重要なシーンの画像である場合に、前記関連情報の画像を前記映像データのフレーム画像とともに表示させないために、前記関連情報保持部から前記関連情報を出力させない制御部と、を備えることを特徴とする。 An information processing apparatus according to an aspect of the present invention, a related information holding unit that holds related information of video data, determines whether a frame image of the video data is an image of an important scene, and the video data When the frame image is not an image of an important scene, the related information is output from the related information holding unit to display the image of the related information together with the frame image of the video data, and the video data A control unit that does not output the related information from the related information holding unit so as not to display the image of the related information together with the frame image of the video data when the frame image is an image of an important scene. It is characterized by.

本発明の一態様に係る情報処理方法は、映像データの関連情報を保持する関連情報保持過程と、前記映像データのフレーム画像が重要なシーンの画像であるか否かを判断し、前記映像データのフレーム画像が重要なシーンの画像ではない場合に、前記関連情報の画像を前記映像データのフレーム画像とともに表示するために、前記関連情報保持過程で保持された前記関連情報を出力させるとともに、前記映像データのフレーム画像が重要なシーンの画像である場合に、前記関連情報の画像を前記映像データのフレーム画像とともに表示させないために、前記関連情報保持過程で保持された前記関連情報を出力させない制御過程と、を有することを特徴とする。 An information processing method according to an aspect of the present invention includes a related information holding process for holding related information of video data, a determination as to whether a frame image of the video data is an image of an important scene, and the video data When the frame image is not an image of an important scene, the related information held in the related information holding process is output to display the image of the related information together with the frame image of the video data. Control not to output the related information held in the related information holding process so that the image of the related information is not displayed together with the frame image of the video data when the frame image of the video data is an image of an important scene. And a process.

本発明の一態様によれば、付帯する情報によって、ユーザの視聴が妨害されてしまうことがなくなる。 According to one aspect of the present invention, user viewing is not hindered by accompanying information.

実施の形態１に係る情報処理装置の構成を概略的に示すブロック図である。1 is a block diagram schematically showing a configuration of an information processing device according to a first embodiment. 実施の形態１に係る情報処理装置の動作を示すフローチャートである。3 is a flowchart illustrating an operation of the information processing apparatus according to the first embodiment. （Ａ）及び（Ｂ）は、実施の形態１においてフレーム差分値を算出する際のフレーム画像を示す概略図である。(A) And (B) is the schematic which shows the frame image at the time of calculating a frame difference value in Embodiment 1. FIG. 実施の形態１の第１の変形例におけるメタデータを示す概略図である。6 is a schematic diagram showing metadata in a first modification of the first embodiment. FIG. 実施の形態１の第１の変形例に係る情報処理装置の構成を概略的に示すブロック図である。6 is a block diagram schematically showing a configuration of an information processing apparatus according to a first modification of the first embodiment. FIG. 実施の形態１の第２の変形例に係る情報処理装置の構成を概略的に示すブロック図である。FIG. 11 is a block diagram schematically showing a configuration of an information processing device according to a second modification example of the first embodiment. 実施の形態１の第２の変形例において表示される映像を示す概略図である。FIG. 11 is a schematic diagram showing an image displayed in the second modification example of the first embodiment. 実施の形態２に係る情報処理装置の構成を概略的に示すブロック図である。FIG. 3 is a block diagram schematically showing a configuration of an information processing device according to a second embodiment. 実施の形態２に係る情報処理装置の動作を示すフローチャートである。6 is a flowchart illustrating an operation of the information processing apparatus according to the second embodiment.

実施の形態１．
図１は、実施の形態１に係る情報処理装置１００の構成を概略的に示すブロック図である。
情報処理装置１００は、データ処理部１０１と、映像表示部１２０と、データ同期制御部１３０とを備える。 Embodiment 1 FIG.
FIG. 1 is a block diagram schematically showing the configuration of the information processing apparatus 100 according to the first embodiment.
The information processing apparatus 100 includes a data processing unit 101, a video display unit 120, and a data synchronization control unit 130.

データ処理部１０１は、映像データ、関連情報及びユーザ入力を入力とし、映像信号を出力する。データ処理部１０１は、制御部１１０と、関連情報バッファ１１４と、関連情報重畳部１１５と、映像出力部１１６とを備える。 The data processing unit 101 receives video data, related information, and user input, and outputs a video signal. The data processing unit 101 includes a control unit 110, a related information buffer 114, a related information superimposing unit 115, and a video output unit 116.

制御部１１０は、関連情報バッファ１１４を制御する。例えば、制御部１１０は、映像データのフレーム画像が重要なシーンの画像であるか否かを判断する。そして、制御部１１０は、映像データのフレーム画像が重要なシーンの画像ではない場合に、関連情報の画像を映像データのフレーム画像とともに表示するために、関連情報バッファ１１４から関連情報を出力させる。一方、制御部１１０は、映像データのフレーム画像が重要なシーンの画像である場合に、関連情報の画像を映像データのフレーム画像とともに表示させないために、関連情報バッファ１１４から関連情報を出力させない。このようにすることで、関連情報の画像が、重要なシーンの画像と同時に表示されることがなくなる。
ここで、実施の形態１における制御部１１０は、映像データのフレーム画像がシーンチェンジした画像である場合に、映像データのフレーム画像が重要なシーンの画像であると判断する。
制御部１１０は、映像解析部１１１と、ユーザ入力受付部１１２と、関連情報制御部１１３とを備える。 The control unit 110 controls the related information buffer 114. For example, the control unit 110 determines whether or not the frame image of the video data is an important scene image. Then, when the frame image of the video data is not an important scene image, the control unit 110 causes the related information buffer 114 to output the related information in order to display the image of the related information together with the frame image of the video data. On the other hand, when the frame image of the video data is an image of an important scene, the control unit 110 does not display the related information from the related information buffer 114 in order not to display the image of the related information together with the frame image of the video data. By doing so, an image of related information is not displayed simultaneously with an image of an important scene.
Here, when the frame image of the video data is a scene-changed image, the control unit 110 in the first embodiment determines that the frame image of the video data is an important scene image.
The control unit 110 includes a video analysis unit 111, a user input reception unit 112, and a related information control unit 113.

映像解析部１１１は、映像データを解析して、映像データのフレーム画像の重要度を算出する。本実施の形態においては、重要度は、連続するフレーム画像間の差分を示すフレーム差分値である。そして、映像解析部１１１は、算出された重要度を関連情報制御部１１３に与える。
ユーザ入力受付部１１２は、ユーザから入力されたユーザ入力データを受け取る。本実施の形態においては、ユーザ入力受付部１１２は、ユーザ入力データとして、情報出力判定値と比較して、関連情報を表示するか否かを判定するための予め定められた閾値を受け取る。 The video analysis unit 111 analyzes the video data and calculates the importance of the frame image of the video data. In the present embodiment, the importance is a frame difference value indicating a difference between successive frame images. Then, the video analysis unit 111 gives the calculated importance to the related information control unit 113.
The user input reception unit 112 receives user input data input from a user. In the present embodiment, the user input receiving unit 112 receives a predetermined threshold value for determining whether or not to display related information as user input data compared to the information output determination value.

関連情報制御部１１３は、関連情報バッファ１１４を制御する。例えば、本実施の形態においては、関連情報制御部１１３は、映像解析部１１１から与えられた重要度を情報出力判定値として、ユーザ入力受付部１１２より与えられた閾値と比較する。そして、関連情報制御部１１３は、情報出力判定値が閾値よりも小さい場合に、映像データのフレーム画像が重要なシーンの画像ではないと判断し、関連情報バッファ１１４に記憶されている関連情報を出力するように、関連情報バッファ１１４を制御する。一方、関連情報制御部１１３は、情報出力判定値が閾値以上である場合に、映像データのフレーム画像が重要なシーンの画像であると判断し、関連情報を出力しないように、言い換えると、関連情報の保持を維持するように、関連情報バッファ１１４を制御する。なお、関連情報制御部１１３は、表示制御信号を関連情報バッファ１１４に与えることで、関連情報バッファ１１４を制御する。 The related information control unit 113 controls the related information buffer 114. For example, in the present embodiment, the related information control unit 113 compares the importance given from the video analysis unit 111 with the threshold given from the user input receiving unit 112 as an information output determination value. Then, when the information output determination value is smaller than the threshold value, the related information control unit 113 determines that the frame image of the video data is not an important scene image, and the related information stored in the related information buffer 114 is determined. The related information buffer 114 is controlled to output. On the other hand, when the information output determination value is equal to or greater than the threshold, the related information control unit 113 determines that the frame image of the video data is an image of an important scene, so that the related information is not output. The related information buffer 114 is controlled so as to maintain the holding of the information. The related information control unit 113 controls the related information buffer 114 by giving a display control signal to the related information buffer 114.

関連情報バッファ１１４は、映像データの関連情報を保持する関連情報保持部である。関連情報は、映像データに付帯する情報であり、例えば、字幕データ、字幕データから得られるキーワード、番組の出演者情報、テロップ及びメニュー画面の少なくとも何れか一つを含む。
関連情報重畳部１１５は、映像データを保持する映像データ保持部である映像データバッファ１１５ａを備える。そして、関連情報重畳部１１５は、関連情報バッファ１１４から関連情報を取得した場合には、映像データのフレーム画像に、関連情報の画像を重畳することで、出力映像データを生成する。また、関連情報重畳部１１５は、関連情報バッファ１１４から関連情報を取得しない場合には、映像データのフレーム画像に、関連情報の画像を重畳せずに、出力映像データを生成する。この場合には、映像データをそのまま出力映像データとすることができる。 The related information buffer 114 is a related information holding unit that holds related information of video data. The related information is information attached to the video data, and includes, for example, at least one of subtitle data, keywords obtained from subtitle data, program performer information, telop, and menu screen.
The related information superimposing unit 115 includes a video data buffer 115a that is a video data holding unit that holds video data. When the related information superimposing unit 115 acquires the related information from the related information buffer 114, the related information superimposing unit 115 generates output video data by superimposing the related information image on the frame image of the video data. When the related information superimposing unit 115 does not acquire the related information from the related information buffer 114, the related information superimposing unit 115 generates output video data without superimposing the image of the related information on the frame image of the video data. In this case, the video data can be used as output video data as it is.

映像出力部１１６は、関連情報重畳部１１５から与えられた出力映像データから映像信号を生成する。
映像表示部１２０は、映像出力部１１６から与えられた映像信号に基づいて、映像を表示する。
データ同期制御部１３０は、データ処理部１０１で処理される各種データの同期制御を行う。例えば、データ同期制御部１３０は、データ処理部１０１の各構成部に対して制御信号を送ることで、データの同期をとる。 The video output unit 116 generates a video signal from the output video data given from the related information superimposing unit 115.
The video display unit 120 displays a video based on the video signal given from the video output unit 116.
The data synchronization control unit 130 performs synchronization control of various data processed by the data processing unit 101. For example, the data synchronization control unit 130 synchronizes data by sending a control signal to each component of the data processing unit 101.

次に、実施の形態１に係る情報処理装置１００の動作について説明する。
図２は、実施の形態１に係る情報処理装置１００の動作を示すフローチャートである。 Next, the operation of the information processing apparatus 100 according to Embodiment 1 will be described.
FIG. 2 is a flowchart showing the operation of the information processing apparatus 100 according to the first embodiment.

映像解析部１１１は、映像データを受信する（Ｓ１０）。映像データは、連続するフレーム画像の集合として考えられるが、ここでは、映像データ全体のうち、ある時刻の１フレームの画像に対応する映像データを示すものとする。以下では、映像データは、１フレーム分の画像データとして説明を行う。 The video analysis unit 111 receives video data (S10). The video data can be considered as a set of continuous frame images. Here, it is assumed that video data corresponding to an image of one frame at a certain time in the entire video data is shown. In the following, video data will be described as image data for one frame.

映像解析部１１１は、ステップＳ１０で受信したフレーム画像に対して、１つ前のフレーム画像との差分を算出する（Ｓ１１）。ここで算出された差分値をフレーム差分値という。
フレーム差分値は、時間的に連続する２つのフレーム画像において、フレームを構成する各画素の同座標に存在する色成分又は輝度成分の差の絶対値を全画素について算出し、足し合わせたものである。
例えば、図３（Ａ）は、ステップＳ１０で受信された映像データで示されるフレーム画像ＩＭ２の一つ前のフレーム画像ＩＭ１を示し、図３（Ｂ）は、ステップＳ１０で受信された映像データで示されるフレーム画像ＩＭ２を示すものとする。フレーム画像ＩＭ１及びフレーム画像ＩＭ２は、それぞれ、ｍ×ｎ（ｍ及びｎは、それぞれ自然数）の画素により構成されている。フレーム画像ＩＭ１を構成する画素の画素値は、ｃ１（ｉ，ｊ）で示され、フレーム画像ＩＭ２を構成する画素の画素値は、ｃ２（ｉ，ｊ）で示されるとすると、フレーム差分値は、以下の（１）式で算出される。ここで、ｉは、１≦ｉ≦ｍを満たす自然数であり、ｊは、１≦ｊ≦ｎを満たす自然数である。

従って、２つのフレーム画像が全く同一のものであれば、フレーム差分値は最小値の「０」となる。一方で、シーンチェンジ点等に見られる、違う場所で撮影された映像に切り替わる時刻を跨ぐ２つのフレーム画像を対象としたフレーム差分値は、比較的大きな値になる。映像解析部１１１は、このようなフレーム差分値を、関連情報制御部１１３に与える。 The video analysis unit 111 calculates the difference between the frame image received in step S10 and the previous frame image (S11). The difference value calculated here is referred to as a frame difference value.
The frame difference value is calculated by adding the absolute value of the difference between the color component or the luminance component existing at the same coordinates of each pixel constituting the frame and adding them together in two temporally continuous frame images. is there.
For example, FIG. 3A shows the frame image IM1 immediately before the frame image IM2 indicated by the video data received in step S10, and FIG. 3B shows the video data received in step S10. It is assumed that the frame image IM2 shown is shown. Each of the frame image IM1 and the frame image IM2 includes m × n pixels (m and n are natural numbers, respectively). If the pixel value of the pixel constituting the frame image IM1 is indicated by c1 (i, j) and the pixel value of the pixel constituting the frame image IM2 is indicated by c2 (i, j), the frame difference value is It is calculated by the following equation (1). Here, i is a natural number that satisfies 1 ≦ i ≦ m, and j is a natural number that satisfies 1 ≦ j ≦ n.

Therefore, if the two frame images are exactly the same, the frame difference value is the minimum value “0”. On the other hand, the frame difference value for two frame images, which are seen at scene change points or the like and straddle the time of switching to a video shot at a different place, is a relatively large value. The video analysis unit 111 gives such a frame difference value to the related information control unit 113.

フレーム差分値が「０」の場合、連続する２つのフレーム画像がまったく同一であるので、画面上同じシーンが継続していることになる。同じシーンが続いているときは、番組を視聴しているユーザがストーリー展開を理解するうえで重要度が低いシーンということができる。一方、フレーム差分値が大きくなるほど、連続する２つのフレーム画像が異なるので、番組のストーリー展開が大きく変化したことが推測できる。このようなシーンは、番組を視聴しているユーザがストーリー展開を理解するうえで重要度が高いシーンということができる。このように、映像解析部１１１は、フレーム差分値を求めることにより、映像データにおける各時刻のフレーム画像の重要度を決定することができる。 When the frame difference value is “0”, since two consecutive frame images are exactly the same, the same scene is continued on the screen. When the same scene continues, it can be said that the user watching the program has a low importance in understanding the story development. On the other hand, as the frame difference value increases, two consecutive frame images are different, so it can be inferred that the story development of the program has changed greatly. Such a scene can be said to be a scene having a high degree of importance for the user viewing the program to understand the story development. As described above, the video analysis unit 111 can determine the importance of the frame image at each time in the video data by obtaining the frame difference value.

図２の説明に戻り、ユーザ入力受付部１１２は、ユーザ入力データを受け付ける（Ｓ１２）。例えば、ユーザ入力受付部１１２は、映像解析結果であるフレーム差分値に対する閾値をユーザ入力データとして受け取り、それを関連情報制御部１１３へと出力する。 Returning to the description of FIG. 2, the user input receiving unit 112 receives user input data (S12). For example, the user input receiving unit 112 receives a threshold for the frame difference value, which is the video analysis result, as user input data, and outputs it to the related information control unit 113.

関連情報制御部１１３は、映像解析部１１１からフレーム差分値と、ユーザ入力受付部１１２からユーザ入力データとを受信する。その後、関連情報制御部１１３は、フレーム差分値を情報出力判定値として、ユーザ入力データで示される閾値と比較して、情報出力判定値がその閾値以上であるか否かを判定する（Ｓ１３）。情報出力判定値が閾値以上である場合（Ｓ１３：Ｙｅｓ）には、処理はステップＳ１４に進み、情報出力判定値が閾値よりも小さい場合（Ｓ１３：Ｎｏ）には、処理はステップＳ１７に進む。 The related information control unit 113 receives the frame difference value from the video analysis unit 111 and the user input data from the user input reception unit 112. Thereafter, the related information control unit 113 compares the frame difference value as an information output determination value with a threshold value indicated by the user input data, and determines whether or not the information output determination value is equal to or greater than the threshold value (S13). . If the information output determination value is greater than or equal to the threshold (S13: Yes), the process proceeds to step S14. If the information output determination value is smaller than the threshold (S13: No), the process proceeds to step S17.

ステップＳ１４では、関連情報制御部１１３は、関連情報を出力することを示す表示制御信号（説明上、「１」というデータと仮定する）を関連情報バッファ１１４へ出力する。このような表示制御信号を受信した関連情報バッファ１１４は、保持している関連情報を関連情報重畳部１１５へと出力する。 In step S <b> 14, the related information control unit 113 outputs a display control signal (assumed to be “1” for explanation) to the related information buffer 114 indicating that related information is output. The related information buffer 114 that has received such a display control signal outputs the held related information to the related information superimposing unit 115.

関連情報重畳部１１５は、ステップＳ１０で受け取られた映像データのフレーム画像と、関連情報バッファ１１４から出力された関連情報の画像とを重畳した出力映像データを生成する（Ｓ１５）。そして、関連情報重畳部１１５は、生成された出力映像データを映像出力部１１６に与える。 The related information superimposing unit 115 generates output video data in which the frame image of the video data received in step S10 and the image of the related information output from the related information buffer 114 are superimposed (S15). Then, the related information superimposing unit 115 gives the generated output video data to the video output unit 116.

映像出力部１１６は、関連情報重畳部１１５から出力映像データを受け取り、画面表示用の映像信号に変換した後、この映像信号を映像表示部１２０に与える（Ｓ１６）。ここでの出力映像データには、関連情報の画像が重畳されている。 The video output unit 116 receives the output video data from the related information superimposing unit 115, converts it into a video signal for screen display, and then gives this video signal to the video display unit 120 (S16). The image of the related information is superimposed on the output video data here.

一方、ステップＳ１７では、関連情報制御部１１３は、関連情報を出力しないことを示す表示制御信号（説明上、「０」というデータと仮定する）を関連情報バッファ１１４へ出力する。このような表示制御信号を受信した関連情報バッファ１１４は、保持中の関連情報の保持をそのまま継続し、関連情報重畳部１１５へは出力しない。 On the other hand, in step S <b> 17, the related information control unit 113 outputs a display control signal (assumed to be “0” for explanation) to the related information buffer 114 indicating that the related information is not output. The related information buffer 114 that has received such a display control signal continues holding the related information being held as it is, and does not output it to the related information superimposing unit 115.

関連情報重畳部１１５は、ステップＳ１０で受け取られた映像データのみを受け取り、その映像データをそのまま出力映像データとして映像出力部１１６に与える（Ｓ１８）。言い換えると、ステップＳ１８では、関連情報重畳部１１５は、ステップＳ１０で受け取られた映像データのフレーム画像に、関連情報の画像を重畳しない。 The related information superimposing unit 115 receives only the video data received in step S10, and provides the video data as it is to the video output unit 116 as output video data (S18). In other words, in step S18, the related information superimposing unit 115 does not superimpose the image of the related information on the frame image of the video data received in step S10.

そして、映像出力部１１６は、関連情報重畳部１１５から出力映像データを受け取り、画面表示用の映像信号に変換した後、この映像信号を映像表示部１２０に与える（Ｓ１９）。ここでの出力映像データには、関連情報の画像が重畳されていない。 Then, the video output unit 116 receives the output video data from the related information superimposing unit 115, converts it into a video signal for screen display, and gives this video signal to the video display unit 120 (S19). The image of the related information is not superimposed on the output video data here.

なお、関連情報制御部１１３が出力する表示制御信号によって、関連情報バッファ１１４が保持中の関連情報を重畳しない場合（Ｓ１３：Ｎｏの場合）には、関連情報の保持が継続されることになり、以降のフレームにおける情報出力判定値が閾値を超えた段階で（Ｓ１３：Ｙｅｓ）、その関連情報の画像が重畳される。複数の関連情報が入力され、連続するフレームにおいて情報出力判定値が閾値を下回る事象が続いた場合には、複数の関連情報が保持される。このとき、これらの関連情報が入力された順番についても関連情報バッファ１１４は記憶しておく。そして、情報出力判定値が閾値以上となった段階（Ｓ１３：Ｙｅｓ）で、関連情報バッファ１１４は、古い関連情報から順に関連情報重畳部１１５へと出力する。若しくは、関連情報バッファ１１４は、そのような段階で、保持中の関連情報全てを一気に関連情報重畳部１１５へと出力することもできる。 If the related information held by the related information buffer 114 is not superimposed by the display control signal output from the related information control unit 113 (S13: No), the related information is continuously held. When the information output determination value in the subsequent frames exceeds the threshold (S13: Yes), the image of the related information is superimposed. When a plurality of related information is input and an event in which the information output determination value falls below the threshold value continues in consecutive frames, the plurality of related information is retained. At this time, the related information buffer 114 also stores the order in which the related information is input. Then, when the information output determination value becomes equal to or greater than the threshold (S13: Yes), the related information buffer 114 outputs the old related information in order from the old related information to the related information superimposing unit 115. Alternatively, the related information buffer 114 can output all of the stored related information to the related information superimposing unit 115 at a stroke at such a stage.

なお、ある時刻において関連情報バッファ１１４に入力された関連情報を関連情報重畳部１１５へ出力するか否かを判断する時刻までに、制御部１１０は、その関連情報が入力された時刻における映像データを解析し、算出された情報出力判定値をユーザ入力データである閾値と比較し、表示制御信号を出力しなければならない（Ｓ１０〜Ｓ１３）。データ同期制御部１３０は、この必要な処理にかかる時間を考慮した同期制御を行うことで、当該時刻における関連情報と映像データ解析結果とを対応付ける役割を持つ。 By the time when it is determined whether or not the related information input to the related information buffer 114 at a certain time is to be output to the related information superimposing unit 115, the control unit 110 has the video data at the time when the related information is input. And the calculated information output determination value is compared with a threshold value which is user input data, and a display control signal must be output (S10 to S13). The data synchronization control unit 130 has a role of associating the related information and the video data analysis result at the time by performing synchronization control in consideration of the time required for the necessary processing.

また同様に、ある時刻において関連情報重畳部１１５に入力された映像データに対して関連情報を重畳するか否かを判断するには、関連情報バッファ１１４のデータ出力を待つ必要がある。これに関しても、データ同期制御部１３０は同期制御を行う役割を持つ。 Similarly, in order to determine whether or not to superimpose related information on video data input to the related information superimposing unit 115 at a certain time, it is necessary to wait for data output from the related information buffer 114. Also in this regard, the data synchronization control unit 130 has a role of performing synchronization control.

このように構成された情報処理装置１００は、映像データにおいて各時刻に表示する内容の重要度を解析することができ、解析された重要度に応じて関連情報を表示するタイミングを制御することができる。例えば、ユーザが番組を視聴し全体のストーリー展開を理解する上では、シーンチェンジ点が重要なポイントとなるが、主映像がシーンチェンジ点等の重要な内容を表示している時刻を避けて、関連情報を表示することができる。このため、実施の形態１に係る情報処理装置１００は、ユーザが重要な内容の見逃し又は聞き逃しをしてしまうことによって、全体のストーリー展開を理解できなくなるのを防ぐことができる。 The information processing apparatus 100 configured as described above can analyze the importance of the content displayed at each time in the video data, and can control the timing of displaying the related information according to the analyzed importance. it can. For example, when a user views a program and understands the development of the entire story, the scene change point is an important point, but avoid the time when the main video displays important contents such as the scene change point, Related information can be displayed. For this reason, the information processing apparatus 100 according to Embodiment 1 can prevent the user from being unable to understand the entire story development by missing or hearing important content.

実施の形態１に係る情報処理装置１００では、関連情報制御部１１３が出力する表示制御信号によって、関連情報バッファ１１４が保持中の、フレーム時刻ｔにおける関連情報を重畳しないとした場合（Ｓ１３：Ｎｏ）には、関連情報の保持が継続されることになる。しかしながら、最短の場合、次のフレーム時刻ｔ+１において情報出力判定値が閾値以上となる場合（Ｓ１３：Ｙｅｓ）が考えられる。このような場合に、次のフレーム時刻ｔ+１において、前のフレーム時刻ｔで保持が継続された関連情報の画像を必ずしも重畳する必要はない。例えば、関連情報制御部１１３は、１度重畳しないという判断を下した場合（Ｓ１３：Ｎｏ）には、予め定められた期間である一定の時間が経過するまでは、関連情報の画像を重畳しないように制御することができる。例えば、テレビにおいてフレーム間に経過する時間は非常に短く、１フレームが経過する間だけ保持した後、次のフレームで重畳した場合と、保持継続をせずに、そのフレームで即座に重畳した場合とで、人間の目でその差を認識できるものではない。このため、次のフレームで重畳が行われても、結果として重要内容の表示時刻に関連情報が表示されたことになり、ユーザの視聴を妨害してしまうことになる。従って、このように一定の時間は、重畳しないようにしておくことで、重要内容の表示時刻が明らかに過ぎ去るのを待ってから関連情報を重畳することができる。言い換えると、制御部１１０は、映像データのフレーム画像が、シーンチェンジしてから予め定められた期間内の画像である場合にも、映像データのフレーム画像が重要なシーンの画像と判断していることになる。これにより、視聴の妨害抑制を確実化できる。例えば、関連情報制御部１１３は、情報出力判定値が閾値未満であると判断した場合には、予め定められた期間の経過後、情報出力判定値が閾値以上となるまで、関連情報を出力しないように関連情報バッファ１１４を制御することができる。なお、関連情報制御部１１３は、情報出力判定値が閾値未満であると判断した場合には、予め定められた期間の経過後すぐに関連情報を出力するように関連情報バッファ１１４を制御してもよい。また、関連情報制御部１１３は、情報出力判定値が閾値未満であると判断した場合には、予め定められた期間内に情報出力判定値が閾値以上となった場合に、予め定められた期間の経過後すぐに関連情報を出力するように関連情報バッファ１１４を制御してもよい。 In the information processing apparatus 100 according to the first embodiment, when the related information at the frame time t held in the related information buffer 114 is not superimposed by the display control signal output from the related information control unit 113 (S13: No) ) Will continue to hold relevant information. However, in the shortest case, a case where the information output determination value is equal to or greater than the threshold value at the next frame time t + 1 (S13: Yes) can be considered. In such a case, at the next frame time t + 1, it is not always necessary to superimpose an image of related information that has been held at the previous frame time t. For example, if it is determined that the related information control unit 113 does not superimpose once (S13: No), the image of the related information is not superimposed until a predetermined time, which is a predetermined period, has elapsed. Can be controlled. For example, the time that elapses between frames on a television is very short. When one frame is held and then it is superimposed on the next frame, and when it is immediately superimposed on that frame without continuing to hold it. And the difference cannot be recognized by human eyes. For this reason, even if superimposition is performed in the next frame, as a result, the related information is displayed at the display time of the important content, and the user's viewing is disturbed. Therefore, by not superimposing for a certain time in this way, it is possible to superimpose related information after waiting for the display time of important contents to clearly pass. In other words, the control unit 110 determines that the frame image of the video data is an image of an important scene even when the frame image of the video data is an image within a predetermined period after the scene change. It will be. As a result, it is possible to ensure the suppression of viewing interference. For example, if the related information control unit 113 determines that the information output determination value is less than the threshold value, the related information control unit 113 does not output the related information until the information output determination value becomes equal to or greater than the threshold value after the elapse of a predetermined period. The related information buffer 114 can be controlled as described above. When the related information control unit 113 determines that the information output determination value is less than the threshold value, the related information control unit 113 controls the related information buffer 114 so that the related information is output immediately after the elapse of a predetermined period. Also good. In addition, when the related information control unit 113 determines that the information output determination value is less than the threshold value, the related information control unit 113 determines a predetermined period when the information output determination value is equal to or greater than the threshold value within a predetermined period. The related information buffer 114 may be controlled so that the related information is output immediately after elapse.

実施の形態１に係る情報処理装置１００では、関連情報制御部１１３が出力する表示制御信号によって、関連情報バッファ１１４が保持中の関連情報を出力しない場合（Ｓ１３：Ｎｏの場合）には、関連情報の保持が継続されることになるが、このような場合が続けば、関連情報は、長時間にわたって重畳されないことになる。そのような場合の対処として、関連情報バッファ１１４は、予め定められた期間である一定時間以上保持が継続された関連情報については、関連情報制御部１１３の表示制御信号にかかわらず、関連情報重畳部１１５に出力するようにしてもよい。例えば、字幕は登場人物の会話を文字で表したものであるケースがあるが、あまりに重畳が遅れすぎると字幕として価値が著しく低下してしまう。つまり、関連情報は、それ自身が入力された時刻に表示される映像の内容と密接に関わっている場合が多いので、このように保持期間が予め定められた期間を経過した関連情報を無条件に重畳して表示することで、関連情報を表示することの価値の低下を抑制することができる。 In the information processing apparatus 100 according to the first embodiment, when the related information buffer 114 does not output the related information being held by the display control signal output from the related information control unit 113 (S13: No), the related information Information retention will continue, but if such a case continues, related information will not be superimposed for a long time. As a countermeasure in such a case, the related information buffer 114 keeps the related information superimposed on the related information that has been held for a predetermined time or more, regardless of the display control signal of the related information control unit 113. The data may be output to the unit 115. For example, there are cases where subtitles represent characters' conversations in characters, but if the superposition is too late, the value of the subtitles will be significantly reduced. In other words, the related information is often closely related to the content of the video displayed at the time when it is input, so that the related information whose retention period has passed in advance is unconditionally By superimposing and displaying, it is possible to suppress a decrease in the value of displaying related information.

実施の形態１に係る情報処理装置１００では、関連情報制御部１１３が出力する表示制御信号によって、関連情報バッファ１１４が保持中の関連情報を出力しない場合（Ｓ１３：Ｎｏの場合）には、関連情報の保持が継続されることになるが、そのような場合が続けば、複数の関連情報がバッファされることになる。このような状況において、関連情報バッファ１１４によって保持されるデータ量が、予め定められたデータ量、例えば、関連情報バッファ１１４が保持できる容量を超過すると判断された場合には、関連情報バッファ１１４は、無条件に保持中の関連情報を関連情報重畳部１１５へと出力してもよい。これにより、保持中の古い関連情報が容量制限のために上書きされてしまう等して、重畳表示できない状態になるのを防ぐことができる。 In the information processing apparatus 100 according to the first embodiment, when the related information buffer 114 does not output the related information being held by the display control signal output from the related information control unit 113 (S13: No), the related information Information retention will continue, but if such a case continues, a plurality of related information will be buffered. In such a situation, if it is determined that the amount of data held by the related information buffer 114 exceeds a predetermined amount of data, for example, the capacity that the related information buffer 114 can hold, the related information buffer 114 The related information that is being stored unconditionally may be output to the related information superimposing unit 115. As a result, it is possible to prevent the old related information that is being held from being overwritten due to the capacity limitation or the like so that it cannot be superimposed and displayed.

実施の形態１に係る情報処理装置１００では、映像解析部１１１がフレーム差分値を利用して映像を解析する方法を例としてあげたが、この方法に限られるものではない。またシーンチェンジ点だけを番組全体を理解する上で重要なポイントであるとするものでもない。例えば、映像解析部１１１は、フレーム画像に対して人物又は顔の検出処理を行い、重要なポイントを解析してもよい。フレーム画像に対して人物又は顔の検出を行った結果、１つも人物及び顔が検出されたかった場合には、そのフレーム画像は背景のみを含むシーンであると考えられ、番組を視聴しているユーザがストーリー展開を理解するうえで重要度が低いシーンの画像といえる。一方、人物又は顔が検出されたフレーム画像では、その人物の会話又は動作により番組のストーリー展開が変化することが推測できる。このようなシーンは、番組を視聴しているユーザがストーリー展開を理解するうえで重要度が高いシーンといえる。このように、映像解析部１１１は、人物又は顔の検出処理を行うことにより、映像データにおける各時刻のフレーム画像の重要度を決定することができる。このような場合、例えば、映像解析部１１１は、人物又は顔が検出されたか否かを重要度として、関連情報制御部１１３に与え、関連情報制御部１１３は、重要度が、人物又は顔が検出されたことを示す場合には、関連情報を出力するように関連情報バッファ１１４を制御すればよい。 In the information processing apparatus 100 according to the first embodiment, the video analysis unit 111 uses the frame difference value as an example to analyze the video, but is not limited to this method. In addition, scene change points alone are not important points in understanding the entire program. For example, the video analysis unit 111 may perform a person or face detection process on the frame image and analyze important points. As a result of detecting a person or face on a frame image, if no person or face is detected, the frame image is considered to be a scene including only the background, and a program is being viewed. It can be said that it is an image of a scene with low importance for the user to understand the story development. On the other hand, in the frame image in which a person or face is detected, it can be estimated that the story development of the program changes depending on the conversation or action of the person. Such a scene can be said to be a scene having a high degree of importance for a user viewing a program to understand the story development. As described above, the video analysis unit 111 can determine the importance of the frame image at each time in the video data by performing the person or face detection process. In such a case, for example, the video analysis unit 111 gives the related information control unit 113 whether or not a person or a face has been detected as an importance level, and the related information control unit 113 sets the importance level to a person or face. In the case of indicating that it has been detected, the related information buffer 114 may be controlled so as to output related information.

実施の形態１に係る情報処理装置１００は、リアルタイムに放送中の番組データの視聴時だけを対象としたものでなく、ハードディスク等の記憶媒体に記憶済みの番組データの再生視聴時においても適用することができる。これにより、ユーザはより多くの映像コンテンツに対して、見逃し及び聞き逃しを抑制した、関連情報を重畳した映像の視聴ができる。なお、記憶済みの番組データには、少なくとも映像データ及び関連情報が含まれるものとする。 The information processing apparatus 100 according to the first embodiment is not only intended for viewing program data being broadcast in real time, but is also applied to playback / viewing of program data stored in a storage medium such as a hard disk. be able to. Thereby, the user can view a video with related information superimposed on a larger amount of video content while suppressing oversight and oversight. The stored program data includes at least video data and related information.

また、上述のように、記憶済みの番組データの再生視聴を行う場合、予め再生開始前に記憶済みの映像データの解析を映像解析部１１１が行っておき、再生時にデータ同期制御部１３０の同期制御に基づいて、関連情報を重畳するか否かを表す表示制御信号を関連情報制御部１１３が出力してもよい。これにより、映像解析部１１１が複雑な映像解析処理を行い、映像解析処理時間が長くなる場合にも、フレーム毎にその処理が済むのを待つ必要がなく、映像表示にもたつきが生じることがなくなる（関連情報の重畳判断処理を待つことによる、映像の遅延が生じない）。このため、ユーザはストレスなく視聴を行うことができる。 Further, as described above, when the stored program data is played back and viewed, the video analysis unit 111 analyzes the stored video data before starting the playback, and the synchronization of the data synchronization control unit 130 during playback is performed. Based on the control, the related information control unit 113 may output a display control signal indicating whether or not the related information is superimposed. As a result, even when the video analysis unit 111 performs complex video analysis processing and the video analysis processing time becomes long, there is no need to wait for the processing to be completed for each frame, and the video display does not become sluggish. (There is no video delay due to waiting for related information superimposition determination processing). For this reason, the user can view without stress.

また、上述のように、記憶済みデータの再生視聴を行う場合、予め記憶時に映像解析を行っておき、各時刻における重要度を、例えば、図４に示されているメタデータとして、記憶する番組データとともに保存しておいてもよい。図４に示されているメタデータは、番組毎に、各時刻における重要度を含むデータである。
このような場合、図５に示されている第１の変形例に係る情報処理装置１００＃１のように、データ処理部１０１＃１が、記憶されているメタデータを解析して、映像データの再生時間に対応する重要度を特定して、特定された重要度を関連情報制御部１１３＃１に与えるメタデータ解析部１１７を備えるように構成することで、関連情報の表示制御を行うことができる。なお、メタデータ解析部１１７は、データ同期制御部１３０の同期制御に応じて、対応する時刻における情報出力判定値を関連情報制御部１１３＃１に与える。このようにすることで、再生中又は再生開始前に比較的処理の重い映像解析を行う必要がなく、映像表示にもたつきが生じないことに加えて、再生開始前の解析結果待ち時間も生じないため、ユーザはストレスなく視聴を行うことができる。
なお、図５には記載されていないが、図４に示されているようなメタデータは、映像解析部１１１が生成し、記憶媒体等に記憶させておけばよい。 Further, as described above, when reproducing and viewing stored data, video analysis is performed at the time of storage in advance, and the importance level at each time is stored as, for example, metadata shown in FIG. You may save with data. The metadata shown in FIG. 4 is data including importance at each time for each program.
In such a case, as in the information processing apparatus 100 # 1 according to the first modification shown in FIG. 5, the data processing unit 101 # 1 analyzes the stored metadata, and the video data By controlling the display of related information by specifying the importance corresponding to the playback time of the video data and including the metadata analysis unit 117 that gives the specified importance to the related information control unit 113 # 1 Can do. Note that the metadata analysis unit 117 gives the information output determination value at the corresponding time to the related information control unit 113 # 1 in accordance with the synchronization control of the data synchronization control unit 130. By doing so, it is not necessary to perform comparatively heavy video analysis during playback or before playback starts, and in addition to the fact that video display does not sag, there is no waiting time for analysis results before playback starts. Therefore, the user can view without stress.
Although not shown in FIG. 5, the metadata as shown in FIG. 4 may be generated by the video analysis unit 111 and stored in a storage medium or the like.

実施の形態１に係る情報処理装置１００では、関連情報を「重畳する」と説明を行ってきたが、主映像に重畳することに限らず、別の画面上に関連情報を表示するといった場合にも本発明が有効である。例えば、図６に示されている第２の変形例に係る情報処理装置１００＃２のように、第１映像表示部としての映像表示部１２０の他に、関連情報バッファ１１４＃２から与えられる関連情報の画像を表示する第２映像表示部としてのテロップ表示部１４０を備えるように構成することもできる。このように構成することにより、図７に示されているように、映像表示部１２０は、映像出力部１１６から与えられる映像信号に対応する主映像ＭＩＭを表示し、テロップ表示部１４０は、関連情報のサブ映像ＳＩＭを表示する。このように、１つの画面上に重畳しない場合でも、両方の映像がユーザの視界に入るような環境下では、関連情報を主映像とは別に表示することで本編の視聴を妨害することになりえるので、上記と同様の効果が発揮できる。 In the information processing apparatus 100 according to the first embodiment, the related information has been described as “superimpose”. However, the related information is not limited to being superimposed on the main video, and the related information is displayed on another screen. The present invention is also effective. For example, like the information processing apparatus 100 # 2 according to the second modification shown in FIG. 6, in addition to the video display unit 120 as the first video display unit, the information is provided from the related information buffer 114 # 2. A telop display unit 140 as a second video display unit that displays an image of related information may be provided. With this configuration, as shown in FIG. 7, the video display unit 120 displays the main video MIM corresponding to the video signal given from the video output unit 116, and the telop display unit 140 An information sub-video SIM is displayed. In this way, even when they are not superimposed on one screen, in an environment where both videos are in the user's field of view, the related information is displayed separately from the main video, thereby hindering viewing of the main part. Therefore, the same effect as above can be exhibited.

実施の形態２．
図８は、実施の形態２に係る情報処理装置２００の構成を概略的に示すブロック図である。
情報処理装置２００は、データ処理部２０１と、映像表示部１２０と、データ同期制御部１３０とを備える。実施の形態２に係る情報処理装置２００は、データ処理部２０１において、実施の形態１に係る情報処理装置１００と異なっている。 Embodiment 2. FIG.
FIG. 8 is a block diagram schematically showing the configuration of the information processing apparatus 200 according to the second embodiment.
The information processing apparatus 200 includes a data processing unit 201, a video display unit 120, and a data synchronization control unit 130. The information processing apparatus 200 according to the second embodiment is different from the information processing apparatus 100 according to the first embodiment in a data processing unit 201.

データ処理部２０１は、映像データ、関連情報、音声データ及びユーザ入力を入力とし、映像信号を出力する。データ処理部２０１は、制御部２１０と、関連情報バッファ１１４と、関連情報重畳部１１５と、映像出力部１１６とを備える。実施の形態２におけるデータ処理部２０１は、制御部２１０での処理の点において、実施の形態１におけるデータ処理部１０１と異なっている。 The data processing unit 201 inputs video data, related information, audio data, and user input, and outputs a video signal. The data processing unit 201 includes a control unit 210, a related information buffer 114, a related information superimposing unit 115, and a video output unit 116. The data processing unit 201 in the second embodiment is different from the data processing unit 101 in the first embodiment in terms of processing in the control unit 210.

制御部２１０は、映像データのフレーム画像が重要なシーンの画像であるか否かを判断する。そして、制御部２１０は、映像データのフレーム画像が重要なシーンの画像ではない場合に、関連情報の画像を映像データのフレーム画像とともに表示するために、関連情報バッファ１１４から関連情報を出力させる。一方、制御部２１０は、映像データのフレーム画像が重要なシーンの画像である場合に、関連情報の画像を映像データのフレーム画像とともに表示させないために、関連情報バッファ１１４から関連情報を出力させない。
ここで、実施の形態２における制御部２１０は、映像データのフレーム画像がシーンチェンジした画像である場合、及び、映像データのフレーム画像が会話シーンの画像である場合の少なくとも何れか一方の場合に、映像データのフレーム画像が重要なシーンの画像であると判断する。
制御部２１０は、映像解析部１１１と、ユーザ入力受付部２１２と、関連情報制御部２１３と、音声データバッファ２１８と、音声解析部２１９とを備える。実施の形態２における制御部２１０は、ユーザ入力受付部２１２及び関連情報制御部２１３での処理の点、並びに、音声データバッファ２１８及び音声解析部２１９をさらに備える点において、実施の形態１における制御部１１０と異なっている。なお、実施の形態２においては、映像解析部１１１で算出される重要度を第１重要度という。 The controller 210 determines whether the frame image of the video data is an important scene image. Then, when the frame image of the video data is not an important scene image, the control unit 210 causes the related information buffer 114 to output the related information in order to display the image of the related information together with the frame image of the video data. On the other hand, when the frame image of the video data is an image of an important scene, the control unit 210 does not display the related information from the related information buffer 114 in order not to display the image of the related information together with the frame image of the video data.
Here, the control unit 210 in the second embodiment performs the case where the frame image of the video data is a scene-changed image and / or the case where the frame image of the video data is an image of a conversation scene. The frame image of the video data is determined to be an important scene image.
The control unit 210 includes a video analysis unit 111, a user input reception unit 212, a related information control unit 213, an audio data buffer 218, and an audio analysis unit 219. The control unit 210 according to the second embodiment is the control according to the first embodiment in that the user input reception unit 212 and the related information control unit 213 are further processed, and the audio data buffer 218 and the audio analysis unit 219 are further provided. It is different from the part 110. In the second embodiment, the importance calculated by the video analysis unit 111 is referred to as a first importance.

音声データバッファ２１８は、音声データを保持する音声データ保持部である。音声データバッファ２１８は、音声データを入力として受信し、音声データを一時的に保持しておき、必要に応じて（詳細は後述する）、その音声データを音声解析部２１９へと出力する。
音声解析部２１９は、映像データに付属する音声データを解析して、映像データのフレーム画像の第２重要度を算出する。本実施の形態においては、第２重要度は、映像データのフレーム画像が会話シーンの画像である確率を示す発話成分値である。ここで、発話成分値は、出演者等が会話の最中であるか否かを確率として示す値である。 The audio data buffer 218 is an audio data holding unit that holds audio data. The audio data buffer 218 receives the audio data as an input, temporarily holds the audio data, and outputs the audio data to the audio analysis unit 219 as necessary (details will be described later).
The audio analysis unit 219 analyzes the audio data attached to the video data and calculates the second importance of the frame image of the video data. In the present embodiment, the second importance is an utterance component value indicating the probability that the frame image of the video data is a conversation scene image. Here, the utterance component value is a value indicating, as a probability, whether or not a performer is in the middle of a conversation.

ユーザ入力受付部２１２は、ユーザ入力データを受け付ける。本実施の形態においては、ユーザ入力受付部２１２は、ユーザ入力データとして、映像解析部１１１から得られる第１重要度及び音声解析部２１９から得られる第２重要度から第３重要度である情報出力判定値を算出するための算出情報と、情報出力判定値と比較して、関連情報を表示するか否かを判定するため閾値との入力を受け付ける。本実施の形態では、算出情報は、第１重要度及び第２重要度のそれぞれに掛け合わせるそれぞれの重み値を示すものとする。 The user input receiving unit 212 receives user input data. In the present embodiment, the user input reception unit 212 has information from the first importance obtained from the video analysis unit 111 and the second importance to third importance obtained from the audio analysis unit 219 as user input data. The calculation information for calculating the output determination value is compared with the information output determination value, and an input of a threshold value is received to determine whether or not to display related information. In the present embodiment, it is assumed that the calculation information indicates each weight value to be multiplied by each of the first importance level and the second importance level.

関連情報制御部２１３は、関連情報バッファ１１４を制御する。例えば、本実施の形態においては、関連情報制御部２１３は、映像解析部１１１から与えられた第１重要度と、音声解析部２１９から与えられた第２重要度とを合わせて、第３重要度である情報出力判定値を算出する。例えば、関連情報制御部２１３は、ユーザ入力受付部２１２から与えられた算出情報で示されるそれぞれの重み値を、第１重要度及び第２重要度のそれぞれに掛け合わせてから、これらを加算した加算値により情報出力判定値を算出する。そして、関連情報制御部２１３は、算出された情報出力判定値がユーザ入力受付部２１２より与えられた閾値よりも小さいと判定した場合に、映像データのフレーム画像が重要なシーンの画像ではないと判断し、関連情報バッファ１１４に記憶されている関連情報を出力するように、関連情報バッファ１１４を制御する。また、関連情報制御部２１３は、算出された情報出力判定値が、ユーザ入力受付部２１２より与えられた閾値以上である場合に、映像データのフレーム画像が重要なシーンの画像であると判断し、関連情報を出力しないように、言い換えると、関連情報の記憶を維持するように、関連情報バッファ１１４を制御する。なお、関連情報制御部２１３は、表示制御信号を関連情報バッファ１１４に与えることで、関連情報バッファ１１４を制御する。 The related information control unit 213 controls the related information buffer 114. For example, in the present embodiment, the related information control unit 213 combines the first importance level given from the video analysis unit 111 and the second importance level given from the audio analysis unit 219 to add the third importance level. An information output determination value that is a degree is calculated. For example, the related information control unit 213 multiplies each weight value indicated by the calculation information given from the user input reception unit 212 with each of the first importance level and the second importance level, and then adds them. An information output determination value is calculated from the added value. When the related information control unit 213 determines that the calculated information output determination value is smaller than the threshold given by the user input reception unit 212, the frame image of the video data is not an image of an important scene. The related information buffer 114 is controlled so as to determine and output the related information stored in the related information buffer 114. In addition, the related information control unit 213 determines that the frame image of the video data is an image of an important scene when the calculated information output determination value is equal to or greater than the threshold given by the user input reception unit 212. The related information buffer 114 is controlled not to output the related information, in other words, to maintain the storage of the related information. The related information control unit 213 controls the related information buffer 114 by giving a display control signal to the related information buffer 114.

次に、実施の形態２に係る情報処理装置２００の動作について説明する。
図９は、実施の形態２に係る情報処理装置２００の動作を示すフローチャートである。図９に示されている処理の内、図２と同様の処理については、図２と同じ符号が付されている。 Next, the operation of the information processing apparatus 200 according to Embodiment 2 will be described.
FIG. 9 is a flowchart showing the operation of the information processing apparatus 200 according to the second embodiment. Of the processes shown in FIG. 9, the same processes as those in FIG. 2 are denoted by the same reference numerals as those in FIG.

図９のステップＳ１０及びＳ１１については、図２のステップＳ１０及びＳ１１と同様である。 Steps S10 and S11 in FIG. 9 are the same as steps S10 and S11 in FIG.

ステップＳ２０では、ユーザ入力受付部２１２は、ユーザ入力を受け付ける。例えば、ユーザ入力受付部２１２は、第１重要度及び第２重要度に掛け合わせる重み値を示す算出情報、並びに、第３重要度である情報出力判定値に対する閾値をユーザ入力データとして受け取り、それを関連情報制御部２１３に与える。算出情報は、映像解析結果及び音声解析結果をそれぞれどの程度考慮して関連情報表示制御を行うかといった設定を示す情報である。例えば、算出情報により、音声解析結果より映像解析結果を重視して、重畳のタイミングを制御するとか、映像解析結果は無視して、音声解析結果のみを反映したタイミング制御を行う等といった設定を行うことができる。 In step S20, the user input receiving unit 212 receives a user input. For example, the user input reception unit 212 receives, as user input data, calculation information indicating a weight value to be multiplied by the first importance level and the second importance level, and a threshold value for the information output determination value that is the third importance level. Is provided to the related information control unit 213. The calculated information is information indicating a setting of how much the video analysis result and the audio analysis result are considered and related information display control is performed. For example, the calculation information is set so that the video analysis result is more important than the audio analysis result and the superimposition timing is controlled, or the video analysis result is ignored and the timing control reflecting only the audio analysis result is performed. be able to.

ステップＳ２１では、音声データバッファ２１８は、音声データを受信する。そして、音声データバッファ２１８は、発話成分値（詳細は後述する）を算出するのに必要な分の音声データを一時的に保持する（Ｓ２２）。その後、音声データバッファ２１８は、保持した音声データを音声解析部２１９に与える。 In step S21, the audio data buffer 218 receives audio data. Then, the voice data buffer 218 temporarily holds the voice data for calculating the speech component value (details will be described later) (S22). Thereafter, the audio data buffer 218 gives the held audio data to the audio analysis unit 219.

次に、音声解析部２１９は、音声データバッファ２１８から一定時間区間の音声データを受信し、その音声データに対して発話成分値の算出を行う（Ｓ２３）。一般的な発話成分値算出方法の１つを説明すると、音声解析部２１９は、まず受信した音声データに対してフーリエ変換を施すことで、周波数成分ｘ（ｆ，ｔ）に変換する。また、音声解析部２１９は、「番組冒頭の数秒間は会話がない」等の仮定により、番組冒頭から、周波数毎にノイズ成分（非発話成分）λ（ｆ）を算出する。そして、音声解析部２１９は、全周波数においてこれらの比率を、下記の（２）式により求める。

そして、音声解析部２１９は、下記の（３）式に従って、ある種の非線形変換を施した後、考慮する全ての周波数で平均化することで、発話成分値Ｇ（ｔ）を算出する。

ここで、Ｆは考慮する周波数の集合であり、｜Ｆ｜は、集合Ｆの要素の数である。なお、ここでの非線形変換は、観測信号ｘ（ｆ，ｔ）をノイズと発話に分類して、それぞれを分散の異なるガウス分布でモデル化した際の尤度比から導出されるものである。
音声解析部２１９は、このような発話成分値算出方法を利用することで、注目する時刻において登場人物等が会話をしているかどうかの確率を数値として算出し、関連情報制御部２１３にそれを出力する。言い換えると、音声解析部２１９は、このようにして算出された発話成分値を、ステップＳ２２で保存された一定時間区間の音声データに対応する映像データのフレーム画像における発話成分値として、関連情報制御部２１３に与える。 Next, the voice analysis unit 219 receives voice data for a certain time interval from the voice data buffer 218, and calculates a speech component value for the voice data (S23). Explaining one general speech component value calculation method, the speech analysis unit 219 first converts the received speech data to a frequency component x (f, t) by performing a Fourier transform. Also, the voice analysis unit 219 calculates a noise component (non-speech component) λ (f) for each frequency from the beginning of the program on the assumption that “there is no conversation for the first few seconds of the program”. Then, the voice analysis unit 219 obtains these ratios at all frequencies by the following equation (2).

Then, the speech analysis unit 219 calculates a speech component value G (t) by performing some kind of non-linear transformation according to the following equation (3) and then averaging at all frequencies to be considered.

Here, F is a set of frequencies to be considered, and | F | is the number of elements of the set F. Note that the non-linear transformation here is derived from the likelihood ratio when the observed signal x (f, t) is classified into noise and speech and each is modeled with a Gaussian distribution having different variances.
The voice analysis unit 219 uses such a speech component value calculation method to calculate the probability of whether or not a character or the like is talking at the time of interest as a numerical value, and sends it to the related information control unit 213. Output. In other words, the audio analysis unit 219 uses the utterance component value calculated in this way as the utterance component value in the frame image of the video data corresponding to the audio data in the predetermined time interval stored in step S22, and performs related information control. Part 213.

関連情報制御部２１３は、映像解析部１１１から第１重要度としてのフレーム差分値と、音声解析部２１９から第２重要度としての発話成分値と、ユーザ入力受付部２１２からユーザ入力データとを受信する。そして、関連情報制御部２１３は、これらの情報を踏まえ、情報出力判定値を算出する。例えば、関連情報制御部２１３は、映像解析結果及び音声解析結果をそれぞれどの程度考慮するかを示す算出情報に基づいて、フレーム差分値及び発話成分値のそれぞれに、それぞれの重み値を掛け合わせて、両者を足し合わせることで、情報出力判定値を算出する。そして、処理は、ステップＳ１３に進む。 The related information control unit 213 receives the frame difference value as the first importance from the video analysis unit 111, the utterance component value as the second importance from the audio analysis unit 219, and the user input data from the user input reception unit 212. Receive. Then, the related information control unit 213 calculates an information output determination value based on such information. For example, the related information control unit 213 multiplies each of the frame difference value and the utterance component value by the respective weight values based on calculation information indicating how much the video analysis result and the audio analysis result are considered. The information output determination value is calculated by adding the two together. Then, the process proceeds to step S13.

図９のステップＳ１３〜Ｓ１９の処理は、図２のステップＳ１３〜Ｓ１９の処理と同様である。 The processing in steps S13 to S19 in FIG. 9 is the same as the processing in steps S13 to S19 in FIG.

なお、発話成分値を算出するために一定時間区間の音声データを必要とするため、音声データバッファ２１８は、一時的に音声データを保持する必要であるが、それによって音声データの解析は、映像データの解析と比べて大きく遅延する可能性がある。このタイムラグについて、データ同期制御部１３０は、同期制御を行う役割を持つ。 Note that the audio data buffer 218 needs to temporarily store the audio data because audio data in a certain time interval is required to calculate the speech component value. There may be a significant delay compared to data analysis. With respect to this time lag, the data synchronization control unit 130 has a role of performing synchronization control.

以上のように、実施の形態２に係る情報処理装置２００においては、映像データだけでなく、音声データをもとに各時刻に表示する内容の重要度を解析することができ、解析された重要度に応じて関連情報を表示するタイミングを制御することができる。例えば、ユーザが番組を視聴し全体のストーリー展開を理解する上では、会話シーンが重要なポイントとなるが、主映像が会話シーン等の重要な内容を表示している時刻を避けて、関連情報を表示させることができる。このため、ユーザが、重要な内容の見逃し又は聞き逃しをしてしまうことによって、全体のストーリー展開を理解できなくなることを防止することができる。 As described above, in the information processing apparatus 200 according to the second embodiment, the importance of the content displayed at each time can be analyzed based on not only video data but also audio data. The timing at which the related information is displayed can be controlled according to the degree. For example, when a user views a program and understands the development of the entire story, the conversation scene is an important point, but avoids the time when the main video displays important contents such as the conversation scene. Can be displayed. For this reason, it is possible to prevent the user from being unable to understand the entire story development by missing or hearing important content.

実施の形態２に係る情報処理装置２００では、映像解析結果及び音声解析結果のどちらか一方のみを反映した関連情報重畳表示制御を行うこともできる。つまり、映像に集中したいユーザは、映像解析結果だけを、音声に集中したいユーザは、音声解析結果だけを反映するようにユーザ入力を行うことができる。 The information processing apparatus 200 according to Embodiment 2 can also perform related information superimposition display control that reflects only one of the video analysis result and the audio analysis result. That is, a user who wants to concentrate on video can perform user input to reflect only the video analysis result, and a user who wants to concentrate on audio can reflect only the audio analysis result.

実施の形態２においても、図４に示されているメタデータに、実施の形態２における情報出力判定値（第３優先度）を含めておくことにより、図５に示されている構成により、会話シーンを考慮した制御を効率よく行うことができる。 Also in the second embodiment, by including the information output determination value (third priority) in the second embodiment in the metadata shown in FIG. 4, the configuration shown in FIG. Control in consideration of the conversation scene can be performed efficiently.

また、実施の形態２においても、図６に示されている制御部１１０を、制御部２１０に置き換えることで、テロップ表示部１４０に関連情報の画像を表示することができる。
さらに、実施の形態２においても、関連情報制御部２１３は、映像データのフレーム画像が重要なシーンの画像であると判断した場合には、予め定められた期間が経過するまで、関連情報バッファ１１４から関連情報を出力させないように構成されていてもよい。
さらにまた、関連情報保持部１１４は、保持期間が予め定められた期間を経過した関連情報を出力してもよく、保持している関連情報のデータ量が予め定められたデータ量以上となった場合には、関連情報を出力してもよい。 Also in the second embodiment, by replacing the control unit 110 shown in FIG. 6 with the control unit 210, an image of related information can be displayed on the telop display unit 140.
Further, also in the second embodiment, when the related information control unit 213 determines that the frame image of the video data is an image of an important scene, the related information buffer 114 is used until a predetermined period elapses. May be configured not to output the related information.
Furthermore, the related information holding unit 114 may output related information whose holding period has passed a predetermined period, and the data amount of the related information held is equal to or greater than the predetermined data amount. In this case, related information may be output.

なお、図示してはいないが、以上に記載された情報処理装置１００、２００が、デジタル放送を受信する受信部を備えていてもよい。このような場合には、映像データは、その受信部のデコーダ（デコード部）から出力されたものであればよい。また、関連情報の字幕データは、その受信部のデマルチプレクサ（デマルチプレクス部）から出力されたものであればよい。また、図示してはいないが、以上に記載された情報処理装置１００、２００が、デジタル放送を受信する受信部から得られる情報等に基づいて、関連情報の字幕データから抽出されたキーワード、番組の出演者情報、テロップ及びメニュー画面を生成するための情報制御部を備えていてもよい。このような場合には、関連情報バッファ１１４に入力される関連情報は、その情報制御部が生成したものであればよい。さらに、図示してはいないが、以上に記載された情報処理装置１００、２００が、ユーザからの入力を受け付ける入力部を備えていてもよい。このような場合には、ユーザ入力データは、その入力部を介して入力されたものであればよい。 Although not shown, the information processing apparatuses 100 and 200 described above may include a receiving unit that receives digital broadcasting. In such a case, the video data may be output from the decoder (decode unit) of the receiving unit. Moreover, the subtitle data of related information should just be output from the demultiplexer (demultiplex part) of the receiving part. Although not shown, the information processing apparatuses 100 and 200 described above can extract keywords and programs extracted from subtitle data of related information based on information obtained from a receiving unit that receives digital broadcasting. An information control unit for generating the performer information, telop, and menu screen may be provided. In such a case, the related information input to the related information buffer 114 only needs to be generated by the information control unit. Furthermore, although not illustrated, the information processing apparatuses 100 and 200 described above may include an input unit that receives input from the user. In such a case, the user input data only needs to be input via the input unit.

以上に記載された情報処理装置１００、２００は、映像表示部１２０を備えているが、映像表示部１２０は、情報処理装置１００、２００の外部の装置が備えるものであってもよい。
例えば、情報処理装置１００、２００は、テレビ、ＳＴＢ（ＳｅｔＴｏｐＢｏｘ）、ＤＶＤ及びＢＤ等のプレーヤー、スマートフォン又はカーナビゲーションシステム等として利用することができる。 The information processing apparatuses 100 and 200 described above include the video display unit 120. However, the video display unit 120 may be included in a device external to the information processing apparatuses 100 and 200.
For example, the information processing apparatuses 100 and 200 can be used as televisions, STB (Set Top Box), DVD and BD players, smartphones, car navigation systems, and the like.

１００，１００＃１，１００＃２，２００情報処理装置、１０１，１０１＃１，１０１＃２、２０１データ処理部、１１０，２１０制御部、１１１映像解析部、１１２，２１２ユーザ入力受付部、１１３，１１３＃１，２１３関連情報制御部、１１４，１１４＃２関連情報バッファ、１１５，１１５＃２関連情報重畳部、１１５ａ，１１５ａ＃２映像データバッファ、１１６映像出力部、１１７メタデータ解析部、２１８音声データバッファ、２１９音声解析部、１２０映像表示部、１３０データ同期制御部。 100, 100 # 1, 100 # 2, 200 Information processing apparatus, 101, 101 # 1, 101 # 2, 201 Data processing unit, 110, 210 Control unit, 111 Video analysis unit, 112, 212 User input reception unit, 113 , 113 # 1, 213 related information control unit, 114, 114 # 2 related information buffer, 115, 115 # 2 related information superimposing unit, 115a, 115a # 2 video data buffer, 116 video output unit, 117 metadata analysis unit, 218 audio data buffer, 219 audio analysis unit, 120 video display unit, 130 data synchronization control unit.

Claims

A related information holding unit for holding related information of video data;
It is determined whether the frame image of the video data is an image of an important scene. If the frame image of the video data is not an image of an important scene, the image of the related information is used as the frame image of the video data. And displaying the related information from the related information holding unit and displaying the image of the related information together with the frame image of the video data when the frame image of the video data is an image of an important scene. An information processing apparatus comprising: a control unit that does not output the related information from the related information holding unit so as not to be displayed.

The information according to claim 1, wherein the control unit determines that the frame image of the video data is an image of an important scene when the frame image of the video data is a scene-changed image. Processing equipment.

The controller is
Analyzing the video data and calculating the importance of the frame image of the video data;
When the importance calculated by the video analysis unit is smaller than a predetermined threshold, it is determined that the frame image of the video data is not an image of an important scene, and the related information is obtained from the related information holding unit. When the importance calculated by the video analysis unit is greater than or equal to a predetermined threshold, the frame data of the video data is determined to be an important scene image, and the related information holding unit The information processing apparatus according to claim 2, further comprising: a related information control unit that does not output the related information.

The controller is
Analyzing metadata indicating the importance of the frame image of the video data for each time, and based on the time, a metadata analysis unit for specifying the importance for each frame image of the video data;
When the importance specified by the metadata analysis unit is smaller than a predetermined threshold, it is determined that the frame image of the video data is not an image of an important scene, and the related information is stored from the related information holding unit. When the importance specified by the metadata analysis unit is greater than or equal to a predetermined threshold, it is determined that the frame image of the video data is an image of an important scene, and the related information is retained The information processing apparatus according to claim 2, further comprising: a related information control unit that does not output the related information from a unit.

The information processing apparatus according to claim 3, wherein the importance is a frame difference value indicating a difference between consecutive frame images.

The information processing according to any one of claims 3 to 5, wherein the control unit further includes a user input receiving unit that receives the predetermined threshold as user input data input from a user. apparatus.

The control unit includes a frame of the video data in at least one of a case where the frame image of the video data is a scene-changed image and a case where the frame image of the video data is a conversation scene image. The information processing apparatus according to claim 1, wherein the image is determined to be an image of an important scene.

The controller is
A video analysis unit that analyzes the video data and calculates a first importance of a frame image of the video data;
An audio analysis unit that analyzes audio data attached to the video data and calculates a second importance of a frame image of the video data;
When the third importance calculated by combining the first importance calculated by the video analysis unit and the second importance calculated by the audio analysis unit is smaller than a predetermined threshold, the video data When it is determined that the frame image is not an image of an important scene, the related information is output from the related information holding unit, and the third importance is equal to or greater than a predetermined threshold, the frame of the video data The information processing apparatus according to claim 7, further comprising: a related information control unit that determines that the image is an image of an important scene and does not output the related information from the related information holding unit.

The controller is
Third importance combining the first importance of the frame image of the video data calculated from the video data and the second importance of the frame image of the video data calculated from the audio data attached to the video data A metadata analysis unit that analyzes the metadata indicating the degree for each time, and identifies the third importance for each frame image of the video data based on the time;
When the third importance specified by the metadata analysis unit is smaller than a predetermined threshold, it is determined that the frame image of the video data is not an image of an important scene, and the related information holding unit And outputting the related information, and determining that the frame image of the video data is an image of an important scene when the third importance specified by the metadata analysis unit is equal to or greater than a predetermined threshold, The information processing apparatus according to claim 7, further comprising: a related information control unit that does not output the related information from the related information holding unit.

The first importance is a frame difference value indicating a difference between successive frame images,
The second importance is an utterance component value indicating a probability that a frame image of the video data is an image of a conversation scene,
The information processing apparatus according to claim 8, wherein the third importance is an added value of the frame difference value and the utterance component value.

The information processing according to any one of claims 8 to 10, wherein the control unit further includes a user input receiving unit that receives the predetermined threshold as user input data input from a user. apparatus.

The user input reception unit includes calculation information indicating respective weight values to be multiplied by the first importance calculated by the video analysis unit and the second importance calculated by the audio analysis unit, Further receiving as the user input data,
The related information control unit is indicated by calculation information received by the user input reception unit in the first importance calculated by the video analysis unit and the second importance calculated by the audio analysis unit. The information processing apparatus according to claim 11, wherein each weight value is multiplied and then added.

When the related information control unit determines that the frame image of the video data is an important scene image, the related information control unit does not output the related information from the related information holding unit until a predetermined period elapses. The information processing apparatus according to any one of claims 3 to 6 and 8 to 12.

The information processing apparatus according to claim 1, wherein the related information holding unit outputs related information whose holding period has passed a predetermined period.

The related information holding unit outputs the related information when the data amount of the related information held exceeds a predetermined data amount. The information processing apparatus according to one item.

The related information superimposing unit that generates the output video data by superimposing the image of the related information output from the related information holding unit on the frame image of the video data. The information processing apparatus according to any one of the above.

The information processing apparatus according to claim 16, further comprising a video output unit that generates a video signal of the output video data generated by the related information superimposing unit.

The information processing apparatus according to claim 17, further comprising a video display unit that displays a video based on the video signal generated by the video output unit.

A video output unit for generating a video signal of the video data;
A first video display unit for displaying video based on the video signal generated by the video output unit;
The information processing apparatus according to claim 1, further comprising: a second video display unit that displays an image of the related information output from the related information holding unit.

Related information holding process for holding related information of video data;
It is determined whether the frame image of the video data is an image of an important scene. If the frame image of the video data is not an image of an important scene, the image of the related information is used as the frame image of the video data. The related information held in the related information holding process is output, and when the frame image of the video data is an image of an important scene, the image of the related information is displayed in the video data. A control process that does not output the related information held in the related information holding process so as not to be displayed together with a frame image.

The information according to claim 20, wherein the control process determines that the frame image of the video data is an image of an important scene when the frame image of the video data is a scene-changed image. Processing method.

The control process is
Analyzing the video data and calculating the importance of the frame image of the video data; and
When the importance calculated in the video analysis process is smaller than a predetermined threshold, it is determined that the frame image of the video data is not an image of an important scene, and the related information holding process holds the image The related information is output, and when the importance calculated in the video analysis process is equal to or greater than a predetermined threshold, it is determined that the frame image of the video data is an image of an important scene, and the related information The information processing method according to claim 21, further comprising: a related information control process that does not output the related information held in the holding process.

The control process is
Analyzing metadata indicating the importance of the frame image of the video data for each time, and based on the time, a metadata analysis process for identifying the importance for each frame image of the video data;
When the importance specified in the metadata analysis process is smaller than a predetermined threshold, it is determined that the frame image of the video data is not an important scene image, and is stored in the related information holding process The related information is output, and when the importance specified in the metadata analysis process is equal to or higher than a predetermined threshold, it is determined that the frame image of the video data is an image of an important scene, The information processing method according to claim 21, further comprising: a related information control process that does not output the related information held in the related information holding process.

The information processing method according to claim 22 or 23, wherein the importance is a frame difference value indicating a difference between consecutive frame images.

The information processing according to any one of claims 22 to 24, wherein the control process further includes a user input reception process for receiving the predetermined threshold value as user input data input from a user. Method.

In the control process, the frame of the video data is at least one of the case where the frame image of the video data is a scene-changed image and the case where the frame image of the video data is an image of a conversation scene. The information processing method according to claim 20, wherein the image is determined to be an image of an important scene.

The control process is
Analyzing the video data and calculating a first importance of a frame image of the video data;
An audio analysis process of analyzing audio data attached to the video data and calculating a second importance of a frame image of the video data;
When the first importance calculated in the video analysis process and the third importance combined with the second importance calculated in the audio analysis process are smaller than a predetermined threshold, the video data When it is determined that the frame image is not an image of an important scene, the related information held in the related information holding process is output, and the third importance is equal to or higher than a predetermined threshold, the video 27. A related information control process that determines that a frame image of data is an image of an important scene and does not output the related information held in the related information holding process. Information processing method.

The control process is
Third importance combining the first importance of the frame image of the video data calculated from the video data and the second importance of the frame image of the video data calculated from the audio data attached to the video data A metadata analysis process for analyzing the metadata indicating the degree for each time, and identifying the third importance for each frame image of the video data based on the time;
When the third importance specified in the metadata analysis process is smaller than a predetermined threshold, it is determined that the frame image of the video data is not an important scene image, and is stored in the related information holding process Output the related information, and when the third importance specified in the metadata analysis process is greater than or equal to a predetermined threshold, the frame image of the video data is an image of an important scene The information processing method according to claim 26, further comprising: a related information control process that determines and does not output the related information held in the related information holding process.

The first importance is a frame difference value indicating a difference between successive frame images,
The second importance is an utterance component value indicating a probability that a frame image of the video data is an image of a conversation scene,
The information processing method according to claim 27 or 28, wherein the third importance is an added value of the frame difference value and the speech component value.

The information processing according to any one of claims 21 to 28, wherein the control step further includes a user input reception step of receiving the predetermined threshold value as user input data input from a user. Method.

In the user input reception process, calculation information indicating respective weight values to be multiplied by the first importance calculated in the video analysis process and the second importance calculated in the audio analysis process, Further receiving as the user input data,
The related information control process is indicated by the calculation information received in the user input reception process in the first importance calculated in the video analysis process and the second importance calculated in the audio analysis process. The information processing method according to claim 30, wherein the weight values are multiplied and then added.

In the related information control process, when it is determined that the frame image of the video data is an image of an important scene, the related information held in the related information holding process until a predetermined period elapses. The information processing method according to any one of claims 22 to 25 and 27 to 31, wherein the information is not output.

The information processing method according to any one of claims 20 to 32, wherein the related information holding step outputs related information whose holding period has passed a predetermined period.

The related information holding process outputs the related information when the amount of data of the related information held exceeds a predetermined amount of data. The information processing method according to one item.

The method further includes a related information superimposing step of generating output video data by superimposing an image of related information output after being held in the related information holding step on a frame image of the video data. Item 35. The information processing method according to any one of Items 20 to 34.

36. The information processing method according to claim 35, further comprising a video output process of generating a video signal of the output video data generated in the related information superimposing process.

The information processing method according to claim 36, further comprising a video display process of displaying a video based on the video signal generated in the video output process.

A video output process for generating a video signal of the video data;
A first video display process for displaying video based on the video signal generated in the video output process;
The information processing according to any one of claims 20 to 34, further comprising: a second video display process for displaying an image of the related information output after being stored in the related information holding process. Method.