JP2012222450A

JP2012222450A - Similar video output method, similar video output apparatus and similar video output program

Info

Publication number: JP2012222450A
Application number: JP2011083739A
Authority: JP
Inventors: Taiga Yoshida; 大我吉田; Takeshi Irie; 豪入江; Takashi Sato; 隆佐藤; Akira Kojima; 明小島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-04-05
Filing date: 2011-04-05
Publication date: 2012-11-12
Anticipated expiration: 2031-04-05
Also published as: JP5627002B2

Abstract

PROBLEM TO BE SOLVED: To output pieces of video data which have similar impressions.SOLUTION: A similar video output apparatus 1 includes: a video data storage part 12 in which a plurality of pieces of video data are stored; structural feature similarity calculation means 22 for calculating time in which a structural feature for each section obtained by dividing the video data appears as a structural feature amount for each of the plurality of pieces of video data and calculating structural feature similarity between the video data on the basis of the structural feature amount and outputting structural feature similarity data; and similar video determination means 24 for extracting video data similar to reference video data from the video data storage part 12 on the basis of the structural feature similarity data.

Description

本発明は、類似する映像データを出力する類似映像出力方法、類似映像出力装置および類似映像出力プログラムに関する。 The present invention relates to a similar video output method, a similar video output device, and a similar video output program for outputting similar video data.

昨今の情報処理装置の発達に伴い、映像データなどの電子コンテンツが膨大に増えている。例えば、ＩＰＴＶ（Internet Protocol TeleVision）や動画共有サイトにおいて、ユーザは、大量の映像データの中から、興味のある映像データを選択し、任意のタイミングでその選択した映像データを閲覧することが可能である。ユーザは例えば、所望の映像データのキーワードやジャンルを指定することによって、指定された条件に合致する映像データを閲覧することができる。このように、ユーザが所望する映像データのイメージが明確である場合、検索条件を指定し、検索された映像データから所望の映像データを探すことは容易である。 With the recent development of information processing apparatuses, electronic contents such as video data have increased enormously. For example, in an IPTV (Internet Protocol TeleVision) or video sharing site, a user can select video data of interest from a large amount of video data and browse the selected video data at an arbitrary timing. is there. For example, by specifying a keyword or genre of desired video data, the user can browse video data that matches the specified conditions. Thus, when the image of the video data desired by the user is clear, it is easy to specify the search condition and search for the desired video data from the searched video data.

しかし、ユーザが所望する映像のイメージが明確でない場合、ユーザは、適切な検索条件を指定することができない。従ってユーザは、検索条件を指定する方法では、所望の映像を探し出すことが困難となってしまう。また、所望の映像データを閲覧する都度、検索条件を指定する操作が必要になるため、ユーザが煩わしさを感じる場合もある。 However, if the image of the video desired by the user is not clear, the user cannot specify an appropriate search condition. Therefore, it is difficult for the user to find a desired video by the method of specifying the search condition. In addition, every time the desired video data is browsed, an operation for specifying the search condition is required, and the user may feel annoyed.

このような問題を解決するためのアプローチの一つとして、映像推薦技術がある。この映像推薦とは、ユーザがある映像に対して閲覧や評価をした際に、関連する別の映像を提示することにより、ユーザの映像発見を容易にするための手法である。 One approach for solving such problems is video recommendation technology. This video recommendation is a technique for facilitating the user's video discovery by presenting another related video when a user browses or evaluates a video.

この映像推薦のための手法は、主に、協調フィルタリングと内容ベースフィルタリングに大別される。 The video recommendation methods are mainly divided into collaborative filtering and content-based filtering.

協調フィルタリングは、映像に対するユーザの閲覧履歴や評価履歴を利用することにより、ユーザもしくは映像同士が似ているかを分析し、推薦する手法である。例えば、ユーザからの評価値のつけられ方が似ているアイテムを推薦する方法がある（例えば、非特許文献１参照）。 Collaborative filtering is a technique of analyzing and recommending whether users or videos are similar by using a user's browsing history or evaluation history for the video. For example, there is a method of recommending items that are similar in how evaluation values are given by users (see Non-Patent Document 1, for example).

この協調フィルタリングの推薦手法では、非特許文献１に挙げた技術のように、映像間の関連を分析するために視聴履歴を利用する。非特許文献１に記載の技術は、視聴や評価をした映像が共通するユーザを嗜好が似ているとみなし、嗜好が似たユーザが視聴した映像のうち、未視聴の映像を推薦する。 In this collaborative filtering recommendation method, the viewing history is used to analyze the relationship between videos, as in the technique described in Non-Patent Document 1. The technology described in Non-Patent Document 1 regards users who share the same viewing and evaluation videos as having similar preferences, and recommends unviewed videos among videos viewed by users with similar preferences.

しかし、協調フィルタリングでは、ユーザの履歴を利用して推薦するという特性上、履歴の量が少ない場合には、視聴履歴の量が少ないため、映像を視聴したユーザが似ているかどうかを正確に分析できず、効果的な推薦を行うことができないという問題があった。例えば、サービスをあまり利用していない、もしくは、サービスに登録したばかりのユーザに推薦を行いたい場合や、埋もれていたり、サービスに新たに追加された映像を推薦の対象にしたい場合には、効果的な推薦を行うことができなかった。 However, in collaborative filtering, the recommendation is to use the user's history, so when the amount of history is small, the amount of viewing history is small, so it accurately analyzes whether the users who viewed the video are similar. There was a problem that it was impossible to make an effective recommendation. For example, if you do not use the service very much or want to recommend it to users who have just registered in the service, or if you want to recommend videos that are buried or newly added to the service I was unable to make a recommendation.

協調フィルタリングは、多くの視聴履歴を必要とするのに対し、内容ベースフィルタリングは、履歴の量が少ない場合でも、映像に付与されたメタデータや映像から抽出された映像特徴の情報に基づいて推薦を行うことができる。内容ベースフィルタリングによれば、サービスに登録したばかりの新規ユーザや埋もれている映像を対象にした推薦など、利用可能な履歴の量が少ない場合にも、映像を推薦することができる。これにより、サービスをより多くのユーザに利用してもらい、より多くの映像を視聴してもらうことができる。 Collaborative filtering requires a lot of viewing history, whereas content-based filtering recommends based on metadata attached to video and video feature information extracted from video even when the amount of history is small It can be performed. According to content-based filtering, a video can be recommended even when the amount of available history is small, such as a recommendation for a new user who has just registered in the service or a video that is buried. As a result, the service can be used by more users and more videos can be viewed.

内容ベースフィルタリングに関する研究としては、ユーザが高く評価したコンテンツに付与されている属性の出現頻度を計測し、出現頻度の高い属性が付与されているコンテンツを推薦する手法（特許文献１）がある。また、ユーザや映像に付与されたメタデータの情報を利用せず、映像特徴の分析に基づいて推薦する手法（例えば、非特許文献２）もある。 As a research on content-based filtering, there is a method (Patent Document 1) that measures the frequency of appearance of attributes assigned to content highly evaluated by the user and recommends content assigned attributes with high appearance frequency. In addition, there is a method (for example, Non-Patent Document 2) that recommends based on analysis of video features without using information on metadata provided to users or videos.

非特許文献２に挙げた技術は、映像の見た目と音の情報に基づいて映像の類似度を算出する。見た目の情報としては、色ヒストグラム、動きの激しさ、１秒あたりの平均ショット数を利用し、音の情報としては、音のテンポの平均値と分散を利用する。 The technique described in Non-Patent Document 2 calculates the similarity between videos based on the appearance and sound information of the videos. As the visual information, a color histogram, the intensity of motion, and the average number of shots per second are used, and as the sound information, the average value and variance of the sound tempo are used.

内容ベースフィルタリングの推薦手法では、映像と内容やジャンルが共通した映像を推薦することが一般的である。 In a content-based filtering recommendation method, it is common to recommend a video having the same content and genre as the video.

特開２００９−２０５４１８号公報JP 2009-205418 A

B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-Based Collaborative Filtering Recommendation Algorithms,” In Proceedings of the 10th international conference on World Wide Web, 2001.B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-Based Collaborative Filtering Recommendation Algorithms,” In Proceedings of the 10th international conference on World Wide Web, 2001. B. Yang, T. Mei, X.-S. Hua, L. Yang, S.-Q. Yang and M. Li. Online Video Recommendation Based on Multimodal Fusion and Relevance Feedback. In Proc. ACM International Conference on Image and Video Retrieval (CIVR), pp. 73-80, 2007.B. Yang, T. Mei, X.-S. Hua, L. Yang, S.-Q. Yang and M. Li.Online Video Recommendation Based on Multimodal Fusion and Relevance Feedback.In Proc.ACM International Conference on Image and Video Retrieval (CIVR), pp. 73-80, 2007.

しかし、映像を推薦する際、内容やジャンルが共通した映像を推薦するだけではなく、視聴者に与える印象が似ている映像を推薦することも重要である。例えば、明るい雰囲気の映像を視聴したユーザには、暗い雰囲気の映像ではなく、明るい雰囲気の映像を推薦した方が、ユーザが視聴する可能性が高い。 However, when recommending videos, it is important not only to recommend videos that share the same content and genre, but also to recommend videos that have similar impressions to viewers. For example, for a user who has viewed a bright atmosphere video, it is more likely that the user will view a video with a bright atmosphere rather than a dark atmosphere.

内容ベースフィルタリングにおいて、映像間の関連を分析するための方法として、映像に付されたメタデータを利用する方法と、映像特徴を解析する方法がある。 In content-based filtering, as a method for analyzing the relationship between videos, there are a method of using metadata attached to videos and a method of analyzing video features.

特許文献１の手法では、メタデータ中の属性やキーワードに基づいて、推薦する映像を決定する。そのため、メタデータ中に映像の印象に関する属性やキーワードが存在しない場合や、メタデータが全く付与されていない場合には、印象が似ている映像の推薦を行うことができなかった。 In the method of Patent Document 1, a video to be recommended is determined based on attributes and keywords in metadata. Therefore, when there is no attribute or keyword related to the impression of the video in the metadata, or when no metadata is given, it is not possible to recommend a video with a similar impression.

また、非特許文献２に開示された映像間の関連を分析するために映像特徴を解析する技術では、映像全体を解析する。従って、視聴者の印象に残らないようなあまり重要でないシーンも類似度算出の対象となり、重要でないシーンの特徴は似ていても視聴者の映像に対する印象は似ていない映像を推薦してしまうという問題があった。また、この技術は、各特徴について、映像全体における平均値もしくは分散を類似度算出に利用するため、映像の前半に特徴が多く現れるといった、映像の時間軸に関する情報を比較することができないという問題があった。そのため、印象が似ている映像を、適切に推薦することができなかった。 Further, in the technique for analyzing video features for analyzing the relationship between videos disclosed in Non-Patent Document 2, the entire video is analyzed. Therefore, a less important scene that does not remain in the viewer's impression is also subject to similarity calculation, and recommends a video that does not have a similar impression on the viewer's video even though the characteristics of the unimportant scene are similar. There was a problem. In addition, since this technique uses the average value or variance of the entire video for calculating the degree of similarity for each feature, it is difficult to compare information about the time axis of the video, such as many features appearing in the first half of the video. was there. For this reason, videos with similar impressions could not be recommended appropriately.

従って本発明の目的は、印象が類似する映像データを出力する類似映像出力方法、類似映像出力装置および類似映像出力プログラムを提供することである。 Accordingly, an object of the present invention is to provide a similar video output method, a similar video output device, and a similar video output program for outputting video data having similar impressions.

上記課題を解決するため、本発明の第１の特徴は、類似する映像データを出力する類似映像出力方法に関する。すなわち本発明の第１の特徴に係る類似映像出力方法は、映像データ記憶部に記憶された複数の映像データのそれぞれについて、当該映像データを分割した区間毎の構造的特徴の出現する時間を構造的特徴量として算出し、構造的特徴量に基づいて、映像データ間の構造的特徴類似度を算出して、構造的特徴類似度データを出力する構造的特徴類似度算出ステップと、構造的特徴類似度データに基づいて、映像データ記憶部から、基準映像データに類似する映像データを抽出する類似映像決定ステップと、を備える。 In order to solve the above problems, a first feature of the present invention relates to a similar video output method for outputting similar video data. That is, the similar video output method according to the first feature of the present invention has a structure in which the appearance time of a structural feature for each section obtained by dividing the video data for each of a plurality of video data stored in the video data storage unit is structured. A structural feature similarity calculating step for calculating a structural feature similarity between video data based on the structural feature amount and outputting structural feature similarity data, and a structural feature A similar video determination step of extracting video data similar to the reference video data from the video data storage unit based on the similarity data.

ここで、構造的特徴は、区間における色、動き、音響特徴、カット割り、音楽区間、発話区間およびテロップ区間のうち、いずれか一つ以上である。 Here, the structural feature is any one or more of color, motion, acoustic feature, cut division, music segment, speech segment, and telop segment in the segment.

また、複数の映像データのそれぞれについて、当該映像データに含まれる重要シーンの特徴量を重要シーン特徴量として算出し、重要シーン特徴量に基づいて、映像データ間の重要シーン特徴類似度を算出して、重要シーン特徴類似度データを出力する重要シーン特徴類似度算出ステップをさらに備えても良い。この場合、類似映像決定ステップは、構造的特徴類似度データおよび重要シーン特徴類似度データに基づいて、映像データ記憶部から、基準映像データに類似する映像データを抽出する。 For each of a plurality of video data, the feature amount of the important scene included in the video data is calculated as the important scene feature amount, and the important scene feature similarity between the video data is calculated based on the important scene feature amount. An important scene feature similarity calculating step for outputting important scene feature similarity data may be further included. In this case, the similar video determination step extracts video data similar to the reference video data from the video data storage unit based on the structural feature similarity data and the important scene feature similarity data.

ここで、重要シーン特徴は、重要シーンの色、動き、音響特徴のいずれか一つ以上である。 Here, the important scene feature is at least one of the color, motion, and acoustic feature of the important scene.

本発明の第２の特徴は、類似する映像データを出力する類似映像出力方法に関する。本発明の第２の特徴に係る類似映像出力方法は、映像データ記憶部に記憶された複数の映像データのそれぞれについて、当該映像データに含まれる重要シーンの特徴量を重要シーン特徴量として算出し、重要シーン特徴量に基づいて、映像データ間の重要シーン特徴類似度を算出して、重要シーン特徴類似度データを出力する重要シーン特徴類似度算出ステップと、重要シーン特徴類似度データに基づいて、映像データ記憶部から、基準映像データに類似する映像データを抽出する類似映像決定ステップと、を備える。 The second feature of the present invention relates to a similar video output method for outputting similar video data. The similar video output method according to the second feature of the present invention calculates, for each of a plurality of video data stored in the video data storage unit, an important scene feature amount included in the video data as an important scene feature amount. Calculating an important scene feature similarity between video data based on the important scene feature amount, and outputting the important scene feature similarity data; and based on the important scene feature similarity data A similar video determination step of extracting video data similar to the reference video data from the video data storage unit.

本発明の第３の特徴は、類似する映像データを出力する類似映像出力装置に関する。本発明の第３の特徴に係る類似映像出力装置は、複数の映像データが記憶された映像データ記憶部と、複数の映像データのそれぞれについて、当該映像データを分割した区間毎の構造的特徴の出現する時間を構造的特徴量として算出し、構造的特徴量に基づいて、映像データ間の構造的特徴類似度を算出して、構造的特徴類似度データを出力する構造的特徴類似度算出手段と、構造的特徴類似度データに基づいて、映像データ記憶部から、基準映像データに類似する映像データを抽出する類似映像決定手段と、を備える。 A third feature of the present invention relates to a similar video output device that outputs similar video data. The similar video output device according to the third aspect of the present invention includes a video data storage unit storing a plurality of video data, and a structural feature for each section obtained by dividing the video data for each of the plurality of video data. Structural feature similarity calculating means for calculating the appearance time as a structural feature amount, calculating the structural feature similarity between video data based on the structural feature amount, and outputting the structural feature similarity data And similar video determination means for extracting video data similar to the reference video data from the video data storage unit based on the structural feature similarity data.

また、複数の映像データのそれぞれについて、当該映像データに含まれる重要シーンの特徴量を重要シーン特徴量として算出し、重要シーン特徴量に基づいて、映像データ間の重要シーン特徴類似度を算出して、重要シーン特徴類似度データを出力する重要シーン特徴類似度算出手段をさらに備えても良い。この場合、類似映像決定手段は、構造的特徴類似度データおよび重要シーン特徴類似度データに基づいて、映像データ記憶部から、基準映像データに類似する映像データを抽出する。 For each of a plurality of video data, the feature amount of the important scene included in the video data is calculated as the important scene feature amount, and the important scene feature similarity between the video data is calculated based on the important scene feature amount. An important scene feature similarity calculating unit that outputs important scene feature similarity data may be further included. In this case, the similar video determining means extracts video data similar to the reference video data from the video data storage unit based on the structural feature similarity data and the important scene feature similarity data.

本発明の第４の特徴は、類似する映像データを出力する類似映像出力装置に関する。本発明の第４の特徴に係る類似映像出力装置は、複数の映像データが記憶された映像データ記憶部と、複数の映像データのそれぞれについて、当該映像データに含まれる重要シーンの特徴量を重要シーン特徴量として算出し、重要シーン特徴量に基づいて、映像データ間の重要シーン特徴類似度を算出して、重要シーン特徴類似度データを出力する重要シーン特徴類似度算出手段と、重要シーン特徴類似度データに基づいて、映像データ記憶部から、基準映像データに類似する映像データを抽出する類似映像決定手段と、を備える。 A fourth feature of the present invention relates to a similar video output device that outputs similar video data. The similar video output device according to the fourth feature of the present invention provides a video data storage unit storing a plurality of video data and a feature amount of an important scene included in the video data for each of the plurality of video data. An important scene feature similarity calculating means for calculating an important scene feature similarity between video data based on the important scene feature quantity and outputting important scene feature similarity data; And a similar video determining means for extracting video data similar to the reference video data from the video data storage unit based on the similarity data.

本発明の第５の特徴は、コンピュータに、本発明の第１の特徴または第２の特徴に係る類似映像出力方法のステップを実行させるための類似映像出力プログラムである。 A fifth feature of the present invention is a similar video output program for causing a computer to execute the steps of the similar video output method according to the first feature or the second feature of the present invention.

本発明によれば、印象が類似する映像データを出力する類似映像出力方法、類似映像出力装置および類似映像出力プログラムを提供することができる。 According to the present invention, it is possible to provide a similar video output method, a similar video output device, and a similar video output program for outputting video data with similar impressions.

図１は、本発明の実施の形態に係る類似映像出力方法を説明するフローチャートである。FIG. 1 is a flowchart for explaining a similar video output method according to an embodiment of the present invention. 図２は、本発明の実施の形態に係る類似映像出力装置の機能ブロック図である。FIG. 2 is a functional block diagram of the similar video output device according to the embodiment of the present invention. 図３は、本発明の実施の形態に係る構造的特徴量データのデータ構造とデータの一例を説明する図である。FIG. 3 is a diagram for explaining an example of the data structure and data of the structural feature amount data according to the embodiment of the present invention. 図４は、本発明の実施の形態に係る構造的特徴類似度データのデータ構造とデータの一例を説明する図である。FIG. 4 is a diagram for explaining an example of the data structure and data of the structural feature similarity data according to the embodiment of the present invention. 図５は、本発明の実施の形態に係る重要シーン特徴量データのデータ構造とデータの一例を説明する図である。FIG. 5 is a diagram for explaining an example of the data structure and data of the important scene feature amount data according to the embodiment of the present invention. 図６は、本発明の実施の形態に係る重要シーン特徴類似度データのデータ構造とデータの一例を説明する図である。FIG. 6 is a diagram for explaining an example of the data structure and data of the important scene feature similarity data according to the embodiment of the present invention. 図７は、本発明の実施の形態に係る類似度データのデータ構造とデータの一例を説明する図である。FIG. 7 is a diagram for explaining an example of the data structure and data of similarity data according to the embodiment of the present invention. 図８は、本発明の実施の形態に係る映像管理データのデータ構造とデータの一例を説明する図である。FIG. 8 is a diagram for explaining an example of the data structure and data of the video management data according to the embodiment of the present invention. 図９は、本発明の実施の形態に係る構造的特徴類似度算出処理を説明するフローチャートである。FIG. 9 is a flowchart for explaining the structural feature similarity calculation processing according to the embodiment of the present invention. 図１０は、本発明の実施の形態に係る構造的特徴類似度算出処理を説明する図である。FIG. 10 is a diagram for explaining the structural feature similarity calculation processing according to the embodiment of the present invention. 図１１は、本発明の実施の形態に係る重要シーン特徴類似度算出処理を説明するフローチャートである。FIG. 11 is a flowchart for explaining important scene feature similarity calculation processing according to the embodiment of the present invention. 図１１は、本発明の実施の形態に係る重要シーン特徴類似度算出処理を説明する図である。FIG. 11 is a diagram for explaining important scene feature similarity calculation processing according to the embodiment of the present invention. 図１３は、本発明の実施の形態に係る類似映像決定処理を説明するフローチャートである。FIG. 13 is a flowchart for explaining similar video determination processing according to the embodiment of the present invention.

次に、図面を参照して、本発明の実施の形態を説明する。以下の図面の記載において、同一又は類似の部分には同一又は類似の符号を付している。 Next, embodiments of the present invention will be described with reference to the drawings. In the following description of the drawings, the same or similar parts are denoted by the same or similar reference numerals.

（実施の形態）
本発明の実施の形態に係る類似映像出力方法は、印象が類似する映像データを出力する。 (Embodiment)
The similar video output method according to the embodiment of the present invention outputs video data with similar impressions.

実施の形態に係る類似映像出力方法は、映像特徴量を分析して、映像の構造的な特徴が似ているかどうか、もしくは、映像の重要シーンの特徴が似ているかどうかに基づいて、複数の推薦対象の映像データから、印象が類似する映像データを、ユーザに推薦する映像データとして出力する。本発明の実施の形態においては、構造的特徴類似度および重要シーン特徴類似度の両方に基づいて、類似する映像データを算出する場合を説明するが、このいずれかのみでも構わない。ユーザが基準映像データの情報を入力すると、類似映像出力方法は、この基準映像データに類似する映像データを、ユーザに推薦する映像データとして出力する。ユーザに出力する推薦する映像データは、一つでも良いし複数でも良い。実施の形態において、推薦する対象をユーザと表記するが、このユーザは一般ユーザであっても良いし、実施の形態に係る類似映像出力方法を利用するシステムであっても良い。 The similar video output method according to the embodiment analyzes a video feature amount, and based on whether the structural feature of the video is similar or whether the feature of the important scene of the video is similar, From the recommended video data, video data with a similar impression is output as video data recommended to the user. In the embodiment of the present invention, a case in which similar video data is calculated based on both the structural feature similarity and the important scene feature similarity will be described. When the user inputs information on the reference video data, the similar video output method outputs video data similar to the reference video data as video data recommended to the user. There may be one or more recommended video data to be output to the user. In the embodiment, a target to be recommended is described as a user, but this user may be a general user or a system using the similar video output method according to the embodiment.

図１を参照して、実施の形態に係る類似映像出力方法の処理の概要を説明する。まず、類似映像出力方法は、ステップＳ１において基準映像データが入力されると、ステップＳ２に進む。 With reference to FIG. 1, the outline of the process of the similar video output method according to the embodiment will be described. First, the similar video output method proceeds to step S2 when reference video data is input in step S1.

ステップＳ２において、構造的類似度算出処理が実行される。構造的類似度算出処理において、映像データを分割した各区間毎の構造的特徴の出現する時間を構造的特徴量として算出し、構造的特徴量に基づいて、映像データ間の構造的特徴類似度が算出される。ここで、構造的特徴は、映像データの区間における色、動き、音響特徴、カット割り、音楽区間、発話区間およびテロップ区間などである。映像を時間順に見ていったとき、ショットの切り替わった時点をカット位置といい、ショットとは、連続して撮影された一つの場面である。 In step S2, a structural similarity calculation process is executed. In the structural similarity calculation processing, the appearance time of the structural feature for each section obtained by dividing the video data is calculated as the structural feature amount, and the structural feature similarity between the video data is calculated based on the structural feature amount. Is calculated. Here, the structural features include color, movement, acoustic features, cut division, music section, speech section, and telop section in the section of video data. When viewing the video in chronological order, the point at which the shots are switched is called the cut position, and a shot is one scene shot continuously.

ステップＳ３において、映像類似度算出処理が実行される。映像類似度算出処理において、映像データに含まれる重要シーンの特徴量を重要シーン特徴量として算出し、重要シーン特徴量に基づいて、映像データ間の重要シーン特徴類似度が算出される。ここで重要シーン特徴は、重要シーンの色、動き、音響特徴のいずれか一つ以上である。 In step S3, a video similarity calculation process is executed. In the video similarity calculation process, the feature amount of the important scene included in the video data is calculated as the important scene feature amount, and the important scene feature similarity between the video data is calculated based on the important scene feature amount. Here, the important scene feature is at least one of the color, motion, and acoustic feature of the important scene.

ステップＳ４において類似映像データが決定される。ここでは、ステップＳ２で算出された構造的特徴類似度、およびステップＳ３で算出された重要シーン特徴類似度の少なくともいずれかの類似度から算出された映像データ間の類似度に基づいて、基準映像データに類似する映像データが特定される。 In step S4, similar video data is determined. Here, based on the structural feature similarity calculated in step S2 and the similarity between the video data calculated from at least one of the important scene feature similarity calculated in step S3, the reference video Video data similar to the data is specified.

ステップＳ５において、ステップＳ４で特定された類似する映像データの情報が出力される。例えば、類似映像出力装置１は、特定された類似する映像データのタイトル、説明、推薦スコアなどのリストを、出力する。 In step S5, information on the similar video data specified in step S4 is output. For example, the similar video output device 1 outputs a list of titles, descriptions, recommended scores, and the like of the specified similar video data.

（類似映像出力装置）
図１を参照して説明した実施の形態に係る類似映像出力方法は、図２に示す類似映像出力装置１によって実現される。類似映像出力装置１は、記憶装置１０、中央処理制御装置２０および表示装置３０を備える一般的なコンピュータである。類似映像出力装置１は、所定の処理を実行するための類似映像出力プログラムが、一般的なコンピュータにインストールされることにより実装される。この類似映像出力装置１の各構成要素は、サーバ装置のＡＰＩ（Application Program Interface）などであって、クライアント端末がこのＡＰＩ通じて映像情報を提供するためのプログラムにより実現されても良い。 (Similar video output device)
The similar video output method according to the embodiment described with reference to FIG. 1 is realized by the similar video output device 1 shown in FIG. The similar video output device 1 is a general computer including a storage device 10, a central processing control device 20, and a display device 30. The similar video output device 1 is implemented by installing a similar video output program for executing a predetermined process in a general computer. Each component of the similar video output device 1 is an API (Application Program Interface) of a server device, and may be realized by a program for a client terminal to provide video information through this API.

記憶装置１０は、基準映像データ記憶部１１、映像データ記憶部１２、構造的特徴量データ記憶部１３、構造的特徴類似度データ記憶部１４、重要シーン特徴量データ記憶部１５、重要シーン特徴類似度データ記憶部１６、類似度データ記憶部１７、類似映像リストデータ記憶部１８および映像管理データ記憶部１９が記憶される。また記憶装置１０には、類似映像出力プログラムも記憶される。 The storage device 10 includes a reference video data storage unit 11, a video data storage unit 12, a structural feature amount data storage unit 13, a structural feature similarity data storage unit 14, an important scene feature amount data storage unit 15, and an important scene feature similarity. A degree data storage unit 16, a similarity degree data storage unit 17, a similar video list data storage unit 18, and a video management data storage unit 19 are stored. The storage device 10 also stores a similar video output program.

基準映像データ記憶部１１は、記憶装置１０のうち、基準映像データ１１ａが記憶された記憶領域である。類似映像出力装置１は、基準映像データ１１ａに印象が類似する映像データを出力する。基準映像データ１１ａは、映像データそのものであっても良いし、例えば、後述する映像データ記憶部１２に記憶された映像データの識別子であっても良い。 The reference video data storage unit 11 is a storage area in the storage device 10 in which the reference video data 11a is stored. The similar video output device 1 outputs video data whose impression is similar to the reference video data 11a. The reference video data 11a may be video data itself, or may be an identifier of video data stored in the video data storage unit 12 described later, for example.

映像データ記憶部１２は、記憶装置１０のうち、複数の映像データ１２ａ、１２ｂ…が記憶された記憶領域である。 The video data storage unit 12 is a storage area in which a plurality of video data 12a, 12b,.

構造的特徴量データ記憶部１３は、記憶装置１０のうち、構造的特徴量データ１３ａが記憶された記憶領域である。構造的特徴量データ１３ａは、構造的特徴類似度算出手段２２によって生成され、参照される。 The structural feature value data storage unit 13 is a storage area in the storage device 10 in which the structural feature value data 13a is stored. The structural feature quantity data 13a is generated and referred to by the structural feature similarity calculation means 22.

構造的特徴量データ１３ａは、図３に示すように、映像識別子および映像中の区分識別子と、その映像および区分における構造的特徴識別子および構造的特徴量とを対応づけたデータである。ここで、構造的特徴識別子は、映像データの区間における色、動き、音響特徴、カット割り、音楽区間、発話区間およびテロップ区間などの、本発明の構造的特徴を模式的に示したものである。図３に示す例において、構造的特徴識別子”ＦＥＡ００１”が色の構造的特徴を示している場合、映像識別子”ＭＯＶ００１”の区分識別子”ＤＩＶ００１”で識別される映像データの区分において、色の構造的特徴が、”（０．０００１，０．０００２，・・・・）”のベクトルであることを示している。類似映像出力装置１は、これらの構造的特徴識別子と、構造的特徴の名称と、を対応づけたデータを記憶装置１０に記憶しても良い。 As shown in FIG. 3, the structural feature amount data 13a is data in which a video identifier and a segment identifier in the video are associated with a structural feature identifier and a structural feature amount in the video and the segment. Here, the structural feature identifier schematically shows the structural features of the present invention such as color, motion, acoustic feature, cut division, music section, speech section, and telop section in the section of video data. . In the example shown in FIG. 3, when the structural feature identifier “FEA001” indicates the structural feature of the color, the color structure in the segment of the video data identified by the segment identifier “DIV001” of the video identifier “MOV001”. This indicates that the target feature is a vector of “(0.0001, 0.0002,...)”. The similar video output device 1 may store data in which the structural feature identifiers and the names of the structural features are associated with each other in the storage device 10.

構造的特徴類似度データ記憶部１４は、記憶装置１０のうち、構造的特徴類似度データ１４ａが記憶された記憶領域である。構造的特徴類似度データ１４ａは、任意の映像データ間の構造的特徴類似度を記憶する。構造的特徴類似度データ１４ａは、構造的特徴類似度算出手段２２によって生成され、類似映像決定手段２４によって参照される。 The structural feature similarity data storage unit 14 is a storage area in the storage device 10 in which the structural feature similarity data 14a is stored. The structural feature similarity data 14a stores the structural feature similarity between arbitrary video data. The structural feature similarity data 14 a is generated by the structural feature similarity calculation unit 22 and is referred to by the similar video determination unit 24.

構造的特徴類似度データ１４ａは、図４に示すように、第１の映像識別子と、第２の映像識別子と、これら第１および第２の映像識別子で特定される映像データ間の構造的特徴類似度と、が対応づけられたデータである。構造的特徴類似度は、図３を参照して説明した構造的特徴量データ１３ａから算出される。 As shown in FIG. 4, the structural feature similarity data 14a includes a first video identifier, a second video identifier, and a structural feature between the video data specified by the first and second video identifiers. It is data in which the similarity is associated with. The structural feature similarity is calculated from the structural feature data 13a described with reference to FIG.

重要シーン特徴量データ記憶部１５は、記憶装置１０のうち、重要シーン特徴量データ１５ａが記憶された記憶領域である。重要シーン特徴量データ１５ａは、重要シーン特徴類似度算出手段２３によって生成され、参照される。 The important scene feature value data storage unit 15 is a storage area in the storage device 10 in which important scene feature value data 15a is stored. The important scene feature data 15a is generated and referred to by the important scene feature similarity calculating unit 23.

重要シーン特徴量データ１５ａは、図５に示すように、映像識別子および映像中の重要シーン識別子と、その映像および重要シーンにおける重要シーン特徴識別子および重要シーン特徴量とを対応づけたデータである。ここで、重要シーン特徴識別子は、重要シーンの色、動き、音響特徴などの重要シーン特徴の識別子である。例えば、重要シーンの色の特徴としてＬ＊ａ＊ｂ＊ヒストグラムが、重要シーンの動きの特徴としてオプティカルフローが、考えられる。類似映像出力装置１は、これらの重要シーン特徴識別子と、重要シーン特徴の名称と、を対応づけたデータを記憶装置１０に記憶しても良い。 As shown in FIG. 5, the important scene feature data 15a is data in which a video identifier and an important scene identifier in the video are associated with an important scene feature identifier and an important scene feature in the video and the important scene. Here, the important scene feature identifier is an identifier of an important scene feature such as the color, motion, and acoustic feature of the important scene. For example, an L * a * b * histogram can be considered as the color feature of the important scene, and an optical flow can be considered as the feature of the important scene movement. The similar video output device 1 may store data in which the important scene feature identifier and the name of the important scene feature are associated with each other in the storage device 10.

重要シーン特徴類似度データ記憶部１６は、記憶装置１０のうち、重要シーン特徴類似度データ１６ａが記憶された記憶領域である。重要シーン特徴類似度データ１６ａは、任意の映像データ間の重要シーン特徴類似度を記憶する。重要シーン特徴類似度データ１６ａは、重要シーン特徴類似度算出手段２３よって生成され、類似映像決定手段２４によって参照される。 The important scene feature similarity data storage unit 16 is a storage area in the storage device 10 in which important scene feature similarity data 16a is stored. The important scene feature similarity data 16a stores the important scene feature similarity between arbitrary video data. The important scene feature similarity data 16 a is generated by the important scene feature similarity calculating unit 23 and is referred to by the similar video determining unit 24.

重要シーン特徴類似度データ１６ａは、図６に示すように、第１の映像識別子と、第２の映像識別子と、これら第１および第２の映像識別子で特定される映像データ間の重要シーン特徴類似度と、が対応づけられたデータである。重要シーン特徴類似度は、図５を参照して説明した重要シーン特徴量データ１５ａから算出される。 The important scene feature similarity data 16a includes, as shown in FIG. 6, important scene features between the first video identifier, the second video identifier, and the video data specified by the first and second video identifiers. It is data in which the similarity is associated with. The important scene feature similarity is calculated from the important scene feature data 15a described with reference to FIG.

類似度データ記憶部１７は、記憶装置１０のうち、類似度データ１７ａが記憶された記憶領域である。類似度データ記憶部１７は、類似映像決定手段２４によって生成され、参照される。 The similarity data storage unit 17 is a storage area in the storage device 10 in which similarity data 17a is stored. The similarity data storage unit 17 is generated and referred to by the similar video determination unit 24.

類似度データ１７ａは、第１の映像識別子と、第２の映像識別子と、これら第１および第２の映像識別子で特定される映像データ間の類似度と、類似度に対応する推薦スコアが対応づけられたデータである。映像データ間の類似度は、図４を参照して説明した構造的特徴類似度データ１４ａの構造的特徴類似度と、図６を参照して説明した重要シーン特徴類似度データ１６ａの重要シーン特徴類似度と、に基づいて算出される。図７に示す例では、映像データ間の類似度は、０から１の値を採る。推薦スコアは、類似度に基づいて算出され、図７に示す例では、０から１００の値を採る。実施の形態において、類似度および推薦スコアは高いほど映像データが類似していることを示し、類似映像データとして推薦され易いことを示している。 The similarity data 17a corresponds to the first video identifier, the second video identifier, the similarity between the video data specified by the first and second video identifiers, and the recommendation score corresponding to the similarity. It is attached data. The similarity between the video data is the structural feature similarity of the structural feature similarity data 14a described with reference to FIG. 4 and the important scene feature of the important scene feature similarity data 16a described with reference to FIG. It is calculated based on the similarity. In the example shown in FIG. 7, the similarity between video data takes values from 0 to 1. The recommendation score is calculated based on the similarity, and takes a value of 0 to 100 in the example shown in FIG. In the embodiment, the higher the similarity and the recommendation score, the more similar the video data is, and it is easy to recommend the similar video data.

類似映像リストデータ記憶部１８は、記憶装置１０のうち、類似映像リストデータ１８ａが記憶された記憶領域である。類似映像リストデータ１８ａは、類似映像決定手段２４によって出力され、類似映像情報出力手段２５によって参照される。 The similar video list data storage unit 18 is a storage area in the storage device 10 in which similar video list data 18a is stored. The similar video list data 18 a is output by the similar video determination unit 24 and is referred to by the similar video information output unit 25.

類似映像リストデータ１８ａは、基準映像データ１１ａに類似する映像として、ユーザに推薦する映像データの情報である。類似映像リストデータ１８ａは、推薦する映像データの識別子、タイトル、説明、推薦スコアなどが対応づけられたデータである。 The similar video list data 18a is information of video data recommended to the user as a video similar to the reference video data 11a. The similar video list data 18a is data in which identifiers, titles, descriptions, recommendation scores, and the like of video data to be recommended are associated.

映像管理データ記憶部１９は、記憶装置１０のうち映像管理データ１９ａが記憶された記憶領域である。映像管理データ１９ａは、図８に示すように、映像識別子、映像名称およびその映像データの説明が対応づけられたデータである。映像管理データ１９ａは、ユーザに推薦する映像データの情報を表示するために参照される。 The video management data storage unit 19 is a storage area in the storage device 10 in which video management data 19a is stored. As shown in FIG. 8, the video management data 19a is data in which a video identifier, a video name, and a description of the video data are associated with each other. The video management data 19a is referred to in order to display information of video data recommended for the user.

図２に示すように、類似映像出力装置１の中央処理制御装置２０は、基準映像データ取得手段２１、構造的特徴類似度算出手段２２、重要シーン特徴類似度算出手段２３、類似映像決定手段２４および類似映像情報出力手段２５を備える。 As shown in FIG. 2, the central processing control device 20 of the similar video output device 1 includes a reference video data acquisition unit 21, a structural feature similarity calculation unit 22, an important scene feature similarity calculation unit 23, and a similar video determination unit 24. And similar video information output means 25.

基準映像データ取得手段２１は、基準映像データ１１ａを取得し、基準映像データ記憶部１１に記憶する。類似映像出力装置１は、この基準映像データ１１ａに類似する映像データを検索する。基準映像データ１１ａは、映像データ記憶部１２に記憶された映像データの識別子であっても良い。 The reference video data acquisition unit 21 acquires the reference video data 11 a and stores it in the reference video data storage unit 11. The similar video output device 1 searches for video data similar to the reference video data 11a. The reference video data 11 a may be an identifier of video data stored in the video data storage unit 12.

基準映像データ１１ａとして、映像データ記憶部１２に記憶されていない映像データが入力された場合、入力された基準映像データ１１ａについて、後述する構造的特徴類似度算出手段２２において、構造的特徴量が算出され、基準映像データ１１ａと、映像データ記憶部１２の各映像データとの構造的特徴類似度が算出される。同様に、入力された基準映像データ１１ａについて、後述する重要シーン特徴類似度算出手段２３において、重要シーン特徴量が算出され、基準映像データ１１ａと、映像データ記憶部１２の各映像データとの重要シーン特徴類似度が算出される。さらに後述する類似映像決定手段２４は、基準映像データ１１ａと、映像データ記憶部１２の各映像データとの構造的特徴類似度と、基準映像データ１１ａと、映像データ記憶部１２の各映像データとの重要シーン特徴類似度と、から基準映像データ１１ａと、映像データ記憶部１２の各映像データとの類似度および推薦スコアを算出して、基準映像データ１１ａに類似する映像データの情報を出力する。 When video data that is not stored in the video data storage unit 12 is input as the reference video data 11a, the structural feature similarity calculation means 22 described later has a structural feature amount for the input reference video data 11a. The structural feature similarity between the reference video data 11a and each video data in the video data storage unit 12 is calculated. Similarly, the important scene feature similarity calculating unit 23 described later calculates the important scene feature amount with respect to the input reference video data 11a, and the importance of the reference video data 11a and each video data in the video data storage unit 12 is calculated. A scene feature similarity is calculated. Further, the similar video determination means 24 described later includes the structural feature similarity between the reference video data 11a and each video data in the video data storage unit 12, the reference video data 11a, and each video data in the video data storage unit 12. The similarity between the reference video data 11a and each video data in the video data storage unit 12 and the recommended score are calculated based on the similarity of the important scene features of the video, and information of video data similar to the reference video data 11a is output. .

本発明の実施の形態においては、基準映像データ１１ａとして、映像データ記憶部１２に記憶された映像データの識別子が指定された場合を説明する。また、後述する構造的特徴類似度算出手段２２、重要シーン特徴類似度算出手段２３および類似映像決定手段２４において、映像データ記憶部１２に記憶された全ての映像データ間の類似度等を算出する場合について説明する。 In the embodiment of the present invention, a case where an identifier of video data stored in the video data storage unit 12 is designated as the reference video data 11a will be described. In addition, a structural feature similarity calculating unit 22, an important scene feature similarity calculating unit 23, and a similar video determining unit 24, which will be described later, calculate similarities between all video data stored in the video data storage unit 12. The case will be described.

構造的特徴類似度算出手段２２は、構造的特徴解析手段２２１および類似度算出手段２２２を備える。 The structural feature similarity calculation unit 22 includes a structural feature analysis unit 221 and a similarity calculation unit 222.

構造的特徴解析手段２２１は、映像データ記憶部１２に記憶された複数の映像データのそれぞれについて、構造的特徴量を算出する。構造的特徴解析手段２２１は、例えば時間の長さが均等になるように、映像データを複数の区間に分割し、各区間毎の構造的特徴の出現する時間を構造的特徴量として算出する。ここで構造的特徴解析手段２２１は、構造的特徴量として、色、動き、音響特徴、カット割り、音楽区間、発話区間およびテロップ区間の各構造的特徴の出現する時間を、各映像データおよび各区間について算出する。構造的特徴解析手段２２１は、映像識別子と、映像データ中の区間の識別子と、構造的特徴の識別子をキーとして、構造的特徴量を対応づけたレコードを、構造的特徴量データ１３ａに挿入する。 The structural feature analysis unit 221 calculates a structural feature amount for each of the plurality of video data stored in the video data storage unit 12. The structural feature analysis unit 221 divides the video data into a plurality of sections so that the length of time is uniform, for example, and calculates the time when the structural features appear in each section as a structural feature amount. Here, the structural feature analysis means 221 uses the appearance time of each structural feature of the color, movement, acoustic feature, cut division, music section, speech section, and telop section as the structural feature amount. Calculate for the interval. The structural feature analysis means 221 inserts a record in which the structural feature amount is associated with the structural feature amount data 13a using the video identifier, the identifier of the section in the video data, and the structural feature identifier as keys. .

類似度算出手段２２２は、構造的特徴量データ１３ａの構造的特徴量に基づいて、映像データ間の構造的特徴類似度を算出して、構造的特徴類似度データ１４ａを出力する。ここで類似度算出手段２２２は、各映像識別子について、この映像識別子関連づけられる複数の構造的特徴量を要素に持つ特徴ベクトルを算出する。類似度算出手段２２２は、映像データ間の類似度として、この特徴ベクトルの類似度を算出する。類似度算出手段２２２は、第１の映像識別子と、第２の映像識別子と、この第１の映像識別子の映像データとこの第２の映像識別子の映像データとの構造的特徴類似度と、を対応づけたレコードを、構造的特徴類似度データ１４ａに挿入する。 The similarity calculation means 222 calculates the structural feature similarity between the video data based on the structural feature quantity of the structural feature quantity data 13a, and outputs the structural feature similarity data 14a. Here, the similarity calculation unit 222 calculates, for each video identifier, a feature vector having a plurality of structural feature quantities associated with the video identifier as elements. The similarity calculation means 222 calculates the similarity of this feature vector as the similarity between video data. The similarity calculation means 222 calculates the first video identifier, the second video identifier, the structural feature similarity between the video data of the first video identifier and the video data of the second video identifier. The associated record is inserted into the structural feature similarity data 14a.

図９を参照して、構造的特徴類似度算出手段２２による構造的特徴類似度算出処理を説明する。 With reference to FIG. 9, the structural feature similarity calculation processing by the structural feature similarity calculation means 22 will be described.

まず、映像データ記憶部１２に記憶された各映像データについて、ステップＳ１０１ないしステップＳ１０３の処理を繰り返す。ステップＳ１０１において構造的特徴類似度算出手段２２は、映像データを時間が均等となる複数の区分に分割する。 First, the processing from step S101 to step S103 is repeated for each video data stored in the video data storage unit 12. In step S101, the structural feature similarity calculating unit 22 divides the video data into a plurality of sections with equal time.

この複数の区間のそれぞれについて、ステップＳ１０２およびステップＳ１０３の処理を繰り返す。構造的特徴類似度算出手段２２は、各区間について、構造的特徴量を算出する。このとき構造的特徴類似度算出手段２２は、各区間における色、動き、音響特徴、カット割り、音楽区間、発話区間およびテロップ区間の時間の長さを、構造的特徴量として算出し、ステップＳ１０３において、構造的特徴量データ１３ａに記録する。全ての区間について、ステップＳ１０２ないしステップＳ１０３の処理が終了すると、次の映像データについて、ステップＳ１０１ないしステップＳ１０３の処理を続ける。 The processing in step S102 and step S103 is repeated for each of the plurality of sections. The structural feature similarity calculating unit 22 calculates a structural feature amount for each section. At this time, the structural feature similarity calculating means 22 calculates the length of time of the color, motion, acoustic feature, cut division, music section, speech section, and telop section in each section as the structural feature amount, and step S103. Are recorded in the structural feature data 13a. When the processing from step S102 to step S103 is completed for all the sections, the processing from step S101 to step S103 is continued for the next video data.

全ての映像データについてステップＳ１０１ないしステップＳ１０３の処理が終了すると、構造的特徴類似度算出手段２２は、ステップＳ１０４ないしステップＳ１０７において、各映像データ間の構造的特徴類似度を算出する。まずステップＳ１０４において構造的特徴類似度算出手段２２は、任意の第１の映像データについて、構造的特徴量データ１３ａから各区間の構造的特徴量を取得する。ステップＳ１０５において構造的特徴類似度算出手段２２は、第２の映像データについて、構造的特徴量データ１３ａから各区間の構造的特徴量を取得する。 When the processing of steps S101 to S103 is completed for all the video data, the structural feature similarity calculating unit 22 calculates the structural feature similarity between the video data in steps S104 to S107. First, in step S104, the structural feature similarity calculating unit 22 acquires the structural feature amount of each section from the structural feature amount data 13a for arbitrary first video data. In step S105, the structural feature similarity calculating unit 22 acquires the structural feature amount of each section from the structural feature amount data 13a for the second video data.

ステップＳ１０６において構造的特徴類似度算出手段２２は、ステップＳ１０４で取得した第１の映像データの構造的特徴量と、ステップＳ１０５で取得した第２の映像データの構造的特徴量から、第１の映像データと第２の映像データとの類似度を算出する。さらにステップＳ１０７において構造的特徴類似度算出手段２２は、第１の映像データの識別子と、第２の映像データの識別子と、ステップＳ１０６で算出した構造的特徴量に基づく構造的特徴類似度とを対応づけて、構造的特徴類似度データ１４ａに記録する。 In step S106, the structural feature similarity calculating unit 22 calculates the first feature from the structural feature amount of the first video data acquired in step S104 and the structural feature amount of the second video data acquired in step S105. The similarity between the video data and the second video data is calculated. Further, in step S107, the structural feature similarity calculating unit 22 obtains the identifier of the first video data, the identifier of the second video data, and the structural feature similarity based on the structural feature amount calculated in step S106. Correspondingly, the structural feature similarity data 14a is recorded.

ここで、構造的特徴類似度算出手段２２の処理を詳述する。本発明の実施の形態において、構造的特徴とは、映像データのカット割り、音楽区間、発話区間、テロップ区間などである。本発明の実施の形態において、これらの構造的特徴がそれぞれ映像データに現れる時間に基づいて、映像データの構造的特徴量が算出される。例えば、音楽区間の構造的特徴量とは、図１０（ａ）に示すように、映像データにおける音楽区間の時間の長さや、音楽区間の位置の情報である。本発明の実施の形態において、このような構造的特徴の類似する映像データを推薦することができる。 Here, the processing of the structural feature similarity calculation means 22 will be described in detail. In the embodiment of the present invention, the structural features include a cut division of video data, a music section, a speech section, a telop section, and the like. In the embodiment of the present invention, the structural feature amount of the video data is calculated based on the time when these structural features appear in the video data. For example, as shown in FIG. 10A, the structural feature amount of the music section is information on the length of the music section and the position of the music section in the video data. In the embodiment of the present invention, video data having similar structural features can be recommended.

構造的特徴の抽出方法として、既存の処理方法を利用することができる。例えば、映像データからカット位置の検出方法として、特許第２８６９３９８号公報に記載の方法などが考えられる。音楽区間の抽出方法として、特許第４５７２２１８号公報に記載の方法などが考えられる。発話区間の抽出方法として、特許第３１０５４６５号公報に記載の方法などが考えられる。テロップ区間の抽出方法として、特許第３４７９５９２号公報に記載の方法などが考えられる。 An existing processing method can be used as a structural feature extraction method. For example, a method described in Japanese Patent No. 2869398 is conceivable as a method for detecting a cut position from video data. As a method for extracting a music section, a method described in Japanese Patent No. 4572218 can be considered. As a method for extracting an utterance section, a method described in Japanese Patent No. 3105465 can be considered. As a telop section extraction method, a method described in Japanese Patent No. 3479592 can be considered.

構造的特徴類似度算出手段２２は、図１０（ｂ）に示すように、映像をいくつかの区間に分割し、それぞれの区間における構造的特徴量を分析し、特徴ベクトルを作成する。構造的特徴類似度算出手段２２は、例えば、映像を時間が均等になるようにいくつかの区間に分割する。また、構造的特徴類似度算出手段２２は、それぞれの区間の長さに対する構造的特徴が出現する区間の長さの割合を数値で表し、それらを要素とする特徴ベクトルを作成する。ここで、特徴量の絶対値ではなく、増減が重要である場合、構造的特徴類似度算出手段２２は、特徴ベクトルを正規化してもよい。 As shown in FIG. 10B, the structural feature similarity calculating unit 22 divides the video into several sections, analyzes the structural feature amount in each section, and creates a feature vector. For example, the structural feature similarity calculation unit 22 divides the video into several sections so that the times are equal. Further, the structural feature similarity calculating unit 22 expresses the ratio of the length of the section in which the structural feature appears with respect to the length of each section as a numerical value, and creates a feature vector having these as elements. Here, when the increase / decrease is important rather than the absolute value of the feature quantity, the structural feature similarity calculation unit 22 may normalize the feature vector.

ここで、映像データＡの各区間の構造的特徴が出現する区間の長さの割合を要素とする特徴ベクトルをｖ^Ａとし、映像データＢの各区間の構造的特徴が出現する区間の長さの割合を要素とする特徴ベクトルをｖ^Ｂとする場合を考える。構造的特徴類似度算出手段２２は、この特徴ベクトルｖ^Ａおよび特徴ベクトルｖ^Ｂの類似度を算出することにより、特徴ベクトルｖ^Ａおよび特徴ベクトルｖ^Ｂの類似度Ｓ（ｖ^Ａ，ｖ^Ｂ）を算出する。 Here, a feature vector whose element is a ratio of the length of the section in which the structural feature of each section of the video data A appears is v ^A, and the length of the section in which the structural feature of each section of the video data B appears. consider the case of a feature vector for the ratio between the element and v ^B. Structural features similarity calculation unit 22, by calculating the similarity of the feature vectors ^{v A} and the feature vector ^{v B,} the feature vector ^{v A} and the feature vector ^{v B} of the similarity ^{S ^(v} A, v ^B) and calculate.

ここで構造的特徴の特徴ベクトルの類似度の算出において、任意の尺度が利用される。例えば、特徴ベクトルの類似度尺度としてコサイン類似度を利用すると、Ｓ（ｖ^Ａ，ｖ^Ｂ）は式（１）で与えられる。

Here, an arbitrary scale is used in calculating the similarity of the feature vectors of the structural features. For example, if cosine similarity is used as a similarity measure for feature vectors, S (v ^A , v ^B ) is given by equation (1).

ここで、ｖ_ｉ ^Ａは、ｖ^Ａのｉ次元目の値を示し、ｖ_ｉ ^Ｂは、ｖ^Ｂのｉ次元目の値を示す。ｎは、特徴ベクトルの次元数である。 Here, v _i ^A represents the i-th value of v ^A , and v _i ^B represents the i-th value of v ^B. n is the number of dimensions of the feature vector.

ここで、構造的特徴の抽出方法や使用する構造的特徴量が、Ｎ_ｓ（＞０）個存在する場合、構造的特徴類似度算出手段２２は、それぞれについて、映像データＡと映像データＢとの類似度Ｓ（ｖ^Ａ，ｖ^Ｂ）を求める。構造的特徴類似度算出手段２２は、それぞれについて求められたＳ（ｖ^Ａ，ｖ^Ｂ）に基づいて、構造的特徴に基づく類似度Ｓ^ＡＢを、類似度Ｓ^ＡＢ＝｛Ｓ_ｉ ^Ａ，Ｂ｜ｉ＝１〜Ｎ_ｓ｝とする。 Here, when there are N _s (> 0) structural feature extraction methods and structural feature amounts to be used, the structural feature similarity calculating unit 22 determines that the video data A, the video data B, Similarity S (v ^A , v ^B ) is obtained. The structural feature similarity calculating unit 22 calculates the similarity S ^AB based on the structural feature based on S (v ^A , v ^B ) obtained for each, and the similarity S ^AB = {S _i ^{A, B} | i = 1 to N _s }.

構造的特徴類似度算出手段２２は、このように算出された類似度Ｓ^ＡＢを、映像データＡの識別子および映像データＢの識別子に関連づけたレコードを生成し、構造的特徴類似度データ１４ａに挿入して記憶する。 The structural feature similarity calculating means 22 generates a record in which the similarity S ^AB calculated in this way is associated with the identifier of the video data A and the identifier of the video data B, and inserts it into the structural feature similarity data 14a. And remember.

次に、重要シーン特徴類似度算出手段２３を説明する。重要シーン特徴類似度算出手段２３は、重要シーン抽出手段２３１、映像特徴解析手段２３２および類似度算出手段２３３を備える。 Next, the important scene feature similarity calculating unit 23 will be described. The important scene feature similarity calculation unit 23 includes an important scene extraction unit 231, a video feature analysis unit 232, and a similarity calculation unit 233.

重要シーン抽出手段２３１は、映像データ記憶部１２に記憶された複数の映像データのそれぞれから、重要シーンを抽出する。重要シーンの抽出方法としては、ダイジェスト作成方法など既存の処理方法を用いることができる。本発明の実施の形態においては、一つの映像データから複数の重要シーンが抽出されても良い。 The important scene extraction unit 231 extracts an important scene from each of the plurality of video data stored in the video data storage unit 12. As an important scene extraction method, an existing processing method such as a digest creation method can be used. In the embodiment of the present invention, a plurality of important scenes may be extracted from one video data.

映像特徴解析手段２３２は、映像データ記憶部１２に記憶された複数の映像データのそれぞれの重要シーンについて、当該映像データに含まれる重要シーンの特徴量を重要シーン特徴量として算出する。ここで、映像特徴解析手段２３２は、重要シーン特徴量として、重要シーンの色、動き、音響特徴のいずれか一つ以上を、各映像データの各重要シーンについて算出する。映像特徴解析手段２３２は、各重要シーンの中間点におけるフレーム画像に基づいて、各重要シーン特徴量を算出しても良い。例えば映像特徴解析手段２３２は、この中間点におけるフレーム画像の色ヒストグラムを、重要シーンの色の特徴量として算出する。映像特徴解析手段２３２は、映像識別子と、映像データ中の重要シーンの識別子と、重要シーン特徴の識別子をキーとして、重要シーン特徴量を対応づけたレコードを、重要シーン特徴量データ１５ａに挿入する。 For each important scene of the plurality of video data stored in the video data storage unit 12, the video feature analysis unit 232 calculates the feature quantity of the important scene included in the video data as the important scene feature quantity. Here, the video feature analysis means 232 calculates one or more of the color, motion, and acoustic features of the important scene as the important scene feature quantity for each important scene of each video data. The video feature analysis means 232 may calculate each important scene feature amount based on the frame image at the intermediate point of each important scene. For example, the video feature analysis unit 232 calculates the color histogram of the frame image at the intermediate point as the feature amount of the important scene color. The video feature analysis means 232 inserts a record in which the important scene feature amount is associated with the important scene feature amount data 15a using the video identifier, the important scene identifier in the video data, and the important scene feature identifier as keys. .

類似度算出手段２３３は、重要シーン特徴量データ１５ａの重要シーン特徴量に基づいて、映像データ間の重要シーン特徴類似度を算出して、重要シーン特徴類似度データ１６ａを出力する。ここで類似度算出手段２３３は、各映像識別子について、この映像識別子関連づけられる複数の重要シーン特徴量を要素に持つ特徴ベクトルを算出する。類似度算出手段２３３は、映像データ間の類似度として、この特徴ベクトルの類似度を算出する。類似度算出手段２３３は、第１の映像識別子と、第２の映像識別子と、この第１の映像識別子の映像データとこの第２の映像識別子の映像データとの重要シーン特徴類似度と、を対応づけたレコードを、重要シーン特徴類似度データ１６ａに挿入する。 The similarity calculation means 233 calculates the important scene feature similarity between the video data based on the important scene feature quantity of the important scene feature quantity data 15a, and outputs the important scene feature similarity data 16a. Here, the similarity calculation means 233 calculates, for each video identifier, a feature vector having a plurality of important scene feature quantities associated with the video identifier as elements. The similarity calculation means 233 calculates the similarity of this feature vector as the similarity between video data. The similarity calculating means 233 calculates the first video identifier, the second video identifier, the important scene feature similarity between the video data of the first video identifier and the video data of the second video identifier. The associated record is inserted into the important scene feature similarity data 16a.

図９を参照して、重要シーン特徴類似度算出手段２３による重要シーン特徴類似度算出処理を説明する。 With reference to FIG. 9, the important scene feature similarity calculating process by the important scene feature similarity calculating unit 23 will be described.

まず、映像データ記憶部１２に記憶された各映像データについて、ステップＳ２０１ないしステップＳ２０３の処理を繰り返す。ステップＳ２０１において重要シーン特徴類似度算出手段２３は、各映像データから、重要シーンを抽出する。ここで一つの映像データから抽出する重要シーンは、一つでも良いし複数でも良い。 First, the processing from step S201 to step S203 is repeated for each video data stored in the video data storage unit 12. In step S201, the important scene feature similarity calculating unit 23 extracts an important scene from each video data. Here, one or more important scenes may be extracted from one video data.

この重要シーンのそれぞれについて、ステップＳ２０２およびステップＳ２０３の処理を繰り返す。重要シーン特徴類似度算出手段２３は、各重要シーンについて、重要シーン特徴量を算出する。このとき重要シーン特徴類似度算出手段２３は、各重要シーンの中間のフレームの重要シーンの色、動き、音響特徴のいずれか一つ以上を、重要シーン特徴量として算出し、ステップＳ２０３において、重要シーン特徴量データ１５ａに記録する。全ての重要シーンについて、ステップＳ２０２ないしステップＳ２０３の処理が終了すると、次の映像データについて、ステップＳ２０１ないしステップＳ２０３の処理を続ける。 The process of step S202 and step S203 is repeated for each important scene. The important scene feature similarity calculating unit 23 calculates an important scene feature amount for each important scene. At this time, the important scene feature similarity calculating unit 23 calculates one or more of important scene colors, movements, and acoustic features in the middle frame of each important scene as an important scene feature amount. Recorded in the scene feature data 15a. When the processing from step S202 to step S203 is completed for all important scenes, the processing from step S201 to step S203 is continued for the next video data.

全ての映像データについてステップＳ２０１ないしステップＳ２０３の処理が終了すると、重要シーン特徴類似度算出手段２３は、ステップＳ２０４ないしステップＳ２０７において、各映像データ間の重要シーン特徴類似度を算出する。まずステップＳ２０４において重要シーン特徴類似度算出手段２３は、任意の第１の映像データについて、重要シーン特徴量データ１５ａから各重要シーンの重要シーン特徴量を取得する。ステップＳ２０５において重要シーン特徴類似度算出手段２３は、第２の映像データについて、重要シーン特徴量データ１５ａから各重要シーンの重要シーン特徴量を取得する。 When the processing from step S201 to step S203 is completed for all the video data, the important scene feature similarity calculating unit 23 calculates the important scene feature similarity between the video data in steps S204 to S207. First, in step S204, the important scene feature similarity calculating unit 23 acquires the important scene feature quantity of each important scene from the important scene feature quantity data 15a for the arbitrary first video data. In step S205, the important scene feature similarity calculating unit 23 acquires the important scene feature quantity of each important scene from the important scene feature quantity data 15a for the second video data.

ステップＳ２０６において重要シーン特徴類似度算出手段２３は、ステップＳ２０４で取得した第１の映像データの重要シーン特徴量と、ステップＳ２０５で取得した第２の映像データの重要シーン特徴量から、第１の映像データと第２の映像データとの類似度を算出する。さらにステップＳ２０７において重要シーン特徴類似度算出手段２３は、第１の映像データの識別子と、第２の映像データの識別子と、ステップＳ２０６で算出した重要シーン特徴量に基づく重要シーン特徴類似度とを対応づけて、重要シーン特徴類似度データ１６ａに記録する。 In step S206, the important scene feature similarity calculating unit 23 calculates the first scene from the important scene feature amount of the first video data acquired in step S204 and the important scene feature amount of the second video data acquired in step S205. The similarity between the video data and the second video data is calculated. Further, in step S207, the important scene feature similarity calculating unit 23 obtains the identifier of the first video data, the identifier of the second video data, and the important scene feature similarity based on the important scene feature amount calculated in step S206. Correspondingly, it is recorded in the important scene feature similarity data 16a.

ここで、重要シーン特徴類似度算出手段２３の処理を詳述する。重要シーン抽出手段２３１が映像データから重要シーンを抽出する方法として、既存の処理方法を利用することができる。例えば、特許第４３５８７２３号公報に記載のダイジェスト映像作成方法などが考えられる。ダイジェスト映像作成技術により、映像データのダイジェストを作成することができるため、ダイジェストとして選択された区間を、映像データの重要シーンとしてみなすことができる。 Here, the processing of the important scene feature similarity calculating means 23 will be described in detail. An existing processing method can be used as a method by which the important scene extracting unit 231 extracts an important scene from video data. For example, a digest video creation method described in Japanese Patent No. 4358723 may be considered. Since the digest of the video data can be created by the digest video creation technique, the section selected as the digest can be regarded as an important scene of the video data.

重要シーン特徴類似度算出手段２３は、図１２（ａ）に示すように、重要シーン抽出手段２３１で抽出された重要シーンについて、図１２（ｂ）に示すように、映像データの重要シーン特徴を解析する。重要シーン特徴類似度算出手段２３は、重要シーン特徴量として、色、動き、音響特徴などを利用することができる。 As shown in FIG. 12A, the important scene feature similarity calculating unit 23 calculates the important scene feature of the video data for the important scene extracted by the important scene extracting unit 231 as shown in FIG. To analyze. The important scene feature similarity calculation means 23 can use color, movement, acoustic features, etc. as the important scene feature quantity.

重要シーン特徴量の類似度算出には、任意の尺度が利用される。例えば、重要シーン特徴量として色ヒストグラムを利用する場合、重要シーン特徴類似度算出手段２３は、図１２に示すように、映像データの中からいくつかの重要シーンを抽出し、それぞれのシーンの中間点におけるフレームを抽出する。重要シーン特徴類似度算出手段２３は、そのフレームの色ヒストグラムを作成し、特徴ベクトルとして表現する。ここで、ヒストグラムの次元間の距離は、利用する表色空間における距離を用いる。 An arbitrary scale is used for calculating the similarity of the important scene feature quantity. For example, when a color histogram is used as the important scene feature quantity, the important scene feature similarity calculating unit 23 extracts several important scenes from the video data as shown in FIG. Extract frames at points. The important scene feature similarity calculating means 23 creates a color histogram of the frame and expresses it as a feature vector. Here, the distance in the color space to be used is used as the distance between the dimensions of the histogram.

ここで、映像データＡの各重要シーンから抽出した特徴量をｖ^Ａ、映像Ｂの各重要シーンから抽出した特徴量をｖ^Ｂとする。ｖ^Ａとｖ^Ｂの類似度Ｉ（ｖ^Ａ，ｖ^Ｂ）は、例えば、式（２）によって算出される。

Here, a feature amount extracted from each important scene of the video data A is denoted by v ^A , and a feature amount extracted from each important scene of the video B is denoted by v ^B. v ^A and ^{v B} of the similarity ^{I (v} A, ^{v B),} for example, is calculated by equation (2).

重要シーン特徴類似度算出手段２３は、図１０（ｂ）に示すように、映像データＡに含まれる各重要シーンＡ_１、Ａ_２、…と、映像データＢ_１、Ｂ_２…に含まれる各シーンの類似度を算出する。類似度が高い重要シーンが存在するほど、Ｉ（ｖ^Ａ，ｖ^Ｂ）の値は大きくなる。ここで、Ｎ_Ａは、映像データＡから抽出された重要シーンの数であり、Ｎ_Ｂは、映像データＢから抽出された重要シーンの数である。ｖ_ｉ ^Ａは映像データＡのi番目の重要シーン、ｖ_ｊ ^Ｂは映像データＢのｊ番目の重要シーンである。また、Ｄ（ｖ_ｉ ^Ａ，ｖ_ｊ ^Ｂ）は、重要シーンｖ_ｉ ^Ａと重要シーンｖ_ｊ ^Ｂの類似度である。 The important scene feature similarity calculating means 23, as shown in FIG. 10B, includes each important scene A ₁ , A ₂ ,... Included in the video data A and each video data B ₁ , B ₂ . Calculate the similarity of the scene. The value of I (v ^A , v ^B ) increases as there is an important scene with a high degree of similarity. Here, N _A is the number of important scene extracted from the video data A, N _B is the number of important scene extracted from the video data B. v _i ^A is the i-th important scene of the video data A, and v _j ^B is the j-th important scene of the video data B. D (v _i ^A , v _j ^B ) is the similarity between the important scene v _i ^A and the important scene v _j ^B.

ここで、Earth Mover’s Distance（Y. Rubner, C. Tomasi, and L. Guibas. "The earth mover’s distance as a metric for image retrieval," International Journal of Computer Vision Vol. 40, No. 2, pp. 99-121, 2000.）を利用する。ｖ_ｉ ^Ａとｖ_ｊ ^ＢのEarth Mover’s Distanceを、Ｅ（ｖ_ｉ ^Ａ，ｖ_ｊ ^Ｂ）とすると、Ｄ（ｖ_ｉ ^Ａ，ｖ_ｊ ^Ｂ）は、式（３）によって算出される。

Here, Earth Mover's Distance (Y. Rubner, C. Tomasi, and L. Guibas. "The earth mover's distance as a metric for image retrieval," International Journal of Computer Vision Vol. 40, No. 2, pp. 99- 121, 2000.). If Earth Mover's Distance between v _i ^A and v _j ^B is E (v _i ^A , v _j ^B ), D (v _i ^A , v _j ^B ) is calculated by equation (3).

他の方法として、重要シーン特徴類似度算出手段２３は、映像データＡの特徴ベクトルをｖ^Ａ、映像データＢの特徴ベクトルをｖ^Ｂとおいたとき、ｖ^Ａとｖ^Ｂのコサイン類似度を算出することによって、重要シーンＡ_ｉと重要シーンＢ_ｊの類似度Ｄ（ｖ_ｉ ^Ａ，ｖ_ｊ ^Ｂ）を算出することができる。この類似度Ｄ（ｖ_ｉ ^Ａ，ｖ_ｊ ^Ｂ）は、式（４）によって表される。

Alternatively, the important scene feature similarity calculating unit 23, when the feature vector of the image data A v ^A, the feature vector of the image data B was placed with v ^B, v calculates the cosine similarity of ^A and v ^B Thus, the similarity D (v _i ^A , v _j ^B ) between the important scene A _i and the important scene B _j can be calculated. This degree of similarity D (v _i ^A , v _j ^B ) is expressed by equation (4).

ここで、ｖ_ｉ ^Ａは、映像データＡの特徴ベクトルのｉ次元目の値で、ｖ_ｉ ^Ｂは、映像データＢの特徴ベクトルのｉ次元目の値である。ｎは、特徴ベクトルの次元数である。 Here, v _i ^A is the i-th value of the feature vector of the video data A, and v _i ^B is the i-th value of the feature vector of the video data B. n is the number of dimensions of the feature vector.

ここで、重要シーンの抽出方法や使用する重要シーン特徴量が、Ｎ_Ｉ（＞０）個存在する場合、重要シーン特徴類似度算出手段２３は、それぞれについて、映像データＡの重要シーンと映像データＢの重要シーンの類似度Ｉ（ｖ^Ａ，ｖ^Ｂ）の値を求める。重要シーン特徴類似度算出手段２３は、それぞれについて求められたＩ（ｖ^Ａ，ｖ^Ｂ）に基づいて、重要シーン特徴に基づく類似度Ｉ^ＡＢを、類似度Ｉ^ＡＢ＝｛Ｉ_ｉ ^Ａ，Ｂ｜ｉ＝１〜Ｎ_Ｉ｝とする。 Here, when there are N _I (> 0) important scene extraction methods and important scene feature quantities to be used, the important scene feature similarity calculating unit 23 determines the important scene and the video data of the video data A for each. The similarity score I (v ^A , v ^B ) of the ^B important scene is obtained. The important scene feature similarity calculating means 23 calculates the similarity I ^AB based on the important scene feature based on I (v ^A , v ^B ) obtained for each, and the similarity I ^AB = {I _i ^{A, B} | i = 1 to N _I }.

重要シーン特徴類似度算出手段２３は、このように算出された類似度Ｉ^ＡＢを、映像データＡの識別子および映像データＢの識別子に関連づけたレコードを生成し、重要シーン特徴類似度データ１６ａに挿入して記憶する。 The important scene feature similarity calculating unit 23 generates a record in which the similarity I ^AB calculated in this way is associated with the identifier of the video data A and the identifier of the video data B, and inserts it into the important scene feature similarity data 16a. And remember.

次に、類似映像決定手段２４を説明する。類似映像決定手段２４は、構造的特徴類似度データ１４ａおよび重要シーン特徴類似度データ１６ａに基づいて、映像データ記憶部１２から、基準映像データ１１ａに類似する映像データを抽出する。 Next, the similar video determining means 24 will be described. The similar video determination unit 24 extracts video data similar to the reference video data 11a from the video data storage unit 12 based on the structural feature similarity data 14a and the important scene feature similarity data 16a.

類似映像決定手段２４は、構造的特徴および重要シーン特徴のうち、いずれか一方のみを用いて、基準映像データ１１ａに類似する映像データを抽出しても良い。例えば、構造的特徴のみに基づいて類似する映像データを抽出する際、類似映像決定手段２４は、構造的特徴類似度データ１４ａに基づいて、構造的特徴類似度を映像データ間の類似度に設定する。重要シーン特徴のみに基づいて類似する映像データを抽出する際、類似映像決定手段２４は、重要シーン特徴類似度データ１６ａに基づいて、重要シーン特徴類似度を映像データ間の類似度に設定する。 The similar video determination unit 24 may extract video data similar to the reference video data 11a using only one of the structural feature and the important scene feature. For example, when extracting similar video data based only on structural features, the similar video determination means 24 sets the structural feature similarity to the similarity between the video data based on the structural feature similarity data 14a. To do. When extracting similar video data based only on the important scene feature, the similar video determination means 24 sets the important scene feature similarity to the similarity between the video data based on the important scene feature similarity data 16a.

構造的特徴類似度データ１４ａおよび重要シーン特徴類似度データ１６ａの両方に基づいて基準映像データ１１ａに類似する映像データを抽出する際、それぞれを重み付けして評価し、類似する映像データを抽出しても良い。この重みは、予めユーザなどにより指定されても良いし、管理者などによりデフォルトの重みが指定されても良い。 When video data similar to the reference video data 11a is extracted based on both the structural feature similarity data 14a and the important scene feature similarity data 16a, each is weighted and evaluated, and similar video data is extracted. Also good. This weight may be designated in advance by a user or the like, or a default weight may be designated by an administrator or the like.

類似映像決定手段２４は、映像データ間の類似度が算出されると、この類似度に基づいて、基準映像データ１１ａに対する各映像データの推薦スコアを算出する。この推薦スコアは、類似度が高いほど高くなり、優先的に推薦されることを示す。類似映像決定手段２４は、この推薦スコアに基づいて、基準映像データ１１ａに類似する映像データを抽出する。ここで類似映像決定手段２４が抽出する映像データは、映像データそのものでなくても、映像データの識別子のみでも良い。類似映像決定手段２４は、類似度の高い所定数の映像データを抽出してもよいし、所定の類似度以上の映像データを抽出しても良い。 When the similarity between the video data is calculated, the similar video determination unit 24 calculates a recommendation score of each video data with respect to the reference video data 11a based on the similarity. The recommendation score is higher as the similarity is higher, and indicates that the recommendation is preferentially recommended. The similar video determination means 24 extracts video data similar to the reference video data 11a based on the recommendation score. Here, the video data extracted by the similar video determination means 24 may not be the video data itself but only the identifier of the video data. The similar video determination unit 24 may extract a predetermined number of video data having a high degree of similarity, or may extract video data having a predetermined similarity or higher.

類似映像決定手段２４は、類似度の高い映像データを推薦する映像データとして、これらの映像データの識別子を含む類似映像リストデータ１８ａを生成し、類似映像リストデータ記憶部１８に記憶する。この類似映像リストデータ１８ａにおいて、ユーザに推薦の度合いを示すため、推薦対象の映像データの識別子に、その推薦スコアが対応づけられても良い。 The similar video determination means 24 generates similar video list data 18a including identifiers of these video data as video data for recommending video data having a high similarity, and stores the similar video list data 18a in the similar video list data storage unit 18. In the similar video list data 18a, in order to indicate the degree of recommendation to the user, the recommendation score may be associated with the identifier of the video data to be recommended.

図１３を参照して、類似映像決定手段２４による類似映像決定処理を説明する。 With reference to FIG. 13, the similar video determination process by the similar video determination means 24 will be described.

類似映像決定手段２４は、映像データ記憶部１２に記憶された任意の２つの映像データについて、ステップＳ３０１ないしステップＳ３０５の処理を繰り返す。 The similar video determination unit 24 repeats the processing from step S301 to step S305 for any two video data stored in the video data storage unit 12.

まずステップＳ３０１において類似映像決定手段２４は、構造的特徴類似度データ１４ａから第１の映像データおよび第２の映像データ間の構造的特徴類似度を取得する。同様にステップＳ３０２において類似映像決定手段２４は、重要シーン特徴類似度データ１６ａから第１の映像データおよび第２の映像データ間の重要シーン特徴類似度を取得する。 First, in step S301, the similar video determination means 24 acquires the structural feature similarity between the first video data and the second video data from the structural feature similarity data 14a. Similarly, in step S302, the similar video determination unit 24 acquires the important scene feature similarity between the first video data and the second video data from the important scene feature similarity data 16a.

ステップＳ３０３において類似映像決定手段２４は、ステップＳ３０１およびステップＳ３０２で取得した、第１の映像データおよび第２の映像データ間の構造的特徴類似度および重要シーン特徴類似度に基づいて、第１の映像データおよび第２の映像データ間の類似度を算出する。このとき類似映像決定手段２４は、構造的特徴類似度および重要シーン特徴類似度をそれぞれ重み付けして、第１の映像データおよび第２の映像データ間の類似度を算出する。 In step S303, the similar video determination unit 24 determines the first based on the structural feature similarity and the important scene feature similarity between the first video data and the second video data acquired in steps S301 and S302. The similarity between the video data and the second video data is calculated. At this time, the similar video determination means 24 calculates the similarity between the first video data and the second video data by weighting the structural feature similarity and the important scene feature similarity, respectively.

ステップＳ３０４において類似映像決定手段２４は、第１の映像データの識別子と、第２の映像データの識別子と、第１の映像データおよび第２の映像データ間の類似度と、を対応づけたレコードを、類似度データ１７ａに挿入する。 In step S304, the similar video determination unit 24 records the first video data identifier, the second video data identifier, and the similarity between the first video data and the second video data. Is inserted into the similarity data 17a.

ステップＳ３０５において類似映像決定手段２４は、類似度データ１７ａを参照して、基準映像データ１１ａとの類似度が高い映像データの識別子を、推薦する映像データの識別子として取得する。さらに類似映像決定手段２４は、取得した推薦する映像データの識別子を含む類似映像リストデータ１８ａを生成し、記憶装置１０に記憶する。 In step S305, the similar video determination unit 24 refers to the similarity data 17a, and acquires an identifier of video data having a high similarity to the reference video data 11a as an identifier of recommended video data. Further, the similar video determination unit 24 generates similar video list data 18a including the identifier of the acquired recommended video data and stores it in the storage device 10.

ここで、類似映像決定手段２４の処理を詳述する。類似映像決定手段２４は、映像データＡと映像データＢの類似度について、構造的特徴に基づいて算出された類似度Ｓ^ＡＢと、重要シーン特徴に基づいて算出された類似度Ｉ^ＡＢとから、映像データＡを視聴したユーザに対する映像データＢの推薦スコアＲ（Ａ，Ｂ）を算出する。 Here, the processing of the similar video determination means 24 will be described in detail. The similar video determination means 24 calculates the similarity between the video data A and the video data B from the similarity S ^AB calculated based on the structural feature and the similarity I ^AB calculated based on the important scene feature. A recommendation score R (A, B) of video data B for a user who has watched video data A is calculated.

推薦スコアＲ（Ａ，Ｂ）は、構造的特徴に基づく構造的特徴類似度の値が大きい場合、もしくは、重要シーンに基づく重要シーン特徴類似度の値が大きい場合、推薦スコアＲ（Ａ，Ｂ）の値も大きくなるように算出される。 The recommendation score R (A, B) is a recommendation score R (A, B) when the value of the structural feature similarity based on the structural feature is large or when the value of the important scene feature similarity based on the important scene is large. ) Is also calculated to be large.

例えば、Ｓ^ＡＢの各要素とＩ^ＡＢの各要素の重み付き線形和により推薦スコアを算出する場合、Ｒ（Ａ，Ｂ）は、式（５）で与えられる。

For example, when a recommendation score is calculated by a weighted linear sum of each element of S ^AB and each element of I ^AB , R (A, B) is given by Expression (5).

ここで、ただし、ｋ_ｐ ^IはI_ｐ ^A,Bの重み、ｋ_ｑ ^SはＳ_ｑ ^A,Bの重みである。映像データに含まれる重要シーン特徴を重視して推薦スコアを算出したい場合、ｋ_ｐ ^Iの値は大きく設定される。映像データの演出を重視して推薦スコアを算出したい場合、I_ｐ ^A,Bの値は大きく設定される。 Here, k _p ^I is the weight of I _p ^{A, B} , and k _q ^S is the weight of S _q ^{A, B.} When it is desired to calculate the recommendation score with emphasis on the important scene feature included in the video data, the value of k _p ^I is set large. When the recommendation score is to be calculated with emphasis on the presentation of video data, the values of I _p ^{A, B} are set large.

重要シーンに基づいて算出した映像データＡと映像データＢの類似度Ｉ^ＡＢ、および、構造的特徴に基づいて算出した映像データＡと映像データＢの類似度Ｓ^ＡＢのうち、どちらか片方のみを利用して、映像データＡおよび映像データＢの類似度Ｒ（Ａ，Ｂ）が算出されても良い。重要シーンに基づいて算出した映像データＡと映像データＢの類似度Ｉ^ＡＢのみ用いて類似度Ｒ（Ａ，Ｂ）を算出する場合、類似度Ｒ（Ａ，Ｂ）は、式（６）で与えられる。構造的特徴に基づいて算出した映像データＡと映像データＢの類似度Ｓ^ＡＢのみ用いて類似度Ｒ（Ａ，Ｂ）を算出する場合、類似度Ｒ（Ａ，Ｂ）は、式（７）で与えられる。

Only one of the similarity I ^AB between the video data A and the video data B calculated based on the important scene and the similarity S ^AB between the video data A and the video data B calculated based on the structural features is obtained. The similarity R (A, B) between the video data A and the video data B may be calculated by using it. When the similarity R (A, B) is calculated using only the similarity I ^AB between the video data A and the video data B calculated based on the important scene, the similarity R (A, B) is expressed by Equation (6). Given. When the similarity R (A, B) is calculated using only the similarity S ^AB between the video data A and the video data B calculated based on the structural features, the similarity R (A, B) is expressed by Equation (7). Given in.

重みｋ_ｐ ^IおよびI_ｐ ^A,Bは、基準映像データ１１ａや映像データ記憶部１２に記憶された映像データの種類によって適宜設定されても良い。例えば、ホームビデオなど、編集があまりされていない映像データの場合、重要シーンに基づいて算出した類似度を用いるのがよい。また、重要シーンの抽出が困難な映像を対象に推薦する場合には、構造的特徴に基づいて算出した類似度を用いるのがよい。それ以外の場合には、重要シーンに基づいて算出した類似度と構造的特徴に基づいて算出した類似度の両方を用いるのがよい。 The weights k _p ^I and I _p ^{A, B} may be set as appropriate depending on the types of video data stored in the reference video data 11a and the video data storage unit 12. For example, in the case of video data that has not been edited much, such as a home video, it is preferable to use a similarity calculated based on an important scene. In addition, when recommending a video for which it is difficult to extract an important scene, it is preferable to use a similarity calculated based on a structural feature. In other cases, it is preferable to use both the similarity calculated based on the important scene and the similarity calculated based on the structural features.

類似映像決定手段２４は、類似度Ｒ（Ａ，Ｂ）から推薦スコアを算出する。例えば、類似度Ｒ（Ａ，Ｂ）を１００分率表記に換算したものが、推薦スコアとして出力されても良い。ユーザに推薦度をわかりやすく知らせるため、推薦スコアは記号などで表記されても良い。例えば類似映像決定手段２４は、類似度が高く推薦スコアが高い場合、推薦スコアを「★★★」などと表記し、類似度が比較的低く推薦スコアが低い場合、推薦スコアを「★」などと表記しても良い。 The similar video determination means 24 calculates a recommendation score from the similarity R (A, B). For example, a value obtained by converting the similarity R (A, B) into a 100% notation may be output as the recommendation score. In order to inform the user of the recommendation level in an easy-to-understand manner, the recommendation score may be written with a symbol or the like. For example, when the similarity is high and the recommendation score is high, the similar video determination unit 24 describes the recommendation score as “★★★” or the like, and when the similarity is relatively low and the recommendation score is low, the recommendation score is “★” or the like. May be written.

類似映像決定手段２４は、ユーザが映像データＡにアクセスしたとき、映像データ記憶部１２の任意の映像データＸのうち、Ｒ（Ａ，Ｂ）の値が大きい映像データから順に所定数の映像データについて類似映像リストデータ１８ａを作成する。 When the user accesses the video data A, the similar video determination unit 24 selects a predetermined number of video data in order from video data having a larger value of R (A, B) among arbitrary video data X in the video data storage unit 12. Similar video list data 18a is created for.

次に、類似映像情報出力手段２５を説明する。類似映像情報出力手段２５は、記憶装置１０から類似映像リストデータ１８ａおよび映像管理データ１９ａを読み出し、推薦する映像データの識別子に基づいて、これらの映像データのタイトルや内容などの情報を取得する。類似映像決定手段２４は、推薦する映像データの識別子と、これらのタイトルなどを対応づけて、出力する。図２に参照する例では、類似映像出力装置１の表示装置３０に表示する場合を説明するが、これに限られない。類似映像情報出力手段２５は例えば、通信制御装置（図示せず）および通信ネットワークを介して、他のコンピュータの表示装置に表示しても良い。 Next, the similar video information output means 25 will be described. The similar video information output means 25 reads the similar video list data 18a and the video management data 19a from the storage device 10, and acquires information such as the title and content of these video data based on the identifier of the recommended video data. The similar video determination means 24 associates the identifier of video data to be recommended with these titles and outputs them. In the example referred to in FIG. 2, the case where the image is displayed on the display device 30 of the similar video output device 1 will be described. The similar video information output means 25 may be displayed on a display device of another computer via a communication control device (not shown) and a communication network, for example.

ここで、構造的特徴類似度算出手段２２および重要シーン特徴類似度算出手段２３の各処理は、基準映像データ取得手段２１によって基準映像データ１１ａが入力された際に実行されても良いし、予め実行されていても良い。例えば、構造的特徴類似度算出手段２２によって構造的特徴類似度データ１４ａが、重要シーン特徴類似度算出手段２３によって重要シーン特徴類似度データ１６ａが、それぞれ予め算出されていれば、類似映像決定手段２４は、容易に類似度データ１７ａおよび類似映像リストデータ１８ａを生成することができる。さらに、類似映像決定手段２４によって、類似度データ１７ａも予め算出されていても良い。この様に予め類似度データなどを算出することにより、類似映像情報の出力に要する処理時間を短縮することができる。 Here, each process of the structural feature similarity calculation unit 22 and the important scene feature similarity calculation unit 23 may be executed when the reference video data 11a is input by the reference video data acquisition unit 21, or may be executed in advance. It may be executed. For example, if the structural feature similarity data 14a is calculated in advance by the structural feature similarity calculator 22 and the important scene feature similarity data 16a is calculated in advance by the important scene feature similarity calculator 23, similar video determination means 24 can easily generate the similarity data 17a and the similar video list data 18a. Further, the similarity data 17a may be calculated in advance by the similar video determination means 24. Thus, by calculating similarity data in advance, the processing time required for outputting similar video information can be shortened.

このように本発明の実施の形態に係る類似映像出力装置１によれば、映像データの構造的特徴に基づく類似度と、重要シーン特徴に基づく類似度によって、類似する映像データを特定することができる。 As described above, according to the similar video output device 1 according to the embodiment of the present invention, similar video data can be specified based on the similarity based on the structural feature of the video data and the similarity based on the important scene feature. it can.

構造的特徴に基づく類似度は、映像データの演出に関する特徴の類似度である。カット割りや音楽区間など、映像の演出に関係する特徴を抽出し、各特徴が映像の前半や後半など、時間軸におけるどの位置に多く出現するかを分析することによって、構造的特徴に基づく類似度が算出される。 The similarity based on the structural feature is the similarity of the feature related to the presentation of the video data. Similarities based on structural features are extracted by extracting features related to the production of the video, such as cut splits and music sections, and analyzing where each feature appears in the time axis, such as the first half or the second half of the video. The degree is calculated.

重要シーン特徴に基づく類似度は、視聴者の印象に残りやすいシーンに関する特徴の類似度である。視聴者の印象に残りやすいような重要シーンのみに限定して色などの特徴を分析することにより、重要シーン特徴に基づく類似度が算出される。 The similarity based on the important scene feature is a feature similarity related to a scene that tends to remain in the viewer's impression. The similarity based on the important scene feature is calculated by analyzing features such as color only for the important scene that tends to remain in the viewer's impression.

一般的に、映像データの演出に関する特徴や、重要シーンにおける特徴が類似する映像は、視聴者の映像データに対する印象も類似すると考えられる。そこで本発明の実施の形態に係る類似映像出力装置１は、各映像データについて、映像データの演出に関する構造的特徴量や、重要シーンにおける重要シーン特徴量を算出し、これらの特徴量が基準映像データ１１ａと類似する映像データを、推薦映像データとして出力する。これにより、本発明の実施の形態に係る類似映像出力装置１は、基準映像データ１１ａに視聴者の映像データに対する印象が類似する映像データを推薦することができる。 In general, it is considered that a video having similar characteristics regarding the presentation of video data or a characteristic in an important scene has a similar impression on the video data of the viewer. Therefore, the similar video output device 1 according to the embodiment of the present invention calculates, for each video data, a structural feature quantity related to the rendering of the video data and an important scene feature quantity in the important scene, and these feature quantities are used as the reference video. Video data similar to the data 11a is output as recommended video data. Thereby, the similar video output device 1 according to the embodiment of the present invention can recommend video data having a similar impression to the viewer's video data to the reference video data 11a.

構造的特徴類似度算出手段２２は、映像データから演出に関する構造的特徴を抽出し、各特徴が映像データの時間軸におけるどの位置でどのくらいの頻度で出現するかを分析して、構造的特徴類似度を算出する。このような構造的特徴類似度により、構造的特徴が映像データの前半部分に多く現れるのか、後半部分に多く現れるのか、あるいは、全体を通してまんべんなく現れるのかなどの情報に基づいて、映像データの類似度が算出される。映画などの映像データでは、緊迫感を与えるため、カット数を増やすといった演出がなされる。そのため、演出に関する特徴が似ている映像データは、視聴者に与える印象も似ていると考えられ、類似映像出力装置１は、構造的特徴類似度により、視聴者に与える印象の似ている映像データを推薦できる。 The structural feature similarity calculating means 22 extracts structural features related to performance from the video data, analyzes how often each feature appears on the time axis of the video data, and analyzes the structural feature similarity. Calculate the degree. Based on such structural feature similarity, the degree of similarity of video data based on information such as whether structural features often appear in the first half, the second half, or evenly throughout. Is calculated. In video data such as movies, an effect of increasing the number of cuts is given to give a sense of urgency. For this reason, it is considered that video data having similar characteristics related to presentations also have similar impressions to viewers, and the similar video output device 1 uses videos having similar impressions to viewers based on structural feature similarity. Can recommend data.

また、構造的特徴類似度算出手段２２は、映像データをいくつかの区間に分割し、各区間におけるカット割りや音楽区間などの構造的特徴量の出現頻度を要素とする特徴ベクトルの類似度を算出する。これにより、緊迫感などの印象を与えるためにカットの頻度を増減するといった、演出に関する情報を分析し、構造的特徴類に基づいた映像データの類似度を算出することができる。 Further, the structural feature similarity calculating means 22 divides the video data into several sections, and calculates the similarity of the feature vectors whose elements are the appearance frequency of the structural features such as cut division and music sections in each section. calculate. This makes it possible to analyze information related to performance such as increasing or decreasing the frequency of cuts to give an impression such as a sense of urgency, and to calculate the similarity of video data based on structural features.

重要シーン特徴類似度算出手段２３は、映像データの中で、視聴者の印象に残りやすいような重要シーンの特徴に基づいて、映像データの類似度を算出する。重要シーンの特徴として利用する特徴は、色、動き、音響特徴などがある。これにより、類似映像出力装置１は、映像データが視聴者に与える印象の似ている映像データを推薦することができる。 The important scene feature similarity calculating unit 23 calculates the similarity of the video data based on the important scene features that are likely to remain in the viewer's impression in the video data. Features used as important scene features include color, motion, and acoustic features. Accordingly, the similar video output device 1 can recommend video data having a similar impression that the video data gives to the viewer.

例えば、ホラー映画には、明るいシーンもあれば暗いシーンもあるが、ホラー映画において、印象に残りやすい重要なシーンでは暗いシーンが一般的と考えられる。そこで重要シーン特徴類似度算出手段２３が、重要シーンに限定して特徴を分析することにより、類似映像出力装置１は、例えば、「重要シーンに暗いシーンが多い」という観点で似ている映像データを発見できる。視聴者の印象に残りやすいシーンの特徴が似ている場合、映像データが視聴者に与える印象も似ていると考えられ、類似映像出力装置１は、重要シーン特徴類似度により、視聴者に与える印象の似ている映像を推薦できる。 For example, in a horror movie, there are a bright scene and a dark scene, but in a horror movie, a dark scene is generally considered as an important scene that tends to remain in an impression. Therefore, the important scene feature similarity calculating unit 23 analyzes the features limited to the important scenes, so that the similar video output device 1 is similar to video data from the viewpoint of “there are many dark scenes in the important scenes”, for example. Can be found. When the features of a scene that is likely to remain in the viewer's impression are similar, it is considered that the impression given to the viewer by the video data is also similar, and the similar video output device 1 gives the viewer based on the similarity of the important scene feature. Can recommend videos with similar impressions.

さらに、重要シーン特徴類似度算出手段２３は、重要シーンに限定して特徴を分析し、類似度を算出する。これにより、映像全体の類似度を算出する場合に比べ、解析対象のシーンが少なくなるため、重要シーン特徴類似度算出手段２３は、類似度算出にかかる計算処理量を削減することができる。 Further, the important scene feature similarity calculating means 23 analyzes the features limited to the important scenes and calculates the similarity. As a result, since the number of scenes to be analyzed is reduced as compared with the case of calculating the similarity of the entire video, the important scene feature similarity calculating unit 23 can reduce the amount of calculation processing for calculating the similarity.

このように本発明の実施の形態に係る類似映像出力装置１は、映像データの構造的特徴、または、映像データの重要シーンの特徴、あるいはその両方を分析することにより、映像データの印象の類似度を算出できる。これにより、本発明の実施の形態に係る類似映像出力装置１は、基準映像データ１１ａと印象の似た映像データを推薦することができる。 As described above, the similar video output device 1 according to the embodiment of the present invention analyzes the structural characteristics of video data, the characteristics of important scenes of video data, or both, thereby analyzing the similarity of impressions of video data. The degree can be calculated. Thereby, the similar video output device 1 according to the embodiment of the present invention can recommend video data having an impression similar to that of the reference video data 11a.

（変形例）
本発明の実施の形態においては、基準映像データが一つの場合について説明したが、変形例においては、複数の基準映像データが指定された場合を説明する。 (Modification)
In the embodiment of the present invention, the case where there is one reference video data has been described. However, in the modification, a case where a plurality of reference video data is designated will be described.

変形例において、類似映像出力装置１の類似映像決定手段２４ａは、入力された複数の基準映像データが入力されると、これらの複数の基準映像データについての類似度を合計することにより、推薦する映像を決定することができる。 In the modification, the similar video determining unit 24a of the similar video output device 1 recommends by adding the similarities of the plurality of reference video data when the plurality of input reference video data is input. The video can be determined.

例えば、入力としてＮ_Ａ件の映像集合Ａ＝｛Ａ_ｉ｜ｉ＝１〜Ｎ_Ａ｝が与えられると、式８に示したＲ’（Ａ，Ｂ）を、式６および式７のＲ（Ａ，Ｂ）の代わりに利用する。これにより類似映像決定手段２４ａは、変形例に係る類似映像リストデータ１８ｂを生成することができる。

For example, given N _A video sets A = {A _i | i = 1 to N _A } as input, R ′ (A, B) shown in Equation 8 is changed to R ( Used in place of A, B). Thereby, the similar video determination means 24a can generate the similar video list data 18b according to the modification.

このように、本発明の変形例においては、複数の基準映像データについても、構造的特徴類似度および重要シーン特徴類似度に基づいて、これらの基準映像データに類似する映像データの情報を、出力することができる。 As described above, in the modified example of the present invention, information on video data similar to the reference video data is output based on the structural feature similarity and the important scene feature similarity for a plurality of reference video data. can do.

（その他の実施の形態）
上記のように、本発明の最良の実施の形態とその変形例によって記載したが、この開示の一部をなす論述及び図面はこの発明を限定するものであると理解すべきではない。この開示から当業者には様々な代替実施の形態、実施例及び運用技術が明らかとなる。 (Other embodiments)
As described above, the best mode for carrying out the invention and the modifications thereof have been described. However, it should not be understood that the description and drawings constituting a part of this disclosure limit the present invention. From this disclosure, various alternative embodiments, examples, and operational techniques will be apparent to those skilled in the art.

例えば、本発明の最良の実施の形態に記載した類似映像出力装置は、図２に示すように一つのハードウェア上に構成されても良いし、その機能や処理数に応じて複数のハードウェア上に構成されても良い。又、既存の情報処理システム上に実現されても良い。 For example, the similar video output device described in the best embodiment of the present invention may be configured on one piece of hardware as shown in FIG. 2, or a plurality of pieces of hardware may be used depending on the functions and the number of processes. It may be configured above. Further, it may be realized on an existing information processing system.

本発明はここでは記載していない様々な実施の形態等を含むことは勿論である。従って、本発明の技術的範囲は上記の説明から妥当な特許請求の範囲に係る発明特定事項によってのみ定められるものである。 It goes without saying that the present invention includes various embodiments not described herein. Therefore, the technical scope of the present invention is defined only by the invention specifying matters according to the scope of claims reasonable from the above description.

１類似映像出力装置
１０記憶装置
１１基準映像データ記憶部
１２映像データ記憶部
１３構造的特徴量データ記憶部
１４構造的特徴類似度データ記憶部
１５重要シーン特徴量データ記憶部
１６重要シーン特徴類似度データ記憶部
１７類似度データ記憶部
１８類似映像リストデータ記憶部
１９映像管理データ記憶部
２０中央処理制御装置
２１基準映像データ取得手段
２２構造的特徴類似度算出手段
２３重要シーン特徴類似度算出手段
２４類似映像決定手段
２５類似映像情報出力手段
３０表示装置
２２１構造的特徴解析手段
２２２、２３３類似度算出手段
２３１重要シーン抽出手段
２３２映像特徴解析手段 DESCRIPTION OF SYMBOLS 1 Similar video output apparatus 10 Storage apparatus 11 Reference video data storage part 12 Video data storage part 13 Structural feature-value data storage part 14 Structural feature similarity data storage part 15 Important scene feature-value data storage part 16 Important scene feature similarity Data storage unit 17 Similarity data storage unit 18 Similar video list data storage unit 19 Video management data storage unit 20 Central processing controller 21 Reference video data acquisition unit 22 Structural feature similarity calculation unit 23 Important scene feature similarity calculation unit 24 Similar video determination means 25 Similar video information output means 30 Display device 221 Structural feature analysis means 222, 233 Similarity calculation means 231 Important scene extraction means 232 Video feature analysis means

Claims

A similar video output method for outputting similar video data,
For each of a plurality of video data stored in the video data storage unit, a time at which a structural feature appears for each section obtained by dividing the video data is calculated as a structural feature amount. A structural feature similarity calculating step for calculating the structural feature similarity between the data and outputting the structural feature similarity data;
A similar video determination step of extracting video data similar to reference video data from the video data storage unit based on the structural feature similarity data;
A similar video output method comprising:

The similar image according to claim 1, wherein the structural feature is at least one of a color, a motion, an acoustic feature, a cut division, a music segment, a speech segment, and a telop segment in the segment. output method.

For each of the plurality of video data, an important scene feature amount included in the video data is calculated as an important scene feature amount, and an important scene feature similarity between the video data is calculated based on the important scene feature amount. And an important scene feature similarity calculating step for outputting important scene feature similarity data,
The similar video determining step extracts video data similar to the reference video data from the video data storage unit based on the structural feature similarity data and the important scene feature similarity data. The similar image output method according to claim 1 or 2.

The similar video output method according to claim 3, wherein the important scene feature is at least one of a color, a motion, and an acoustic feature of the important scene.

A similar video output method for outputting similar video data,
For each of a plurality of video data stored in the video data storage unit, a feature amount of an important scene included in the video data is calculated as an important scene feature amount, and an important scene between the video data is calculated based on the important scene feature amount. An important scene feature similarity calculating step for calculating feature similarity and outputting important scene feature similarity data;
A similar video determination step of extracting video data similar to reference video data from the video data storage unit based on the important scene feature similarity data;
A similar video output method comprising:

A similar video output device that outputs similar video data,
A video data storage unit storing a plurality of video data;
For each of the plurality of video data, the appearance time of the structural feature for each section obtained by dividing the video data is calculated as the structural feature amount, and the structural feature similarity between the video data is calculated based on the structural feature amount. A structural feature similarity calculating means for calculating a degree and outputting structural feature similarity data;
Similar video determination means for extracting video data similar to reference video data from the video data storage unit based on the structural feature similarity data;
A similar video output device comprising:

The similar image according to claim 6, wherein the structural feature is any one or more of a color, a motion, an acoustic feature, a cut division, a music section, a speech section, and a telop section in the section. Output device.

For each of the plurality of video data, an important scene feature amount included in the video data is calculated as an important scene feature amount, and an important scene feature similarity between the video data is calculated based on the important scene feature amount. , Further comprising important scene feature similarity calculating means for outputting important scene feature similarity data,
The similar video determining means extracts video data similar to the reference video data from the video data storage unit based on the structural feature similarity data and the important scene feature similarity data. The similar image output device according to claim 6 or 7.

The similar video output device according to claim 8, wherein the important scene feature is at least one of a color, a motion, and an acoustic feature of the important scene.

A similar video output device that outputs similar video data,
A video data storage unit storing a plurality of video data;
For each of the plurality of video data, an important scene feature amount included in the video data is calculated as an important scene feature amount, and an important scene feature similarity between the video data is calculated based on the important scene feature amount. , Important scene feature similarity calculating means for outputting important scene feature similarity data;
Similar video determining means for extracting video data similar to reference video data from the video data storage unit based on the important scene feature similarity data;
A similar video output device comprising:

A similar video output program for causing a computer to execute the steps according to any one of claims 1 to 5.