JP2010020630A

JP2010020630A - Video search apparatus, video search method and computer program

Info

Publication number: JP2010020630A
Application number: JP2008181887A
Authority: JP
Inventors: Akira Soga; 彰曽我
Original assignee: Xing Inc
Current assignee: Xing Inc
Priority date: 2008-07-11
Filing date: 2008-07-11
Publication date: 2010-01-28
Anticipated expiration: 2028-07-11
Also published as: JP5337420B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a video search apparatus for accurately searching for and listing video data desired by a user. <P>SOLUTION: The video search apparatus receives a search keyword for searching a storage device storing a plurality of video data composed of a plurality of frame images, for specific video data, and searches for video data according to the received search keyword and metadata associated with each video data. The video search apparatus includes: means for generating corresponding metadata for video search for each of the plurality of frame images included in the video data; an evaluation means for evaluating the quality, camera shake, block noise and trend of the video data; a search means for searching for video data including the search keyword in the metadata; and a list generation means for generating a list in which thumbnail images and titles of a plurality of video data are arranged in descending order of video data evaluation. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、動画データの検索を行う動画検索方法、該動画検索方法を実施する動画検索装置、及びコンピュータを該動画検索装置として機能させるためのコンピュータプログラムに関する。 The present invention relates to a moving image search method for searching for moving image data, a moving image search device for executing the moving image search method, and a computer program for causing a computer to function as the moving image search device.

使用者からアップロードされた動画データをオンライン上で自由に閲覧できるように蓄積した動画投稿サイトが実用化されている。動画投稿サイトは、動画データに付されたメタデータ、例えば題名をキーワード検索することによって、使用者所望の動画データを検索し、サムネイル画像を一覧表示するように構成されている。一覧においては、例えば視聴回数が多い動画データから昇順に表示される。 Video posting sites that store video data uploaded by users so that they can be freely viewed online have been put to practical use. The moving image posting site is configured to search the user-desired moving image data and display a list of thumbnail images by searching for metadata attached to the moving image data, for example, a title keyword. In the list, for example, video data with a large number of viewing times are displayed in ascending order.

また、使用者からアップロードされた動画データを電子アルバムとして蓄積し、オンライン上で電子アルバムの閲覧を行うことが可能な電子アルバムシステムが実用化されている。該電子アルバムシステムは、動画データに付加された題名、撮影日時データ等に基づいて特定の動画データを検索し、検索された動画データを再生するように構成されている。 In addition, an electronic album system has been put into practical use in which moving image data uploaded from a user is stored as an electronic album and the electronic album can be browsed online. The electronic album system is configured to search for specific moving image data based on a title, shooting date / time data and the like added to the moving image data, and to reproduce the searched moving image data.

一方、動画の画質を評価する装置として、ＭＰＥＧ等で圧縮された動画の画質劣化を検出する画質評価装置が提案されている（例えば、特許文献１）。
特開２００３−８７４４２号公報 On the other hand, as an apparatus for evaluating the image quality of moving images, an image quality evaluating apparatus that detects image quality deterioration of moving images compressed by MPEG or the like has been proposed (for example, Patent Document 1).
JP 2003-87442 A

しかしながら、従来の動画検索方法では、動画データの画質、内容等を客観的に評価せずに検索結果を一覧表示するため、使用者が所望している動画データを優先的に一覧表示できていないという問題があった。
例えば、全く同じ被写体を撮影した２つの動画データがアップロードされており、一方の動画Ａはブロックノイズ、手ぶれがひどいが、他方の動画Ｂは、動画Ａに比べてブロックノイズが少なく、手ぶれも無いような場合を考えてみる。この場合、動画Ｂを優先的に一覧表示すべきであるが、動画Ａの視聴回数が動画Ｂよりも多いと、動画Ａが優先的に一覧表示されることになる。なお、動画投稿サイトの使用者から動画データに対する評価を受け付け、該評価が高い順で動画を一覧表示する動画投稿サイトもあるが、使用者による主観的な評価に基づくものであり、必ずしも使用者が所望する動画データを一覧表示できていない。 However, in the conventional video search method, the search results are displayed in a list without objectively evaluating the image quality, contents, etc. of the video data, so that the video data desired by the user cannot be displayed in a list preferentially. There was a problem.
For example, two pieces of moving image data obtained by photographing the same subject are uploaded. One moving image A has much block noise and camera shake, while the other moving image B has less block noise and no camera shake compared to moving image A. Consider a case like this. In this case, the video B should be preferentially displayed as a list, but if the video A is viewed more frequently than the video B, the video A is preferentially displayed as a list. Although there are video posting sites that accept video data evaluations from users of video posting sites and list videos in descending order of the evaluation, they are based on subjective evaluations by users and are not necessarily users. Cannot display a list of desired video data.

また、従来の動画投稿サイトにおいては、使用者によって動画データに付された題名、又は該動画データに対して投稿されたコメント等を手掛かりにして動画データをキーワード検索しているが、題名、コメント等は使用者によって主観的に選択されているため、検索精度が悪いという問題があった。 In addition, in conventional video posting sites, keywords are searched for video data based on the title given to the video data by the user or comments posted on the video data, but the title, comment Etc. are subjectively selected by the user, and there is a problem that the search accuracy is poor.

本発明は斯かる事情に鑑みてなされたものであり、動画データに含まれる複数のフレーム画像毎に、該フレーム画像、例えば該フレーム画像の含まれる文字情報に基づいて検索用のメタデータを生成し、また動画データを客観的に評価するように構成することにより、従来の動画検索方法に比べて、使用者が所望する動画データを精度良く検索及び一覧表示することができる動画検索装置、動画検索方法及びコンピュータプログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and generates search metadata for each of a plurality of frame images included in moving image data based on the frame image, for example, character information included in the frame image. In addition, by configuring the video data to be objectively evaluated, a video search device and video that can search and display a list of video data desired by the user with higher accuracy than conventional video search methods. It is an object to provide a search method and a computer program.

本発明の他の目的は、映像シーン毎にメタデータを生成するように構成することによって、各映像シーンの内容を客観的に示したメタデータに基づいて使用者が所望する動画データを精度良く検索及び一覧表示することができる動画検索装置を提供することにある。 Another object of the present invention is to generate metadata for each video scene so that video data desired by a user can be accurately obtained based on metadata that objectively shows the contents of each video scene. An object of the present invention is to provide a moving image search device capable of searching and displaying a list.

本発明の他の目的は、検索キーワードをより多く含む動画データを高く評価するように構成することによって、使用者が所望する動画データをより精度良く検索及び一覧表示することができる動画検索装置を提供することにある。 Another object of the present invention is to provide a moving image search apparatus capable of searching and displaying a list of moving image data desired by a user with higher accuracy by configuring the moving image data including more search keywords to be highly evaluated. It is to provide.

本発明の他の目的は、使用者が所望する動画データとして、流行性が高い動画データを優先的に一覧表示することができる動画検索装置を提供することにある。 Another object of the present invention is to provide a moving image search apparatus capable of preferentially displaying moving image data having high fashionability as moving image data desired by a user.

本発明の他の目的は、使用者が所望する動画データとして、画質が良い動画データを優先的に一覧表示することができる動画検索装置を提供することにある。 Another object of the present invention is to provide a moving image search apparatus capable of preferentially displaying moving image data with good image quality as moving image data desired by a user.

本発明の他の目的は、使用者が所望する動画データとして、ブロックノイズが小さい動画データを優先的に一覧表示することができる動画検索装置を提供することにある。 Another object of the present invention is to provide a moving image search apparatus that can preferentially display moving image data with small block noise as moving image data desired by a user.

本発明の他の目的は、使用者が所望する動画データとして、手ぶれが小さい動画データを優先的に一覧表示することができる動画検索装置を提供することにある。 Another object of the present invention is to provide a moving image search apparatus that can preferentially display moving image data with small camera shake as moving image data desired by a user.

本発明に係る動画検索装置は、複数のフレーム画像からなる動画データを複数記憶した記憶装置から特定の動画データを検索するための検索キーワードを受け付け、受け付けた検索キーワード、及び各動画データに関連付けられたメタデータに基づいて動画データを検索する動画検索装置において、動画データに含まれる複数のフレーム画像毎に、該フレーム画像に基づいて前記動画データのメタデータを生成するメタデータ生成手段と、動画データに含まれるフレーム画像、又は動画データに関連付けられたメタデータに基づいて該動画データを評価する評価手段と、検索キーワードをメタデータに含む動画データを検索する検索手段と、検索して得た動画データが複数である場合、複数の動画データ夫々のサムネイル画像又は各動画データのメタデータを、各動画データの評価結果に基づいて配列した一覧を生成する一覧生成手段とを備えることを特徴とする。 The moving image search device according to the present invention receives a search keyword for searching for specific moving image data from a storage device that stores a plurality of moving image data composed of a plurality of frame images, and is associated with the received search keyword and each moving image data. In the video search device for searching video data based on the metadata, metadata generation means for generating metadata of the video data based on the frame image for each of a plurality of frame images included in the video data; Obtained by searching with an evaluation means for evaluating the moving picture data based on a frame image included in the data or metadata associated with the moving picture data, a searching means for searching for moving picture data including the search keyword in the metadata If there are multiple video data, each thumbnail image or video data The metadata, characterized in that it comprises a list generating means for generating a list that is arranged on the basis of the evaluation results of each video data.

本発明に係る動画検索装置は、前記メタデータ生成手段は、フレーム画像から文字領域を抽出する手段と、抽出された文字領域から文字を認識する文字認識手段とを備え、文字認識によって得られた文字をメタデータとして生成するようにしてあることを特徴とする。 In the moving picture search apparatus according to the present invention, the metadata generation means includes means for extracting a character area from the frame image, and character recognition means for recognizing a character from the extracted character area, and is obtained by character recognition. Characters are generated as metadata.

本発明に係る動画検索装置は、動画データに含まれる複数のフレーム画像夫々の輝度を比較することによって、映像シーンが切り替わる境界のフレーム画像を特定する手段を備え、前記メタデータ生成手段は、映像シーン毎に、各映像シーンを構成する一のフレーム画像に基づいて動画データのメタデータを生成するようにしてあることを特徴とする。 The moving image search device according to the present invention includes means for identifying a frame image at a boundary at which a video scene is switched by comparing the brightness of each of a plurality of frame images included in the moving image data, and the metadata generation means includes: For each scene, metadata of moving image data is generated based on one frame image constituting each video scene.

本発明に係る動画検索装置は、前記評価手段は、検索キーワードを含むメタデータをより多く含む動画データを高く評価するようにしてあることを特徴とする。 The moving image search apparatus according to the present invention is characterized in that the evaluation means highly evaluates moving image data including more metadata including a search keyword.

本発明に係る動画検索装置は、前記評価手段は、複数の情報資源を記憶している外部装置から、動画データに関連付けられたメタデータを含む情報資源を検索し、該メタデータを含む情報資源の数を取得する手段を備え、該情報資源の数が多い程、動画データを高く評価するようにしてあることを特徴とする。 In the video search device according to the present invention, the evaluation unit searches an information resource including metadata associated with the video data from an external device storing a plurality of information resources, and the information resource including the metadata Means for acquiring the number of video data, and the higher the number of the information resources, the higher the evaluation of the moving image data.

本発明に係る動画検索装置は、動画データに係る撮影地点を示す撮影位置情報と、該撮影位置情報が示す撮影地点の地名とを対応付けた地名情報を記憶する手段を備え、前記評価手段は、動画データに撮影位置情報が付加されている場合、該撮影位置情報及び地名情報に基づいて撮影地点の地名を特定する手段と、複数の情報資源を記憶している外部装置から、特定された地名を含む情報資源を検索し、該地名を含む情報資源の数を取得する手段とを備え、動画データの撮影位置情報に係る地名を含む情報資源の数が多い程、動画データを高く評価するようにしてあることを特徴とする。 The moving picture search device according to the present invention includes means for storing place name information in which shooting position information indicating a shooting point related to moving picture data is associated with a place name of the shooting point indicated by the shooting position information, and the evaluation means includes When the shooting position information is added to the moving image data, it is specified from the means for specifying the place name of the shooting point based on the shooting position information and the place name information, and the external device storing a plurality of information resources. Means for retrieving an information resource including a place name and acquiring the number of information resources including the place name, and the higher the number of information resources including the place name related to the shooting position information of the moving image data, the higher the evaluation of the moving image data. It is characterized by the above.

本発明に係る動画検索装置は、前記評価手段は、複数の情報資源を記憶している外部装置から、動画データに含まれるフレーム画像の特徴量を有する情報資源を検索し、該特徴量を有する情報資源の数を取得する手段とを備え、動画データに含まれるフレーム画像の特徴量を有する情報資源の数が多い程、動画データを高く評価するようにしてあることを特徴とする。 In the moving image search device according to the present invention, the evaluation means searches an information resource having a feature amount of a frame image included in the moving image data from an external device storing a plurality of information resources, and has the feature amount. Means for acquiring the number of information resources, and the higher the number of information resources having the feature amount of the frame image included in the moving image data, the higher the evaluation of the moving image data.

本発明に係る動画検索装置は、前記評価手段は、動画データに含まれるフレーム画像を構成する画素の階調値を変更する画質補正を行い、補正後のフレーム画像を構成する画素の階調値、及び補正前のフレーム画像を構成する画素の階調値を比較する手段を備え、各階調値の差が大きい程、前記動画データを低く評価するようにしてあることを特徴とする。 In the moving image search apparatus according to the present invention, the evaluation unit performs image quality correction for changing a gradation value of a pixel constituting the frame image included in the moving image data, and a gradation value of the pixel constituting the corrected frame image And means for comparing the gradation values of the pixels constituting the uncorrected frame image, and the larger the difference between the gradation values, the lower the evaluation of the moving image data.

本発明に係る動画検索装置は、前記評価手段は、動画データに含まれるフレーム画像からブロックノイズのエッジを検出し、ブロックノイズのエッジが大きい程、前記動画データを低く評価するようにしてあることを特徴とする。 In the moving image search apparatus according to the present invention, the evaluation means detects an edge of block noise from a frame image included in the moving image data, and evaluates the moving image data lower as the block noise edge is larger. It is characterized by.

本発明に係る動画検索装置は、前記評価手段は、動画データに含まれる複数のフレーム画像の端部に含まれる被写体を比較することで、該被写体の振動量を算出する手段を備え、該振動量が大きい程、前記動画データを低く評価するようにしてあることを特徴とする。 In the moving image search apparatus according to the present invention, the evaluation unit includes a unit that calculates a vibration amount of the subject by comparing the subjects included in the end portions of the plurality of frame images included in the moving image data. The larger the amount, the lower the evaluation of the moving image data.

本発明に係る動画検索方法は、複数のフレーム画像からなる動画データを複数記憶した記憶装置から特定の動画データを検索するための検索キーワードを受け付け、受け付けた検索キーワード、及び各動画データに関連付けられたメタデータに基づいて動画データを検索する動画検索方法において、動画データに含まれる複数のフレーム画像毎に、該フレーム画像に基づいて前記動画データのメタデータを生成するステップと、動画データに含まれるフレーム画像、又は動画データに関連付けられたメタデータに基づいて該動画データを評価するステップと、検索キーワードをメタデータに含む動画データを検索するステップと、検索して得た動画データが複数である場合、複数の動画データ夫々のサムネイル画像又は各動画データのメタデータを、各動画データの評価結果に基づいて配列した一覧を生成するステップとを有することを特徴とする。 The moving image search method according to the present invention receives a search keyword for searching for specific moving image data from a storage device that stores a plurality of moving image data composed of a plurality of frame images, and is associated with the received search keyword and each moving image data. In the video search method for searching for video data based on the metadata, the step of generating metadata of the video data based on the frame image for each of a plurality of frame images included in the video data, and included in the video data A step of evaluating the video data based on the frame image or the metadata associated with the video data, a step of searching the video data including the search keyword in the metadata, and a plurality of video data obtained by the search. In some cases, a thumbnail image of each of a plurality of video data or a meta data of each video data. The motor is characterized by having a step of generating a list that is arranged on the basis of the evaluation results of each video data.

本発明に係るコンピュータプログラムは、コンピュータに、複数のフレーム画像からなる動画データを複数記憶した記憶装置から、検索キーワード、及び各動画データに関連付けられたメタデータに基づいて動画データを検索させるコンピュータプログラムにおいて、動画データに含まれる複数のフレーム画像毎に、該フレーム画像に基づいて前記動画データのメタデータを生成するステップと、動画データに含まれるフレーム画像、又は動画データに関連付けられたメタデータに基づいて該動画データを評価するステップと、検索キーワードをメタデータに含む動画データを検索するステップと、検索して得た動画データが複数である場合、複数の動画データ夫々のサムネイル画像又は各動画データのメタデータを、各動画データの評価結果に基づいて配列した一覧を生成するステップとをコンピュータに実行させることを特徴とする。 A computer program according to the present invention causes a computer to search video data based on a search keyword and metadata associated with each video data from a storage device that stores a plurality of video data consisting of a plurality of frame images. And generating metadata of the moving image data based on the frame image for each of a plurality of frame images included in the moving image data; and a frame image included in the moving image data or metadata associated with the moving image data A step of evaluating the moving image data based on the step, a step of searching for moving image data including the search keyword in the metadata, and a plurality of moving image data obtained by the search, when there are a plurality of moving image data, each thumbnail image or each moving image The metadata of the data is evaluated for each video data. Characterized in that to perform the steps on a computer to generate a list that is arranged based on.

本発明にあっては、メタデータ生成手段は、動画データに含まれる複数のフレーム画像毎に、該フレーム画像に基づいて動画データのメタデータを生成する。該メタデータは、動画データを検索する際に使用する情報であり、動画データに基づいて生成されるため、使用者が生成するメタデータに比べて客観性が高い。
評価手段は、動画データに含まれるフレーム画像、又はメタデータに基づいて動画データを評価する。該評価は、フレーム画像、又はメタデータに基づいて行われるため、使用者の評価をそのまま利用する場合に比べて客観性が高い。
そして、検索手段は、検索キーワードをメタデータに含む動画データを検索し、一覧生成手段は、検索して得た複数の動画データのサムネイル画像又はメタデータを、各動画データの評価結果に基づいて配列した一覧を生成する。
従って、従来の動画検索方法に比べて、使用者が所望する動画データを精度良く検索し、一覧表示することが可能になる。
なお、動画データを記憶した記憶装置は、必ずしも動画検索装置を構成するものでは無い。即ち、動画検索装置の外部にある記憶装置から動画データを検索する構成も本発明に含まれる。 In the present invention, the metadata generation unit generates metadata of moving image data based on the frame image for each of a plurality of frame images included in the moving image data. The metadata is information used when searching for moving image data, and is generated based on the moving image data. Therefore, the metadata is more objective than the metadata generated by the user.
The evaluation means evaluates the moving image data based on frame images or metadata included in the moving image data. Since the evaluation is performed based on the frame image or the metadata, the objectivity is higher than when the user's evaluation is used as it is.
Then, the search means searches for video data including the search keyword in the metadata, and the list generation means calculates thumbnail images or metadata of the plurality of video data obtained by the search based on the evaluation result of each video data. Generate an ordered list.
Therefore, compared with the conventional moving image search method, it is possible to search moving image data desired by the user with high accuracy and display a list.
Note that the storage device storing the moving image data does not necessarily constitute a moving image search device. In other words, the present invention includes a configuration for retrieving moving image data from a storage device outside the moving image retrieval device.

本発明にあっては、フレーム画像から文字領域を抽出し、抽出された文字領域から文字を認識し、文字認識によって得られた文字をメタデータとして生成する。従って、従来の動画検索方法に比べて、使用者が所望する動画データを客観的に精度良く検索し、一覧表示することが可能になる。 In the present invention, a character region is extracted from a frame image, a character is recognized from the extracted character region, and a character obtained by character recognition is generated as metadata. Therefore, compared with the conventional moving image search method, the moving image data desired by the user can be searched objectively and accurately and displayed in a list.

本発明にあっては、動画データに含まれる複数のフレーム画像夫々の輝度を比較することによって、映像シーンが切り替わる境界のフレーム画像を特定、つまり、動画データを複数の映像シーンに区分けする。なお、厳密には、境界のフレーム画像は、時系列順に並ぶ２つの映像シーンがあった場合、いずれかの映像シーンに含まれるものであるが、いずれの映像シーンに含まれるものであっても良い。
そして、映像シーン毎に、各映像シーンを構成する一のフレーム画像、つまり少なくとも一のフレーム画像に基づいて動画データのメタデータを生成する。従って、動画データに複数の映像シーンが含まれている場合、各映像シーンの内容を示したメタデータに基づいて動画データを検索することが可能になる。よって、使用者が所望する動画データをより精度良く検索し、一覧表示することが可能になる。 In the present invention, by comparing the brightness of each of the plurality of frame images included in the moving image data, the frame image at the boundary where the video scene is switched is specified, that is, the moving image data is divided into the plurality of video scenes. Strictly speaking, when there are two video scenes arranged in chronological order, the boundary frame image is included in any video scene, but may be included in any video scene. good.
Then, for each video scene, metadata of moving image data is generated based on one frame image constituting each video scene, that is, at least one frame image. Therefore, when a plurality of video scenes are included in the moving image data, it is possible to search the moving image data based on the metadata indicating the contents of each video scene. Therefore, it becomes possible to search the moving image data desired by the user with higher accuracy and display the list.

本発明にあっては、動画データに係るメタデータが複数生成される。そして、検索キーワードを含むメタ情報をより多く含む動画データをより高く評価する。従って、使用者が所望する動画データをより精度良く検索し、一覧表示することが可能になる。 In the present invention, a plurality of metadata relating to moving image data is generated. Then, the moving image data including more meta information including the search keyword is more highly evaluated. Therefore, it is possible to search the moving image data desired by the user with higher accuracy and display a list.

本発明にあっては、動画データに関連付けられたメタデータを含む情報資源を、外部装置から検索し、該メタデータを含む情報資源の数を取得する。評価手段は、メタデータを含む情報資源の数が多い程、動画データを高く評価する。
情報資源は、例えばニュース関連の写真、画像、音声、ＨＴＭＬ文書、ＸＭＬ文書等であり、流行性を反映しているため、動画データをそのテーマの流行性という観点から評価するための基準として利用することができる。つまり、動画データのテーマの流行性が高い場合、該動画データに関連する情報資源の数は多く、動画データのテーマの流行性が低い場合、該動画データに関連する情報資源の数は少ない傾向がある。
従って、動画データを流行性という観点から評価し、使用者が所望する動画データとして、流行性が高い動画データを優先的に一覧表示することが可能になる。
なお、情報資源の検索及びメタデータを含む情報資源の数を特定する処理は、動画検索装置で実行するように構成しても良いし、外部の情報検索装置に検索及び情報資源の数を要求し、その結果を取得するように構成しても良い。 In the present invention, an information resource including metadata associated with moving image data is searched from an external device, and the number of information resources including the metadata is acquired. The evaluation means evaluates moving image data higher as the number of information resources including metadata increases.
Information resources are, for example, news-related photographs, images, sounds, HTML documents, XML documents, etc., and reflect the trend, so use it as a standard for evaluating video data from the perspective of the theme's trend can do. That is, when the trend of the theme of the video data is high, the number of information resources related to the video data is large, and when the theme of the video data theme is low, the number of information resources related to the video data tends to be small. There is.
Therefore, it is possible to evaluate moving image data from the viewpoint of fashionability, and to preferentially display moving image data with high fashionability as moving image data desired by the user.
Note that the search of information resources and the process of specifying the number of information resources including metadata may be executed by a video search device, or request the number of searches and information resources from an external information search device. The result may be obtained.

本発明にあっては、動画データに付加された撮影位置情報と、撮影位置情報及び該撮影位置情報が示す地名を対応付けた地名情報とに基づいて撮影地点の地名を特定する。そして、動画データの撮影地点の地名を含む情報資源を、外部装置から検索し、該メタデータを含む情報資源の数を取得する。評価手段は、前記地名を含む情報資源の数が多い程、動画データを高く評価する。動画の撮影地点に流行性が高い被写体がある場合、該動画データに関連する情報資源の数は多く、動画の撮影地点に流行性が高い被写体が無い場合、該動画データに関連する情報資源の数は少ない傾向がある。
従って、動画データの撮影地点における被写体の流行性という観点から評価し、使用者が所望する動画データとして、流行性が高い動画データを優先的に一覧表示することが可能になる。 In the present invention, the place name of the shooting point is specified based on the shooting position information added to the moving image data and the place name information in which the shooting position information and the place name indicated by the shooting position information are associated with each other. Then, the information resource including the place name of the shooting point of the moving image data is searched from the external device, and the number of information resources including the metadata is acquired. The evaluation unit highly evaluates the moving image data as the number of information resources including the place name increases. When there is a highly fashionable subject at the video shooting location, there are many information resources related to the video data, and when there is no highly fashionable subject at the video shooting location, the information resources related to the video data Number tends to be small.
Therefore, it is possible to preferentially display the moving image data having high fashionability as the moving image data desired by the user, which is evaluated from the viewpoint of the fashionability of the subject at the shooting point of the moving image data.

本発明にあっては、動画データに含まれるフレーム画像の特徴量と共通する特徴量を含む情報資源を、外部装置から検索し、該特徴量を含む情報資源の数を取得する。基本的な原理は前記発明等と同様であり、評価手段は、前記特徴量を含む情報資源の数が多い程、動画データを高く評価する。
従って、動画データを流行性という観点から評価し、使用者が所望する動画データとして、流行性が高い動画データを優先的に一覧表示することが可能になる。 In the present invention, an information resource including a feature amount common to the feature amount of the frame image included in the moving image data is searched from the external device, and the number of information resources including the feature amount is acquired. The basic principle is the same as that of the said invention etc., and an evaluation means evaluates moving image data highly, so that there are many information resources including the said feature-value.
Therefore, it is possible to evaluate moving image data from the viewpoint of fashionability, and to preferentially display moving image data with high fashionability as moving image data desired by the user.

本発明にあっては、動画データに含まれるフレーム画像の画質を補正し、補正後のフレーム画像を構成する画素の階調値、及び補正前のフレーム画像を構成する画素の階調値を比較することにより、動画データを評価する。
従って、画質という観点から動画データを評価し、使用者が所望する動画データとして、画質が良い動画データを優先的に一覧表示することが可能になる。 In the present invention, the image quality of the frame image included in the moving image data is corrected, and the gradation values of the pixels constituting the frame image after the correction are compared with the gradation values of the pixels constituting the frame image before the correction. To evaluate the video data.
Therefore, it is possible to evaluate moving image data from the viewpoint of image quality, and to preferentially display moving image data with good image quality as moving image data desired by the user.

本発明にあっては、動画データに含まれるフレーム画像からエッジを抽出することによってブロックノイズのエッジを検出し、ブロックノイズのエッジが大きいほど、動画データを低く評価する。
従って、ブロックノイズという観点から動画データを評価し、使用者が所望する動画データとして、ブロックノイズが小さい動画データを優先的に一覧表示することが可能になる。 In the present invention, an edge of block noise is detected by extracting an edge from a frame image included in the moving image data, and the moving image data is evaluated lower as the edge of the block noise is larger.
Therefore, it is possible to evaluate moving image data from the viewpoint of block noise, and to preferentially display moving image data with small block noise as moving image data desired by the user.

本発明にあっては、動画データに含まれる複数のフレーム画像の端部に含まれる被写体を比較することで、該被写体の振動量を算出し、該振動量が大きい程、前記動画データを低く評価する。
従って、手ぶれという観点から動画データを評価し、使用者が所望する動画データとして、手ぶれが小さい動画データを優先的に一覧表示することが可能になる。 In the present invention, the amount of vibration of the subject is calculated by comparing subjects included in the end portions of a plurality of frame images included in the moving image data. The larger the amount of vibration, the lower the moving image data. evaluate.
Therefore, it is possible to evaluate moving image data from the viewpoint of camera shake, and to preferentially display moving image data with small camera shake as moving image data desired by the user.

本発明によれば、従来の動画検索方法に比べて、使用者が所望する動画データを精度良く検索及び一覧表示することができる。 According to the present invention, it is possible to search and display a list of moving image data desired by a user with higher accuracy than in a conventional moving image search method.

本発明によれば、各映像シーンの内容を客観的に示したメタデータに基づいて使用者が所望する動画データを精度良く検索及び一覧表示することができる。 According to the present invention, it is possible to accurately search and display a list of moving image data desired by a user based on metadata that objectively shows the contents of each video scene.

本発明によれば、検索キーワードをより多く含む動画データを高く評価し、使用者が所望する動画データをより精度良く検索及び一覧表示することができる。 According to the present invention, moving image data including more search keywords can be highly evaluated, and moving image data desired by a user can be searched and displayed in a more accurate manner.

本発明によれば、使用者が所望する動画データとして、流行性が高い動画データを優先的に一覧表示することができる。 According to the present invention, it is possible to preferentially display a list of moving image data with high fashionability as moving image data desired by the user.

本発明によれば、使用者が所望する動画データとして、画質が良い動画データを優先的に一覧表示することができる。 According to the present invention, moving image data with good image quality can be preferentially displayed as a list of moving image data desired by the user.

本発明によれば、使用者が所望する動画データとして、ブロックノイズが小さい動画データを優先的に一覧表示することができる。 According to the present invention, moving image data with small block noise can be preferentially displayed as a list as moving image data desired by a user.

本発明によれば、使用者が所望する動画データとして、手ぶれが小さい動画データを優先的に一覧表示することができる。 According to the present invention, moving image data with small camera shake can be preferentially displayed as a list as moving image data desired by the user.

以下、本発明をその実施の形態を示す図面に基づいて詳述する。
図１は、本発明の実施の形態に係る動画検索システムの構成を示す模式図である。本発明の実施の形態に係る動画検索システムは、情報資源記憶部５を有する通信装置４と、通信装置４から情報資源を検索する情報検索装置６と、動画データ記憶部２から動画データを検索し、検索して得た動画データを種々の観点から評価し、評価の高い順で検索結果を一覧表示する機能を有する動画投稿サイトとしての動画検索装置１と、携帯電話端末７と、端末装置８とを備える。各装置は、通信網Ｎを介して接続されており、各種データを送受信するように構成されている。本発明に係る動画検索装置１は、動画データを構成するフレーム画像毎に検索用のメタデータを生成し、更に動画データをテーマの流行性、画質、ブロックノイズの有無、手ぶれの程度、使用者評価等に基づいて総合的に評価し、動画データの検索結果を評価が高い順で一覧表示することによって、使用者が所望する動画データを精度良く検索及び一覧表示することを可能にしたものである。 Hereinafter, the present invention will be described in detail with reference to the drawings illustrating embodiments thereof.
FIG. 1 is a schematic diagram showing a configuration of a moving image search system according to an embodiment of the present invention. The video search system according to the embodiment of the present invention includes a communication device 4 having an information resource storage unit 5, an information search device 6 for searching for information resources from the communication device 4, and video data search from the video data storage unit 2. The video search device 1 as a video posting site having a function of evaluating the video data obtained by the search from various viewpoints and displaying a list of search results in descending order of evaluation, the mobile phone terminal 7, and the terminal device 8. Each device is connected via a communication network N and configured to transmit and receive various data. The moving image search apparatus 1 according to the present invention generates search metadata for each frame image constituting the moving image data, and further, the moving image data has the theme fashion, image quality, presence of block noise, degree of camera shake, user By comprehensively evaluating based on evaluation, etc., and displaying a list of search results of moving image data in descending order of evaluation, the user can search and display a list of desired moving image data with high accuracy. is there.

図２は、動画検索装置１の構成を模式的に示すブロック図である。動画検索装置１は、装置全体を制御するＣＰＵ１０を備えたコンピュータである。ＣＰＵ１０には、ＲＯＭ１１、ＲＡＭ１２、外部記憶装置１３、内部記憶装置１５、マウス、キーボード等の入力装置１４、液晶ディスプレイ、ＣＲＴ等の出力装置１６、通信部１７及びＩ／Ｆ部１８がバス１９を介して接続されている。 FIG. 2 is a block diagram schematically showing the configuration of the moving image search apparatus 1. The moving image search apparatus 1 is a computer including a CPU 10 that controls the entire apparatus. The CPU 10 includes a ROM 11, a RAM 12, an external storage device 13, an internal storage device 15, an input device 14 such as a mouse and a keyboard, an output device 16 such as a liquid crystal display and a CRT, a communication unit 17, and an I / F unit 18 via a bus 19. Connected through.

ＲＯＭ１１は、コンピュータの動作に必要な制御プログラムを記憶したマスクＲＯＭ、ＥＥＰＲＯＭ等の不揮発性メモリである。
ＲＡＭ１２は、ＣＰＵ１０の演算処理を実行する際に生ずる各種データを一時記憶するＤＲＡＭ、ＳＲＡＭ等の揮発性メモリである。
外部記憶装置１３は、本発明の実施の形態に係るコンピュータプログラム９ａをコンピュータ読み取り可能に記録したＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ等の記録媒体９からデータを読み取る光ディスクドライブである。
内部記憶装置１５は、例えばハードディスクであり、外部記憶装置１３が記録媒体９から読み取ったコンピュータプログラム９ａを記録する。
通信部１７は、通信網Ｎを介して情報検索装置６、通信装置４、携帯電話端末７及び端末装置８との間で情報資源の検索要求、動画データ、その他各種データを送受信するためのインタフェースである。
Ｉ／Ｆ部１８には、動画データを記憶する動画データ記憶部２、及び地図データを記憶する地図データ記憶部３が接続されており、ＣＰＵ１０は、Ｉ／Ｆ部１８を介して、動画データ及び地図データの読み出し、及び動画データの書き込みを行うように構成されている。地図データには、ＧＰＳ情報が示す地点の地名を示す情報が含まれている。
ＣＰＵ１０は、内部記憶装置１５が記憶しているコンピュータプログラム９ａをＲＡＭ１２に読み出して実行することにより、本発明に係る動画検索方法を実施する。 The ROM 11 is a non-volatile memory such as a mask ROM or EEPROM that stores a control program necessary for the operation of the computer.
The RAM 12 is a volatile memory such as DRAM or SRAM that temporarily stores various data generated when the arithmetic processing of the CPU 10 is executed.
The external storage device 13 is an optical disk drive that reads data from a recording medium 9 such as a CD-ROM or DVD-ROM in which the computer program 9a according to the embodiment of the present invention is recorded in a computer-readable manner.
The internal storage device 15 is a hard disk, for example, and records the computer program 9a read from the recording medium 9 by the external storage device 13.
The communication unit 17 is an interface for transmitting and receiving information resource search requests, moving image data, and other various data to and from the information search device 6, the communication device 4, the mobile phone terminal 7, and the terminal device 8 via the communication network N. It is.
The I / F unit 18 is connected to a moving image data storage unit 2 that stores moving image data and a map data storage unit 3 that stores map data, and the CPU 10 receives the moving image data via the I / F unit 18. The map data is read and the moving image data is written. The map data includes information indicating the place name of the point indicated by the GPS information.
The CPU 10 implements the moving image search method according to the present invention by reading the computer program 9a stored in the internal storage device 15 into the RAM 12 and executing it.

図３は、情報検索装置６及び通信装置４の構成を模式的に示すブロック図である。情報検索装置６は、装置全体を制御するＣＰＵ６０を備えたコンピュータである。ＣＰＵ６０には、ＲＯＭ６１、ＲＡＭ６２、外部記憶装置６３、内部記憶装置６５、マウス、キーボード等の入力装置６４、液晶ディスプレイ、ＣＲＴ等の出力装置６６、通信部６７がバス６８を介して接続されている。ＲＯＭ６１及び内部記憶装置６５は、情報検索装置６の動作に必要な制御プログラムを記憶しており、ＣＰＵ６０は、ＲＯＭ６１及び内部記憶装置６５から制御プログラムをＲＡＭ６２に読み出して実行することにより、通信装置４の情報資源記憶部５に格納されている情報資源の検索を行う。
通信装置４は、装置全体を制御するＣＰＵ４０を備えたウェブサーバとしてのコンピュータである。ＣＰＵ４０には、ＲＯＭ４１、ＲＡＭ４２、外部記憶装置４３、内部記憶装置４５、マウス、キーボード等の入力装置４４、液晶ディスプレイ、ＣＲＴ等の出力装置４６、通信部４７、及びＩ／Ｆ部４８がバス４９を介して接続されている。Ｉ／Ｆ部４８には、複数の情報資源を記憶する情報資源記憶部５が接続されている。情報資源は、例えばニュース関連のＨＴＭＬ文書、ＸＭＬ文書、写真、画像、カラオケ音声等のデータである。なお、情報資源の種類はこれらに限定されない。ＲＯＭ４１及び内部記憶装置４５は、通信装置４の動作に必要な制御プログラムを記憶しており、ＣＰＵ４０は、ＲＯＭ４１及び内部記憶装置４５から制御プログラムをＲＡＭ４２に読み出して実行することにより、情報資源の提供を行うウェブサーバ機能を実現する。ウェブサーバ機能は、ハイパーテキスト転送プロトコルにて情報検索装置６、携帯電話端末７、端末装置８等から情報資源の送信要求を受け付け、情報資源を送信要求元に送信する機能を有する。 FIG. 3 is a block diagram schematically showing the configuration of the information search device 6 and the communication device 4. The information search device 6 is a computer including a CPU 60 that controls the entire device. A ROM 61, a RAM 62, an external storage device 63, an internal storage device 65, an input device 64 such as a mouse and a keyboard, an output device 66 such as a liquid crystal display and a CRT, and a communication unit 67 are connected to the CPU 60 via a bus 68. . The ROM 61 and the internal storage device 65 store a control program necessary for the operation of the information search device 6. The CPU 60 reads the control program from the ROM 61 and the internal storage device 65 to the RAM 62 and executes it, thereby executing the communication device 4. The information resource stored in the information resource storage unit 5 is searched.
The communication device 4 is a computer as a web server including a CPU 40 that controls the entire device. The CPU 40 includes a ROM 41, a RAM 42, an external storage device 43, an internal storage device 45, an input device 44 such as a mouse and a keyboard, an output device 46 such as a liquid crystal display and a CRT, a communication unit 47, and an I / F unit 48. Connected through. The I / F unit 48 is connected to an information resource storage unit 5 that stores a plurality of information resources. The information resource is data such as news-related HTML documents, XML documents, photographs, images, karaoke voices, and the like. Note that the types of information resources are not limited to these. The ROM 41 and the internal storage device 45 store control programs necessary for the operation of the communication device 4, and the CPU 40 provides information resources by reading the control programs from the ROM 41 and the internal storage device 45 into the RAM 42 and executing them. Web server function is implemented. The web server function has a function of accepting an information resource transmission request from the information retrieval device 6, the mobile phone terminal 7, the terminal device 8, and the like using the hypertext transfer protocol and transmitting the information resource to the transmission request source.

図１に示した携帯電話端末７は、所謂ＧＰＳ機能を有するカメラ付き携帯である。携帯電話端末７は、装置全体を制御する図示しないＣＰＵを備えている。ＣＰＵには、ＲＯＭ、ＲＡＭ、記憶部、他の携帯電話端末７又は動画検索装置１との間で各種データを送受信する通信部、番号ボタン、十字キー、決定ボタン、メニューボタン等の操作部、携帯電話端末７の使用者が発話した音声を電気信号に変換して送信するマイクロホン、表示部、音声出力部、撮像部、ＧＰＳ処理部、及び時計部がバスを介して接続されている。ＲＯＭは、ＣＰＵの動作に必要なプログラムを記憶しており、ＣＰＵは、ＲＯＭが記憶しているプログラムをＲＡＭに読み出して実行することにより、動画データの送受信、動画データの検索要求等を行う。撮像部は、レンズ、該レンズにて結像した像を電気信号に変換するＣＣＤ、ＣＭＯＳ等の撮像素子、撮像素子にて変換された電気信号をデジタルの動画データにＡＤ変換し、ＡＤ変換された動画データに各種画像処理を施す画像処理部等を備えている。ＧＰＳ処理部は、ＧＰＳ受信機を備えている。ＧＰＳ受信機は、ＧＰＳ衛星からの電波を受信し携帯電話端末７の現在位置を測定する。携帯電話端末７は、撮像部にて撮像処理を行った場合、撮像部で得られた動画データに、ＧＰＳ処理部で得られた緯度及び経度を示すＧＰＳ情報、時計部が計時して得られた撮影日時データ等をメタデータとして付加し、記憶部に記憶するように構成されている。 The mobile phone terminal 7 shown in FIG. 1 is a camera-equipped mobile phone having a so-called GPS function. The mobile phone terminal 7 includes a CPU (not shown) that controls the entire apparatus. The CPU includes a ROM, a RAM, a storage unit, a communication unit that transmits and receives various data to and from the other mobile phone terminal 7 or the video search device 1, an operation unit such as a number button, a cross key, a determination button, and a menu button. A microphone, a display unit, an audio output unit, an imaging unit, a GPS processing unit, and a clock unit that convert voice transmitted by the user of the mobile phone terminal 7 into an electrical signal and transmit the electrical signal are connected via a bus. The ROM stores a program necessary for the operation of the CPU. The CPU reads out and executes the program stored in the ROM to the RAM, thereby performing transmission / reception of moving image data, a search request for moving image data, and the like. The image pickup unit converts the image formed by the lens into an electric signal such as a CCD or CMOS image sensor, and converts the electric signal converted by the image sensor into digital moving image data. An image processing unit for performing various image processing on the moving image data. The GPS processing unit includes a GPS receiver. The GPS receiver receives radio waves from GPS satellites and measures the current position of the mobile phone terminal 7. When the cellular phone terminal 7 performs an imaging process in the imaging unit, GPS information indicating the latitude and longitude obtained by the GPS processing unit and a clock unit are obtained by measuring the moving image data obtained by the imaging unit. In addition, it is configured such that shooting date / time data and the like are added as metadata and stored in the storage unit.

端末装置８は、装置全体を制御する図示しないＣＰＵを備えたウェブクライアント機能を有するパーソナルコンピュータである。 The terminal device 8 is a personal computer having a web client function including a CPU (not shown) that controls the entire device.

図４は、動画データのアップロードに係る携帯電話端末７及び動画検索装置１の処理手順を示すフローチャートである。まず、携帯電話端末７は、図示しない記憶部から動画データを読み取り（ステップＳ１１）、読み取った動画データの題名の編集を受け付ける（ステップＳ１２）。題名の編集を終えた場合、携帯電話端末７は、題名が付加された動画データを動画検索装置１に送信する（ステップＳ１３）。 FIG. 4 is a flowchart showing a processing procedure of the mobile phone terminal 7 and the moving image search device 1 related to uploading of moving image data. First, the mobile phone terminal 7 reads moving image data from a storage unit (not shown) (step S11), and accepts editing of the title of the read moving image data (step S12). When the editing of the title is completed, the mobile phone terminal 7 transmits the moving image data to which the title is added to the moving image search device 1 (step S13).

図５は、動画データのデータ構造を概念的に示す説明図である。例えば、動画データは、時系列順の複数のフレーム画像と、該動画データの題名と、複数のフレーム画像夫々に基づいて生成されたメタデータ、動画の撮影地点を示すＧＰＳ情報と、動画データの評価を示す動画データ評価ポイントとを含む。アップロード時においては、メタデータ及び動画評価ポイントは付されていない状態にある。なお、題名もメタ情報の一種である。 FIG. 5 is an explanatory diagram conceptually showing the data structure of moving image data. For example, the moving image data includes a plurality of frame images in chronological order, a title of the moving image data, metadata generated based on each of the plurality of frame images, GPS information indicating a shooting point of the moving image, Video data evaluation points indicating evaluation. At the time of uploading, metadata and moving image evaluation points are not attached. The title is also a kind of meta information.

動画検索装置１のＣＰＵ１０は、携帯電話端末７から送信された動画データを通信部１７にて受信する（ステップＳ１４）。そして、ＣＰＵ１０は、受信した動画データを動画評価サイトに掲載する（ステップＳ１５）。 The CPU 10 of the video search device 1 receives the video data transmitted from the mobile phone terminal 7 at the communication unit 17 (step S14). And CPU10 posts the received moving image data on a moving image evaluation site (step S15).

次いで、ＣＰＵ１０は、メタデータの生成に係るサブルーチンを呼び出して、動画データに含まれる複数のフレーム画像に基づいてメタデータを生成する（ステップＳ１６）。そして、ＣＰＵ１０は、生成したメタデータを該動画データに関連付け（ステップＳ１７）、メタデータが関連付けられた動画データを内部記憶装置１５に記憶させ（ステップＳ１８）、処理を終える。
ステップＳ１７における関連付けは、例えば、動画データのヘッダ部分に包含させることによって行う。なお、動画データと、メタデータとを関連付けることができるのであれば、テーブルなどを用いた他の方法を採用しても良い。 Next, the CPU 10 calls a subroutine related to generation of metadata, and generates metadata based on a plurality of frame images included in the moving image data (step S16). Then, the CPU 10 associates the generated metadata with the moving image data (step S17), stores the moving image data associated with the metadata in the internal storage device 15 (step S18), and ends the processing.
The association in step S17 is performed, for example, by including in the header portion of the moving image data. Note that another method using a table or the like may be employed as long as the moving image data and the metadata can be associated with each other.

なお、上述のフローチャートでは、題名の編集を無条件で受け付けるように構成してあるが、動画検索装置１は、ステップＳ１６で生成したメタデータの文字列と、題名とを比較し、該文字列及び題名の関連性が低い場合、携帯電話端末７へ警告を発するように構成しても良い。また、動画検索装置１は、ステップＳ１６で生成されたメタデータを携帯電話端末７へ送信し、適切な題名の入力を促すように構成しても良い。 In the above flowchart, the editing of the title is accepted unconditionally, but the moving image search device 1 compares the metadata character string generated in step S16 with the title, and the character string. And when the relevance of the title is low, the mobile phone terminal 7 may be configured to issue a warning. Further, the moving image search device 1 may be configured to transmit the metadata generated in step S16 to the mobile phone terminal 7 and prompt the user to input an appropriate title.

図６は、メタデータの生成に係る動画検索装置１の処理手順を示すフローチャートである。メタデータの生成に係るサブルーチンが呼び出された場合、ＣＰＵ１０は、動画データに含まれる一のフレーム画像から文字領域、所謂テロップの領域を抽出し（ステップＳ３１）、抽出された文字領域に含まれる文字列を文字認識する（ステップＳ３２）。そして、ＣＰＵ１０は、文字認識によって得られた文字列に基づいてメタデータとして生成する（ステップＳ３３）。例えば、文字列を分節し、単語を抽出し、該単語をメタデータとする。なお、文字領域が抽出されなかった場合、前記一のフレーム画像についてはメタデータの生成が行われない。 FIG. 6 is a flowchart illustrating a processing procedure of the moving image search apparatus 1 related to generation of metadata. When a subroutine related to generation of metadata is called, the CPU 10 extracts a character area, that is, a so-called telop area from one frame image included in the moving image data (step S31), and the characters included in the extracted character area Character recognition is performed on the column (step S32). And CPU10 produces | generates as metadata based on the character string obtained by character recognition (step S33). For example, a character string is segmented, a word is extracted, and the word is used as metadata. If no character area is extracted, no metadata is generated for the one frame image.

次いで、ＣＰＵ１０は、動画データを構成する全フレーム画像のメタデータを生成したか否かを判定する（ステップＳ３４）。全フレーム画像のメタデータの生成が行われていないと判定した場合（ステップＳ３４：ＮＯ）、ＣＰＵ１０は、処理をステップＳ３１に戻す。全フレーム画像のメタデータの生成を行ったと判定した場合（ステップＳ３４：ＹＥＳ）、ＣＰＵ１０は、処理を終える。 Next, the CPU 10 determines whether or not the metadata of all the frame images constituting the moving image data has been generated (step S34). If it is determined that the metadata of all frame images has not been generated (step S34: NO), the CPU 10 returns the process to step S31. When it is determined that the metadata of all frame images has been generated (step S34: YES), the CPU 10 ends the process.

動画評価サイトは、動画検索装置１が携帯電話端末７、端末装置８等の外部装置に対して提供するサービスである。動画検索装置１は、携帯電話端末７又は端末装置８から送信された動画データの送信要求を受信した場合、撮影日時、題名等の検索キーワードに基づいて動画データを、動画データ記憶部２から検索し、検索された動画データを送信要求元に送信する。また、動画検索装置１は、動画データと共に、提供した動画データに対する動画評価サイト使用者（動画評価サイトを使用する使用者）の評価を受け付けるための評価受付動画データを送信元に送信する。そして、携帯電話端末７又は端末装置８から特定の動画データを評価する情報が送信された場合、動画検索装置１は、動画データを評価する情報を内部記憶装置１５に記憶させる。 The video evaluation site is a service that the video search device 1 provides to external devices such as the mobile phone terminal 7 and the terminal device 8. When the moving image search device 1 receives a moving image data transmission request transmitted from the mobile phone terminal 7 or the terminal device 8, the moving image search device 1 searches the moving image data storage unit 2 for moving image data based on a search keyword such as a shooting date and a title. Then, the searched moving image data is transmitted to the transmission request source. Moreover, the moving image search device 1 transmits, to the transmission source, evaluation reception moving image data for receiving evaluation of a moving image evaluation site user (a user who uses the moving image evaluation site) for the provided moving image data together with the moving image data. When the information for evaluating the specific moving image data is transmitted from the mobile phone terminal 7 or the terminal device 8, the moving image search device 1 stores the information for evaluating the moving image data in the internal storage device 15.

また、動画検索サイトは、投稿された動画データの評価を定期的に行う。以下、動画データの評価方法を説明する。
図７は、動画データの評価に係る動画検索装置１の処理手順を示すフローチャートである。動画検索装置１のＣＰＵ１０は、動画データについて、流行、画質、ブロックノイズの有無、手ぶれの有無、使用者評価等、種々の観点から総合的な評価を行う。 In addition, the video search site periodically evaluates the posted video data. Hereinafter, a method for evaluating moving image data will be described.
FIG. 7 is a flowchart showing a processing procedure of the moving image search apparatus 1 related to evaluation of moving image data. The CPU 10 of the moving image search apparatus 1 performs comprehensive evaluation on the moving image data from various viewpoints such as fashion, image quality, presence / absence of block noise, presence / absence of camera shake, and user evaluation.

サブルーチンが呼び出された場合、ＣＰＵ１０は、評価対象の動画データが初評価を受ける動画データであるか否かを判定する（ステップＳ５１）。 When the subroutine is called, the CPU 10 determines whether or not the moving image data to be evaluated is moving image data that undergoes the initial evaluation (step S51).

初評価を受ける動画データであると判定した場合（ステップＳ５１：ＹＥＳ）、ＣＰＵ１０は、動画データの画質を評価する（ステップＳ５２）。具体的には、動画データに含まれる一又は複数のフレーム画像に対して階調値を変更するといった所定の画質補正処理を行い、補正後の動画データを構成する各画素の階調値と、補正前の動画データを構成する各画素の階調値とを比較し、各画素の階調値の変化が小さい程、動画データの画質を高く評価する。各画素の階調値は、例えばＲＧＢ成分夫々の階調値、又は各画素の輝度値である。以下、各画素の階調値は２５６階調として説明する。補正の前後で各画素の階調値の変化が０である場合、満点として評価する。 When it determines with it being the moving image data which receive initial evaluation (step S51: YES), CPU10 evaluates the image quality of moving image data (step S52). Specifically, a predetermined image quality correction process such as changing the gradation value for one or a plurality of frame images included in the moving image data is performed, and the gradation value of each pixel constituting the corrected moving image data; The gradation value of each pixel constituting the moving image data before correction is compared, and the image quality of the moving image data is evaluated higher as the change in the gradation value of each pixel is smaller. The gradation value of each pixel is, for example, the gradation value of each RGB component or the luminance value of each pixel. In the following description, the gradation value of each pixel is assumed to be 256 gradations. When the change in gradation value of each pixel is 0 before and after correction, it is evaluated as a perfect score.

所定の画質補正処理は、例えばカラーバランス、コントラスト等を補正する処理であり、所謂スマート補正、レベル補正、コントラスト補正、カラー補正等が挙げられる。具体的には、動画に含まれるフレーム画像内の最も暗い画素の階調値を黒（階調値０）、フレーム画像内の最も明るい画素の階調値を白（階調値２５５）に対応付ける補正を行う。また、他の明るさを有する画素の階調値は、各階調値の度数分布が偏って分布しないように１〜２５４の階調値に対応付けられる。例えば、フレーム画像内の最も暗い画素の階調値が３０、最も明るい画素の階調値が２００である場合、階調値３０を階調値０に、階調値２００を階調値２５５に変換する補正を行う。そして、フレーム画像内の他の階調値３１〜１９９は、１〜２５４の範囲で対応付けられる。このように補正することにより、カラーバランス及びコントラストを補正することができる。
なお、スマート補正、レベル補正、コントラスト補正、カラー補正を峻別することはできないが、各補正内容の概要は次の通りである。即ち、スマート補正とは、フレーム画像全体のカラーバランスを補正し、シャドウ部分（階調値０）及びハイライト部分（階調値２５５）が鮮明になるように補正する処理であり、コントラスト補正は、フレーム画像の色合いを変化させずにフレーム画像の全体的な明暗のみを補正する処理であり、レベル補正は、フレーム画像全体のコントラストと、色かぶり又はカラーキャストとを補正する処理であり、カラー補正は、フレーム画像全体のシャドウ部分、中間調部分、及びハイライト部分を識別し、コントラスト及びカラーバランスを補正する処理である。 The predetermined image quality correction processing is, for example, processing for correcting color balance, contrast, etc., and includes so-called smart correction, level correction, contrast correction, color correction, and the like. Specifically, the gradation value of the darkest pixel in the frame image included in the moving image is associated with black (gradation value 0), and the gradation value of the brightest pixel in the frame image is associated with white (gradation value 255). Make corrections. In addition, the gradation values of pixels having other brightness are associated with gradation values of 1 to 254 so that the frequency distribution of each gradation value is not unevenly distributed. For example, when the tone value of the darkest pixel in the frame image is 30 and the tone value of the brightest pixel is 200, the tone value 30 is set to 0 and the tone value 200 is set to 255. Make corrections to convert. And the other gradation values 31-199 in a frame image are matched in the range of 1-254. By correcting in this way, the color balance and contrast can be corrected.
Smart correction, level correction, contrast correction, and color correction cannot be distinguished, but the outline of each correction is as follows. That is, the smart correction is a process for correcting the color balance of the entire frame image and correcting the shadow part (tone value 0) and the highlight part (tone value 255) to be clear. This is a process that corrects only the overall brightness of the frame image without changing the hue of the frame image, and the level correction is a process that corrects the contrast of the entire frame image and the color cast or color cast. The correction is a process of identifying the shadow portion, halftone portion, and highlight portion of the entire frame image and correcting the contrast and color balance.

次いで、ＣＰＵ１０は、ブロックノイズの大きさに基づいて動画データを評価する（ステップＳ５３）。具体的には、動画データに含まれるフレーム画像からブロックノイズのエッジを検出し、ブロックノイズのエッジが大きい程、動画データを低く評価する。
なお、ブロックノイズとは、動画データの量子化及び逆量子化による量子化誤差が原因となって発生するノイズである。量子化及び逆量子化は、例えば８×８画素のブロック毎に行われるため、該ブロックの境界部分に画像の不連続性が発生する。このため、ブロックノイズが大きい動画には、タイル状のモザイク模様が発生する。 Next, the CPU 10 evaluates moving image data based on the magnitude of block noise (step S53). Specifically, an edge of block noise is detected from a frame image included in the moving image data, and the moving image data is evaluated lower as the edge of the block noise is larger.
The block noise is noise generated due to a quantization error due to quantization and inverse quantization of moving image data. Since quantization and inverse quantization are performed for each block of 8 × 8 pixels, for example, image discontinuity occurs at the boundary portion of the block. For this reason, a tile-like mosaic pattern occurs in a moving image with large block noise.

次いで、ＣＰＵ１０は、手ぶれの大きさに基づいて動画データを評価する（ステップＳ５４）。具体的には、ＣＰＵ１０は、動画データに含まれる時系列順の複数のフレーム画像の枠部分を比較することによって、枠部分の振動量を算出する。例えば、一のフレーム画像の枠部に含まれる水平方向１ライン分の各画素の輝度値を記憶する。そして、他のフレーム画像の枠部に含まれる水平方向１ライン分の各画素の輝度値の並びと、先に記憶した輝度の並びとを比較し、一定の相関を有する水平ラインを特定する。輝度の並びが一定の相関を有している場合、共通の被写体の画像を構成した水平ラインと推定される。同様に、他のフレーム画像についても相関が高い水平ラインを特定する。そして、各水平ラインの垂直方向における位置関係に基づいて手ぶれが発生しているか否か、手ぶれの大きさを算出することができる。つまり、垂直方向に振動する水平ラインの振動数が所定振動数以上である場合、手ぶれが発生していると判定することができる。振動数が所定振動数未満である場合、使用者が意図的にカメラを上下させて動画を撮影したと考えられる。また、水平ラインの振幅の大きさから、手ぶれの大きさを算出することができる。
なお、水平ラインを用いた場合を説明したが、垂直ラインについても同様の処理を行うことによって、左右方向の手ぶれの大きさを算出することができる。 Next, the CPU 10 evaluates the moving image data based on the size of camera shake (step S54). Specifically, the CPU 10 calculates the amount of vibration of the frame portion by comparing the frame portions of a plurality of frame images in time series included in the moving image data. For example, the luminance value of each pixel for one horizontal line included in the frame portion of one frame image is stored. Then, the arrangement of the luminance values of the pixels for one horizontal line included in the frame portion of the other frame image is compared with the arrangement of the previously stored luminance, and the horizontal line having a certain correlation is specified. When the luminance arrangement has a certain correlation, it is estimated as a horizontal line constituting a common subject image. Similarly, horizontal lines with high correlation are specified for other frame images. Based on the positional relationship between the horizontal lines in the vertical direction, whether or not camera shake has occurred can be calculated. That is, when the frequency of the horizontal line that vibrates in the vertical direction is greater than or equal to a predetermined frequency, it can be determined that camera shake has occurred. When the frequency is less than the predetermined frequency, it is considered that the user has intentionally moved the camera up and down to take a video. Further, the magnitude of camera shake can be calculated from the magnitude of the horizontal line amplitude.
In addition, although the case where the horizontal line was used was demonstrated, the magnitude | size of the camera shake of the left-right direction is computable by performing the same process also about a vertical line.

ステップＳ５４の処理を終えた場合、又は初評価を受ける動画データでは無いと判定した場合（ステップＳ５１：ＮＯ）、ＣＰＵ１０は、動画データの流行性評価に係るサブルーチンを呼び出して、動画データの流行性を評価する（ステップＳ５５）。そして、ＣＰＵ１０は、動画評価サイトを通じて携帯電話端末７又は端末装置８から収集して内部記憶装置１５に記憶してある動画評価サイト使用者の評価を読み出し、該評価の平均値を算出することによって、使用者評価を行い（ステップＳ５６）、処理を終える。
使用者の評価を適度に加味することによって、メタデータのみでは評価できない評価要素も評価対象にすることができる。ただし、動画データをより客観的に評価したい場合、使用者評価を利用しないように構成しても良い。 When the process of step S54 is completed, or when it is determined that the video data is not subject to the initial evaluation (step S51: NO), the CPU 10 calls a subroutine related to the video data epidemic evaluation to call the video data epidemic. Is evaluated (step S55). Then, the CPU 10 reads the evaluations of the moving image evaluation site users collected from the mobile phone terminal 7 or the terminal device 8 through the moving image evaluation site and stored in the internal storage device 15, and calculates the average value of the evaluations. The user evaluation is performed (step S56), and the process is finished.
By appropriately taking into account the user's evaluation, evaluation elements that cannot be evaluated only with metadata can also be evaluated. However, when it is desired to evaluate the moving image data more objectively, the user evaluation may not be used.

次いで、ＣＰＵ１０は、動画評価ポイント、即ちステップＳ５２乃至５６で算出されたポイントの動画評価ポイント、即ち総合得点を算出する（ステップＳ５７）。動画評価ポイントは、流行性、画質、使用者評価が高い程高得点になる。また、ブロックノイズ、手ぶれが大きい場合、動画評価ポイントが減点される。なお、二回目以降の評価においては、ステップＳ５２〜５４の処理を実行しないが、初評価時にステップＳ５２〜５４で算出されたポイントを用いて動画評価ポイントを算出する。ステップＳ５２〜５４で算出されるポイントは、時間経過によって変化しないためである。
そして、ＣＰＵ１０は、算出された動画評価ポイントを動画データに関連付け（ステップＳ５８）、処理を終える。なお、動画データとの関連付けは、例えば、図５に示すように動画データのヘッダに動画評価ポイントを含ませることによって行う。 Next, the CPU 10 calculates a moving image evaluation point, that is, a moving image evaluation point of the points calculated in steps S52 to S56, that is, a total score (step S57). The higher the trendiness, image quality, and user evaluation, the higher the video evaluation points. Also, when block noise and camera shake are large, moving image evaluation points are deducted. In the second and subsequent evaluations, the processing of steps S52 to 54 is not executed, but the moving image evaluation points are calculated using the points calculated in steps S52 to 54 during the initial evaluation. This is because the points calculated in steps S52 to S54 do not change with time.
Then, the CPU 10 associates the calculated moving image evaluation point with the moving image data (step S58) and ends the process. The association with the moving image data is performed, for example, by including moving image evaluation points in the header of the moving image data as shown in FIG.

図８は、動画データの流行性評価に係るサブルーチンの処理手順を示すフローチャートである。動画検索装置１のＣＰＵ１０は、評価対象の動画データに付加されたメタデータを抽出する（ステップＳ７１）。そして、ＣＰＵ１０は、抽出したメタデータの文字列を含む情報資源の検索を要求する要求データを情報検索装置６に送信する（ステップＳ７２）。なお、メタデータの数は、複数であっても良い。 FIG. 8 is a flowchart showing a processing procedure of a subroutine relating to the trendy evaluation of moving image data. The CPU 10 of the video search device 1 extracts metadata added to the video data to be evaluated (step S71). Then, the CPU 10 transmits request data for requesting a search for an information resource including the extracted metadata character string to the information search device 6 (step S72). The number of metadata may be plural.

情報検索装置６のＣＰＵ６０は、動画検索装置１から送信された要求データを受信し（ステップＳ７３）、要求データを受信した場合、前記抽出したメタデータの文字列を含む情報資源を通信装置４から検索する（ステップＳ７４）。そして、ＣＰＵ６０は、検索された情報資源の数を示す情報資源数データを動画検索装置１に送信する（ステップＳ７５）。 The CPU 60 of the information search device 6 receives the request data transmitted from the video search device 1 (step S73). When the request data is received, the information resource including the extracted metadata character string is received from the communication device 4. Search is performed (step S74). Then, the CPU 60 transmits information resource number data indicating the number of searched information resources to the moving image search device 1 (step S75).

動画検索装置１のＣＰＵ１０は、情報検索装置６から送信された情報資源数データを受信し（ステップＳ７６）、受信した情報資源数データに基づいて動画データを評価する（ステップＳ７７）。具体的には、情報資源数データで示される情報資源数が多い程、動画データを高く評価し、情報資源数が小さい程、動画データを低く評価する。 The CPU 10 of the video search device 1 receives the information resource number data transmitted from the information search device 6 (step S76), and evaluates the video data based on the received information resource number data (step S77). Specifically, the larger the number of information resources indicated by the information resource number data, the higher the moving image data is evaluated, and the smaller the number of information resources, the lower the moving image data is evaluated.

次に、動画データの検索及び一覧表示の方法を説明する。
図９は、動画検索に係る動画検索装置１の処理手順を示すフローチャートである。携帯電話端末７は、検索キーワード、及び動画検索要求データを動画検索装置１に送信する（ステップＳ９１）。 Next, a method for searching for moving image data and displaying a list will be described.
FIG. 9 is a flowchart showing a processing procedure of the moving image search apparatus 1 related to the moving image search. The mobile phone terminal 7 transmits the search keyword and the video search request data to the video search device 1 (step S91).

動画検索装置１のＣＰＵ１０は、携帯電話端末７から送信された検索キーワード、及び動画検索要求データを通信部１７にて受信し（ステップＳ９２）、受信した検索キーワードにて、該検索キーワードを題名及びメタ情報に含む動画データを検出する（ステップＳ９３）。例えば、検索キーワードが「桜」、「春」、「京都」である場合、メタデータに「桜」、「春」、「京都」の文字が含まれている動画データを検索する。
なお、動画データには複数のメタデータが含まれているため、検索キーワードを含む所定数以上のメタデータが含まれている動画データを検索するように構成しても良い。また、題名に前記検索キーワードが含まれていることを必須とせず、該検索キーワードを含むメタデータを所定数以上含む動画データも検索するように構成しても良い。更に、検索キーワードを含むメタデータを少なくとも一つ含む動画データを検索するように構成しても良い。 The CPU 10 of the video search device 1 receives the search keyword and the video search request data transmitted from the mobile phone terminal 7 at the communication unit 17 (step S92), and uses the search keyword as a title and a search keyword. Video data included in the meta information is detected (step S93). For example, when the search keyword is “sakura”, “spring”, or “Kyoto”, the search is performed for video data that includes the characters “sakura”, “spring”, and “Kyoto” in the metadata.
Since the moving image data includes a plurality of metadata, the moving image data including a predetermined number or more of metadata including the search keyword may be searched. In addition, it is not essential that the search keyword is included in the title, and moving image data including a predetermined number or more of metadata including the search keyword may be searched. Furthermore, it may be configured to search for moving image data including at least one metadata including a search keyword.

次いで、ＣＰＵ１０は、検索された各動画データから、該動画データの動画評価ポイントを抽出する（ステップＳ９４）。そして、ＣＰＵ１０は、検索して得た動画データを示すサムネイル画像及び題名を、動画評価ポイントの昇順で配列した一覧を生成し（ステップＳ９５）、生成した一覧に係る一覧データを通信部１７にて携帯電話端末７へ送信する（ステップＳ９６）。 Next, the CPU 10 extracts a moving image evaluation point of the moving image data from each searched moving image data (step S94). Then, the CPU 10 generates a list in which thumbnail images and titles indicating the moving image data obtained by the search are arranged in ascending order of moving image evaluation points (step S95), and the communication unit 17 generates list data related to the generated list. The data is transmitted to the mobile phone terminal 7 (step S96).

携帯電話端末７のＣＰＵは、動画検索装置１から送信された一覧データを受信し（ステップＳ９７）、受信した一覧データに基づいて動画データの検索結果一覧を表示し（ステップＳ９８）、処理を終える。 The CPU of the mobile phone terminal 7 receives the list data transmitted from the moving image search device 1 (step S97), displays a search result list of moving image data based on the received list data (step S98), and finishes the process. .

図１０は、携帯電話端末７に表示された検索結果一覧を示す模式図である。図１０に示すように、検索結果一覧は、検索して得た複数の動画データ夫々のサムネイル画像Ｖ１，Ｖ２・・・及び題名を縦方向に配列してなる。各動画データのサムネイル画像Ｖ１，Ｖ２・・・は、対応する動画評価ポイントの順で上下に配列している。サムネイル画像Ｖ１の側には、該サムネイル画像に係る動画データの題名が表示され、更に、該動画データの投稿日、投稿者、再生回数、動画評価ポイントが表示されている。 FIG. 10 is a schematic diagram showing a list of search results displayed on the mobile phone terminal 7. As shown in FIG. 10, the search result list is formed by vertically arranging thumbnail images V1, V2,... And titles of a plurality of moving image data obtained by the search. The thumbnail images V1, V2,... Of each moving image data are arranged vertically in the order of the corresponding moving image evaluation points. On the thumbnail image V1 side, the title of the moving image data related to the thumbnail image is displayed, and further, the posting date, the poster, the number of times of reproduction, and the moving image evaluation point of the moving image data are displayed.

サムネイル画像Ｖ１，Ｖ２・・・に係る動画データは、いずれも同じ被写体を同じように撮影したものであるが、画質、手ぶれ、ブロックノイズの程度が異なる。サムネイル画像Ｖ１に係る動画データは、画質が良く、手ぶれ、ブロックノイズも無いため、他の動画データに比べて動画評価ポイントが高く、検索結果一覧の上部に表示されている。サムネイル画像Ｖ２に係る動画データは画質が悪いため、動画評価ポイントが低く、サムネイル画像Ｖ１の下側に表示されている。被写体の破線で示した部分は、輪郭部分がぼやけていることを示している。同様に、サムネイル画像Ｖ３に係る動画データは手ぶれがあり、サムネイル画像Ｖ４に係る動画データは、ブロックノイズが発生しており、サムネイル画像Ｖ５に係る動画データは、手ぶれがあり、しかもブロックノイズも発生している。
このように、本発明によれば、同じ被写体を撮影した動画データであっても、画質が良く、手ぶれ、ブロックノイズが無い動画データから順に一覧表示される。 The moving image data related to the thumbnail images V1, V2,... Is the same subject photographed in the same manner, but the image quality, camera shake, and block noise are different. The moving image data related to the thumbnail image V1 has good image quality, no camera shake, and no block noise. Therefore, the moving image evaluation point is higher than other moving image data, and is displayed at the top of the search result list. Since the moving image data related to the thumbnail image V2 has poor image quality, the moving image evaluation point is low, and is displayed below the thumbnail image V1. A portion indicated by a broken line of the subject indicates that the contour portion is blurred. Similarly, the moving image data related to the thumbnail image V3 has camera shake, the moving image data related to the thumbnail image V4 has block noise, the moving image data related to the thumbnail image V5 has camera shake, and block noise is also generated. is doing.
As described above, according to the present invention, even moving image data obtained by photographing the same subject is displayed in a list in order from moving image data with good image quality, no camera shake, and no block noise.

使用者は、検索結果の一覧の中から所望の動画データを見つけることができた場合、対応するサムネイル画像を選択することによって、動画データを動画検索装置１から取得し、再生することができる。 When the user can find the desired moving image data from the list of search results, the user can acquire and reproduce the moving image data from the moving image search device 1 by selecting the corresponding thumbnail image.

このように構成された実施の形態に係る動画検索装置１、動画検索方法及びコンピュータプログラム９ａにあっては、動画データを構成するフレーム画像毎に、該フレーム画像に基づいて検索用のメタデータを生成するため、従来の動画検索方法に比べて、使用者が所望する動画データを客観的に精度良く検索することができる。 In the moving image search device 1, the moving image search method, and the computer program 9a according to the embodiment configured as described above, for each frame image constituting the moving image data, search metadata is stored based on the frame image. Therefore, the moving image data desired by the user can be searched objectively and accurately compared to the conventional moving image search method.

また、使用者が所望する動画データとして、流行性が高い動画データを高く評価し、優先的に一覧表示することができる。 Further, as the moving image data desired by the user, moving image data with high fashionability can be highly evaluated, and a list can be preferentially displayed.

更に、使用者が所望する動画データとして、画質が良い動画データを高く評価し、優先的に一覧表示することができる。 Furthermore, as moving image data desired by the user, moving image data with good image quality can be highly evaluated and preferentially displayed as a list.

更にまた、使用者が所望する動画データとして、ブロックノイズが小さい動画データを高く評価し、優先的に一覧表示することができる。 Furthermore, moving image data with small block noise can be highly evaluated as moving image data desired by the user, and a list can be preferentially displayed.

更にまた、使用者が所望する動画データとして、手ぶれが小さい動画データを高く評価し、優先的に一覧表示することができる。 Furthermore, as moving image data desired by the user, moving image data with small camera shake can be highly evaluated and preferentially displayed as a list.

（変形例１）
変形例１に係る動画検索装置１、動画検索方法、及びコンピュータプログラム９ａは、メタ情報の生成方法のみが異なるため、以下ではメタ情報の生成方法を説明する。
図１１は、変形例１におけるメタデータの生成に係る動画検索装置１の処理手順を示すフローチャートである。ＣＰＵ１０は、複数のフレーム画像夫々の平均輝度を算出する（ステップＳ１１１）。そして、ＣＰＵ１０は、時系列順で隣り合うフレーム画像間の平均輝度の差分を算出し、該差分が所定以上であるフレーム画像を映像シーンの切り替わり箇所として特定する（ステップＳ１１２）。 (Modification 1)
Since the moving image search device 1, the moving image search method, and the computer program 9a according to Modification 1 differ only in the generation method of meta information, the generation method of meta information will be described below.
FIG. 11 is a flowchart illustrating a processing procedure of the moving image search apparatus 1 according to the metadata generation in the first modification. The CPU 10 calculates the average brightness of each of the plurality of frame images (step S111). Then, the CPU 10 calculates a difference in average luminance between adjacent frame images in chronological order, and specifies a frame image having the difference equal to or greater than a predetermined value as a video scene switching location (step S112).

次いで、ＣＰＵ１０は、一の映像シーンの動画データに含まれるフレーム画像から文字領域を抽出し（ステップＳ１１３）、抽出された文字領域に含まれる文字列を文字認識する（ステップＳ１１４）。そして、ＣＰＵ１０は、テンプレート画像を用いたパターン認識によって、フレーム画像に特定の形状パターン、例えばサッカーボールの亀甲柄等を認識する（ステップＳ１１５）。 Next, the CPU 10 extracts a character area from the frame image included in the moving image data of one video scene (step S113), and recognizes the character string included in the extracted character area (step S114). Then, the CPU 10 recognizes a specific shape pattern such as a turtle shell pattern of a soccer ball in the frame image by pattern recognition using the template image (step S115).

ステップＳ１１５の処理を終えた場合、文字認識結果又はパターン認識結果に基づいてメタデータを生成する（ステップＳ１１６）。なお、動画検索装置１は、予めテンプレート画像と、検索用の文字列とを対応付けて内部記憶装置１５及び記録媒体９に記憶しており、パターン検出に成功したテンプレート画像に対応する文字列を内部記憶装置１５から読み出して、該文字列をメタデータとする。また、映像シーン毎に一のメタデータのみを生成するように構成しても良いし、映像シーン毎に複数のメタデータを生成するように構成しても良い。 When the process of step S115 is completed, metadata is generated based on the character recognition result or the pattern recognition result (step S116). Note that the moving image search apparatus 1 stores a template image and a search character string in advance in association with each other in the internal storage device 15 and the recording medium 9, and stores a character string corresponding to the template image for which pattern detection has been successful. The character string is read out from the internal storage device 15 as metadata. Further, only one metadata may be generated for each video scene, or a plurality of metadata may be generated for each video scene.

次いで、ＣＰＵ１０は、全映像シーンのメタデータを生成したか否かを判定する（ステップＳ１１７）。メタデータが生成されていない映像シーンがあると判定した場合（ステップＳ１１７：ＮＯ）、ＣＰＵ１０は、処理をステップＳ１１３に戻す。全映像シーンのメタデータを生成したと判定した場合（ステップＳ１１７：ＹＥＳ）、ＣＰＵ１０は修理を終える。 Next, the CPU 10 determines whether or not metadata for all video scenes has been generated (step S117). When it is determined that there is a video scene for which metadata has not been generated (step S117: NO), the CPU 10 returns the process to step S113. If it is determined that the metadata of all video scenes has been generated (step S117: YES), the CPU 10 finishes the repair.

変形例１に係る動画検索方法、動画検索装置１、コンピュータプログラムにあっては、動画に文字情報が含まれていない場合であっても、メタ情報を生成し、動画データの検索精度を向上させることができる。 In the moving image search method, the moving image search device 1, and the computer program according to Modification 1, even if the moving image does not include character information, meta information is generated to improve the search accuracy of moving image data. be able to.

また、各映像シーンの内容を客観的に示したメタデータに基づいて使用者が所望する動画データを精度良く検索及び一覧表示することができる。 Also, it is possible to accurately search and display a list of moving image data desired by the user based on metadata that objectively shows the contents of each video scene.

更に、映像シーン毎に一のメタデータのみを生成するように構成した場合、効率的に動画データの検索用メタデータを生成することができる。 Further, when only one metadata is generated for each video scene, it is possible to efficiently generate search metadata for moving image data.

なお、映像シーンの切り替わりをフレーム画像の平均輝度の変化によって検出する例を説明したが、輝度スペクトルの形状を比較することによって、映像シーンの切り替わりを検出するようにしても良い。 In addition, although the example which detects the change of a video scene by the change of the average brightness | luminance of a frame image was demonstrated, you may make it detect the change of a video scene by comparing the shape of a luminance spectrum.

（変形例２）
変形例２に係る動画検索装置１、動画検索方法、及びコンピュータプログラム９ａは、動画データの検索及び一覧生成方法のみが異なるため、以下では動画データの検索及び一覧生成方法を説明する。
図１２は、変形例２の動画検索に係る動画検索装置１の処理手順を示すフローチャートである。携帯電話端末７は、検索キーワード、及び動画検索要求データを動画検索装置１に送信する（ステップＳ２１１）。 (Modification 2)
Since the moving image search apparatus 1, the moving image search method, and the computer program 9a according to the second modification are different only in the moving image data search and list generation method, the moving image data search and list generation method will be described below.
FIG. 12 is a flowchart illustrating a processing procedure of the moving image search apparatus 1 according to the moving image search of the second modification. The mobile phone terminal 7 transmits the search keyword and the video search request data to the video search device 1 (step S211).

動画検索装置１のＣＰＵ１０は、携帯電話端末７から送信された検索キーワード、及び動画検索要求データを通信部１７にて受信し（ステップＳ２１２）、受信した検索キーワードにて、該検索キーワードをメタ情報に含む動画データを検出する（ステップＳ２１３）。 The CPU 10 of the video search device 1 receives the search keyword and the video search request data transmitted from the mobile phone terminal 7 at the communication unit 17 (step S212), and uses the received search keyword as a meta information. Is detected (step S213).

次いで、ＣＰＵ１０は、検索された各動画データから、該動画データの動画評価ポイントを抽出する（ステップＳ２１４）。そして、ＣＰＵ１０は、動画データ毎に、検索キーワードが含まれているメタ情報の数に基づいて動画データを再評価する（ステップＳ２１５）。具体的には、一の動画データに含まれている全メタ情報の数と、該メタ情報の内、検索キーワードが含まれているメタ情報の数とを算出し、検索キーワードが含まれているメタ情報の割合を算出する。そして、該割合が高い程、動画データの内容が検索キーワードに近いものとしてより高く評価する。 Next, the CPU 10 extracts a moving image evaluation point of the moving image data from each searched moving image data (step S214). Then, the CPU 10 reevaluates the moving image data based on the number of meta information including the search keyword for each moving image data (step S215). Specifically, the number of all meta information included in one video data and the number of meta information including a search keyword in the meta information are calculated, and the search keyword is included. Calculate the meta information ratio. Then, the higher the ratio, the higher the evaluation that the content of the video data is closer to the search keyword.

ステップＳ２１５の処理を終えた場合、ＣＰＵ１０は、ステップＳ９５〜９８と同様の処理をステップＳ２１６〜２１９で実行する。 When the process of step S215 is finished, the CPU 10 executes the same process as steps S95 to 98 in steps S216 to 219.

変形例２に係る動画検索装置１、動画検索方法及びコンピュータプログラム９ａにあっては、使用者が所望する動画データをより精度良く検索及び一覧表示することができる。 In the moving image search device 1, the moving image search method, and the computer program 9a according to the modified example 2, it is possible to search and display a list of moving image data desired by the user with higher accuracy.

（変形例３）
変形例３に係る動画検索装置１、動画検索方法、及びコンピュータプログラム９ａは、動画データの流行性評価の方法のみが異なるため、以下では動画データの流行性評価の方法を説明する。
図１３は、変形例３における動画データの流行性評価に係るサブルーチンの処理手順を示すフローチャートである。動画検索装置１のＣＰＵ１０は、評価対象の動画データに付加されたメタデータを抽出する（ステップＳ３１１）。そして、ＣＰＵ１０は、抽出したメタデータの文字列を用いて情報資源の検索が行われた回数を要求する要求データを情報検索装置６に送信する（ステップＳ３１２）。 (Modification 3)
Since the moving image search device 1, the moving image search method, and the computer program 9a according to Modification 3 differ only in the method of evaluating the fashionability of moving image data, a method of evaluating the fashionability of moving image data will be described below.
FIG. 13 is a flowchart showing a processing procedure of a subroutine relating to the trendy evaluation of moving image data in the third modification. The CPU 10 of the video search device 1 extracts metadata added to the video data to be evaluated (step S311). Then, the CPU 10 transmits request data requesting the number of times the information resource is searched using the extracted metadata character string to the information search device 6 (step S312).

情報検索装置６のＣＰＵ６０は、動画検索装置１から送信された要求データを受信し（ステップＳ３１３）、要求データを受信した場合、前記抽出したメタデータの文字列を用いて情報資源の検索が行われた回数を内部記憶装置６５から読み出す（ステップＳ３１４）。そして、ＣＰＵ６０は、読み出された回数を示す回数データを動画検索装置１へ送信する（ステップＳ３１５）。 The CPU 60 of the information search device 6 receives the request data transmitted from the video search device 1 (step S313). When the request data is received, the information resource search is performed using the extracted metadata character string. The number of breaks is read from the internal storage device 65 (step S314). Then, the CPU 60 transmits frequency data indicating the read frequency to the moving image search device 1 (step S315).

動画検索装置１のＣＰＵ１０は、情報検索装置６から送信された回数データを受信し（ステップＳ３１６）、受信した回数データに基づいて動画データを評価する（ステップＳ３１７）。つまり、回数データが多い程、動画データを高く評価する。 The CPU 10 of the video search device 1 receives the frequency data transmitted from the information search device 6 (step S316), and evaluates the video data based on the received frequency data (step S317). That is, the greater the number of times data, the higher the evaluation of moving image data.

変形例３にあっては、上述の実施の形態と同様、従来の動画検索方法に比べて、使用者が所望する動画データを客観的に精度良く検索することができる。 In the third modification example, the moving image data desired by the user can be searched objectively and accurately as compared with the conventional moving image search method as in the above-described embodiment.

（変形例４）
変形例４に係る動画検索装置１、動画検索方法、及びコンピュータプログラム９ａは、動画データの流行性評価の方法のみが異なるため、以下では動画データの流行性評価の方法を説明する。
図１４は、変形例４における動画データの流行性評価に係るサブルーチンの処理手順を示すフローチャートである。動画検索装置１のＣＰＵ１０は、評価対象の動画データを含み、情報資源の検索を要求する要求データを情報検索装置６に送信する（ステップＳ４１１）。 (Modification 4)
Since the moving image search device 1, the moving image search method, and the computer program 9a according to the modified example 4 differ only in the method of evaluating the trend of moving image data, the method of evaluating the trend of moving image data will be described below.
FIG. 14 is a flowchart showing a processing procedure of a subroutine related to the trendy evaluation of moving image data in the fourth modification. The CPU 10 of the video search device 1 transmits request data for requesting a search for information resources to the information search device 6 including the video data to be evaluated (step S411).

情報検索装置６のＣＰＵ６０は、動画検索装置１から送信された要求データを受信し（ステップＳ４１２）、要求データを受信した場合、前記動画データから特徴量を抽出し（ステップＳ４１３）、抽出された特徴量に基づいて前記動画データに関連する情報資源を検索する（ステップＳ４１４）。そして、ＣＰＵ６０は、検索された情報資源の数を示す情報資源数データを動画検索装置１に送信する（ステップＳ４１５）。 The CPU 60 of the information search device 6 receives the request data transmitted from the video search device 1 (step S412). When the request data is received, the CPU 60 of the information search device 6 extracts the feature amount from the video data (step S413). An information resource related to the moving image data is searched based on the feature amount (step S414). Then, the CPU 60 transmits information resource number data indicating the number of searched information resources to the moving image search device 1 (step S415).

特徴量としては、例えば、特開平１１−９６３６８に開示されているように、動画から島の外形形状を抽出し、その外形形状のエッジ画素の点列をフーリエ級数展開して求めた係数を採用すると良い。情報検索装置６のＣＰＵ６０は、要求データに含まれる動画データから抽出した特徴量と、情報資源記憶部５が記憶している動画データの特徴量とを比較することで類似度、即ち各係数の偏差を算出し、類似度が所定値以上である場合、評価対象の動画データに関連する情報資源と判断する。
また、特徴量として、動画に表れる特徴色を採用しても良い。情報検索装置６のＣＰＵ６０は、例えば予め定められた彩度の高い色であって、所定面積以上の動画部分を占める色を特徴量として抽出する。該特徴量を用いた場合、例えば森林の写真に関連する動画として、同じような緑色を基調とした森林の動画データが検索される。 As the feature amount, for example, as disclosed in Japanese Patent Laid-Open No. 11-96368, a coefficient obtained by extracting an outline shape of an island from a moving image and expanding a point series of edge pixels of the outline shape by Fourier series expansion is employed. Good. The CPU 60 of the information search device 6 compares the feature amount extracted from the moving image data included in the request data with the feature amount of the moving image data stored in the information resource storage unit 5, that is, the similarity, that is, each coefficient. When the deviation is calculated and the similarity is a predetermined value or more, it is determined that the information resource is related to the video data to be evaluated.
Further, a feature color appearing in a moving image may be adopted as the feature amount. The CPU 60 of the information search device 6 extracts, for example, a color that has a predetermined high saturation and occupies a moving image portion having a predetermined area or more as a feature amount. When the feature amount is used, for example, moving image data of a forest based on the same green color is retrieved as a moving image related to a forest photo.

動画検索装置１のＣＰＵ１０は、情報検索装置６から送信された情報資源数データを受信し（ステップＳ４１６）、受信した情報資源数データに基づいて動画データを評価する（ステップＳ４１７）。 The CPU 10 of the video search device 1 receives the information resource number data transmitted from the information search device 6 (step S416), and evaluates the video data based on the received information resource number data (step S417).

変形例４にあっては、上述の実施の形態と同様、従来の動画検索方法に比べて、使用者が所望する動画データを客観的に精度良く検索することができる。 In the modified example 4, as in the above-described embodiment, the moving image data desired by the user can be searched objectively and accurately compared to the conventional moving image search method.

なお、実施の形態にあっては、動画投稿サイトを一例として説明したが、本発明は他の各種サービスに適用しても良い。例えば、動画データを一覧表示して提供する電子アルバムシステム、所謂コミュニティーに写真を掲載するサービスに本発明を適用しても良い。 In the embodiment, the video posting site has been described as an example, but the present invention may be applied to other various services. For example, the present invention may be applied to an electronic album system that provides a list of moving image data, that is, a service for posting photos in a so-called community.

また、実施の形態及び変形例にあっては、一台の動画検索装置１が、動画検索処理を実行するように構成してあるが、言うまでもなく複数台のコンピュータで分散処理させても良い。 Further, in the embodiment and the modification, one moving image search apparatus 1 is configured to execute the moving image search process, but needless to say, it may be distributedly processed by a plurality of computers.

更にまた、実施の形態及び変形例で示した動画データの評価方法は一例であり、客観的に評価できるのであれば他の評価方法を採用しても良い。例えば、動画データに付随する音声データを音声認識する手段と、音声認識によって得られた文字情報に基づいてメタデータを生成する手段とを動画検索装置に備え、該メタデータに基づいて動画データを評価するように構成しても良い。また、音声データに係るメタデータを用いて動画データを評価する場合、カラオケに関連する情報資源、例えば、楽曲のタイトル、歌詞、ランキング等を含む情報資源を通信装置から検索して、動画データを評価するように構成しても良い。 Furthermore, the moving image data evaluation method shown in the embodiment and the modification is an example, and other evaluation methods may be adopted as long as the evaluation method can be objectively evaluated. For example, the moving image search device includes means for recognizing sound data accompanying the moving image data and means for generating metadata based on character information obtained by the sound recognition, and the moving image data is converted based on the metadata. You may comprise so that it may evaluate. In addition, when evaluating moving image data using metadata related to audio data, information resources related to karaoke, for example, information resources including a song title, lyrics, ranking, etc. are searched from a communication device, and moving image data is obtained. You may comprise so that it may evaluate.

更に、情報資源の検索を情報検索装置に実行させる構成であるが、情報資源の検索処理を動画検索装置が行うように構成しても良い。 Furthermore, although the information search device is configured to execute the information resource search, the moving image search device may be configured to perform the information resource search processing.

更に、今回開示された実施の形態はすべての点で例示であって、制限的なものではない。本発明の範囲は、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれる。 Furthermore, the embodiments disclosed herein are illustrative in all respects and are not restrictive. The scope of the present invention is defined by the terms of the claims, and includes all modifications within the scope and meaning equivalent to the terms of the claims.

本発明の実施の形態に係る動画検索システムの構成を示す模式図である。It is a schematic diagram which shows the structure of the moving image search system which concerns on embodiment of this invention. 動画検索装置の構成を模式的に示すブロック図である。It is a block diagram which shows typically the structure of a moving image search device. 情報検索装置及び通信装置の構成を模式的に示すブロック図である。It is a block diagram which shows typically the structure of an information search device and a communication apparatus. 動画データのアップロードに係る携帯電話端末及び動画検索装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the mobile telephone terminal and moving image search apparatus which concern on upload of moving image data. 動画データのデータ構造を概念的に示す説明図である。It is explanatory drawing which shows notionally the data structure of moving image data. メタデータの生成に係る動画検索装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the moving image search device which concerns on the production | generation of metadata. 動画データの評価に係る動画検索装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the moving image search apparatus which concerns on evaluation of moving image data. 動画データの流行性評価に係るサブルーチンの処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the subroutine which concerns on the fashionability evaluation of moving image data. 動画検索に係る動画検索装置の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the moving image search apparatus which concerns on a moving image search. 携帯電話端末に表示された検索結果一覧を示す模式図である。It is a schematic diagram which shows the search result list displayed on the mobile phone terminal. 変形例１におけるメタデータの生成に係る動画検索装置の処理手順を示すフローチャートである。12 is a flowchart illustrating a processing procedure of a moving image search apparatus according to generation of metadata in Modification 1; 変形例２の動画検索に係る動画検索装置の処理手順を示すフローチャートである。10 is a flowchart illustrating a processing procedure of a moving image search apparatus according to a moving image search of a second modification. 変形例３における動画データの流行性評価に係るサブルーチンの処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the subroutine which concerns on the fashionableness evaluation of the moving image data in the modification 3. 変形例４における動画データの流行性評価に係るサブルーチンの処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the subroutine which concerns on the fashionableness evaluation of the moving image data in the modification 4.

Explanation of symbols

１動画検索装置
２動画データ記憶部
３地図データ記憶部
４通信装置
５情報資源記憶部
６情報検索装置
７携帯電話端末
８端末装置
９記録媒体
９ａコンピュータプログラム
１０ＣＰＵ
１７通信部
６０ＣＰＵ
６７通信部
Ｎ通信網 DESCRIPTION OF SYMBOLS 1 Movie search device 2 Movie data storage unit 3 Map data storage unit 4 Communication device 5 Information resource storage unit 6 Information search device 7 Mobile phone terminal 8 Terminal device 9 Recording medium 9a Computer program 10 CPU
17 Communication unit 60 CPU
67 Communication Department N Communication Network

Claims

Receiving a search keyword for searching for specific moving image data from a storage device that stores a plurality of moving image data composed of a plurality of frame images, and moving image data based on the received search keyword and metadata associated with each moving image data In the video search device to search,
For each of a plurality of frame images included in the moving image data, metadata generating means for generating metadata of the moving image data based on the frame image;
Evaluation means for evaluating the moving image data based on frame images included in the moving image data or metadata associated with the moving image data;
A search means for searching for video data including the search keyword in metadata;
A list generating means for generating a list in which thumbnail images of each of the plurality of video data or metadata of each video data are arranged based on an evaluation result of each video data when there are a plurality of video data obtained by the search; A moving image search apparatus comprising:

The metadata generation means includes
Means for extracting a character region from the frame image;
A character recognition means for recognizing characters from the extracted character area,
The moving image search device according to claim 1, wherein characters obtained by character recognition are generated as metadata.

Means for identifying a frame image at a boundary where a video scene is switched by comparing the brightness of each of a plurality of frame images included in the moving image data;
The metadata generation means includes
The moving image search device according to claim 1 or 2, wherein for each video scene, metadata of moving image data is generated based on one frame image constituting each video scene.

The evaluation means includes
The moving image search device according to any one of claims 1 to 3, wherein moving image data including more metadata including a search keyword is highly evaluated.

The evaluation means includes
Means for retrieving information resources including metadata associated with moving image data from an external device storing a plurality of information resources, and obtaining the number of information resources including the metadata; The moving image search device according to any one of claims 1 to 4, wherein the moving image data is highly evaluated as the number of images increases.

Means for storing place name information in which shooting position information indicating a shooting point related to moving image data is associated with a place name of the shooting point indicated by the shooting position information;
The evaluation means includes
In the case where shooting position information is added to the video data, means for specifying the place name of the shooting point based on the shooting position information and the place name information;
Means for retrieving an information resource containing the specified place name from an external device storing a plurality of information resources, and obtaining the number of information resources containing the place name;
The moving image according to any one of claims 1 to 5, wherein the moving image data is highly evaluated as the number of information resources including the place name related to the shooting position information of the moving image data increases. Search device.

The evaluation means includes
Means for retrieving an information resource having a feature amount of a frame image included in moving image data from an external device storing a plurality of information resources, and acquiring the number of information resources having the feature amount;
7. The moving image data is highly evaluated as the number of information resources having the feature amount of the frame image included in the moving image data increases. Video search device.

The evaluation means includes
Performs image quality correction that changes the tone value of the pixels that make up the frame image included in the video data, and the tone value of the pixels that make up the corrected frame image, and the tone values of the pixels that make up the frame image before correction The moving image search according to any one of claims 1 to 7, further comprising means for comparing values, wherein the moving image data is evaluated lower as the difference between the gradation values is larger. apparatus.

The evaluation means includes
9. The block noise edge is detected from a frame image included in the moving image data, and the moving image data is evaluated to be lower as the block noise edge is larger. The moving image search device according to one.

The evaluation means includes
By comparing subjects included at the ends of a plurality of frame images included in the moving image data, a means for calculating the vibration amount of the subject is provided. The larger the vibration amount, the lower the evaluation of the moving image data. The moving image search device according to claim 1, wherein the moving image search device is a video search device.

Receiving a search keyword for searching for specific moving image data from a storage device that stores a plurality of moving image data composed of a plurality of frame images, and moving image data based on the received search keyword and metadata associated with each moving image data In the video search method to search,
For each of a plurality of frame images included in the moving image data, generating metadata of the moving image data based on the frame image;
Evaluating the video data based on frame images included in the video data, or metadata associated with the video data;
Searching for video data that includes the search keyword in the metadata;
When there is a plurality of video data obtained by searching, the method includes a step of generating a list in which thumbnail images of each of the plurality of video data or metadata of each video data are arranged based on an evaluation result of each video data. A video search method characterized by

In a computer program for causing a computer to search for moving image data based on a search keyword and metadata associated with each moving image data from a storage device that stores a plurality of moving image data consisting of a plurality of frame images.
For each of a plurality of frame images included in the moving image data, generating metadata of the moving image data based on the frame image;
Evaluating the video data based on frame images included in the video data, or metadata associated with the video data;
Searching for video data that includes the search keyword in the metadata;
A plurality of video data obtained by the search, generating a list in which thumbnail images of each of the plurality of video data or metadata of each video data are arranged based on the evaluation result of each video data; A computer program that is executed.