JP5226241B2

JP5226241B2 - How to add tags

Info

Publication number: JP5226241B2
Application number: JP2007106740A
Authority: JP
Inventors: 健萩原
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2007-04-16
Filing date: 2007-04-16
Publication date: 2013-07-03
Anticipated expiration: 2027-04-16
Also published as: JP2008268985A

Description

本発明は、動画像データにタグを付与する方法、サーバ、およびプログラムに関する。 The present invention relates to a method for adding a tag to moving image data, a server, and a program.

従来、インターネット上のコンテンツには、当該コンテンツの検索を容易にするため、タグと呼ばれるキーワードを関連付けることが多い。管理サーバにより自動的に付与され、ユーザの目に触れることのないインデックスとは異なり、タグは、ユーザ自身がコンテンツを特徴付けるキーワードを登録する。このことにより、ユーザは、コンテンツの内容を把握され易くできる。更には、検索時に参照されることにより、効果的なコンテンツの提供が期待できる。 Conventionally, a keyword called a tag is often associated with content on the Internet in order to facilitate the search for the content. Unlike an index that is automatically assigned by the management server and is not visible to the user, the tag registers a keyword that characterizes the content of the user. As a result, the user can easily grasp the contents. Furthermore, provision of effective content can be expected by referring to the search.

ここで、インターネット上のコンテンツには、様々な種類のデータが含まれ、近年では、動画像データを管理するＷｅｂサイト等も存在する。このような状況において、動画像データに対する検索インデックスを作成する技術が提供されてきている。例えば、特許文献１には、映像（動画像データ）の音声からキーワードを抽出し、映像に対するインデックスを自動的に付与する装置が開示されている。
特開２００２−１７１４８１号公報 Here, content on the Internet includes various types of data, and in recent years, there are Web sites that manage moving image data. In such a situation, a technique for creating a search index for moving image data has been provided. For example, Patent Document 1 discloses an apparatus that extracts keywords from audio of video (moving image data) and automatically assigns an index to the video.
JP 2002-171481 A

上述の技術によれば、動画像データに含まれる音声に基づいて、当該動画像データ内のシーンを検索するためのインデックスを作成するが、これは、単に動画像データに含まれるキーワードを抽出したに過ぎないため、当該動画像を特徴付けているとは限らない。そのため、他の動画像も含めた複数の動画像データの中から目的のものを選択する検索には適さない。 According to the above-described technique, an index for searching for a scene in the moving image data is created based on the sound included in the moving image data. This is simply extracted from a keyword included in the moving image data. Therefore, the moving image is not necessarily characterized. Therefore, it is not suitable for a search for selecting a target one from a plurality of moving image data including other moving images.

そこで本発明は、ある動画像データを他の複数の動画像データの中で特徴付けるためのタグを自動的に付与し、検索エンジンにより当該動画像データ、あるいは当該動画像データを含むＷｅｂページを効果的に抽出可能とする方法を提供することを目的とする。 Therefore, the present invention automatically adds a tag for characterizing a certain moving image data among a plurality of other moving image data, and the search engine effectively uses the moving image data or a Web page including the moving image data. It is an object to provide a method that enables extraction.

上記目的のため、具体的には、以下のようなものを提供する。 For the above purpose, the following are specifically provided.

（１）動画像データにタグを付与する方法であって、
前記動画像データに含まれる音声をテキストデータに変換するステップと、
前記テキストデータからキーワードを抽出するステップと、
所定の検索システムにおける前記キーワードの重要度を算出するステップと、
算出した前記重要度が所定の条件を満たす場合に、前記キーワードを、前記動画像データのタグとして付与するステップと、を含む方法。 (1) A method of attaching a tag to moving image data,
Converting audio included in the moving image data into text data;
Extracting keywords from the text data;
Calculating the importance of the keyword in a predetermined search system;
Adding the keyword as a tag of the moving image data when the calculated importance satisfies a predetermined condition.

このような構成によれば、当該方法を実行するサーバは、動画像データに含まれる音声をテキストデータに変換し、テキストデータからキーワードを抽出し、所定の検索システムにおけるキーワードの重要度を算出し、算出した重要度が所定の条件を満たす場合に、キーワードを、動画像データのタグとして付与する。 According to such a configuration, the server that executes the method converts the voice included in the moving image data into text data, extracts the keyword from the text data, and calculates the importance of the keyword in a predetermined search system. When the calculated importance satisfies a predetermined condition, a keyword is assigned as a tag of moving image data.

このことにより、当該サーバは、動画像データの音声に含まれるキーワードのうち、検索システムにおける重要度が所定の条件を満たすもの、すなわち、当該動画像データを特徴付けるキーワードをタグとして付与できる。その結果、検索システムから得られる他のコンテンツをも考慮した情報に基づく効果的なタグを自動的に付与することができる。 Thus, the server can assign, as a tag, a keyword whose importance in the search system satisfies a predetermined condition among keywords included in the audio of the moving image data, that is, a keyword that characterizes the moving image data. As a result, an effective tag based on information in consideration of other contents obtained from the search system can be automatically assigned.

（２）前記重要度は、前記所定の検索システムにおける、前記キーワードによる検索結果の件数に基づくことを特徴とする（１）に記載の方法。 (2) The method according to (1), wherein the importance is based on the number of search results by the keyword in the predetermined search system.

このような構成によれば、当該方法を実行するサーバは、検索システムにおける重要度として、キーワードによる検索結果の件数（ヒット数）を用いるので、より多くのＷｅｂページに登場するキーワードが動画像データに対するタグとして選択される。 According to such a configuration, since the server executing the method uses the number of search results (number of hits) based on keywords as the importance in the search system, keywords appearing on more Web pages are moving image data. Selected as the tag for.

このことにより、ユーザにとって馴染みの深いキーワードがタグ付けされることとなるため、当該動画像データ、あるいは当該動画像データを含むＷｅｂページが効果的に検索される可能性がある。 As a result, a keyword familiar to the user is tagged, so that there is a possibility that the moving image data or a Web page including the moving image data is effectively searched.

（３）前記重要度は、前記所定の検索システムにおける、前記キーワードによる検索が実行された回数に基づくことを特徴とする（１）に記載の方法。 (3) The method according to (1), wherein the degree of importance is based on the number of times the search by the keyword is executed in the predetermined search system.

このような構成によれば、当該方法を実行するサーバは、検索システムにおける重要度として、キーワードによる検索が実行された回数（検索キーワードランキング）を用いるので、より多くユーザから検索が試みられたキーワードが動画像データのタグとして選択される。 According to such a configuration, the server that executes the method uses the number of times the search by the keyword is executed (search keyword ranking) as the importance in the search system. Is selected as a tag of moving image data.

このことにより、ユーザが頻繁に入力するキーワードが優先的にタグ付けされるため、当該動画像データ、あるいは当該動画像データを含むＷｅｂページが効果的に検索される可能性がある。 As a result, keywords frequently input by the user are preferentially tagged, so that there is a possibility that the moving image data or a Web page including the moving image data is effectively searched.

（４）前記重要度は、前記キーワードの出現頻度に関する指標のＴＦ（ＴｅｒｍＦｒｅｑｕｅｎｃｙ）とＩＤＦ（ＩｎｖｅｒｓｅＤｏｃｕｍｅｎｔＦｒｅｑｕｅｎｃｙ）の積であるＴＦ・ＩＤＦに、更に基づくことを特徴とする（２）または（３）に記載の方法。 (4) The degree of importance is further based on TF · IDF, which is a product of TF (Term Frequency) and IDF (Inverse Document Frequency), which is an index related to the appearance frequency of the keyword (2) or (3 ) Method.

このような構成によれば、当該方法を実行するサーバは、検索システムにおける重要度として、ヒット数や検索キーワードランキングに加えて、ＴＦ・ＩＤＦというキーワードの出現頻度に基づく指標を用いる。 According to such a configuration, the server executing the method uses an index based on the appearance frequency of the keyword TF / IDF in addition to the number of hits and the search keyword ranking as the importance in the search system.

このことにより、例えば、ヒット数や検索キーワードランキングが所定の条件を満たさない場合であっても、ＴＦ・ＩＤＦにより動画像を特徴付けるキーワードであると判別されれば、当該キーワードをタグとして選択することができる。 Thus, for example, even if the number of hits or search keyword ranking does not satisfy a predetermined condition, if the keyword is characterized by the TF / IDF, the keyword is selected as a tag. Can do.

その結果、単一の指標に基づくことにより重要なキーワードを見逃す可能性を低減でき、動画像を特徴付けるタグとして抽出することができる。 As a result, the possibility of missing an important keyword can be reduced based on a single index, and it can be extracted as a tag characterizing a moving image.

（５）前記キーワードは、複数の単語の組み合わせであることを特徴とする（１）から（４）のいずれかに記載の方法。 (5) The method according to any one of (1) to (4), wherein the keyword is a combination of a plurality of words.

このような構成によれば、当該方法を実行するサーバは、複数の単語の組み合わせについて重要度を判別してタグ付けを行う。このことにより、付与したタグが、１つの単語の場合に比べて、より内容を的確に表すことが可能となるため、検索システムにおけるヒット数や検索キーワードランキング等に基づいて選択することにより、より効果的なタグを付与できる可能性がある。 According to such a configuration, the server that executes the method performs tagging by determining the importance of combinations of a plurality of words. This makes it possible to express the content more accurately than the case where the assigned tag is a single word. Therefore, by selecting based on the number of hits, search keyword ranking, etc. in the search system, There is a possibility that an effective tag can be given.

（６）前記キーワードが、予め記憶したカテゴリデータと所定以上の類似度を有する場合に、当該カテゴリデータを、前記動画像データのタグとして付与するステップを更に含む（１）から（５）のいずれかに記載の方法。 (6) When the keyword has a predetermined degree of similarity or more with category data stored in advance, the method further includes a step of assigning the category data as a tag of the moving image data. The method of crab.

このような構成によれば、当該方法を実行するサーバは、予め用意されたカテゴリデータとの類似度判定により、当該カテゴリデータをタグとして付与できるの。このことにより、例えば、カテゴリデータを動画像データの音声の中に含む場合に、当該動画像データを自動的にカテゴリ分類することができる。 According to such a configuration, the server that executes the method can assign the category data as a tag by determining the similarity with the category data prepared in advance. Thus, for example, when category data is included in the sound of moving image data, the moving image data can be automatically classified into categories.

（７）前記キーワードに関して、所定の規則に基づいて分野を決定するステップと、
決定した前記分野を示すデータを、前記動画像データのタグとして付与するステップと、を更に含む（１）から（６）のいずれかに記載の方法。 (7) Regarding the keyword, determining a field based on a predetermined rule;
The method according to any one of (1) to (6), further including a step of assigning data indicating the determined field as a tag of the moving image data.

このような構成によれば、当該方法を実行するサーバは、キーワードに関して、所定の規則に基づいて分野を決定し、決定した分野を示すデータを、動画像データのタグとして付与する。 According to such a configuration, a server that executes the method determines a field based on a predetermined rule with respect to a keyword, and assigns data indicating the determined field as a tag of moving image data.

このことにより、当該サーバは、所定の規則（分野推定技術）に基づいて、予め用意されたオントロジ（分類に使用する辞書データ等）により、動画像データを分類できる。その結果、動画像データの音声情報に基づいて、自動的に分野推定し、対応するタグを付与することができる。 Thus, the server can classify the moving image data based on the ontology (such as dictionary data used for classification) prepared in advance based on a predetermined rule (field estimation technique). As a result, based on the audio information of the moving image data, the field can be automatically estimated and a corresponding tag can be assigned.

（８）動画像データにタグを付与するサーバであって、
前記動画像データに含まれる音声をテキストデータに変換する手段と、
前記テキストデータからキーワードを抽出する手段と、
所定の検索システムにおける前記キーワードの重要度を算出する手段と、
算出した前記重要度が所定の条件を満たす場合に、前記キーワードを、前記動画像データのタグとして付与する手段と、を備えるサーバ。 (8) A server for attaching a tag to moving image data,
Means for converting audio contained in the moving image data into text data;
Means for extracting a keyword from the text data;
Means for calculating the importance of the keyword in a predetermined search system;
A server provided with the keyword as a tag of the moving image data when the calculated importance satisfies a predetermined condition.

このような構成によれば、当該サーバを運用することにより、（１）と同様の効果が期待できる。 According to such a configuration, the same effect as in (1) can be expected by operating the server.

（９）動画像データにタグを付与させるプログラムであって、
前記動画像データに含まれる音声をテキストデータに変換するステップと、
前記テキストデータからキーワードを抽出するステップと、
所定の検索システムにおける前記キーワードの重要度を算出するステップと、
算出した前記重要度が所定の条件を満たす場合に、前記キーワードを、前記動画像データのタグとして付与するステップと、をサーバに実行させるプログラム。 (9) A program for attaching a tag to moving image data,
Converting audio included in the moving image data into text data;
Extracting keywords from the text data;
Calculating the importance of the keyword in a predetermined search system;
A program for causing a server to execute the step of assigning the keyword as a tag of the moving image data when the calculated importance satisfies a predetermined condition.

このような構成によれば、当該プログラムを当該サーバ上で実行することにより、（１）と同様の効果が期待できる。 According to such a configuration, the same effect as in (1) can be expected by executing the program on the server.

本発明によれば、ある動画像データを他の複数の動画像データの中で特徴付けるためのタグを自動的に付与し、検索エンジンにより当該動画像データ、あるいは当該動画像データを含むＷｅｂページを効果的に抽出することができる。 According to the present invention, a tag for characterizing certain moving image data among a plurality of other moving image data is automatically assigned, and the moving image data or a Web page including the moving image data is retrieved by a search engine. It can be extracted effectively.

本発明に係る好適な実施形態の一例について、図面に基づいて以下に説明する。 An example of a preferred embodiment according to the present invention will be described below based on the drawings.

［システムの全体構成］
図１は、本発明の好適な実施形態の一例に係るコンテンツ管理システムの全体構成を表す概念図である。 [System overall configuration]
FIG. 1 is a conceptual diagram showing the overall configuration of a content management system according to an example of a preferred embodiment of the present invention.

タグ決定サーバ１０、検索サーバ２０、コンテンツサーバ３０、および端末装置４０は通信回線を介して接続されている。コンテンツサーバ３０において、端末装置４０のユーザにより送信（投稿）されたコンテンツを管理し、コンテンツＤＢに記憶している。 The tag determination server 10, the search server 20, the content server 30, and the terminal device 40 are connected via a communication line. In the content server 30, the content transmitted (posted) by the user of the terminal device 40 is managed and stored in the content DB.

ここで、ユーザから投稿されるコンテンツには動画像データが含まれている。コンテンツサーバ３０は、この動画像データをタグ決定サーバ１０に送信することにより、動画像データに対するタグの決定を要求する。 Here, the content posted by the user includes moving image data. The content server 30 requests the determination of the tag for the moving image data by transmitting the moving image data to the tag determination server 10.

タグ決定サーバ１０は、タグを決定する際に参照する情報を取得するため、検索サーバ２０に対して検索情報を要求する。検索サーバ２０は、タグ決定サーバ１０から受信したキーワードにより検索実行、あるいは予め収集したデータを抽出し、タグ決定サーバ１０に提供する。これにより、タグ決定サーバ１０は、後述する処理により動画像データに対するタグを決定し、コンテンツサーバ３０に送信する。 The tag determination server 10 requests search information from the search server 20 in order to acquire information to be referred to when determining a tag. The search server 20 executes search based on the keyword received from the tag determination server 10 or extracts data collected in advance, and provides the tag determination server 10 with the data. Thereby, the tag determination server 10 determines a tag for the moving image data by a process described later, and transmits the tag to the content server 30.

このように決定したタグは、動画像データと関連付けてコンテンツＤＢに記憶することにより、ユーザからの要求に応じて表示させることができる。また、動画像データ、あるいは動画像データを含んだコンテンツの検索に際して、このタグが参照されるため、動画像データの内容に基づいた検索処理が可能となる。 The tag determined in this way can be displayed in response to a request from the user by storing it in the content DB in association with moving image data. Further, since this tag is referred to when searching for moving image data or content including moving image data, a search process based on the contents of moving image data is possible.

なお、本実施形態におけるコンテンツ管理システムは、複数のサーバにより実現しているが、構成はこれには限られず、タグ決定サーバ１０に検索サーバ２０やコンテンツサーバ３０の機能を併せ持つこととしてもよい。また、タグ決定サーバ１０を各機能に応じて複数のサーバに分割してもよい。 The content management system in the present embodiment is realized by a plurality of servers, but the configuration is not limited to this, and the tag determination server 10 may have the functions of the search server 20 and the content server 30 together. Moreover, you may divide the tag determination server 10 into a some server according to each function.

［機能構成］
図２は、本発明の好適な実施形態の一例に係るタグ決定サーバ１０の各機能を示すブロック図である。 [Function configuration]
FIG. 2 is a block diagram showing each function of the tag determination server 10 according to an example of the preferred embodiment of the present invention.

タグ決定サーバ１０は、音声認識部１１０、形態素解析部１２０、タグ候補抽出部１３０、検索情報収集部１４０、およびタグ決定部１５０を備え、動画像データの入力に応じて、タグデータを出力する。 The tag determination server 10 includes a voice recognition unit 110, a morpheme analysis unit 120, a tag candidate extraction unit 130, a search information collection unit 140, and a tag determination unit 150, and outputs tag data in response to input of moving image data. .

音声認識部１１０は、受信した動画像データに含まれる音声データを抽出し、既存の音声認識技術を用いてテキストデータに変換する。このことにより、動画像データが含んでいる情報が文字情報として抽出されることとなる。 The voice recognition unit 110 extracts voice data included in the received moving image data and converts it into text data using an existing voice recognition technique. As a result, information included in the moving image data is extracted as character information.

形態素解析部１２０は、音声認識部１１０にて生成されたテキストデータについて、形態素解析し、単語に分割する。ここで、形態素解析の方法には既存の技術を用いることができる。 The morpheme analysis unit 120 performs morpheme analysis on the text data generated by the speech recognition unit 110 and divides it into words. Here, an existing technique can be used for the morphological analysis method.

タグ候補抽出部１３０は、形態素解析部１２０により分割された単語のうち、動画像データのタグとして付与可能なもの（キーワード）を抽出する。具体的には、名詞や動詞に品詞を限定する等、予め抽出規則を決めておくことが望ましい。 The tag candidate extraction unit 130 extracts words (keywords) that can be assigned as moving image data tags from the words divided by the morphological analysis unit 120. Specifically, it is desirable to determine extraction rules in advance, such as limiting the part of speech to nouns and verbs.

検索情報収集部１４０は、タグ候補抽出部１３０にて抽出されたタグ候補の中から、実際に付与するタグを決定するための参照情報として、検索サーバ２０に問い合わせを行い、検索情報を収集する（詳細は後述する）。 The search information collection unit 140 makes an inquiry to the search server 20 as reference information for determining a tag to be actually assigned from the tag candidates extracted by the tag candidate extraction unit 130, and collects the search information. (Details will be described later).

タグ決定部１５０は、検索情報収集部１４０にて取得した検索情報を参照し、検索キーワードとして価値の高いものをタグ候補の中から抽出し、動画像データに対するタグとして決定する。このように、検索サーバ２０からの情報収集により、Ｗｅｂの中における価値判断が可能となるため、動画像データのみから得られる情報でタグ付けする場合に比べて、より効果的な検索キーワードをタグ付けすることができる。 The tag determining unit 150 refers to the search information acquired by the search information collecting unit 140, extracts a high-value search keyword from tag candidates, and determines it as a tag for moving image data. In this way, value collection on the Web can be performed by collecting information from the search server 20, so that more effective search keywords can be tagged than when tagging with information obtained only from moving image data. Can be attached.

このようにして決定されたタグは、コンテンツサーバ３０において動画像データと関連付けて記憶される。具体的には、例えば、図４に示すタグデータテーブルのように、動画像データ毎に複数のタグデータを記憶することが望ましい。なお、タグデータは、動画像に対して付与するとしたが、当該動画像を含むＷｅｂページに対して付与することとしてもよい。 The tag determined in this way is stored in the content server 30 in association with moving image data. Specifically, for example, as in the tag data table shown in FIG. 4, it is desirable to store a plurality of tag data for each moving image data. Although tag data is given to a moving image, it may be given to a Web page including the moving image.

［処理フロー］
図３は、本発明の好適な実施形態の一例に係るタグ決定サーバ１０における、タグ決定処理の流れを示す図である。 [Processing flow]
FIG. 3 is a diagram showing a flow of tag determination processing in the tag determination server 10 according to an example of the preferred embodiment of the present invention.

ステップＳ１０１では、タグ決定サーバ１０は、コンテンツサーバ３０より、タグ付け対象である動画像データを受け付け、音声認識部１１０に引き渡す。 In step S <b> 101, the tag determination server 10 receives moving image data to be tagged from the content server 30 and delivers it to the voice recognition unit 110.

ステップＳ１０２では、音声認識部１１０は、ステップＳ１０１にて受け付けた動画像データから音声データを抽出する。動画像データには音声情報が含まれていることが多く、これを抽出することにより、当該動画像データの内容を示す情報を取得できる。 In step S102, the voice recognition unit 110 extracts voice data from the moving image data received in step S101. The moving image data often includes audio information, and by extracting this, information indicating the contents of the moving image data can be acquired.

ステップＳ１０３では、音声認識部１１０は、ステップＳ１０２にて抽出した音声データをテキストデータへ変換する。具体的には、既存の技術を用いて実現でき、文字情報（テキストデータ）にすることで後続の言語処理が可能となる。 In step S103, the voice recognition unit 110 converts the voice data extracted in step S102 into text data. Specifically, it can be realized using existing technology, and subsequent language processing can be performed by using character information (text data).

ステップＳ１０４では、形態素解析部１２０は、ステップＳ１０３にて生成されたテキストデータを形態素解析し、複数の単語に分解する。このことにより、動画像データの内容を示すデータとして、複数の単語が得られる。 In step S104, the morphological analysis unit 120 performs morphological analysis on the text data generated in step S103, and decomposes it into a plurality of words. Thereby, a plurality of words are obtained as data indicating the contents of the moving image data.

ステップＳ１０５では、タグ候補抽出部１３０は、ステップＳ１０４にて取得した複数の単語のうち、タグの候補となるもの（キーワード）を抽出する。具体的には、例えば、名詞や動詞等の限られた品詞を抽出することにより、タグ候補を絞り込むことができる。 In step S105, the tag candidate extraction unit 130 extracts a tag candidate (keyword) from among the plurality of words acquired in step S104. Specifically, for example, tag candidates can be narrowed down by extracting limited parts of speech such as nouns and verbs.

また、キーワードは複数の単語の組み合わせとしてもよく、例えば、名詞および動詞の組み合わせ等を抽出することによれば、内容をより特定できるため、タグとして適している可能性がある。その場合には、例えば、係り受け解析の技術を用いる等、単語間の関係を考慮した抽出を行うことが望ましい。 Further, the keyword may be a combination of a plurality of words. For example, by extracting a combination of nouns and verbs, etc., the content can be specified more, so that it may be suitable as a tag. In that case, it is desirable to perform extraction in consideration of the relationship between words, for example, using a dependency analysis technique.

ステップＳ１０６では、検索情報収集部１４０は、ステップＳ１０５にて抽出したタグ候補キーワードに基づいて、検索サーバ２０から検索情報を収集する。具体的には、例えば、検索サーバ２０は、タグ決定サーバ１０から受信したキーワードに基づいて、Ｗｅｂの検索を実行する。その結果、例えば図５に示す検索件数テーブルを得る。 In step S106, the search information collection unit 140 collects search information from the search server 20 based on the tag candidate keywords extracted in step S105. Specifically, for example, the search server 20 executes a Web search based on the keyword received from the tag determination server 10. As a result, for example, the search number table shown in FIG. 5 is obtained.

検索件数テーブルには、キーワードに基づく検索結果の件数（ヒット件数）を格納している。ここで、例えば、「ＸＸレストラン」であれば、検索結果として「１２３４件」がヒットしたことを示している。 The number of search results based on keywords (the number of hits) is stored in the search number table. Here, for example, “XX restaurant” indicates that “1234 hits” were found as a search result.

また、検索サーバ２０は、例えば図６に示すキーワード別入力回数テーブルのように、統計情報として、キーワードの入力頻度を管理することもでき、このテーブルのデータをタグ決定サーバ１０に提供してもよい。 Further, the search server 20 can also manage the keyword input frequency as statistical information, as in the keyword-specific input count table shown in FIG. 6, for example, and provide the data of this table to the tag determination server 10. Good.

キーワード別入力回数テーブルには、検索サーバ２０において実行された検索について、キーワード毎の入力回数を年月と共に記憶している。これにより、検索サーバ２０は、所定の期間内におけるキーワードの入力頻度の順位を求めることができるので、この順位をタグ決定サーバ１０に提供することとしてよい。 In the keyword-specific number-of-inputs table, the number of times of input for each keyword for the search executed in the search server 20 is stored together with the date. As a result, the search server 20 can obtain the rank of the keyword input frequency within a predetermined period, and may provide this rank to the tag determination server 10.

また、検索サーバ２０は、所定の周期、タイミングにおいて、この順位を求めて記憶しておいてもよい。例えば、図７のキーワード・ランキングテーブルにおいては、年月毎にキーワードの入力回数の順位（ランキング）を記憶している。 Further, the search server 20 may obtain and store this rank at a predetermined cycle and timing. For example, the keyword ranking table of FIG. 7 stores the ranking (ranking) of the number of keyword inputs for each year.

なお、入力頻度の順位付けは、タグ決定サーバ１０が行ってもよく、その場合、検索情報収集部１４０は、キーワード別入力回数テーブルのデータを受け取り、順位を算出する。 Note that the ranking of the input frequencies may be performed by the tag determination server 10, and in that case, the search information collection unit 140 receives the data of the keyword-specific input count table and calculates the rank.

ステップＳ１０７では、タグ決定部１５０は、ステップＳ１０６にて取得した検索情報に基づいて、キーワードの重要度を判別する。具体的には、例えば、検索件数テーブル（図５）におけるヒット件数が１０００件以上であるもの、キーワード別入力回数テーブル（図６）における前月の入力回数が１００００回以上であるもの、キーワード・ランキングテーブル（図７）における前月のランキングが１００００位以内のもの等、予め判別条件を設定しておく。 In step S107, the tag determination unit 150 determines the importance of the keyword based on the search information acquired in step S106. Specifically, for example, the number of hits in the search number table (FIG. 5) is 1000 or more, the number of times of input in the previous month in the keyword-specific input number table (FIG. 6) is 10,000 or more, keyword ranking A determination condition is set in advance such that the ranking of the previous month in the table (FIG. 7) is within 10,000.

このような判別条件により、例えば、図５の検索件数テーブルを利用する場合においては、「ＸＸレストラン」が「１２３４件」で１０００件以上のヒット件数であるため、タグとして選択される。 For example, when using the search number table in FIG. 5 based on such a determination condition, “XX restaurant” is “1234 items” and the number of hits is 1000 or more, so it is selected as a tag.

また、図６のキーワード別入力回数テーブルを利用する場合においては、「ＸＸレストラン」が「２００７年４月」の入力回数「１２３４５回」で１００００回以上であるため、タグとして選択される。集計期間はこのように１ヶ月単位でもよいし、過去６ヶ月間等、予め設定した期間の合計値としてよい。 Further, in the case of using the keyword-specific input count table of FIG. 6, “XX restaurant” is selected as a tag because the input count “12345” of “April 2007” is 10,000 or more. The aggregation period may be in units of one month as described above, or may be a total value of preset periods such as the past six months.

また、図７のキーワード・ランキングテーブルを利用する場合においては、「ＸＸレストラン」が「４５６７位」、「ＺＺＺホテル」が「８８８８位」で１００００位以内であるため、タグとして選択されることとなる。 In the case of using the keyword / ranking table of FIG. 7, “XX restaurant” is “4567” and “ZZZ hotel” is “888” and is within 10000, so it is selected as a tag. Become.

ここで、ヒット数やランキングが、これらの条件を満たさない場合には、例えばＴＦ・ＩＤＦによるキーワードの重要度判定を行ってもよい。その場合、キーワードのＴＦ・ＩＤＦ値が予め記憶した閾値以上であれば、当該キーワードが動画像データを特徴付けており、重要度が高いものと判断できる。 Here, when the number of hits and the ranking do not satisfy these conditions, for example, the importance level of the keyword may be determined by TF / IDF. In this case, if the TF / IDF value of the keyword is equal to or greater than a threshold value stored in advance, it can be determined that the keyword characterizes the moving image data and has high importance.

なお、文書（当該動画像データ）中の出現頻度に関するＴＦ値は、都度算出することとしてよいが、多数の文書集合の中での出現頻度に関するＩＤＦ値は、予め計算し、検索サーバ２０に保持しておくことができる。 The TF value related to the appearance frequency in the document (the moving image data) may be calculated each time. However, the IDF value related to the appearance frequency in a large number of document sets is calculated in advance and stored in the search server 20. Can be kept.

ステップＳ１０８では、キーワードの重要度が所定以上であると判別されたため、タグ決定部１５０は、当該キーワードをタグとして動画像データに付与する。具体的には、当該キーワードをコンテンツサーバ３０に送信することにより、タグデータテーブル（図４）に記憶される。 In step S108, since it is determined that the importance of the keyword is equal to or higher than the predetermined value, the tag determination unit 150 assigns the keyword as a tag to the moving image data. Specifically, the keyword is transmitted to the content server 30 and stored in the tag data table (FIG. 4).

ステップＳ１０９では、キーワードの重要度が所定以上でないため、タグ決定部１５０は、当該キーワードはタグとして相応しくないと判断し、別のタグを付与するための処理として、カテゴリの決定を行う。 In step S109, since the importance of the keyword is not greater than or equal to a predetermined value, the tag determining unit 150 determines that the keyword is not suitable as a tag, and determines a category as a process for assigning another tag.

具体的には、例えば、図８に示すように、予め記憶したカテゴリデータとの類似度判定（マッチング）により、キーワードと一致あるいは類似するカテゴリを決定する。また、カテゴリの決定は、このような類似度判定には限られない。例えば図９に示すように、所定のオントロジ（辞書データ等）を参照する分野推定技術を用いることにより、分野データを決定することができる。 Specifically, for example, as shown in FIG. 8, a category matching or similar to the keyword is determined by similarity determination (matching) with previously stored category data. Further, category determination is not limited to such similarity determination. For example, as shown in FIG. 9, field data can be determined by using field estimation technology that refers to a predetermined ontology (such as dictionary data).

ステップＳ１１０では、タグ決定部１５０は、ステップＳ１０９にて決定したカテゴリデータまたは分野データを、タグとして動画像データに付与する。具体的には、ステップＳ１０８と同様に、カテゴリデータまたは分野データをコンテンツサーバ３０に送信することにより、タグデータテーブル（図４）に記憶される。 In step S110, the tag determination unit 150 adds the category data or field data determined in step S109 to the moving image data as a tag. Specifically, as in step S108, category data or field data is transmitted to the content server 30 and stored in the tag data table (FIG. 4).

なお、カテゴリや分野の決定を、ステップＳ１０７におけるキーワードの重要度判定に応じて行っているが、ステップＳ１０９〜Ｓ１１０は、重要度判定によらず常に行うこととしてもよい。その場合には、動画像データから抽出したキーワードと、カテゴリや分野を示すデータとが共にタグとして付与されることとなる。 The category and field are determined in accordance with the keyword importance determination in step S107, but steps S109 to S110 may be always performed regardless of the importance determination. In this case, both the keyword extracted from the moving image data and the data indicating the category and field are assigned as tags.

［サーバのハードウェア構成］
図１０は、図１で説明したタグ決定サーバ１０のハードウェア構成の一例を示す図である。タグ決定サーバ１０は、制御部１０１を構成するＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１０１０（マルチプロセッサ構成ではＣＰＵ１０１２等複数のＣＰＵが追加されてもよい）、バスライン１００５、通信Ｉ／Ｆ１０４０、メインメモリ１０５０、ＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔＯｕｔｐｕｔＳｙｓｔｅｍ）１０６０、ＵＳＢポート１０９０、Ｉ／Ｏコントローラ１０７０、ならびにキーボードおよびマウス１１００等の入力手段や表示装置１０２２を備える。 [Hardware configuration of server]
FIG. 10 is a diagram illustrating an example of a hardware configuration of the tag determination server 10 described in FIG. The tag determination server 10 includes a CPU (Central Processing Unit) 1010 (a plurality of CPUs such as a CPU 1012 may be added in a multiprocessor configuration), a bus line 1005, a communication I / F 1040, a main memory 1050, A BIOS (Basic Input Output System) 1060, a USB port 1090, an I / O controller 1070, a keyboard and a mouse 1100, and other input means and a display device 1022 are provided.

Ｉ／Ｏコントローラ１０７０には、テープドライブ１０７２、ハードディスク１０７４、光ディスクドライブ１０７６、半導体メモリ１０７８、等の記憶手段を接続することができる。 Storage means such as a tape drive 1072, a hard disk 1074, an optical disk drive 1076, and a semiconductor memory 1078 can be connected to the I / O controller 1070.

ＢＩＯＳ１０６０は、タグ決定サーバ１０の起動時にＣＰＵ１０１０が実行するブートプログラムや、タグ決定サーバ１０のハードウェアに依存するプログラム等を格納する。 The BIOS 1060 stores a boot program executed by the CPU 1010 when the tag determination server 10 is started up, a program depending on the hardware of the tag determination server 10, and the like.

記憶部１０７を構成するハードディスク１０７４は、タグ決定サーバ１０がサーバとして機能するための各種プログラムおよび本発明の機能を実行するプログラムを記憶しており、更に必要に応じて各種データベースを構成可能である。 The hard disk 1074 constituting the storage unit 107 stores various programs for the tag determination server 10 to function as a server and programs for executing the functions of the present invention, and various databases can be configured as necessary. .

光ディスクドライブ１０７６としては、例えば、ＤＶＤ−ＲＯＭドライブ、ＣＤ−ＲＯＭドライブ、ＤＶＤ−ＲＡＭドライブ、ＣＤ−ＲＡＭドライブを使用することができる。この場合は各ドライブに対応した光ディスク１０７７を使用する。光ディスク１０７７から光ディスクドライブ１０７６によりプログラムまたはデータを読み取り、Ｉ／Ｏコントローラ１０７０を介してメインメモリ１０５０またはハードディスク１０７４に提供することもできる。 As the optical disc drive 1076, for example, a DVD-ROM drive, a CD-ROM drive, a DVD-RAM drive, or a CD-RAM drive can be used. In this case, the optical disk 1077 corresponding to each drive is used. A program or data can be read from the optical disk 1077 by the optical disk drive 1076 and provided to the main memory 1050 or the hard disk 1074 via the I / O controller 1070.

タグ決定サーバ１０に提供されるプログラムは、ハードディスク１０７４、光ディスク１０７７、またはメモリーカード等の記録媒体に格納されて提供される。このプログラムは、Ｉ／Ｏコントローラ１０７０を介して、記録媒体から読み出され、または通信Ｉ／Ｆ１０４０を介してダウンロードされることによって、タグ決定サーバ１０にインストールされ実行されてもよい。 The program provided to the tag determination server 10 is stored and provided in a recording medium such as the hard disk 1074, the optical disk 1077, or a memory card. The program may be installed in the tag determination server 10 and executed by being read from the recording medium via the I / O controller 1070 or downloaded via the communication I / F 1040.

前述のプログラムは、内部または外部の記憶媒体に格納されてもよい。ここで、記憶部１０７を構成する記憶媒体としては、ハードディスク１０７４、光ディスク１０７７、またはメモリーカードの他に、ＭＤ等の光磁気記録媒体、テープ媒体を用いることができる。また、専用通信回線やインターネットに接続されたサーバシステムに設けたハードディスク１０７４または光ディスクライブラリ等の記憶装置を記録媒体として使用し、通信回線を介してプログラムをタグ決定サーバ１０に提供してもよい。 The aforementioned program may be stored in an internal or external storage medium. Here, as a storage medium constituting the storage unit 107, in addition to the hard disk 1074, the optical disk 1077, or the memory card, a magneto-optical recording medium such as an MD or a tape medium can be used. Further, a storage device such as a hard disk 1074 or an optical disk library provided in a server system connected to a dedicated communication line or the Internet may be used as a recording medium, and the program may be provided to the tag determination server 10 via the communication line.

ここで、表示装置１０２２は、ユーザにデータの入力を受け付ける画面を表示したり、タグ決定サーバ１０による演算処理結果の画面を表示したりするものであり、ブラウン管表示装置（ＣＲＴ）、液晶表示装置（ＬＣＤ）等のディスプレイ装置を含む。 Here, the display device 1022 displays a screen for accepting data input to the user, or displays a screen of calculation processing results by the tag determination server 10, and is a cathode ray tube display device (CRT), a liquid crystal display device. (LCD) and other display devices.

ここで、入力手段は、ユーザによる入力の受け付けを行うものであり、キーボードおよびマウス１１００等により構成してよい。 Here, the input means accepts input by the user, and may be configured by a keyboard, a mouse 1100, and the like.

また、通信Ｉ／Ｆ１０４０は、タグ決定サーバ１０を専用ネットワークまたは公共ネットワークを介して端末と接続できるようにするためのネットワーク・アダプタである。通信Ｉ／Ｆ１０４０は、モデム、ケーブル・モデムおよびイーサネット（登録商標）・アダプタを含んでよい。 The communication I / F 1040 is a network adapter that enables the tag determination server 10 to be connected to a terminal via a dedicated network or a public network. The communication I / F 1040 may include a modem, a cable modem, and an Ethernet (registered trademark) adapter.

以上の例は、タグ決定サーバ１０について主に説明したが、コンピュータに、プログラムをインストールして、そのコンピュータをサーバ装置として動作させることにより上記で説明した機能を実現することもできる。したがって、本発明において一実施形態として説明したサーバにより実現される機能は、上述の方法を当該コンピュータにより実行することにより、あるいは、上述のプログラムを当該コンピュータに導入して実行することによっても実現可能である。 In the above example, the tag determination server 10 has been mainly described. However, the functions described above can be realized by installing a program in a computer and operating the computer as a server device. Therefore, the functions realized by the server described as an embodiment in the present invention can be realized by executing the above-described method by the computer, or by introducing the above-mentioned program into the computer and executing it. It is.

また、検索サーバ２０およびコンテンツサーバ３０についても、タグ決定サーバ１０と同様な構成を持つ。 Further, the search server 20 and the content server 30 have the same configuration as the tag determination server 10.

［端末のハードウェア構成］
端末装置４０も、上述のタグ決定サーバ１０と同様な構成を持つ。また、上述の例ではいわゆるコンピュータで実現した例について説明したが、更に、本発明の原理が適用可能である限り、携帯電話、ＰＤＡ（ＰｅｒｓｏｎａｌＤａｔａＡｓｓｉｓｔａｎｔ）、ゲーム機等の様々な端末で実現してよい。 [Device hardware configuration]
The terminal device 40 also has the same configuration as the tag determination server 10 described above. In the above-described example, an example realized by a so-called computer has been described. Furthermore, as long as the principle of the present invention is applicable, it can be realized by various terminals such as a mobile phone, a PDA (Personal Data Assistant), and a game machine. It's okay.

以上、本発明の実施形態について説明したが、本発明は上述した実施形態に限るものではない。また、本発明の実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、本発明の実施形態に記載されたものに限定されるものではない。 As mentioned above, although embodiment of this invention was described, this invention is not restricted to embodiment mentioned above. The effects described in the embodiments of the present invention are only the most preferable effects resulting from the present invention, and the effects of the present invention are limited to those described in the embodiments of the present invention. is not.

本発明の好適な実施形態の一例に係るコンテンツ管理システムの全体構成を表す概念図である。It is a conceptual diagram showing the whole structure of the content management system which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係るタグ決定サーバ１０の各機能を示すブロック図である。It is a block diagram which shows each function of the tag determination server 10 which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係るタグ決定サーバ１０における、タグ決定処理の流れを示す図である。It is a figure which shows the flow of the tag determination process in the tag determination server 10 which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係るタグデータテーブルを示す図である。It is a figure which shows the tag data table which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係る検索件数テーブルを示す図である。It is a figure which shows the search number table which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係るキーワード別入力回数テーブルを示す図である。It is a figure which shows the input frequency table according to keyword which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係るキーワード・ランキングテーブルを示す図である。It is a figure which shows the keyword ranking table which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係る類似度判定の概要を示す図である。It is a figure which shows the outline | summary of the similarity determination which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係る分野推定の概要を示す図である。It is a figure which shows the outline | summary of the field estimation which concerns on an example of suitable embodiment of this invention. 本発明の好適な実施形態の一例に係るタグ決定サーバ１０のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the tag determination server 10 which concerns on an example of suitable embodiment of this invention.

Explanation of symbols

１０タグ決定サーバ
２０検索サーバ
３０コンテンツサーバ
４０端末装置
１１０音声認識部
１２０形態素解析部
１３０タグ候補抽出部
１４０検索情報収集部
１５０タグ決定部 DESCRIPTION OF SYMBOLS 10 Tag determination server 20 Search server 30 Content server 40 Terminal device 110 Speech recognition part 120 Morphological analysis part 130 Tag candidate extraction part 140 Search information collection part 150 Tag determination part

Claims

A method in which a server adds a tag to moving image data,
Converting audio included in the moving image data into text data;
Extracting keywords from the text data;
On the extracted the keyword, in response to an inquiry to a predetermined Web search systems, statistical information indicating at least the number of times that the Web Search by the system to your Keru number of search results by the keyword or the keyword is executed acquiring search information including,
Calculating the importance of the keyword based on the acquired search information;
Adding the keyword as a tag of the moving image data when the calculated importance satisfies a predetermined condition.

The method according to claim 1 , wherein the search information includes TF · IDF, which is a product of an index TF and an IDF relating to the appearance frequency of the keyword.

The keyword method according to claim 1 or claim 2 which is a combination of nouns and verbs in the dependency relationship with each other.

The keyword is, if it has a previously stored category data and a predetermined or more similarity, wherein the category data, to any of claims 1 to 3, step further including imparting a tag of the moving image data the method of.

If the importance is less than a predetermined value, determining a field based on a predetermined rule for the keyword;
Method according to data indicating the determined the art, any steps to impart, claim 1 further comprising a according to claim 4 as a tag of said moving image data.

A server for attaching a tag to moving image data,
Means for converting audio contained in the moving image data into text data;
Means for extracting a keyword from the text data;
On the extracted the keyword, in response to an inquiry to a predetermined Web search systems, statistical information indicating at least the number of times that the Web Search by the system to your Keru number of search results by the keyword or the keyword is executed Means for obtaining the search information included ;
Means for calculating the importance of the keyword based on the acquired search information;
A server provided with the keyword as a tag of the moving image data when the calculated importance satisfies a predetermined condition.

A program for attaching a tag to moving image data,
Converting audio included in the moving image data into text data;
Extracting keywords from the text data;
On the extracted the keyword, in response to an inquiry to a predetermined Web search systems, statistical information indicating at least the number of times that the Web Search by the system to your Keru number of search results by the keyword or the keyword is executed acquiring search information including,
Calculating the importance of the keyword based on the acquired search information;
A program for causing a server to execute the step of assigning the keyword as a tag of the moving image data when the calculated importance satisfies a predetermined condition.