JP2012164018A

JP2012164018A - Tag recommendation device

Info

Publication number: JP2012164018A
Application number: JP2011021881A
Authority: JP
Inventors: Yuya Noda; 雄也野田; Kenji Mori; 健治森
Original assignee: Nifty Corp
Current assignee: Nifty Corp
Priority date: 2011-02-03
Filing date: 2011-02-03
Publication date: 2012-08-30
Anticipated expiration: 2031-02-03
Also published as: JP5639490B2

Abstract

PROBLEM TO BE SOLVED: To provide technique for recommending a tag to a document.SOLUTION: A tag recommendation device includes: gathering means of gathering a document including a tag comprising a specific symbol and a character string; extraction means of extracting a word and the tag included in each of the gathered documents and a combination of the tag and the word included in the same document from each of the gathered documents; calculation means of calculating a tag-word co-occurrence scale indicative of a degree of co-occurrence of each tag and each word in the same document for each of combinations of words and tags based upon the word and the tag extracted by the extraction means, the combination of the tag and the word included in the same document, and the number of documents; and recommendation means of receiving a document, extracting words included in the received document, and calculating a recommendation score of each of tags with respect to the received documents based upon the word co-occurrence scales associated with all the extracted words.

Description

本発明は、タグを提示するタグ推薦装置に関する。 The present invention relates to a tag recommendation device that presents tags.

近年、ブログ、マイクロブログ等のサービスが普及している。マイクロブログは、不特定の者に対して例えば１００文字程度の文章を書いて発信したり、不特定の者が発信された文章を読んだりすることができるサービスである。これらのサービスには、個人が書いた文書を気軽に発信できるという特徴がある。そのため、ネットワーク上を流通する情報の量が急増している。 In recent years, services such as blogs and microblogs have become widespread. Microblogging is a service that allows an unspecified person to write and send a sentence of, for example, about 100 characters, or read a sentence sent by an unspecified person. These services are characterized by the ability to easily send documents written by individuals. Therefore, the amount of information distributed on the network is increasing rapidly.

これらのサービスでは、投稿者（文書の発信者）が、投稿の際、投稿する文書に、特定の記号（例えば、＃記号）と文字列とによるタグを付加すると文書のグループ化ができる機能がある。例えば、「#abcde」で検索すると、「#abcde」が付加された文書群が抽出される。この機能には、読者が所望の文書を探しやすいようにとの配慮や、同一タグを使用する著者同士の緩いコミュニケーションを形成するという側面がある。 These services have a function that allows a poster (sender of a document) to group documents by adding a tag with a specific symbol (for example, # symbol) and a character string to a document to be posted. is there. For example, when searching for “#abcde”, a document group to which “#abcde” is added is extracted. This function has considerations such as making it easier for readers to find a desired document, and forming a loose communication between authors who use the same tag.

図１は、タグ付き文書の例を示す図である。図１の例では、「今日は天気がいいです。」との文書に対し、「#weather」というタグが付加されている。投稿者は、このタグを含む文書を投稿する。マイクロブログ等のサービスの利用者は、「#weather」とのタグにより、この文書に関連する文書を抽出することができる。 FIG. 1 is a diagram illustrating an example of a tagged document. In the example of FIG. 1, a tag “#weather” is added to a document “The weather is fine today”. The contributor posts a document including this tag. A user of a service such as a microblog can extract a document related to this document by using a tag “#weather”.

特開２０１０−２８２６１３号公報JP 2010-282613 A

ブログ、マイクロブログ等のサービスにおいて、文書に付加するタグは、投稿者が自由に記述できる。そのため、投稿者が既存の適切なタグを知らなければ、新たなタグを作成してしまう可能性があり、内容が類似した異なるタグが複数発生することがある。また、タグの種類は非常に多く、投稿者がすべてを把握することは困難である。従って、本来はタグに依って関連を持つべき情報が分散してしまい、読者の情報を探す負担が増加するという問題がある。 In a service such as a blog or a microblog, a tag added to a document can be freely described by a contributor. Therefore, if the contributor does not know an existing appropriate tag, a new tag may be created, and a plurality of different tags having similar contents may occur. Also, there are so many kinds of tags that it is difficult for the poster to grasp all of them. Therefore, there is a problem that information that should be related originally is dispersed depending on tags, and the burden of searching for information of the reader increases.

本発明は、文書にタグを推薦する技術を提供することを課題とする。 An object of the present invention is to provide a technique for recommending a tag to a document.

上述の課題を解決するために、本発明の態様では、以下の構成を採用する。 In order to solve the above-mentioned problems, the following configuration is adopted in the aspect of the present invention.

本発明の一態様は、
特定の記号と文字列とによるタグを含む文書を収集する収集手段と、
前記収集された各文書に含まれる単語、前記タグ、及び、同一文書に含まれるタグと単語との組み合わせを、前記収集された各文書から抽出する抽出手段と、
前記抽出手段により抽出された単語、タグ、同一文書に含まれるタグと単語との組み合わせ及び文書数に基づいて、同一文書における各タグと各単語との共起の度合いを示すタグ単語共起尺度を、単語とタグの組み合わせごとに算出する算出手段と、
文書を受信し、受信した文書に含まれる単語を抽出し、抽出したすべての単語に関する
タグ単語共起尺度に基づいて、受信した文書についてのタグ毎の推薦スコアを算出する推薦手段と、
を備えるタグ推薦装置である。 One embodiment of the present invention provides:
A collection means for collecting documents including tags with specific symbols and character strings;
Extraction means for extracting the words included in each collected document, the tags, and combinations of tags and words included in the same document from the collected documents;
Tag word co-occurrence scale indicating the degree of co-occurrence between each tag and each word in the same document based on the word extracted by the extraction means, the tag, the combination of the tag and word contained in the same document, and the number of documents Calculating means for each word and tag combination;
A recommendation means for receiving a document, extracting words included in the received document, and calculating a recommendation score for each tag for the received document based on a tag word co-occurrence scale for all the extracted words;
Is a tag recommendation device.

なお、本発明の他の態様として、以上のいずれかの構成を実現する方法、プログラム、当該プログラムを記録したコンピュータ読み取り可能記録媒体であってもよい。 Note that, as another aspect of the present invention, a method, a program, and a computer-readable recording medium recording the program may be used to realize any one of the above configurations.

本発明の態様によれば、文書にタグを推薦する技術を提供することができる。 According to the aspect of the present invention, it is possible to provide a technique for recommending a tag to a document.

図１は、タグ付き文書の例を示す図である。FIG. 1 is a diagram illustrating an example of a tagged document. 図２は、情報処理システムの例を示す図である。FIG. 2 is a diagram illustrating an example of an information processing system. 図３は、単語出現頻度テーブルの例を示す図である。FIG. 3 is a diagram illustrating an example of a word appearance frequency table. 図４は、タグ出現頻度テーブルの例を示す図である。FIG. 4 is a diagram illustrating an example of a tag appearance frequency table. 図５は、タグ単語共起頻度テーブルの例を示す図である。FIG. 5 is a diagram illustrating an example of a tag word co-occurrence frequency table. 図６は、タグ利用履歴テーブルの例を示す図である。FIG. 6 is a diagram illustrating an example of a tag usage history table. 図７は、タグ利用尺度テーブルの例を示す図である。FIG. 7 is a diagram illustrating an example of a tag usage scale table. 図８は、タグ単語共起尺度テーブルの例を示す図である。FIG. 8 is a diagram illustrating an example of a tag word co-occurrence scale table. 図９は、情報処理装置のハードウェア構成例を示す図である。FIG. 9 is a diagram illustrating a hardware configuration example of the information processing apparatus. 図１０は、情報処理システムの動作シーケンスの例（１）を示す図である。FIG. 10 is a diagram illustrating an example (1) of the operation sequence of the information processing system. 図１１は、情報処理システムの動作シーケンスの例（２）を示す図である。FIG. 11 is a diagram illustrating an example (2) of the operation sequence of the information processing system. 図１２は、収集部の動作フローの例を示す図である。FIG. 12 is a diagram illustrating an example of an operation flow of the collection unit. 図１３は、算出部によるタグ利用尺度の算出の動作フローの例を示す図である。FIG. 13 is a diagram illustrating an example of an operation flow for calculating a tag usage scale by the calculation unit. 図１４は、算出部によるタグ単語共起尺度の算出の動作フローの例を示す図である。FIG. 14 is a diagram illustrating an example of an operation flow for calculating a tag word co-occurrence scale by the calculation unit. 図１５は、推薦部の動作フローの例（１）を示す図である。FIG. 15 is a diagram illustrating an example (1) of the operation flow of the recommendation unit. 図１６は、推薦部の動作フローの例（２）を示す図である。FIG. 16 is a diagram illustrating an example (2) of the operation flow of the recommendation unit.

以下、図面を参照して実施形態について説明する。実施形態の構成は例示であり、本発明は開示の実施形態の構成に限定されない。 Hereinafter, embodiments will be described with reference to the drawings. The configuration of the embodiment is an exemplification, and the present invention is not limited to the configuration of the disclosed embodiment.

〔実施形態〕
（構成例）
図２は、本実施形態の情報処理システムの例を示す図である。図２の情報処理システム１０は、サーバ装置１００、記憶装置２００、ユーザ端末３００を含む。サーバ装置１００は、記憶装置２００及びユーザ端末３００と、それぞれ、ネットワーク等を介して、接続される。サーバ装置１００には、複数のユーザ端末３００が接続されうる。サーバ装置１００は、記憶装置２００を含んでもよい。ネットワーク等は、インターネット等の公衆ネットワーク、ＬＡＮ（Local Area Network）、ＷＡＮ（Wide Area Network）等の内部
ネットワークであってもよい。 Embodiment
(Configuration example)
FIG. 2 is a diagram illustrating an example of an information processing system according to the present embodiment. The information processing system 10 in FIG. 2 includes a server device 100, a storage device 200, and a user terminal 300. The server device 100 is connected to the storage device 200 and the user terminal 300 via a network or the like. A plurality of user terminals 300 can be connected to the server device 100. The server device 100 may include a storage device 200. The network or the like may be a public network such as the Internet, or an internal network such as a LAN (Local Area Network) or a WAN (Wide Area Network).

サーバ装置１００は、サービス部１１０、収集部１２０、算出部１３０、推薦部１４０を含む。サービス部１１０、収集部１２０、算出部１３０、推薦部１４０のうち、いずれかが、別のサーバ装置に含まれてもよい。例えば、サービス部１１０を含むサーバ装置と、収集部１２０を含むサーバ装置と、算出部１３０を含むサーバ装置と、推薦部１４０を
含むサーバ装置とが、ネットワーク等を介して接続されて、サーバ装置１００として、動作してもよい。複数のサーバ装置によって、サービス部１１０、収集部１２０、算出部１３０、推薦部１４０が実現されることによって、各処理部による負荷が分散される。 The server device 100 includes a service unit 110, a collection unit 120, a calculation unit 130, and a recommendation unit 140. Any one of the service unit 110, the collection unit 120, the calculation unit 130, and the recommendation unit 140 may be included in another server device. For example, a server device that includes a service unit 110, a server device that includes a collection unit 120, a server device that includes a calculation unit 130, and a server device that includes a recommendation unit 140 are connected via a network or the like. 100 may operate. The service unit 110, the collection unit 120, the calculation unit 130, and the recommendation unit 140 are realized by a plurality of server devices, so that the load of each processing unit is distributed.

サービス部１１０は、ユーザ端末３００等に対し、マイクロブログ等のサービスを提供する。サービス部１１０は、マイクロブログ等のサービスにおいて、ユーザ端末３００等から投稿された文書、当該文書が投稿された日時等を保存する。サービス部１１０は、収集部１２０からの要求に応じて、当該文書等を提供する。提供する文書には、当該文書が投稿された日時の情報が含まれる。 The service unit 110 provides services such as microblogging to the user terminal 300 and the like. The service unit 110 stores a document posted from the user terminal 300 or the like, a date and time when the document is posted, in a service such as a microblog. The service unit 110 provides the document or the like in response to a request from the collection unit 120. The provided document includes information on the date and time when the document was posted.

収集部１２０は、サービス部１１０に投稿された文書を要求し、サービス部１１０から文書（文書群）を受信する。収集部１２０は、サービス部１１０から提供された文書群から、タグ付きの文書を抽出する。タグ付き文書は、特定の記号（例えば、＃記号）と文字列とによるタグを含む文書である。収集部１２０は、所定時間毎に、サービス部１１０に投稿された文書を要求する。 The collection unit 120 requests a document posted to the service unit 110 and receives a document (document group) from the service unit 110. The collection unit 120 extracts a tagged document from the document group provided from the service unit 110. A tagged document is a document including a tag with a specific symbol (for example, # symbol) and a character string. The collection unit 120 requests a document posted to the service unit 110 every predetermined time.

収集部１２０は、抽出された全文書に対して形態素解析を実行する。収集部１２０は、形態素解析の実行結果として、各文書に含まれる単語情報を取得する。収集部１２０は、単語情報から、各単語の出現回数（単語出現頻度）をカウントする。また、収集部１２０は、各文書に含まれるタグ情報を取得する。収集部１２０は、タグ情報から、各タグの出現回数（タグ出現頻度）をカウントする。さらに、収集部１２０は、同一文書内で任意のタグと任意の単語との組み合わせが出現する文書の数（タグ単語共起頻度）を、タグと単語の組み合わせごとにカウントする。収集部１２０は、これらの、単語出現頻度データ、タグ出現頻度データ、タグ単語共起頻度データを、それぞれ、単語出現頻度テーブル２１１、タグ出現頻度テーブル２１２、タグ単語共起頻度テーブル２１３として、頻度ＤＢ２１０に格納する。 The collection unit 120 performs morphological analysis on all the extracted documents. The collection unit 120 acquires word information included in each document as a morphological analysis execution result. The collection unit 120 counts the number of appearances of each word (word appearance frequency) from the word information. In addition, the collection unit 120 acquires tag information included in each document. The collection unit 120 counts the number of appearances of each tag (tag appearance frequency) from the tag information. Furthermore, the collection unit 120 counts the number of documents in which a combination of an arbitrary tag and an arbitrary word appears in the same document (tag word co-occurrence frequency) for each combination of the tag and the word. The collection unit 120 uses the word appearance frequency data, the tag appearance frequency data, and the tag word co-occurrence frequency data as a word appearance frequency table 211, a tag appearance frequency table 212, and a tag word co-occurrence frequency table 213, respectively. Store in DB210.

収集部１２０は、タグと当該タグを含む文書が投稿された日時の情報との組み合わせであるタグ利用履歴データを、タグ利用履歴テーブル２１４として、頻度ＤＢ２１０に格納する。 The collection unit 120 stores tag usage history data, which is a combination of a tag and information on the date and time when a document including the tag is posted, in the frequency DB 210 as a tag usage history table 214.

算出部１３０は、頻度ＤＢ２１０に格納されるタグ利用履歴テーブル２１４から、タグ利用履歴データを取得し、タグ毎にタグ利用尺度を算出する。タグ利用尺度は、タグの利用頻度の変化を表す尺度である。タグ利用尺度の算出については後述する。算出部１３０は、タグと算出したタグ利用尺度とを対応づけたタグ利用尺度データを、タグ利用尺度ＤＢ２２０に、タグ利用尺度テーブル２２１として、格納する。算出部１３０は、タグ利用履歴データを、収集部１２０から取得してもよい。 The calculation unit 130 acquires tag usage history data from the tag usage history table 214 stored in the frequency DB 210 and calculates a tag usage scale for each tag. The tag usage scale is a scale that represents a change in tag usage frequency. The calculation of the tag usage scale will be described later. The calculation unit 130 stores the tag usage scale data in which the tag is associated with the calculated tag usage scale as the tag usage scale table 221 in the tag usage scale DB 220. The calculation unit 130 may acquire tag usage history data from the collection unit 120.

また、算出部１３０は、頻度ＤＢ２１０に格納される単語出現頻度テーブル２１１から、単語出現頻度データを取得する。算出部１３０は、頻度ＤＢ２１０に格納されるタグ出現頻度テーブル２１２から、タグ出現頻度データを取得する。算出部１３０は、頻度ＤＢ２１０に格納されるタグ単語共起頻度テーブル２１３から、タグ単語共起頻度データを取得する。算出部１３０は、取得したこれらのデータに基づいて、タグ−単語の組み合わせ毎にタグ単語共起尺度を算出する。タグ単語共起尺度は、タグ−単語の共起の程度を表す尺度である。タグ単語共起尺度の算出については、後述する。算出部１３０は、タグ−単語の組み合わせと算出したタグ単語共起尺度とを対応付けたタグ単語共起尺度データを、タグ単語共起尺度ＤＢ２３０に、タグ単語共起尺度テーブル２３１として、格納する。 Further, the calculation unit 130 acquires word appearance frequency data from the word appearance frequency table 211 stored in the frequency DB 210. The calculation unit 130 acquires tag appearance frequency data from the tag appearance frequency table 212 stored in the frequency DB 210. The calculation unit 130 acquires tag word co-occurrence frequency data from the tag word co-occurrence frequency table 213 stored in the frequency DB 210. The calculation unit 130 calculates a tag word co-occurrence scale for each tag-word combination based on the acquired data. The tag word co-occurrence scale is a scale representing the degree of tag-word co-occurrence. The calculation of the tag word co-occurrence scale will be described later. The calculation unit 130 stores the tag word co-occurrence scale data in which the tag-word combination and the calculated tag word co-occurrence scale are associated with each other in the tag word co-occurrence scale DB 230 as the tag word co-occurrence scale table 231. .

推薦部１４０は、ユーザ端末３００から、マイクロブログ等のサービスに投稿予定の文書を受信する。推薦部１４０は、受信した文書に対して形態素解析を実行し、当該文書に
含まれる単語情報を取得する。推薦部１４０は、タグ単語共起尺度ＤＢ２３０に格納されるタグ単語共起尺度テーブル２３１から、取得した単語情報に基づいて、各単語を含むタグ単語共起尺度データを抽出する。推薦部１４０は、タグ単語共起尺度データに含まれるタグについて、タグ利用尺度ＤＢ２２０のタグ利用尺度テーブル２３１から、当該タグを含むタグ利用尺度データを抽出する。推薦部１４０は、タグ毎に、タグ単語共起尺度及びタグ利用尺度から、タグの推薦スコアを算出する。タグの推薦スコアは、タグの、投稿予定の文書に付加することを推薦する度合いを示すものである。推薦部１４０は、算出したタグの推薦スコア上位Ｎ件（Ｎは所定の値）のタグとその推薦スコアとを、ユーザ端末３００に送信する。 The recommendation unit 140 receives a document scheduled to be posted to a service such as a microblog from the user terminal 300. The recommendation unit 140 performs morphological analysis on the received document, and acquires word information included in the document. The recommendation unit 140 extracts tag word co-occurrence scale data including each word from the tag word co-occurrence scale table 231 stored in the tag word co-occurrence scale DB 230 based on the acquired word information. For the tags included in the tag word co-occurrence scale data, the recommendation unit 140 extracts tag use scale data including the tags from the tag use scale table 231 of the tag use scale DB 220. For each tag, the recommendation unit 140 calculates a tag recommendation score from the tag word co-occurrence scale and the tag usage scale. The tag recommendation score indicates the degree to which the tag is recommended to be added to a document to be posted. The recommendation unit 140 transmits the calculated tag recommendation score top N (N is a predetermined value) tag and its recommendation score to the user terminal 300.

記憶装置２００は、頻度ＤＢ２１０（Data Base: データベース）、タグ利用尺度ＤＢ
２２０、タグ用語共起尺度ＤＢ２３０を含む。頻度ＤＢ２１０、タグ利用尺度ＤＢ２２０、タグ単語共起尺度ＤＢ２３０は、それぞれ、別々の記憶装置に含まれてもよい。 The storage device 200 includes a frequency DB 210 (Data Base: database) and a tag usage scale DB.
220, tag term co-occurrence scale DB230 is included. The frequency DB 210, the tag usage scale DB 220, and the tag word co-occurrence scale DB 230 may each be included in separate storage devices.

頻度ＤＢ２１０は、単語出現頻度テーブル２１１、タグ出現頻度テーブル２１２、タグ単語共起頻度テーブル２１３、タグ利用履歴テーブル２１４を含む。 The frequency DB 210 includes a word appearance frequency table 211, a tag appearance frequency table 212, a tag word co-occurrence frequency table 213, and a tag usage history table 214.

図３は、単語出現頻度テーブルの例を示す図である。単語出現頻度テーブル２１１は、文書に出現した単語とその単語の出現回数とを対応付けた単語出現頻度データを格納する。テーブルにおける、１つの情報と１つの情報（例えば、単語等とこの単語の出現回数等）との組み合わせを１つのレコードともいう。 FIG. 3 is a diagram illustrating an example of a word appearance frequency table. The word appearance frequency table 211 stores word appearance frequency data in which a word that appears in a document is associated with the number of appearances of the word. A combination of one piece of information and one piece of information (for example, a word and the number of appearances of this word) in the table is also referred to as one record.

図４は、タグ出現頻度テーブルの例を示す図である。タグ出現頻度テーブル２１２は、文書に出現したタグとそのタグの出現回数とを対応付けたタグ出現頻度データを格納する。 FIG. 4 is a diagram illustrating an example of a tag appearance frequency table. The tag appearance frequency table 212 stores tag appearance frequency data in which tags appearing in a document are associated with the number of appearances of the tags.

図５は、タグ単語共起頻度テーブルの例を示す図である。タグ単語共起頻度テーブル２１３は、同一文書に出現したタグと単語の組み合わせと、この組み合わせの出現回数とを対応付けたタグ単語共起頻度データを格納する。 FIG. 5 is a diagram illustrating an example of a tag word co-occurrence frequency table. The tag word co-occurrence frequency table 213 stores tag word co-occurrence frequency data in which combinations of tags and words that appear in the same document are associated with the number of appearances of this combination.

図６は、タグ利用履歴テーブルの例を示す図である。タグ利用履歴テーブル２１４は、タグと当該タグを含む文書が投稿された日時の情報との組み合わせであるタグ利用履歴データを格納する。 FIG. 6 is a diagram illustrating an example of a tag usage history table. The tag usage history table 214 stores tag usage history data that is a combination of a tag and information on the date and time when a document including the tag is posted.

タグ利用尺度ＤＢ２２０は、タグ利用尺度テーブル２２１を含む。 The tag usage scale DB 220 includes a tag usage scale table 221.

図７は、タグ利用尺度テーブルの例を示す図である。タグ利用尺度テーブル２２１は、タグと算出部１３０が算出したタグ利用尺度とを対応づけたタグ利用尺度データを格納する。 FIG. 7 is a diagram illustrating an example of a tag usage scale table. The tag usage scale table 221 stores tag usage scale data in which tags are associated with tag usage scales calculated by the calculation unit 130.

タグ単語共起尺度ＤＢ２３０は、タグ単語共起尺度テーブル２３１を含む。 The tag word co-occurrence scale DB 230 includes a tag word co-occurrence scale table 231.

図８は、タグ単語共起尺度テーブルの例を示す図である。タグ単語共起尺度テーブル２３１は、タグと単語との組み合わせと算出部１３０が算出したタグ単語共起尺度とを対応付けたタグ単語共起尺度データを格納する。 FIG. 8 is a diagram illustrating an example of a tag word co-occurrence scale table. The tag word co-occurrence scale table 231 stores tag word co-occurrence scale data in which a combination of a tag and a word is associated with a tag word co-occurrence scale calculated by the calculation unit 130.

ユーザ端末３００は、利用者によって入力されたマイクロブログ等に投稿する予定の文書を、推薦部１４０に送信する。ユーザ端末３００は、推薦部１４０に送信した文書に対して推薦されるタグとその推薦スコアとを、推薦部１４０から受信する。ユーザ端末３００は、利用者に、推薦部１４０から受信したタグとその推薦スコアとを提示し、投稿する
文書に付加するタグを選択させる。ユーザ端末３００は、利用者から文書に付加するタグが選択されると、当該タグが付加された文書を、サービス部１１０に送信（投稿）する。 The user terminal 300 transmits a document to be posted to a microblog or the like input by the user to the recommendation unit 140. The user terminal 300 receives a tag recommended for the document transmitted to the recommendation unit 140 and its recommendation score from the recommendation unit 140. The user terminal 300 presents the tag received from the recommendation unit 140 and its recommendation score to the user, and causes the user to select a tag to be added to the document to be posted. When a tag to be added to the document is selected by the user, the user terminal 300 transmits (posts) the document with the tag added to the service unit 110.

サーバ装置１００は、パーソナルコンピュータ（ＰＣ、Personal Computer）のような
汎用のコンピュータまたはサーバマシンのような専用のコンピュータを使用して実現可能である。 The server apparatus 100 can be realized by using a general-purpose computer such as a personal computer (PC) or a dedicated computer such as a server machine.

ユーザ端末３００は、ＰＣ、ＰＤＡ（Personal Digital Assistant）のような専用または汎用のコンピュータ、あるいは、コンピュータを搭載した電子機器を使用して実現可能である。また、ユーザ端末３００は、スマートフォン、携帯電話、カーナビゲーション装置のような専用または汎用のコンピュータ、あるいは、コンピュータを搭載した電子機器を使用して実現可能である。 The user terminal 300 can be realized by using a dedicated or general-purpose computer such as a PC or PDA (Personal Digital Assistant), or an electronic device equipped with the computer. The user terminal 300 can be realized by using a dedicated or general-purpose computer such as a smartphone, a mobile phone, or a car navigation device, or an electronic device equipped with a computer.

図９は、情報処理装置のハードウェア構成例を示す図である。サーバ装置１００及びユーザ端末３００は、例えば、図９に示すような情報処理装置１０００によって、実現される。 FIG. 9 is a diagram illustrating a hardware configuration example of the information processing apparatus. The server apparatus 100 and the user terminal 300 are realized by, for example, an information processing apparatus 1000 as illustrated in FIG.

コンピュータ、即ち、情報処理装置１０００は、ＣＰＵ（Central Processing Unit）
１００２、メモリ１００４、記憶部１００６、入力部１００８、出力部１０１０、通信部１０１２を含む。 The computer, that is, the information processing apparatus 1000 is a CPU (Central Processing Unit).
1002, a memory 1004, a storage unit 1006, an input unit 1008, an output unit 1010, and a communication unit 1012.

情報処理装置１０００は、ＣＰＵ１００２が記録部１００６に記憶されたプログラムをメモリ１００４の作業領域にロードして実行し、プログラムの実行を通じて周辺機器が制御されることによって、所定の目的に合致した機能を実現することができる。 In the information processing apparatus 1000, the CPU 1002 loads a program stored in the recording unit 1006 into the work area of the memory 1004 and executes the program, and the peripheral device is controlled through the execution of the program. Can be realized.

ＣＰＵ１００２は、記憶部１００６に格納されるプログラムに従って処理を行う。 The CPU 1002 performs processing according to a program stored in the storage unit 1006.

メモリ１００４は、ＣＰＵ１００２がプログラムやデータをキャッシュしたり作業領域を展開したりする。メモリ１００４は、例えば、例えば、ＲＡＭ（Random Access Memory）やＲＯＭ（Read Only Memory）を含む。 The memory 1004 is used by the CPU 1002 to cache programs and data and to develop work areas. The memory 1004 includes, for example, a RAM (Random Access Memory) and a ROM (Read Only Memory).

記憶部１００６は、各種のプログラム及び各種のデータを読み書き自在に記録媒体に格納する。記憶部１００６は、例えば、ＥＰＲＯＭ（Erasable Programmable ROM）、ソリ
ッドステートドライブ装置、ハードディスクドライブ（ＨＤＤ、Hard Disk Drive）装置
である。記憶部１００６としては、例えば、ＣＤ（Compact Disc）ドライブ装置、ＤＶＤ（Digital Versatile Disk）ドライブ装置、＋Ｒ／＋ＲＷドライブ装置、ＨＤＤＶＤ（High-Definition Digital Versatile Disk）ドライブ装置、または、ＢＤ（Blu-ray Disk）ドライブ装置がある。また、記録媒体としては、例えば、不揮発性半導体メモリ（フラッシュメモリ）を含むシリコンディスク、ハードディスク、ＣＤ、ＤＶＤ、＋Ｒ／＋ＲＷ、ＨＤＤＶＤ、または、ＢＤがある。ＣＤとしては、ＣＤ−Ｒ（Recordable）、ＣＤ−ＲＷ（Rewritable）、ＣＤ−ＲＯＭがある。ＤＶＤとしては、ＤＶＤ−Ｒ、ＤＶＤ−ＲＡＭ（Random Access Memory）がある。ＢＤとしては、ＢＤ−Ｒ、ＢＤ−ＲＥ（Rewritable）、ＢＤ−ＲＯＭがある。また、記憶部１００６は、リムーバブルメディア、即ち可搬記録媒体を含むことができる。リムーバブルメディアは、例えば、ＵＳＢ（Universal Serial Bus）メモリ、あるいは、ＣＤやＤＶＤのようなディスク記録媒体である。 The storage unit 1006 stores various programs and various data in a recording medium in a readable and writable manner. The storage unit 1006 is, for example, an EPROM (Erasable Programmable ROM), a solid state drive device, or a hard disk drive (HDD, Hard Disk Drive) device. As the storage unit 1006, for example, a CD (Compact Disc) drive device, a DVD (Digital Versatile Disk) drive device, a + R / + RW drive device, an HD DVD (High-Definition Digital Versatile Disk) drive device, or a BD (Blu- ray Disk) drive device. Examples of the recording medium include a silicon disk including a nonvolatile semiconductor memory (flash memory), a hard disk, a CD, a DVD, + R / + RW, an HD DVD, or a BD. CDs include CD-R (Recordable), CD-RW (Rewritable), and CD-ROM. Examples of DVD include DVD-R and DVD-RAM (Random Access Memory). BD includes BD-R, BD-RE (Rewritable), and BD-ROM. The storage unit 1006 can include a removable medium, that is, a portable recording medium. The removable medium is, for example, a USB (Universal Serial Bus) memory or a disk recording medium such as a CD or a DVD.

メモリ１００４及び記憶部１００６は、コンピュータ読み取り可能な記録媒体である。 The memory 1004 and the storage unit 1006 are computer-readable recording media.

入力部１００８は、ユーザ等からの操作指示等を受け付ける。入力部１００８は、キーボード、ポインティングデバイス、ワイヤレスリモコン、マイクロフォン、カメラ等の入
力デバイスである。入力部１００８から入力された情報は、ＣＰＵ１００２に通知される。 The input unit 1008 receives an operation instruction or the like from a user or the like. The input unit 1008 is an input device such as a keyboard, a pointing device, a wireless remote controller, a microphone, or a camera. Information input from the input unit 1008 is notified to the CPU 1002.

出力部１０１０は、ＣＰＵ１００２で処理されるデータやメモリ１００４に記憶されるデータを出力する。出力部１０１０は、ＣＲＴ（Cathode Ray Tube）ディスプレイ、ＬＣＤ（Liquid Crystal Display）、ＰＤＰ（Plasma Display Panel）、ＥＬ（Electroluminescence）パネル、プリンタ、スピーカ等の出力デバイスである。 The output unit 1010 outputs data processed by the CPU 1002 and data stored in the memory 1004. The output unit 1010 is an output device such as a CRT (Cathode Ray Tube) display, an LCD (Liquid Crystal Display), a PDP (Plasma Display Panel), an EL (Electroluminescence) panel, a printer, or a speaker.

通信部１０１２は、外部装置とデータの送受信を行う。通信部１０１２は、例えば、信号線を介して、外部装置と接続される。外部装置は、例えば、他の情報処理装置、記憶装置である。通信部１０１２は、例えば、ＬＡＮ（Local Area Network）インタフェースボードや、無線通信のための無線通信回路である。 The communication unit 1012 transmits / receives data to / from an external device. The communication unit 1012 is connected to an external device via, for example, a signal line. The external device is, for example, another information processing device or storage device. The communication unit 1012 is, for example, a LAN (Local Area Network) interface board or a wireless communication circuit for wireless communication.

情報処理装置１０００は、記憶部１００６に、オペレーティングシステム、各種プログラム、各種テーブル等を記憶している。 The information processing apparatus 1000 stores an operating system, various programs, various tables, and the like in the storage unit 1006.

オペレーティングシステムは、ソフトウェアとハードウェアとの仲介、メモリ空間の管理、ファイル管理、プロセスやタスクの管理等を行うソフトウェアである。オペレーティングシステムは、通信インタフェースを含む。通信インタフェースは、通信部１０１２を介して接続される他の外部装置等とデータのやり取りを行うプログラムである。 The operating system is software that mediates software and hardware, manages memory space, manages files, manages processes and tasks, and the like. The operating system includes a communication interface. The communication interface is a program for exchanging data with other external devices connected via the communication unit 1012.

サーバ装置１００を実現できる情報処理装置１０００は、ＣＰＵ１００２が記憶部１００６に記憶されているプログラムをメモリ１００４にロードして実行することによって、サービス部１１０、収集部１２０、算出部１３０、推薦部１４０としての機能を実現する。 In the information processing apparatus 1000 that can implement the server apparatus 100, the CPU 1002 loads the program stored in the storage unit 1006 into the memory 1004 and executes the program, whereby the service unit 110, the collection unit 120, the calculation unit 130, and the recommendation unit 140. As a function.

記憶装置２００としては、例えば、ソリッドステートドライブ装置、ハードディスクドライブ装置、ＣＤドライブ装置、ＤＶＤドライブ装置、＋Ｒ／＋ＲＷドライブ装置、ＨＤ
ＤＶＤドライブ装置、または、ＢＤドライブ装置がある。また、記憶装置２００は、リムーバブルメディア、即ち可搬記録媒体を含むことができる。 Examples of the storage device 200 include a solid state drive device, a hard disk drive device, a CD drive device, a DVD drive device, a + R / + RW drive device, and an HD.
There is a DVD drive device or a BD drive device. The storage device 200 can include a removable medium, that is, a portable recording medium.

（動作例）
〈全体〉
本実施形態の情報処理システム１０の動作例について説明する。 (Operation example)
<The entire>
An operation example of the information processing system 10 according to the present embodiment will be described.

図１０及び図１１は、情報処理システムの動作シーケンスの例を示す図である。図１０の「Ａ」、「Ｂ」、「Ｃ」、「Ｄ」、「Ｅ」、「Ｆ」、「Ｇ」、「Ｈ」は、それぞれ、図１１の「Ａ」、「Ｂ」、「Ｃ」、「Ｄ」、「Ｅ」、「Ｆ」、「Ｇ」、「Ｈ」と接続する。 10 and 11 are diagrams illustrating an example of an operation sequence of the information processing system. “A”, “B”, “C”, “D”, “E”, “F”, “G”, and “H” in FIG. 10 respectively represent “A”, “B”, “ Connect with “C”, “D”, “E”, “F”, “G”, “H”.

収集部１２０は、所定の周期で、サービス部１１０に投稿された文書を要求し、サービス部１１０から、サービス部１１０に蓄積される文書（文書群）を収集する（ＳＱ１００２）。収集部１２０は、他のサーバ装置から、蓄積される文書を収集してもよい。 The collection unit 120 requests a document posted to the service unit 110 at a predetermined cycle, and collects documents (document groups) accumulated in the service unit 110 from the service unit 110 (SQ1002). The collection unit 120 may collect accumulated documents from other server devices.

収集部１２０は、サービス部１１０から提供された文書群から、タグ付きの文書を抽出する。収集部１２０は、各文書に含まれる単語及びタグを抽出する。収集部１２０は、抽出した単語、タグ等から、単語出現頻度データ、タグ出現頻度データ、タグ単語共起頻度データ、タグ利用履歴データを生成する（ＳＱ１００４）。 The collection unit 120 extracts a tagged document from the document group provided from the service unit 110. The collection unit 120 extracts words and tags included in each document. The collection unit 120 generates word appearance frequency data, tag appearance frequency data, tag word co-occurrence frequency data, and tag usage history data from the extracted words, tags, and the like (SQ1004).

収集部１２０は、生成した、単語出現頻度データ、タグ出現頻度データ、タグ単語共起頻度データ、タグ利用履歴データを、頻度ＤＢ２１０に、それぞれ、単語出現頻度テーブ
ル２１１、タグ出現頻度テーブル２１２、タグ単語共起頻度テーブル２１３、タグ利用履歴テーブル２１４として格納する（ＳＱ１００６）。 The collection unit 120 generates the generated word appearance frequency data, tag appearance frequency data, tag word co-occurrence frequency data, and tag usage history data in the frequency DB 210 in a word appearance frequency table 211, a tag appearance frequency table 212, and a tag, respectively. The word co-occurrence frequency table 213 and the tag usage history table 214 are stored (SQ1006).

算出部１３０は、頻度ＤＢ２１０に格納されるタグ利用履歴テーブル２１４から、タグ利用履歴データを取得する（ＳＱ１００８）。算出部１３０は、収集部１２０から、タグ利用履歴データを取得してもよい。算出部１３０は、タグ利用履歴データから、タグ毎にタグ利用尺度を算出する（ＳＱ１０１０）。算出部１３０は、タグと算出したタグ利用尺度とを対応づけたタグ利用尺度データを、タグ利用尺度ＤＢ２２０に、タグ利用尺度テーブル２２１として、格納する（ＳＱ１０１２）。 The calculation unit 130 acquires tag usage history data from the tag usage history table 214 stored in the frequency DB 210 (SQ1008). The calculation unit 130 may acquire tag usage history data from the collection unit 120. The calculation unit 130 calculates a tag usage scale for each tag from the tag usage history data (SQ1010). The calculation unit 130 stores the tag usage scale data in which the tag is associated with the calculated tag usage scale in the tag usage scale DB 220 as the tag usage scale table 221 (SQ1012).

算出部１３０は、頻度ＤＢ２１０から、単語出現頻度データ、タグ出現頻度データ、タグ単語共起頻度データを取得する（ＳＱ１０１４）。算出部１３０は、取得したこれらのデータから、タグと単語との組み合わせ毎にタグ単語共起尺度を算出する（ＳＱ１０１６）。算出部１３０は、タグ−単語の組み合わせと算出したタグ単語共起尺度とを対応付けたタグ単語共起尺度データを、タグ単語共起尺度ＤＢ２３０に、タグ単語共起尺度テーブル２３１として、格納する（ＳＱ１０１８）。 The calculation unit 130 acquires word appearance frequency data, tag appearance frequency data, and tag word co-occurrence frequency data from the frequency DB 210 (SQ1014). The calculation unit 130 calculates a tag word co-occurrence scale for each combination of a tag and a word from these acquired data (SQ1016). The calculation unit 130 stores the tag word co-occurrence scale data in which the tag-word combination and the calculated tag word co-occurrence scale are associated with each other in the tag word co-occurrence scale DB 230 as the tag word co-occurrence scale table 231. (SQ1018).

ここまでの動作により、ユーザ端末３００に対してタグを推薦するためのデータが生成される。 Through the operations so far, data for recommending a tag to the user terminal 300 is generated.

ユーザ端末３００は、利用者によって入力されたマイクロブログ等に投稿する予定の文書を、推薦部１４０に送信する（ＳＱ１０２０）。推薦部１４０は、ユーザ端末３００から、マイクロブログ等のサービスに投稿予定の文書を受信すると、受信した文書に対して形態素解析を実行し、当該文書に含まれる単語情報を取得する。推薦部１４０は、タグ単語共起尺度ＤＢ２３０に格納されるタグ単語共起尺度テーブル２３１から、取得した単語情報に基づいて、各単語を含むタグ単語共起尺度データを抽出する（ＳＱ１０２２）。推薦部１４０は、抽出したタグ単語共起尺度データに含まれるタグについて、タグ利用尺度ＤＢ２２０のタグ利用尺度テーブル２３１から、当該タグを含むタグ利用尺度データを抽出する（ＳＱ１０２４）。推薦部１４０は、タグ毎に、タグ単語共起尺度及びタグ利用尺度から、タグの推薦スコアを算出する（ＳＱ１０２６）。推薦部１４０は、算出したタグの推薦スコア上位Ｎ件（Ｎは所定の数）のタグとその推薦スコアとを、ユーザ端末３００に送信する（ＳＱ１０２８）。 The user terminal 300 transmits a document to be posted to a microblog or the like input by the user to the recommendation unit 140 (SQ1020). When the recommendation unit 140 receives a document to be posted to a service such as a microblog from the user terminal 300, the recommendation unit 140 performs morphological analysis on the received document and acquires word information included in the document. The recommendation unit 140 extracts tag word co-occurrence scale data including each word based on the acquired word information from the tag word co-occurrence scale table 231 stored in the tag word co-occurrence scale DB 230 (SQ1022). The recommendation unit 140 extracts the tag usage scale data including the tag from the tag usage scale table 231 of the tag usage scale DB 220 for the tags included in the extracted tag word co-occurrence scale data (SQ1024). For each tag, the recommendation unit 140 calculates a tag recommendation score from the tag word co-occurrence scale and the tag usage scale (SQ1026). The recommendation unit 140 transmits the calculated tag recommendation score top N (N is a predetermined number) tags and their recommendation scores to the user terminal 300 (SQ1028).

ユーザ端末３００は、推薦部１４０に送信した文書に対して推薦されるタグとその推薦スコアとを、推薦部１４０から受信する。ユーザ端末３００は、利用者に、推薦部１４０から受信したタグとその推薦スコアとを提示し、投稿する文書に付加するタグを選択させる（ＳＱ１０３０）。ユーザ端末３００は、利用者から文書に付加するタグが選択されると、当該タグが付加された文書を、サービス部１１０に送信（投稿）する（ＳＱ１０３２）。これにより、利用者は、投稿する文書に適切なタグを付加することができる。投稿された文書は、サービス部１１０で蓄積され、収集部１２０によって収集される。 The user terminal 300 receives a tag recommended for the document transmitted to the recommendation unit 140 and its recommendation score from the recommendation unit 140. The user terminal 300 presents the tag received from the recommendation unit 140 and its recommendation score to the user, and selects a tag to be added to the document to be posted (SQ1030). When a tag to be added to the document is selected by the user, the user terminal 300 transmits (posts) the document with the tag added to the service unit 110 (SQ 1032). Thereby, the user can add an appropriate tag to the document to be posted. Posted documents are accumulated in the service unit 110 and collected by the collection unit 120.

〈収集部〉
図１２は、収集部の動作フローの例を示す図である。図１２の動作フローは、例えば、所定時間毎に動作する。 <Collection Department>
FIG. 12 is a diagram illustrating an example of an operation flow of the collection unit. The operation flow in FIG. 12 operates, for example, every predetermined time.

収集部１２０は、サービス部１１０に投稿された文書を要求し、サービス部１１０から文書（文書群）を受信する（Ｓ１０１）。収集部１２０は、他のサーバ装置に対し、投稿された文書を要求し、文書（文書群）を収集してもよい。収集される文書は、例えば、ブログサービス、マイクロブログサービスで投稿された文書である。収集される文書には、当該文書が投稿された日時の情報を含む。 The collection unit 120 requests a document posted to the service unit 110 and receives a document (document group) from the service unit 110 (S101). The collection unit 120 may request a posted document from another server device and collect the document (document group). The collected document is, for example, a document posted by a blog service or a microblog service. The collected document includes information on the date and time when the document was posted.

収集部１２０は、収集された文書群から、タグ付きの文書を抽出する（Ｓ１０２）。タグ付き文書は、特定の記号（例えば、＃記号）と文字列とによるタグを含む文書である。収集部１２０は、サービス部１１０からタグ付きの文書のみを収集してもよい。 The collection unit 120 extracts a tagged document from the collected document group (S102). A tagged document is a document including a tag with a specific symbol (for example, # symbol) and a character string. The collection unit 120 may collect only tagged documents from the service unit 110.

収集部１２０は、抽出された全文書に対して形態素解析を実行する。収集部１２０は、形態素解析の実行結果として、各文書に含まれる単語情報、各文書に含まれるタグ情報を取得する。収集部１２０は、単語情報から、各単語の出現回数（単語出現頻度）をカウントする。出現回数は、文書単位の出現回数としてもよい。文書単位の出現回数とは、１文書に同一単語が複数含まれている場合でも、その単語の出現回数を１回とカウントすることを意味する。収集部１２０は、タグ情報から、各タグの出現回数（タグ出現頻度）をカウントする。さらに、収集部１２０は、同一文書内で任意のタグと任意の単語との組み合わせが出現する文書の数（タグ単語共起頻度）を、タグと単語の組み合わせごとにカウントする。収集部１２０は、これらの、単語出現頻度データ、タグ出現頻度データ、タグ単語共起頻度データを、それぞれ、単語出現頻度テーブル２１１、タグ出現頻度テーブル２１２、タグ単語共起頻度テーブル２１３として、頻度ＤＢ２１０に格納する。収集部１２０は、タグと当該タグを含む文書が投稿された日時の情報との組み合わせであるタグ利用履歴データを、タグ利用履歴テーブル２１４として、頻度ＤＢ２１０に格納する（Ｓ１０３）。 The collection unit 120 performs morphological analysis on all the extracted documents. The collection unit 120 acquires word information included in each document and tag information included in each document as an execution result of the morphological analysis. The collection unit 120 counts the number of appearances of each word (word appearance frequency) from the word information. The number of appearances may be the number of appearances in document units. The number of appearances in units of documents means that the number of appearances of a word is counted as one even if the same word is included in one document. The collection unit 120 counts the number of appearances of each tag (tag appearance frequency) from the tag information. Furthermore, the collection unit 120 counts the number of documents in which a combination of an arbitrary tag and an arbitrary word appears in the same document (tag word co-occurrence frequency) for each combination of the tag and the word. The collection unit 120 uses the word appearance frequency data, the tag appearance frequency data, and the tag word co-occurrence frequency data as a word appearance frequency table 211, a tag appearance frequency table 212, and a tag word co-occurrence frequency table 213, respectively. Store in DB210. The collection unit 120 stores, in the frequency DB 210, tag usage history data that is a combination of a tag and information on the date and time when a document including the tag is posted (S103).

〈算出部〉
〔タグ利用尺度〕
図１３は、算出部によるタグ利用尺度の算出の動作フローの例を示す図である。図１３の動作フローは、例えば、所定時間毎に動作する。 <Calculation unit>
[Tag Usage Scale]
FIG. 13 is a diagram illustrating an example of an operation flow for calculating a tag usage scale by the calculation unit. The operation flow of FIG. 13 operates, for example, every predetermined time.

算出部１３０は、頻度ＤＢ２１０に格納されるタグ利用履歴テーブル２１４から、直近（例えば、Ａ日前から現在まで（Ａは所定の値））のタグ利用履歴データを取得する（Ｓ２０１）。算出部１３０は、タグ毎に、所定時間間隔毎のタグの出現回数を算出する（Ｓ２０２）。算出部１３０は、タグ毎の、所定時間間隔毎のタグの出現回数から、タグ毎に、タグ利用尺度を算出する（Ｓ２０３）。 The calculation unit 130 acquires tag usage history data from the tag usage history table 214 stored in the frequency DB 210 (S201) for the latest (for example, from A day before to the present (A is a predetermined value)). The calculation unit 130 calculates the number of appearances of the tag for each predetermined time interval for each tag (S202). The calculation unit 130 calculates a tag usage scale for each tag from the number of appearances of the tag for each predetermined time interval for each tag (S203).

次に、タグ利用尺度の算出の具体例を示す。 Next, a specific example of calculating the tag usage scale will be shown.

《タグ利用尺度の算出の例（１）》
所定時間間隔の代表時刻を時刻Ｘｉ、時刻Ｘｉを含む所定時間間隔におけるタグの出現頻度（回数）をＹｉとする。このとき、Ｙｉは、Ｘｉの１次式で近似できると仮定すると、当該１次式の傾きａは、最小二乗法により次のように求められる。 << Example of tag usage scale calculation (1) >>
The representative time of the predetermined time interval is time Xi, and the appearance frequency (number of times) of the tag in the predetermined time interval including time Xi is Yi. At this time, assuming that Yi can be approximated by a linear expression of Xi, the slope a of the linear expression is obtained as follows by the least square method.

ここで、ｎは、所定時間間隔の数である。即ち、Ａ日前から現在までのデータを取得しているとすると、ｎは、Ａ日を所定時間間隔で割った値である。 Here, n is the number of predetermined time intervals. That is, assuming that data from A day before to the present is acquired, n is a value obtained by dividing A day by a predetermined time interval.

この傾きａを用いて、タグ利用尺度ｋを次のように求めることができる。 Using this inclination a, the tag utilization scale k can be obtained as follows.

即ち、傾きａが正である場合、タグ利用尺度ｋが１、傾きａが負である場合、タグ利用尺度ｋは傾きａに応じた値とする。よって、タグの利用が時間を追うごとに増加しているときは、タグ利用尺度ｋは最大値の１となる。タグの利用が時間を追うごとに減少しているときは、タグ利用尺度ｋはcos(tan^-1(a))となる。 That is, when the slope a is positive, the tag usage scale k is 1, and when the slope a is negative, the tag usage scale k is a value corresponding to the slope a. Therefore, when the tag usage increases with time, the tag usage scale k is 1 which is the maximum value. When the tag usage decreases with time, the tag usage scale k is cos (tan ⁻¹ (a)).

《タグ利用尺度の算出の例（２）》
所定時間間隔の代表時刻を時刻Ｘｉ、時刻Ｘｉを含む所定時間間隔におけるタグの出現頻度をＹｉとする。また、現時刻を時刻ｐとする。このとき、タグ利用尺度ｋを次のように求めることができる。 << Example of tag usage scale calculation (2) >>
The representative time of the predetermined time interval is time Xi, and the appearance frequency of the tag in the predetermined time interval including time Xi is Yi. The current time is set as time p. At this time, the tag utilization scale k can be obtained as follows.

時刻Ｘｉにおける出現頻度Ｙｉを、現在日時と時刻Ｘｉの差で割ったものの総和を取る。現在日時と時刻Ｘｉとの差が小さいほど、値が大きくなる。また、利用頻度が多いほど値が大きくなることから、多く利用されているタグのほうが、タグ利用尺度ｋが大きくなる。 The sum of the appearance frequency Yi at time Xi divided by the difference between the current date and time and time Xi is taken. The smaller the difference between the current date and time and the time Xi, the larger the value. In addition, since the value increases as the usage frequency increases, the tag usage scale k increases for tags that are frequently used.

《タグ利用尺度の算出の例（３）》
ここでは、所定時間間隔毎のタグの出現回数を使用せずに、タグ利用尺度ｋを求める。タグの利用尺度は、次のように求められる。 << Example of tag usage scale calculation (3) >>
Here, the tag usage scale k is obtained without using the number of appearances of the tag for each predetermined time interval. The tag usage scale is determined as follows.

ここで、ｆ（ｔａｇ）は、タグ「ｔａｇ」の、直近（例えば、Ａ日前から現在まで）の出現回数である。また、値Ｎは、収集部１２０が収集した直近のタグ付きの文書数である。このタグ利用尺度ｋは、直近におけるタグ「ｔａｇ」の出現割合に相当する。 Here, f (tag) is the number of appearances of the tag “tag” in the latest (for example, from day A to the present). The value N is the number of documents with the latest tag collected by the collection unit 120. This tag usage scale k corresponds to the most recent appearance ratio of the tag “tag”.

〔タグ単語共起尺度〕
図１４は、算出部によるタグ単語共起尺度の算出の動作フローの例を示す図である。図１４の動作フローは、例えば、所定時間毎に動作する。タグ単語共起尺度は、タグと単語との共起の程度を表す尺度である。 [Tag word co-occurrence scale]
FIG. 14 is a diagram illustrating an example of an operation flow for calculating a tag word co-occurrence scale by the calculation unit. The operation flow in FIG. 14 operates, for example, every predetermined time. The tag word co-occurrence scale is a scale representing the degree of co-occurrence between a tag and a word.

算出部１３０は、頻度ＤＢ２１０に格納されるタグ単語共起頻度テーブル２１３から、タグ単語共起頻度データを１つずつ取得する（Ｓ３０１）。算出部１３０は、頻度ＤＢ２１０に格納されるタグ出現頻度テーブル２１２から、ステップＳ３０１で取得したタグについての、タグ出現頻度データを取得する。また、算出部１３０は、頻度ＤＢ２１０に格納される単語出現頻度テーブル２１１から、ステップＳ３０１で取得した単語についての、単語出現頻度データを取得する。算出部１３０は、取得したこれらのデータに基づいて、タグと単語との組み合わせの、タグ単語共起尺度を算出する。算出部１３０は、タグと単語との組み合わせと算出したタグ単語共起尺度とを対応付けたタグ単語共起尺度データを、タグ単語共起尺度ＤＢ２３０に、タグ単語共起尺度テーブル２３１として、格納する（Ｓ３０３）。算出部１３０は、頻度ＤＢ２１０に格納されるタグ単語共起頻度データをすべて取得したか否かを確認する（Ｓ３０４）。算出部１３０は、まだ取得していないタグ単語共起頻度データがある場合（Ｓ３０４；ＮＯ）、処理をステップＳ３０１に戻す。また、算出部１３０は、すべてのタグ単語共起頻度データを取得した場合（Ｓ３０４；ＹＥＳ）、処理を終了する。 The calculation unit 130 acquires tag word co-occurrence frequency data one by one from the tag word co-occurrence frequency table 213 stored in the frequency DB 210 (S301). The calculation unit 130 acquires tag appearance frequency data for the tag acquired in step S301 from the tag appearance frequency table 212 stored in the frequency DB 210. Further, the calculation unit 130 acquires word appearance frequency data for the word acquired in step S301 from the word appearance frequency table 211 stored in the frequency DB 210. The calculation unit 130 calculates a tag word co-occurrence scale for a combination of a tag and a word based on the acquired data. The calculation unit 130 stores the tag word co-occurrence scale data in which the combination of the tag and the word is associated with the calculated tag word co-occurrence scale in the tag word co-occurrence scale DB 230 as the tag word co-occurrence scale table 231. (S303). The calculation unit 130 checks whether or not all the tag word co-occurrence frequency data stored in the frequency DB 210 has been acquired (S304). If there is tag word co-occurrence frequency data that has not yet been acquired (S304; NO), the calculation unit 130 returns the process to step S301. Moreover, the calculation part 130 complete | finishes a process, when all the tag word co-occurrence frequency data are acquired (S304; YES).

ここで、タグ単語共起尺度の算出の具体例について説明する。タグ単語共起尺度は、０以上１以下となるように正規化されてもよい。 Here, a specific example of calculating the tag word co-occurrence scale will be described. The tag word co-occurrence scale may be normalized to be 0 or more and 1 or less.

《タグ単語共起尺度の算出の例（１）》
共起頻度f(term,tag)をタグ単語共起尺度ｍとすることができる。ここで、共起頻度f(term,tag)は、同一文書内に単語「term」とタグ「tag」とが出現する文書の数を示す。f(term,tag)は、共起の観測値である。 << Example of tag word co-occurrence scale calculation (1) >>
The co-occurrence frequency f (term, tag) can be used as the tag word co-occurrence scale m. Here, the co-occurrence frequency f (term, tag) indicates the number of documents in which the word “term” and the tag “tag” appear in the same document. f (term, tag) is the co-occurrence observation.

《タグ単語共起尺度の算出の例（２）》
観測値と期待値との比を、タグ単語共起尺度ｍとすることができる。即ち、次のように表すことができる。観測値と期待値との比は、値が大きいほど共起しやすいことを意味する。 << Example of tag word co-occurrence scale calculation (2) >>
The ratio between the observed value and the expected value can be the tag word co-occurrence scale m. That is, it can be expressed as follows. The ratio between the observed value and the expected value means that the larger the value, the easier it is to co-occur.

ここで、f(term)は、単語「term」が出現する回数（文書の数）を示す。f(tag)は、タ
グ「tag」が出現する回数（文書の数）を示す。また、値Ｎは、収集部１２０が収集した
直近のタグ付きの文書数である。 Here, f (term) indicates the number of times that the word “term” appears (the number of documents). f (tag) indicates the number of times the tag “tag” appears (number of documents). The value N is the number of documents with the latest tag collected by the collection unit 120.

《タグ単語共起尺度の算出の例（３）》
ｔ検定の独立性の検定を応用して、次のようにタグ単語共起尺度ｍを求めることができる。 << Example of tag word co-occurrence scale calculation (3) >>
By applying the t-test independence test, the tag word co-occurrence scale m can be obtained as follows.

《タグ単語共起尺度の算出の例（４）》
単語とタグとの共起がランダムに発生する場合を期待値として、次のようにタグ単語共起尺度ｍを求めることができる。 << Example of tag word co-occurrence scale calculation (4) >>
The tag word co-occurrence scale m can be obtained as follows, assuming that the co-occurrence of words and tags occurs randomly.

《タグ単語共起尺度の算出の例（５）》
対数尤度比（LLR: Log-Likelihood Ratio）を用いて、次のようにタグ単語共起尺度ｍ
を求めることができる。 << Example of calculating tag word co-occurrence scale (5) >>
Using log-likelihood ratio (LLR), tag word co-occurrence scale m
Can be requested.

ここで、 here,

である。なお、対数の底は、原則としてｅとする。 It is. In principle, the base of the logarithm is e.

《タグ単語共起尺度の算出の例（６）》
ＰＭＩ（Point-wise Mutual Information）を用いて、次のようにタグ単語共起尺度ｍ
を求めることができる。 << Example of tag word co-occurrence scale calculation (6) >>
Tag word co-occurrence scale m using PMI (Point-wise Mutual Information)
Can be requested.

このタグ単語共起尺度ｍは、単語Ａが出現する文書にタグＴが付く確率が高く、タグＴが付く文書に単語Ａが出現する確率が高い場合に、極めて大きな値となる。 The tag word co-occurrence scale m has a very large value when the probability that the tag T is attached to the document in which the word A appears is high and the probability that the word A appears in the document to which the tag T is attached is high.

〈推薦部〉
図１５及び図１６は、推薦部の動作フローの例を示す図である。図１５の「Ａ」及び「Ｂ」は、それぞれ、図１６の「Ａ」及び「Ｂ」と接続する。図１５及び図１６の動作フローは、例えば、ユーザ端末３００から文書を受信することによって開始される。 <Recommendation Department>
15 and 16 are diagrams illustrating an example of an operation flow of the recommendation unit. “A” and “B” in FIG. 15 are connected to “A” and “B” in FIG. 16, respectively. The operation flows of FIGS. 15 and 16 are started by receiving a document from the user terminal 300, for example.

推薦部１４０は、ユーザ端末３００から、マイクロブログ等のサービスに投稿予定の文書を受信する（Ｓ４０１）。推薦部１４０は、受信した文書に対して形態素解析を実行し、文書を単語毎に分割し、文書に含まれる単語情報を取得する（Ｓ４０２）。推薦部１４０は、形態素解析以外の方法により、文書に含まれる単語情報を取得してもよい。推薦部１４０は、受信した文書に含まれる単語の数が、閾値Ｗｔｈ以上であるか否かを判定する（Ｓ４０３）。 The recommendation unit 140 receives a document scheduled to be posted to a service such as a microblog from the user terminal 300 (S401). The recommendation unit 140 performs morphological analysis on the received document, divides the document into words, and acquires word information included in the document (S402). The recommendation unit 140 may acquire word information included in the document by a method other than morphological analysis. The recommendation unit 140 determines whether or not the number of words included in the received document is greater than or equal to a threshold value Wth (S403).

受信した文書に含まれる単語の数が閾値Ｗｔｈ以上である場合（Ｓ４０３；ＹＥＳ）、推薦部１４０は、タグ単語共起尺度ＤＢ２３０から、文書に含まれる各単語に関するタグ単語共起尺度データを抽出する（Ｓ４０４）。単語に関するタグ単語共起尺度データとは、当該単語が含まれるタグ単語共起尺度データ（レコード）である。１つの単語に対して、複数のタグ単語共起尺度データが抽出されることもある。推薦部１４０は、抽出したタグ単語共起尺度データをタグ毎にまとめる。１つのタグにつき複数のタグ単語共起尺度データが抽出されている場合、推薦部１４０は、同一のタグのタグ単語共起尺度データのタグ単語共起尺度を統合し、このタグの基本推薦尺度とする。ここで、統合とは、例えば、各タグ単語共起尺度を乗算することをいう。乗算の代わりに、各タグ単語共起尺度の和をとってもよい。統合は、乗算や和に限定されるものではない。また、１つのタグにつき１つのタグ単語共起尺度データが抽出されている場合、推薦部１４０は、このタグ単語共起尺度データのタグ単語共起尺度を、このタグの基本推薦尺度とする。このようにして、推薦部１４０は、タグ毎に基本推薦尺度を算出する（Ｓ４０５）。 When the number of words included in the received document is equal to or greater than the threshold value Wth (S403; YES), the recommendation unit 140 extracts tag word co-occurrence scale data regarding each word included in the document from the tag word co-occurrence scale DB 230. (S404). The tag word co-occurrence scale data related to a word is tag word co-occurrence scale data (record) including the word. A plurality of tag word co-occurrence scale data may be extracted for one word. The recommendation unit 140 collects the extracted tag word co-occurrence scale data for each tag. When a plurality of tag word co-occurrence scale data is extracted for one tag, the recommendation unit 140 integrates the tag word co-occurrence scale data of the tag word co-occurrence scale data of the same tag, and the basic recommendation scale of this tag And Here, integration refers to, for example, multiplying each tag word co-occurrence scale. Instead of multiplication, the sum of each tag word co-occurrence scale may be taken. Integration is not limited to multiplication or summation. When one tag word co-occurrence scale data is extracted for each tag, the recommendation unit 140 sets the tag word co-occurrence scale of the tag word co-occurrence scale data as the basic recommended scale of this tag. In this way, the recommendation unit 140 calculates a basic recommendation scale for each tag (S405).

推薦部１４０は、抽出したタグ単語共起尺度データに含まれるタグについて、タグ利用尺度ＤＢ２２０のタグ利用尺度テーブル２３１から、当該タグを含むタグ利用尺度データ
を抽出する。推薦部１４０は、各タグの基本推薦尺度に、当該タグのタグ利用尺度を統合し、推薦スコアとする（Ｓ４０６）。推薦スコアがより高いタグは、受信した文書に付加するのによりふさわしいタグであることを意味する。ここで、統合とは、例えば、基本推薦尺度とタグ利用尺度とを乗算することである。また、乗算の代わりに、基本推薦尺度とタグ利用尺度とを足しあわせてもよい。また、乗算の代わりに、基本推薦尺度に所定の係数をかけてタグ利用尺度と足しあわせてもよい。推薦部１４０は、ステップＳ４０６で得られた推薦スコアの降順にタグをソートする。推薦部１４０は、ソートしたタグの上位Ｎ件を抽出し、当該タグと、当該タグの推薦スコアとを、ユーザ端末３００に送信し、処理を終了する（Ｓ４０７）。タグ利用尺度を使用せずに、基本推薦尺度をそのまま推薦スコアとしてもよい。 For the tags included in the extracted tag word co-occurrence scale data, the recommendation unit 140 extracts tag use scale data including the tags from the tag use scale table 231 of the tag use scale DB 220. The recommendation unit 140 integrates the tag usage scale of the tag into the basic recommendation scale of each tag to obtain a recommendation score (S406). A tag with a higher recommendation score means that the tag is more suitable to be added to the received document. Here, integration is, for example, multiplying the basic recommendation scale and the tag usage scale. Moreover, you may add a basic recommendation scale and a tag utilization scale instead of multiplication. Further, instead of multiplication, the basic recommendation scale may be multiplied by a predetermined coefficient and added to the tag usage scale. The recommendation unit 140 sorts the tags in descending order of the recommendation score obtained in step S406. The recommendation unit 140 extracts the top N sorted tags, transmits the tag and the recommendation score of the tag to the user terminal 300, and ends the process (S407). Instead of using the tag utilization scale, the basic recommendation scale may be used as the recommendation score as it is.

受信した文書に含まれる単語の数が閾値Ｗｔｈ未満である場合（Ｓ４０３；ＮＯ）、推薦部１４０は、ユーザ端末３００に対し、適切なタグを推薦するのに十分な情報を得られないとして、エラーを送信し（Ｓ４０８）、処理を終了する。 When the number of words included in the received document is less than the threshold Wth (S403; NO), the recommendation unit 140 cannot obtain sufficient information to recommend an appropriate tag to the user terminal 300. An error is transmitted (S408), and the process is terminated.

（実施形態の作用効果）
サーバ装置１００の収集部１２０は、サービス部１１０から、マイクロブログ等のサービスに対して投稿された文書、当該文書が投稿された日時等を収集する。収集部１２０は、収集した文書等から、単語情報、タグ情報を抽出する。収集部１２０は、単語情報に基づいて、各単語の出現回数、各タグの出現回数、タグ単語共起頻度を求める。また、収集部１２０は、タグ情報及び文書が投稿された日時から、タグ利用履歴を生成する。算出部１３０は、各単語の出現回数、各タグの出現回数、タグ単語共起頻度から、タグ単語共起尺度を求める。また、算出部１３０は、タグ利用履歴からタグ利用尺度を求める。推薦部１４０は、ユーザ端末３００から投稿予定の文書を受信し、当該文書に含まれる単語を抽出する。推薦部１４０は、マイクロブログ等のサービスに対して投稿予定の文書に含まれる単語、タグ単語共起尺度、タグ利用尺度に基づいて、投稿予定の文書に付加するタグとして推薦するタグを抽出する。推薦部１４０は、投稿予定の文書に含まれる単語、タグ単語共起尺度、タグ利用尺度に基づいて、タグの推薦スコアを算出することにより、推薦するタグを抽出する。推薦部１４０は、投稿予定の文書に付加するタグとして、推薦するタグを、ユーザ端末３００に送信する。サーバ装置１００は、過去に投稿された文書に基づいて、投稿予定の文書に付加するタグとして適切と判断するタグを、抽出することができる。ユーザ端末３００の利用者は、付加すべきタグが提示されるため、タグを網羅的に知らなくても、適切なタグを選択することができる。 (Effect of embodiment)
The collection unit 120 of the server device 100 collects documents posted to a service such as a microblog from the service unit 110, the date and time when the document was posted, and the like. The collection unit 120 extracts word information and tag information from the collected documents and the like. The collection unit 120 obtains the number of times each word appears, the number of times each tag appears, and the tag word co-occurrence frequency based on the word information. Further, the collection unit 120 generates a tag usage history from the tag information and the date and time when the document is posted. The calculation unit 130 obtains a tag word co-occurrence scale from the number of times each word appears, the number of times each tag appears, and the tag word co-occurrence frequency. Further, the calculation unit 130 obtains a tag usage scale from the tag usage history. The recommendation unit 140 receives a document to be posted from the user terminal 300 and extracts words included in the document. The recommendation unit 140 extracts a tag to be recommended as a tag to be added to a document to be posted based on a word, a tag word co-occurrence scale, and a tag usage scale included in the document to be posted to a service such as a microblog. . The recommendation unit 140 extracts a recommended tag by calculating a tag recommendation score based on a word, a tag word co-occurrence scale, and a tag usage scale included in a document to be posted. The recommendation unit 140 transmits a recommended tag to the user terminal 300 as a tag to be added to the document scheduled to be posted. The server apparatus 100 can extract a tag determined to be appropriate as a tag to be added to a document scheduled to be posted based on a document posted in the past. Since the user of the user terminal 300 is presented with a tag to be added, an appropriate tag can be selected without knowing the tag exhaustively.

また、サーバ装置１００は、タグの利用尺度を使用することで、活発に利用されているタグを、推薦するタグとして抽出しやすくなる。また、過去に多く利用されたが、最近利用されなくなったタグが、推薦するタグとして、抽出されにくくなる。タグサーバ装置１００に推奨されて利用されたタグは、マイクロブログサービス等において投稿される文書に付加されることで、サーバ装置１００は、当該タグが付加された文書を、利用尺度、共起尺度にフィードバックすることで、より品質の高いタグの推薦を実現できる。複数の類似タグが利用されている場合でも、このフィードバック構造により、タグが一本化されやすくなる。 Further, the server device 100 can easily extract tags that are actively used as recommended tags by using the tag usage scale. Also, tags that have been used a lot in the past but have not been used recently are less likely to be extracted as recommended tags. A tag recommended and used by the tag server device 100 is added to a document posted in a microblog service or the like, so that the server device 100 can use the document to which the tag is added as a usage scale or a co-occurrence scale. By providing feedback, it is possible to recommend tags with higher quality. Even when a plurality of similar tags are used, this feedback structure makes it easy to unify tags.

サーバ装置１００によれば、ユーザ端末３００に投稿予定の文書に付加するタグとして推薦するタグを送信することで、利用者が投稿する文書に付加するのに適切なタグを容易に選択することができる。 According to the server apparatus 100, by transmitting a tag recommended as a tag to be added to a document to be posted to the user terminal 300, it is possible to easily select an appropriate tag to be added to the document posted by the user. it can.

（変形例）
上述の例では、タグと単語との間の共起頻度から、タグ単語共起尺度を求め、タグの推薦スコアを算出している。これに加えて、投稿される文書に付加される付加情報（文脈、
contents）とタグとの共起尺度（タグ付加情報共起尺度）を求めて、これを用いてタグの推薦スコアを算出してもよい。付加情報（文脈、context）として、例えば、天気（気温
、気圧、湿度、風速、降水量、天候等）、時間帯（朝、昼、夜、１時間毎など）、場所（緯度、経度、施設、道路、路線等）、ユーザ端末の種類等が、挙げられる。 (Modification)
In the above-described example, the tag word co-occurrence scale is obtained from the co-occurrence frequency between the tag and the word, and the tag recommendation score is calculated. In addition to this, additional information (context,
A co-occurrence scale (contents) and a tag (tag additional information co-occurrence scale) may be obtained and used to calculate a tag recommendation score. As additional information (context, context), for example, weather (temperature, atmospheric pressure, humidity, wind speed, precipitation, weather, etc.), time zone (morning, noon, night, every hour, etc.), location (latitude, longitude, facility) , Roads, routes, etc.), types of user terminals, and the like.

ユーザ端末３００は、マイクロブログ等のサービスにタグを含む文書を投稿する際、文書を付加情報と共に送信する。ユーザ端末３００は、ユーザ端末の固有の機能等によって付加情報を取得する。また、ユーザ端末３００は、付加情報をユーザに入力させることにより取得してもよい。サービス部１１０は、ユーザ端末３００から文書と共に付加情報を受信すると、投稿された文書、文書が投稿された日時等と共に、付加情報を蓄積する。収集部１２０は、サービス部１１０から、投稿された文書、文書が投稿された日時等と共に、付加情報を収集する。収集部１２０は、単語頻度データ、タグ頻度データと同様に、付加情報頻度データを生成する。また、収集部１２０は、同一文書に関する付加情報（文脈、contents）とタグとの共起頻度を求める。算出部１３０は、タグ単語共起尺度を求めるのと同様にして、タグ付加情報共起尺度を求める。 When the user terminal 300 posts a document including a tag to a service such as a microblog, the user terminal 300 transmits the document together with additional information. The user terminal 300 acquires additional information using a function unique to the user terminal. The user terminal 300 may acquire the additional information by causing the user to input the additional information. When the service unit 110 receives additional information together with the document from the user terminal 300, the service unit 110 accumulates the additional information together with the posted document, the date and time when the document was posted, and the like. The collection unit 120 collects additional information from the service unit 110 together with the posted document, the date and time when the document was posted, and the like. The collection unit 120 generates additional information frequency data in the same manner as the word frequency data and the tag frequency data. Further, the collection unit 120 obtains the co-occurrence frequency of additional information (context, contents) and a tag regarding the same document. The calculation unit 130 obtains the tag additional information co-occurrence scale in the same manner as the tag word co-occurrence scale.

ユーザ端末３００は、投稿予定の文書と共に付加情報を推薦部１４０に送信する。推薦部１４０は、文書に含まれる各単語に関するタグ単語共起尺度データを抽出するのと同様に、付加情報に関するタグ付加情報尺度データを抽出する。推薦部１４０は、抽出したタグ単語共起尺度データ及びタグ付加情報尺度データをタグ毎にまとめる。推薦部１４０は、これらのタグ単語共起尺度データのタグ単語共起尺度及びタグ付加情報尺度データのタグ付加情報尺度を統合し、このタグの基本推薦尺度とする。ここで、統合とは、例えば、各タグ単語共起尺度及び各タグ付加情報尺度を乗算することをいう。乗算の代わりに、各タグ単語共起尺度及び各タグ付加情報尺度の和をとってもよい。和を取る際に、各タグ単語共起尺度、各タグ付加情報尺度に所定の重み付けをしてもよい。統合は、これらに限定されるものではない。 The user terminal 300 transmits additional information to the recommendation unit 140 together with the document scheduled to be posted. The recommendation unit 140 extracts the tag additional information scale data regarding the additional information in the same manner as extracting the tag word co-occurrence scale data regarding each word included in the document. The recommendation unit 140 collects the extracted tag word co-occurrence scale data and tag additional information scale data for each tag. The recommendation unit 140 integrates the tag word co-occurrence scale of the tag word co-occurrence scale data and the tag additional information scale of the tag additional information scale data, and sets it as a basic recommendation scale for the tag. Here, integration refers to, for example, multiplying each tag word co-occurrence scale and each tag additional information scale. Instead of multiplication, the sum of each tag word co-occurrence measure and each tag additional information measure may be taken. When taking the sum, each tag word co-occurrence scale and each tag additional information scale may be given a predetermined weight. Integration is not limited to these.

サーバ装置１００によれば、付加情報を加味して、文書に付加するタグを推薦することができる。推薦スコアの算出の際に、単語情報に加えて、付加情報を利用することで、サーバ装置１００は、より適切なタグを推薦することができる。 According to the server apparatus 100, it is possible to recommend a tag to be added to a document in consideration of additional information. When calculating the recommendation score, the server device 100 can recommend a more appropriate tag by using the additional information in addition to the word information.

〔コンピュータ読み取り可能な記録媒体〕
コンピュータその他の機械、装置（以下、コンピュータ等）に上記いずれかの機能を実現させるプログラムをコンピュータ等が読み取り可能な記録媒体に記録することができる。そして、コンピュータ等に、この記録媒体のプログラムを読み込ませて実行させることにより、その機能を提供させることができる。 [Computer-readable recording medium]
A program for causing a computer or other machine or device (hereinafter, a computer or the like) to realize any of the above functions can be recorded on a recording medium that can be read by the computer or the like. The function can be provided by causing a computer or the like to read and execute the program of the recording medium.

ここで、コンピュータ等が読み取り可能な記録媒体とは、データやプログラム等の情報を電気的、磁気的、光学的、機械的、または化学的作用によって蓄積し、コンピュータ等から読み取ることができる記録媒体をいう。このような媒体内には、ＣＰＵ、メモリ等のコンピュータを構成する要素を設け、そのＣＰＵにプログラムを実行させてもよい。 Here, a computer-readable recording medium is a recording medium that stores information such as data and programs by electrical, magnetic, optical, mechanical, or chemical action and can be read from a computer or the like. Say. In such a medium, elements constituting a computer such as a CPU and a memory may be provided to cause the CPU to execute a program.

また、このような記録媒体のうちコンピュータ等から取り外し可能なものとしては、例えば、フレキシブルディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ／Ｗ、ＤＶＤ、ＤＡＴ、８ｍｍテープ、メモリカード等がある。 Examples of such a recording medium that can be removed from a computer or the like include a flexible disk, a magneto-optical disk, a CD-ROM, a CD-R / W, a DVD, a DAT, an 8 mm tape, and a memory card. .

また、コンピュータ等に固定された記録媒体としてハードディスクドライブやＲＯＭ等がある。 Moreover, there are a hard disk drive, a ROM, and the like as a recording medium fixed to a computer or the like.

１０情報処理システム
１００サーバ装置
１１０サービス部
１２０収集部
１３０算出部
１４０推薦部
２００記憶装置
２１０頻度ＤＢ
２１１単語出現頻度テーブル
２１２タグ出現頻度テーブル
２１３タグ単語共起頻度テーブル
２１４タグ利用履歴テーブル
２２０タグ利用尺度ＤＢ
２２１タグ利用尺度テーブル
２３０タグ用語共起尺度ＤＢ
２３１タグ単語共起尺度テーブル
３００ユーザ端末
１０００情報処理装置
１００２ＣＰＵ
１００４メモリ
１００６記憶部
１００８入力部
１０１０出力部
１０１２通信部 10 Information processing system
100 server device
110 Service Department
120 Collection Department
130 Calculation unit
140 recommendation section
200 storage device
210 Frequency DB
211 Word appearance frequency table
212 Tag appearance frequency table
213 Tag word co-occurrence frequency table
214 Tag Usage History Table
220 Tag Usage Scale DB
221 Tag usage scale table
230 Tag Term Co-occurrence Scale DB
231 Tag word co-occurrence scale table
300 User terminal
1000 Information processing device
1002 CPU
1004 memory
1006 Storage unit
1008 Input section
1010 Output unit
1012 Communication Department

Claims

A collection means for collecting documents including tags with specific symbols and character strings;
Extraction means for extracting the words included in each collected document, the tags, and combinations of tags and words included in the same document from the collected documents;
Tag word co-occurrence scale indicating the degree of co-occurrence between each tag and each word in the same document based on the word extracted by the extraction means, the tag, the combination of the tag and word contained in the same document, and the number of documents Calculating means for each word and tag combination;
A recommendation means for receiving a document, extracting words included in the received document, and calculating a recommendation score for each tag for the received document based on a tag word co-occurrence scale for all the extracted words;
A tag recommendation device comprising:

The collecting means collects a document including a tag with a specific symbol and a character string and a date and time when the document is posted,
The calculation means calculates a tag usage scale indicating a usage level of the tag for each tag based on the date and time when the document including the tag and the tag is posted;
The recommendation means calculates a recommendation score for each tag for the received document based on the tag word co-occurrence scale and the tag usage scale.
The tag recommendation device according to claim 1.

The collecting means collects a document including a tag with a specific symbol and a character string and additional information related to the document,
The extraction means extracts a tag included in each collected document, the additional information, and a combination of a tag and additional information related to the same document from each collected document and additional information,
The calculation means calculates the tag included in each collected document, the additional information, the combination of the tag and additional information regarding the same document, and the number of documents, and the tag and each additional information regarding the same document. A tag additional information co-occurrence scale indicating the degree of co-occurrence is calculated for each combination of word and additional information,
The recommendation means receives a document and additional information, and calculates a recommendation score for each tag for the received document based on a tag additional information co-occurrence scale related to the received additional information.
The tag recommendation device according to claim 1 or 2.