JP5890385B2

JP5890385B2 - Data processing apparatus and data processing method

Info

Publication number: JP5890385B2
Application number: JP2013264058A
Authority: JP
Inventors: 慎一郎岡本
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2013-12-20
Filing date: 2013-12-20
Publication date: 2016-03-22
Anticipated expiration: 2033-12-20
Also published as: JP2015121858A

Description

本発明は、対象物を感情パラメータに基づいて定量化するデータ処理装置、及びデータ処理方法に関する。 The present invention relates to a data processing apparatus and a data processing method for quantifying an object based on emotion parameters.

従来、ユーザにコンテンツを配信するコンテンツ配信装置が知られている（例えば、特許文献１参照）。
特許文献１に記載の装置は、ユーザの端末において作者名やタイトル名、カテゴリ等の検索条件が入力され、コンテンツ配信装置に送信する。コンテンツ配信装置は、コンテンツの作者名、タイトル名、カテゴリ等を関連付けたデータベースを有し、入力された作者名やタイトル、カテゴリに合致するコンテンツを抽出してユーザの端末に送信して表示させる。 Conventionally, a content distribution apparatus that distributes content to a user is known (for example, see Patent Document 1).
In the device described in Patent Document 1, search conditions such as an author name, a title name, and a category are input at a user terminal and transmitted to a content distribution device. The content distribution apparatus has a database that associates the author name, title name, category, and the like of the content, extracts content that matches the input author name, title, category, and transmits it to the user's terminal for display.

特開２００７−６５８４１号公報JP 2007-65841 A

ところで、上記特許文献１のようなコンテンツ配信装置では、作者名やタイトル名、カテゴリ等によるコンテンツの検索は実施できるが、作者名やタイトル名、カテゴリが分かっていなければ目的のコンテンツを検索できない。つまり、例えばあるコンテンツと雰囲気が似ている等、所定のコンテンツに対して人が感じる感情が近いコンテンツを調べたい場合、上記のようなデータベースでは、検索を実施できないという課題がある。
このように、人が感じる感情が似ているコンテンツを検索する場合等、人の感情を軸としたデータ処理が可能な装置が望まれている。 By the way, in the content distribution apparatus like the above-mentioned patent document 1, although the content can be searched by the author name, title name, category, etc., the target content cannot be searched unless the author name, title name, category is known. In other words, for example, when it is desired to examine content that is similar to an emotion that a person feels with respect to a predetermined content, such as when the atmosphere is similar to a certain content, there is a problem that a search cannot be performed with the above database.
Thus, there is a demand for an apparatus that can perform data processing around human emotions, such as when searching for content that has similar emotions that humans feel.

本発明は、人の感情を軸としたデータ処理を実施可能なデータ処理装置、及びデータ処理方法を提供することを目的とする。 An object of the present invention is to provide a data processing device and a data processing method capable of performing data processing around human emotion.

本発明のデータ処理装置は、複数の語句が感情単位で分類された感情分類辞書を取得する辞書取得手段と、前記感情分類辞書を用い、対象物に対する人の感情を定量化した定量化データを生成するコンテンツ定量化手段と、を備え、前記辞書取得手段は、複数の対象物に対するテキストベースの評価データに基づいて前記感情分類辞書を生成する辞書生成手段を備え、前記辞書生成手段は、前記評価データから、前記対象物における主要要素と、当該主要要素に対して出現頻度が高くかつ前記評価データにおける前記主要要素の記載位置を中心とした所定範囲内にある複数の前記第一語句と、を形態素解析によって抽出し、前記複数の第一語句のうち、前記主要要素に対する共起度合が前記所定値以上となる前記第一語句を前記第二語句として抽出し、当該第二語句を感情毎に分類することを特徴とする。 The data processing apparatus of the present invention uses a dictionary acquisition means for acquiring an emotion classification dictionary in which a plurality of words are classified in emotion units, and quantified data obtained by quantifying human emotions with respect to an object using the emotion classification dictionary. comprising a content quantification means for generating, wherein the dictionary acquisition means includes a dictionary generating means for generating said emotion classification dictionary based on text-based evaluation data for a plurality of objects, said dictionary generating unit, wherein From the evaluation data, the first element in the object, a plurality of the first phrases that have a high appearance frequency with respect to the main element and are within a predetermined range centered on a description position of the main element in the evaluation data, Is extracted by morphological analysis, and the first phrase that has a co-occurrence degree with respect to the main element equal to or greater than the predetermined value among the plurality of first phrases is defined as the second phrase. Extracted, characterized by classifying said second word for each emotion.

本発明では、上述のような感情分類辞書を用いて、対象物に対する定量化データを生成している。このため、このような人の感情を軸とした定量化データを用いることで、感情を軸として各種データ処理を実施でき、例えば、作者名やコンテンツ名が不明な場合でも、あるコンテンツと、同じ風潮のコンテンツを探す検索処理や、所定のコンテンツに対して人の感じ方をレビューとして表示させる表示処理等の処理を実施できる。 In the present invention, quantification data for an object is generated using the emotion classification dictionary as described above. For this reason, by using quantified data centered on the emotions of such people, various data processing can be performed centering on emotions. For example, even if the author name or content name is unknown, it is the same as a certain content It is possible to perform processing such as search processing for searching for trendy content and display processing for displaying how a person feels as a review for a predetermined content.

本発明の第一実施形態のデータ処理システムの概略構成を示すブロック図。1 is a block diagram showing a schematic configuration of a data processing system according to a first embodiment of the present invention. 第一実施形態のデータ処理装置であるサーバ装置の概略構成を示すブロック図。The block diagram which shows schematic structure of the server apparatus which is a data processor of 1st embodiment. 第一実施形態の感情分類辞書の辞書生成処理を示すフローチャート。The flowchart which shows the dictionary production | generation process of the emotion classification dictionary of 1st embodiment. 評価データが掲載されたレビュー記事の一例を示す図。The figure which shows an example of the review article on which evaluation data was published. 共起ネットワークの一例を示す図。The figure which shows an example of a co-occurrence network. 第一実施形態のコンテンツの定量化処理を示すフローチャート。The flowchart which shows the content quantification process of 1st embodiment. 第一実施形態におけるコンテンツ検索処理を示すフローチャート。The flowchart which shows the content search process in 1st embodiment. 検索処理により検索されたコンテンツとクエリコンテンツの感情値を比較する図。The figure which compares the emotion value of the content searched by the search process, and query content. 第一実施形態におけるコンテンツ定量化データ出力処理を示すフローチャート。The flowchart which shows the content quantification data output process in 1st embodiment.

［第一実施形態］
以下、本発明に係る第一実施形態について、図面に基づいて説明する。
［全体構成］
図１は、第一実施形態のデータ処理システムの概略構成を示すブロック図である。
図１に示すように、本実施形態のデータ処理システム１は、ユーザ端末１０と、本発明のデータ処理装置として機能するサーバ装置２０と、を備え、これらのユーザ端末１０及びサーバ装置２０がネットワーク（例えばインターネット等のＷＡＮ（Wide Area Network））を介して通信可能に接続されている。
このデータ処理システム１では、サーバ装置２０は、ユーザ端末１０から受信した検索クエリに基づき、所定のコンテンツ（本実施形態では、対象物として書籍等のコンテンツを例示する）の雰囲気と類似するコンテンツを検索し、当該検索結果をユーザ端末１０に返す。また、データ処理システム１では、サーバ装置２０は、ユーザ端末１０から、所定のコンテンツの紹介要求を受信した際に、当該コンテンツに対して複数のユーザが感じた感情を定量化したレビューデータを返し、ユーザ端末１０から出力（表示）させる。
以下、上記のようなサービスを提供するための具体的な構成及び方法を説明する。 [First embodiment]
Hereinafter, a first embodiment according to the present invention will be described with reference to the drawings.
[overall structure]
FIG. 1 is a block diagram showing a schematic configuration of a data processing system according to the first embodiment.
As shown in FIG. 1, the data processing system 1 of this embodiment includes a user terminal 10 and a server device 20 that functions as the data processing device of the present invention. The user terminal 10 and the server device 20 are connected to a network. (For example, it is connected so that communication is possible via WAN (Wide Area Network), such as the internet.).
In this data processing system 1, the server device 20 displays content similar to the atmosphere of predetermined content (in this embodiment, content such as a book is exemplified as a target object) based on the search query received from the user terminal 10. Search and return the search result to the user terminal 10. In the data processing system 1, when the server device 20 receives a request for introduction of predetermined content from the user terminal 10, the server device 20 returns review data that quantifies emotions felt by a plurality of users for the content. And output (display) from the user terminal 10.
Hereinafter, a specific configuration and method for providing the above service will be described.

［ユーザ端末の構成］
ユーザ端末１０は、コンピュータであり、図１に示すように、端末通信部１１、入力操作部１２、端末記憶部１３、端末制御部１４、及びディスプレイ１５を備えている。 [User terminal configuration]
The user terminal 10 is a computer and includes a terminal communication unit 11, an input operation unit 12, a terminal storage unit 13, a terminal control unit 14, and a display 15, as shown in FIG.

端末通信部１１は、例えばＬＡＮ等を介してネットワークに接続されており、ネットワーク上の他の機器と通信する。
入力操作部１２は、ユーザ操作による操作信号を端末制御部１４に出力する。入力操作部としては、例えば、ディスプレイ１５と一体に設けられたタッチパネルや、キーボード、マウス等の入力装置等を例示できる。 The terminal communication unit 11 is connected to a network via, for example, a LAN, and communicates with other devices on the network.
The input operation unit 12 outputs an operation signal generated by a user operation to the terminal control unit 14. Examples of the input operation unit include a touch panel provided integrally with the display 15 and an input device such as a keyboard and a mouse.

端末記憶部１３は、例えばメモリ、ハードディスク等のデータ記録装置により構成されている。端末記憶部１３には、ユーザ端末１０を制御するための各種プログラム等が記憶される。
端末制御部１４は、ＣＰＵ（Central Processing Unit）等の演算回路、ＲＡＭ（Random Access Memory）等の記憶回路により構成され、ユーザ端末１０の各部を制御する。端末制御部１４は、端末記憶部１３等に記憶されているプログラムの中から所定のアプリケーション等のプログラム（ソフトウェア）をＲＡＭに展開し、ＲＡＭに展開されたプログラムとの協働で、各種処理を実行する。これにより、端末制御部１４は、サーバ装置２０に対してネットワークを介して通信可能になり、例えば、サーバ装置２０が提供する各種サービスの利用や各種データの閲覧が可能となる。
また、端末制御部１４は、ユーザの入力操作部１２の操作により、検索クエリに基づいた検索要求やコンテンツの紹介要求を生成してサーバ装置２０に送信する。これらの検索クエリや紹介要求には、例えば、コンテンツのタイトル名等、コンテンツを特定するコンテンツ特定データが含まれる。
さらに、端末制御部１４は、サーバ装置２０から送信された各種データをディスプレイ１５に表示（出力）させる処理をする。 The terminal storage unit 13 is configured by a data recording device such as a memory or a hard disk. The terminal storage unit 13 stores various programs for controlling the user terminal 10 and the like.
The terminal control unit 14 includes an arithmetic circuit such as a CPU (Central Processing Unit) and a storage circuit such as a RAM (Random Access Memory), and controls each unit of the user terminal 10. The terminal control unit 14 develops a program (software) such as a predetermined application from among the programs stored in the terminal storage unit 13 and the like, and performs various processes in cooperation with the program expanded in the RAM. Run. As a result, the terminal control unit 14 can communicate with the server device 20 via the network. For example, various services provided by the server device 20 can be used and various data can be viewed.
In addition, the terminal control unit 14 generates a search request or a content introduction request based on the search query and transmits it to the server device 20 by the user's operation of the input operation unit 12. These search queries and introduction requests include content specifying data for specifying content such as the title name of the content.
Furthermore, the terminal control unit 14 performs processing for displaying (outputting) various data transmitted from the server device 20 on the display 15.

［サーバ装置の構成］
図２は、本実施形態のサーバ装置２０を示すブロック図である。
本実施形態のサーバ装置２０は、コンピュータであり、通信部２１と、記憶部２２（記憶手段）と、制御部２３と、等を含んで構成されている。
通信部２１は、例えばＬＡＮ等を介してネットワークに接続されており、ネットワーク上の他の機器と通信する。 [Configuration of server device]
FIG. 2 is a block diagram showing the server device 20 of the present embodiment.
The server device 20 according to the present embodiment is a computer, and includes a communication unit 21, a storage unit 22 (storage unit), a control unit 23, and the like.
The communication unit 21 is connected to a network via, for example, a LAN and communicates with other devices on the network.

記憶部２２は、例えばメモリ、ハードディスク等により構成されたデータ記録装置であり、本発明におけるデータ蓄積手段を構成する。
この記憶部２２は、検索装置を制御するための各種プログラムや各種データを記憶する。また、記憶部２２には、感情分類辞書、及びコンテンツを感情分類辞書に基づいて定量化した定量化データ等が記録される。
なお、本実施形態では、サーバ装置２０の記憶部２２がデータ蓄積手段として機能する例を示すが、例えば、ネットワーク上の他の装置にデータ蓄積手段が設けられ、上記定量化データが蓄積されていてもよい。感情分類辞書においても同様であり、ネットワーク上の所定の装置に感情分類辞書が記録されていてもよい。
また、記憶部２２には、登場人物辞書が記録されている。この登場人物辞書は、コンテンツと、当該コンテンツに登場する登場人物とを関連付けた辞書である。 The storage unit 22 is a data recording device constituted by, for example, a memory, a hard disk, etc., and constitutes a data storage means in the present invention.
The storage unit 22 stores various programs and various data for controlling the search device. The storage unit 22 stores an emotion classification dictionary and quantified data obtained by quantifying content based on the emotion classification dictionary.
In the present embodiment, an example is shown in which the storage unit 22 of the server device 20 functions as a data storage unit. For example, the data storage unit is provided in another device on the network, and the quantified data is stored. May be. The same applies to the emotion classification dictionary, and the emotion classification dictionary may be recorded in a predetermined device on the network.
In addition, a character dictionary is recorded in the storage unit 22. This character dictionary is a dictionary in which content and characters appearing in the content are associated with each other.

［記憶部に記憶される感情分類辞書］
感情分類辞書は、コンテンツに対する評価データに基づいて生成される辞書である。表１に、本実施形態の感情分類辞書の一例を示す。 [Emotion classification dictionary stored in memory]
The emotion classification dictionary is a dictionary generated based on evaluation data for content. Table 1 shows an example of the emotion classification dictionary of this embodiment.

表１に示すように、感情分類辞書には、人の感情を示す感情トピックと、当該感情トピックに関連する語句とが関連付けられた辞書である。ここで、語句は、形容詞や形容動詞等の他、感情を示す名詞や動詞等、人の感情を表す感情表現語句であることが好ましい。
なお、上述のような感情分類辞書の詳細な生成方法は後述する。 As shown in Table 1, the emotion classification dictionary is a dictionary in which emotion topics indicating human emotions and words and phrases related to the emotion topics are associated with each other. Here, the phrase is preferably an emotion expression phrase representing human emotion, such as an adjective or adjective verb, as well as a noun or verb representing emotion.
A detailed method for generating the emotion classification dictionary as described above will be described later.

［定量化データ］
定量化データは、各コンテンツに対して人が感じる感情を、上記感情分類辞書を用いて定量化したデータであり、表２に示すような定量化データベースに各コンテンツに対する定量化データが記録される。 [Quantification data]
The quantification data is data obtained by quantifying emotions felt by humans for each content using the emotion classification dictionary, and the quantification data for each content is recorded in a quantification database as shown in Table 2. .

表２において、コンテンツＩＤは、コンテンツを識別してコンテンツを特定するためのデータである。本実施形態では、コンテンツＩＤを例示するが、その他、コンテンツを特定するためのデータとして、例えばコンテンツのタイトル名等が用いられてもよい。
感情値は、上記感情トピックに対してそれぞれ設定され、各感情トピックの感情を定量化した値となる。なお、定量化データの詳細な生成方法については後述する。 In Table 2, the content ID is data for identifying the content and specifying the content. In the present embodiment, the content ID is exemplified, but, in addition, for example, the title name of the content may be used as data for specifying the content.
The emotion value is set for each of the emotion topics, and is a value obtained by quantifying the emotion of each emotion topic. A detailed method for generating quantified data will be described later.

制御部２３は、ＣＰＵ等の演算回路、ＲＡＭ等の記憶回路により構成され、記憶部２２等に記憶されているプログラム（ソフトウェア）をＲＡＭに展開し、ＲＡＭに展開されたプログラムとの協働で、各種処理を実行する。そして、制御部２３は、上記各種処理を実行することで、図２に示すように、評価データ取得手段２３１、データ解析手段２３２、辞書取得手段２３３、定量化手段２３４、要求取得手段２３５、検索手段２３６、及び定量化データ出力手段２３７等として機能する。 The control unit 23 includes an arithmetic circuit such as a CPU and a storage circuit such as a RAM. The control unit 23 expands a program (software) stored in the storage unit 22 or the like into the RAM, and cooperates with the program expanded in the RAM. Various processes are executed. And the control part 23 performs the said various processes, and as shown in FIG. 2, the evaluation data acquisition means 231, the data analysis means 232, the dictionary acquisition means 233, the quantification means 234, the request | requirement acquisition means 235, search It functions as the means 236, the quantified data output means 237, and the like.

評価データ取得手段２３１は、評価データを取得する。具体的には、評価データ取得手段２３１は、ネットワークを介して他の装置、例えばＳＮＳ（Social Networking Service）を提供する装置や、コンテンツのレビューサイトやブログサイトを公開する装置、Twitter（登録商標）等のリアルタイムでネット上にユーザの発言データを公開する装置等から取得する。つまり、ネットワーク上で閲覧可能な個人のブログやレビュー記事、ユーザの発言（つぶやき）等のテキストベースのデータを取得する。
データ解析手段２３２は、取得した評価データを形態素解析し、評価データに含まれる語句（第一語句）を抽出する。 The evaluation data acquisition unit 231 acquires evaluation data. Specifically, the evaluation data acquisition unit 231 is a device that provides another device, for example, an SNS (Social Networking Service), a device that publishes a content review site or blog site, Twitter (registered trademark). It is acquired from a device that publishes user's speech data on the network in real time. That is, it acquires text-based data such as personal blogs and review articles that can be browsed on the network, and user comments (tweets).
The data analysis unit 232 performs morphological analysis on the acquired evaluation data, and extracts a phrase (first phrase) included in the evaluation data.

辞書取得手段２３３は、評価データから抽出された語句（第一語句）を用いて、表１に示したような感情分類辞書を生成する。
具体的には、辞書取得手段２３３は、本発明の辞書生成手段として機能する共起判定手段２３３Ａ及び分類手段２３３Ｂを含む。共起判定手段２３３Ａは、抽出された語句（第一語句）同士の共起度合を判定し、共起度合に基づいて共起語句（第二語句）を抽出する。分類手段２３３Ｂは、第二語句を感情トピック毎に分類（クラスタリング）する。 The dictionary acquisition unit 233 generates an emotion classification dictionary as shown in Table 1 using the words (first words) extracted from the evaluation data.
Specifically, the dictionary acquisition means 233 includes co-occurrence determination means 233A and classification means 233B that function as dictionary generation means of the present invention. The co-occurrence determination unit 233A determines the co-occurrence degree between the extracted words (first words) and extracts the co-occurrence word (second word) based on the co-occurrence degrees. The classifying unit 233B classifies (clusters) the second phrase for each emotion topic.

定量化手段２３４は、本発明のコンテンツ定量化手段として機能する。この定量化手段２３４は、記憶部２２から感情分類辞書を取得し（読み込み）、コンテンツに対して人が感じる感情や雰囲気を定量化し、定量化データを生成する。
要求取得手段２３５は、クエリ取得手段として機能し、ユーザ端末１０から送信された各種要求を取得する。本実施形態では、一例として、検索クエリを含む検索要求、所定のコンテンツに対する定量化データの出力を要求する出力要求等を例示する。なお、検索クエリとしては、所定のコンテンツを特定するデータ、例えばコンテンツ名等が指定される。 The quantification unit 234 functions as the content quantification unit of the present invention. The quantification unit 234 acquires (reads) the emotion classification dictionary from the storage unit 22, quantifies emotions and atmospheres that people feel about the content, and generates quantified data.
The request acquisition unit 235 functions as a query acquisition unit and acquires various requests transmitted from the user terminal 10. In the present embodiment, as an example, a search request including a search query, an output request for requesting output of quantified data for a predetermined content, and the like are illustrated. As the search query, data specifying a predetermined content, for example, a content name is specified.

検索手段２３６は、検索要求を受信した際に、検索クエリに指定されたコンテンツと、定量化データの傾向が類似するコンテンツを検索する。
定量化データ出力手段２３７は、出力要求を受信した際に、指定されたコンテンツの定量化データを読み出し、ユーザ端末１０にて表示可能に送信する。
なお、各機能構成の詳細な処理については後述する。 When the search unit 236 receives a search request, the search unit 236 searches for content whose tendency of the quantified data is similar to the content specified in the search query.
When receiving the output request, the quantified data output unit 237 reads the quantified data of the designated content and transmits it so that it can be displayed on the user terminal 10.
Detailed processing of each functional configuration will be described later.

［データ処理方法］
次に、上述のようなデータ処理システム１におけるデータ処理方法について、図面に基づいて説明する。 [Data processing method]
Next, a data processing method in the data processing system 1 as described above will be described with reference to the drawings.

（辞書生成処理）
図３は、サーバ装置２０における感情分類辞書の生成処理を示すフローチャートである。
サーバ装置２０は、コンテンツに対する人の感情を定量化した定量化データを生成するために、まず、感情分類辞書を生成する。 (Dictionary generation process)
FIG. 3 is a flowchart showing the emotion classification dictionary generation processing in the server device 20.
The server device 20 first generates an emotion classification dictionary in order to generate quantified data that quantifies human emotions regarding the content.

これには、サーバ装置２０の評価データ取得手段２３１は、ネットワーク上から複数のコンテンツに対する評価データを取得する（ステップＳ１１）。
このステップＳ１１では、傾向が異なる複数のコンテンツに対する評価データを取得することが好ましい。例えば、コンテンツとして映画を対象にする場合、公開日、監督名、カテゴリ（例えば、恋愛映画、歴史映画、ＳＦ映画、アクション映画等）が異なる複数のコンテンツを対象とする。これらのコンテンツは、サーバ装置２０の管理者が適宜設定してもよく、例えばコンテンツの紹介サイト等を参照し、新たなコンテンツに関するデータが公開される毎に当該コンテンツ名を取得して評価データの取得対象としてもよい。 For this, the evaluation data acquisition means 231 of the server device 20 acquires evaluation data for a plurality of contents from the network (step S11).
In this step S11, it is preferable to obtain evaluation data for a plurality of contents having different tendencies. For example, when movies are targeted as contents, a plurality of contents having different release dates, director names, and categories (for example, romance movies, historical movies, SF movies, action movies, etc.) are targeted. These contents may be appropriately set by the administrator of the server device 20, for example, referring to a content introduction site, etc., each time data related to new contents is published, the contents name is obtained and evaluation data It may be an acquisition target.

また、評価データ取得手段２３１は、評価データの取得として、上記取得対象とされたコンテンツのコンテンツ名をクエリとして、インターネット上から評価データが記載されたレビュー記事やブログ、ユーザの発言等のＷｅｂデータを検索する。図４は、評価データが掲載されたレビュー記事の一例を示す図である。
評価データ取得手段２３１は、図４に示すように検索されたレビュー記事等のＷｅｂデータ４０を解析し、テキストデータで記載された評価データ４１を取得する。Ｗｅｂデータ４０から評価データ４１の取得方法としては、周知の解析方法を用いることができ、例えば、html（HyperText Markup Language）等のマークアップ言語を解析して、テキスト記載部分を抽出する。
なお、評価データ取得手段２３１は、各コンテンツに対して複数の評価データを取得することが好ましい。 Further, the evaluation data acquisition unit 231 acquires the evaluation data by using the content name of the content to be acquired as a query, and Web data such as a review article, a blog, and a user's remarks in which the evaluation data is described from the Internet. Search for. FIG. 4 is a diagram illustrating an example of a review article on which evaluation data is posted.
The evaluation data acquisition unit 231 analyzes the Web data 40 such as a review article searched as shown in FIG. 4 and acquires evaluation data 41 described as text data. As a method for obtaining the evaluation data 41 from the Web data 40, a known analysis method can be used. For example, a markup language such as html (HyperText Markup Language) is analyzed to extract a text description portion.
The evaluation data acquisition unit 231 preferably acquires a plurality of evaluation data for each content.

次に、データ解析手段２３２は、ステップＳ１１において取得した評価データに対して形態素解析を実施し、語句を抽出する（ステップＳ１２）。
この後、辞書取得手段２３３の共起判定手段２３３Ａは、記憶部２２から登場人物辞書を読み出し、ステップＳ１２により抽出された第一語句のうち、コンテンツに登場する登場人物に対して出現頻度が高い感情を表す感情表現語句を抽出する（ステップＳ１３）。このような登場人物は、本発明における主要要素となり、主要要素に対して出現頻度が高い感情表現語句を抽出することは、評価データを生成した作成者の登場人物に対する強い感情を示す語句となる。
具体的には、ステップＳ１３において、共起判定手段２３３Ａは、記憶部２２に記憶された登場人物辞書から、コンテンツに対応する登場人物を読み出す。そして、評価データにおける登場人物の記載された記載位置を特定し、当該記載位置と近い位置に出現する感情表現語句を抽出する。例えば、登場人物が記載された一文を特定し、特定された文や、その前後の文から、感情表現語句を抽出する。なお、感情表現語句としては、例えば、形容詞や形容動詞、人の感情を示す名詞や動詞等を例示できるが、特に、特定が容易な形容詞、形容動詞が好ましい。 Next, the data analysis means 232 performs a morphological analysis on the evaluation data acquired in step S11, and extracts a phrase (step S12).
Thereafter, the co-occurrence determination unit 233A of the dictionary acquisition unit 233 reads the character dictionary from the storage unit 22, and the appearance frequency is high for the characters appearing in the content among the first words extracted in step S12. An emotion expression phrase representing emotion is extracted (step S13). Such a character becomes a main element in the present invention, and extracting an emotion expression word / phrase having a high appearance frequency with respect to the main element is a word indicating a strong feeling for the character of the creator who generated the evaluation data. .
Specifically, in step S <b> 13, the co-occurrence determination unit 233 </ b> A reads a character corresponding to the content from the character dictionary stored in the storage unit 22. Then, a description position where the character is described in the evaluation data is specified, and an emotion expression word / phrase appearing at a position close to the description position is extracted. For example, a sentence in which a character is described is specified, and emotion expression phrases are extracted from the specified sentence and the sentences before and after the specified sentence. Examples of emotion expression phrases include adjectives and adjective verbs, and nouns and verbs indicating human emotions. Particularly, adjectives and adjective verbs that are easy to specify are preferable.

そして、共起判定手段２３３Ａは、抽出された登場人物と、その登場人物に対する感情表現語句との共起度合を算出する（ステップＳ１４）。
このステップＳ１４において、共起判定手段２３３Ａは、登場人物に対する各語句の共起度合として、例えば登場人物に対する語句の共起回数（出現回数）を用いる。なお、共起度合として、登場人物及び語句のJaccard係数を共起度合としてもよい。
なお、ステップＳ１４において、共起判定手段２３３Ａは、図５に示すような共起ネットワーク５０を構築してもよい。図５は、共起ネットワーク５０の一例を示す図である。
図５において、５１は、登場人物であり、５２は、登場人物に対して抽出された感情表現語句（第一語句）であり、各語句を結ぶラインにより共起関係を示している。ここで、図５において、ラインの線幅が太いほど共起度合が高いことを意味する。
本実施形態では、複数の評価データに基づいて、上記のような共起度合を判定することで、より精度の高い共起度合を算出することができる。 Then, the co-occurrence determining unit 233A calculates the co-occurrence degree between the extracted characters and the emotion expression words / phrases for the characters (step S14).
In step S14, the co-occurrence determining unit 233A uses, for example, the number of times the words co-occurred (number of appearances) for the characters as the degree of co-occurrence of the words for the characters. As the co-occurrence degree, the Jaccard coefficient of the characters and phrases may be used as the co-occurrence degree.
In step S14, the co-occurrence determining unit 233A may construct a co-occurrence network 50 as shown in FIG. FIG. 5 is a diagram illustrating an example of the co-occurrence network 50.
In FIG. 5, 51 is a character, 52 is an emotion expression word (first word) extracted for the character, and indicates a co-occurrence relationship by lines connecting the words. Here, in FIG. 5, the thicker the line width, the higher the co-occurrence degree.
In the present embodiment, it is possible to calculate the co-occurrence degree with higher accuracy by determining the co-occurrence degree as described above based on a plurality of evaluation data.

この後、共起判定手段２３３Ａは、登場人物に対する感情表現語句（第一語句）のうち、共起度合が所定値以上となる感情表現語句を第二語句として抽出する（ステップＳ１５）。
図５に示すような共起ネットワーク５０を構築する場合では、ラインの線幅が所定値以上となる語句を第二語句として抽出する。 Thereafter, the co-occurrence determining unit 233A extracts an emotion expression phrase having a co-occurrence degree equal to or greater than a predetermined value from the emotion expression phrases (first phrase) for the character as a second phrase (step S15).
When a co-occurrence network 50 as shown in FIG. 5 is constructed, a phrase whose line width is equal to or greater than a predetermined value is extracted as a second phrase.

次に、辞書取得手段２３３の分類手段２３３Ｂは、ステップＳ１５により抽出された第二語句を感情トピック（クラスタ）毎に分類（クラスタリング）し、各感情トピックに対応する語句を関連付けた感情分類辞書を作成する（ステップＳ１６）。
ここで、分類手段２３３Ｂは、抽出された第二語句の分類方法として、ＬＤＡ（Latent Dirichlet Allocation；潜在的ディリクレ配分法）を用いる。これにより、抽出された第二語句に基づいて、最適な数の感情トピックが算出され、各感情トピックと第二語句との類似度（感情トピックに第二語句が関連する確率）が算出される。したがって、各感情トピックに対して、所定の類似度以上の第二語句を関連付けることで、表１に示すような感情分類辞書を作成できる。
なお、本実施形態では、ＬＤＡにより感情トピックやその数を自動的に生成する例を示したが、これに限定されず、感情トピックや、設定する感情トピックの数が予め設定されていてもよい。 Next, the classification unit 233B of the dictionary acquisition unit 233 classifies (clusters) the second word / phrase extracted in step S15 for each emotion topic (cluster), and creates an emotion classification dictionary that associates the word / phrase corresponding to each emotion topic. Create (step S16).
Here, the classification means 233B uses LDA (Latent Dirichlet Allocation) as a classification method of the extracted second word / phrase. As a result, an optimal number of emotion topics are calculated based on the extracted second phrase, and the similarity between each emotion topic and the second phrase (probability that the second phrase is related to the emotion topic) is calculated. . Therefore, an emotion classification dictionary as shown in Table 1 can be created by associating each emotion topic with a second phrase having a predetermined similarity or higher.
In the present embodiment, an example of automatically generating emotion topics and the number thereof by LDA has been shown. However, the present invention is not limited to this, and the number of emotion topics or emotion topics to be set may be set in advance. .

上述した辞書生成処理の実施タイミングとしては、例えば、サーバ装置２０の管理者が指定したタイミングであってもよく、例えば一か月に一回等、周期的に自動で実施されることで、感情分類辞書が随時更新されてもよい。
また、インターネット上の所定のＷｅｂデータ（例えばコンテンツレビューサイト等）を監視し、新たなコンテンツに関するデータが公開される毎に感情分類辞書を作成して更新してもよい。 The implementation timing of the dictionary generation process described above may be, for example, a timing designated by the administrator of the server device 20, for example, once a month, etc. The classification dictionary may be updated as needed.
Alternatively, predetermined Web data (such as a content review site) on the Internet may be monitored, and an emotion classification dictionary may be created and updated each time data relating to new content is released.

（コンテンツ定量化処理）
次に、コンテンツの定量化処理について図面に基づいて説明する。
図６は、サーバ装置２０におけるコンテンツ定量化処理を示すフローチャートである。
サーバ装置２０は、上記のように生成した感情分類辞書を用いて、コンテンツに対する人の感情を定量化した定量化データを生成する。 (Content quantification process)
Next, content quantification processing will be described with reference to the drawings.
FIG. 6 is a flowchart showing content quantification processing in the server device 20.
The server device 20 uses the emotion classification dictionary generated as described above to generate quantified data that quantifies human emotions with respect to the content.

これには、サーバ装置２０の定量化手段２３４は、定量化データの生成対象であるコンテンツを特定する（ステップＳ２１）。
このステップＳ２１では、コンテンツの特定は、例えばインターネット上の所定のＷｅｂデータ（例えばコンテンツレビューサイト等）を監視し、新たなコンテンツに関するデータが公開される毎に当該コンテンツ名を取得してもよく、定期的にＷｅｂデータの更新状況を取得し、更新により、新たなコンテンツに関するデータが公開される毎に当該コンテンツ名を取得してもよい。なお、例えばサーバ装置２０の管理者が指定したタイミングで、サーバ管理者がコンテンツ名等のコンテンツを特定するデータを入力することで当該コンテンツを特定してもよい。 For this purpose, the quantification means 234 of the server device 20 identifies the content that is the target for generating the quantification data (step S21).
In step S21, the content may be specified by monitoring predetermined Web data (for example, a content review site) on the Internet, for example, and acquiring the content name every time data related to new content is released. The update status of the Web data may be periodically acquired, and the content name may be acquired every time data related to new content is released through the update. For example, the server administrator may specify the content by inputting data specifying the content such as a content name at a timing specified by the administrator of the server device 20.

この後、評価データ取得手段２３１は、ステップＳ２１にて特定したコンテンツに対する評価データを、ネットワーク上から取得する（ステップＳ２２）。
このステップＳ２２では、評価データ取得手段２３１は、評価データの取得として、上記特定されたコンテンツのコンテンツ名をクエリとして、インターネット上から評価データが記載されたＷｅｂデータ（例えばレビュー記事やブログ、ユーザの発言等）を検索する。
そして、評価データ取得手段２３１は、これらのレビュー記事やブログ、ユーザの発言等を解析し、テキストデータの評価データを取得する。なお、ステップＳ１２と同様、評価データ取得手段２３１は、コンテンツに対して、複数の評価データを取得することが好ましい。 Thereafter, the evaluation data acquisition unit 231 acquires evaluation data for the content specified in step S21 from the network (step S22).
In this step S22, the evaluation data acquisition means 231 acquires the evaluation data by using the content name of the identified content as a query and Web data (for example, review articles, blogs, user's Search).
Then, the evaluation data acquisition means 231 analyzes these review articles, blogs, user comments, etc., and acquires evaluation data of text data. As in step S12, the evaluation data acquisition unit 231 preferably acquires a plurality of evaluation data for the content.

次に、データ解析手段２３２は、ステップＳ２２において取得した評価データに対して形態素解析を実施し、評価データに含まれる語句を抽出する（ステップＳ２３）。
この後、定量化手段２３４は、上記辞書生成処理により生成され、記憶部２２に記憶された感情分類辞書を読み出し、コンテンツの各感情トピックに対する感情値を取得する（ステップＳ２４）。
具体的には、定量化手段２３４は、感情分類辞書の各感情トピックに含まれる、ステップＳ２３で抽出された語句の数を感情値として取得する。つまり、感情トピックに含まれる語句ののべ数を感情値とする。例えば、ステップＳ２３において抽出された語句が、「ドキドキ」「ドキドキ」「迫力」であり、上述した表１の感情分類辞書を用いる場合、定量化手段２３４は、「覚醒（ドキドキ）」との感情トピックに対して、２つの「ドキドキ」、１つの「迫力」との語句が含まれるので、感情トピック「覚醒（ドキドキ）」に対する感情値を「３」とする。
そして、定量化手段２３４は、ステップＳ２４により、各感情トピックに対する感情値がそれぞれ設定されると、これらの感情トピックに対する感情値、及びコンテンツを特定するコンテンツ特定データ（例えばコンテンツＩＤやコンテンツ名）を関連付けた定量化データを生成し、記憶部２２に記憶する（ステップＳ２５）。 Next, the data analysis means 232 performs morphological analysis on the evaluation data acquired in step S22, and extracts words included in the evaluation data (step S23).
Thereafter, the quantification means 234 reads the emotion classification dictionary generated by the dictionary generation process and stored in the storage unit 22, and acquires the emotion value for each emotion topic of the content (step S24).
Specifically, the quantifying means 234 acquires the number of phrases extracted in step S23 included in each emotion topic of the emotion classification dictionary as an emotion value. That is, the total number of words included in the emotion topic is used as the emotion value. For example, when the phrase extracted in step S23 is “pounding”, “pounding”, “power”, and using the emotion classification dictionary of Table 1 described above, the quantifying means 234 determines that the emotion is “awakening (pounding)”. Since the words “pounding” and “powerful” are included for the topic, the emotion value for the emotional topic “awakening” is set to “3”.
When the emotion value for each emotion topic is set in step S24, the quantification means 234 obtains the emotion value for each emotion topic and content specifying data (for example, content ID or content name) for specifying the content. The associated quantification data is generated and stored in the storage unit 22 (step S25).

（コンテンツ検索処理）
次に、上記のような定量化データを利用したサービスの一例として、コンテンツ検索処理を、図面に基づいて説明する。
図７は、本実施形態におけるコンテンツ検索処理を示すフローチャートである。
ユーザ端末１０において、ユーザにより入力操作部１２が操作され、検索クエリが入力されると、ユーザ端末１０の端末制御部１４は、検索クエリを含む検索要求を生成する（ステップＳ３１）。ここで検索クエリとしては、コンテンツを特定するコンテンツ特定データ（例えばコンテンツ名やコンテンツＩＤ等）が指定される。また、ユーザ端末１０は、検索要求とともに、ユーザ端末１０を識別するためのユーザＩＤを送信する（ステップＳ３２）。 (Content search process)
Next, content search processing will be described with reference to the drawings as an example of a service using quantified data as described above.
FIG. 7 is a flowchart showing content search processing in the present embodiment.
In the user terminal 10, when the input operation unit 12 is operated by the user and a search query is input, the terminal control unit 14 of the user terminal 10 generates a search request including the search query (step S31). Here, as the search query, content specifying data (for example, content name, content ID, etc.) specifying the content is specified. Further, the user terminal 10 transmits a user ID for identifying the user terminal 10 together with the search request (step S32).

サーバ装置２０は、要求取得手段２３５によりユーザ端末１０から送信された検索要求を受信すると（ステップＳ４１）、検索手段２３６によりコンテンツ検索処理を実施させる。
コンテンツ検索処理では、まず、検索手段２３６は、検索要求に含まれた検索クエリで指定されたコンテンツ（以降、クエリコンテンツと称する場合がある）に対応する定量化データがあるか否かを判定する（ステップＳ４２）。
ステップＳ４２において、「Ｎｏ」と判定された場合、検索手段２３６は、ユーザ端末１０に対して、ディスプレイ１５にてクエリコンテンツが見つからない旨を出力させる非該当出力指示を出力する（ステップＳ４３）。 When receiving the search request transmitted from the user terminal 10 by the request acquisition unit 235 (step S41), the server device 20 causes the search unit 236 to perform content search processing.
In the content search process, first, the search unit 236 determines whether or not there is quantification data corresponding to content specified by the search query included in the search request (hereinafter sometimes referred to as query content). (Step S42).
When it is determined as “No” in step S42, the search unit 236 outputs a non-corresponding output instruction to the user terminal 10 to output that the query content is not found on the display 15 (step S43).

ステップＳ４２において、「Ｙｅｓ」と判定された場合、検索手段２３６は、クエリコンテンツの定量化データを取得する（ステップＳ４４）。
そして、検索手段２３６は、ステップＳ４４で取得した定量化データと傾向が類似する定量化データを定量化データベースから検索し、そのコンテンツのコンテンツ名及び当該コンテンツの定量化データを取得する（ステップＳ４５）。
この後、検索手段２３６は、ステップＳ４５にて取得したコンテンツ名及び定量化データをユーザ端末１０に返す（ステップＳ４６）。 When it is determined as “Yes” in step S42, the search unit 236 acquires quantification data of the query content (step S44).
Then, the search means 236 searches the quantification database for quantification data whose tendency is similar to the quantification data acquired in step S44, and acquires the content name of the content and the quantification data of the content (step S45). .
Thereafter, the search unit 236 returns the content name and quantification data acquired in step S45 to the user terminal 10 (step S46).

ユーザ端末１０の端末制御部１４は、サーバ装置２０から非該当出力指示を受信したか、検索結果を受信したかを判定する（ステップＳ３３）。ステップＳ３３において、非該当出力指示を受信した場合は、ディスプレイ１５に、対応するコンテンツが見つからない旨を表示させる（ステップＳ３４）。
また、ステップＳ３３にて検索結果を受信した場合、検索されたコンテンツ名、及びそのコンテンツの定量化データをディスプレイ１５に検索結果として表示させる（ステップＳ３５）。
図８は、クエリコンテンツ、及び検索されたコンテンツにおける定量化データを比較した一例を示す図である。図８の実線は、検索されたコンテンツ、破線はクエリコンテンツを示している。図８に示すように、上記のような検索処理により、クエリコンテンツと各感情値の傾向が類似するコンテンツが検索されることになる。 The terminal control unit 14 of the user terminal 10 determines whether a non-corresponding output instruction has been received from the server device 20 or a search result has been received (step S33). If a non-corresponding output instruction is received in step S33, the display 15 displays that the corresponding content is not found (step S34).
If the search result is received in step S33, the searched content name and the quantified data of the content are displayed on the display 15 as the search result (step S35).
FIG. 8 is a diagram illustrating an example in which the quantified data in the query content and the searched content is compared. The solid line in FIG. 8 indicates the searched content, and the broken line indicates the query content. As shown in FIG. 8, content similar in tendency to the query content and each emotion value is searched by the above search processing.

（コンテンツ定量化データ出力処理）
次に、上記のような定量化データを利用したサービスの他の例としてコンテンツ定量化データ出力処理を図面に基づいて説明する。
図９は、本実施形態におけるコンテンツ定量化データ出力処理を示すフローチャートである。
ユーザ端末１０において、ユーザにより入力操作部１２が操作され、所定のコンテンツに対する定量化データを出力要求が入力されると、端末制御部１４は、当該出力要求とユーザＩＤとをサーバ装置２０に送信する（ステップＳ５１）。 (Content quantification data output processing)
Next, content quantification data output processing will be described with reference to the drawings as another example of the service using the quantification data as described above.
FIG. 9 is a flowchart showing content quantification data output processing in the present embodiment.
In the user terminal 10, when the user operates the input operation unit 12 and an output request for quantified data for a predetermined content is input, the terminal control unit 14 transmits the output request and the user ID to the server device 20. (Step S51).

サーバ装置２０は、要求取得手段２３５によりユーザ端末１０から送信された検索要求を受信すると（ステップＳ６１）、定量化データ出力手段２３７によりコンテンツ定量化データ出力処理を実施させる。
コンテンツ定量化データ出力処理では、まず、定量化データ出力手段２３７は、出力要求にて指定されたコンテンツに対応する定量化データがあるか否かを判定する（ステップＳ６２）。
ステップＳ６２において、「Ｎｏ」と判定された場合、ステップＳ４３と同様、ユーザ端末１０に対して、非該当出力指示を出力する。
ステップＳ６２において、「Ｙｅｓ」と判定され場合、定量化データ出力手段２３７は、指定されたコンテンツの定量化データを取得し（ステップＳ６３）、ユーザ端末１０に返す（ステップＳ６４）。 When receiving the search request transmitted from the user terminal 10 by the request acquisition unit 235 (step S61), the server device 20 causes the quantified data output unit 237 to execute content quantification data output processing.
In the content quantification data output process, first, the quantification data output means 237 determines whether or not there is quantification data corresponding to the content specified in the output request (step S62).
If it is determined as “No” in step S62, a non-corresponding output instruction is output to the user terminal 10 as in step S43.
When it is determined as “Yes” in step S62, the quantified data output unit 237 acquires the quantified data of the designated content (step S63) and returns it to the user terminal 10 (step S64).

ユーザ端末１０の端末制御部１４は、サーバ装置２０から非該当出力指示を受信したか、定量化データを受信したかを判定する（ステップＳ５２）。ステップＳ５２において、非該当出力指示を受信した場合は、ステップＳ３４と同様、ディスプレイ１５に、対応するコンテンツが見つからない旨を表示させる。
また、ステップＳ５２において、指定したコンテンツに対する定量化データを受信した場合、その定量化データをディスプレイ１５に表示させる（ステップＳ５３）。 The terminal control unit 14 of the user terminal 10 determines whether a non-corresponding output instruction is received from the server device 20 or quantified data is received (step S52). If a non-corresponding output instruction is received in step S52, the display 15 displays that the corresponding content is not found, as in step S34.
In step S52, when quantification data for the designated content is received, the quantification data is displayed on the display 15 (step S53).

［第一実施形態の作用効果］
本実施形態のサーバ装置２０の定量化手段２３４は、複数の語句が感情トピック単位で分類された感情分類辞書を記憶部２２から取得し、この感情分類辞書を用いて、コンテンツに対する人の感情を定量化した定量化データを生成する。
すなわち、感情分類辞書には、人の感情を示す語句に対する感情トピックが関連付けられているため、このような辞書を用いることで、コンテンツに対して人がどのような印象を持っているかを解析及び定量化することができる。このような定量化データを用いることで、例えば、上述した検索処理やコンテンツ定量化データ出力処理等、人の感情を軸とした各種情報処理を実施することができる。 [Operational effects of the first embodiment]
The quantification unit 234 of the server device 20 according to the present embodiment acquires an emotion classification dictionary in which a plurality of words are classified by emotion topic unit from the storage unit 22 and uses this emotion classification dictionary to express a person's feelings about the content. Generate quantified data.
In other words, since the emotion classification dictionary is associated with emotion topics for words indicating human emotions, using such a dictionary, it is possible to analyze and express what impression a person has with respect to content. Can be quantified. By using such quantified data, for example, various types of information processing based on human emotions such as the above-described search processing and content quantified data output processing can be performed.

本実施形態では、辞書取得手段２３３は、ネットワーク（インターネット）上に公開されている複数のユーザ（評価者）のコンテンツに対する評価データに基づいて、感情分類辞書を生成する。つまり、辞書取得手段２３３は、コンテンツに対して個々のユーザが感じた感情に基づいて、感情分類辞書を生成する。このように、人の感情を軸として感情分類辞書を生成することで、例えば機械的に語句を分類する場合よりも、人の感情に即した定量化データを生成でき、検索処理等の各種処理における処理精度を向上させることができる。 In the present embodiment, the dictionary acquisition unit 233 generates an emotion classification dictionary based on evaluation data for content of a plurality of users (evaluators) disclosed on the network (Internet). That is, the dictionary acquisition unit 233 generates an emotion classification dictionary based on emotions felt by individual users with respect to the content. In this way, by generating an emotion classification dictionary around human emotions, for example, it is possible to generate quantified data that matches human emotions compared to mechanically classifying words and phrases, and various processes such as search processing The processing accuracy can be improved.

本実施形態では、データ解析手段２３２が評価データから形態素解析により複数の第一語句を抽出し、辞書取得手段２３３の共起判定手段２３３Ａは、抽出した第一語句のうち共起度合が所定値以上となる第一語句を第二語句として抽出する。
評価データにおいて、共起度合が高い語句は、評価データを作成した人がコンテンツに対して強い感情を有する語句であり、かつ同じ感情で関連付けられていることが多い。したがって、このような共起度合が高い語句同士を抽出することで、感情トピックに対して適切な語句を関連付けた精度の高い感情分類辞書を生成することができる。 In the present embodiment, the data analysis unit 232 extracts a plurality of first words / phrases from the evaluation data by morphological analysis, and the co-occurrence determination unit 233A of the dictionary acquisition unit 233 determines that the co-occurrence degree of the extracted first words / phrases has a predetermined value. The first word / phrase is extracted as the second word / phrase.
In the evaluation data, a phrase with a high degree of co-occurrence is a phrase in which the person who created the evaluation data has a strong feeling for the content, and is often associated with the same feeling. Therefore, by extracting words having a high degree of co-occurrence, it is possible to generate a highly accurate emotion classification dictionary in which appropriate phrases are associated with emotion topics.

本実施形態では、共起判定手段２３３Ａは、評価データにおける主要要素である登場人物と、その登場人物に対する語句を抽出する。このような語句は、評価データを生成した評価者が登場人物に対して強く抱いている感情を示す語句であり、コンテンツに対して評価者が抱くイメージを強く反映した語句である可能性が高い。したがって、これらの語句に基づいて感情分類辞書を作成することで、感情トピックに対して適切な語句を関連付けた精度の高い感情分類辞書を生成することができる。 In the present embodiment, the co-occurrence determining unit 233A extracts characters that are main elements in the evaluation data and words / phrases for the characters. Such a phrase is a phrase that indicates the emotion that the evaluator who generated the evaluation data strongly holds for the character, and is likely to be a phrase that strongly reflects the image that the evaluator has on the content . Therefore, by creating an emotion classification dictionary based on these phrases, it is possible to generate an accurate emotion classification dictionary in which appropriate phrases are associated with emotion topics.

また、この際、共起判定手段２３３Ａは、評価データにおける登場人物が記載された文を特定し、特定された文や、その前後の文から、感情表現語句を抽出する。つまり、登場人物を中心として所定範囲内に記載された語句を抽出している。これにより、登場人物に対する感情を示す語句をより精度よく抽出できる。 At this time, the co-occurrence determining unit 233A identifies a sentence in which the characters in the evaluation data are described, and extracts an emotion expression phrase from the identified sentence and the sentences before and after the sentence. In other words, phrases described within a predetermined range centering on the characters are extracted. Thereby, the phrase which shows the emotion with respect to a character can be extracted more accurately.

本実施形態では、評価データから例えば形容詞や形容動詞等の感情表現語句を抽出する。
これにより、感情分類辞書における各感情トピックに対して最適な語句を関連付けさせることができる。 In the present embodiment, emotion expression phrases such as adjectives and adjective verbs are extracted from the evaluation data.
Thereby, an optimal phrase can be associated with each emotion topic in the emotion classification dictionary.

本実施形態では、分類手段２３３Ｂは、ＬＤＡを用いて、抽出された語句を感情トピック毎に分類する。ＬＤＡを用いることで、共起判定手段２３３Ａにより抽出された第二語句を自動で最適な感情トピックを最適なクラスタ数で分類することができる。
これにより、辞書生成処理における処理の簡略化及び迅速化を図れ、かつ感情分類辞書の精度向上をも図れる。 In the present embodiment, the classifying unit 233B classifies the extracted phrases for each emotion topic using LDA. By using the LDA, it is possible to automatically classify the optimal emotion topic by the optimal number of clusters in the second word / phrase extracted by the co-occurrence determination unit 233A.
As a result, the process in the dictionary generation process can be simplified and speeded up, and the accuracy of the emotion classification dictionary can be improved.

本実施形態では、定量化手段２３４は、評価データ取得手段２３１により取得されたコンテンツに対する評価データと、感情分類辞書とに基づいてコンテンツを定量化する。
つまり、複数のユーザ（評価者）のコンテンツに対する評価データに基づき、コンテンツに対して個々のユーザが感じた感情に基づいて、当該コンテンツの定量化データを生成する。このため、例えば、コンテンツの内容（例えばあらすじ等）に基づいて定量化データを生成するよりも、人の感情評価に基づいた、感情を軸とした適正な定量化データを生成することができる。 In the present embodiment, the quantification unit 234 quantifies the content based on the evaluation data for the content acquired by the evaluation data acquisition unit 231 and the emotion classification dictionary.
In other words, based on the evaluation data for the content of a plurality of users (evaluators), the quantification data of the content is generated based on the emotion felt by each user for the content. For this reason, for example, rather than generating quantification data based on the content (for example, synopsis), appropriate quantification data based on emotions can be generated based on human emotion evaluation.

本実施形態では、定量化手段２３４により生成された定量化データは記憶部２２に蓄積されている。このため、この定量化データを読み出すことで容易に各種処理を実施できる。 In the present embodiment, the quantification data generated by the quantification unit 234 is accumulated in the storage unit 22. Therefore, various processes can be easily performed by reading out the quantification data.

本実施形態では、コンテンツ（クエリコンテンツ）が指定された検索クエリを含む検索要求を要求取得手段２３５が受けた際に、検索手段２３６は、クエリコンテンツの定量化データと類似する定量化データを有するコンテンツを定量化データベースから検索して、ユーザ端末１０に返す。
従来、あるクエリコンテンツに対して同じような雰囲気のコンテンツ（人が受ける印象や感情が同じであるコンテンツ）を探す際に、そのコンテンツと同じ作者のコンテンツを検索して表示させたり、コンテンツを購入した他者が他にどのようなコンテンツを購入しているかを検索して表示させたりするサービスは知られている。しかしながらこのような検索サービスでは、検索されたコンテンツが、クエリコンテンツと同じ雰囲気を有しているとは限らない。これに対して、本実施形態では、感情分類辞書に基づいて各コンテンツに対する人の感情を定量化した定量化データを用いるため、クエリコンテンツと雰囲気が似たコンテンツ（クエリコンテンツと同じ感情を抱くことができるコンテンツ）を好適に検索することができる。 In the present embodiment, when the request acquisition unit 235 receives a search request including a search query in which content (query content) is specified, the search unit 236 has quantification data similar to the quantification data of the query content. The content is retrieved from the quantification database and returned to the user terminal 10.
Conventionally, when searching for content with the same atmosphere (content with the same impression and emotion received by people) for a certain query content, the content of the same author as that content is searched and displayed, or the content is purchased There are known services for searching and displaying what kind of content other people have purchased. However, in such a search service, the searched content does not always have the same atmosphere as the query content. On the other hand, in this embodiment, since quantified data obtained by quantifying human emotions for each content based on the emotion classification dictionary is used, content similar to the query content (the same emotion as the query content is held). Can be suitably searched.

本実施形態では、コンテンツに対する出力要求を要求取得手段２３５により取得した際に、定量化データ出力手段２３７は、そのコンテンツに対する定量化データを取得して、ユーザ端末１０に返す。これにより、ユーザ端末１０には、コンテンツに対する感情の定量化データが表示される。このように、コンテンツに対して複数の評価者が感じた感情を例えば図８に示すようなレーダーチャート等によって表示させることで、ユーザはコンテンツに対する評価を容易に理解することができる。 In this embodiment, when the output request for the content is acquired by the request acquisition unit 235, the quantified data output unit 237 acquires the quantified data for the content and returns it to the user terminal 10. Thereby, emotion quantification data for the content is displayed on the user terminal 10. In this way, by displaying emotions felt by a plurality of evaluators with respect to the content using, for example, a radar chart as shown in FIG. 8, the user can easily understand the evaluation of the content.

［第二実施形態］
上述した第一実施形態では、定量化手段２３４は、１つのコンテンツに対して、複数の評価者からの評価データに基づいた定量化データを生成する。この場合、人によっては、コンテンツに対する感じ方が異なるため、例えば１つのコンテンツに対して「怖い」と感じる評価者のグループ（感情グループ）や、「面白い」と感じる評価者の感情グループとが混在する可能性がある。このように、複数の感情グループが混在する場合、感情トピックを定量化すると、コンテンツの特徴が見えにくく、若しくは、各感情グループの特徴と異なる特徴を示した定量化データになることがある。 [Second Embodiment]
In the first embodiment described above, the quantification unit 234 generates quantification data based on evaluation data from a plurality of evaluators for one content. In this case, depending on the person, how to feel the content differs, for example, a group of evaluators who feel “scary” for one content (emotion group) and an emotion group of evaluators who feel “interesting” are mixed. there's a possibility that. As described above, when a plurality of emotion groups coexist, if the emotion topic is quantified, the feature of the content may be difficult to see or may be quantified data that shows a feature different from the feature of each emotion group.

これに対して、第二実施形態では、上記のような問題を解消するために、定量化手段２３４は、各コンテンツに対して、感情トピック同士の共起関係を感情値に関連付けた定量化データを生成する。具体的には、定量化手段２３４は、例えば、クロス集計や、相関分析、多次元分析等の手法を用いて、感情トピック同士の共起関係を求める。 On the other hand, in the second embodiment, in order to solve the above-described problem, the quantification unit 234 quantifies data that associates the co-occurrence relationship between emotion topics with emotion values for each content. Is generated. Specifically, the quantification unit 234 obtains a co-occurrence relationship between emotion topics using a technique such as cross tabulation, correlation analysis, or multidimensional analysis.

このように、定量化データとして、各感情トピックの感情値に加え、感情トピック間の共起関係が関連付けられた定量化データを用いることで、コンテンツに対して複数の感情グループがある場合でも（人によってコンテンツに対する感じ方が異なる場合でも）、コンテンツの特徴が見えやすくなり、かつ、各感情グループのそれぞれの特徴も把握しやすくなる。
例えば、コンテンツに対して「怖い」との感情トピックと、「面白い」との感情トピックとが共起関係である場合、そのコンテンツの定量化データを見ることで、「怖い」と感じる人、「面白い」と感じる人がいることを把握できる。この場合、コンテンツに対して「怖い」「面白い」の双方を同時に感じる場合とは区別することができ、各コンテンツの特徴がより分かりやすい定量化データを提供できる。 As described above, even when there are a plurality of emotion groups for the content by using the quantification data in which the co-occurrence relationship between the emotion topics is associated in addition to the emotion value of each emotion topic as the quantification data ( Even if the feeling of content differs among people), it becomes easier to see the characteristics of the content and also to understand the characteristics of each emotion group.
For example, if the emotional topic of “Scared” and the emotional topic of “Interesting” are co-occurring with respect to the content, by looking at the quantification data of the content, You can see that there are people who feel “interesting”. In this case, it can be distinguished from the case where the user feels both “scary” and “interesting” at the same time, and can provide quantified data that makes the characteristics of each content easier to understand.

［変形例］
なお、本発明は、上述した実施形態に限定されるものではなく、本発明の目的を達成できる範囲で、以下に示される変形をも含むものである。
［変形例１］
上記実施形態では、ステップＳ４２にて「Ｎｏ」と判定された場合、サーバ装置２０は、非該当出力指示をユーザ端末１０に返す例を示したが、これに限定されない。例えば、検索クエリにて指定されたコンテンツに対して、上述したコンテンツ定量化処理を実施してもよい。この場合、ステップＳ２１において特定されるコンテンツを、検索クエリにて指定されたコンテンツとし、ネットワーク上から評価データを取得し、その評価データに基づいてコンテンツを定量化する。 [Modification]
In addition, this invention is not limited to embodiment mentioned above, In the range which can achieve the objective of this invention, the deformation | transformation shown below is also included.
[Modification 1]
In the above-described embodiment, the server device 20 returns the non-corresponding output instruction to the user terminal 10 when it is determined “No” in step S42. However, the present invention is not limited to this. For example, the content quantification process described above may be performed on the content specified by the search query. In this case, the content specified in step S21 is the content specified by the search query, evaluation data is acquired from the network, and the content is quantified based on the evaluation data.

［変形例２］
上記実施形態では、ユーザ端末１０からの出力要求に基づいてコンテンツの定量化データを表示させる例を示すが、これに限定されない。例えば、コンテンツを紹介する紹介サイト等において、コンテンツに対する定量化データをサーバ装置２０から取得して掲載する等、定量化データを利用した様々なサービスに本発明を適用できる。 [Modification 2]
In the above embodiment, an example is shown in which content quantification data is displayed based on an output request from the user terminal 10, but the present invention is not limited to this. For example, the present invention can be applied to various services using quantified data, such as obtaining and posting quantified data for the content from the server device 20 at an introduction site or the like that introduces content.

［変形例３］
上記実施形態において、本発明の対象物として、書籍や映画等のコンテンツを例示したが、これに限定されない。対象物としては、ユーザ（評価者）によって評価可能な対象であれば、いかなる対象物であってもよい。例えば、飲食店等の店舗に適用する場合では、店の雰囲気、味の傾向等のグルメレポート記事に基づいて感情分類辞書及び各店舗の定量化データを生成することもできる。この場合、所定店舗名を検索クエリとして入力した場合、同様の雰囲気の店舗を検索することもできる。また、飲食店のレビューサイト等において、これらの定量化データを活用することで、各店舗の傾向を直感的に理解できるサイトを構築することができる等、利用の拡大を図れる。 [Modification 3]
In the said embodiment, although content, such as a book and a movie, was illustrated as a target object of this invention, it is not limited to this. The target object may be any target as long as it can be evaluated by the user (evaluator). For example, when applied to restaurants such as restaurants, emotion classification dictionaries and quantified data of each store can be generated based on gourmet report articles such as store atmosphere and taste trends. In this case, when a predetermined store name is input as a search query, a store having a similar atmosphere can be searched. In addition, by using these quantified data at a review site of a restaurant, etc., it is possible to expand the use, for example, by building a site that can intuitively understand the tendency of each store.

［変形例４］
上記実施形態において、辞書取得手段２３３は、評価データ取得手段２３１により取得されたネットワーク上のＷｅｂデータから評価データを抽出し、当該評価データに基づいて感情分類辞書を生成したが、これに限定されない。例えば、アンケート等により集計された評価データをサーバ装置２０に対して入力することで、入力された評価データに基づいて感情分類辞書を生成してもよい。定量化手段２３４においても同様であり、評価データの取得先としては、ネットワーク上に公開されている評価データに限定されない。 [Modification 4]
In the above embodiment, the dictionary acquisition unit 233 extracts the evaluation data from the Web data on the network acquired by the evaluation data acquisition unit 231 and generates the emotion classification dictionary based on the evaluation data. However, the present invention is not limited to this. . For example, the emotion classification dictionary may be generated based on the input evaluation data by inputting the evaluation data aggregated by a questionnaire or the like to the server device 20. The same applies to the quantifying means 234, and the acquisition source of the evaluation data is not limited to the evaluation data published on the network.

［変形例５］
辞書取得手段２３３の共起判定手段２３３Ａは、共起度合として、共起回数に基づいて、第二語句として抽出したが、例えば、上述したように、Jaccard係数等に基づいて第二語句を抽出してもよい。
また、精度は低下するが、共起度合に限らず、感情分類辞書を第二語句として抽出してもよい。 [Modification 5]
The co-occurrence determination unit 233A of the dictionary acquisition unit 233 extracts the second word / phrase based on the number of co-occurrence as the degree of co-occurrence. For example, as described above, the second word / phrase is extracted based on the Jaccard coefficient or the like. May be.
In addition, although the accuracy is lowered, the emotion classification dictionary may be extracted as the second phrase without being limited to the co-occurrence degree.

［変形例６］
上記実施形態において、辞書取得手段２３３は、登場人物に対して出現頻度が高い語句を抽出する例を示したが、これに限定されない。例えば、コンテンツの作品全体に対する人の感情等を抽出するために、コンテンツ名やコンテンツ制作者等を主要要素とし、これらの主要要素に対する語句を抽出してもよい。
また、主要要素に限らず、感情表現語句を抽出してもよい。例えば、対象物として店舗や商品等、飲食者や商品使用者の評価データを解析する場合では、評価者自身（例えば私等の一人称主語）を主要要素として感情表現語句を抽出してもよい。 [Modification 6]
In the above embodiment, the dictionary acquisition unit 233 has shown an example of extracting words / phrases having a high appearance frequency with respect to the characters, but the present invention is not limited to this. For example, in order to extract a person's feelings and the like for the entire content work, the content name, content creator, and the like may be used as main elements, and words for these main elements may be extracted.
Moreover, you may extract not only a main element but an emotion expression phrase. For example, when analyzing evaluation data of a restaurant or a product user such as a store or a product as an object, an emotion expression phrase may be extracted with the evaluator itself (for example, my first person subject) as a main element.

［変形例７］
分類手段２３３Ｂは、ＬＤＡにより抽出された第二語句を分類したが、これに限定されない。例えば、語句に対する感情トピックが関連付けられた分類表を予め生成しておき、当該分類表に基づいた分類を実施してもよい。 [Modification 7]
The classification unit 233B classifies the second word / phrase extracted by the LDA, but is not limited thereto. For example, a classification table in which emotion topics for words are associated in advance may be generated, and classification based on the classification table may be performed.

［変形例８］
上記実施形態では、サーバ装置２０に評価データ取得手段２３１、データ解析手段２３２、辞書取得手段２３３、定量化手段２３４が設けられる例を示したが、これに限定されない。例えば、ユーザ端末１０の端末制御部１４が、端末記憶部１３に記憶されたプログラムを読み出し実行することで、上記評価データ取得手段２３１、データ解析手段２３２、辞書取得手段２３３、定量化手段２３４として機能する構成としてもよい。この場合、ユーザ端末１０にインストールされているアプリケーションの定量化データを表示させることもできる。また、インストールされているアプリケーションの定量化データを集計したユーザの嗜好データを判定することも可能となる。このような嗜好データをアプリケーション提供装置や広告配信装置に送信することで、ユーザにとって有益な広告やアプリケーションの紹介を配信することも可能となる。 [Modification 8]
In the above embodiment, an example in which the evaluation data acquisition unit 231, the data analysis unit 232, the dictionary acquisition unit 233, and the quantification unit 234 are provided in the server device 20 is shown, but the present invention is not limited to this. For example, the terminal control unit 14 of the user terminal 10 reads and executes a program stored in the terminal storage unit 13, so that the evaluation data acquisition unit 231, the data analysis unit 232, the dictionary acquisition unit 233, and the quantification unit 234 are used. It may be configured to function. In this case, the quantification data of the application installed in the user terminal 10 can be displayed. It is also possible to determine user preference data obtained by collecting quantification data of installed applications. By transmitting such preference data to the application providing device and the advertisement distribution device, it is possible to distribute advertisements and application introductions that are useful to the user.

［変形例９］
上記実施形態において、定量化手段２３４は、各感情トピックに分類された語句の数そのものを感情値として取得したが、これに限定されず、分類結果に基づいたその他の感情値の設定方法を用いてもよい。
例えば、ステップＳ２３で抽出された全語句数に対する、各感情トピックに対して分類された語句数の割合（全体に対する各感情トピックの占有率）や分布度を感情値としてもよい。
また、例えば所定周期毎に、各感情トピックに対して分類された語句数を検出することで、各感情トピックに分類された語句数の推移（変化率）を算出して感情値としてもよい。 [Modification 9]
In the above embodiment, the quantification unit 234 acquires the number of words classified into each emotion topic itself as an emotion value, but is not limited thereto, and uses other emotion value setting methods based on the classification result. May be.
For example, the ratio of the number of phrases classified with respect to each emotion topic with respect to the total number of phrases extracted in step S23 (occupation ratio of each emotion topic with respect to the whole) and the distribution may be used as the emotion value.
In addition, for example, by detecting the number of phrases classified for each emotion topic every predetermined period, the transition (change rate) of the number of phrases classified for each emotion topic may be calculated and used as an emotion value.

［変形例１０］
辞書取得手段２３３は、共起判定手段２３３Ａ及び分類手段２３３Ｂを含み、辞書取得手段として機能する例を示したが、これに限定されない。例えば、辞書取得手段２３３は、ネットワーク上の他の装置から、感情分類辞書を取得してもよい。 [Modification 10]
Although the dictionary acquisition unit 233 includes the co-occurrence determination unit 233A and the classification unit 233B and functions as the dictionary acquisition unit, the example is not limited thereto. For example, the dictionary acquisition unit 233 may acquire the emotion classification dictionary from another device on the network.

その他、本発明の実施の際の具体的な構造および手順は、本発明の目的を達成できる範囲で他の構造などに適宜変更できる。 In addition, the specific structure and procedure for carrying out the present invention can be appropriately changed to other structures and the like within a range in which the object of the present invention can be achieved.

１…データ処理システム、１０…ユーザ端末、２０…サーバ装置（データ処理装置）、２１…通信部、２２…記憶部（記憶手段）、２３…制御部、４１…評価データ、５０…共起ネットワーク、２３１…評価データ取得手段、２３２…データ解析手段、２３３…辞書取得手段２３３、Ａ…共起判定手段、２３３Ｂ…分類手段、２３４…定量化手段、２３５…要求取得手段、２３６…検索手段、２３７…定量化データ出力手段。 DESCRIPTION OF SYMBOLS 1 ... Data processing system, 10 ... User terminal, 20 ... Server apparatus (data processing apparatus), 21 ... Communication part, 22 ... Memory | storage part (memory | storage means), 23 ... Control part, 41 ... Evaluation data, 50 ... Co-occurrence network 231 ... Evaluation data acquisition means, 232 ... Data analysis means, 233 ... Dictionary acquisition means 233, A ... Co-occurrence determination means, 233B ... Classification means, 234 ... Quantification means, 235 ... Request acquisition means, 236 ... Search means, 237: Quantified data output means.

Claims

A dictionary acquisition means for acquiring an emotion classification dictionary in which a plurality of words are classified by emotion unit;
Using the emotion classification dictionary, content quantification means for generating quantification data quantifying human emotions on the object,
The dictionary acquisition means includes dictionary generation means for generating the emotion classification dictionary based on text-based evaluation data for a plurality of objects ,
The dictionary generation means includes, from the evaluation data, a plurality of elements that are within a predetermined range centered on a main element in the target object, a frequency of appearance of the main element, and a description position of the main element in the evaluation data. And the first word / phrase by morphological analysis, and the first word / phrase having a co-occurrence degree with respect to the main element equal to or greater than the predetermined value is extracted as the second word / phrase among the plurality of first words / phrases. A data processing device that classifies the second phrase for each emotion .

The data processing apparatus according to claim 1 ,
The data processing apparatus, wherein the first word / phrase extracted by the morphological analysis is an emotion expression word / phrase indicating a human emotion.

In the data processing device according to claim 1 or 2 ,
The dictionary generation means classifies the extracted second word / phrase for each emotion using a potential Dirichlet distribution method.

The data processing device according to any one of claims 1 to 3 ,
Comprising evaluation data acquisition means for acquiring text-based evaluation data for the object;
The content quantifying means generates the quantified data based on evaluation data for the object and the emotion classification dictionary.

The data processing apparatus according to claim 4 , wherein
The content quantifying means classifies a phrase extracted by morphological analysis of the evaluation data for the target object for each emotion using the emotion classification dictionary, and associates an emotion value based on a classification result with the target object. A data processing device characterized by generating the quantification data.

The data processing apparatus according to claim 5 , wherein
The content quantifying means generates quantified data in which a co-occurrence relationship between the emotions is associated with the emotion value.

The data processing apparatus according to any one of claims 1 to 6 ,
A data processing apparatus comprising data storage means for storing the quantification data generated by the content quantification means.

The data processing apparatus according to claim 7 , wherein
Query acquisition means for acquiring a predetermined object as a search query;
Search means for searching quantification data similar to the quantification data for the object specified as the search query from the data storage means, and returning an object corresponding to the searched quantification data as a search result;
A data processing apparatus comprising:

In the data processing device according to claim 7 or 8 ,
Request acquisition means for acquiring an output request for the quantification data for a predetermined object;
Quantified data output means for acquiring and outputting the quantified data for the object specified as the output request from the data storage means;
A data processing apparatus comprising:

A data processing method for generating quantified data by quantifying emotions felt by a person with respect to an object,
The computer
Generating an emotion classification dictionary in which a plurality of phrases are classified by emotion unit based on text-based evaluation data for a plurality of objects, and storing it in a storage means ;
Obtaining the emotion classification dictionary from the storage means ;
Using the emotion classification dictionary to generate the quantification data for the object ,
In the step of generating the emotion classification dictionary and storing it in the storage means, from the evaluation data, the main element in the object, the frequency of appearance with respect to the main element, and the description position of the main element in the evaluation data A plurality of the first words and phrases within a predetermined range centered on the morpheme analysis, and among the plurality of first words and phrases, the co-occurrence degree with respect to the main element is equal to or greater than the predetermined value. A data processing method characterized by extracting a phrase as the second phrase and classifying the second phrase for each emotion .