JP2016057989A

JP2016057989A - Information provision device, and method and program for providing information

Info

Publication number: JP2016057989A
Application number: JP2014185605A
Authority: JP
Inventors: 恭子小松; Kyoko Komatsu; 広海石先; Hiromi Ishisaki; 服部　元; Hajime Hattori; 元服部; 滝嶋　康弘; Yasuhiro Takishima; 康弘滝嶋
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2014-09-11
Filing date: 2014-09-11
Publication date: 2016-04-21
Anticipated expiration: 2034-09-11
Also published as: JP6392042B2

Abstract

PROBLEM TO BE SOLVED: To provide an information provision device, and a method and a program for providing information enabling a provider of a specific product or service to grasp opinions of users and specifically use the opinions for marketing such as product development, service improvement, and system modification.SOLUTION: An information provision device 100, which provides information on products or services, is provided with: an analysis part 120 that extracts, from text data collected as data relating to a specific product or a service, a subjective view included in the text data and estimates a factor that generates the subjective view; a database creation part 130 that creates database with the use of the extracted subjective view, the estimated factor, and items for categorizing the subjective view and the factor; and an information output part 141 that outputs the information on the specific product or service on the basis of the creased database.SELECTED DRAWING: Figure 1

Description

本発明は、商品またはサービスに関する情報を提供する情報提供装置、情報を提供する方法およびプログラムに関する。 The present invention relates to an information providing apparatus that provides information related to goods or services, a method for providing information, and a program.

従来、ある事象に関連する事象間の因果関係を把握するために、Ｗｅｂ文書から要因を抽出することが行なわれている。例えば、非特許文献１は、事象間の因果関係を見つける手がかりとなる表現を基に、Ｗｅｂ文書から要因を検索、抽出してさらに因果関係ネットワークを構築する手法を提案している。そのネットワークは因果関係の要因とその結果をノードとして配置し、有効グラフで関係を表現している。 Conventionally, in order to grasp a causal relationship between events related to a certain event, a factor is extracted from a Web document. For example, Non-Patent Document 1 proposes a method for constructing a causal relationship network by searching for and extracting factors from a Web document based on expressions that serve as clues for finding a causal relationship between events. The network arranges the causal factors and the results as nodes, and expresses the relationship with an effective graph.

また、非特許文献２は、商品レビューやブログ等のテキスト情報からユーザの要望、疑問の声や困難な話題を分析し、クレーム分析や好評意見を調査した結果を可視化するサービスを開示している。非特許文献２のサービスは、好意的、否定的な意見を色分けし全体的な意見を簡単に把握できる形で示したり、意見をランキング表示させている。また、意見を分類させて、分類別投稿数の時間推移を示す見せ方もしている。 Non-Patent Document 2 discloses a service that analyzes user requests, questioning voices and difficult topics from text information such as product reviews and blogs, and visualizes the results of investigation of complaint analysis and popular opinions. . The service of Non-Patent Document 2 displays favorable opinions and negative opinions in a color-coded manner so that the overall opinions can be easily grasped, or displays the opinions in ranking. In addition, we classify opinions and show how the number of posts by category shows over time.

青野壮志、太田学、“要因検索による因果関係ネットワークの構築と因果知識の獲得”、[online]、DEIM Forum 2010 B9-1、［平成26年9月8日検索］、インターネット<URL:http://db-event.jpn.org/deim2010/proceedings/files/B9-1.pdf>Satoshi Aono, Manabu Ota, “Building causal network and acquiring causal knowledge by factor search”, [online], DEIM Forum 2010 B9-1, [September 8, 2014 search], Internet <URL: http: //db-event.jpn.org/deim2010/proceedings/files/B9-1.pdf> 株式会社日立システムズ、“テキストマイニングツール見える化エンジン”、[online]、［平成26年9月8日検索］、インターネット<URL:http://www.hitachi-systems.com/solution/s107/mieruka/>Hitachi Systems, Ltd., “Text Mining Tool Visualization Engine”, [online], [Search September 8, 2014], Internet <URL: http://www.hitachi-systems.com/solution/s107/mieruka />

しかしながら、非特許文献１記載の技術は、検索語に対する要因をＷｅｂから検索する際、時間情報を考慮せずに要因関係があるワードを抽出するに過ぎず、顧客の意見や評判を商品開発やサービス改良に活かすことはできない。 However, the technology described in Non-Patent Document 1 only extracts words that have a factor relationship without considering time information when searching the factor for the search term from the Web. It cannot be used for service improvement.

また、非特許文献２で開示されるサービスは、対象となるキーワードに関するお客様の評判意見をポジネガやユーザのプロフィール別に分類し可視化しているが、収集した意見の原因を対象としていない。そのため、ユーザの意見を把握して商品開発やサービス改良、システム改修等具体的にマーケティングに活用することが難しい。 In addition, the service disclosed in Non-Patent Document 2 classifies and visualizes customer reputation opinions regarding target keywords according to positive and user profiles, but does not target the causes of collected opinions. For this reason, it is difficult to grasp the user's opinion and use it for marketing, such as product development, service improvement, and system repair.

本発明は、このような事情に鑑みてなされたものであり、特定の商品またはサービスの提供者は、ユーザの意見を把握して商品開発やサービス改良、システム改修等具体的にマーケティングに活用できる情報提供装置、情報を提供する方法およびプログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and a provider of a specific product or service can grasp a user's opinion and can specifically use it for marketing such as product development, service improvement, and system repair. An object is to provide an information providing apparatus, a method of providing information, and a program.

（１）上記の目的を達成するため、本発明の情報提供装置は、商品またはサービスに関する情報を提供する情報提供装置であって、特定の商品またはサービスに関するものとして収集されたテキストデータから、前記テキストデータに含まれる主観を抽出し、前記主観を生じさせる原因を推定する分析部と、前記抽出された主観、前記推定された原因およびこれらを分類する項目を用いてデータベースを作成するデータベース作成部と、前記作成されたデータベースに基づいて前記特定の商品またはサービスに関する情報を出力する情報出力部と、を備えることを特徴としている。 (1) In order to achieve the above object, an information providing apparatus according to the present invention is an information providing apparatus that provides information related to a product or service, and is based on text data collected as related to a specific product or service. An analysis unit that extracts the subjectivity included in the text data and estimates the cause of the subjectivity, and a database creation unit that creates a database using the extracted subjectivity, the estimated cause, and items for classifying them And an information output unit that outputs information on the specific product or service based on the created database.

これにより、特定の商品またはサービスに関してテキストデータから導かれた情報を提供できる。その結果、例えば特定の商品またはサービスの提供者は、ユーザの意見を把握して商品開発やサービス改良、システム改修など具体的にマーケティングに活用できる。 As a result, information derived from text data regarding a specific product or service can be provided. As a result, for example, a provider of a specific product or service can grasp the user's opinion and use it for marketing specifically, such as product development, service improvement, and system repair.

（２）また、本発明の情報提供装置は、前記分析部は、前記主観を抽出したテキストデータから前記主観に関するキーワードを抽出し、前記データベース作成部は、前記主観または原因を分類する項目として、前記主観を抽出したテキストデータから抽出されたキーワードを用いることを特徴としている。これにより、キーワードにより系統的な情報を提供できるため、特定の商品またはサービスの提供者は、容易に事業の状況を把握できる。 (2) Further, in the information providing apparatus of the present invention, the analysis unit extracts keywords related to the subjectivity from the text data obtained by extracting the subjectivity, and the database creation unit includes the subject or the cause as an item to be classified. The keyword extracted from the text data which extracted the subjectivity is used. As a result, systematic information can be provided by keywords, so that a provider of a specific product or service can easily grasp the business situation.

（３）また、本発明の情報提供装置は、前記分析部は、前記主観または原因を分類するカテゴリを特定し、前記データベース作成部は、前記主観または原因を分類する項目として、これらを分類するカテゴリを用いることを特徴としている。これにより、カテゴリにより系統的な情報を提供できるため、特定の商品またはサービスの提供者は、容易に事業の状況を把握できる。 (3) Further, in the information providing apparatus of the present invention, the analysis unit specifies a category for classifying the subjectivity or cause, and the database creation unit classifies these as items for classifying the subjectivity or cause. It is characterized by using categories. As a result, systematic information can be provided by category, so that a provider of a specific product or service can easily grasp the business situation.

（４）また、本発明の情報提供装置は、前記データベース作成部は、前記主観または原因を分類する項目として、前記主観を抽出したテキストデータの情報源の属性を用いることを特徴としている。これにより、例えば、商品またはサービスの提供を受けた消費者の年齢層や地域等で整理した情報を提供できる。 (4) Further, the information providing apparatus of the present invention is characterized in that the database creation unit uses an attribute of an information source of text data from which the subjectivity is extracted as an item for classifying the subjectivity or the cause. Thereby, for example, it is possible to provide information organized by the age group or region of the consumer who received the product or service.

（５）また、本発明の情報提供装置は、前記データベース作成部は、特定の期間に作成されたテキストデータを対象に、前記主観または原因を分類する項目の各々が出現した出現率を算出することを特徴としている。これにより、特定の期間に出現した項目の出現率を情報として提供することができ、商品またはサービスの提供者は、現在生じている問題の優先順位等を容易に把握できる。 (5) Further, in the information providing apparatus according to the present invention, the database creation unit calculates an appearance rate at which each of the items for classifying the subjectivity or the cause appears in text data created in a specific period. It is characterized by that. Thereby, the appearance rate of the item which appeared in the specific period can be provided as information, and the provider of goods or services can easily grasp the priority order of the problem that is currently occurring.

（６）また、本発明の情報提供装置は、前記情報出力部は、前記データベースを用いて、前記主観を分類する項目を上位ノードとし、前記原因を分類する項目を下位ノードとするツリー形式で情報を出力することを特徴としている。これにより、主観を生じさせた原因が分かりやすい形式で情報を提供することができる。 (6) Further, in the information providing apparatus according to the present invention, the information output unit uses the database to form a tree format in which an item for classifying the subject is an upper node and an item for classifying the cause is a lower node. It is characterized by outputting information. This makes it possible to provide information in a form that makes it easy to understand the cause of the subjectivity.

（７）また、本発明の情報提供装置は、前記情報出力部は、前記各ノードに対応する項目の重要度に応じて前記各ノードの形態を変えてツリーを構成することを特徴としている。これにより、例えば、頻度の多い項目を表すノードを大きくしたり、時期の古い項目を表すノードを暗くしたりすることができる。 (7) Further, the information providing apparatus of the present invention is characterized in that the information output unit forms a tree by changing the form of each node according to the importance of the item corresponding to each node. As a result, for example, it is possible to enlarge a node representing an item with a high frequency, or to darken a node representing an item with an old time.

（８）また、本発明の情報提供装置は、前記分析部は、リアルタイムに収集されたテキストデータを分析し、前記情報出力部は、前記分析結果に応じて前記特定の商品またはサービスに関する情報を出力することを特徴としている。これにより、分析結果に応じてリアルタイムに出力する情報を更新するため、特定の商品またはサービスの提供者は現在進行している状況を把握できる。 (8) Further, in the information providing apparatus of the present invention, the analysis unit analyzes text data collected in real time, and the information output unit displays information on the specific product or service according to the analysis result. It is characterized by output. Thereby, since the information output in real time according to the analysis result is updated, the provider of the specific product or service can grasp the current state of progress.

（９）また、本発明の情報提供装置は、前記情報出力部は、前記データベースに蓄積された主観または原因を分類する項目の数が、一定時間内に所定の条件を満たしたときに、アラームを出力することを特徴としている。これにより、例えば、項目の数が急激に増加している場合には、商品等に対する不満や要望が殺到している状況を想定できるが、そのような場合に、特定の商品またはサービスの提供者に向けてアラームを出力することができる。 (9) In the information providing apparatus of the present invention, the information output unit may generate an alarm when the number of items for classifying the subjectivity or cause accumulated in the database satisfies a predetermined condition within a predetermined time. Is output. Thus, for example, when the number of items is rapidly increasing, it is possible to assume a situation in which there are a lot of dissatisfaction and requests for products, etc. In such a case, the provider of a specific product or service An alarm can be output for

（１０）また、本発明の情報提供装置は、前記情報出力部は、前記データベースに蓄積された主観または原因を分類する項目の影響度を参照し、前記影響度が上位の項目については優先的にアラームを出力することを特徴としている。これにより、例えば、多くのテキストデータで言及され、解決可能な原因がある場合に、アラームで早急に対処すべき状況にあることを知ることができる。 (10) Further, in the information providing apparatus of the present invention, the information output unit refers to the degree of influence of an item for classifying subjectivity or cause accumulated in the database, and the item having the higher degree of influence is prioritized. It is characterized by outputting an alarm. Thereby, for example, when there is a cause that can be solved by being mentioned in a lot of text data, it is possible to know that there is a situation that should be dealt with immediately by an alarm.

（１１）また、本発明の方法は、商品またはサービスに関する情報を提供する方法であって、特定の商品またはサービスに関するものとして収集されたテキストデータから、前記テキストデータの作成者による主観を抽出し、前記主観を生じさせる原因を推定するステップと、前記抽出された主観、前記推定された原因およびこれらを分類する項目を用いてデータベースを作成するステップと、前記作成されたデータベースに基づいて前記特定の商品またはサービスに関する情報を出力するステップと、を含むことを特徴としている。これにより、特定の商品またはサービスに関してテキストデータから導かれた情報を提供できる。 (11) Further, the method of the present invention is a method for providing information related to a product or service, and extracts the subjectivity by the creator of the text data from text data collected as related to a specific product or service. A step of estimating a cause causing the subjectivity, a step of creating a database using the extracted subjectivity, the estimated cause and items for classifying the extracted subjectivity, and the identification based on the created database Outputting information related to the goods or services. As a result, information derived from text data regarding a specific product or service can be provided.

（１２）また、本発明のプログラムは、商品またはサービスに関する情報を提供するプログラムであって、特定の商品またはサービスに関するものとして収集されたテキストデータから、前記テキストデータの作成者による主観を抽出し、前記主観を生じさせる原因を推定する処理と、前記抽出された主観、前記推定された原因およびこれらを分類する項目を用いてデータベースを作成する処理と、前記作成されたデータベースに基づいて前記特定の商品またはサービスに関する情報を出力する処理と、をコンピュータに実行させることを特徴としている。これにより、特定の商品またはサービスに関してテキストデータから導かれた情報を提供できる。 (12) The program according to the present invention is a program for providing information related to a product or service, and extracts the subjectivity by the creator of the text data from text data collected as related to a specific product or service. , A process for estimating the cause causing the subjectivity, a process for creating a database using the extracted subjectivity, the estimated cause, and items for classifying them, and the identification based on the created database And a process of outputting information on the product or service of the computer. As a result, information derived from text data regarding a specific product or service can be provided.

本発明によれば、特定の商品またはサービスの提供者は、ユーザの意見を把握して商品開発やサービス改良、システム改修など具体的にマーケティングに活用できる。 According to the present invention, a provider of a specific product or service can grasp a user's opinion and can use it for marketing, such as product development, service improvement, and system repair.

本発明の情報提供装置の構成を示すブロック図である。It is a block diagram which shows the structure of the information provision apparatus of this invention. 共起語抽出処理の一例を示すフローチャートである。It is a flowchart which shows an example of a co-occurrence word extraction process. 原因語候補抽出処理の一例を示すフローチャートである。It is a flowchart which shows an example of a cause word candidate extraction process. 複数原因の判定処理の一例を示すフローチャートである。It is a flowchart which shows an example of the determination process of multiple causes. 単一原因の判定処理の一例を示すフローチャートである。It is a flowchart which shows an example of the determination process of a single cause. 商品またはサービスに関する情報の表示例を示す図である。It is a figure which shows the example of a display of the information regarding goods or service. ツリー構造の表示例を示す図である。It is a figure which shows the example of a display of a tree structure. 時間軸の表示例を示す図である。It is a figure which shows the example of a display of a time-axis. ツリー構造の表示例を示す図である。It is a figure which shows the example of a display of a tree structure.

以下に、本発明の実施の形態について、図面を参照しながら説明する。 Embodiments of the present invention will be described below with reference to the drawings.

［情報提供装置の構成］
図１は、情報提供装置１００の構成を示すブロック図である。情報提供装置１００は、サーバまたはＰＣ等の端末であり、収集したデータを分析し、商品またはサービスに関する情報を提供する。情報提供装置１００は、データ収集部１１０、分析部１２０、データベース作成部１３０および制御部１４０を備えている。 [Configuration of Information Providing Device]
FIG. 1 is a block diagram illustrating a configuration of the information providing apparatus 100. The information providing apparatus 100 is a terminal such as a server or a PC, and analyzes collected data and provides information on products or services. The information providing apparatus 100 includes a data collection unit 110, an analysis unit 120, a database creation unit 130, and a control unit 140.

（データ収集部）
データ収集部１１０は、レビュー文、ＳＮＳの投稿、そのコメント等、様々なテキストデータを収集する。収集単位は、適宜、エントリ単位、文章単位、句読点単位等にすることができる。なお、収集データには、ブログ等インターネット上の投稿文や一連のメール文章といったコミュニケーション文を使うこともできる。表１に、収集された文節文のテキスト例を示す。

(Data collection part)
The data collection unit 110 collects various text data such as review sentences, SNS posts, and comments thereof. The collection unit can be appropriately set as an entry unit, a sentence unit, a punctuation mark unit, or the like. Note that the collected data can be communication sentences such as posts posted on the Internet such as blogs and a series of e-mail texts. Table 1 shows a text example of the collected phrase.

（分析部）
分析部１２０は、特定の商品またはサービスに関するものとして収集されたテキストデータから、テキストデータに含まれる主観を抽出し、主観を生じさせる原因を推定する。例えば、主観または原因を分類するカテゴリを特定し、主観を抽出したテキストデータからその主観に関するキーワードを抽出することが好ましい。また、分析部１２０は、リアルタイム収集されたテキストデータを分析することもできる。 (Analysis Department)
The analysis unit 120 extracts the subjectivity included in the text data from the text data collected as related to the specific product or service, and estimates the cause of the subjectivity. For example, it is preferable to specify a category for classifying subjectivity or cause, and extract a keyword related to the subjectivity from text data from which the subjectivity is extracted. The analysis unit 120 can also analyze the text data collected in real time.

具体的には、分析部１２０は、主観抽出部１２１および原因抽出部１２２を備えている。主観抽出部１２１は、要望や不満という主観の判定を行ない、それらの主観を抽出する。要望判定は、テキスト検索手法やベクトル空間モデルにより文書分類を行ない、要望Ｄを抽出し、ラベル付けして行なう（表２参照）。

Specifically, the analysis unit 120 includes a subjective extraction unit 121 and a cause extraction unit 122. The subjectivity extraction unit 121 performs subjectivity determination such as demand or dissatisfaction, and extracts those subjects. The request determination is performed by classifying documents by a text search method or a vector space model, extracting the request D, and labeling it (see Table 2).

例えば、関連技術１および２を用いて、学習データに含まれる要望データから要望特徴空間および、教師ベクトルを作成する。そして、その要望特徴空間に基づいて入力テキストの特徴ベクトルを作成し、教師ベクトルとの類似度に基づき要望ラベルを付与することができる。また、関連技術３を利用し、学習データに含まれる要望データから要望識別器を作成、利用することもできる。
［関連技術１］Bag-of-words model：http://en.wikipedia.org/wiki/Bag-of-words_model
［関連技術２］Mecab：http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html
［関連技術３］SVMLIGHT：http://svmlight.joachims.org/ For example, using the related techniques 1 and 2, a desired feature space and a teacher vector are created from the desired data included in the learning data. Then, a feature vector of the input text can be created based on the desired feature space, and a desired label can be assigned based on the similarity with the teacher vector. Further, using the related technique 3, a request discriminator can be created and used from request data included in the learning data.
[Related Technology 1] Bag-of-words model: http://en.wikipedia.org/wiki/Bag-of-words_model
[Related Technology 2] Mecab: http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html
[Related Technology 3] SVMLIGHT: http://svmlight.joachims.org/

不満判定は、情緒的表現を解析することにより不満Ｆを抽出し、ラベル付けする。情緒的表現の解析には、従来技術にあるように、整備された辞書やポジネガ判定等を用いることで実現する（表３参照）。

In the dissatisfaction determination, the dissatisfaction F is extracted by analyzing the emotional expression and labeled. The analysis of the emotional expression is realized by using a prepared dictionary, positive / negative determination or the like as in the prior art (see Table 3).

原因抽出部１２２は、共起語抽出、原因語候補抽出および原因判定を行なうことで原因を抽出する。共起語抽出は、要望判定処理により要望とラベル付けされた文章（要望文章）と、不満判定処理により不満とラベル付けされた文章（不満文章）を、対応づけて、共起する単語の組み合わせを抽出する。共起語抽出処理の詳細は後述する。 The cause extraction unit 122 extracts a cause by performing co-occurrence word extraction, cause word candidate extraction, and cause determination. Co-occurrence word extraction is a combination of co-occurring words by associating sentences labeled as requests by the request determination process (request sentences) with sentences labeled dissatisfaction by the dissatisfaction determination process (dissatisfied sentences). To extract. Details of the co-occurrence word extraction process will be described later.

対応づけは、例えば、抽出された不満文章を起点に、直近の要望文章をペアとすることでできる。例えば表３ではｔ５のＦ「暑い」とｔｎのＤ「風がほしい」が対応づけられ、これら２つと共起する、室温、湿度、エアコン、我慢等の単語が抽出される。 The association can be performed, for example, by pairing the latest requested text with the extracted dissatisfied text as a starting point. For example, in Table 3, the word “F hot” at t5 and the letter “I want wind” at tn are associated with each other, and words such as room temperature, humidity, air conditioner, and patience that co-occur with these two are extracted.

原因候補抽出は、共起語抽出処理で抽出された共起語に基づいて、テーブル作成処理で作成された確率テーブルを参照して行ない、該当する原因候補をすべて抽出する。例えば、上記ような共起語（室温、湿度、エアコン、我慢）がある場合、表４に示すように、３行目の「エアコンがついていない」、４行目の「エアコンが壊れている」等が抽出される。原因候補抽出処理の詳細は後述する。

The cause candidate extraction is performed by referring to the probability table created in the table creation process based on the co-occurrence words extracted in the co-occurrence word extraction process, and extracts all corresponding cause candidates. For example, if there are co-occurrence words (room temperature, humidity, air conditioner, patience) as shown above, as shown in Table 4, the third line “No air conditioner”, the fourth line “Air conditioner is broken” Etc. are extracted. Details of the cause candidate extraction process will be described later.

参照先は入力元の主観抽出部１２１の結果に応じて適宜変化させることができる。例えば、主観抽出部１２１にて要望、不満がとれた場合は表４に示すような３つの要素である要望、不満、原因（Ｄ，Ｆ，Ｃ）を含むテーブルＤＦＣを優先して参照し、他のテーブルは第二、第三候補として参照することができる。このとき、第二候補は、要望、原因（Ｄ，Ｃ）の要素を含むテーブルＤＣでも不満、原因（Ｆ，Ｃ）の要素を含むテーブルＦＣでも構わない。不満だけしか抽出されていない場合や不満だけに注目する場合は、表６に示すような不満と原因（Ｆ，Ｃ）の２要素を含むテーブルＦＣを優先して参照し、要望を無視して、表４に示すような３つの要素を含むテーブルＤＦＣを第二候補として参照することも考えられる（詳細は表７参照）。

The reference destination can be appropriately changed according to the result of the subjective extraction unit 121 of the input source. For example, when the subjective extraction unit 121 is satisfied or dissatisfied, the table DFC including the request, dissatisfaction, and cause (D, F, C) as shown in Table 4 is referred to with priority. Other tables can be referred to as second and third candidates. At this time, the second candidate may be a table DC including elements of request and cause (D, C) or a table FC including elements of cause (F, C). When only dissatisfaction has been extracted or when attention is focused only on dissatisfaction, priority is given to the table FC including two elements of dissatisfaction and cause (F, C) as shown in Table 6, and the request is ignored. It is also conceivable to refer to a table DFC including three elements as shown in Table 4 as the second candidate (see Table 7 for details).

原因判定は、原因語候補抽出処理で抽出された原因語候補より、原因を決定する。原因判定の詳細は後述する。決定は、原因語候補のテーブルを参照し、確率値を用いて行なう。例えば、確率値の上位複数を表示することや、事前に設定した閾値より高いものを表示すること等が考えられる。例えば、表８に示すように原因語候補が抽出された場合、確率値を用いて、Ｃ１、Ｃ２、Ｃ３が優先して選ばれる。

In the cause determination, the cause is determined from the cause word candidates extracted in the cause word candidate extraction process. Details of the cause determination will be described later. The decision is made using a probability value with reference to a table of causal word candidates. For example, it is conceivable to display a plurality of higher probability values or display a value higher than a preset threshold value. For example, when causal word candidates are extracted as shown in Table 8, C1, C2, and C3 are preferentially selected using probability values.

同一確率のものが存在する場合の、原因判定における優先順位は、入力された文章Ｓのうち、要望抽出処理で抽出された要望Ｄおよび不満抽出処理で抽出された不満Ｆに対し、一番文節間隔が小さくなるような、原因Ｃを優先する処理を適用することができる。例えば、表８でＣ１、Ｃ２、Ｃ３は同確率であるが、表９よりＣ１はｔ３、Ｃ２はｔ７、Ｃ３はｔ５０に位置し、対象となるｔ１００のＤとｔ５のＦに対して、これらのいずれかに対して文節間隔が小さいＣ１とＣ２が優先して選ばれる。 The priority in the cause determination in the case where there are those with the same probability is the first phrase with respect to the request D extracted by the request extraction process and the dissatisfaction F extracted by the dissatisfaction extraction process among the input sentences S. It is possible to apply a process that prioritizes the cause C so that the interval becomes small. For example, in Table 8, C1, C2, and C3 have the same probability, but from Table 9, C1 is located at t3, C2 is at t7, and C3 is located at t50. C1 and C2 having a small phrase interval are selected with priority over either of the above.

なお、同一間隔のものが存在する場合は、ＤおよびＦより前のＣを優先することもできる。これは、文章が一般的に原因から結果の順に書かれる構造が多いことに起因する。例えば、表９のＣ１とＣ２の位置関係からは、Ｃ１が優先して選ばれる。上記のような処理を行なうことで原因を単一に絞る例については後述する。また、原因語候補として出てきた原因語を含む文章に、原因ラベルを付与することもできる。 In addition, when the thing of the same space | interval exists, priority before C of D and F can also be given priority. This is due to the fact that there are many structures in which sentences are generally written in order from cause to result. For example, C1 is preferentially selected from the positional relationship between C1 and C2 in Table 9. An example of narrowing down the cause by performing the above processing will be described later. In addition, a cause label can be given to a sentence including a cause word that appears as a cause word candidate.

（データベース作成部（ＤＢ作成部））
データベース作成部１３０は、抽出された主観、推定された原因およびこれらを分類する項目を用いてデータベースを作成する。主観または原因を分類する項目としては、主観を抽出したテキストデータから抽出されたキーワードを用いてデータベースを作成することが好ましい。これにより、キーワードにより系統的な情報を提供できるため、特定の商品またはサービスの提供者は、容易に事業の状況を把握できる。 (Database creation part (DB creation part))
The database creation unit 130 creates a database using the extracted subjectivity, estimated causes, and items for classifying them. As an item for classifying subjectivity or cause, it is preferable to create a database using keywords extracted from text data from which subjectivity is extracted. As a result, systematic information can be provided by keywords, so that a provider of a specific product or service can easily grasp the business situation.

主観または原因を分類する項目として、これらを分類するカテゴリを用いてもよい。これにより、カテゴリにより系統的な情報を提供できるため、特定の商品またはサービスの提供者は、容易に事業の状況を把握できる。 You may use the category which classifies these as an item which classifies subjectivity or a cause. As a result, systematic information can be provided by category, so that a provider of a specific product or service can easily grasp the business situation.

また、主観または原因を分類する項目として、主観を抽出したテキストデータの情報源の属性を用いてもよい。これにより、例えば、商品またはサービスの提供を受けた消費者の年齢層や地域等で整理した情報を提供できる。 Moreover, you may use the attribute of the information source of the text data which extracted subjectivity as an item which classifies subjectivity or a cause. Thereby, for example, it is possible to provide information organized by the age group or region of the consumer who received the product or service.

具体的には、特定の期間に作成されたテキストデータを対象に、主観または原因を分類する項目の各々が出現した出現率を算出してもよい。これにより、特定の期間に出現した項目の出現率を情報として提供することができ、商品またはサービスの提供者は、現在生じている問題の優先順位等を容易に把握できる。 Specifically, the appearance rate at which each of the items for classifying the subjectivity or the cause may be calculated for text data created during a specific period. Thereby, the appearance rate of the item which appeared in the specific period can be provided as information, and the provider of goods or services can easily grasp the priority order of the problem that is currently occurring.

具体的には、データベース作成部１３０は、情報取得部１３１、キーワード抽出部１３２、カテゴリ分類部１３３、確率計算部１３４を備えている。データ収集部１１０で収集したデータのうち、分析部１２０でラベル付けされたデータを対象にデータベース作成を行なう。表１０にデータベースの構成を示す。データベースには、キーワードリスト（ＫＷＬ）、不満文（Ｆ）、不満カテゴリ（Ｆｃ）、要望文（Ｄ）、要望カテゴリ（Ｄｃ）、原因文（Ｃ）、原因カテゴリ（Ｃｃ）、投稿時間（Ｔｉｍｅ）、出現確率（ＡＲ）が格納される。

Specifically, the database creation unit 130 includes an information acquisition unit 131, a keyword extraction unit 132, a category classification unit 133, and a probability calculation unit 134. Of the data collected by the data collection unit 110, the database is created for the data labeled by the analysis unit 120. Table 10 shows the configuration of the database. The database includes a keyword list (KWL), dissatisfaction sentence (F), dissatisfaction category (Fc), request sentence (D), request category (Dc), cause sentence (C), cause category (Cc), posting time (Time) ), The appearance probability (AR) is stored.

情報取得部１３１は、対象データから分析部により不満、要望、原因ラベルが付いた各文章（Ｆ列，Ｄ列，Ｃ列）、投稿時間（Ｔｉｍｅ列）をデータベースに格納する。この際、ユーザの属性情報をデータベースに含めることもできる。属性情報は、年齢、性別、地域、職業等が考えられ、プロフィール推定により情報取得できる。これらの情報をデータベースに含めることにより、属性情報ごとの意見傾向分析等に利用できると考えられる。データベースの作成には、ＳＮＳ等のネットワークから得られるデータを用いるの好ましいが、それ以外のデータを用いてもよい。 The information acquisition unit 131 stores each sentence (F column, D column, C column) and post time (Time column) with dissatisfaction, request, and cause label from the target data by the analysis unit in the database. At this time, user attribute information may be included in the database. As attribute information, age, gender, region, occupation, etc. can be considered, and information can be acquired by profile estimation. By including these information in the database, it can be used for opinion trend analysis for each attribute information. In creating the database, it is preferable to use data obtained from a network such as SNS, but other data may be used.

以下に、ＳＮＳ以外のデータを用いたデータベース例を示す。
（例１）ＣＳの電話対応データを利用する場合（利用者：オペレータ等）
電話応対時にお客様から得られる情報を入力する画面を用意する。得られる情報は例えば、音声により判別できる性別、応対で分かる不満／要望等の意見、お客様の属性情報等が考えられる。オペレータがリアルタイムに情報入力することにより情報を取得し、表１０のデータベースを作成、更新することができる。また、入力した情報と同様の事例をリアルタイムに表示させることで、電話越しに頂いた意見に対する解決策を素早く発見する等も利用も考えられる。 An example of a database using data other than SNS is shown below.
(Example 1) When using CS telephone correspondence data (user: operator, etc.)
Prepare a screen to input information obtained from customers when answering a phone call. For example, the information obtained can be gender that can be discriminated by voice, opinions such as dissatisfaction / request, etc. that can be understood by response, customer attribute information, and the like. Information can be acquired by the operator inputting information in real time, and the database in Table 10 can be created and updated. In addition, it can be used to quickly find solutions to opinions over the phone by displaying real-time examples similar to the input information.

（例２）企業が保持する顧客データベースを参照する場合（利用者：支店担当等）
ＳＮＳ等で取得可能な表１０のデータベースと、企業が保持する既存の顧客データベースやショップでの対応履歴により例１と同様に作成、更新したデータベースを参照する。例えば、支店担当者が利用する際、地域別に分析することで、都心で早期に出現した現象に対して、地方で先手の対策を打つ等の利用も考えられる。 (Example 2) When referring to a customer database held by a company (user: branch manager, etc.)
Reference is made to the database created and updated in the same manner as in Example 1 based on the database in Table 10 that can be acquired by SNS and the like, and the existing customer database held by the company and the correspondence history in the shop. For example, when a branch person in charge uses it, it may be possible to analyze the region by region and take measures against the phenomenon that appeared early in the city center, such as taking the first measures in the region.

キーワード抽出部１３２は、情報取得部１３１で格納した不満、要望、原因ラベルが付いた各文章やラベル付け前の元の文章より、トピックをキーワードとして抽出しデータベースのキーワードリスト（ＫＷＬ列）に格納する。ここでいうトピックとは、商品名やサービス名等が考えられる。トピック分類は、予め各サービスごとに関連ワードを登録した辞書を用いることで実現できる。ただし実現方法は辞書に限らない。 The keyword extraction unit 132 extracts topics as keywords from each sentence with the dissatisfaction, demand, and cause label stored in the information acquisition unit 131 and the original sentence before labeling, and stores it as a keyword list (KWL column) in the database. To do. The topic here may be a product name, a service name, or the like. Topic classification can be realized by using a dictionary in which related words are registered in advance for each service. However, the realization method is not limited to a dictionary.

カテゴリ分類部１３３は、カテゴリ分類処理は、ラベル付けされた不満、要望、原因を、予め用意されたカテゴリに分類して表１０内のＦｃ列、Ｄｃ列、Ｃｃ列に格納する。カテゴリ分類は前記キーワード抽出同様、各カテゴリごとに関連ワードを登録した辞書を用いることで実現できる。ただし実現方法は辞書に限らない。 The category classification unit 133 classifies the labeled dissatisfaction, desire, and cause into categories prepared in advance and stores them in the Fc column, Dc column, and Cc column in Table 10. Category classification can be realized by using a dictionary in which related words are registered for each category as in the keyword extraction. However, the realization method is not limited to a dictionary.

確率計算部１３４は、不満カテゴリＦｃと要望カテゴリＤｃと原因カテゴリＣｃの出現確率を計算し格納する。算出は、例えば、不満カテゴリＦ１に対する要望がＤ１、Ｄ２の２つ、原因がＣ１−３、Ｃ１１、Ｃ１２の５つであった場合、５通りすべての出現率を１としたときの確率を計算する、または、一定期間に対する投稿数（投稿数／一定期間）を計算する、投稿数の増減率を用いる等の方法で行なう。この確率値を事象の影響度と捉えると、閾値以上の割合となる事象が発現した際、システム利用者にアラーム通知する等の機能を付加することも可能である。 The probability calculation unit 134 calculates and stores the appearance probabilities of the dissatisfied category Fc, the desired category Dc, and the cause category Cc. For example, when there are two requests for the dissatisfaction category F1, D1 and D2, and there are five causes C1-3, C11, and C12, the probability is calculated when all five occurrence rates are set to 1. Or the number of posts for a certain period (the number of posts / a certain period) is calculated, or the rate of increase / decrease in the number of posts is used. If this probability value is regarded as the degree of influence of an event, it is possible to add a function such as an alarm notification to the system user when an event having a ratio equal to or greater than a threshold occurs.

（制御部）
制御部１４０は、入力された情報をもとにデータベースの情報を加工して出力する。入力（検索語）としては、商品名やサービス名等の検索クエリが挙げられる。また、表示させる期間を設定することができる。この他に、属性別表示のプルダウンを用意する等して、グラフに反映させるデータを選択することもできる。制御部１４０は、情報出力部１４１を含んでいる。なお、出力は、情報提供装置１００に接続されたディスプレイ等の表示部に対して行なってもよいし、ネットワークを介して端末に配信することで行なってもよい。 (Control part)
The control unit 140 processes and outputs information in the database based on the input information. Examples of the input (search term) include search queries such as product names and service names. In addition, a display period can be set. In addition to this, data to be reflected in the graph can be selected by preparing a pull-down display by attribute. The control unit 140 includes an information output unit 141. The output may be performed on a display unit such as a display connected to the information providing apparatus 100, or may be performed by distributing to a terminal via a network.

情報出力部１４１は、作成されたデータベースに基づいて特定の商品またはサービスに関する情報を出力する。これにより、特定の商品またはサービスに関してテキストデータから導かれた情報を提供できる。その結果、例えば特定の商品またはサービスの提供者は、ユーザの意見を把握して商品開発やサービス改良、システム改修等具体的にマーケティングに活用できる。 The information output unit 141 outputs information related to a specific product or service based on the created database. As a result, information derived from text data regarding a specific product or service can be provided. As a result, for example, a provider of a specific product or service can grasp the user's opinion and use it for marketing, such as product development, service improvement, and system repair.

情報出力部１４１は、データベースを用いて、主観を分類する項目を上位ノードとし、原因を分類する項目を下位ノードとするツリー形式で情報を出力することが好ましい。これにより、主観を生じさせた原因が分かりやすい形式で情報を提供することができる。 The information output unit 141 preferably uses a database to output information in a tree format in which the subject classification item is an upper node and the cause classification item is a lower node. This makes it possible to provide information in a form that makes it easy to understand the cause of the subjectivity.

その際には、各ノードに対応する項目の重要度に応じて各ノードの形態を変えてツリーを構成することが好ましい。これにより、例えば、頻度の多い項目を表すノードを大きくしたり、時期の古い項目を表すノードを暗くしたりすることができる。 In that case, it is preferable to configure the tree by changing the form of each node according to the importance of the item corresponding to each node. As a result, for example, it is possible to enlarge a node representing an item with a high frequency, or to darken a node representing an item with an old time.

また、情報出力部１４１は、分析結果に応じてリアルタイムに特定の商品またはサービスに関する情報を出力することが好ましい。これにより、分析結果に応じてリアルタイムに出力する情報を更新するため、特定の商品またはサービスの提供者は現在進行している状況を把握できる。 Moreover, it is preferable that the information output part 141 outputs the information regarding a specific goods or service in real time according to an analysis result. Thereby, since the information output in real time according to the analysis result is updated, the provider of the specific product or service can grasp the current state of progress.

情報出力部１４１は、データベースに蓄積された主観または原因を分類する項目の数が、一定時間内に所定の条件を満たしたときに、アラームを出力してもよい。これにより、例えば、項目の数が急激に増加している場合には、商品等に対する不満や要望が殺到している状況を想定できるが、そのような場合に、特定の商品またはサービスの提供者に向けてアラームを出力することができる。 The information output unit 141 may output an alarm when the number of items for classifying the subjectivity or the cause accumulated in the database satisfies a predetermined condition within a predetermined time. Thus, for example, when the number of items is rapidly increasing, it is possible to assume a situation in which there are a lot of dissatisfaction and requests for products, etc. In such a case, the provider of a specific product or service An alarm can be output for

その際に、データベースに蓄積された主観または原因を分類する項目の影響度を参照し、影響度が上位の項目については優先的にアラームを出力することが好ましい。これにより、例えば、多くのテキストデータで言及され、解決可能な原因がある場合に、アラームで早急に対処すべき状況にあることを知ることができる。 At that time, it is preferable to refer to the degree of influence of the item that classifies the subjectivity or the cause accumulated in the database, and to output an alarm preferentially for an item having a higher degree of influence. Thereby, for example, when there is a cause that can be solved by being mentioned in a lot of text data, it is possible to know that there is a situation that should be dealt with immediately by an alarm.

グラフ構築は、データベース作成部で作成されたデータベースより、入力ワードが前記データベースの本文またはキーワードリストに該当する行を複数抽出し、不満、要望、原因をノードとしてツリー上に並べることで行なうことができる。この際、不満、要望、原因の順序は問わない。 Graph construction can be performed by extracting a plurality of lines whose input words correspond to the text or keyword list of the database from the database created by the database creation unit, and arranging dissatisfaction, desire, and cause as nodes on the tree. it can. At this time, the order of dissatisfaction, demand, and cause is not questioned.

［情報提供装置の動作］
上記のように構成された情報提供装置１００の動作を説明する。情報提供装置１００は、データ収集し、収集したデータを分析して主観や原因を抽出し、その結果をもとにデータベースを作成し、作成されたデータベースを利用して、ユーザが求める商品またはサービスに関する情報を出力する。このうち、特に、共起語抽出処理、原因語候補抽出処理、複数または単一の原因の判定処理について、以下に詳細を説明する。 [Operation of information provider]
The operation of the information providing apparatus 100 configured as described above will be described. The information providing apparatus 100 collects data, analyzes the collected data, extracts subjectivity and causes, creates a database based on the results, and uses the created database to make a product or service requested by the user Output information about. Among these, in particular, the co-occurrence word extraction process, the cause word candidate extraction process, and the determination process for multiple or single causes will be described in detail below.

図２は、共起語抽出処理の一例を示すフローチャートである。まず、ラベル付けされた文章群を取得し（ステップＳ１０１）、要望ラベルまたは不満ラベルの有無を判定する（ステップＳ１０２）。ステップＳ１０２において、要望ラベルまたは不満ラベルの一方のみが存在した場合は、ステップＳ１０４に遷移し、要望ラベルまたは不満ラベルの両方が存在した場合は、要望、不満の対応付けを行なう（ステップＳ１０３）。次に、共起語抽出処理を行ない（ステップＳ１０４）、共起語を取得する（ステップＳ１０５）。 FIG. 2 is a flowchart illustrating an example of co-occurrence word extraction processing. First, a labeled sentence group is acquired (step S101), and the presence or absence of a desired label or a dissatisfied label is determined (step S102). In step S102, if only one of the desired label or the dissatisfied label exists, the process proceeds to step S104. If both the desired label or the dissatisfied label exists, association between the desired label and the dissatisfied is performed (step S103). Next, a co-occurrence word extraction process is performed (step S104), and a co-occurrence word is acquired (step S105).

図３は、原因語候補抽出処理の一例を示すフローチャートである。まず、ラベル付けされた文章群および共起語を入力する（ステップＳ２０１）。次に、要望ラベルおよび不満ラベルの有無を判断する（ステップＳ２０２）。ステップＳ２０２において、要望ラベルおよび不満ラベルの両方があった場合は、テーブルＤＦＣを参照し（ステップＳ２０３）、第二候補の有無を判断する（ステップＳ２０４）。ステップＳ２０４において、第二候補が無い場合は、ステップＳ２０８に遷移する一方、第二候補がある場合は、テーブルＤＣを参照し（ステップＳ２０５）、第三候補の有無を判断する（ステップＳ２０６）。ステップＳ２０６において、第三候補が無い場合は、ステップＳ２０８に遷移する一方、第三候補がある場合は、テーブルＦＣを参照し（ステップＳ２０７）、原因語候補を抽出する（ステップＳ２０８）。 FIG. 3 is a flowchart illustrating an example of the causal word candidate extraction process. First, a labeled sentence group and co-occurrence words are input (step S201). Next, it is determined whether or not there are desired labels and dissatisfied labels (step S202). In step S202, if both the desired label and the dissatisfied label exist, the table DFC is referred to (step S203), and the presence / absence of the second candidate is determined (step S204). In step S204, if there is no second candidate, the process proceeds to step S208. On the other hand, if there is a second candidate, the table DC is referred to (step S205), and the presence / absence of the third candidate is determined (step S206). In step S206, when there is no third candidate, the process proceeds to step S208. On the other hand, when there is a third candidate, the table FC is referred to (step S207), and a cause word candidate is extracted (step S208).

次に、ステップＳ２０２において、不満ラベルのみがあった場合は、テーブルＦＣを参照し（ステップＳ２０９）、第二候補の有無を判断する（ステップＳ２１０）。第二候補が無い場合は、ステップＳ２０８に遷移する一方、第二候補がある場合は、テーブルＤＦＣを参照して（ステップＳ２１１）、ステップＳ２０８に遷移する。 Next, in step S202, when there is only a dissatisfied label, the table FC is referred to (step S209), and the presence / absence of the second candidate is determined (step S210). If there is no second candidate, the process proceeds to step S208. If there is a second candidate, the table DFC is referenced (step S211), and the process proceeds to step S208.

次に、ステップＳ２０２において、要望ラベルのみがあった場合は、テーブルＤＣを参照し（ステップＳ２１２）、第二候補の有無を判断する（ステップＳ２１３）。第二候補が無い場合は、ステップＳ２０８に遷移する一方、第二候補がある場合は、テーブルＤＦＣを参照して（ステップＳ２１４）、ステップＳ２０８に遷移する。 Next, in step S202, when there is only a desired label, the table DC is referred to (step S212), and the presence / absence of the second candidate is determined (step S213). If there is no second candidate, the process proceeds to step S208. If there is a second candidate, the table DFC is referenced (step S214), and the process proceeds to step S208.

図４は、複数原因の判定処理の一例を示すフローチャートである。ここでは、複数の原因を特定する場合について説明する。まず、対象の確率テーブルを取得し（ステップＳ３０１）、次に、確率テーブルにおける確率値が閾値以上であるかどうかを判断する（ステップＳ３０２）。確率値が閾値以上でない場合は、終了する一方、確率値が閾値以上である場合は、その確率値を有する原因を特定し（ステップＳ３０３）、終了する。 FIG. 4 is a flowchart illustrating an example of a multi-cause determination process. Here, a case where a plurality of causes are specified will be described. First, a target probability table is acquired (step S301), and then it is determined whether the probability value in the probability table is equal to or greater than a threshold value (step S302). If the probability value is not equal to or greater than the threshold value, the process ends. If the probability value is equal to or greater than the threshold value, the cause having the probability value is specified (step S303), and the process ends.

図５は、単一原因の判定処理の一例を示すフローチャートである。ここでは、同一確率が存在する場合、原因を一つに絞る例を示す。まず、対象の確率テーブルを取得し（ステップＳ４０１）、次に、確率テーブルにおける確率値が閾値以上であるかどうかを判断する（ステップＳ４０２）。確率値が閾値以上でない場合は、終了する一方、確率値が閾値以上である場合は、同一確率が存在するかどうかを判断する（ステップＳ４０３）。ステップＳ４０３において、同一確率が存在しない場合は、確率値を比較して（ステップＳ４０５）、大きい方を抽出することによって、その原因を特定し（ステップＳ４０６）、終了する。ステップＳ４０３において、同一確率が存在する場合は、文章間隔を比較し（ステップＳ４０４）、文章間隔が最小である原因を特定して（ステップＳ４０６）、終了する。なお、以上の処理は、コンピュータにプログラムを実行させることで行なわれる。 FIG. 5 is a flowchart illustrating an example of a single cause determination process. Here, an example is shown in which the cause is narrowed down to one when the same probability exists. First, a target probability table is acquired (step S401), and then it is determined whether the probability value in the probability table is equal to or greater than a threshold value (step S402). If the probability value is not greater than or equal to the threshold value, the process ends. If the probability value is greater than or equal to the threshold value, it is determined whether or not the same probability exists (step S403). In step S403, if the same probability does not exist, the probability values are compared (step S405), the larger one is extracted, the cause is identified (step S406), and the process ends. In step S403, if the same probability exists, the sentence intervals are compared (step S404), the cause of the minimum sentence interval is specified (step S406), and the process ends. The above processing is performed by causing a computer to execute a program.

［表示例］
次に、情報出力の一例として表示例を説明する。図６は、商品またはサービスに関する情報の表示例を示す図である。図６の例では、ある年の１月下旬から２月中旬までの期間を対象としてある商品またはサービスに対して、不満Ｆ、要望Ｄ、原因Ｃの順でそれぞれの関係をツリーとして表示している。図６の例では、ノードＣ１２をクリックすることで元の文章が表示されている。次に、各表示の具体例を説明する。 [Display example]
Next, a display example will be described as an example of information output. FIG. 6 is a diagram illustrating a display example of information on products or services. In the example of FIG. 6, for each product or service for the period from the end of January to the middle of February in a certain year, each relationship is displayed as a tree in the order of dissatisfaction F, demand D, and cause C. Yes. In the example of FIG. 6, the original sentence is displayed by clicking the node C12. Next, specific examples of each display will be described.

図７は、ツリー構造の表示例を示す図である。図７の例では、検索語を「電子マネー」としたときに、「迷惑メール」の不満とそれに対する「改善」、「サポート」の要望が生じており、原因として「登録」、「フィルター」が挙がっている。また、フィルターのノードをクリックすることで、「Ｕ１１：フィルタのかけ方が分かりません」という投稿された元の文章が表示されている。 FIG. 7 is a diagram illustrating a display example of a tree structure. In the example of FIG. 7, when the search term is “electronic money”, dissatisfaction with “spam mail” and requests for “improvement” and “support” have arisen, and “registration” and “filter” are the causes. Is raised. Also, by clicking on the filter node, the original sentence “U11: I don't know how to apply the filter” is displayed.

図８は、時間軸の表示例を示す図である。図８に示すように、ツリーの表示期間は時間軸で制御できる。時間軸は、年月日や、検索した商品のイベントのフェーズ（報道発表、発売日、改修日等）等時間変化を表す軸を表示し、表示したい時期を選択して表示することができる。時間軸上のバーをスライドさせることで、バーのある期間のグラフを表示することができる。例えば、図８ではｔｉｍｅ軸は１月下旬、ｐｈａｓｅ軸では改修後を示している。また、各ノードをクリックすることで、投稿された元の文章を表示させることもできる。 FIG. 8 is a diagram illustrating a display example of the time axis. As shown in FIG. 8, the display period of the tree can be controlled on the time axis. As the time axis, an axis representing a time change such as a date and a phase of an event of a searched product (a press release, a release date, a repair date, etc.) can be displayed, and a time to be displayed can be selected and displayed. By sliding the bar on the time axis, it is possible to display a graph for a certain period of time. For example, in FIG. 8, the time axis indicates the end of January, and the phase axis indicates after renovation. In addition, by clicking each node, the posted original sentence can be displayed.

図９は、ツリー構造の表示例を示す図である。各ノードの色や大きさにより、投稿時期やその規模（声の大きさ、影響範囲）を示すこともできる。例えば、投稿時期が新しければ薄い色（例：Ｃ１ノード）、投稿時期が古ければ濃い色（例：Ｃ３ノード）であったり、規模が小さければノードの大きさを小さく、規模が大きければノードを大きく（例：Ｃ２ノード）表示する等が可能である。 FIG. 9 is a diagram illustrating a display example of a tree structure. Depending on the color and size of each node, it is also possible to indicate the posting time and scale (voice volume, range of influence). For example, if the posting time is new, the color is light (eg, C1 node), if the posting time is old, the color is dark (eg, C3 node), or if the scale is small, the size of the node is small and if the scale is large. It is possible to display a large node (eg, C2 node).

１００情報提供装置
１１０データ収集部
１２０分析部
１２１主観抽出部
１２２原因抽出部
１３０データベース作成部
１３１情報取得部
１３２キーワード抽出部
１３３カテゴリ分類部
１３４確率計算部
１４０制御部
１４１情報出力部 DESCRIPTION OF SYMBOLS 100 Information provision apparatus 110 Data collection part 120 Analysis part 121 Subjective extraction part 122 Cause extraction part 130 Database preparation part 131 Information acquisition part 132 Keyword extraction part 133 Category classification part 134 Probability calculation part 140 Control part 141 Information output part

Claims

An information providing device for providing information on goods or services,
An analysis unit that extracts subjectivity included in the text data from text data collected as related to a specific product or service, and estimates a cause of the subjectivity;
A database creation unit for creating a database using the extracted subjectivity, the estimated cause, and an item for classifying them;
An information output device comprising: an information output unit that outputs information on the specific product or service based on the created database.

The analysis unit extracts keywords related to the subjectivity from text data obtained by extracting the subjectivity,
The information providing apparatus according to claim 1, wherein the database creation unit uses a keyword extracted from text data from which the subjectivity is extracted as an item for classifying the subjectivity or the cause.

The analysis unit identifies a category for classifying the subjectivity or the cause,
The information providing apparatus according to claim 1, wherein the database creation unit uses a category for classifying the subjectivity or the cause as an item for classifying the subjectivity or the cause.

The information provision according to any one of claims 1 to 3, wherein the database creation unit uses an attribute of an information source of text data from which the subjectivity is extracted as an item for classifying the subjectivity or the cause. apparatus.

The said database preparation part calculates the appearance rate in which each of the item which classify | categorizes the said subjectivity or a cause appeared for the text data created in the specific period, The claim 1 to Claim 4 characterized by the above-mentioned. The information provision apparatus in any one.

The information output unit uses the database to output information in a tree format in which an item for classifying the subject is an upper node and an item for classifying the cause is a lower node. The information providing apparatus according to claim 5.

The information providing apparatus according to claim 6, wherein the information output unit configures a tree by changing the form of each node according to the importance of an item corresponding to each node.

The analysis unit analyzes text data collected in real time,
The information providing apparatus according to any one of claims 1 to 7, wherein the information output unit outputs information on the specific product or service according to the analysis result.

2. The information output unit according to claim 1, wherein the information output unit outputs an alarm when the number of items classified into the subjectivity or the cause accumulated in the database satisfies a predetermined condition within a predetermined time. Item 9. The information providing apparatus according to any one of Items 8 to 8.

10. The information output unit refers to an influence degree of an item for classifying subjectivity or cause accumulated in the database, and outputs an alarm preferentially for an item having a higher influence degree. The information providing apparatus according to the description.

A method of providing information about a product or service,
Extracting the subjectivity by the creator of the text data from text data collected as relating to a specific product or service, and estimating the cause of the subjectivity;
Creating a database using the extracted subjectivity, the estimated causes, and the items that classify them;
Outputting information relating to the specific goods or services based on the created database.

A program that provides information about a product or service,
Extracting subjectivity by the creator of the text data from text data collected as relating to a specific product or service, and estimating the cause of the subjectivity;
A process of creating a database using the extracted subjectivity, the estimated cause, and items for classifying them,
A program for causing a computer to execute a process of outputting information on the specific product or service based on the created database.