JP2004227343A

JP2004227343A - Opinion analyzing method, opinion analyzing device and opinion analyzing program

Info

Publication number: JP2004227343A
Application number: JP2003015277A
Authority: JP
Inventors: Takashi Yanase; 隆史柳瀬; Akira Ochitani; 亮落谷
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2003-01-23
Filing date: 2003-01-23
Publication date: 2004-08-12
Anticipated expiration: 2023-01-23
Also published as: JP4269698B2

Abstract

<P>PROBLEM TO BE SOLVED: To present an opinion analytic result under the consideration of a highly precise point at issue to a user. <P>SOLUTION: An opinion analyzing program is provided to input a free descriptive sentence as the answer of questionnaires by using a computer, and to extract an opinion descriptive part and a ground descriptive part based on an analytic rule stored in an analytic rule database, and to make a storage device store them with the inputted free descriptive sentence as the analytic result, and to extract a topic described in many free descriptive sentences as the point at issue based on the analytic result, and to prepare an opinion chart showing the distribution of the opinion descriptive parts for each point at issue, and to output the prepared opinion chart to an opinion summarizing person client. Thus, it is possible to extract only the description of the opinion or request in response to the questionnaire free descriptive answer sentence, and to present the opinion analytic result under the consideration of the highly precise point at issue to the user. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、アンケート回答分析に関し、特に自由記述回答を含む複数のアンケート回答文から、回答者の持つ意見や要望などの全体的な傾向の分析結果を提示する技術に関する。
【０００２】
【従来の技術】
インターネットを利用した電子メールや、Ｗｅｂページなどを介して収集される自然言語による自由記述を含むアンケート回答文を分析して、回答者の持つ意見や要望の全体的な特徴や傾向を得るためには、従来は人手によってアンケートの自由記述回答文を分析し、その結果を提示するようにしたものが一般的であった。ところが、膨大なアンケートの自由記述回答文を人手によって解析することは多大な労力やコストを必要とするため、例えば、アンケートの自由記述回答文に対して、テキスト分類エンジンを利用して多数派の意見をルール形式で自動で提示するという技術（例えば、特許文献１参照。）や、アンケートの自由記述回答文を解析してキーワードおよびその係り受け関係を抽出することによって、各回答文がある評価対象について肯定的評価を与えているか否定的評価を与えているかを判断し、その結果を評価対象ごとに集計してグラフを作成して提示する技術（例えば、特許文献２参照。）など、コンピュータを利用したアンケートの自由記述回答文の分析作業の自動化に関する技術が提案されている。
【０００３】
また、関連する技術として、文書を分類する際に得られる特徴表現の間の関係やクラスタ間の関係を活用させて、効率的な分類結果の提示や操作手段を提供する技術について開示されている（例えば、特許文献３参照）。
【０００４】
【特許文献１】
特開２００１−２６６０６０号公報（第２−３頁）
【０００５】
【特許文献２】
特開２００２−１４０４６５号公報（第２−３頁）
【０００６】
【特許文献３】
特開２０００−２５９６５８号公報（第２−３頁）
【０００７】
【発明が解決しようとする課題】
しかしながら、前述のテキスト分類による方法では、テキスト分類はキーワードベースで行われるため、自由記述回答が比較的長い文となると、キーワードとして抽出される語が多くなってしまい、高い精度での分類を行うことは困難であった。
【０００８】
一方、回答文ごとに肯定意見か否定意見かを判断する方法では、与えられたアンケート回答文が全体として肯定意見か否定意見かを判定することしかできない。例えば道路行政に関する要望を求めるアンケートにおいて「○○付近の渋滞が激しいので、バイパス道路を設置するべきだ思う。」という回答文が与えられた場合、「バイパス道路の設置」に肯定的という情報は得られても、この意見はどの程度肯定的なのか、あるいはこの意見が具体的な根拠に基づいているのか、などの情報までを得るのは不可能であった。
【０００９】
また、収集された回答文から、例えば「少数派でも具体的な根拠に基づく強硬な意見」のような条件に合った回答文だけを取り出したい場合がある。しかしながら、テキスト分類による方法においては多数派の意見は容易に取り出すことはできるが、少数派の意見を取り出すことは困難である。一方、回答文ごとに肯定意見か否定意見かを判断する方法においても、このような条件に合うものを取り出すことは困難であるため、全ての回答文を実際に読んで人手で選別せざるを得ず、多大な労力やコストが必要であった。
【００１０】
本発明は、上記のような事情に鑑みて提案されたものであり、テキスト分類による論点抽出処理前に各アンケート回答文を文書解析して「意見や要望の記述」を抽出して、それらの記述だけを論点抽出処理に用いることによって、利用者に対してより高精度な論点に着目した意見分析結果の提示を可能とすることを第１の課題としている。
【００１１】
本発明の第２の課題は、個々のアンケート回答文に対して、ある論点に対する関連度および根拠の具体性などに基づく意見の強さを数値化することにより、単に肯定か否定かを判断するだけでなく、その度合いを与えた意見分析結果の提示を可能とすることである。
本発明の第３の課題は、収集されたアンケート回答文の中から、例えば「少数派でも強硬な意見」のように従来の技術では取り出しにくい条件に合致した回答文を少ないコストで機械的に取り出し、利用者への提示を可能とすることである。
【００１２】
【課題を解決するための手段】
図１は、本発明の実施の形態１の全体構成図を示すものである。アンケート回答者クライアント５，６，７からインターネットなどのネットワーク４を介して、アンケートの回答として自由記述文が電子メール、またはＷｅｂフォームへの書き込みなどによって、意見分析装置１に送信されると、意見分析プログラム１０の自由記述文入力手段１１は送信された自由記述文を受信し、意見分析装置１に入力する。意見解析手段１２は、解析規則データベース２に格納されている解析規則に基づき、入力した自由記述文からアンケート回答者の意見として認識した意見記述部分を抽出し、解析結果として入力した自由記述文と共に記憶装置に記憶させ、論点抽出手段１３は、前記意見記述部分解から出現頻度の多い順に単語を論点として抽出し、意見チャート作成手段１４は、前記論点ごとに前記意見記述部分の分布図表を作成し、意見チャート出力手段１５は、前記分布図表を意見集計者クライアント８に出力することにより、アンケート自由記述回答文に対して、意見や要望の記述のみを抽出することが可能となるため、利用者に対してより高精度な論点に着目した分析結果の提示が可能となる。
【００１３】
また、意見分析プログラム１０に、論点抽出手段１３が抽出した論点に対する関連度を計算する論点関連度計算手段１６、前記意見記述部分における特定の語の出現頻度を数えることにより計算される意見記述の断定性を示す数値と、意見解析手段１２はさらに前記意見の根拠と認識した根拠記述部分を抽出し、前記根拠記述部分における特定の語の出現頻度を数えることにより計算される根拠記述の具体性を示す数値とを用いて前記自由記述文における意見の強さの度合いを示す意見強度を計算する意見強度計算手段１７を備えることにより、単なる肯定か否定かだけでなく、その度合いをいくつかの観点から評価することや、例えば「少数派でも強硬な意見」のような従来の技術では取り出しにくい条件に合う意見文を提示することも可能となる。
【００１４】
【発明の実施の形態】
図１は、本発明の実施の形態１の全体構成図を示すものである。本発明の意見分析装置１では、図示しないが通常と同じくＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ハードディスクドライブ（ＨＤＤ：ＨａｒｄＤｉｓｋＤｒｉｖｅ）、グラフィック処理装置、入力インタフェースなどがバスを介して接続された構成からなるコンピュータ上で意見分析プログラム１０が動くことによって各手段として機能する。
【００１５】
意見分析プログラム１０は、アンケート回答者クライアント５，６，７から送信された自由記述文を受信し、意見分析装置１に入力する自由記述文入力手段１１、解析規則データベース２に格納されている解析規則に基づき、入力した自由記述文から意見記述部分と根拠記述部分を抽出し、解析結果として入力した自由記述文と共に記憶装置に記憶させる意見解析手段１２、前記意見記述部分から出現頻度の多い順に単語を論点として抽出する論点抽出手段１３、前記論点ごとに前記意見記述部分の分布図表を作成する意見チャート作成手段１４、作成された分布図表を意見集計者クライアント８に出力する意見チャート出力手段１５、論点抽出手段１３が抽出した論点に対する関連度を計算する論点関連度計算手段１６、前記意見記述部分における特定の語の出現頻度を数えることにより計算される意見記述の断定性を示す数値と、前記根拠記述部分における特定の語の出現頻度を数えることにより計算される根拠記述の具体性を示す数値とを用いて前記自由記述文における意見の強さの度合いを示す意見強度を計算する意見強度計算手段１７から構成されている。
【００１６】
解析規則データベース２は、意見文解析装置１に入力されたアンケート自由記述回答文を解析するための規則が記憶されている。意見文データベース３は、個々のアンケート自由記述回答文に対する意見解析手段１２による解析結果、論点関連度計算手段１６により求められた論点と関連度、および意見強度計算手段１７により計算された意見強度が、意見文データとして記憶されている。
【００１７】
図２は、本発明に係る実施の形態１における意見文解析処理概要の流れを示すフローチャートである。アンケート回答者クライアントからインターネットなどのネットワークを介して、自由記述回答文が電子メール、またはＷｅｂフォームへの書き込みなどによって、意見分析装置に送信されると、意見分析プログラムは、アンケート回答者のアンケート回答を受信・受け付けを行い（Ｓ２０１）、解析規則データベース２に記憶させてある解析規則に基づいて解析・整理を行い、その結果を意見データベース３に記憶させる（Ｓ２０２）。本処理については、後述の図５に基づいて詳細に説明する。次に、他に処理が終了していない自由記述回答文があるかどうかを判定し、アンケート回答者からのアンケート回答すべてに対して行う（Ｓ２０３）。
【００１８】
図３、および図４は、入力されるアンケート自由記述回答文の例である。このアンケート回答文は「道路行政についてご意見があればお書き下さい」という質問に対して書かれたものである。ここで、回答文にはＩＤ番号が付与されているが、これはネットワークを通じて回答文が入力されてきた際に、各回答文に対して一意に付与されるものとする。
【００１９】
図５は自由記述回答文を解析して、意見文データとして意見文データベースに記憶させる意見文データの記憶処理の流れを示すフローチャートである。本処理が開始されると、まずＩＤ番号と自由記述回答文の全文が意見文データベース３に格納する（Ｓ５０１）。次に、自由記述回答文が複数の文で構成されている場合は、回答文中の句点、および疑問符を区切れ目にして、文単位に分割する（Ｓ５０２）。次に、分割された文の最初の１文を読み込み（Ｓ５０３）、解析規則データベース２（図６に解析規則データ例が示してある。）から適用順序に従って１つの解析規則を読み込んで（Ｓ５０４）照合を行う（Ｓ５０５）。
【００２０】
照合はパターンマッチングによって行れ、マッチすれば次の処理に進み、マッチしない場合は、適用順序に従って次の解析規則があるかどうかを判断し（Ｓ５０７）、他に規則があればそれを読み込み（Ｓ５０４）、再度照合処理（Ｓ５０５）が行われる。照合処理（Ｓ５０５）においてマッチした場合は、解析規則中の談話要素名とそれぞれ切り出された部分を意見データベース３に格納する（Ｓ５０６）。１文に対する処理が終了すれば、他に未処理の文があるかどうかを判定し（Ｓ５０８）、あれば次の文の読み込み処理に戻る（Ｓ５０３）。なければ処理を終了する。
【００２１】
図６は解析規則データベース２に記憶させてある解析規則データの例を示してある。個々のデータは適用順序と解析規則によって構成されており、適用順序は個々の解析規則を適用する順序を、解析規則は自由記述回答文との照合を行うためのパターンを示している。解析規則の表記のうち、例えば＜根拠記述＞は、任意の表現にマッチして、その箇所の談話要素を「根拠記述」とするという意味である。従って、例えば適用順序１の「＜根拠記述＞ので、＜賛成意見記述＞すべきだと思う。」という規則は、「〜ので、…すべきだと思う。」という表現の任意の文とマッチして、マッチした場合は「ので、」から前の部分を「根拠記述」の談話要素として切り出し、「ので、」から後ろかつ「すべきだと思う。」から前の部分を「賛成意見記述」の談話要素として切り出す、ということを示している。なお、解析規則データベース２に記憶させてある解析規則データは、必要に応じて変更・追加が可能である。
【００２２】
図７は、図３および図４に示した自由記述回答文が、図６に示した解析規則データに基づいた意見文解析処理によって、意見文データベース３に格納する意見文データ例が示してある。意見文データベース３中の各意見文データは「ＩＤ番号」「談話要素」「記述内容」の項目で構成されている。図３に示したＩＤが０００１の自由記述回答文は、適用順序１の解析規則にマッチし、「根拠記述」の談話要素に対応する記述内容として「○○市△△付近の渋滞が激しいので、」が、「賛成意見記述」の談話要素に対応する記述内容として「バイパス道路を絶対に整備すべきだと思う。」がそれぞれ切り出されて意見文データベース３に格納される。
【００２３】
また、図５に示したＩＤが０００２の自由記述回答文は、２文で構成されているが、１文目が適用順序２の解析規則にマッチし、「根拠記述」の談話要素に対応する記述内容として「△△付近の商店街離れが深刻な現状です。」が切り出されている。また、２文目は適用順序３の解析規則にマッチし、「反対意見記述」の談話要素に対応する記述内容として「さらなるバイパス道路は必要ないと思います。」が切り出されて意見文データベース３に格納される。
【００２４】
図８は、論点抽出・関連度計算処理の流れを示すフローチャートである。本発明の意見分析装置において分析の対象となる全てのアンケート回答文に対して、意見文解析処理が完了し、意見文データベース３に意見文データとして格納されている状態にあるときに、例えば意見集計者の指示により本処理が開始されるものとする。
【００２５】
本処理が開始されると、まず意見文データベース３から、各意見文データのうち談話要素が「賛成意見記述」または「反対意見記述」に対応する記述内容を取り出す（Ｓ８０１）。次に、取り出された記述内容を用いて論点抽出および関連度計算処理を行う（Ｓ８０２）。この処理は、従来の文書クラスタリングに関する技術を用いることにより実現が可能である。例えば、特開２０００−２５９６５８号公報には、文書クラスタリングの結果を効率よく提示するための技術が記載されている。該公報に記載されている技術を用いることにより、各クラスタに対しては、それらのクラスタを特徴づける表現を付与することができる。さらに、各文書に対しては、属するクラスタの持つ特徴に対する類似度を与えることができる。
【００２６】
本実施例では、１つの意見文データに対して先のステップにより取り出された記述内容を１つの文書データとして、該公報に記載されている技術を用いてクラスタリング処理を行う。そして各クラスタを特徴づける表現を論点、各文書データの属するクラスタの持つ特徴に対する類似度を論点関連度として求め、その結果を意見文データベース３に書き込む（Ｓ８０３）。
【００２７】
図９は、意見文データベース３から前述のステップで取り出された記述内容を用いて文書クラスタリング処理を行った結果の例が示してある。本図では、抽出されたクラスタのうち２つが示されており、各クラスタを特徴づける表現として「バイパス道路、整備」と「駐車場、増やす」が示してある。これらのクラスタを特徴づける表現が論点となる。さらに、各クラスタの中には複数の意見文データがあり、それぞれの意見文データにはクラスタの特徴に対する類似度が与えられている。この類似度が、各意見文データの論点関連度になる。
【００２８】
図１０は、論点抽出・関連度計算処理が終了した時点での意見文データベース３の記憶内容の例を示してある。本図では、ＩＤ番号０００１の意見文データの論点が「バイパス道路、整備」、論点関連度が０．９、ＩＤ番号０００２の意見文データの論点が「バイパス道路、整備」、論点関連度が０．８と求められた場合の例を示してある。各意見文データの論点および論点関連度は、談話要素が「全文」となっている行に書き込むものとする。
【００２９】
図１１は、意見強度計算処理の流れを示すフローチャートである。本発明の意見分析装置において分析の対象となる全てのアンケート回答文に対して、意見文解析処理が完了し、意見文データベース３に意見文データとして格納されている状態にあるときに、例えば意見集計者の指示により本処理が開始されるものとする。
【００３０】
本処理が開始されると、まず図１２で示すような意見強度計算ダイアログが表示し、意見強度の計算条件に関する意見集計者の指示を受け付ける（Ｓ１１０１）。次に、意見文データベース３から、各意見文データにおいて談話要素が「根拠記述」に対応する記述内容を取り出す（Ｓ１１０２）。次に、意見文データごとに、取り出された記述内容を形態素解析し、形態素解析処理の結果品詞が「数値」あるいは「固有名詞」に判断された単語の出現回数を数える（Ｓ１１０３）。実際には、意見強度計算ダイアログで根拠記述具体性として考慮する語としてチェックされている品詞についてのみ出現頻度を数えるものとする。
【００３１】
次に、意見文データベース３から、各意見文データにおいて談話要素が「賛成意見記述」あるいは「反対意見記述」に対応する記述内容を取り出す（Ｓ１１０４）。次に、意見文データごとに、取り出された記述内容から、意見記述断定性として考慮する語の出現回数を数える（Ｓ１１０５）。出現回数を数える語は、意見強度ダイアログで「システムによる定義語」がチェックされている場合は予め登録されている語であるが、「語を指定」がチェックされている場合は、後続のテキストボックスに入力されている語も加えるものとする。
【００３２】
次に、前述のステップで数えられた各品詞や語の出現頻度に基づき、与えられた式に従って意見強度を計算し（Ｓ１１０６）、計算結果を意見文データベース３に書き込む（Ｓ１１０７）。なお、意見強度の計算式は、意見強度計算ダイアログに設定された根拠記述具体性のウエイトをａ、意見記述断定性のウエイトをｂとすると例えば以下のように表すことができる。
【００３３】
【数１】

【００３４】
意見強度の具体的な計算方法について、図１０に示した意見文データを例に説明する。なお、計算条件の設定は図１２に示したダイアログの通りとする。ＩＤ番号が０００１の意見文データについては、談話要素が「根拠記述」に対応する記述内容には、根拠記述具体性として考慮する語として固有名詞が２回（「○○市」と「△△」、ともに地名とする）出現する。従って根拠記述具体性は２となる。
【００３５】
意見記述断定性については、図１３で示すようにシステムによる定義語として予め「べきだ」という語が登録されていたとすると、談話要素が「賛成意見記述」に対応する記述内容には１回出現する。さらに、意見分析者による指定語として「絶対」という語が入力されており、これも１回出現している。さらに重みが１．５と指定されているので、意見記述断定性は（２）式より１＋１×１．５＝２．５となる。また、根拠記述具体性と意見記述断定性のウエイトが図１２で示す意見強度計算ダイアログで２：３と設定されているため、意見強度は（３）式に、ａに２を、ｂに３を、根拠記述具体性に２を、意見記述断定性に２．５を代入し、２．３と計算される。
【００３６】
また、ＩＤ番号が０００２の意見文データについては、談話要素が「根拠記述」に対応する記述内容には、根拠記述具体性として考慮する語として固有名詞が１回（「△△」）出現する。従って根拠記述具体性は１となる。意見記述断定性については、システムによる定義語の中に、「反対意見記述」に対応する記述内容中に出現するものがないとすると、意見記述断定性は０となる。従って、意見強度は（３）式に、ａに２を、ｂに３を、根拠記述具体性に１を、意見記述断定性に０を代入し、０．４と計算される。
【００３７】
図１２は、意見強度計算ダイアログの例が示してある。本図において、意見強度計算条件は、根拠記述具体性の計算条件設定、意見記述断定性の計算条件設定、および根拠記述具体性と意見記述断定性のウエイト設定の３つで構成されている。根拠記述具体性の計算条件設定では、根拠記述具体性の度合いを計算するために考慮する語の品詞を選択する。本図のダイアログでは数値と固有名詞が選択できるようになっており、両方を選択することも可能である。
【００３８】
意見記述断定性の計算条件設定では、意見記述断定性の度合いを計算するために考慮する語を選択する。本図のダイアログでは、あらかじめ用意された語を使用する（図１３で示すような「システムによる定義語」をチェック）か、意見集計者が指定するか（「語を指定」をチェック）の何れか、または両方を選択することができる。意見集計者による指定を選択した場合は、考慮する語をテキストボックスに入力する。この場合、コンマで区切ることにより、複数の語を指定することも可能である。また、その下のテキストボックスに、指定した語に対する考慮の度合いを重みとして入力することができる。重みは、システムによる定義語を１としたときの値として任意の数値を入力した後、「意見強度計算開始」と書かれたボタンを押すことにより意見分析装置に入力される。
【００３９】
図１４は、計算結果が意見文データベースに書き込まれた状態の例が示してある。本図において、各意見文データに対して、談話要素が「全文」の行には意見強度、「根拠記述」の行には根拠記述具体性、「賛成意見記述」あるいは「反対意見記述」の行には意見記述断定性の数値がそれぞれ書き込まれている。
図１５は、意見チャート作成処理の流れを示すフローチャートである。本処理が開始されると、まず意見文データベース３から各意見文データに対する論点の項目を読み込み、論点ごとの意見文データの頻度を数える（Ｓ１５０１）。次に、図１６で示すような意見チャート作成ダイアログを表示して、意見集計者から意見チャートを作成する論点の指定を受ける（Ｓ１５０２）。
【００４０】
次に、指定された論点のうちの１つについてグラフを作成するために、グラフの横軸に意見記述強度、縦軸に論点関連度をとる（Ｓ１５０３）。次に、意見文データベース３から意見文データを１つ読み込み（Ｓ１５０４）、読み込んだ意見文データの論点が現在グラフを作成中のものかどうかを判定し（Ｓ１５０５）、そうであれば意見記述度と論点関連度を読み込み、グラフ中の適切な位置にプロットする（Ｓ１５０６）。
【００４１】
プロットが終了すると、意見文データベース３中の全ての意見文データについて処理が終了したかどうかを判定し（Ｓ１５０７）、終了していなければ次の意見文データの読み込み処理に戻る（Ｓ１５０４）。終了していれば、他にグラフを作成する論点があるかどうかを判定し（Ｓ１５０８）、他に論点があれば次のグラフの作成処理に移る（Ｓ１５０３）。なければ処理を終了し、作成した意見チャートとしてのグラフデータを、意見集計者クライアントに送信する。
【００４２】
図１６は意見チャート作成ダイアログの例が示してある。この例では、前出のステップにおいて数えられた論点別の頻度が「バイパス道路、整備」が２２、「駐車場、増やす」が１５、「高速道路、無料」が８であった場合を示している。
各論点の前にはチェックボックスが設けてあり、ここをクリックすることにより意見チャートを作成する論点を指定することが出来る。複数の論点を指定することも可能である。
【００４３】
図１７は意見データがプロットされた意見チャートの例が示してある。このように賛成意見か反対意見かでプロットする点の種類を区別することも可能である。この場合、意見文データの談話要素の項目にあるのが「賛成意見記述」であるか「反対意見記述」であるかにより判断される。この例では、賛成意見が「●」、反対意見が「×」でプロットされている。また、図１８に示してあるように、プロットされた点の横に各意見の根拠を示すキーワードを付与することも可能である。このキーワードは例えば、意見文データにおいて根拠記述に対応する記述内容を形態素解析時に最初に出現するものを取り出す、という方法で付与することが可能である。
【００４４】
図１９は、図１８の意見チャートに２本の目安線を加えたものである。目安線うちの１本は、例えば論点関連度が最大のものから数えて上位半分になるところで、意見強度の軸に平行になるように引く。もう１本は、例えば意見強度が最大のものから数えて上位半分になるところで、論点関連度の軸に平行になるように引く。このように意見チャートを構成することにより、例えばチャートの右上の領域は「比較的多数派の意見」の集合、左上の領域は「多数派ではあるが単なる希望や要望」の集合、右下の領域は「ユニークな意見ではあるが強硬な意見」の集合のように、プロットされた点が存在する領域で個々の意見の特徴がつかめるようになる。
【００４５】
また、プロットされた点と意見文データベース中の個々の意見文データとのリンクを張ることによって、例えば図２０に示すようにプロットされた点をディスプレイ上で指示（クリック）することにより対応する意見文データの談話要素が「全文」に相当する記述内容を表示することができる。このようにすることによって、例えば「『バイパスの整備』に関する少数派ではあるが強硬な意見」という意見文データを容易に取り出して見ることができるようになる。
【００４６】
図２１は、意見チャート作成ダイアログにおいて、複数の論点が指定された場合において、意見チャートを作成した例が示してある。これは、１つの論点について作成されるグラフを、円グラフのようにして組み合わせたものである。各論点におけるグラフにおいて、論点関連度の軸と意見強度の軸のなす角は、例えば次式によって計算することが出来る。
【００４７】
【数２】

【００４８】
このように意見チャートを作成することにより、複数の論点に関する意見の分布を効率よく見ることが可能であると同時に、座標軸のなす角が論点の属する意見文データの数に比例するため、どの論点に関する意見が多いかということについても容易に把握することが可能になる。
（付記１）自由記述文を所定の解析規則に基づいて、意見と認識した意見記述部分抽出するステップと、
前記意見記述部分から出現頻度の多い順に単語を抽出するステップとを有することを特徴とする意見分析方法。
【００４９】
（付記２）自由記述文を所定の解析規則に基づいて、意見と認識した意見記述部分抽出する意見解析手段と、
前記意見記述部分から出現頻度の多い順に単語を抽出する論点抽出手段とを有することを特徴とする意見分析装置。
（付記３）自由記述文を所定の解析規則に基づいて、意見と認識した意見記述部分抽出する意見解析手段と、
前記意見記述部分から出現頻度の多い順に単語を抽出する論点抽出手段としてコンピュータを機能させる意見分析プログラム。
【００５０】
（付記４）前記論点抽出手段が抽出した単語に対する関連度を計算する論点関連度計算手段をさらに備えることを特徴とする請求項３記載の意見分析プログラム。
（付記５）前記意見記述部分における特定の語の出現頻度を数えることにより計算される意見記述の断定性を示す数値と、前記意見解析手段はさらに前記意見の根拠と認識した根拠記述部分を抽出し、前記根拠記述部分における特定の語の出現頻度を数えることにより計算される根拠記述の具体性を示す数値とを用いて前記自由記述文における意見の強さの度合いを示す意見強度を計算する意見強度計算手段をさらに備えることを特徴とする請求項３、および請求項４記載の意見分析プログラム。
【００５１】
（付記６）前記意見チャート作成手段は、前記自由記述文に対して前記論点関連度計算手段により計算された論点関連度と、前記意見強度計算手段により計算された意見強度とを２次元の座標系上にプロットすることにより意見の分布を示す図を作成することを特徴とする付記３記載の意見分析プログラム。
（付記７）前記意見チャート作成手段は、前記２次元の座標系にプロットされた点の存在する領域により、意見の特徴が判別できるように前記２次元の座標系を複数の領域に分割することを特徴とする付記３記載の意見分析プログラム。
【００５２】
（付記８）前記意見チャート作成手段は、各自由記述文に対して前記抽出された論点以外に記述内容を代表する語を抽出して前記作成された意見の分布を示す図追加することを特徴とする付記３記載の意見分析プログラム。
【００５３】
【発明の効果】
以上のように、本発明によれば、アンケート自由記述回答文に対して文書解析により意見や要望の記述だけを取り出すことができるので、それらの記述だけを文書クラスタリング処理に用いることによって、利用者に対してより高精度な論点に着目した意見分析結果の提示が可能となる。
【００５４】
また、個々の意見文データに対して論点関連度および根拠の具体性などに基づく意見強度を計算することによって、単なる肯定か否定かだけでなく、その度合いをいくつかの観点から評価することが可能になる。
さらに、論点関連度と意見強度をグラフにプロットして意見チャートとして提示することにより、例えば「少数派でも強硬な意見」のような従来の技術では取り出しにくい条件に合う意見文を取り出すことも、意見チャート上の点を指示するという簡単な操作で行うことが可能になる。
【図面の簡単な説明】
【図１】実施の形態１の全体構成図
【図２】本発明に係る意見文解析処理概要の流れを示すフローチャート
【図３】アンケート自由記述回答文の例１
【図４】アンケート自由記述回答文の例２
【図５】本発明に係る意見文データの記憶処理の流れを示すフローチャート
【図６】解析規則データベースに格納されている解析規則データの例
【図７】意見文データベースに格納されている意見文データの例
【図８】本発明に係る論点抽出・関連度計算処理の流れを示すフローチャート
【図９】クラスタリング結果の例
【図１０】意見文データベースに論点と論点関連度を書き込んだ例
【図１１】本発明に係る意見強度計算処理の流れを示すフローチャート
【図１２】意見強度計算ダイアログの例
【図１３】意見記述断定性に用いるシステムによる定義語の例
【図１４】意見文データベースに意見強度を書き込んだ例
【図１５】本発明に係る意見チャート作成処理の流れを示すフローチャート
【図１６】意見チャート作成ダイアログの例
【図１７】意見チャートの例１
【図１８】意見チャートの例２
【図１９】意見チャートの例３
【図２０】意見チャートの例４
【図２１】意見チャートの例５
【符号の説明】
１意見分析装置
２解析規則データベース
３意見文データベース
４通信ネットワーク
５アンケート回答者クライアント
６アンケート回答者クライアント
７アンケート回答者クライアント
８意見集計者クライアント
１０意見分析プログラム
１１自由記述文入力手段
１２意見解析手段
１３論点抽出手段
１４意見チャート作成手段
１５意見チャート出力手段
１６論点関連度計算手段
１７意見強度計算手段[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a questionnaire response analysis, and more particularly, to a technique for presenting an analysis result of an overall tendency, such as an opinion or a request of a respondent, from a plurality of questionnaire response sentences including a free description answer.
[0002]
[Prior art]
To analyze the questionnaire response sentence including free description in natural language collected via e-mail using the Internet or web pages, etc., to obtain the overall characteristics and trends of the opinions and requests of the respondents Conventionally, it has been common practice to manually analyze a freely described answer sentence of a questionnaire and present the result. However, analyzing a large number of free description answers of a questionnaire manually requires a great deal of labor and cost. A technique of automatically presenting opinions in a rule format (for example, see Patent Literature 1), or analyzing a freely described answer sentence of a questionnaire to extract a keyword and its dependency relationship, thereby evaluating each answer sentence. Computers such as a technique for determining whether a positive evaluation or a negative evaluation is given to an object, summing up the results for each evaluation object, creating a graph, and presenting the graph (for example, see Patent Document 2). A technique has been proposed for automating the work of analyzing a free-form answer sentence of a questionnaire using GIS.
[0003]
In addition, as a related technique, a technique is disclosed that provides efficient classification result presentation and operation means by utilizing relationships between feature expressions and relationships between clusters obtained when documents are classified. (For example, see Patent Document 3).
[0004]
[Patent Document 1]
JP 2001-266060 A (pages 2-3)
[0005]
[Patent Document 2]
JP-A-2002-140465 (pages 2-3)
[0006]
[Patent Document 3]
JP-A-2000-259658 (pages 2-3)
[0007]
[Problems to be solved by the invention]
However, in the method based on the text classification described above, the text classification is performed on a keyword basis, so if the free description answer is a relatively long sentence, many words are extracted as keywords, and the classification is performed with high accuracy. It was difficult.
[0008]
On the other hand, in the method of judging a positive opinion or a negative opinion for each answer sentence, it is only possible to determine whether a given questionnaire answer sentence is a positive opinion or a negative opinion as a whole. For example, in a questionnaire requesting a request for road administration, if the answer sentence is "There is heavy traffic around XX, so a bypass road should be installed." Even so, it was not possible to obtain information such as how positive this opinion was or whether it was based on concrete grounds.
[0009]
Further, from the collected answer texts, there may be a case where it is desired to extract only answer texts that meet conditions such as “a strong opinion based on concrete grounds even in a minority”. However, in the method based on text classification, the opinion of the majority can be easily extracted, but it is difficult to extract the opinion of the minority. On the other hand, it is difficult to determine whether the answer is positive or negative for each answer sentence, because it is difficult to extract those that meet such conditions. A great deal of labor and cost were required.
[0010]
The present invention has been proposed in view of the above-described circumstances. Before the issue extraction process based on text classification, each questionnaire response sentence is subjected to document analysis to extract “description of opinions and requests”, A first object is to enable a user to present an opinion analysis result focusing on a more accurate issue by using only the description in the issue extraction process.
[0011]
A second object of the present invention is to simply judge affirmative or negative by quantifying the opinion of each questionnaire answer sentence based on the degree of relevance to a certain issue and the specificity of the basis. Not only that, but it is also possible to present opinion analysis results giving the degree.
A third problem of the present invention is that, from a collected questionnaire answer sentence, an answer sentence that meets conditions that are difficult to retrieve with conventional techniques, such as “a strong opinion even in a minority”, is mechanically obtained at low cost. It is to be able to take out and present it to the user.
[0012]
[Means for Solving the Problems]
FIG. 1 shows an overall configuration diagram of Embodiment 1 of the present invention. When a free description is sent as an answer to the questionnaire from the questionnaire respondent clients 5, 6, 7 via the network 4 such as the Internet to the opinion analysis device 1 by e-mail or writing on a Web form, the opinion The free description sentence input means 11 of the analysis program 10 receives the transmitted free description sentence and inputs it to the opinion analysis device 1. The opinion analysis means 12 extracts an opinion description part recognized as the opinion of the questionnaire respondent from the input free description based on the analysis rule stored in the analysis rule database 2 and, together with the free description input as the analysis result. The issue extracting unit 13 extracts words as issues from the opinion description section decomposition in the order of appearance frequency, and the opinion chart creating means 14 creates a distribution chart of the opinion description part for each issue. By outputting the distribution chart to the opinion tallyer client 8, the opinion chart output unit 15 can extract only the description of the opinion or request from the questionnaire free description answer sentence. It is possible to present the analysis result focusing on the point of higher accuracy to the person.
[0013]
The opinion analysis program 10 also includes an issue relevance calculator 16 for calculating the relevance to the issues extracted by the issue extractor 13, and an opinion description calculated by counting the frequency of occurrence of a specific word in the opinion description part. The numerical value indicating the assertiveness and the opinion analysis means 12 further extract the grounds description part recognized as the grounds of the opinion and the specificity of the grounds description calculated by counting the frequency of occurrence of a specific word in the grounds description part Is provided with the opinion strength calculating means 17 for calculating the opinion strength indicating the degree of the opinion strength in the free description sentence by using the numerical value indicating the value of the free description sentence. It is also possible to evaluate from a viewpoint or present an opinion sentence that meets conditions that are difficult to retrieve with conventional technology, such as “strong opinion even in a minority” To become.
[0014]
BEST MODE FOR CARRYING OUT THE INVENTION
FIG. 1 shows an overall configuration diagram of Embodiment 1 of the present invention. In the opinion analysis device 1 of the present invention, a CPU (Central Processing Unit), a RAM (Random Access Memory), a hard disk drive (HDD: Hard Disk Drive), a graphic processing device, an input interface, and the like are provided via a bus (not shown) as usual. When the opinion analysis program 10 runs on a computer having a configuration connected to the computer, it functions as each unit.
[0015]
The opinion analysis program 10 receives the free description sent from the questionnaire respondent clients 5, 6, 7 and inputs the free description sentence 11 to the opinion analysis device 1, and the analysis stored in the analysis rule database 2. Opinion analysis means 12 for extracting an opinion description part and a grounds description part from the input free description sentence based on the rules and storing it in a storage device together with the input free description sentence as an analysis result, in the order of the appearance frequency from the opinion description part An issue extracting means 13 for extracting a word as an issue, an opinion chart creating means 14 for creating a distribution chart of the opinion description portion for each issue, and an opinion chart output means 15 for outputting the created distribution chart to the opinion tallyer client 8 Issue relevance calculating means 16 for calculating the relevance to the issues extracted by the issue extracting means 13; Numerical value indicating the assertiveness of the opinion description calculated by counting the frequency of occurrence of the specific word in, and a numerical value indicating the specificity of the grounds description calculated by counting the frequency of occurrence of the specific word in the grounds description portion And an opinion strength calculating means 17 for calculating the opinion strength indicating the strength of the opinion in the free description sentence.
[0016]
The analysis rule database 2 stores rules for analyzing the questionnaire free description answer sentence input to the opinion sentence analysis device 1. The opinion sentence database 3 stores the analysis results of the individual questionnaire free description answer sentences by the opinion analysis means 12, the issues and relevance calculated by the issue relevance calculation means 16, and the opinion strength calculated by the opinion strength calculation means 17. And opinion sentence data.
[0017]
FIG. 2 is a flowchart illustrating a flow of an opinion sentence analysis process outline according to the first embodiment of the present invention. When a freely-written answer sentence is sent from the questionnaire respondent client to the opinion analysis device via a network such as the Internet by e-mail or writing to a web form, the opinion analysis program causes the questionnaire respondent to answer the questionnaire. Is received / accepted (S201), analysis / organization is performed based on the analysis rules stored in the analysis rule database 2, and the result is stored in the opinion database 3 (S202). This processing will be described in detail with reference to FIG. Next, it is determined whether or not there is any other free description answer sentence that has not been processed, and this is performed for all questionnaire responses from the questionnaire respondents (S203).
[0018]
FIG. 3 and FIG. 4 are examples of a questionnaire free description answer sentence that is input. This questionnaire was written in response to the question "Please write if you have any opinions about road administration." Here, the answer text is assigned an ID number, which is uniquely assigned to each answer text when the answer text is input through the network.
[0019]
FIG. 5 is a flowchart showing the flow of the storage process of opinion sentence data for analyzing a freely described answer sentence and storing it in the opinion sentence database as opinion sentence data. When this process is started, first, the entire text of the ID number and the free description answer sentence is stored in the opinion sentence database 3 (S501). Next, when the free description answer sentence is composed of a plurality of sentences, the sentence is divided into sentence units by separating the punctuation marks and the question mark in the answer sentence (S502). Next, the first sentence of the divided sentences is read (S503), and one analysis rule is read from the analysis rule database 2 (an example of analysis rule data is shown in FIG. 6) according to the application order (S504). Collation is performed (S505).
[0020]
The matching is performed by pattern matching. If a match is found, the process proceeds to the next process. If no match is found, it is determined whether or not there is a next analysis rule according to the application order (S507). S504), the collation processing (S505) is performed again. If a match is found in the collation processing (S505), the discourse element names in the analysis rules and the cut out portions are stored in the opinion database 3 (S506). When the processing for one sentence is completed, it is determined whether there is another unprocessed sentence (S508). If there is, the processing returns to the reading processing of the next sentence (S503). If not, the process ends.
[0021]
FIG. 6 shows an example of analysis rule data stored in the analysis rule database 2. Each piece of data is composed of an application order and an analysis rule. The application order indicates the order in which the individual analysis rules are applied, and the analysis rule indicates a pattern for performing matching with a freely described answer sentence. In the notation of the analysis rule, for example, <foundation description> means that the discourse element at that location is matched with an arbitrary expression and is set as “foundation description”. Therefore, for example, the rule of “Appropriate opinion description because <rational description> should be described” in the order of application 1 matches any sentence with the expression “I think it should be... Then, if a match is found, the part before “So” is cut out as a discourse element of “Evidence Description”, and the part after “So” and before “From I think.” "As a discourse element. The analysis rule data stored in the analysis rule database 2 can be changed or added as needed.
[0022]
FIG. 7 shows an example of opinion sentence data in which the free description answer sentence shown in FIGS. 3 and 4 is stored in the opinion sentence database 3 by opinion sentence analysis processing based on the analysis rule data shown in FIG. . Each opinion sentence data in the opinion sentence database 3 is composed of items of “ID number”, “discussion element”, and “description content”. The free description answer sentence with the ID of 0001 shown in FIG. 3 matches the analysis rule of the application order 1, and the description corresponding to the discourse element of "grounds description" is "Because there is heavy traffic congestion near ○ City △△. , Is extracted as a description corresponding to the discourse element of the "agreement opinion description", and "I think that a bypass road should be maintained." Is cut out and stored in the opinion sentence database 3.
[0023]
Also, the free description answer sentence with the ID of 0002 shown in FIG. 5 is composed of two sentences, but the first sentence matches the analysis rule of the application order 2 and corresponds to the discourse element of “grounds description”. The content of the description is "It is a serious situation that the shopping district near 離れ is serious." In addition, the second sentence matches the analysis rule of application order 3, and "I think that no further bypass road is necessary" is cut out as the description corresponding to the discourse element of "disagreement description", and the opinion sentence database 3 Is stored in
[0024]
FIG. 8 is a flowchart showing the flow of the issue extraction / association calculation process. In the opinion analysis apparatus of the present invention, when the opinion sentence analysis process is completed for all the questionnaire answer sentences to be analyzed and the opinion sentence data is stored in the opinion sentence database 3 as opinion sentence data, for example, It is assumed that this processing is started according to the instruction of the tallyer.
[0025]
When the present process is started, first, the description content of the discourse element corresponding to the “favoring opinion description” or the “disagreeing opinion description” is extracted from the opinion sentence data 3 (S801). Next, an issue is extracted and the degree of association is calculated using the extracted description contents (S802). This processing can be realized by using a conventional technique regarding document clustering. For example, Japanese Patent Application Laid-Open No. 2000-259658 describes a technique for efficiently presenting the result of document clustering. By using the technology described in this publication, expressions characterizing those clusters can be given to each cluster. Further, each document can be given a degree of similarity to the feature of the cluster to which it belongs.
[0026]
In the present embodiment, clustering processing is performed using the technology described in the official gazette, using the description content extracted in the previous step for one opinion sentence data as one document data. Then, an expression characterizing each cluster is obtained as an issue, and a similarity to the feature of the cluster to which each document data belongs is obtained as an issue relevance, and the result is written in the opinion sentence database 3 (S803).
[0027]
FIG. 9 shows an example of a result of performing a document clustering process using the description contents extracted in the above-described steps from the opinion sentence database 3. In this drawing, two of the extracted clusters are shown, and the expressions characterizing each cluster are “bypass road, maintenance” and “parking lot, increase”. The expressions that characterize these clusters are the issue. Further, each cluster has a plurality of opinion sentence data, and each opinion sentence data is given a similarity to the feature of the cluster. This similarity becomes the issue relevance of each opinion sentence data.
[0028]
FIG. 10 shows an example of the contents stored in the opinion sentence database 3 at the time when the issue extraction / association degree calculation processing is completed. In this figure, the issue of the opinion sentence data with ID number 0001 is “bypass road, maintenance”, the issue relevance is 0.9, the issue of the opinion sentence data with ID number 0002 is “bypass road, maintenance”, and the issue relevance is An example in which 0.8 is obtained is shown. The issue and the degree of issue relevance of each opinion sentence data are to be written on the line where the discourse element is “full sentence”.
[0029]
FIG. 11 is a flowchart illustrating the flow of the opinion strength calculation process. In the opinion analysis apparatus of the present invention, when the opinion sentence analysis process is completed for all the questionnaire answer sentences to be analyzed and the opinion sentence data is stored in the opinion sentence database 3 as opinion sentence data, for example, It is assumed that this processing is started according to the instruction of the tallyer.
[0030]
When this processing is started, first, an opinion strength calculation dialog as shown in FIG. 12 is displayed, and an instruction of the opinion tallyer regarding calculation conditions of the opinion strength is received (S1101). Next, from the opinion sentence database 3, the description content of the discourse element corresponding to the “rational description” in each opinion sentence data is extracted (S1102). Next, for each piece of opinion sentence data, the fetched description content is morphologically analyzed, and the number of appearances of a word whose part of speech is determined to be “numeric” or “proper noun” as a result of the morphological analysis is counted (S1103). Actually, the frequency of appearance is counted only for parts of speech that are checked as words to be considered as grounds description specificity in the opinion strength calculation dialog.
[0031]
Next, from the opinion sentence database 3, the description content of the discourse element in each opinion sentence data corresponding to the "support opinion description" or the "disagree opinion description" is extracted (S1104). Next, for each opinion sentence data, the number of appearances of words to be considered as opinion description assertiveness is counted from the extracted description contents (S1105). The words that count the number of appearances are pre-registered words if "system defined words" is checked in the opinion strength dialogue, but if "specify words" is checked, the following text The word entered in the box shall be added.
[0032]
Next, based on the appearance frequency of each part of speech or word counted in the above-described step, the opinion strength is calculated according to a given formula (S1106), and the calculation result is written in the opinion sentence database 3 (S1107). The opinion strength calculation formula can be expressed as follows, for example, assuming that the weight of the basis description concreteness set in the opinion strength calculation dialog is a and the weight of the opinion description assertiveness is b.
[0033]
(Equation 1)

[0034]
A specific method of calculating the opinion strength will be described using the opinion sentence data shown in FIG. 10 as an example. The calculation conditions are set as shown in the dialog shown in FIG. Regarding opinion sentence data with an ID number of 0001, the description content in which the discourse element corresponds to “grounds description” includes two proper nouns as words to be considered as grounds description specificity (“XX city” and “△△ ”, Both names). Therefore, the ground description specificity is 2.
[0035]
As for the opinion description assertiveness, assuming that the word “should” is registered in advance as a definition word by the system as shown in FIG. 13, the discourse element appears once in the description content corresponding to “favorable opinion description”. I do. Furthermore, the word "absolute" has been input as a designated word by the opinion analyst, and this also appears once. Further, since the weight is specified as 1.5, the opinion description assertiveness is 1 + 1 × 1.5 = 2.5 from the equation (2). Further, since the weight of the specificity of the evidence description and the opinion description assertiveness is set to 2: 3 in the opinion intensity calculation dialog shown in FIG. 12, the opinion intensity is expressed by the expression (3), 2 for a and 3 for b. Is substituted into the ground description specificity, and 2.5 into the opinion description assertiveness, and is calculated as 2.3.
[0036]
In the opinion sentence data with the ID number of 0002, a proper noun appears once (“$”) as a word to be considered as the ground description specificity in the description content in which the discourse element corresponds to “ground description”. . Therefore, the ground description specificity is 1. Regarding the opinion description assertiveness, if there are no words defined by the system that appear in the description corresponding to the "opposite opinion description", the opinion description assertiveness is zero. Therefore, the opinion strength is calculated as 0.4 by substituting 2 for a, 3 for b, 1 for the ground description specificity, and 0 for the opinion description assertiveness in equation (3).
[0037]
FIG. 12 shows an example of the opinion strength calculation dialog. In this figure, the opinion strength calculation condition is composed of three items: setting of calculation conditions for grounds description specificity, setting of calculation conditions for opinion description assertiveness, and weight setting of grounds specificity and opinion description assertiveness. In the calculation condition setting of the ground description specificity, the part of speech of a word to be considered for calculating the degree of the ground description specificity is selected. In the dialog of this figure, numerical values and proper nouns can be selected, and both can be selected.
[0038]
In setting the opinion description assertiveness calculation condition, words to be considered for calculating the degree of opinion assertiveness are selected. In the dialog shown in the figure, either a word prepared in advance is used (check “defined words by the system” as shown in FIG. 13) or an opinion tallyer specifies (check “specify words”). Either or both can be selected. If the opinion tally is selected, enter the word to be considered in the text box. In this case, it is possible to specify a plurality of words by separating them with commas. In the text box below, the degree of consideration for the specified word can be input as a weight. The weight is input to the opinion analysis device by inputting an arbitrary numerical value as a value when the definition word by the system is set to 1, and then pressing a button labeled “start opinion intensity calculation”.
[0039]
FIG. 14 shows an example of a state where the calculation result is written in the opinion sentence database. In this figure, for each opinion sentence data, the row where the discourse element is “full text” has opinion strength, the row “grounds description” has specificity of grounds description, The line contains the numerical value of the opinion statement.
FIG. 15 is a flowchart illustrating the flow of the opinion chart creation process. When this process is started, first, the item of the issue for each opinion sentence data is read from the opinion sentence database 3, and the frequency of the opinion sentence data for each issue is counted (S1501). Next, an opinion chart creation dialog as shown in FIG. 16 is displayed, and the opinion totalizer receives designation of an issue for creating an opinion chart (S1502).
[0040]
Next, in order to create a graph for one of the designated issues, the horizontal axis of the graph takes the opinion description strength and the vertical axis takes the issue relevance (S1503). Next, one piece of opinion sentence data is read from the opinion sentence database 3 (S1504), and it is determined whether or not the point of the read opinion sentence data is a graph currently being created (S1505). And the issue point relevance are read and plotted at an appropriate position in the graph (S1506).
[0041]
When the plot is completed, it is determined whether or not the processing has been completed for all the opinion sentence data in the opinion sentence database 3 (S1507). If not, the process returns to the next opinion sentence data reading process (S1504). If the process has been completed, it is determined whether or not there is another issue to create a graph (S1508). If there is another issue, the process proceeds to the next graph creation process (S1503). If not, the process is terminated, and the created graph data as the opinion chart is transmitted to the opinion totalizer client.
[0042]
FIG. 16 shows an example of the opinion chart creation dialog. In this example, a case is shown in which the frequency of each issue counted in the previous step is 22 for "bypass road, maintenance", 15 for "parking lot, increase", and 8 for "highway, free". I have.
A check box is provided in front of each issue, and by clicking here, an issue for which an opinion chart is created can be designated. It is possible to specify more than one issue.
[0043]
FIG. 17 shows an example of an opinion chart in which opinion data is plotted. In this way, it is also possible to distinguish the types of points to be plotted based on the consent or dissent. In this case, the judgment is made based on whether the item of the discourse element of the opinion sentence data is “a description of a favorable opinion” or “a description of a negative opinion”. In this example, the approval opinion is plotted with "●" and the opposition is plotted with "x". Further, as shown in FIG. 18, a keyword indicating the basis of each opinion can be given beside the plotted point. This keyword can be assigned by, for example, extracting the description content corresponding to the grounds description in the opinion sentence data that first appears during morphological analysis.
[0044]
FIG. 19 is obtained by adding two standard lines to the opinion chart of FIG. One of the reference lines is drawn so as to be parallel to the opinion strength axis, for example, at a position in the upper half counted from the largest issue relevance. The other is drawn so as to be parallel to the axis of the issue relevance, for example, where the opinion strength becomes the upper half counted from the largest one. By constructing the opinion chart in this way, for example, the upper right area of the chart is a set of "relatively majority opinions", the upper left area is a set of "a majority but only hopes and wishes", the lower right The region can be used to grasp the characteristics of each opinion in the region where the plotted points exist, such as a set of “unique opinions but strong opinions”.
[0045]
Further, by linking the plotted points to individual opinion sentence data in the opinion sentence database, for example, by pointing (clicking) the plotted points on the display as shown in FIG. The description content in which the discourse element of the sentence data corresponds to “all sentences” can be displayed. By doing so, it is possible to easily extract and view opinion sentence data, for example, "a minority but strong opinion on" bypass maintenance "".
[0046]
FIG. 21 shows an example in which an opinion chart is created when a plurality of issues are specified in the opinion chart creation dialog. This is a combination of graphs created for one issue, like a pie chart. In the graph for each issue, the angle between the axis of the issue relevance and the axis of opinion strength can be calculated by, for example, the following equation.
[0047]
(Equation 2)

[0048]
By creating an opinion chart in this way, it is possible to efficiently see the distribution of opinions on a plurality of issues, and at the same time, since the angle formed by the coordinate axes is proportional to the number of opinion sentence data to which the issue belongs, Also, it is possible to easily grasp whether there are many opinions regarding.
(Supplementary Note 1) a step of extracting a free description sentence based on a predetermined analysis rule and extracting an opinion description part recognized as an opinion;
Extracting words from the opinion description portion in the order of appearance frequency.
[0049]
(Supplementary Note 2) Opinion analysis means for extracting a free description sentence based on a predetermined analysis rule and extracting an opinion description portion recognized as an opinion,
An opinion extracting device for extracting words from the opinion description portion in the order of frequency of appearance.
(Supplementary Note 3) Opinion analysis means for extracting a free description sentence based on a predetermined analysis rule and extracting an opinion description portion recognized as an opinion,
An opinion analysis program for causing a computer to function as an issue extracting means for extracting words from the opinion description part in the order of appearance frequency.
[0050]
(Supplementary Note 4) The opinion analysis program according to claim 3, further comprising an issue relevance calculating means for calculating a relevance to the word extracted by the issue extracting means.
(Supplementary Note 5) A numerical value indicating the assertiveness of the opinion description calculated by counting the frequency of occurrence of a specific word in the opinion description portion, and the opinion analysis unit further extracts the basis description portion recognized as the basis of the opinion. Then, the opinion strength indicating the degree of opinion strength in the free description sentence is calculated using a numerical value indicating the specificity of the basis description calculated by counting the frequency of occurrence of a specific word in the basis description part. 5. The opinion analysis program according to claim 3, further comprising opinion strength calculation means.
[0051]
(Supplementary Note 6) The opinion chart creation means is a two-dimensional coordinate system for the issue relevance calculated by the issue relevance calculation means for the free description sentence and the opinion strength calculated by the opinion strength calculation means. 3. The opinion analysis program according to claim 3, wherein a diagram showing a distribution of opinions is created by plotting the opinions on a system.
(Supplementary Note 7) The opinion chart creating means divides the two-dimensional coordinate system into a plurality of regions so that the feature of the opinion can be determined based on the region where the points plotted in the two-dimensional coordinate system exist. 3. The opinion analysis program according to claim 3, characterized in that:
[0052]
(Supplementary Note 8) The opinion chart creating means extracts words representative of description contents other than the extracted issues for each free description sentence, and adds a diagram showing a distribution of the created opinions. Opinion analysis program described in Appendix 3.
[0053]
【The invention's effect】
As described above, according to the present invention, it is possible to extract only the opinion and request description from the questionnaire free description answer sentence by document analysis. Therefore, by using only those descriptions for the document clustering process, , It is possible to present opinion analysis results focusing on more precise issues.
[0054]
In addition, by calculating the opinion strength of each opinion sentence data based on the degree of relevance of the issue and the specificity of the basis, it is possible to evaluate not only positive or negative but also the degree from several viewpoints Will be possible.
Furthermore, by plotting the degree of issue relevance and opinion strength on a graph and presenting it as an opinion chart, for example, it is also possible to extract opinion sentences that meet conditions that are difficult to extract with conventional techniques such as "a strong opinion even in a minority", This can be performed by a simple operation of designating a point on the opinion chart.
[Brief description of the drawings]
FIG. 1 is an overall configuration diagram of a first embodiment.
FIG. 2 is a flowchart showing a flow of an opinion sentence analysis process according to the present invention;
FIG. 3 is an example 1 of a questionnaire free description answer sentence
FIG. 4 Example 2 of a questionnaire free description answer sentence
FIG. 5 is a flowchart showing the flow of opinion sentence data storage processing according to the present invention;
FIG. 6 shows an example of analysis rule data stored in an analysis rule database.
FIG. 7 is an example of opinion sentence data stored in an opinion sentence database.
FIG. 8 is a flowchart showing the flow of an issue extraction / relevance calculation process according to the present invention;
FIG. 9 shows an example of a clustering result.
FIG. 10 is an example in which issues and issue relevance are written in the opinion sentence database.
FIG. 11 is a flowchart showing the flow of opinion strength calculation processing according to the present invention.
FIG. 12 is an example of an opinion strength calculation dialog
FIG. 13 is an example of a definition word by a system used for opinion description assertiveness.
FIG. 14: Example of writing opinion strength in opinion sentence database
FIG. 15 is a flowchart showing the flow of an opinion chart creation process according to the present invention.
FIG. 16 shows an example of an opinion chart creation dialog.
FIG. 17 is an example 1 of an opinion chart.
FIG. 18 is an opinion chart example 2
FIG. 19 is an opinion chart example 3
FIG. 20 is an opinion chart example 4
FIG. 21 is an opinion chart example 5
[Explanation of symbols]
1 opinion analysis device
2 Analysis rule database
3 Opinion database
4 Communication network
5 Survey respondent clients
6 Survey respondent clients
7 Survey respondent clients
8 Opinion Tally Client
10. Opinion analysis program
11 Free description input means
12 Opinion analysis means
13 Issue point extraction means
14 Opinion chart creation means
15 Opinion chart output means
16 Issue relevance calculation means
17 Opinion intensity calculation means

Claims

Extracting a free description sentence based on a predetermined analysis rule and extracting an opinion description portion recognized as an opinion;
Extracting words from the opinion description portion in the order of appearance frequency.

Opinion analysis means for extracting a free description sentence based on a predetermined analysis rule and extracting an opinion description portion recognized as an opinion;
An opinion extracting device for extracting words from the opinion description portion in the order of frequency of appearance.

Opinion analysis means for extracting a free description sentence based on a predetermined analysis rule and extracting an opinion description portion recognized as an opinion;
An opinion analysis program for causing a computer to function as an issue extracting means for extracting words from the opinion description part in the order of appearance frequency.

4. The opinion analysis program according to claim 3, further comprising an issue relevance calculating means for calculating a relevance to the word extracted by the issue extracting means.

A numerical value indicating the assertiveness of the opinion description calculated by counting the frequency of occurrence of a specific word in the opinion description portion, and the opinion analysis means further extracts a basis description portion recognized as the basis for the opinion, and Opinion strength calculation means for calculating the opinion strength indicating the degree of opinion strength in the free description sentence, using a numerical value indicating the specificity of the basis description calculated by counting the appearance frequency of a specific word in the description part The opinion analysis program according to claim 3, further comprising: