JP2005165754A

JP2005165754A - Text mining analysis apparatus, text mining analysis method, and text mining analysis program

Info

Publication number: JP2005165754A
Application number: JP2003404793A
Authority: JP
Inventors: Junko Watanabe; 純子渡辺
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2003-12-03
Filing date: 2003-12-03
Publication date: 2005-06-23

Abstract

<P>PROBLEM TO BE SOLVED: To perform analyses similar to fixed point observation and detection of characteristic tendencies by comparing/editing a plurality of text mining analysis results. <P>SOLUTION: A text mining analysis apparatus comprises more than one text data 21, a feature analysis execution means 31 for applying text mining analyses to the respective text data, and more than one analysis result data 22 for displaying a feature word obtained by the analysis and one or more than one predetermined feature degree corresponding to the feature word. The apparatus has comparison means 32, 33, and 34 which select the analysis result data to be compared and analyzed based on the input information by an analyzer; create a comparison list on the basis of the selected analysis result data; set a comparison condition based on the input information by the analyzer; execute the comparison analysis corresponding to the comparison list according to the comparison condition; and output the comparison result by the comparison analysis. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、日々蓄積されるテキストデータを分析して有益な情報を抽出するテキストマイニング技術に関し、特に複数の分析結果を比較・編集することにより、定点観測的な分析及び特徴的な傾向の出現を検知可能とするテキストマイニング分析装置、テキストマイニング分析方法、及びテキストマイニング分析プログラムに関する。 The present invention relates to a text mining technique for extracting useful information by analyzing text data accumulated every day, and in particular, by comparing and editing a plurality of analysis results, a fixed point observation analysis and appearance of a characteristic tendency The present invention relates to a text mining analysis apparatus, a text mining analysis method, and a text mining analysis program.

従来、電子化され日々蓄積されるテキストを含むデータ（以下、単にテキストデータと称する。）、例えば、コンタクトセンタに寄せられる問い合わせ履歴や顧客満足度調査アンケートなどは、顧客からの生の声として、企業等の活動にフィードバックして利用されている。
このようなテキストデータは、一般に日次や月次などのタイミングで整理され、様々な分析に利用されており、企業等におけるマーケティングや製品開発、営業活動等に有効に活用可能なものとして注目されている。
このため、このようなテキストデータにもとづき有益な情報を抽出する技術として、種々のテキストマイニング分析技術が提案されている。 Conventionally, data including text that is digitized and accumulated every day (hereinafter simply referred to as text data), for example, inquiry histories sent to contact centers and customer satisfaction survey questionnaires, etc. as raw voices from customers, It is used as feedback in the activities of companies.
Such text data is generally organized at daily and monthly timings, and is used for various analyses, and is attracting attention as being useful for marketing, product development, and sales activities in companies. ing.
For this reason, various text mining analysis techniques have been proposed as techniques for extracting useful information based on such text data.

例えば、自然言語による自由回答記述を含むアンケート回答文をインターネット等のネットワークを通じて自動的に収集して分析し、分析結果をルール形式の知識として要求者に配信することが可能なテキストマイニング分析技術などが提案されている（例えば、特許文献１参照。）。
また、日々蓄積されるテキストデータの傾向を明らかにし、特徴的な個々のデータを発見することが可能なテキストマイニング分析技術なども提案されている（例えば、特許文献２参照。）。
さらに、文書集合の内容を複数の視点から分析することが可能なテキストマイニング機能を提供することにより、文書集合の傾向を容易に分析することの可能なテキストマイニング分析技術なども提案されている（例えば、特許文献３参照。）。 For example, text mining analysis technology that can automatically collect and analyze questionnaire response sentences including free answer descriptions in natural language, and distribute the analysis results to the requester as rule format knowledge Has been proposed (see, for example, Patent Document 1).
In addition, a text mining analysis technique capable of clarifying the tendency of text data accumulated every day and finding characteristic individual data has been proposed (for example, see Patent Document 2).
In addition, a text mining analysis technique that can easily analyze the tendency of a document set by providing a text mining function that can analyze the contents of a document set from a plurality of viewpoints has been proposed ( For example, see Patent Document 3.)

特開２００１−２６６０６０号広報Japanese Laid-Open Patent Publication No. 2001-266060 特開２００２−２１５６４７号公報JP 2002-215647 A 特開２００１−３１８９３９号公報JP 2001-318939 A

しかしながら、これらの従来のテキストマイニング技術によれば、個々の分析において特徴情報を抽出したり分類することはできるものの、これら複数の分析結果を比較することができないという問題があった。例えば、今月初めて現れたトピックスは何かなどという気づきについては、各々の分析結果を人手で付き合わせる必要があるという問題があった。
また、別のタイミングで分析済みの結果を再利用することができないため、この分析済みの対象データも含めた分析を実施したい場合は、この対象データも含めた分析データを準備して再度テキストマイニング分析を実施する必要があり、分析に多大な時間を要するという問題があった。 However, according to these conventional text mining techniques, although feature information can be extracted and classified in each analysis, there is a problem in that these analysis results cannot be compared. For example, there was a problem that it was necessary to manually associate each analysis result with regard to the awareness of the topic that first appeared this month.
In addition, since the analyzed results cannot be reused at different timings, if you want to perform an analysis that includes this analyzed target data, prepare the analytical data that also includes this target data, and then perform text mining again. There is a problem that it is necessary to carry out the analysis and it takes a lot of time for the analysis.

本発明は、上記の事情にかんがみなされたものであり、複数のテキストマイニング分析結果にもとづいて、その分析結果間の差異や特徴点を抽出し、これらを表示する比較結果を出力することで、テキストデータの定点観測的な分析及び特徴的な傾向の出現を検知可能とするテキストマイニング分析装置、テキストマイニング分析方法、及びテキストマイニング分析プログラムの提供を目的とする。 The present invention is considered in view of the above circumstances, based on a plurality of text mining analysis results, extracting differences and feature points between the analysis results, and outputting a comparison result for displaying these, It is an object of the present invention to provide a text mining analysis apparatus, a text mining analysis method, and a text mining analysis program that can detect fixed point observation of text data and the appearance of a characteristic tendency.

上記目的を達成するため、本発明のテキストマイニング分析装置は、二以上のテキストデータと、各テキストデータをテキストマイニング分析する特徴分析実行手段と、この分析によって得られた特徴語及びこの特徴語に対応する一又は二以上の所定の特徴度を保有する二以上の分析結果データとを備えたテキストマイニング分析装置であって、入力情報にもとづき比較分析の対象とする分析結果データを選択し、選択した分析結果データにもとづき比較一覧を作成し、入力情報にもとづき比較条件を設定して、比較条件に従って比較一覧に対する比較分析を実行し、比較分析による比較結果を出力する比較手段を有する構成としてある。 In order to achieve the above object, a text mining analyzer of the present invention includes two or more text data, a feature analysis execution means for text mining analysis of each text data, a feature word obtained by this analysis, and a feature word A text mining analyzer that includes two or more analysis result data having one or more corresponding predetermined features, and selects and selects analysis result data to be subjected to comparative analysis based on input information A comparison list is created based on the analysis result data, a comparison condition is set based on the input information, a comparison analysis is performed on the comparison list according to the comparison condition, and a comparison result is output according to the comparison analysis. .

テキストマイニング分析装置をこのような構成にすれば、現在のテキストデータに対してテキストマイニング分析を実行することができるのみならず、過去にわたって分析した結果を流用して、分析結果間の比較分析を行うことができるため、リスクやトレンドなどについての気づきや発見等を効率的に行うことが可能となる。
企業においては、顧客の声を分析して、事業活動に活かすことが一般的に行われているが、顧客からの情報を収集するコンタクトセンタに集まるテキストデータは膨大な量となる場合が多い。 If the text mining analyzer is configured in this way, not only text mining analysis can be performed on the current text data, but also the analysis results over the past can be used to perform comparative analysis between the analysis results. Since it can be performed, it becomes possible to efficiently recognize and discover risks and trends.
Enterprises generally analyze customer feedback and use it in business activities, but the amount of text data gathered at a contact center that collects information from customers is often enormous.

例えば、同一月についての１０年間の比較を実施しようとした場合、従来のテキストマイニング分析技術によれば、１０年分のデータを対象としてテキストマイニング分析を再実行する必要があるため、その分析に使用する必要のあるリソースは甚大なものであった。
本発明によれば、過去に行われた分析結果を利用してテキストマイニング分析を行うことができ、従来の分析技術に比較して、極めて迅速に効果的なテキストマイニング分析を実現することが可能となる。
なお、所定の特徴度とは、例えば、対応する特徴語の出現頻度や各種統計尺度等を意味するものである。 For example, when trying to perform a 10-year comparison for the same month, according to the conventional text mining analysis technology, it is necessary to re-execute the text mining analysis for 10 years of data. The resources that need to be used were enormous.
According to the present invention, it is possible to perform text mining analysis using analysis results performed in the past, and it is possible to realize effective text mining analysis very quickly compared to conventional analysis techniques. It becomes.
The predetermined feature level means, for example, the appearance frequency of various feature words, various statistical scales, and the like.

また、本発明のテキストマイニング分析装置は、比較手段が、比較条件の設定において、入力情報にもとづいて、比較一覧における分析結果データの一をターゲットとして選択するとともに、このターゲットについての比較対象とする一又は二以上の分析結果データを比較一覧の中から指定し、比較分析の実行において、ターゲットに存在するが、比較対象とする分析結果データに存在しない特徴語を、新出単語として検出し、比較結果の出力において、新出単語を強調表示する比較結果を出力する構成としてある。 In the text mining analyzer of the present invention, the comparison means selects one of the analysis result data in the comparison list as a target based on the input information in setting the comparison condition, and sets the comparison target for this target. Specify one or more analysis result data from the comparison list, and detect the feature words that are present in the target but not in the analysis result data to be compared as new words in the comparison analysis. In the output of the comparison result, the comparison result for highlighting the new word is output.

テキストマイニング分析装置をこのような構成にすれば、例えば最新の顧客の声として入力されたテキストデータにもとづき作成された分析結果をターゲットとして選択し、この分析結果を、過去の所定期間に得られた分析結果と比較することにより、新たに出現した特徴語を把握することができ、最新のトレンドなどを効率的に把握することが可能となる。
また、このような比較特徴として得られる結果が、強調表示されるため、すなわち、直感的に把握されやすい形式で出力されて可視化されるため、分析者だけでなく、例えば企業における現場担当者全員に対しても比較結果のフィードバックを容易に行うことができ、比較結果の活用を促進することが可能となる。 If the text mining analyzer is configured in this way, for example, an analysis result created based on text data input as the latest customer voice can be selected as a target, and the analysis result can be obtained in a predetermined period in the past. Compared with the analysis results, it is possible to grasp newly appearing feature words and efficiently grasp the latest trends and the like.
In addition, because the results obtained as such comparison features are highlighted, that is, output and visualized in a format that is easy to grasp intuitively, not only analysts, but also all on-site personnel in the company, for example Therefore, the comparison result can be easily fed back, and the use of the comparison result can be promoted.

また、本発明のテキストマイニング分析装置は、比較手段が、比較条件の設定において、入力情報にもとづいて、比較一覧の中から一又は二以上の分析対象とする特徴語を選択し、比較分析の実行において、比較一覧におけるすべての分析対象とする特徴語を検出して、比較結果の出力において、検出された分析対象とする特徴語を強調表示する比較結果を出力する構成としてある。 In the text mining analysis apparatus of the present invention, the comparison unit selects one or more feature words to be analyzed from the comparison list based on the input information in setting the comparison condition, and performs comparison analysis. In the execution, all the feature words to be analyzed in the comparison list are detected, and in the output of the comparison result, the comparison result for highlighting the detected feature words to be analyzed is output.

テキストマイニング分析装置をこのような構成にすれば、分析者は、注目するキーワードなどが現在又は過去の分析結果において、どのような状況で出現しているのかを容易に把握することができ、その出現状況の変化などを効率的に捉えることが可能となる。 If the text mining analyzer is configured in this way, the analyst can easily grasp the situation in which the keyword of interest appears in the current or past analysis results. It is possible to efficiently capture changes in the appearance status.

また、本発明のテキストマイニング分析装置は、比較手段が、比較結果の出力において、検出された分析対象とする特徴語に対応する特徴度を、比較一覧における分析結果データごとに表示するグラフを出力する構成としてある。
テキストマイニング分析装置をこのような構成にすれば、注目するキーワードなどが現在又は過去の分析結果において、どのような状況で出現しているのかをグラフにより明示することができるため、分析者に対して、トレンド等の把握に効果的に利用可能な比較結果を提供することが可能となる。 In the text mining analysis apparatus of the present invention, the comparison unit outputs a graph that displays, for each analysis result data in the comparison list, the feature degree corresponding to the detected feature word to be analyzed in the output of the comparison result. It is as composition to do.
If the text mining analyzer is configured in this way, it is possible to clearly indicate in what situation the keyword of interest has appeared in the current or past analysis results, so that Thus, it is possible to provide a comparison result that can be effectively used for grasping trends and the like.

また、本発明のテキストマイニング分析装置は、入力情報にもとづいて、所定のアラーム出力基準を設定し、比較一覧又は比較結果からアラーム出力基準を満たす特徴語を検出して、当該特徴語についてのアラーム情報を出力するアラーム出力手段を有する構成としてある。
テキストマイニング分析装置をこのような構成にすれば、分析者は、特に把握したい各種指標にもとづく比較結果について、アラームを出力することができるため、このような比較結果を明確に認識することが可能となる。
例えば、出現頻度の増加率が所定以上の大きさである特徴語が出現した場合に、アラームを出力するように設定しておくことなどによって、分析者は、トレンドやリスク等を一層効果的に把握することが可能となる。 Further, the text mining analyzer of the present invention sets a predetermined alarm output criterion based on input information, detects a feature word satisfying the alarm output criterion from a comparison list or a comparison result, and generates an alarm for the feature word. An alarm output means for outputting information is provided.
If the text mining analyzer is configured in this way, the analyst can output an alarm on the comparison results based on the various indicators that he or she wants to grasp, so that the comparison results can be clearly recognized. It becomes.
For example, if a feature word with an increase rate of appearance frequency larger than a predetermined value appears, the analyst can set trends and risks more effectively by setting an alarm to be output. It becomes possible to grasp.

また、本発明のテキストマイニング分析装置は、比較手段又はアラーム出力手段が、比較結果又はアラーム情報を、当該テキストマイニング分析装置に備えられた出力装置又は当該テキストマイニング分析装置に有線又は無線で接続された情報処理装置に出力する構成としてある。
テキストマイニング分析装置をこのような構成にすれば、分析者は、比較結果をテキストマイニング分析装置のディスプレイで確認したり、印刷することができる。また、この比較結果を、通信回線を介して、例えば、企業部門ＥＩＰ（ＥｎｔｅｒｐｒｉｓｅＩｎｆｏｒｍａｔｉｏｎＰｏｒｔａｌ）といった情報共有システムに送信することにより、テキストマイニング分析装置から入力された気づき情報としてのアラーム情報を、速報アラームなどとして活用することが可能となる。 In the text mining analyzer of the present invention, the comparison means or the alarm output means is connected to the output device provided in the text mining analyzer or the text mining analyzer by wire or wirelessly the comparison result or alarm information. The information is output to the information processing apparatus.
If the text mining analyzer is configured in this way, the analyst can check the comparison result on the display of the text mining analyzer or print it. Further, by sending the comparison result to an information sharing system such as an enterprise division EIP (Enterprise Information Portal) via a communication line, alarm information as notice information input from the text mining analyzer is quickly reported. It can be used as an alarm.

また、本発明のテキストマイニング分析装置は、比較手段が、比較一覧における特徴語の選択入力を受けると、この選択された特徴語の抽出元であるテキストデータの情報を表示する構成としてある。
テキストマイニング分析装置をこのような構成にすれば、分析者は、比較一覧や比較結果に表示された分析結果における特徴語が、どのようなテキストデータに記載されているものであるのかを参照することができ、新出単語や注目キーワードなどにもとづいて、顧客の声を把握することが可能となる。 In the text mining analysis apparatus of the present invention, when the comparison unit receives the selection input of the feature word in the comparison list, the text mining analysis apparatus displays the information of the text data from which the selected feature word is extracted.
When the text mining analyzer is configured in this way, the analyst refers to what text data the feature word in the analysis result displayed in the comparison list or comparison result is described in. The customer's voice can be grasped based on a new word or a keyword of interest.

また、本発明のテキストマイニング分析方法は、二以上のテキストデータと、各テキストデータをテキストマイニング分析する特徴分析実行手段と、この分析によって得られた特徴語及びこの特徴語に対応する一又は二以上の所定の特徴度を保有する二以上の分析結果データとを備えたテキストマイニング分析装置を用いて、分析結果データの比較分析を行うテキストマイニング分析方法であって、テキストマイニング分析装置が、入力情報にもとづき比較分析の対象とする分析結果データを選択し、選択した分析結果データにもとづき比較一覧を作成し、入力情報にもとづき比較条件を設定して、比較条件に従って比較一覧に対する比較分析を実行し、比較分析による比較結果を出力する方法としてある。 The text mining analysis method of the present invention includes two or more text data, a feature analysis execution means for text mining analysis of each text data, a feature word obtained by this analysis, and one or two corresponding to the feature word. A text mining analysis method for performing a comparative analysis of analysis result data using a text mining analysis device having two or more analysis result data possessing the above-mentioned predetermined features. Select analysis result data to be subjected to comparative analysis based on information, create a comparison list based on the selected analysis result data, set comparison conditions based on input information, and execute comparison analysis on comparison list according to comparison conditions This is a method for outputting a comparison result by comparative analysis.

また、本発明のテキストマイニング分析方法は、テキストマイニング分析装置が、入力情報にもとづいて、所定のアラーム出力基準を設定し、比較一覧からアラーム出力基準を満たす特徴語を検出して、当該特徴語についてのアラーム情報を出力する方法としてある。 Further, according to the text mining analysis method of the present invention, the text mining analyzer sets a predetermined alarm output criterion based on the input information, detects a feature word satisfying the alarm output criterion from the comparison list, and detects the feature word. As a method of outputting alarm information about

テキストマイニング分析方法をこのような方法にすれば、過去にわたって分析した結果を流用して、分析結果間の比較分析を行うことができるため、リスクやトレンドなどについての気づきや発見等を効率的に行うことが可能となる。
すなわち、このように過去の分析結果を利用することにより、過去のテキストデータを含めた分析を行う場合に、これらのテキストデータを用いて再度テキストマイニング分析を実行する必要がないため、分析に使用するリソースを低減でき、極めて迅速な分析を実現することが可能となる。
さらに、比較分析の結果が所定の基準を満たした場合にアラームを出力することができ、分析者は、リスクやトレンドなどをより効果的に把握することが可能となる。 If the text mining analysis method is used in this way, the analysis results obtained over the past can be used to perform comparative analysis between the analysis results, so that awareness and discovery of risks and trends can be efficiently performed. Can be done.
In other words, by using past analysis results in this way, when performing analysis including past text data, there is no need to perform text mining analysis again using these text data. Resources can be reduced, and extremely quick analysis can be realized.
Furthermore, an alarm can be output when the result of the comparative analysis satisfies a predetermined standard, and the analyst can more effectively grasp risks and trends.

また、本発明のテキストマイニング分析プログラムは、二以上のテキストデータと、各テキストデータをテキストマイニング分析する特徴分析実行手段と、この分析によって得られた特徴語及びこの特徴語に対応する一又は二以上の所定の特徴度を保有する二以上の分析結果データとを備えたテキストマイニング分析装置に、分析結果データの比較分析を行わせるテキストマイニング分析プログラムであって、テキストマイニング分析装置に、入力情報にもとづき比較分析の対象とする分析結果データを選択させ、選択された分析結果データにもとづき比較一覧を作成させ、入力情報にもとづき比較条件を設定させて、比較条件に従って比較一覧に対する比較分析を実行させ、比較分析による比較結果を出力させる構成としてある。 The text mining analysis program of the present invention includes two or more text data, a feature analysis execution means for text mining analysis of each text data, a feature word obtained by this analysis, and one or two corresponding to the feature word. A text mining analysis program that causes a text mining analysis device having two or more analysis result data possessing the above predetermined features to perform a comparative analysis of the analysis result data, the input information to the text mining analysis device Based on the selected analysis result data, select the analysis result data, create a comparison list based on the selected analysis result data, set the comparison condition based on the input information, and execute the comparison analysis for the comparison list according to the comparison condition And a comparison result by comparison analysis is output.

また、本発明のテキストマイニング分析プログラムは、テキストマイニング分析装置に、入力情報にもとづいて、所定のアラーム出力基準を設定させ、比較一覧からアラーム出力基準を満たす特徴語を検出させて、当該特徴語についてのアラーム情報を出力させる構成としてある。 The text mining analysis program of the present invention causes the text mining analyzer to set a predetermined alarm output criterion based on input information, detect a feature word that satisfies the alarm output criterion from a comparison list, and Is configured to output alarm information about the.

テキストマイニング分析プログラムをこのような構成にすれば、テキストマイニング分析装置に、過去の分析結果を用いて、分析結果間の比較分析を実行させ、比較結果を出力させることができるとともに、所定の場合にはアラームを出力させることもできるため、分析者は、リスクやトレンド等を効率的に把握することが可能となる。 If the text mining analysis program is configured in this way, the text mining analyzer can be used to perform comparison analysis between analysis results using past analysis results and output the comparison results. Since an alarm can be output, the analyst can efficiently grasp risks and trends.

本発明によれば、現在のテキストデータに対するテキストマイニング分析を行うことができるとともに、この分析結果と、過去に行ったテキストマイニング分析の結果を用いて、分析結果間の比較分析を行うことができるため、リスクやトレンドなどについての気づきや発見等を効率的に行うことが可能となる。
また、この比較分析によって、新たに出現した特徴語や、注目するキーワードなどの変化状況等を把握することができ、最新のトレンドなどを効率的に把握することが可能となる。
さらに、このような比較特徴として得られる結果が、直感的に把握されやすい形式で出力され、可視化されることにより、分析者だけでなく、例えば企業における現場担当者全員に対しても比較結果のフィードバックを容易に行うことができ、比較結果の活用を促進することが可能となる。
また、特に把握したい各種指標にもとづく比較結果について、アラームを出力することができるため、このような比較結果を明確に認識することができ、分析者は、トレンドやリスク等を一層効果的に把握することが可能となる。 According to the present invention, it is possible to perform a text mining analysis on the current text data, and to perform a comparative analysis between the analysis results using the analysis result and the result of the text mining analysis performed in the past. Therefore, it is possible to efficiently recognize and discover risks and trends.
In addition, by this comparative analysis, it is possible to grasp the state of change of newly appearing feature words, noticed keywords, etc., and it is possible to efficiently grasp the latest trends and the like.
Furthermore, the results obtained as such comparison features are output and visualized in an intuitively easy-to-understand format, so that not only the analyst but also all on-site personnel in the company can compare the results. Feedback can be performed easily, and the use of the comparison result can be promoted.
In addition, alarms can be output for comparison results based on various indicators that you want to understand, so you can clearly recognize such comparison results, and analysts can understand trends and risks more effectively. It becomes possible to do.

以下、本発明に係るテキストマイニング分析装置の好ましい実施形態について、図面を参照しつつ説明する。
なお、以下の実施形態に示す本発明のテキストマイニング分析装置は、プログラムに制御されたコンピュータにより動作するようになっている。プログラムは、コンピュータの各構成要素に指令を送り、テキストマイニング分析装置の動作に必要となる所定の処理、例えば、特徴分析処理、比較設定処理、比較一覧表示処理、比較特徴抽出処理等を行わせる。このように、本発明のテキストマイニング分析装置における各処理，動作は、プログラムとコンピュータとが協働した具体的手段により実現できるものである。
プログラムは予めＲＯＭ，ＲＡＭ等の記録媒体に格納され、コンピュータに実装された記録媒体から当該コンピュータにプログラムを読み込ませて実行されるが、例えば通信回線を介してコンピュータに読み込ませることもできる。
また、プログラムを格納する記録媒体は、例えば半導体メモリ，磁気ディスク，光ディスク、その他任意のコンピュータで読取り可能な任意の記録手段により構成できる。 Hereinafter, a preferred embodiment of a text mining analyzer according to the present invention will be described with reference to the drawings.
Note that the text mining analyzer of the present invention shown in the following embodiments is operated by a computer controlled by a program. The program sends a command to each component of the computer to perform predetermined processing necessary for the operation of the text mining analyzer, such as feature analysis processing, comparison setting processing, comparison list display processing, comparison feature extraction processing, etc. . Thus, each process and operation in the text mining analyzer of the present invention can be realized by specific means in which the program and the computer cooperate.
The program is stored in advance in a recording medium such as a ROM and a RAM, and is executed by causing the computer to read the program from a recording medium mounted on the computer. For example, the program may be read by the computer via a communication line.
Further, the recording medium for storing the program can be constituted by, for example, a semiconductor memory, a magnetic disk, an optical disk, or any other recording means readable by any computer.

［第一実施形態］
まず、本発明の第一実施形態の構成について、図１〜図８を参照して説明する。図１は、本実施形態のテキストマイニング分析装置の構成を示すブロック図である。図２〜図８は、順に本実施形態のテキストマイニング分析装置におけるテキストデータ，分析結果データ，比較対象リスト設定画面，比較結果一覧，新出単語抽出表示結果，注目キーワードハイライト表示結果，グラフ表示結果を示す図である。 [First embodiment]
First, the structure of 1st embodiment of this invention is demonstrated with reference to FIGS. FIG. 1 is a block diagram showing the configuration of the text mining analyzer of this embodiment. 2 to 8 show, in order, text data, analysis result data, comparison target list setting screen, comparison result list, new word extraction display result, attention keyword highlight display result, graph display in the text mining analyzer of this embodiment. It is a figure which shows a result.

図１に示すように、本実施形態のテキストマイニング分析装置は、入力装置１０，記憶装置２０，データ分析処理装置３０，出力装置４０を有している。
入力装置１０は、テキストデータに対するマイニング分析を実施する者（以下、単に分析者と称する。）が、比較分析の条件等をデータ分析処理装置３０に入力するためなどに使用するキーボード等の装置である。 As shown in FIG. 1, the text mining analyzer of this embodiment includes an input device 10, a storage device 20, a data analysis processing device 30, and an output device 40.
The input device 10 is a device such as a keyboard used by a person who performs mining analysis on text data (hereinafter simply referred to as an analyst) to input comparative analysis conditions and the like into the data analysis processing device 30. is there.

記憶装置２０は、情報を記録する装置であり、図１に示すように、複数のテキストデータ２１（テキストデータ２１−１，・・・，テキストデータ２１−ｎ）と、複数の分析結果データ２２（分析結果データ２２−１，・・・，分析結果データ２２−ｍ）とを蓄積する。なお、このテキストデータ２１と分析結果データ２２とは、必ずしも一対一に対応するものではなく、複数のテキストデータ２１にもとづき一の分析結果データ２２が作成される場合や、一のテキストデータ２１にもとづき複数の分析結果データ２２が作成される場合も含まれるものである。 The storage device 20 is a device for recording information. As shown in FIG. 1, the storage device 20 has a plurality of text data 21 (text data 21-1,..., Text data 21-n) and a plurality of analysis result data 22. (Analysis result data 22-1,..., Analysis result data 22-m) are accumulated. Note that the text data 21 and the analysis result data 22 do not necessarily correspond one-to-one. If one analysis result data 22 is created based on a plurality of text data 21, The case where a plurality of analysis result data 22 is created is also included.

テキストデータ２１は、アンケートに対する回答や、問合せ、苦情等のテキストを含む否定形型データであり、例えば、図２に示すようなものとすることができる。
同図に示すように、テキストデータ２１としては、ＣＳＶと呼ばれるコンマ区切りのテキストファイルや、リレーショナルデータベースなどのデータベースから抽出される一定のフォーマットを有するファイル等とすることができ、列と行から構成されている。 The text data 21 is negative type data including texts such as answers to questionnaires, inquiries, and complaints, and can be, for example, as shown in FIG.
As shown in the figure, the text data 21 can be a comma-delimited text file called CSV, a file having a certain format extracted from a database such as a relational database, etc., and is composed of columns and rows. Has been.

図２に示すテキストデータ２１は、ある商品に対するアンケートを実施した結果得られたデータの例であり、そのデータ構造として、「日時」，「商品名」，「性別」，「年齢区分」，「評価点」，「意見」，「要望」等の列項目を有している。
「日時」は、データが入力された年月日および時刻情報、「商品名」は、アンケートの対象である商品の名称、「性別」・「年齢区分」は、アンケート回答者の属性情報である。また、「評価点」は、商品に対する回答者の評価の点数（この例では、５点満点の５段階評価）、「意見」・「要望」は、商品に対する回答者の意見・要望（文章で記述されたもの）である。 The text data 21 shown in FIG. 2 is an example of data obtained as a result of conducting a questionnaire for a certain product. The data structure includes “date and time”, “product name”, “gender”, “age category”, “ It has column items such as "evaluation point", "opinion", and "request".
“Date and time” is the date and time information when the data was input, “Product name” is the name of the product that is the subject of the questionnaire, and “Gender” and “Age category” are the attribute information of the survey respondents . The “evaluation score” is the score of the respondent's evaluation for the product (in this example, a five-point scale of 5 points), and the “opinion” and “request” are the opinions and requests of the respondent to the product Is described).

上述のように、このテキストデータ２１としては、アンケート以外の情報を用いることができることは言うまでもなく、例えば、コンタクトセンタに寄せられる問い合わせや苦情をデータベース化したものや、営業担当者が書く日報をデータベース化したもの等を用いることも好ましい。また、テキストデータ２１の列の項目数は、図２に例示したものに限定されるものではなく、その他数百項目におよぶ場合があり、さらにテキストデータ２１の行数は、数百から数百万のように膨大な量になることもある。 As mentioned above, it goes without saying that information other than a questionnaire can be used as the text data 21, for example, a database of inquiries and complaints sent to the contact center, and daily reports written by sales representatives. It is also preferable to use a modified one. Further, the number of items in the column of the text data 21 is not limited to that illustrated in FIG. 2 and may be several hundred other items, and the number of lines of the text data 21 is several hundred to several hundreds. It can be as huge as ten thousand.

分析結果データ（以下、単に分析結果と称する。）２２は、特徴分析実行手段３１が出力した実行結果であり、例えば、図３に示すようなものとすることができる。同図に示すように、そのデータ構造としては、「特徴語」，「頻度」，「総頻度」，「統計尺度」等の列項目を有するものとすることができる。
また、この分析結果２２を図やリストなどにより表示することも好ましく、表示された特徴語からその分析元データである対応する原文（特徴語の抽出元のテキストデータ２１）を参照可能とすることも好ましい。その方法としては、例えば、分析結果２２において個々の特徴語を、例えばクリックなどにより選択可能とし、その選択がされると、特徴分析実行手段３１が、選択された特徴語をキーとしてテキストデータ２１を検索し、この特徴語を有するテキストデータ２１の名称をリスト表示するウィンドウを出力することなどが可能である。 The analysis result data (hereinafter simply referred to as analysis result) 22 is an execution result output by the feature analysis execution means 31 and can be, for example, as shown in FIG. As shown in the figure, the data structure may have column items such as “characteristic word”, “frequency”, “total frequency”, and “statistical scale”.
It is also preferable to display the analysis result 22 as a diagram or a list, and the corresponding original text (text data 21 from which feature words are extracted) can be referred to from the displayed feature words. Is also preferable. As the method, for example, individual feature words in the analysis result 22 can be selected by, for example, clicking, and when the selection is made, the feature analysis execution means 31 uses the selected feature words as a key for the text data 21. And a window displaying a list of the names of the text data 21 having the characteristic words can be output.

データ分析処理装置３０は、プログラム制御により動作する情報処理装置であり、特徴分析実行手段３１，比較設定手段３２，比較一覧表示手段３３，比較特徴抽出手段３４を有している。
特徴分析実行手段３１は、入力されたテキストデータ２１に対し、入力装置から設定した条件に従って、テキストマイニング分析を実行し、分析結果２２を出力するものである。この特徴分析実行手段３１は、本出願人による特許文献である特許文献１に記載のテキスト分類エンジンに相当するものである。
この特徴分析実行手段３１は、テキストデータ２１から単語を切り出し、係り受けを解析して、特定キーワードとカテゴリから分類ルールを抽出し、その結果を分析結果２２及び出力装置４０へ出力する。 The data analysis processing device 30 is an information processing device that operates under program control, and includes a feature analysis execution unit 31, a comparison setting unit 32, a comparison list display unit 33, and a comparison feature extraction unit 34.
The feature analysis execution means 31 performs text mining analysis on the input text data 21 according to the conditions set from the input device, and outputs an analysis result 22. This feature analysis execution means 31 corresponds to the text classification engine described in Patent Document 1 which is a patent document by the present applicant.
The feature analysis execution unit 31 extracts words from the text data 21, analyzes the dependency, extracts a classification rule from the specific keyword and category, and outputs the result to the analysis result 22 and the output device 40.

例えば、図３は、図２に示すテキストデータ２１に対し、「性別」＝「男性」かつ「年齢区分」＝「４０歳代」の条件に該当するデータにおける「意見」で使われている特徴的な単語を抽出した結果を表示する例を示したものである。
なお、同図において、「親しみやすい」「若者→好む」といった特徴語が表示されているが、特徴的な単語と判断するルールとしては、例えば、出力頻度が高い単語を特徴的と判断することができる。また、対象とするカテゴリ以外のデータ、この例の場合では、「性別」＝「女性」又は「年齢区分」＝「４０歳代以外」と比べた場合の特徴度を統計処理することにより、特徴的な単語を把握することも可能である。 For example, FIG. 3 shows the characteristics used in the “opinion” in the data corresponding to the condition of “sex” = “male” and “age category” = “40s” with respect to the text data 21 shown in FIG. The example which displays the result of extracting a typical word is shown.
In the figure, characteristic words such as “friendly” and “young → prefer” are displayed. As a rule for determining a characteristic word, for example, a word with high output frequency is determined to be characteristic. Can do. In addition, in the case of this example, data other than the target category, in the case of “gender” = “female” or “age category” = “other than the 40s”, statistical processing is performed on the characteristic degree, It is also possible to grasp typical words.

比較設定手段３２は、特徴分析実行手段３１によって出力された複数の分析結果２２にもとづいて、その差異を比較分析するにあたっての分析対象範囲及び比較オプションを設定するものである。この比較設定手段３２は、図４に示すような比較対象リスト設定画面を表示し、分析者による選択入力情報にもとづいて、比較一覧表示手段３３により表示する比較一覧に表示させる分析結果２２を設定する。また、比較設定手段３２は、比較一覧に表示させる分析結果２２の表示順序を設定することも可能である。また、図示しないが、比較設定手段３２は、各種比較オプションを設定するための設定画面を表示でき、分析者の入力情報にもとづいて、比較オプションを設定し、比較特徴抽出手段３４は、この比較オプションにもとづき比較分析を実行する。 The comparison setting unit 32 sets an analysis target range and a comparison option for performing a comparative analysis of the difference based on the plurality of analysis results 22 output by the feature analysis execution unit 31. The comparison setting means 32 displays a comparison target list setting screen as shown in FIG. 4, and sets the analysis result 22 to be displayed in the comparison list displayed by the comparison list display means 33 based on the selection input information by the analyst. To do. The comparison setting means 32 can also set the display order of the analysis results 22 to be displayed on the comparison list. Although not shown, the comparison setting means 32 can display a setting screen for setting various comparison options. The comparison setting means 34 sets comparison options based on the input information of the analyst. Perform comparative analysis based on options.

比較一覧表示手段３３は、比較設定手段３２により設定した条件に従って、比較一覧を出力するものである。図５は、この比較一覧表示手段３３により表示された比較一覧である比較結果一覧の例を示している。
このユーザインタフェースによって、各分析結果を一覧表示することができるとともに、特徴度の基準となる頻度や統計尺度にもとづいて、ソート（並べ替え）が行えるようになっている。また、特徴語からその単語が使用されている原文、すなわちその単語の抽出元であるテキストデータ２１を参照できるようになっている。その参照方法としては、分析結果２２における場合と同様のものとすることができる。なお、特徴度とは、分析結果の特徴を把握するための指標を意味するものである。 The comparison list display means 33 outputs a comparison list according to the conditions set by the comparison setting means 32. FIG. 5 shows an example of a comparison result list that is a comparison list displayed by the comparison list display means 33.
With this user interface, each analysis result can be displayed as a list, and sorting (rearrangement) can be performed on the basis of a frequency or a statistical scale serving as a reference for the characteristic degree. Further, the original text in which the word is used from the feature word, that is, the text data 21 from which the word is extracted can be referred to. The reference method can be the same as in the analysis result 22. The feature level means an index for grasping the feature of the analysis result.

比較特徴抽出手段３４は、比較一覧表示手段３３により作成された比較一覧から所定の条件にもとづいて、分析結果間の差異や特徴点を抽出し、比較結果として出力するものである。このとき、比較特徴抽出手段３４は、比較オプションの設定により、複数の分析手法を提供することができる。また、各比較オプションごとに、分析者による入力情報にもとづいて、比較条件を設定し、この比較条件に従って比較一覧に対する比較分析を実行する。
例えば、比較一覧から新出単語を抽出して、表や図やリストといった形式で出力することができる。図６は、このような新出単語の抽出表示結果の例を示すものであり、同図において、特徴分析結果１及び２には表示されておらず、特徴分析結果３において初めて出現した特徴語が、他の語と識別可能に強調表示されている。 The comparison feature extraction unit 34 extracts differences between the analysis results and feature points from the comparison list created by the comparison list display unit 33 based on a predetermined condition, and outputs the difference as a comparison result. At this time, the comparison feature extraction means 34 can provide a plurality of analysis methods by setting comparison options. In addition, for each comparison option, a comparison condition is set based on information input by the analyst, and a comparison analysis is performed on the comparison list according to the comparison condition.
For example, new words can be extracted from the comparison list and output in the form of a table, a figure, or a list. FIG. 6 shows an example of such a new word extraction display result. In FIG. 6, the feature words that are not displayed in the feature analysis results 1 and 2 but appear for the first time in the feature analysis result 3 are shown. Is highlighted to be distinguishable from other words.

また、注目単語のハイライト表示やそのグラフ化を行うことも可能である。図７は、注目キーワードのハイライト表示結果の例を示しており、同図においては、「デザイン」が他の語と識別可能に強調表示されている。もちろん、複数の特徴語をハイライト表示させることも可能である。そして、図８は、このようにして選択された注目キーワードの特徴度の変化を表示するグラフの表示結果を示している。同図の例では、特徴分析結果１〜４に対応する月である９月〜１２月における「デザイン」，「信頼する」，「平凡だ」，「使いにくい」の４つのキーワードの出現頻度の変化を表示している。 It is also possible to highlight the attention word and graph it. FIG. 7 shows an example of a highlighted display result of a keyword of interest. In FIG. 7, “design” is highlighted so as to be distinguishable from other words. Of course, a plurality of feature words can be highlighted. FIG. 8 shows a display result of a graph that displays the change in the feature level of the keyword of interest selected in this way. In the example of the figure, the frequency of appearance of the four keywords “design”, “trust”, “ordinary”, and “difficult to use” in September to December, which is the month corresponding to the feature analysis results 1 to 4, is shown. The change is displayed.

出力装置４０は、分析結果２２、比較一覧表示手段３３により作成される比較一覧、及び比較特徴抽出手段３４により作成される各種表やグラフ等を出力するディスプレイ装置や印刷装置などの装置である。 The output device 40 is a device such as a display device or a printing device that outputs the analysis result 22, the comparison list created by the comparison list display means 33, and various tables and graphs created by the comparison feature extraction means 34.

次に、本実施形態のテキストマイニング分析装置における処理手順について、図１、及び図９〜図１１を参照して説明する。
図９〜図１１は、順に本実施形態のテキストマイニング分析装置における比較設定処理手順，新出単語抽出処理手順，注目キーワードハイライト・グラフ化処理手順を示すフローチャートである。 Next, a processing procedure in the text mining analyzer of the present embodiment will be described with reference to FIG. 1 and FIGS.
FIG. 9 to FIG. 11 are flowcharts showing the comparison setting processing procedure, the new word extraction processing procedure, and the attention keyword highlight / graphing processing procedure in the text mining analyzer of this embodiment in order.

まず、分析者は、入力装置１０から分析対象とするテキストデータ２１を指定するとともに、その分析の条件を設定し、特徴分析実行手段３１によって分析を実行して、分析結果２２を作成し、これを出力装置４０に表示する。
このとき、特徴分析実行手段３１は、入力装置１０から入力したテキストデータ２１の指定及び分析条件にもとづいて、テキストデータ２１からの単語の切り出し、係り受けの解析、及び特定キーワードとカテゴリにもとづく分類ルールの抽出を実行し、その結果を有する分析結果２２を記憶装置２０に作成するとともに、これを出力装置４０へ出力して表示させる。 First, the analyst designates the text data 21 to be analyzed from the input device 10, sets the analysis conditions, executes the analysis by the feature analysis execution means 31, and creates the analysis result 22. Is displayed on the output device 40.
At this time, the feature analysis execution means 31 cuts out words from the text data 21, analyzes the dependency, and classifies based on the specific keyword and category based on the specification of the text data 21 input from the input device 10 and the analysis conditions. The rule is extracted, and an analysis result 22 having the result is created in the storage device 20 and output to the output device 40 for display.

一般に、コンタクトセンタに寄せられる問い合わせの履歴や、顧客満足度調査のアンケートなどについては、企業などの活動にフィードバックする目的で、通常日次あるいは月次といったタイミングで特徴分析が実行され、その分析結果が記憶装置２０に逐次記憶され、蓄積されていく。
そして、分析者が、このように蓄積された分析結果と、現在特徴を把握したい対象であるテキストデータ２１に対する分析結果２２とにもとづいて、定点観測的な分析や特徴的な傾向出現の検知等を行う場合、比較設定手段３２を用いて、比較一覧を表示するための条件設定を行う。 In general, with regard to the history of inquiries sent to the contact center and customer satisfaction survey questionnaires, etc., feature analysis is usually performed on a daily or monthly basis for the purpose of providing feedback to company activities. Are sequentially stored and accumulated in the storage device 20.
Then, based on the analysis results accumulated in this way and the analysis results 22 for the text data 21 that is the object for which the current feature is to be grasped, a fixed point observation analysis, the appearance of a characteristic trend, etc. When performing, the comparison setting means 32 is used to set conditions for displaying a comparison list.

このとき、比較設定手段３２は、図９に示すように、分析者が、蓄積されている分析結果２２から比較分析の対象とする分析結果を選択すると、この選択情報を入力し（ステップＡ１）、図４に示すような比較対象リストに追加する（ステップＡ２）。そして、これらの動作を、所望の分析結果についての選択が完了するまで繰り返す（ステップＡ３）。 At this time, as shown in FIG. 9, when the analyst selects an analysis result to be subjected to comparative analysis from the accumulated analysis results 22, the comparison setting means 32 inputs this selection information (step A1). 4 is added to the comparison target list as shown in FIG. 4 (step A2). These operations are repeated until selection of a desired analysis result is completed (step A3).

さらに、分析者が、選択を完了した比較対象リストから、分析結果を一つ選択し、比較する順序を指定して設定すると、比較設定手段３２は、その選択情報を入力して（ステップＡ４）、その順序設定を実行する（ステップＡ５）。そして、これらの動作を、比較対象リストにおけるすべての分析結果２２について設定が完了するまで繰り返す（ステップＡ６）。
この比較順序の設定は、特に、時系列でトレンド分析をしたい場合や、顧客のロイヤリティサイクルによる分析をしたい場合など、比較の並び順に意味を持たせたい場合に行うと効果的なものである。 Furthermore, when the analyst selects one analysis result from the comparison target list for which selection has been completed, and designates and sets the order of comparison, the comparison setting means 32 inputs the selection information (step A4). The order setting is executed (step A5). These operations are repeated until the setting is completed for all the analysis results 22 in the comparison target list (step A6).
This setting of the comparison order is particularly effective when it is desired to make sense in the order of comparison, such as when trend analysis is performed in chronological order or when analysis is performed based on a customer loyalty cycle.

次に、比較一覧表示手段３３が、比較設定手段３２により比較対象リストとして設定された条件にもとづいて、図５に示すような比較一覧を表示する。
さらに、比較特徴抽出手段３４は、分析者により設定されたオプションにもとづいて、分析結果間の差異や特徴点について、図６〜図８に示すような種々の比較結果を出力装置４０に表示する。 Next, the comparison list display means 33 displays a comparison list as shown in FIG. 5 based on the conditions set as a comparison target list by the comparison setting means 32.
Further, the comparison feature extraction means 34 displays various comparison results as shown in FIGS. 6 to 8 on the output device 40 for the differences between the analysis results and the feature points based on the options set by the analyst. .

図１０は、比較特徴抽出手段３４による処理の一オプションである新出単語を抽出する処理手順を示すフローチャートである。
分析者が、比較条件として、比較一覧に表示された分析結果の中から、ターゲットの分析結果（以下、単にターゲット結果と称する。）を一つ選択するとともに、新出単語抽出範囲を指定すると、比較特徴抽出手段３４は、分析者により選択されたターゲット結果の選択情報を入力し（ステップＢ１）、分析者により指定された新出単語抽出範囲の指定情報を入力する（ステップＢ２）。このターゲット結果とは、新出単語が出現しているかどうかを判断する対象としての分析結果を意味する。また、新出単語抽出範囲とは、ターゲット結果における特徴語が、出現しているかどうかを判断するための対象としての分析結果を意味する。 FIG. 10 is a flowchart showing a processing procedure for extracting a new word, which is an option of processing by the comparison feature extraction means 34.
When an analyst selects a target analysis result (hereinafter simply referred to as a target result) from the analysis results displayed in the comparison list as a comparison condition, and specifies a new word extraction range, The comparison feature extraction unit 34 inputs selection information of the target result selected by the analyst (step B1), and inputs designation information of a new word extraction range designated by the analyst (step B2). This target result means an analysis result as a target for determining whether or not a new word appears. The new word extraction range means an analysis result as an object for determining whether or not a feature word in the target result appears.

次に、比較特徴抽出手段３４は、ターゲット結果に特徴語として出力されている単語を一つ選択して（ステップＢ３）、選択した特徴語が、新出単語抽出範囲に指定されている分析結果に特徴語として出力されているかどうかを判別し、出現していない場合は、その選択した特徴語を新出単語であると判断する（ステップＢ４）。
そして、その選択した特徴語についての表示切替情報を作成し、比較一覧において、その選択した特徴語の表示を、他の特徴語と識別可能な強調表示に変換させた新出単語抽出表示結果を作成する（ステップＢ５）。なお、この新出単語のみを表示するリストや図等を作成してもよい。
以上の動作をターゲット結果におけるすべての特徴語について繰り返し（ステップＢ６）、完成した単語抽出表示結果を出力する。 Next, the comparison feature extraction unit 34 selects one word output as a feature word in the target result (step B3), and the selected feature word is the analysis result specified in the new word extraction range. Is not output as a feature word, and if it does not appear, it is determined that the selected feature word is a new word (step B4).
Then, display switching information for the selected feature word is created, and a new word extraction display result obtained by converting the display of the selected feature word into a highlighted display that can be distinguished from other feature words in the comparison list. Create (step B5). In addition, you may create the list, figure, etc. which display only this new word.
The above operation is repeated for all feature words in the target result (step B6), and the completed word extraction display result is output.

図６に示す単語抽出表示結果は、ターゲット結果として特徴分析結果３を選択し、新出単語抽出範囲として特徴分析結果１及び２を指定した場合の例を示したものである。すなわち、この場合は、特徴分析結果１及び２には出現せず、特徴分析結果３に初めて現れた特徴語のみを抽出する処理設定となり、同図に示すように「子供」「使いにくい」「店舗」「販売員」などの単語が、新出単語としてハイライト表示されている。 The word extraction display result shown in FIG. 6 shows an example in which feature analysis result 3 is selected as the target result and feature analysis results 1 and 2 are designated as the new word extraction range. In other words, in this case, the processing setting is such that only feature words that do not appear in the feature analysis results 1 and 2 but appear for the first time in the feature analysis result 3 are extracted. As shown in FIG. Words such as “Store” and “Salesperson” are highlighted as new words.

また、図１１は、比較特徴抽出手段３４による処理の別のオプションである注目キーワードをハイライト出力する処理手順を示すフローチャートである。
分析者が、比較条件として、比較一覧に表示された分析結果の中から、気になる単語を、分析対象とする単語である注目キーワードとして選択して、図示しないハイライト対象リストに追加すると、比較特徴抽出手段３４は、この注目キーワードの選択情報を入力し（ステップＣ１）、ハイライト対象リストに追加する（ステップＣ２）。そして、これが所望の注目キーワードの数だけ繰り返し実行される（ステップＣ３）。 FIG. 11 is a flowchart showing a processing procedure for highlight-outputting a keyword of interest, which is another option of processing by the comparison feature extraction unit 34.
When an analyst selects, as a comparison condition, a word of interest from the analysis results displayed in the comparison list as an attention keyword that is a word to be analyzed, and adds it to a highlight target list (not shown) The comparison feature extraction means 34 inputs this attention keyword selection information (step C1) and adds it to the highlight target list (step C2). This is repeated for the number of desired keywords of interest (step C3).

そして、比較特徴抽出手段３４は、ハイライト対象リストに追加されたそれぞれのキーワードについて、比較一覧における分析結果に特徴語として表示されているかどうかを検索し、表示されている場合は、その特徴語についての表示切替情報を作成して、その表示をハイライト表示に変換して表示する（ステップＣ４）。
また、比較特徴抽出手段３４は、その検索結果にもとづいて、ハイライト対象リストにおける各キーワードについての、特徴度の変化を表示するグラフを作成して、出力装置４０に表示する（ステップＣ５）。 Then, the comparison feature extraction unit 34 searches for whether or not each keyword added to the highlight target list is displayed as a feature word in the analysis result in the comparison list. The display switching information about is generated, and the display is converted into a highlight display and displayed (step C4).
Further, the comparison feature extraction unit 34 creates a graph that displays the change in the feature level for each keyword in the highlight target list based on the search result, and displays the graph on the output device 40 (step C5).

図７に示す注目キーワードハイライト表示結果は、ハイライト対象リストに「デザイン」を追加し、比較一覧における各分析結果上で「デザイン」のポジションをハイライト表示した例を示している。さらに、同様にして、ハイライト対象リストに「信頼する」「平凡だ」「使いにくい」を追加し、グラフ化した例を図８に示している。図８では、キーワードの特徴度の推移が一覧可能な折れ線グラフが表示されている。もちろん、グラフの種類は折れ線グラフに限定されるものではなく、その他の各種グラフに適宜変更可能とすることも好ましい。 The attention keyword highlight display result shown in FIG. 7 shows an example in which “design” is added to the highlight target list, and the position of “design” is highlighted on each analysis result in the comparison list. Similarly, FIG. 8 shows an example in which “reliable”, “ordinary” and “difficult to use” are added to the highlight target list and graphed. In FIG. 8, a line graph is displayed in which the transition of the keyword characteristic degree can be listed. Of course, the type of the graph is not limited to the line graph, and it is also preferable that the graph can be appropriately changed to various other graphs.

［第二実施形態］
次に、本発明の第二実施形態について、図１２を参照して説明する。同図は、本実施形態のテキストマイニング分析装置の構成を示すブロック図である。
本実施形態は、分析者が設定した所定のアラーム出力基準を超えて特徴が出現した場合に、アラーム情報を出力する点で第一実施形態と異なるものである。
本実施形態のテキストマイニング分析装置は、図１に示す第一実施形態の構成に加えて、データ分析処理装置３０にアラーム出力手段３５を有している。また、このアラーム出力手段３５からの出力情報を有線又は無線により受信可能な情報共有サーバ５０を有している。 [Second Embodiment]
Next, a second embodiment of the present invention will be described with reference to FIG. FIG. 2 is a block diagram showing the configuration of the text mining analyzer of this embodiment.
The present embodiment is different from the first embodiment in that alarm information is output when a feature appears exceeding a predetermined alarm output standard set by an analyst.
The text mining analyzer of this embodiment has an alarm output means 35 in the data analysis processor 30 in addition to the configuration of the first embodiment shown in FIG. In addition, an information sharing server 50 capable of receiving output information from the alarm output means 35 by wire or wirelessly is provided.

アラーム出力手段３５は、分析者による入力情報にもとづいて、所定のアラーム出力基準を設定し、比較一覧からこのアラーム出力基準を満たす単語を検出して、そのアラーム情報を出力するものである。
本実施形態においては、アラーム出力手段３５は、分析者により入力された所定の閾値情報にもとづいて、比較一覧における各分析結果から、その閾値情報を超える特徴度を有する特徴語を抽出してアラーム情報を作成し、これを出力装置４０や情報共有サーバ５０へ出力する。 The alarm output means 35 sets a predetermined alarm output standard based on information input by an analyst, detects a word satisfying this alarm output standard from the comparison list, and outputs the alarm information.
In the present embodiment, the alarm output means 35 extracts a feature word having a characteristic degree exceeding the threshold information from each analysis result in the comparison list based on the predetermined threshold information input by the analyst, and generates an alarm. Information is created and output to the output device 40 and the information sharing server 50.

この閾値情報は、種々の条件情報とすることができる。例えば、特徴度の変化量とすることができ、出現頻度の増加率や減少率などとすることができる。また、特徴度の絶対値とすることもでき、所定の出現総頻度を超える特徴語などをアラーム出力の対象とすることができる。さらに、あらかじめ設定しておいた特定のキーワードが出現したか否かをアラーム出力の対象とすることもできる。これらは、アラーム出力手段３５に、オプションとして設定可能とすることができる。
また、アラーム情報は、分析者に警告を与えることが可能なものであれば、特に限定されるものではなく、例えば、警告ウィンドウにアラーム出力の対象として抽出された特徴語をリスト表示するものなどとすることができる。 The threshold information can be various condition information. For example, it can be the amount of change of the feature degree, and can be the increase rate or decrease rate of the appearance frequency. Also, the absolute value of the feature degree can be used, and a feature word exceeding a predetermined total appearance frequency can be set as an alarm output target. Further, whether or not a specific keyword set in advance appears can be set as an alarm output target. These can be set as options in the alarm output means 35.
The alarm information is not particularly limited as long as it can give a warning to the analyst. For example, a list of feature words extracted as alarm output targets in a warning window is displayed. It can be.

情報共有サーバ５０は、例えば、企業部門ＥＩＰといった情報共有システムと連携して、テキストマイニング分析装置から入力された気づき情報としてのアラーム情報を、速報アラームなどとして活用可能な情報処理装置とすることができる。 The information sharing server 50 may be an information processing apparatus that can utilize alarm information as notice information input from a text mining analyzer as a breaking alarm in cooperation with an information sharing system such as a corporate sector EIP. it can.

次に、本実施形態のテキストマイニング分析装置におけるアラーム出力処理手順について、図１３を参照して説明する。同図は、本実施形態のテキストマイニング分析装置におけるアラーム出力処理手順を示すフローチャートであり、アラーム出力手段３５による処理の一オプションであるキーワード出現増加率についてのアラーム出力処理手順を示すものである。
まず、分析者が、比較一覧表示手段３３により表示される比較一覧、又は比較特徴抽出手段３４により表示される新出単語抽出表示結果や注目キーワードハイライト表示結果等からターゲット結果を選択して閾値を設定すると、アラーム出力手段３５は、選択されたターゲット結果の選択情報を入力し（ステップＤ１）、閾値の設定を行う（ステップＤ２）。 Next, an alarm output processing procedure in the text mining analyzer of this embodiment will be described with reference to FIG. FIG. 6 is a flowchart showing an alarm output processing procedure in the text mining analyzer of the present embodiment, and shows an alarm output processing procedure for a keyword appearance increase rate that is one option of processing by the alarm output means 35.
First, the analyst selects a target result from the comparison list displayed by the comparison list display means 33, the new word extraction display result displayed by the comparison feature extraction means 34, the attention keyword highlight display result, etc. Is set, the alarm output means 35 inputs selection information of the selected target result (step D1) and sets a threshold (step D2).

次に、アラーム出力手段３５は、分析者による入力情報にもとづいて、選択されたターゲット結果から特徴語を一つ選択し（ステップＤ３）、ターゲット結果におけるその特徴語の出現頻度と、それ以前の分析結果におけるその特徴語の出現頻度とを比較して、その特徴語の出現増加率が設定した閾値を超えているかを判定する（ステップＤ４）。
閾値をこえていた場合は、アラーム出力手段３５は、その特徴語についてのアラーム情報を生成する（ステップＤ５）。
そして、ターゲット結果に含まれているすべての特徴語について、出現頻度が閾値を超えているかどうかを判定する処理を繰り返して（ステップＤ６）、アラーム情報が生成されている場合は、これらのアラーム情報を出力装置４０や情報共有サーバ５０へ出力する（ステップＤ７）。 Next, the alarm output means 35 selects one feature word from the selected target result based on the input information by the analyst (step D3), the appearance frequency of the feature word in the target result, and the previous frequency. It compares with the appearance frequency of the feature word in an analysis result, and determines whether the appearance increase rate of the feature word exceeds the set threshold value (step D4).
If the threshold is exceeded, the alarm output means 35 generates alarm information for the feature word (step D5).
Then, for all feature words included in the target result, the process of determining whether the appearance frequency exceeds the threshold is repeated (step D6), and if alarm information is generated, these alarm information Is output to the output device 40 and the information sharing server 50 (step D7).

なお、ターゲット結果における特徴語の出現頻度と、それ以前の分析結果における特徴語の出現頻度との比較において用いられる以前の分析結果としては、比較分析の対象としている比較一覧に存在するターゲット結果以前のすべての分析結果を用いるようにすることができる。
また、例えば、ステップＤ１において、第一実施形態における場合と同様に、ターゲット結果の選択とともに、比較の対称とする一又は二以上の分析結果を指定可能とし、ターゲット結果における特徴語の出現頻度と、指定された分析結果における特徴語の出現頻度との比較を行うようにすることも好ましい。 The previous analysis results used in the comparison between the appearance frequency of feature words in the target results and the appearance frequency of feature words in the previous analysis results are those before the target results existing in the comparison list that is the target of the comparative analysis. All analysis results of can be used.
Also, for example, in step D1, as in the first embodiment, together with the selection of the target result, it is possible to specify one or more analysis results that are symmetrical for comparison, and the appearance frequency of feature words in the target result It is also preferable to perform comparison with the appearance frequency of feature words in the designated analysis result.

なお、本発明は以上の実施形態に限定されるものではなく、本発明の範囲内において、種々の変更実施が可能であることは言うまでもない。
例えば、上記実施形態においては、比較特徴抽出手段３４により、出力装置４０に新出単語抽出表示、注目キーワードハイライト表示、及びグラフ表示を行う構成としているが、さらに多くの比較分析指標にもとづいて、その結果を表示させても勿論かまわない。また、これらの比較分析結果を情報共有サーバ５０に送信して出力可能とすることも好ましい。
さらに、アラーム出力手段３５によるアラーム情報の送信とともに、情報共有サーバ５０に比較一覧を送信する構成とすることなどもでき、その他適宜変更することが可能である。 In addition, this invention is not limited to the above embodiment, It cannot be overemphasized that a various change implementation is possible within the scope of the present invention.
For example, in the above embodiment, the comparison feature extraction means 34 is configured to perform the new word extraction display, the attention keyword highlight display, and the graph display on the output device 40. However, based on more comparative analysis indices. Of course, the result may be displayed. It is also preferable that these comparative analysis results be transmitted to the information sharing server 50 so that they can be output.
Further, the alarm output means 35 can transmit the alarm information and the comparison list can be transmitted to the information sharing server 50. Other modifications can be made as appropriate.

本発明によれば、コンタクトセンタに寄せられた問い合わせや顧客満足度調査アンケートなどの顧客の声から、注目するトレンドや傾向、リスクや問題といった各種情報を抽出する情報マイニング装置や、情報マイニング装置をコンピュータに実現するためのプログラムといった用途に適用することが可能である。 According to the present invention, there is provided an information mining device or an information mining device for extracting various information such as a trend or tendency to be noticed, a risk or a problem from a customer's voice such as an inquiry sent to a contact center or a customer satisfaction survey. The present invention can be applied to uses such as a program for realizing on a computer.

本発明の第一実施形態のテキストマイニング分析装置の構成を示すブロック図である。It is a block diagram which shows the structure of the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置におけるテキストデータを示す図である。It is a figure which shows the text data in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における分析結果データを示す図である。It is a figure which shows the analysis result data in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における比較対象リスト設定画面を示す図である。It is a figure which shows the comparison object list setting screen in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における比較結果一覧を示す図である。It is a figure which shows the comparison result list in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における新出単語抽出表示結果を示す図である。It is a figure which shows the new word extraction display result in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における注目キーワードハイライト表示結果を示す図である。It is a figure which shows the attention keyword highlight display result in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置におけるグラフ表示結果を示す図である。It is a figure which shows the graph display result in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における比較設定処理手順を示すフローチャートである。It is a flowchart which shows the comparison setting process sequence in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における新出単語抽出処理手順を示すフローチャートである。It is a flowchart which shows the new word extraction process procedure in the text mining analyzer of 1st embodiment of this invention. 本発明の第一実施形態のテキストマイニング分析装置における注目キーワードハイライト・グラフ化処理手順を示すフローチャートである。It is a flowchart which shows the attention keyword highlight graphing process sequence in the text mining analyzer of 1st embodiment of this invention. 本発明の第二実施形態のテキストマイニング分析装置の構成を示すブロック図である。It is a block diagram which shows the structure of the text mining analyzer of 2nd embodiment of this invention. 本発明の第二実施形態のテキストマイニング分析装置におけるアラーム出力処理手順を示すフローチャートである。It is a flowchart which shows the alarm output process sequence in the text mining analyzer of 2nd embodiment of this invention.

Explanation of symbols

１０入力装置
２０記憶装置
２１（２１−１〜２１−ｎ）テキストデータ
２２（２２−１〜２２−ｍ）分析結果データ
３０データ分析処理装置
３１特徴分析実行手段
３２比較設定手段
３３比較一覧表示手段
３４比較特徴抽出手段
３５アラーム出力手段
４０出力装置
５０情報共有サーバ DESCRIPTION OF SYMBOLS 10 Input device 20 Storage device 21 (21-1 to 21-n) Text data 22 (22-1 to 22-m) Analysis result data 30 Data analysis processing device 31 Feature analysis execution means 32 Comparison setting means 33 Comparison list display means 34 comparison feature extraction means 35 alarm output means 40 output device 50 information sharing server

Claims

Two or more text data, feature analysis execution means for text mining analysis of each text data, and two or more possessing feature words obtained by this analysis and one or more predetermined feature degrees corresponding to the feature words A text mining analyzer with analysis result data of
Select the analysis result data to be subjected to the comparative analysis based on the input information, create a comparison list based on the selected analysis result data, set the comparison condition based on the input information, and perform the comparison list according to the comparison condition. A text mining analysis apparatus comprising: a comparison unit that executes comparison analysis and outputs a comparison result of the comparison analysis.

The comparing means is
In the setting of the comparison condition, one of the analysis result data in the comparison list is selected as a target based on input information, and one or more analysis result data to be compared with respect to the target is selected in the comparison list. Specify from
In the execution of the comparative analysis, a feature word that exists in the target but does not exist in the analysis result data to be compared is detected as a new word,
The text mining analysis apparatus according to claim 1, wherein in the output of the comparison result, a comparison result for highlighting the new word is output.

The comparing means is
In setting the comparison condition, based on the input information, select one or more feature words to be analyzed from the comparison list,
In the execution of the comparative analysis, all feature words to be analyzed in the comparison list are detected,
The text mining analysis apparatus according to claim 1 or 2, wherein in the output of the comparison result, a comparison result for highlighting the detected feature word to be analyzed is output.

In the output of the comparison result, the comparison means
The text mining analysis apparatus according to claim 3, wherein a graph displaying the feature degree corresponding to the detected feature word to be analyzed for each analysis result data in the comparison list is output.

The text mining analyzer according to any one of claims 1 to 4,
Alarm output means for setting a predetermined alarm output standard based on input information, detecting a feature word satisfying the alarm output standard from the comparison list or the comparison result, and outputting alarm information for the feature word A text mining analyzer characterized by having.

The comparison means or the alarm output means outputs the comparison result or the alarm information to an output device provided in the text mining analyzer or an information processing device connected to the text mining analyzer by wire or wirelessly. The text mining analyzer according to any one of claims 1 to 5, wherein:

The comparing means is
The text mining according to any one of claims 1 to 6, wherein when the selection input of the feature word in the comparison list is received, the information of the text data from which the selected feature word is extracted is displayed. Analysis equipment.

Two or more text data, feature analysis execution means for text mining analysis of each text data, two or more possessing feature words obtained by this analysis and one or more predetermined feature degrees corresponding to the feature words A text mining analysis method for performing a comparative analysis of the analysis result data using a text mining analysis device provided with the analysis result data of
The text mining analyzer is
Select the analysis result data to be subjected to comparative analysis based on the input information,
Create a comparison list based on the selected analysis result data,
Set comparison conditions based on the input information,
Performing a comparative analysis on the comparison list according to the comparison conditions;
A text mining analysis method, comprising: outputting a comparison result by the comparison analysis.

The text mining analyzer is
Based on the input information, set the predetermined alarm output criteria,
Detect feature words that meet the alarm output criteria from the comparison list,
The text mining analysis method according to claim 8, wherein alarm information about the feature word is output.

Two or more text data, feature analysis execution means for text mining analysis of each text data, and two or more possessing feature words obtained by this analysis and one or more predetermined feature degrees corresponding to the feature words A text mining analysis program for causing a text mining analysis device comprising the analysis result data to perform a comparative analysis of the analysis result data,
In the text mining analyzer,
Based on the input information, the analysis result data to be subjected to comparative analysis is selected,
Create a comparison list based on the selected analysis result data,
Let the comparison conditions be set based on the input information,
Performing a comparative analysis on the comparison list according to the comparison conditions;
A text mining analysis program for causing a comparison result by the comparison analysis to be output.

In the text mining analyzer,
Based on the input information, set a predetermined alarm output standard,
Let the feature words that meet the alarm output criteria be detected from the comparison list,
The text mining analysis program according to claim 10, wherein the alarm information for the feature word is output.