JP5332918B2

JP5332918B2 - Classification data recommendation method, program, and apparatus

Info

Publication number: JP5332918B2
Application number: JP2009135132A
Authority: JP
Inventors: 章文中浜
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2009-06-04
Filing date: 2009-06-04
Publication date: 2013-11-06
Anticipated expiration: 2029-06-04
Also published as: JP2010282416A

Abstract

<P>PROBLEM TO BE SOLVED: To automatically check section data set to new text information, and to provide an inputting person with candidate information when the section data are erroneously added or not yet inputted. <P>SOLUTION: A classification code table generating part 101 generates a classification code table file 104 indicating a correlation between section data and classification codes, and a classification code statistical information table file 105 collecting the status of appearance in the population text file 109 of the classification codes from the population text file 109 of text information and the sample text file 110 of the text information to which section data determined to be right are added. A section data recommending part 102 extracts the classification codes from new text information, extracts the section data corresponding to the classification codes from the classification code table file 104, extracts statistical information from the classification code statistical information table file 105 for each section data, and then selects and presents candidates of the section data corresponding to the new text information on the basis of the statistical information. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

テキスト情報を１つ以上の区分値によってそのテキスト情報の区分を示す区分データと共に蓄積する技術に関する。 The present invention relates to a technique for accumulating text information together with segment data indicating the segment of the text information by one or more segment values.

近年、コールセンターの集積情報に注目し、分析を進める企業が増加傾向にある。背景には、顧客応対システムやテキストマイニングソフト等の導入が進み、コールセンターのＩＴインフラが整備されたことが挙げられる。 In recent years, there has been an increase in the number of companies that are focusing on call center information and conducting analysis. The background is the introduction of customer service systems and text mining software, and the establishment of the IT infrastructure of the call center.

しかしながら、集積情報は、様々な理由により、誤入力が発生しやすい。例えば、通話しながらのデータ入力や通話後、複数案件をまとめて入力する場合が多いため、入力者が誤入力に気がつかない場合が多い。 However, erroneous input tends to occur in the accumulated information for various reasons. For example, since there are many cases where a plurality of cases are input at a time after data input or after a call, the input person is often unaware of erroneous input.

更に、コールセンターやお客様相談室の応対要員は、パートタイムや派遣社員で構成される場合が多く、定期的な人員入替により、入力水準を一定に保つのが難しい。
誤入力の例としては、以下が挙げられる。
（Ａ）テキスト情報（直接入力する情報）の誤字脱字
（Ｂ）区分データ（プルダウンメニューから選択する情報）の付け間違い
（Ｃ）区分データの未設定
なお、上記（Ａ）〜（Ｃ）として示される事例は、図１８に示されるような顧客応対システム１８０１での入力を想定している。顧客応対システムとは、応対開始時間や顧客情報、質問内容や応対内容を入力者が都度画面入力して、履歴管理を行うシステムで、応対履歴データベース１８０２を有するシステムである。 Furthermore, the call center and customer service staff are often made up of part-time and temporary staff, and it is difficult to keep the input level constant due to regular personnel changes.
The following are examples of erroneous input.
(A) Typing error in text information (information entered directly) (B) Mistake in classification data (information selected from pull-down menu) (C) Classification data not set In addition, shown as (A) to (C) above In this example, it is assumed that the customer reception system 1801 as shown in FIG. The customer service system is a system that has a service history database 1802 in which an input person inputs a screen each time a service start time, customer information, question content, and service content, and manages the history.

このような顧客応対システムの入力画面は、業種・業務により違いはあるものの、大きく４種類の情報から画面構成される場合が多い。

（１）基本情報
案件番号、受付日時、受付者等の受付に関する情報を入力する部分
（２）顧客情報
顧客ＩＤ、顧客名等の顧客に関する情報を入力する部分
（３）問合せ情報
質問内容や質問の区分に関する情報を入力する部分
（４）回答情報
回答内容や回答の区分に関する情報を入力する部分

上記（１）〜（４）には、直接タイプ入力される項目と、プルダウンメニュー、チェックボックス等から選択される項目がある。また、画面起動時に初期値がセットされる場合や、特定の入力値からデータベースが検索されて、他の項目値が自動セットされる場合もある。 The input screen of such a customer service system is often composed of four types of information, although there are differences depending on the type of business and business.

(1) Basic information A part for inputting information related to the reception of the project number, reception date and time, the receptionist, etc. (2) Customer information A part for inputting information about the customer such as a customer ID and a customer name (3) Inquiry information (4) Reply information The part to input the information about the answer contents and the answer category

In the above (1) to (4), there are items that are directly typed and items that are selected from pull-down menus, check boxes, and the like. In addition, an initial value may be set when the screen is started, or a database may be searched from a specific input value and another item value may be automatically set.

誤入力は、上記（３）の質問と（４）の回答の区分の入力で発生しやすい。実際のところ、（Ａ）の誤字脱字は、入力文字の一部で発生する場合が多い。発生しても意味が通じない場合は少ないため、誤字脱字を都度確認して修正する必要性は低い。 An erroneous input is likely to occur when the question (3) and the answer (4) are entered. Actually, the typographical error in (A) often occurs in a part of the input characters. Since there are few cases where it does not make sense even if it occurs, there is little need to check and correct typographical errors.

一方、（Ｂ）（Ｃ）の区分データは本来、テキスト内容を目検で確認するのが非常に手間となるため、入力者が付加した区分データを集計することで、すばやく傾向をつかむ目的がある。それ故に、区分データには精度が求められる。 On the other hand, since the classification data of (B) and (C) is originally very troublesome to check the text content by visual inspection, the purpose of quickly grasping the tendency is to aggregate the classification data added by the input person. is there. Therefore, accuracy is required for the segment data.

区分データの精度が低いと、集計結果の妥当性が無くなる。そのため、入力データを目検でチェックして、区分データの修正を行っている企業もある。しかしながら、多くの企業では、人員不足や集積情報が膨大なことから、入力段階のまま集積せざる得ないケースが多い。また、（Ｂ）（Ｃ）の区分データは、入力者が入力方法を知らない場合や誤理解している場合に、誤入力が続くため、（Ａ）の問題より深刻である。 If the accuracy of the classification data is low, the validity of the tabulation results is lost. For this reason, some companies check the input data by visual inspection and correct the classification data. However, in many companies, there are many cases where there is a shortage of personnel and a large amount of accumulated information, and thus it is necessary to accumulate the data in the input stage. The classification data (B) and (C) are more serious than the problem (A) because the erroneous input continues when the input person does not know or misunderstands the input method.

特開２００１−０６０１９９号公報JP 2001-060199 A 特開２００２−１８３１９４号公報JP 2002-183194 A 特開２００２−１８９７５４号公報JP 2002-189754 A

しかし、上述したような区分データの誤入力をチェックしたり、区分データの入力を支援するような有効な従来技術はなかった。
そこで、本発明の１つの側面では、顧客応対システム等における新規テキスト情報の入力画面に設定される区分データを自動的にチェック可能とし、付け間違いの可能性がある場合や未入力時に入力者へ候補情報（レコメンド）を提供することを目的とする。 However, there has been no effective conventional technique for checking the erroneous input of the classification data as described above or supporting the input of the classification data.
Therefore, in one aspect of the present invention, it is possible to automatically check the classification data set on the input screen for new text information in the customer service system, etc., and when there is a possibility of a mistake or when it is not input, The purpose is to provide candidate information (recommendations).

態様の一例では、テキスト情報を１つ以上の区分値によって該テキスト情報の区分を示す区分データと共に蓄積する方法として実現され、以下の構成を有する。
まず、ｌそれぞれ入力者によって区分データが付与されたテキスト情報の母集団テキストファイルと、正しいと判断された区分データが付与されたテキスト情報の標本テキストファイルとが、所定のテキスト単位の任意の組合せで構成される分類コードをキーとして比較されることにより、区分データと分類コードとの対応関係を示す分類コード表ファイルが作成される。 In one example, the present invention is realized as a method of storing text information together with segment data indicating a segment of the text information by one or more segment values, and has the following configuration.
First, an arbitrary combination of a text information population text file to which classification data is assigned by each input person and a text information sample text file to which classification data determined to be correct is given in a predetermined text unit. As a key, the classification code table file indicating the correspondence between the classification data and the classification code is created.

次に、分類コード表ファイルから抽出される区分データ毎に、分類コード表ファイル内でその区分データに対応付けられる分類コードの母集団テキストファイル内での出現状況を統計情報として収集した分類コード統計情報表ファイルが作成される。 Next, for each category data extracted from the category code table file, category code statistics collected as statistical information on the appearance status of the category code associated with the category data in the category code table file in the population text file An information table file is created.

次に、新規に入力された新規テキスト情報から分類コードが抽出され、その抽出された分類コード毎に分類コード表ファイルが検索されることにより、その分類コードに対応する区分データが抽出される。 Next, a classification code is extracted from the newly input new text information, and a classification code table file is searched for each extracted classification code, whereby classification data corresponding to the classification code is extracted.

そして、その抽出された区分データ毎に分類コード統計情報表ファイルから統計情報が抽出され、その抽出された統計情報に基づいて、新規テキスト情報に対応する区分データの候補が抽出された区分データから選択され提示される。 Then, statistical information is extracted from the classification code statistical information table file for each of the extracted classification data, and based on the extracted statistical information, the classification data candidates corresponding to the new text information are extracted from the classification data. Selected and presented.

新規テキスト情報の入力に対して、それにもっとも適した区分データの候補（レコメンド）を的確に提示することが可能となる。 For the input of new text information, it becomes possible to accurately present the candidate (recommendation) of the classification data most suitable for it.

区分データのレコメンド機能を有する顧客応対システムの実施形態の構成図である。It is a block diagram of embodiment of the customer reception system which has the recommendation function of division data. 分類コード表作成部１０１が実行する分類コード表作成処理を示す動作フローチャートである。It is an operation | movement flowchart which shows the classification code table preparation process which the classification code table preparation part 101 performs. 区分データレコメンド部１０２が実行する区分データレコメンド表示制御処理の制御動作を示す動作フローチャートである。It is an operation | movement flowchart which shows the control operation | movement of the division data recommendation display control process which the division data recommendation part 102 performs. 分類コード自動再作成部１０３が実行する分類コード自動再作成処理の制御動作を示す動作フローチャートである。5 is an operation flowchart showing a control operation of classification code automatic recreation processing executed by a classification code automatic recreation unit 103; 母集団テキストファイル１０９のデータ構成例を示す図である。It is a figure which shows the example of a data structure of the population text file 109. FIG. 標本テキストファイル１１０のデータ構成例を示す図である。It is a figure which shows the data structural example of the sample text file 110. FIG. 区分データと分類階層・分類コードとの対応関係を示す図である。It is a figure which shows the correspondence of classification data and classification hierarchy and classification code. 分類コード表ファイル１０４のデータ構成例を示す図である。6 is a diagram showing an example data structure of a classification code table file 104. FIG. 分類コード統計情報表ファイル１０５のデータ構成例を示す図である。It is a figure which shows the data structural example of the classification code statistical information table file 105. FIG. 応対情報の入力例と、それに対応する形態素、形態素パターン、及び形態素行列の例を示す図であるIt is a figure which shows the example of input of reception information, and the example of the morpheme, morpheme pattern, and morpheme matrix corresponding to it 形態素行列を構成する形態素行列ファイルのデータ構成例を示す図である。It is a figure which shows the example of a data structure of the morpheme matrix file which comprises a morpheme matrix. 検索結果ファイル１０７のデータ構成例を示す図である。6 is a diagram showing an example data structure of a search result file 107. FIG. レコメンドファイル１０８のデータ構成例を示す図である。It is a figure which shows the data structural example of the recommendation file 108. FIG. レコメンドメッセージの表示処理の説明図である。It is explanatory drawing of the display process of a recommendation message. 区分データのお薦め表示の表示例を示す図である。It is a figure which shows the example of a display of the recommendation display of division data. 分類コード自動再作成処理における分類コード統計情報表ファイル１０５のデータ構成例を示す図である。It is a figure which shows the data structural example of the classification code statistical information table file 105 in the classification code automatic re-creation process. 顧客応対システムの実施形態を実現できるコンピュータのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the computer which can implement | achieve embodiment of a customer service system. 顧客応対システムの説明図である。It is explanatory drawing of a customer reception system.

以下、実施形態について詳細に説明する。
図１は、区分データのレコメンド機能を有する顧客応対システムの実施形態の構成図である。 Hereinafter, embodiments will be described in detail.
FIG. 1 is a configuration diagram of an embodiment of a customer service system having a recommendation function for segment data.

本実施形態による顧客応対システム１００は、分類コード表作成部１０１、区分データレコメンド部１０２、及び分類コード自動再作成部１０３を備える。分類コード表作成部１０１は、入力者からの作成依頼に基づいて、分類コード表ファイル１０４及び分類コード統計情報表ファイル１０５を作成する。区分データレコメンド部１０２は、入力者による顧客応対における応対履歴データベース１０６への画面入力時に、分類コード表ファイル１０４を検索して、検索結果ファイル１０７、及びレコメンドファイル１０８を生成する。そして、区分データレコメンド部１０２は、これらのファイルの内容に基づいて、ｌ区分データのレコメンド即ち候補の提示を行う。 The customer service system 100 according to the present embodiment includes a classification code table creation unit 101, a classification data recommendation unit 102, and a classification code automatic re-creation unit 103. The classification code table creation unit 101 creates a classification code table file 104 and a classification code statistical information table file 105 based on a creation request from an input person. The classification data recommendation unit 102 searches the classification code table file 104 and generates a search result file 107 and a recommendation file 108 when a screen is input to the response history database 106 in customer response by an input person. Then, based on the contents of these files, the segment data recommendation unit 102 recommends the segment data, that is, presents candidates.

分類コード自動再作成部１０３は、分類コード表ファイル１０４の自動再作成を実行する。
以上の構成を有する顧客応対システム１００の実施形態の動作について、以下に詳細に説明する。 The classification code automatic re-creation unit 103 executes automatic re-creation of the classification code table file 104.
The operation of the embodiment of the customer reception system 100 having the above configuration will be described in detail below.

図２は、図１の分類コード表作成部１０１が実行する分類コード表作成処理を示す動作フローチャートである。
まず、入力者は、情報源Ａの母集団テキストファイル１０９と情報源Ｂの標本テキストファイル１１０を指定する。図５は、母集団テキストファイル１０９のデータ構成例を示す図、図６は、標本テキストファイル１１０のデータ構成例を示す図である。 FIG. 2 is an operation flowchart showing a classification code table creation process executed by the classification code table creation unit 101 of FIG.
First, the input person designates the population text file 109 of the information source A and the sample text file 110 of the information source B. FIG. 5 is a diagram showing a data configuration example of the population text file 109, and FIG. 6 is a diagram showing a data configuration example of the sample text file 110.

母集団テキストファイル１０９は、図５のように、１つのレコード（データ組）が、過去一定期間の顧客対応にて入力されるテキスト情報と、そのテキスト情報に対して入力者が設定した区分１から区分Ｎ（例えばＮ＝３）までの各区分値の組からなる区分データで構成されている。この区分データは、入力者の判断で設定されたものであるため、テキスト情報に対して必ずしも適切な区分が設定されているとは限らない。母集団テキストファイル１０９のデータ量は、過去の一定期間（例えば１ヶ月）の顧客対応にて得られる全データ数であるため、大量（例えば１万件）である。この場合、後述する各ファイルとの同期を取るために、区分データを構成する区分１から区分Ｎまでの各区分値を結合した値が、区分データＩＤとして使用される。 As shown in FIG. 5, the population text file 109 includes text information in which one record (data set) is input in response to a customer for a certain period in the past, and category 1 set by the input person for the text information. To segment N (for example, N = 3). Since this classification data is set by the judgment of the input person, an appropriate classification is not always set for the text information. The data amount of the population text file 109 is a large amount (for example, 10,000) because it is the total number of data obtained by customer correspondence in the past certain period (for example, one month). In this case, in order to synchronize with each file to be described later, a value obtained by combining the partition values from partition 1 to partition N constituting the partition data is used as the partition data ID.

これに対して、標本テキストファイル１１０は、図６のように、１つのレコード（データ組）が、テキスト情報と、そのテキスト情報に対して最適に設定された区分データとで構成されている。標本テキストファイル１１０のテキスト情報とそれに対応する区分データは、人手によって最適な対応関係となるように修正されている。標本テキストファイル１１０のデータ量は、１組の区分データに対して例えば５０レコード程度である。母集団テキストファイル１０９の場合と同様に、区分データを構成する区分１から区分Ｎまでの各区分値を結合した値が、区分データＩＤとして使用される。 On the other hand, as shown in FIG. 6, in the sample text file 110, one record (data set) is composed of text information and segment data set optimally for the text information. The text information of the sample text file 110 and the classification data corresponding thereto are corrected so as to have an optimum correspondence relationship manually. The data amount of the sample text file 110 is, for example, about 50 records for one set of division data. As in the case of the population text file 109, a value obtained by combining the segment values from segment 1 to segment N constituting the segment data is used as the segment data ID.

母集団テキストファイル１０９及び標本テキストファイル１１０は、それぞれＣＳＶファイルのようなテキストファイルであってもよいし、データベースシステム上のレコードファイルであってもよい。 Each of the population text file 109 and the sample text file 110 may be a text file such as a CSV file, or may be a record file on a database system.

図１の分類コード表作成部１０１は、上述の情報源Ａである母集団テキストファイル１０９と情報源Ｂである標本テキストファイル１１０から、以下のようにして分類コード表ファイル１０４を作成する。 The classification code table creation unit 101 in FIG. 1 creates the classification code table file 104 from the population text file 109 as the information source A and the sample text file 110 as the information source B as described below.

まず、分類コード表作成部１０１は、母集団テキストファイル１０９と標本テキストファイル１１０から、分類コードを抽出する（図２のステップＳ２０１）。
本実施形態では、上述の分類コードを作成するために、本出願の出願人による特願２００８−２５８７７６号の特許出願に記載の技術を用いる。この技術による処理の概要は以下の通りである。 First, the classification code table creation unit 101 extracts classification codes from the population text file 109 and the sample text file 110 (step S201 in FIG. 2).
In this embodiment, in order to create the above-described classification code, the technique described in Japanese Patent Application No. 2008-258776 by the applicant of the present application is used. The outline of the processing by this technique is as follows.

ステップ１
区分データが適切に設定された標本テキストファイル１１０を構成する各レコードのテキスト情報（図６参照）に対して形態素解析が実行され、各レコードに共通に含まれる２つの形態素からなる組（２形態素組）の集合が抽出される。 Step 1
Morphological analysis is performed on the text information (see FIG. 6) of each record constituting the sample text file 110 in which the classification data is appropriately set, and a set of two morphemes commonly included in each record (two morphemes). Set) is extracted.

ステップ２
母集団テキストファイル１０９を構成する各レコードのテキスト情報（図５参照）に対しても形態素解析が実行され、各テキストファイル１０９を構成する形態素群が抽出される。
抽出の繰返し回数を示す分類階層が、１にセットされる。
母集団テキストファイル１０９全体が、初期状態の処理対象の母集団テキストの集合として選択される。 Step 2
Morphological analysis is also performed on the text information (see FIG. 5) of each record constituting the population text file 109, and the morpheme group constituting each text file 109 is extracted.
The classification hierarchy indicating the number of repetitions of extraction is set to 1.
The entire population text file 109 is selected as a set of population texts to be processed in the initial state.

ステップ３
ステップ１で抽出された２形態素組の各々毎に、標本テキストファイル１１０内の各レコードでの同時出現数（標本内出現数）と、処理対象の母集団テキストの集合内の同時出現数（母集団内出現数）が算出される。そして、「出現率＝標本内出現数÷母集団内出現数」が算出される。
この出現率が上位所定順位以内となる２形態素組群が、現在の分類階層（最初は１）における分類コード群として選択される。 Step 3
For each of the two morpheme sets extracted in step 1, the number of simultaneous occurrences in each record in the sample text file 110 (number of occurrences in the sample) and the number of simultaneous occurrences in the set of population text to be processed (mother) The number of occurrences in the group) is calculated. Then, “appearance rate = number of occurrences in sample ÷ number of occurrences in population” is calculated.
A group of two morpheme sets whose appearance rate is within the upper predetermined order is selected as a classification code group in the current classification hierarchy (initially 1).

ステップ４
ステップ３で選択された各分類コード（＝２形態素組）と、現在の分類階層とかなるデータ組に、各分類コードが含まれる標本テキストファイル１１０に対して設定されている区分データが対応付けられる。 Step 4
The classification data set in the sample text file 110 including each classification code is associated with each classification code (= 2 morpheme set) selected in step 3 and the data set corresponding to the current classification hierarchy. .

ステップ５
分類階層が＋１され、ステップ３で選択された各分類コード（＝２形態素組）群のみが含まれる母集団テキストファイル１０９内のテキスト情報の集合が新たな処理対象の母集団テキストの集合とされ、ステップ３、４、及びこのステップ５の処理が繰り返し実行される。
このステップ５において、母集団テキストの集合に含まれるテキスト情報の数の変化が所定の閾値以下になったら、分類コードの抽出処理を終了する。 Step 5
The set of text information in the population text file 109 that includes only each of the classification codes (= 2 morpheme pairs) selected in step 3 is set as a new set of population texts to be processed. , Steps 3 and 4 and Step 5 are repeatedly executed.
In step 5, when the change in the number of text information included in the set of population texts is equal to or less than a predetermined threshold, the classification code extraction process is terminated.

図７は、上述のステップ１からステップ５の繰返し処理によって生成される区分データと分類階層・分類コードとの対応関係を示す図である。分類コードは、テキストファイルに対して或る区分データが設定されるときに、そのテキストファイルに対する形態素解析の結果そのテキストファイルに含まれるべき適切な２つの形態素からなる組（２形態素組）の候補である。１種類の区分データに対して、複数の分類コードが対応付けられる場合がある。また、１種類の区分データに対して、分類階層が異なる同じ分類コードが複数回対応付けられる場合もある。 FIG. 7 is a diagram showing a correspondence relationship between the classification data generated by the repetitive processing from step 1 to step 5 and the classification hierarchy / classification code. The classification code is a candidate for a set of two appropriate morphemes (two morpheme sets) to be included in the text file as a result of morphological analysis for the text file when certain classification data is set for the text file. It is. A plurality of classification codes may be associated with one type of classification data. In addition, the same classification code with different classification hierarchy may be associated with one type of classification data multiple times.

上述のステップ５として示されるように、分類コードの抽出において、分類階層が進むにつれて母集団テキストファイルが絞り込まれてゆく。このため、新たに入力されたテキストファイルを形態素解析して得られる形態素に、複数の分類階層にわたって選択されている分類コードを構成する２つの形態素が含まれる場合には、次のような推測が可能である。即ち、新たに入力されたテキストファイルは、上記分類コードに対応付けられている区分データが設定されている標本テキストファイル１１０に、より類似したテキスト内容を有すると推測できる。このため、その新たに入力されたテキストファイルには、上記区分データを設定することが推奨される。これが、本実施形態における区分データレコメンド機能の基本的な考え方である。 As shown in step 5 above, in the extraction of the classification code, the population text file is narrowed down as the classification hierarchy progresses. For this reason, when the morpheme obtained by morphological analysis of a newly input text file includes two morphemes constituting a classification code selected over a plurality of classification hierarchies, the following estimation is performed. Is possible. That is, it can be estimated that the newly input text file has more similar text content to the sample text file 110 in which the classification data associated with the classification code is set. For this reason, it is recommended to set the classification data in the newly input text file. This is the basic concept of the partitioned data recommendation function in this embodiment.

次に、図１の分類コード表作成部１０１は、図２のステップＳ２０１にて抽出した上述の分類コードに基づいて、分類コード表ファイル１０４を作成する（図２のステップＳ２０２）。図８は、図７の対応関係に基づいて生成される分類コード表ファイル１０４のデータ構成例を示す図である。図７の対応関係のフォーマットに、各データ組（レコード）毎に「分類最下層」データが付加されている。このデータは、それに対応する分類コードが、どの分類階層まで抽出されたかを示す情報である。例えば図８に示される１レコード目の分類コード「納品書−印刷」は、分類階層１のデータであるが、この分類コードは分類階層３においても選択されていることを示している。この情報は、図１の区分データレコメンド部１０２が、後述する分類コード表検索処理を実行するときに使用される。 Next, the classification code table creation unit 101 in FIG. 1 creates the classification code table file 104 based on the above-described classification code extracted in step S201 in FIG. 2 (step S202 in FIG. 2). FIG. 8 is a diagram showing a data configuration example of the classification code table file 104 generated based on the correspondence relationship of FIG. “Classification lowest layer” data is added to each format (record) in the correspondence format of FIG. This data is information indicating to which classification hierarchy the corresponding classification code has been extracted. For example, the classification code “delivery form-print” of the first record shown in FIG. 8 is data of the classification hierarchy 1, but this classification code is also selected in the classification hierarchy 3. This information is used when the classification data recommendation unit 102 in FIG. 1 executes a classification code table search process described later.

分類コード表ファイル１０４は、ＣＳＶファイルのようなテキストファイルであってもよいし、データベースシステム上のレコードファイルであってもよい。図８の１行が１組のデータ組即ち１レコードを表す。区分データを構成する区分１から区分Ｎまでの各区分値を結合した値が、区分データＩＤとして使用される。 The classification code table file 104 may be a text file such as a CSV file or a record file on a database system. One row in FIG. 8 represents one data set, that is, one record. A value obtained by combining the segment values from segment 1 to segment N constituting the segment data is used as the segment data ID.

次に、分類コード表作成部１０１は、図２のステップＳ２０１での分類コードの抽出処理の内容に基づいて、分類コード統計情報表ファイル１０５を作成する（図２のステップＳ２０３）。 Next, the classification code table creation unit 101 creates the classification code statistical information table file 105 based on the content of the classification code extraction process in step S201 in FIG. 2 (step S203 in FIG. 2).

図９は、分類コード統計情報表ファイル１０５のデータ構成例を示す図である。このファイルは、区分データ項目（区分１から区分Ｎまでの各区分値の組）と、Ｒ１〜Ｒ４、及びＲ６〜Ｒ８の各分類コード統計情報項目と、分類精度項目、及び作成日項目とから構成される。区分データ項目の区分１から区分Ｎまでの各区分値を結合した値が区分データＩＤとして使用される。 FIG. 9 is a diagram illustrating a data configuration example of the classification code statistical information table file 105. This file consists of category data items (sets of category values from category 1 to category N), classification code statistical information items of R1 to R4 and R6 to R8, classification accuracy items, and creation date items. Composed. A value obtained by combining the segment values from the segment 1 to the segment N of the segment data item is used as the segment data ID.

図２の動作フローチャートでの処理結果に基づいて、分類コード表ファイル１０４の区分データ項目に登録されている各区分データのうち、同じ区分データを有するレコード群の情報が、分類コード統計情報表ファイル１０５の１つのレコードにまとめられる。そして、分類コード統計情報表ファイル１０５の各処理対象レコードの項目は、以下のようにして生成される。 Based on the processing result in the operation flowchart of FIG. 2, the information of the record group having the same classification data among the classification data registered in the classification data item of the classification code table file 104 is the classification code statistical information table file. One record 105 is collected. And the item of each processing object record of classification code statistics information table file 105 is generated as follows.

Ｒ１項目には、分類コード表ファイル１０４のもととなった母集団テキストファイル１０９のレコード件数が登録される。
Ｒ２項目には、標本テキストファイル１１０のレコード件数が登録される。 In the R1 item, the number of records of the population text file 109 that is the basis of the classification code table file 104 is registered.
In the R2 item, the number of records of the sample text file 110 is registered.

Ｒ３項目には、区分データに対応付けられる１組以上の分類コードを使って母集団テキストファイル１０９が検索されたときに抽出されるレコード件数が登録される。より具体的には、分類コード表作成部１０１は、分類コード統計情報表ファイル１０５上の処理対象レコードの区分データで図８の分類コード表ファイル１０４の区分データ項目を検索する。そして、分類コード表作成部１０１は、検索により抽出された各レコードの分類コード項目（図８参照）に格納されている分類コードを抽出する。次に、分類コード表作成部１０１は、抽出した分類コード毎に、母集団テキストファイル１０９のテキスト情報項目（図５参照）を検索し、その分類コードを構成する２つの形態素を共に含むレコードの件数を抽出する。そして、分類コード表作成部１０１は、検索したレコード件数を、抽出した分類コード全体で加算し、その結果をＲ３項目に登録する。 In the R3 item, the number of records extracted when the population text file 109 is searched using one or more classification codes associated with the classification data is registered. More specifically, the classification code table creation unit 101 searches the classification data items in the classification code table file 104 of FIG. 8 with the classification data of the processing target records in the classification code statistical information table file 105. Then, the classification code table creation unit 101 extracts the classification code stored in the classification code item (see FIG. 8) of each record extracted by the search. Next, the classification code table creation unit 101 searches the text information item (see FIG. 5) of the population text file 109 for each extracted classification code, and records the records that include both of the two morphemes constituting the classification code. Extract the number of records. Then, the classification code table creation unit 101 adds the number of retrieved records for the entire extracted classification code, and registers the result in the R3 item.

Ｒ４項目には、Ｒ３項目の決定において母集団テキストファイル１０９から検索されたレコードのうち、分類コード統計情報表ファイル１０５上の処理対象レコードの区分データと同じ区分データが設定されているレコードの件数が登録される。 In the R4 item, the number of records in which the same classification data as the classification data of the processing target record in the classification code statistical information table file 105 is set among the records retrieved from the population text file 109 in the determination of the R3 item Is registered.

分類精度項目には、Ｒ４の項目値をＲ３の項目値で割って得られる値が登録される。
作成日は、母集団テキストファイル１０９内で最新に登録されたテキスト情報（図５）を含むレコードの日付である。 In the classification accuracy item, a value obtained by dividing the item value of R4 by the item value of R3 is registered.
The creation date is the date of the record including the text information (FIG. 5) registered most recently in the population text file 109.

Ｒ６〜Ｒ８は、図１の分類コード自動再作成部１０３が使用し、初期値は何れも０である。これらの詳細については後述する。
以上のようにして、図１の分類コード表作成部１０１が分類コード表ファイル１０４と分類コード統計情報表ファイル１０５を作成すると、図１の区分データレコメンド部１０２が動作可能となる。 R6 to R8 are used by the automatic classification code re-creation unit 103 in FIG. 1, and the initial values are all zero. Details of these will be described later.
As described above, when the classification code table creation unit 101 in FIG. 1 creates the classification code table file 104 and the classification code statistical information table file 105, the classification data recommendation unit 102 in FIG. 1 becomes operable.

図３は、区分データレコメンド部１０２が実行する区分データレコメンド表示制御処理の制御動作を示す動作フローチャートである。
まず、入力者（図１参照）が、顧客応対システム１００の画面入力部を用いて、顧客に対する応対内容を入力する（図３のステップＳ３０１）。図１０（ａ）は、入力例を示す図である。質問内容のテキストと共に区分データ（大分類、中分類等、更に細かい分類も可能）が入力される。 FIG. 3 is an operation flowchart illustrating the control operation of the partition data recommendation display control process executed by the partition data recommendation unit 102.
First, the input person (see FIG. 1) inputs the content of the customer response using the screen input unit of the customer response system 100 (step S301 in FIG. 3). FIG. 10A is a diagram illustrating an input example. Along with the text of the question content, classification data (larger classification, middle classification, etc. can be further classified) is input.

これを受けて、図１の区分データレコメンド部１０２は、まずステップＳ３０１での入力内容に対して１つの検索ＩＤを発行する（図３のステップＳ３０２）。検索ＩＤとしては例えば「A00000000000000」のようなものである。 In response to this, the classification data recommendation unit 102 in FIG. 1 first issues one search ID for the input content in step S301 (step S302 in FIG. 3). The search ID is, for example, “A00000000000000”.

以上の入力内容は、検索ＩＤ及び作成日と共に、図１の応対履歴データベース１０６に順次蓄積され、応対履歴の解析処理等に活用されるほか、後述する分類コード自動再作成処理において参照される。 The above input contents are sequentially stored together with the search ID and creation date in the response history database 106 of FIG.

次に、区分データレコメンド部１０２は、ステップＳ３０１での入力内容から、質問内容のテキストデータを抽出し、そのデータに対して形態素解析処理を実行する（図３のステップＳ３０３）。この結果、例えば図１０（ａ）の入力例における質問内容の部分のテキストデータに対して、図１０（ｂ）に例示されるような形態素が抽出される。 Next, the classification data recommendation unit 102 extracts text data of the question content from the input content in step S301, and executes a morphological analysis process on the data (step S303 in FIG. 3). As a result, for example, morphemes as illustrated in FIG. 10B are extracted from the text data of the question content portion in the input example of FIG.

次に、区分データレコメンド部１０２は、ステップＳ３０３で抽出した形態素から、図１０（ｃ）として例示されるように、２形態素から構成される形態素パターンを選択し、それに基づいて形態素行列ファイルを作成する（図３のステップＳ３０４）。図１０（ｄ）は、形態素行列を概念的に示した図であり、縦列と横行にそれぞれ出現しうる形態素が配置され、各列と各行の交点が、各列の形態素と各行の形態素からなる形態素パターンを表している。同一の形態素同士の組と順序が逆の形態素の組に対応する×の部分は除かれて、○の部分の形態素パターンのみが形態素行列の要素を構成するデータとして抽出される。図１１は、形態素行列を構成する形態素行列ファイルのデータ構成例を示す図である。ステップＳ３０２にて発行された検索ＩＤ毎、即ちステップＳ３０１での入力内容毎に、形態素行列の要素を構成する形態素パターンが登録される。 Next, the partitioned data recommendation unit 102 selects a morpheme pattern composed of two morphemes from the morphemes extracted in step S303 as illustrated in FIG. 10C, and creates a morpheme matrix file based thereon. (Step S304 in FIG. 3). FIG. 10D is a diagram conceptually showing a morpheme matrix, in which morphemes that can appear in columns and rows are arranged, and the intersection of each column and each row consists of the morpheme of each column and the morpheme of each row. Represents a morpheme pattern. The part of x corresponding to the pair of the same morpheme and the pair of morpheme whose order is reversed is excluded, and only the morpheme pattern of the part of ○ is extracted as data constituting the element of the morpheme matrix. FIG. 11 is a diagram illustrating a data configuration example of a morpheme matrix file that forms a morpheme matrix. For each search ID issued in step S302, that is, for each input content in step S301, a morpheme pattern constituting an element of the morpheme matrix is registered.

次に、区分データレコメンド部１０２は、ステップＳ３０５からＳ３１０までのループ処理によって、以下の一連の処理を繰り返し実行する。
まず、区分データレコメンド部１０２は、ステップＳ３０４にて得られた形態素行列ファイルから、形態素行列の要素である形態素パターンを順番に選択する（図３のステップＳ３０６）。 Next, the segment data recommendation unit 102 repeatedly executes the following series of processes by the loop process from step S305 to S310.
First, the partitioned data recommendation unit 102 sequentially selects morpheme patterns that are elements of the morpheme matrix from the morpheme matrix file obtained in step S304 (step S306 in FIG. 3).

次に、区分データレコメンド部１０２は、分類コード表ファイル１０４（図８参照）において、ステップＳ３０６で選択した形態素パターンを分類コード項目の値として有するレコードを抽出する。続いて、区分データレコメンド部１０２は、抽出されたレコードの分類階層項目と分類最下層項目の内容を判定する。この判定に基づき、区分データレコメンド部１０２は、第１階層から分類最下層項目で示される階層までの全ての階層のレコードが抽出されている区分データ（＝区分ＩＤ）を検索する（以上、図３のステップＳ３０７）。 Next, the classification data recommendation unit 102 extracts a record having the morpheme pattern selected in step S306 as the value of the classification code item in the classification code table file 104 (see FIG. 8). Subsequently, the classification data recommendation unit 102 determines the contents of the classification hierarchy item and the classification bottom layer item of the extracted record. Based on this determination, the classification data recommendation unit 102 searches the classification data (= classification ID) from which the records of all hierarchies from the first hierarchy to the hierarchy indicated by the classification lowest layer item are extracted (see FIG. 3 step S307).

次に、区分データレコメンド部１０２は、ステップＳ３０７での検索がヒットしたか否かを判定する（図３のステップＳ３０８）。そして、区分データレコメンド部１０２は、ヒットした場合に、そのヒットした区分データを、ステップＳ３０２にて発行されている検索ＩＤに対応させて、検索結果ファイル１０７に登録する（図３のステップＳ３０９）。図１２は、検索結果ファイル１０７のデータ構成例を示す図である。検索結果ファイル１０７の各レコードにおいて、検索ＩＤ項目にはステップＳ３０２で発行された検索ＩＤが格納され、区分データ項目にはステップＳ３０７での検索でヒットした区分データが格納される。 Next, the segment data recommendation unit 102 determines whether or not the search in Step S307 has been hit (Step S308 in FIG. 3). Then, when a hit is found, the category data recommendation unit 102 registers the hit category data in the search result file 107 in association with the search ID issued in step S302 (step S309 in FIG. 3). . FIG. 12 is a diagram illustrating a data configuration example of the search result file 107. In each record of the search result file 107, the search ID issued in step S302 is stored in the search ID item, and the category data hit in the search in step S307 is stored in the category data item.

以上のステップＳ３０６からＳ３０９までの一連の処理が、顧客応対の入力内容から得られた形態素行列の要素を構成する形態素パターン毎に繰り返し実行される。
形態素行列中の全ての形態素パターンに対するステップＳ３０５からＳ３１０までの繰返し処理が終了すると、区分データレコメンド部１０２は、ステップＳ３０９にて検索結果ファイル１０７に新たに登録された各処理対象レコード毎に、次の処理を実行する。即ち、区分データレコメンド部１０２は、分類コード統計情報表ファイル１０５から、その区分データ項目の内容が上記処理対象レコードの区分データ（図１２参照）と同じレコードを検索する。次に、区分データレコメンド部１０２は、検索されたレコードの分類精度項目に格納されている分類精度値を抽出する。そして、区分データレコメンド部１０２は、その抽出した分類精度値と上記処理対象レコードの内容を結合する。その後、区分データレコメンド部１０２は、上位所定順位（例えば第３位）までの分類精度値とそれに結合されている処理対象レコードの内容を、レコメンドファイル１０８に登録する（以上、図３のステップＳ３１１）。図１３は、レコメンドファイル１０８のデータ構成例を示す図である。レコメンドファイル１０８の各レコードは、上記処理対象レコードの検索ＩＤ及び区分データ（図１２参照）が格納される検索ＩＤ項目及び区分データ項目と、分類精度値が格納される分類精度項目とから構成される。 The series of processing from step S306 to S309 is repeatedly executed for each morpheme pattern constituting the elements of the morpheme matrix obtained from the input contents of the customer service.
When the iterative processing from steps S305 to S310 for all morpheme patterns in the morpheme matrix is completed, the partitioned data recommendation unit 102 performs the next processing target record newly registered in the search result file 107 in step S309. Execute the process. That is, the category data recommendation unit 102 searches the category code statistical information table file 105 for a record whose content of the category data item is the same as the category data of the record to be processed (see FIG. 12). Next, the classification data recommendation unit 102 extracts the classification accuracy value stored in the classification accuracy item of the retrieved record. Then, the classification data recommendation unit 102 combines the extracted classification accuracy value and the contents of the processing target record. Thereafter, the classification data recommendation unit 102 registers the classification accuracy value up to the upper predetermined rank (for example, the third rank) and the contents of the processing target record combined therewith in the recommendation file 108 (step S311 in FIG. 3). ). FIG. 13 is a diagram illustrating a data configuration example of the recommendation file 108. Each record of the recommendation file 108 includes a search ID item and a classification data item in which the search ID and classification data (see FIG. 12) of the processing target record are stored, and a classification accuracy item in which a classification accuracy value is stored. The

次に、区分データレコメンド部１０２は、ステップＳ３０１にて入力中の応対内容の区分データ（図１０（ａ）参照）と、レコメンドファイル１０８との比較処理を実行する（図３のステップＳ３１２）。具体的には、区分データレコメンド部１０２は、レコメンドファイル１０８（図１３参照）において、検索ＩＤ項目の値がステップＳ３０２にて発行されている検索ＩＤと同じレコードを、分類精度項目に登録されている分類精度が高い順位で抽出する。そして、区分データレコメンド部１０２は、応対内容中の区分データが、レコメンドファイル１０８から抽出されたレコードのうち、何番目の順位のレコード中の区分データ（図１３参照）と一致するかを比較判定する。 Next, the category data recommendation unit 102 performs a comparison process between the category data (see FIG. 10A) of the response contents being input in step S301 and the recommendation file 108 (step S312 in FIG. 3). Specifically, the classification data recommendation unit 102 registers, in the recommendation file 108 (see FIG. 13), a record whose search ID item value is the same as the search ID issued in step S302 is registered in the classification accuracy item. Extraction with high classification accuracy. Then, the category data recommendation unit 102 compares the category data in the response contents with the category data (see FIG. 13) in the order of the records extracted from the recommendation file 108. To do.

ステップＳ３１２での比較判定の結果に基づいて、区分データレコメンド部１０２は、入力中の応対内容に対応させて、区分データのお薦めを示すメッセージを表示する（図３のステップＳ３１３）。図１４は、ステップＳ３１２での比較判定と、その比較結果を受けて実行されるレコメンドメッセージを表示するための制御処理との関係を示す図である。今、応対内容中の区分データがレコメンドファイル１０８から抽出された第１番目の順位のレコード、即ち分類精度が最上位のレコードの区分データと一致したと判定された場合には、レコメンドメッセージは表示されない。これは、入力者による区分データの指定が最適であることを意味する。また、応対内容中の区分データがレコメンドファイル１０８から抽出された第１番目以外の順位のレコード、即ち分類精度が第２位以下のレコードの区分データと一致したと判定されたなら、一致検出順位より高順位のレコードの区分データが、お薦め表示される。これは、入力者による区分データの指定よりも最適な区分データが存在することを意味する。更に、応対内容中の区分データがレコメンドファイル１０８から抽出されたどのレコードの区分データとも一致しないと判定されたなら、抽出された全ての順位のレコードの区分データが、お薦め表示される。これは、入力者による区分データの指定は適切でないことを意味するため、この場合には、全ての順位の区分データ候補がお薦め表示される。図１５は、上述の処理によって入力者に表示される区分データのお薦め表示の表示例を示す図である。 Based on the result of the comparison determination at step S312, the segment data recommendation unit 102 displays a message indicating the recommendation of the segment data in correspondence with the contents of the response being input (step S313 in FIG. 3). FIG. 14 is a diagram illustrating the relationship between the comparison determination in step S312 and the control process for displaying a recommendation message executed in response to the comparison result. If it is determined that the category data in the response content matches the category data of the first rank record extracted from the recommendation file 108, that is, the classification accuracy, the recommendation message is displayed. Not. This means that the designation of the classification data by the input person is optimal. Also, if it is determined that the category data in the response content matches the category data of the records other than the first one extracted from the recommendation file 108, that is, the classification accuracy is the second or lower record, the match detection rank The classification data of the higher-order record is recommended. This means that there is optimum segment data than designation of segment data by the input person. Further, if it is determined that the category data in the contents of the response does not match the category data of any record extracted from the recommendation file 108, the category data of all the extracted rank records are recommendedly displayed. This means that it is not appropriate to specify the category data by the input person. In this case, the category data candidates of all ranks are recommended and displayed. FIG. 15 is a diagram illustrating a display example of the recommendation display of the segment data displayed to the input person by the above-described processing.

図１５に例示される表示において、入力者は、何れかの区分データのお薦めの右横の「選択」リンクをクリックすることにより、そのお薦めの区分データを採用することができる。区分データレコメンド部１０２は、入力者によりお薦めの区分データの選択操作を受付け（図３のステップＳ３１４）、その選択された区分データを応対入力画面中の区分データの入力フォーム領域に自動設定する（図３のステップＳ３１５）。 In the display illustrated in FIG. 15, the input user can adopt the recommended category data by clicking the “select” link on the right side of the recommendation of any category data. The category data recommendation unit 102 accepts a recommended category data selection operation by the input person (step S314 in FIG. 3), and automatically sets the selected category data in the category data input form area in the response input screen ( Step S315 in FIG. 3).

その後、区分データレコメンド部１０２の処理が終了する。
以上説明した実施形態では、新たな応対内容の入力に対して、それにもっとも適した区分データの候補（レコメンド）を的確に提示することが可能となる。 Thereafter, the processing of the segment data recommendation unit 102 ends.
In the embodiment described above, it is possible to accurately present a candidate (recommendation) of classification data most suitable for input of new response content.

図４は、分類コード自動再作成部１０３が実行する分類コード自動再作成処理の制御動作を示す動作フローチャートである。
この処理は、分類コード統計情報表ファイル１０５を利用し、区分データの一致度を自動的に解析し、一致度が低い区分データの分類コードを自動的に再作成する処理である。この処理を自動化することにより、人手による区分データのチェック作業が不要となり、分類コードを自動学習することができ、高い精度でのレコメンド処理が可能となる。 FIG. 4 is an operation flowchart showing the control operation of the automatic classification code re-creation process executed by the automatic classification code re-creation unit 103.
This process uses the classification code statistical information table file 105 to automatically analyze the degree of coincidence of the category data and automatically recreate the category code of the category data having a low degree of coincidence. By automating this process, it is not necessary to manually check the classification data, the classification code can be automatically learned, and the recommendation process can be performed with high accuracy.

この処理ではまず、一定期間毎に、以下のチェック処理が実行される。
まず、分類コード自動再作成部１０３は、図９に示されるデータ構成例を有する分類コード統計情報表ファイル１０５の各区分データのレコードにおいて、Ｒ６、Ｒ７、Ｒ８の各項目値を初期化、即ち値０を設定する（図４のステップＳ４０１）。 In this process, first, the following check process is executed at regular intervals.
First, the classification code automatic re-creation unit 103 initializes each item value of R6, R7, and R8 in each classification data record of the classification code statistical information table file 105 having the data configuration example shown in FIG. A value 0 is set (step S401 in FIG. 4).

次に、分類コード自動再作成部１０３は、分類コード統計情報表ファイル１０５の各区分データのレコードにおいて、Ｒ６項目に、以下の計算結果をセットする（図４のステップＳ４０２）。なお、Ｒ４項目値及びＲ１項目値は、Ｒ６項目と同一のレコード中の値である。

Ｒ６項目値＝Ｒ４項目値／Ｒ１項目値

このようにして計算されるＲ６項目値は、Ｒ１項目値で示されるレコード件数を有する母集団テキストファイル１０９の中に、Ｒ４項目値で示されるレコード件数を有する区分データが一致するレコードが存在する割合、即ち期待値を示している。 Next, the classification code automatic re-creation unit 103 sets the following calculation result in the R6 item in each classification data record of the classification code statistical information table file 105 (step S402 in FIG. 4). The R4 item value and the R1 item value are values in the same record as the R6 item.

R6 item value = R4 item value / R1 item value

As for the R6 item value calculated in this way, there is a record in the population text file 109 having the record number indicated by the R1 item value that matches the segment data having the record number indicated by the R4 item value. The ratio, that is, the expected value is shown.

次に、分類コード自動再作成部１０３は、分類コード統計情報表ファイル１０５から、各レコードの区分データ及び作成日の組合せを取り出す（図９参照）（図４のステップＳ４０３）。 Next, the automatic classification code re-creation unit 103 takes out the combination of the classification data and creation date of each record from the classification code statistical information table file 105 (see FIG. 9) (step S403 in FIG. 4).

次に、分類コード自動再作成部１０３は、ステップＳ４０３で取り出した区分データ及び作成日の組合せ毎に、以下の一連の処理を繰り返し実行する（図４のステップＳ４０４からＳ４０８までのループ処理）。
即ちまず、分類コード自動再作成部１０３は、区分データ及び作成日の組合せの１組を選択する（図４のステップＳ４０５）。 Next, the classification code automatic recreating unit 103 repeatedly executes the following series of processes for each combination of the classification data extracted in step S403 and the creation date (loop processing from steps S404 to S408 in FIG. 4).
That is, first, the classification code automatic re-creation unit 103 selects one combination of the classification data and the creation date (step S405 in FIG. 4).

次に、分類コード自動再作成部１０３は、応対履歴データベース１０６から、ステップＳ４０５で選択した組の作成日以降の作成日が記憶されている応対履歴のレコード件数を検索する。そして、分類コード自動再作成部１０３は、そのレコード件数を、分類コード統計情報表ファイル１０５において、ステップＳ４０５で選択した組に対応するレコードのＲ７項目にセットする（以上、図４のステップＳ４０６）。 Next, the automatic classification code re-creation unit 103 searches the response history database 106 for the number of records in the response history in which the creation dates after the creation date of the set selected in step S405 are stored. Then, the classification code automatic re-creation unit 103 sets the number of records in the R7 item of the record corresponding to the group selected in step S405 in the classification code statistical information table file 105 (step S406 in FIG. 4). .

次に、分類コード自動再作成部１０３は、応対履歴データベース１０６から、ステップＳ４０５で選択した組の作成日以降の作成日が記憶され、かつその組の区分データと同じ区分データが記憶されている応対履歴のレコード件数を検索する。そして、分類コード自動再作成部１０３は、そのレコード件数を、分類コード統計情報表ファイル１０５において、ステップＳ４０５で選択した組に対応するレコードのＲ８項目にセットする（以上、図４のステップＳ４０７）。 Next, the classification code automatic re-creation unit 103 stores the creation date after the creation date of the group selected in step S405 from the response history database 106, and the same segment data as the segment data of the group is stored. Search the number of records in the response history. Then, the classification code automatic re-creating unit 103 sets the number of records in the R8 item of the record corresponding to the group selected in step S405 in the classification code statistical information table file 105 (step S407 in FIG. 4). .

以上のステップＳ４０５からＳ４０７の一連の処理が、分類コード統計情報表ファイル１０５上の全てのレコードに対して実行される（図４のステップＳ４０４からＳ４０８までのループ処理）。 The series of processes in steps S405 to S407 described above is executed for all records on the classification code statistical information table file 105 (loop process from steps S404 to S408 in FIG. 4).

その後、分類コード自動再作成部１０３は、分類コード統計情報表ファイル１０５から、以下の条件に一致する区分データを抽出する（図４のステップＳ４０９）。

Ｒ８項目値／Ｒ７項目値(実際に存在する割合)＜Ｒ６項目値（期待する割合）×α

ここで、αは閾値のシステムパラメーターであり、管理者が自由に設定できる値とする。 Thereafter, the classification code automatic re-creation unit 103 extracts classification data that satisfies the following conditions from the classification code statistical information table file 105 (step S409 in FIG. 4).

R8 item value / R7 item value (actually existing ratio) <R6 item value (expected ratio) × α

Here, α is a threshold system parameter and is a value that can be freely set by the administrator.

図１６は、分類コード自動再作成処理における分類コード統計情報表ファイル１０５のデータ構成例を示す図であり、図９のデータ構成例に対応している。この例に示されるレコードでは、元々Ｒ１＝１０，０００件の母集団テキストファイル１０９に対してそのレコードの区分データを有するレコード件数Ｒ４が１８０件であった。従って、Ｒ６＝Ｒ４／Ｒ１＝１８０／１０，０００＝０．０１８となる。そして、そのレコードの最新の作成日以降に応対履歴データベース１０６に登録されたレコード件数Ｒ７は、２０，０００件になっている。この２０，０００件のうち、そのレコードの区分データと一致する区分データを有するレコード件数Ｒ８は１００件であった。従って、ステップＳ４０９では、実際に存在する割合（Ｒ８／Ｒ７＝０．０１）が、期待する割合（Ｒ６＝０．０１８）×αよりも小さいレコードの区分データが抽出される。 FIG. 16 is a diagram illustrating a data configuration example of the classification code statistical information table file 105 in the automatic classification code re-creation process, and corresponds to the data configuration example of FIG. In the record shown in this example, the number of records R4 having the classification data of the record for the population text file 109 of R1 = 10,000 was originally 180. Therefore, R6 = R4 / R1 = 180 / 10,000 = 0.018. The record number R7 registered in the response history database 106 after the latest creation date of the record is 20,000. Among these 20,000 records, the record count R8 having the category data matching the category data of the record was 100. Accordingly, in step S409, the segment data of the records in which the actually existing ratio (R8 / R7 = 0.01) is smaller than the expected ratio (R6 = 0.018) × α is extracted.

最後に、分類コード自動再作成部１０３は、ステップＳ４０９で上述の条件に一致する区分データが抽出された場合には、以下の処理を実行する。即ち、分類コード自動再作成部１０３は、Ｒ７に対応する応対履歴データベース１０６上のレコード群を母集団テキストファイル１０９、Ｒ８に対応する応対履歴データベース１０６上のレコード群を標本テキストファイル１１０とする。そして、分類コード自動再作成部１０３は、図２の動作フローチャートで説明したのと同じ分類コード作成処理を実行することにより、分類コード表ファイル１０４及び分類コード統計情報表ファイル１０５を自動的に再作成する。 Finally, the classification code automatic recreating unit 103 executes the following processing when the classification data that matches the above-described condition is extracted in step S409. That is, the classification code automatic re-creation unit 103 sets the record group on the response history database 106 corresponding to R7 as the population text file 109 and the record group on the response history database 106 corresponding to R8 as the sample text file 110. Then, the classification code automatic re-creation unit 103 automatically regenerates the classification code table file 104 and the classification code statistical information table file 105 by executing the same classification code creation process as described in the operation flowchart of FIG. create.

図１７は、以上に説明した顧客応対システムの実施形態を実現できるコンピュータのハードウェア構成の一例を示す図である。
図１７に示されるコンピュータは、ＣＰＵ１７０１、メモリ１７０２、入力装置１７０３、出力装置１７０４、外部記憶装置１７０５、可搬記録媒体１７０９が挿入される可搬記録媒体駆動装置１７０６、及びネットワーク接続装置１７０７を有し、これらがバス１７０８によって相互に接続された構成を有する。同図に示される構成は上記システムを実現できるコンピュータの一例であり、そのようなコンピュータはこの構成に限定されるものではない。 FIG. 17 is a diagram illustrating an example of a hardware configuration of a computer that can realize the embodiment of the customer reception system described above.
The computer shown in FIG. 17 includes a CPU 1701, a memory 1702, an input device 1703, an output device 1704, an external storage device 1705, a portable recording medium driving device 1706 into which a portable recording medium 1709 is inserted, and a network connection device 1707. However, they are connected to each other by a bus 1708. The configuration shown in the figure is an example of a computer that can implement the above system, and such a computer is not limited to this configuration.

ＣＰＵ１７０１は、当該コンピュータ全体の制御を行う。メモリ１７０２は、プログラムの実行、データ更新等の際に、外部記憶装置１７０５（或いは可搬記録媒体１７０９）に記憶されているプログラム又はデータを一時的に格納するＲＡＭ等のメモリである。ＣＵＰ１７０１は、プログラムをメモリ１７０２に読み出して実行することにより、全体の制御を行う。 The CPU 1701 controls the entire computer. The memory 1702 is a memory such as a RAM that temporarily stores a program or data stored in the external storage device 1705 (or the portable recording medium 1709) when executing a program, updating data, or the like. The CUP 1701 performs overall control by reading the program into the memory 1702 and executing it.

入力装置１７０３は、例えば、キーボード、マウス等及びそれらのインタフェース制御装置とからなる。入力装置１７０３は、ユーザによるキーボードやマウス等による入力操作を検出し、その検出結果をＣＰＵ１７０１に通知する。 The input device 1703 includes, for example, a keyboard, a mouse, etc. and their interface control devices. The input device 1703 detects an input operation by a user using a keyboard, a mouse, or the like, and notifies the CPU 1701 of the detection result.

出力装置１７０４は、表示装置、印刷装置等及びそれらのインタフェース制御装置とからなる。出力装置１７０４は、ＣＰＵ１７０１の制御によって送られてくるデータを表示装置や印刷装置に出力する。 The output device 1704 includes a display device, a printing device, etc. and their interface control devices. The output device 1704 outputs data sent under the control of the CPU 1701 to a display device or a printing device.

外部記憶装置１７０５は、例えばハードディスク記憶装置である。主に各種データやプログラムの保存に用いられる。
可搬記録媒体駆動装置１７０６は、光ディスクやＳＤＲＡＭ、コンパクトフラッシュ等の可搬記録媒体１７０９を収容するもので、外部記憶装置１７０５の補助の役割を有する。 The external storage device 1705 is, for example, a hard disk storage device. Mainly used for storing various data and programs.
The portable recording medium driving device 1706 accommodates a portable recording medium 1709 such as an optical disk, SDRAM, or compact flash, and has an auxiliary role for the external storage device 1705.

ネットワーク接続装置１７０７は、例えばＬＡＮ（ローカルエリアネットワーク）又はＷＡＮ（ワイドエリアネットワーク）の通信回線を接続するための装置である。
実施形態によるシステムは、図１に示される各ブロックの機能、又は図２〜図４に示される動作フローチャートの処理に対応する機能を搭載したプログラムをＣＰＵ１７０１が実行することで実現される。そのプログラムは、例えば外部記憶装置１７０５や可搬記録媒体１７０９に記録して配布してもよく、或いはネットワーク接続装置１７０７によりネットワークから取得できるようにしてもよい。また、各処理において用いられるデータは、例えば外部記憶装置１７０５からメモリ１７０２に読み出されて処理される。 The network connection device 1707 is a device for connecting, for example, a LAN (local area network) or WAN (wide area network) communication line.
The system according to the embodiment is realized by the CPU 1701 executing a program equipped with the function of each block shown in FIG. 1 or the function corresponding to the processing of the operation flowcharts shown in FIGS. The program may be distributed by being recorded in, for example, the external storage device 1705 or the portable recording medium 1709, or may be acquired from the network by the network connection device 1707. Further, data used in each process is read from the external storage device 1705 to the memory 1702 and processed, for example.

以上説明した実施形態では、顧客応対システムにおける顧客応対情報の入力時の区分データの付与時にレコメンド表示する実施形態について説明したが、上述の技術は、何らかの文章情報の入力時に区分データを付与するシステムに広く適用することが可能である。 In the embodiment described above, the embodiment in which the recommendation display is performed when the classification data is given when the customer reception information is input in the customer reception system has been described. However, the above technique is a system that provides the classification data when inputting some text information. It can be widely applied to.

コールセンターやお客様相談室等で使用される顧客応対システムにおける区分データの入力に利用することができる。 It can be used to input segment data in customer service systems used in call centers and customer service rooms.

１００、１８０１顧客応対システム
１０１分類コード表作成部
１０２区分データレコメンド部
１０３分類コード自動再作成部
１０４分類コード表ファイル
１０５分類コード統計情報表ファイル
１０６、１８０２応対履歴データベース
１０７検索結果ファイル
１０８レコメンドファイル
１０９母集団テキストファイル（情報源Ａ）
１１０標本テキストファイル（情報源Ｂ）
１７０１ＣＰＵ
１７０２メモリ
１７０３入力装置
１７０４出力装置
１７０５外部記憶装置
１７０６可搬記録媒体駆動装置
１７０７ネットワーク接続装置
１７０８バス
１７０９可搬記録媒体 100, 1801 Customer response system 101 Classification code table creation unit 102 Classification data recommendation unit 103 Automatic classification code re-creation unit 104 Classification code table file 105 Classification code statistical information table file 106, 1802 Response history database 107 Search result file 108 Recommendation file 109 Population text file (source A)
110 Sample text file (source B)
1701 CPU
1702 Memory 1703 Input device 1704 Output device 1705 External storage device 1706 Portable recording medium driving device 1707 Network connection device 1708 Bus 1709 Portable recording medium

Claims

In a method for a computer to store text information together with segment data indicating a segment of the text information by one or more segment values,
The computer
A population text file of text information to which the classification data is assigned by an input person, and a sample text file of text information to which the classification data determined to be correct are given in an arbitrary combination of predetermined text units. By creating a classification code table file indicating the correspondence between the classification data and the classification code by comparing the configured classification code as a key,
For each classification data extracted from the classification code table file, a classification in which the appearance status of the classification code associated with the classification data in the classification code table file in the population text file is collected as statistical information A classification corresponding to the classification code is created by creating a code statistical information table file, extracting the classification code from newly input new text information, and searching the classification code table file for each extracted classification code Extract the data,
Extracting the statistical information from the classification code statistical information table file for each of the extracted classification data;
Based on the extracted statistical information, a candidate for segment data corresponding to the new text information is selected from the extracted segment data and presented.
A partitioned data recommendation method characterized by that.

Storing the new text information as a history database;
For each of the classification data extracted from the classification code table file, calculate the appearance status of the classification data in the history database,
Comparing the appearance status with the statistical information extracted from the classification code statistical information table file corresponding to the classification data;
Based on the comparison result, the population text file and the sample text file are determined from the history database, and the classification code table file and the classification code statistical information table file are recreated.
The partitioned data recommendation method according to claim 1.

The text unit is a morpheme;
3. The partitioned data recommendation method according to claim 1 or 2, wherein:

In a computer that stores text information together with segment data indicating a segment of the text information by one or more segment values,
A population text file of text information to which the classification data is assigned by an input person, and a sample text file of text information to which the classification data determined to be correct are given in an arbitrary combination of predetermined text units. By creating a classification code table file indicating the correspondence between the classification data and the classification code by comparing the configured classification code as a key,
For each classification data extracted from the classification code table file, a classification in which the appearance status of the classification code associated with the classification data in the classification code table file in the population text file is collected as statistical information A classification corresponding to the classification code is created by creating a code statistical information table file, extracting the classification code from newly input new text information, and searching the classification code table file for each extracted classification code Extract the data,
Extracting the statistical information from the classification code statistical information table file for each of the extracted classification data;
Based on the extracted statistical information, a candidate for segment data corresponding to the new text information is selected from the extracted segment data and presented.
Program for executing processing.

Storing the new text information as a history database;
For each of the classification data extracted from the classification code table file, calculate the appearance status of the classification data in the history database,
Comparing the appearance status with the statistical information extracted from the classification code statistical information table file corresponding to the classification data;
Based on the comparison result, the population text file and the sample text file are determined from the history database, and the classification code table file and the classification code statistical information table file are recreated.
The program according to claim 1, further executing a process.

In an apparatus for storing text information together with segment data indicating a segment of the text information by one or more segment values,
A population text file of text information to which the classification data is assigned by an input person, and a sample text file of text information to which the classification data determined to be correct are given in an arbitrary combination of predetermined text units. By comparing the configured classification code as a key, a classification code table file indicating a correspondence relationship between the classification data and the classification code is created, and for each of the classification data extracted from the classification code table file, A classification code table creation unit that creates a classification code statistical information table file that collects the appearance status of the classification code associated with the classification data in the classification code table file as statistical information;
The classification code is extracted from the newly input new text information, and the classification code table file is searched for each extracted classification code, so that the classification data corresponding to the classification code is extracted and the extracted The statistical information is extracted from the classification code statistical information table file for each classification data, and based on the extracted statistical information, a classification data candidate corresponding to the new text information is selected from the extracted classification data And a segment data recommendation part to be presented,
A segmented data recommendation device comprising:

The new text information is accumulated as a history database, and for each category data extracted from the classification code table file, the appearance status of the category data in the history database is calculated, and the appearance status is used as the category data. Correspondingly, the statistical information extracted from the classification code statistical information table file is compared, and based on the comparison result, the population text file and the sample text file are determined from the history database, and the classification code table A classification code automatic re-creation unit that re-creates the file and the classification code statistical information table file;
The classified data recommendation device according to claim 1.