JP4477587B2

JP4477587B2 - Method for generating operation buttons for computer processing of text data

Info

Publication number: JP4477587B2
Application number: JP2006053616A
Authority: JP
Inventors: 秀博有田; 智章村上
Original assignee: 株式会社エヌジェーケー
Priority date: 2006-02-28
Filing date: 2006-02-28
Publication date: 2010-06-09
Anticipated expiration: 2026-02-28
Also published as: JP2007233631A

Description

本発明は、テキストデータの意味を認識して、テキストデータを多次元に検索するための操作ボタン（意味ボタン）を自動的に生成して表示装置に表示させるテキストデータのコンピュータ処理用操作ボタン生成方法に関する。 The present invention recognizes the meaning of text data and automatically generates operation buttons (meaning buttons) for searching the text data in a multidimensional manner and generates operation buttons for computer processing of the text data to be displayed on the display device. Regarding the method.

コンピュータによるテキストデータ処理の分野、主にテキストマイニングやナレッジマネジメントの分野においては、日本語の言語処理技術発展により、キーワード抽出、汎用的な辞書による意味認識、キーワードのカテゴライズ、これらの結果を利用した各種の分析と活用が行われるようになってきている。この分析においては、例えば、品詞出現度数、キーワード出現度数、キーワードとデータ属性（いつ、だれが、どこで等）の関係、キーワード間の関係、キーワード出現度数の時系列変化等を各種の様式で出力できるようになり、それらの出力を使って用途別に活用できるようになってきている。 In the field of text data processing by computer, mainly in the field of text mining and knowledge management, keyword extraction, semantic recognition by general-purpose dictionary, keyword categorization, and these results were utilized by the development of Japanese language processing technology. Various types of analysis and utilization are starting to take place. In this analysis, for example, the part-of-speech appearance frequency, keyword appearance frequency, the relationship between keywords and data attributes (when, who, where, etc.), the relationship between keywords, the time series change of keyword appearance frequency, etc. are output in various formats. It is now possible to use these outputs to make use of them by application.

しかしながら、従来の技術では、（ａ）辞書の精度によって分析の成果が左右される、（ｂ）特定の用途や目的にフィットしない、（ｃ）辞書を更新する機能はあるが、現実にユーザや業務のニーズ毎に対応するには難しい、などという問題がある。例えば、自社の製品名、社員名、部門名の認識、自社製品と他社製品の区別、製品機能や仕様の名称の認識、取引先の企業名や氏名の認識等、テキストデータに普通に含まれているユーザ固有のキーワードには、汎用の辞書のままでは対応できない。ユーザ側でこれらのキーワードを辞書に追加できたとしても、そのキーワードを、複数の業務や用途に応じて、対応させるカテゴリや意味認識の基準を定義し、自在に切り分けて使用できるような仕組みにはなっていない。例えば、表記上同一のキーワードでも、使われる分野によって意味や解釈が異なるということが往々にしてあるが、汎用辞書ではそのような柔軟な認識を行うことはできない。 However, in the conventional technology, (a) the result of the analysis depends on the accuracy of the dictionary, (b) does not fit a specific use or purpose, (c) has a function of updating the dictionary, There is a problem that it is difficult to respond to each business need. For example, it is usually included in text data such as recognition of the company's product name, employee name, department name, distinction between the company's product and other company's product, recognition of the name of the product function or specification, recognition of the company name or name of the business partner, etc. The user-specific keywords cannot be handled with a general-purpose dictionary. Even if the user can add these keywords to the dictionary, the keywords can be defined according to multiple tasks and usages, and the meaning recognition criteria can be defined and used freely. It is not. For example, even with the same keyword in notation, the meaning and interpretation are often different depending on the field used, but such a flexible recognition cannot be performed with a general-purpose dictionary.

また、従来の汎用の辞書を使用した技術では、対応しがたい状況が発生している。例えば、電話、ＦＡＸ、メール、或いは口頭などによる顧客からの問い合わせを集めたようなテキストデータにおいて、顧客からの問い合わせ（テキストデータ）を、「苦情」、「質問」、「要望」などに分類し、更にそれぞれを細分化して各種の分析を行いたいというニーズがある。当該業務においては、どのような語句や表現を基本的に「苦情」と判断するのか、或いは前記判断と同義や類義の語句や表現はどこまで「苦情」に含めるべきか、更には、基本的には「苦情」と判断したが、その中から除外すべき語句や表現は何か等、当該業務の目的に適合した判断基準をきめ細かく決定する必要がある。これらのニーズは、従来の辞書を引いて決定する類のものではない。 In addition, the conventional technique using a general-purpose dictionary is difficult to cope with. For example, in text data that collects customer inquiries by telephone, fax, e-mail, or oral, customer inquiries (text data) are classified into “complaints”, “questions”, “requests”, etc. Furthermore, there is a need to further subdivide each and perform various analyses. In this business, what kind of words and expressions are basically judged as “complaints”, how much words and expressions that are synonymous or similar to the above judgments should be included in “complaints”, However, it is necessary to determine the judgment criteria that are suitable for the purpose of the business, such as what words and phrases should be excluded. These needs are not something that can be determined by pulling a traditional dictionary.

データベースやファイルにおける一般的な固定長レコード形式のデータでは、エンドユーザが操作ボタンを選択するだけで、欲しいデータを簡単に検索して、分析・活用できるようにするため、操作ボタンを自動的に生成する技術が開発されている（特許文献１参照）。しかしながら、この技術は、可変長の文章の集合であるテキスト形式のデータに対しては対応できず、エンドユーザが操作ボタンを選択するだけで、その選択された操作ボタンの意味に基づいて、欲しいテキストデータを簡単に検索して、分析・活用できるような技術とはなっていない。
特許第２７０２４１６号 In general fixed-length record format data in databases and files, the operation button is automatically selected so that the end user can easily search, analyze and utilize the desired data simply by selecting the operation button. The technology to generate has been developed (see Patent Document 1). However, this technology cannot handle text-format data that is a collection of variable-length sentences, and the end user simply selects an operation button and wants it based on the meaning of the selected operation button. It is not a technology that allows text data to be easily searched, analyzed, and utilized.
Japanese Patent No. 2702416

エンドユーザでも、専門知識なしで、テキストデータの分析を簡単にできるようにする情報処理、すなわち、テキストデータの意味や属性に基づいた操作ボタンを自動的に表示し、それらのボタンを選択するだけで、テキストデータを多次元の階層で絞り込んだり、並列に関連付けて再表示したりして、自在に検索・分析・活用できるような情報処理が求められている。 Information processing that makes it easy for end users to analyze text data without specialized knowledge, that is, operation buttons based on the meaning and attributes of text data are automatically displayed and only those buttons are selected. Therefore, there is a need for information processing that allows text data to be narrowed down in a multi-dimensional hierarchy, or displayed in association with parallel data, so that it can be freely searched, analyzed, and utilized.

そのようなテキストデータ分析のためには、普遍的な知識ベースである従来の汎用辞書引きによる意味認識方法に代わる新しい仕組みが必要となる。つまり、ユーザ固有の知識・経験・知恵に基づくノウハウを活かして、環境や状況の変化にも機敏かつ個別に対応しやすく、また一般用途や特殊用途のどちらにも適用可能な、テキストデータを分析するための基準、換言すれば、「意味認識ルール」といった概念で処理する仕組みが必要とされる。 For such text data analysis, a new mechanism is required to replace the conventional method of meaning recognition by general-purpose dictionary lookup, which is a universal knowledge base. In other words, by utilizing know-how based on user-specific knowledge, experience, and wisdom, text data that can be easily and individually responded to changes in the environment and circumstances, and can be applied to both general and special purposes is analyzed. In other words, a mechanism for processing based on a concept such as a “semantic recognition rule” is required.

意味認識とは、ルール（意味認識ルール）を基にテキストデータの形態を解析し、テキストデータをルールで定義されたカテゴリに振り分けることを可能とする仕組みである。例えば、あるテキストデータがルール中で定義されているカテゴリ「苦情」とカテゴリ「プリンタ」に振り分けられたならば、そのテキストデータは、「苦情」と「プリンタ」に関連する意味を持つと認識されたと考える仕組みである。 Semantic recognition is a mechanism that makes it possible to analyze the form of text data based on a rule (semantic recognition rule) and distribute the text data into categories defined by the rule. For example, if text data is assigned to the category “complaint” and category “printer” defined in the rule, the text data is recognized as having a meaning related to “complaint” and “printer”. It is a mechanism that thinks.

具体的には、まず、ユーザにとって固有の意味を表す特定の文字列（特定キーワード）を、複数の業務や業種の個別の用途に応じて任意のカテゴリに関連付けることによって該カテゴリを定義し、そうしたカテゴリの集合を自在かつ容易に任意のルールに構成可能にする仕組みが必要である。特定キーワードには、例えば、自社の製品名・社員名・部門名、他社製品名、製品の重要な機能名・仕様名、取引先の企業名・氏名などが考えられる。 Specifically, first, the category is defined by associating a specific character string (specific keyword) representing a meaning unique to the user with an arbitrary category according to individual uses of a plurality of businesses and industries. There is a need for a mechanism that allows a set of categories to be freely and easily configured into arbitrary rules. Specific keywords include, for example, the company's product name / employee name / department name, other company's product name, important function name / specification name of the product, and the company name / name of the supplier.

同様に、ユーザにとって重要な意味を表し、語句や表現に含まれる共通の言語要素（概念キーワード）を、複数の業務や業種の個別の用途に応じて任意のカテゴリに関連付けることによって該カテゴリを定義し、そうしたカテゴリの集合を自在かつ容易に任意の意味認識ルールに構成可能にする仕組みが必要である。概念キーワードには、例えば、「起動しない」、「表示できない」、「戻れなければ」、「印刷されなくて」などの表現において共通に含まれる言語要素である「〜ない」、もしくはその活用形があって、当該業務では「苦情」というカテゴリに関連付けて分析したい場合が考えられる。 Similarly, a category is defined by associating a common language element (conceptual keyword) included in a phrase or expression with a meaning that is important to the user and associated with an arbitrary category according to individual usages of a plurality of businesses and industries. However, there is a need for a mechanism that allows a set of such categories to be freely and easily configured into arbitrary semantic recognition rules. The conceptual keywords, for example, "not start", "can not be displayed", "unless return", "no ~" is the language elements contained in common expressions such as "not be printed", or its There are cases where there is a utilization form and it is desired to analyze in association with the category of “complaints” in this business.

更に、特定キーワードと概念キーワードの２つの仕組みを一体化して構成した任意のルール（意味認識ルール）に基づき、テキストデータの形態を解析して、前記２種類のキーワードを抽出して、抽出したキーワードを該当するカテゴリに振り分けることによって意味付けし、カテゴリとキーワード、キーワードと他のカテゴリのキーワード、キーワードとテキストデータが、それぞれ相互に関連付けられた操作ボタンを自動生成するための仕組みが必要である。 Furthermore, based on an arbitrary rule (semantic recognition rule) configured by integrating two mechanisms of a specific keyword and a conceptual keyword, the form of text data is analyzed, the two types of keywords are extracted, and the extracted keywords It is necessary to provide a mechanism for automatically generating operation buttons in which the meanings are assigned to the corresponding categories and the categories and keywords, the keywords and keywords of other categories, and the keywords and text data are associated with each other.

操作ボタンに関する引用文献１に記載の発明では、前記カテゴリに対応するフィールドで構成されるデータを基に、自動的に操作ボタンを生成して、データの分析を行うことが可能である。しかし、テキストデータの場合、前記カテゴリに対応するものはテキストデータ中に含まれるキーワードであるが、このキーワードは、フィールドのように予め存在が確定していない。このため、テキストデータに含まれるキーワードが持つ意味によって該当するカテゴリに関連付けようとすると、そのテキストデータがどのキーワードをいくつ含んでいるかは予め分からない。したがって、テキストデータの意味認識によって得られる結果には、不特定数のカテゴリが関連付けられることになり、フィールドが確定しているデータを基にする特許文献１に記載の発明では対応できない。 In the invention described in the cited document 1 relating to the operation button, it is possible to automatically generate the operation button based on the data configured by the field corresponding to the category and analyze the data. However, in the case of text data, a keyword corresponding to the category is a keyword included in the text data, but this keyword is not determined in advance like a field. For this reason, when trying to associate with the corresponding category according to the meaning of the keyword included in the text data, it is not known in advance how many keywords the text data includes. Therefore, an unspecified number of categories are associated with the result obtained by recognizing the meaning of the text data, and cannot be handled by the invention described in Patent Document 1 based on data in which fields are determined.

本発明は、上記事情に鑑みて為されたもので、テキストデータの持つ多様な意味に基づいて操作ボタン（意味ボタン）を自動生成して表示装置に表示し、大量のテキストデータを、任意のカテゴリとキーワードに対応する多様な視点から、多次元で動的に階層を絞りながら検索したり、また複数のボタンクラスや個別ボタンを並列に再表示させて他のカテゴリのキーワードとの相互関連を見ながら検索したりすることができるようにしたテキストデータのコンピュータ処理用操作ボタン生成方法を提供することを目的とする。 The present invention has been made in view of the above circumstances, and automatically generates operation buttons (meaning buttons) based on various meanings of text data and displays them on a display device. From various viewpoints corresponding to categories and keywords, you can search while dynamically narrowing down the hierarchy in multiple dimensions, and redisplay multiple button classes and individual buttons in parallel to correlate with keywords of other categories It is an object of the present invention to provide an operation button generation method for computer processing of text data that can be searched while viewing.

上記目的を達成するため、本発明のテキストデータのコンピュータ処理用操作ボタン生成方法は、文字列を含みファイル名を付けて記憶装置に記憶されている複数のテキストデータから任意のテキストデータを検索するためのコンピュータ処理用操作ボタンをプログラムされたコンピュータの処理によって生成するテキストデータのコンピュータ処理用操作ボタン生成方法であって、カテゴリと、前記テキストデータ中の文字列とマッチングさせるための表現要素であるキーワードとを互いに関連付けして定義し、前記テキストデータ中の文字列に前記キーワードが含まれているテキストデータを検索し、前記カテゴリ、前記キーワード、及び該キーワードが含まれていることが検索された前記テキストデータから、カテゴリフィールド、キーワードフィールド及びテキストデータファイル名フィールドの３つのフィールドを有し、各フィールドに、前記カテゴリ、前記キーワード、及び前記キーワードが含まれていることが検索されたテキストデータのファイル名をそれぞれ一対一で対応させた結果レコードの集合である意味認識結果テーブルを備えたボタン化ソースデータを生成し、前記ボタン化ソースデータの前記カテゴリフィールドのフィールド値に対応させて分析用ボタンクラスを、前記キーワードフィールドのフィールド値に対応させて前記分析用ボタンクラスに属する個別ボタンをそれぞれ生成し、前記ボタン化ソースデータの前記テキストデータファイル名フィールドから参照用ボタンクラスを、該テキストデータファイル名フィールドのフィールド値に対応させて前記参照用ボタンクラスに属する個別ボタンをそれぞれ生成し、前記ボタンクラス及び前記個別ボタンを表示装置に表示する。 In order to achieve the above object, an operation button generation method for computer processing of text data according to the present invention retrieves arbitrary text data from a plurality of text data including a character string and having a file name and stored in a storage device . A computer data processing button generation method for text data generated by computer processing for a computer processing operation button for the purpose, and a representation element for matching a category and a character string in the text data A keyword is defined in association with each other, and text data in which the keyword is included in a character string in the text data is searched, and it is found that the category, the keyword, and the keyword are included . the text data or et al., category field, key Possess three fields of the word fields and text data file name field, each field, the category, corresponding the keyword, and the keyword file name of the text data that is retrieved that contain the respectively one-to-one Creates a button of source data having a meaning recognition result table is a set of result records obtained by the analytical button cLASS in correspondence with the field value of the category field of the button of the source data, the field of the keyword field the individual button belonging to the button class for the analysis generated each in correspondence to a value, the Browse button cLASS from the text data file name field of the button of the source data to correspond to a field value of the text data file name field The individual button belonging to the button class for the reference product, respectively, for displaying the button class and the individual buttons on the display device.

これにより、任意のカテゴリとキーワードに対応する多様な視点から、テキストデータの持つ多様な意味に基づいて操作ボタン（意味ボタン）を自動生成して表示装置に表示することができる。 Thus, operation buttons (meaning buttons) can be automatically generated and displayed on the display device based on various meanings of the text data from various viewpoints corresponding to arbitrary categories and keywords.

前記分析用ボタンクラス及び前記参照用ボタンクラスを前記個別ボタンと共に表示装置に並列に表示し、１つの分析用ボタンクラスに属する任意の前記個別ボタンを選択すると、当該選択された個別ボタンに対応する前記キーワードフィールドのフィールド値に対応するキーワードと、該キーワードを含むテキストデータに含まれる他のキーワードの少なくとも一方のキーワードを含む前記結果レコードを前記意味認識結果テーブルから抽出し、該抽出された結果レコードを基に、前記分析用ボタンクラス及び前記参照用ボタンクラスに属する前記個別ボタンを生成して再表示することが好ましい。
これによって、個別ボタンを選択する度に、他のカテゴリ（ボタンクラス）間でのキーワード（個別ボタン）の相互関連を知ることができる。 The analysis button class and the reference button class are displayed in parallel on the display device together with the individual buttons, and when any individual button belonging to one analysis button class is selected, the selected individual button corresponds to the selected individual button The result record including the keyword corresponding to the field value of the keyword field and at least one of the other keywords included in the text data including the keyword is extracted from the semantic recognition result table, and the extracted result record Preferably, the individual buttons belonging to the analysis button class and the reference button class are generated and displayed again.
Thus, each time an individual button is selected, the correlation between keywords (individual buttons) between other categories (button classes) can be known.

任意の分析用ボタンクラスの任意の個別ボタンを任意の順番で選択することにより、当該選択された個別ボタンに対応する前記キーワードフィールドのフィールド値にそれぞれ対応するキーワードを全て含む前記結果レコードを前記意味認識結果テーブルから抽出し、該抽出された結果レコードを基に、前記分析用ボタンクラス及び前記参照用ボタンクラスに属する前記個別ボタンを生成して再表示することが好ましい。
これにより、表示された操作ボタン（意味ボタン）をユーザに選択させるだけで、特定の意味を持つテキストデータを検索したり、特定のテキストデータが持つ意味を調べたりすることができる。 Meaning the result record including all the keywords respectively corresponding to the field values of the keyword field corresponding to the selected individual button by selecting any individual button of any button class for analysis in any order It is preferable to extract from the recognition result table, and generate and re-display the individual buttons belonging to the analysis button class and the reference button class based on the extracted result record .
As a result, text data having a specific meaning can be searched for or the meaning of the specific text data can be examined simply by allowing the user to select the displayed operation button (meaning button).

前記意味認識結果テーブルは、例えば、前記キーワードを定義し、前記カテゴリ毎に定義済みのキーワードへの参照を個別に定義することで、カテゴリとキーワードとを互いに関連付けつつ該カテゴリを定義し、テキストデータの中から前記キーワードと一致するキーワードを抽出し該キーワードへの参照が定義されているカテゴリに関連付け、この関連付けた結果を基に生成される。 The semantic recognition result table defines, for example, the keywords and defines the categories while associating the categories and keywords with each other by defining the keywords and individually defining references to the defined keywords for each category. A keyword that matches the keyword is extracted from among the keywords, associated with a category in which a reference to the keyword is defined, and generated based on the associated result.

前記キーワードは、特定の文字列からなる特定キーワードと、抽象化された部分を含む文字列からなる概念キーワードに分類されて定義され、前記特定キーワードにあっては前記特定の文字列を、前記概念キーワードにあっては前記抽象化された部分を含む文字列から該抽象化された部分を除く文字列を前記テキストデータ中の文字列とマッチングさせるために使用することが好ましい。
キーワードは、テキストデータ中の文字列とマッチングさせるための表現要素であり、キーワード（表現要素）を、特定の文字列である特定キーワードと、抽象化された文字列を含む概念キーワードに分類して定義することで、ユーザにとって固有の意味を表す特定の文字列（特定キーワード）と、ユーザにとって重要な意味を表し、語句や表現に含まれる共通の言語要素（概念キーワード）がそれぞれ相互に関連付けられた操作ボタンを自動生成することができる。 The keywords are categorized and defined as a specific keyword composed of a specific character string and a concept keyword composed of a character string including an abstracted portion. In the specific keyword, the specific character string is defined as the concept. In the keyword, it is preferable that a character string excluding the abstracted portion from a character string including the abstracted portion is used for matching with a character string in the text data .
A keyword is an expression element that matches a character string in text data. The keyword (expression element) is classified into a specific keyword that is a specific character string and a concept keyword that includes an abstracted character string. By defining, a specific character string (specific keyword) expressing a meaning unique to the user and a common language element (concept keyword) included in a phrase or expression that expresses an important meaning for the user are associated with each other. Operation buttons can be automatically generated.

前記キーワードは、キーワード名称、個別キーワード及び個別除外キーワードを有し、個別除外キーワードを除外しつつ、個別キーワードをキーワード名称と同義語として検索することが好ましい。 The keyword preferably includes a keyword name, an individual keyword, and an individual excluded keyword, and the individual keyword is searched as a synonym for the keyword name while excluding the individual excluded keyword.

本発明によれば、業務の用途や目的等にきめ細かく対応させたキーワードとカテゴリを必要に応じて幾つでも定義できるので、テキストデータの持つ多様な意味に基づいて操作ボタンを自動生成できるようになる。これにより、コンピュータ経験が少なく専門知識を持たないエンドユーザでも、このボタンを選択するだけで、大量のテキストデータを、任意のカテゴリとキーワードに対応する多様な視点から、多次元で動的に階層を絞りながら検索したり、また複数のボタンクラスや個別ボタンを並列に再表示させて他のカテゴリのキーワードとの相互関連を見ながら検索したりすることが簡単に可能になり、テキストデータを自在に分析し活用することができるようになる。 According to the present invention, it is possible to define as many keywords and categories as necessary corresponding to the purpose and purpose of business, so that operation buttons can be automatically generated based on various meanings of text data. . As a result, even end users with little computer experience and no specialized knowledge can select a large number of text data from various viewpoints corresponding to arbitrary categories and keywords. It is easy to search while narrowing down, and it is also possible to search while looking at the correlation with keywords of other categories by redisplaying multiple button classes and individual buttons in parallel. Can be analyzed and utilized.

特許文献１に記載の発明は、カテゴリに対応するフィールドを予め特定できるソースデータを使用してボタン生成を行うため、フィールドが不特定の場合、すなわち、ソースデータにどのようなカテゴリが現れるかが予め確定できない場合に適用することはできない。本発明によれば、現れるカテゴリを確定できないようなソースデータに対するボタン生成を自動的に行うことが可能となる。 In the invention described in Patent Document 1, since button generation is performed using source data in which a field corresponding to a category can be specified in advance, when a field is unspecified, that is, what category appears in the source data. It cannot be applied when it cannot be determined in advance. According to the present invention, it is possible to automatically generate a button for source data in which an appearing category cannot be determined.

以下、本発明の実施の形態を図面を参照して説明する。本発明を実施するためのコンピュータシステム、すなわち以降に示す操作ボタンによるテキストデータ分析システムのハードウェアは、図１に示すように、中央処理装置１０、記憶装置１２、表示装置１４及び入力装置１６から主に構成されている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. As shown in FIG. 1, the computer system for carrying out the present invention, that is, the hardware of the text data analysis system using operation buttons shown below, includes a central processing unit 10, a storage unit 12, a display unit 14, and an input unit 16. It is mainly composed.

操作ボタンによるテキストデータ分析システムのソフトウェアは、図２に示すように、意味認識ルール定義ツール２０、意味認識ルール定義ツール２０によって設定された意味認識ルールに従ってテキストデータの意味を判断し、ボタン化ソースデータを作成するための意味認識ツール２２、意味認識ツール２２によって得た認識結果から、意味の分類（カテゴリ）と意味付けに至った要素（キーワード）及びテキストデータを関連付けて、専門知識を必要としない操作ボタン（意味ボタン）を生成するための意味ボタン自動生成ツール２４、及び意味ボタン自動生成ツール２４によって生成された操作ボタン（意味ボタン）を表示し、表示された操作ボタンをユーザに選択させるだけで、特定の意味を持つテキストデータを検索したり、特定のテキストデータが持つ意味を調べたりすることを可能とするための意味ボタンによる分析ツール２６の４つのツールから構成される。 As shown in FIG. 2, the software of the text data analysis system using the operation button determines the meaning of the text data according to the meaning recognition rule set by the meaning recognition rule definition tool 20 and the meaning recognition rule definition tool 20, and generates a button source. Semantic recognition tool 22 for creating data, from the recognition result obtained by the semantic recognition tool 22, the classification (category) of meaning, the element (keyword) and the text data that have been given meaning are associated with each other, and specialized knowledge is required. The meaning button automatic generation tool 24 for generating an operation button (meaning button) not to be displayed, and the operation button (meaning button) generated by the meaning button automatic generation tool 24 are displayed, and the user selects the displayed operation button. Just search for text data with a specific meaning, It consists of four tools of analysis tool 26 by means button to make it possible to examine or the meaning of the text data of.

以降、意味認識ルール定義ツール２０、意味認識ツール２２、意味ボタン自動生成ツール２４及び意味ボタンによる分析ツール２６の4つの手段から構成された情報処理装置及びプログラムを総称して、「操作ボタンによるテキストデータ分析システム」と呼ぶ。 Hereinafter, the information processing apparatus and the program composed of the four means of the semantic recognition rule definition tool 20, the semantic recognition tool 22, the semantic button automatic generation tool 24, and the semantic button analysis tool 26 are collectively referred to as “text by operation button”. Called “data analysis system”.

ここで、意味認識ルール定義ツール２０は、ユーザが入力装置１６を使い、コンピュータとの対話によって、テキストデータを意味付けするための基準である「意味認識ルール」を登録するための操作環境を提供する。１つの意味認識ルールは、「キーワード定義部」、「カテゴリ定義部」、「適用カテゴリ指定部」及び「ルール名称指定部」の各要素から構成されており、意味認識ルール定義ツール２０は、構成要素の定義或いは指定をそれぞれ実行するためのキーワード定義機能２０ａ、カテゴリ定義機能２０ｂ、適用カテゴリ指定機能２０ｃ及びルール名称指定機能２０ｄを有している。ユーザは、意味認識ルール定義ツール２０によって、任意数の「意味認識ルール」を登録し、「意味認識ルールファイル」として記憶装置１２に保存して、必要に応じて、更新・参照・再利用できる。 Here, the semantic recognition rule definition tool 20 provides an operating environment for registering a “meaning recognition rule”, which is a reference for giving meaning to text data, by the user using the input device 16 and interacting with the computer. To do. One semantic recognition rule is composed of each element of “keyword definition part”, “category definition part”, “applied category designation part”, and “rule name designation part”. It has a keyword definition function 20a, a category definition function 20b, an application category specification function 20c, and a rule name specification function 20d for executing definition or specification of elements, respectively. The user can register an arbitrary number of “meaning recognition rules” with the meaning recognition rule definition tool 20, save it as a “meaning recognition rule file” in the storage device 12, and update, reference, and reuse as necessary. .

「キーワード定義部」を構成するのは任意数の「キーワード」であり、キーワード定義機能２０ａは、キーワードを定義するための機能を有している。キーワードは、テキストデータ中の文字列とマッチングさせるための表現要素であり、この例では、特定の文字列である「特定キーワード」と抽象化された文字列を含む「概念キーワード」の２種類に分かれている。 The “keyword definition section” is composed of an arbitrary number of “keywords”, and the keyword definition function 20a has a function for defining keywords. The keyword is an expression element for matching with a character string in the text data. In this example, there are two kinds of “concept keyword” including a specific character string “specific keyword” and an abstract character string. I know.

カテゴリ定義機能２０ｂによって、「カテゴリ定義部」を定義する。カテゴリ定義部を構成するのは任意数の「カテゴリ」である。カテゴリ定義機能２０ｂによって、１つのカテゴリに対して、任意数のキーワードを関連付ける。カテゴリに関連付けられたキーワードは、意味認識ツール２２によって当該キーワードを含むテキストデータをそのカテゴリに振り分ける際の根拠となる。 A “category definition section” is defined by the category definition function 20b. An arbitrary number of “categories” constitute the category definition section. An arbitrary number of keywords are associated with one category by the category definition function 20b. The keyword associated with the category is a basis for sorting the text data including the keyword into the category by the semantic recognition tool 22.

適用カテゴリ指定機能２０ｃは、定義済みのカテゴリのうち、どれを意味認識に使用するかをユーザに指定させる機能であり、指定されたカテゴリは、「適用カテゴリ指定部」に保存される。
「ルール名称指定部」は任意の文字列で構成され、ルールを一意に識別するためのものである。ユーザは、ルール名称指定機能２０ｄによって、「ルール名称」を指定する。 The application category designation function 20c is a function that allows the user to designate which of the predefined categories is used for semantic recognition, and the designated category is stored in the “application category designation section”.
The “rule name designating part” is composed of an arbitrary character string and uniquely identifies a rule. The user designates a “rule name” using the rule name designation function 20d.

意味認識ツール２２は、ユーザが入力装置１６を使い、コンピュータとの対話により、前記意味認識ルールの内容に従って、テキストデータの形態を解析してキーワードを抽出し、抽出したキーワードを基に当該テキストデータをカテゴリで意味付けし、その結果を基にボタン化ソースデータを作成するための操作環境を提供する。意味認識ツール２２は、分析対象テキストデータ指定機能２２ａ、意味認識ルール指定機能２２ｂ及びボタン化ソースデータ作成機能２２ｃを有している。 The meaning recognition tool 22 uses the input device 16 and the user interacts with the computer to analyze the form of text data and extract keywords according to the contents of the meaning recognition rules, and to extract the text data based on the extracted keywords. Is provided by the category, and an operating environment for creating buttoned source data based on the result is provided. The meaning recognition tool 22 has an analysis target text data specification function 22a, a meaning recognition rule specification function 22b, and a buttoned source data creation function 22c.

ユーザは、分析対象テキストデータ指定機能２２ａによって、分析対象としたいテキストデータを指定する。「テキストデータ」とは、「テキスト単位」の集合である。１つのテキスト単位は、文字列から成る本文、及び必要に応じて任意数のその他の属性情報を含む。また、ユーザは、意味認識ルール指定機能２２ｂによって、意味認識ルール定義ツール２０で作成済みの意味認識ルールファイルのうちから、分析対象テキストデータに適用したいルールを１つ選択して指定する。 The user designates text data to be analyzed by the analysis target text data specifying function 22a. “Text data” is a set of “text units”. One text unit includes a body composed of a character string, and an arbitrary number of other attribute information as required. Further, the user selects and designates one rule to be applied to the analysis target text data from the semantic recognition rule file created by the semantic recognition rule definition tool 20 by means of the semantic recognition rule specification function 22b.

ボタン化ソースデータ作成機能２２ｃは、指定された分析対象テキストデータ及び意味認識ルールを使用して意味認識処理を実行し、ボタン化ソースデータを作成する機能であり、意味認識ルールに定義されているキーワードを分析対象のテキストデータから抽出し、当該キーワードが関連付けられているカテゴリに当該テキスト単位を振り分ける。その結果を基に、「ボタン化ソースデータ」を作成し、「ボタン化ソースデータファイル」として記憶装置１２に保存する。ボタン化ソースデータは、図１０に示すように、意味認識ルールファイル名５２、分析対象テキストデータ保存場所５４及び意味認識結果テーブル５６から構成される。意味認識結果テーブル５６は、カテゴリ５８ａ、キーワード５８ｂ及びテキストデータファイル名５８ｃの少なくとも３個のフィールドを持つ結果レコード５８の集合によって構成される。 The buttoned source data creation function 22c is a function for executing button recognition source data by executing the meaning recognition process using the specified analysis target text data and the meaning recognition rule, and is defined in the meaning recognition rule. A keyword is extracted from the text data to be analyzed, and the text unit is assigned to a category associated with the keyword. Based on the result, “buttonized source data” is created and stored in the storage device 12 as a “buttonized source data file”. As shown in FIG. 10, the buttoned source data includes a semantic recognition rule file name 52, an analysis target text data storage location 54, and a semantic recognition result table 56. The meaning recognition result table 56 is composed of a set of result records 58 having at least three fields of a category 58a, a keyword 58b, and a text data file name 58c.

意味ボタン自動生成ツール２４は、ボタン化ソースデータを基に、エンドユーザがテキストデータの意味を簡単に分析できるようにするための操作ボタン（意味ボタン）を自動的に生成する機能を提供する。操作ボタンは、キーワード等に対応する個別ボタン、カテゴリ及びテキストデータ等に対応するボタンクラスから構成されるが、その用途から、特に「意味ボタン」と呼ぶ。意味ボタン自動生成ツール２４は、ボタンクラス生成機能２４ａと個別ボタン生成機能２４ｂを有している。 The meaning button automatic generation tool 24 provides a function of automatically generating operation buttons (meaning buttons) for enabling the end user to easily analyze the meaning of the text data based on the button source data. The operation buttons are composed of individual buttons corresponding to keywords and the like, and button classes corresponding to categories, text data, and the like. The meaning button automatic generation tool 24 has a button class generation function 24a and an individual button generation function 24b.

ボタンクラス生成機能２４ａは、ボタン化ソースデータを走査し、ボタンクラスに成りうる要素を抽出してボタンクラスを生成する。ボタンクラスは、ボタンクラス名称を持ち、当該ボタンクラスと関連付けされた個別ボタンを管理する。ボタン化ソースデータには、２通りのボタンクラスに成りうる要素がある。第１にボタンクラスと成りうるのが、前記意味認識結果テーブルのカテゴリのフィールドに現れるユニークな値、すなわち個別のカテゴリ名称をボタンクラス名称とするものである。第２にボタンクラスと成りうるのが、前記意味認識結果テーブルのうちカテゴリのフィールドとキーワードのフィールドを除いたフィールドであり、これを「参照用フィールド」と呼ぶ。図１０に示すように、結果レコード５８が、例えばカテゴリ５８ａ、キーワード５８ｂ及びテキストデータファイル名５８ｃという３つのフィールドから構成されるならば、テキストデータファイル名５８ｃのフィールドが前記参照用フィールドであって、例えば「対象」という名称のボタンクラスとなる。以降、前者を「分析用ボタンクラス」、後者を「参照用ボタンクラス」と呼ぶ。よって、ボタンクラス数は、分析用ボタンクラス数と参照用ボタンクラス数の合計となる。 The button class generation function 24a scans the button source data, extracts elements that can be button classes, and generates a button class. The button class has a button class name and manages individual buttons associated with the button class. Buttoned source data includes elements that can become two types of button classes. First, a button class can be a unique value appearing in a category field of the meaning recognition result table, that is, an individual category name is a button class name. Secondly, a button class can be a field excluding a category field and a keyword field in the meaning recognition result table, and this is called a “reference field”. As shown in FIG. 10, if the result record 58 is composed of, for example, three fields of a category 58a, a keyword 58b, and a text data file name 58c, the field of the text data file name 58c is the reference field. For example, the button class is named “target”. Hereinafter, the former is referred to as “analysis button class”, and the latter is referred to as “reference button class”. Therefore, the number of button classes is the sum of the number of analysis button classes and the number of reference button classes.

個別ボタン生成機能２４ｂは、図１０に示す、ボタン化ソースデータ５０のキーワード５８ｂのフィールドに現れるユニークな値、すなわち個別のキーワード名称を個別ボタン名称として、対応するボタンクラスに関連付けて個別ボタンを生成するものと、前記参照用フィールドに現れるユニークな値、すなわち個別のデータを個別ボタン名称として、当該参照用ボタンクラスに関連付けて個別ボタンを生成するものとがある。 The individual button generation function 24b generates an individual button by associating a unique value appearing in the field of the keyword 58b of the button source data 50 shown in FIG. And a unique value appearing in the reference field, that is, an individual data is generated as an individual button name and associated with the reference button class.

特許文献１に記載の発明では、複数のフィールドから構成されるレコード形式のデータを基にボタンを生成している。この方法によれば、１件のデータに対して、ボタン化フィールドと対応するボタンクラスに属する個別ボタンは常に生成され、且つ１つのフィールドに対応して生成される個別ボタンは常に１個である。これに対して、この例では、意味認識ルールに定義したキーワードを含むテキスト（文章）形式のデータを基にボタンを生成しており、１件のデータに対して、前記キーワードが出現するか否か、出現した場合の個数は何個か、どのボタンクラスに属する個別ボタンが生成されるか、また生成される場合でも、何個生成されるかは、全く不定である。また、１件のテキストデータに対して、どのボタンクラスに属する個別ボタンも全く生成されない場合もある。このように、この例では、ボタン生成の基になるデータの形式が既存の特許と基本的に異なり、その異なる形式のデータを基に、全く異なる方法でボタンクラス及び個別ボタンを生成している。 In the invention described in Patent Document 1, a button is generated based on record-format data composed of a plurality of fields. According to this method, for one piece of data, an individual button belonging to the button class corresponding to the buttoned field is always generated, and one individual button is always generated corresponding to one field. . On the other hand, in this example, a button is generated based on text (sentence) format data including a keyword defined in the semantic recognition rule, and whether or not the keyword appears for one piece of data. The number of buttons when they appear, the number of individual buttons belonging to which button class are generated, and the number of buttons generated even when they are generated are completely undefined. In some cases, no individual button belonging to any button class is generated for a single piece of text data. In this way, in this example, the format of data on which buttons are generated is fundamentally different from existing patents, and button classes and individual buttons are generated in completely different ways based on the data in the different formats. .

意味ボタンによる分析ツール２６は、意味ボタン自動生成ツール２４によって生成された意味ボタンを表示装置１４に表示し、入力装置１６を使ってユーザが任意のボタンを選択できるようにする。意味ボタンによる分析ツール２６は、意味ボタン並列関連表示機能２６ａとテキストデータ検索及び内容表示機能２６ｂを有している。 The meaning button analysis tool 26 displays the meaning button generated by the meaning button automatic generation tool 24 on the display device 14 and allows the user to select an arbitrary button using the input device 16. The semantic button analysis tool 26 has a semantic button parallel relation display function 26a and a text data search and content display function 26b.

意味認識結果テーブルには、ボタンクラスとそれに属する個別ボタン間の関連や、ボタンクラスを跨る個別ボタン間の関連が保持されている。意味ボタン並列関連表示機能２６ａは、結果テーブルに含まれる前記の関連をユーザの意味ボタン操作に反映させるための機能で、ある個別ボタンを選択すると、その個別ボタンに関連する他のボタンクラスの個別ボタンを同時に再表示するので、選択した個別ボタンと関連のある情報を瞬時に確認でき、簡単に取り出すことができる。
テキストデータ検索及び内容表示機能２６ｂは、ユーザがボタンを選択するだけで任意のカテゴリのテキストデータを検索したり、当該テキストデータに含まれるキーワードを抽出したりして表示する機能であり、これにより、カテゴリ及びキーワード間の相互関連を把握したりするなど、多様な分析を簡単に行うことが可能となる。また、ボタン選択によって絞り込んだテキストデータの詳細と、その中に含まれている絞り込みの対象となったキーワードを区別して表示することなども簡単に可能となる。 The meaning recognition result table holds a relationship between a button class and individual buttons belonging to the button class, and a relationship between individual buttons that straddle the button class. The meaning button parallel relation display function 26a is a function for reflecting the relation included in the result table in the meaning button operation of the user, and when a certain individual button is selected, individual button classes related to the individual button are individually displayed. Since the buttons are redisplayed at the same time, information related to the selected individual button can be instantly confirmed and easily retrieved.
The text data search and content display function 26b is a function for searching text data of an arbitrary category or extracting a keyword included in the text data and displaying it by simply selecting a button by the user. It is possible to easily perform various analyzes such as grasping the correlation between categories and keywords. It is also possible to easily display the details of the text data narrowed down by selecting a button and the keywords included in the narrowed-down target.

次に、図３乃至図１６を参照して、図２に示す操作ボタンによるテキストデータ分析システムを更に詳細に説明する。
この例で使用する分析対象となるテキストデータの形式を、次のように仮定する。テキスト単位は、コンピュータの記憶装置１２に保存されている１つのファイルで、このファイルの内容は、単一または複数の文章から構成される日本語とし、テキスト単位を一意に識別するためのキーをファイル名称とする。テキストデータの形式は、前記以外にも、リレーショナルデータベースのテーブルやインターネット上の多様なテキスト資源なども考えられるが、これらも、この例で仮定するテキストデータ形式に対する方法と同様の方法で分析可能である。また、この例では、あるパソコン周辺機器メーカーのユーザサポート窓口で操作ボタンによるテキストデータ分析システムを使用することを想定している。 Next, the text data analysis system using the operation buttons shown in FIG. 2 will be described in more detail with reference to FIGS.
The format of the text data to be analyzed used in this example is assumed as follows. The text unit is a single file stored in the storage device 12 of the computer. The content of this file is Japanese consisting of single or plural sentences, and a key for uniquely identifying the text unit is used. The file name. In addition to the above, the text data format may be a relational database table or various text resources on the Internet. These can also be analyzed in the same way as the text data format assumed in this example. is there. In this example, it is assumed that a text data analysis system using operation buttons is used at a user support window of a certain PC peripheral device manufacturer.

図３は、分析対象として使用するテキストデータの例を示す。図３中の凡例で示すように、テキスト単位は、ファイル名とテキストデータの内容から構成されている。「テキストＡ」、「テキストＢ」などのファイル名を持つこれらのファイル（テキストデータ）が、コンピュータの記憶装置１２の「file:/text/user_support/」という場所に保存されているものとする。 FIG. 3 shows an example of text data used as an analysis target. As shown in the legend in FIG. 3, the text unit is composed of a file name and the contents of text data. Assume that these files (text data) having file names such as “text A” and “text B” are stored in a location “file: / text / user_support /” in the storage device 12 of the computer.

図４は、図２に示す意味認識ルール定義ツール２０で作成される意味認識ルール３０の構造を示す。図４中の「１」は１個、「*」は１個以上、「**」は０個以上それぞれ存在することを示す。１つの意味認識ルール３０に対して、１つのルール名称３２、１つのキーワード定義部３４、１つのカテゴリ定義部３６、及び１つの適用カテゴリ指定部３８が対応する。なお、以下の例では、ルール名称を「ユーザサポート分析用ルール」と指定したとして説明する。 FIG. 4 shows the structure of the semantic recognition rule 30 created by the semantic recognition rule definition tool 20 shown in FIG. In FIG. 4, “1” indicates one, “*” indicates one or more, and “**” indicates zero or more. One rule name 32, one keyword definition unit 34, one category definition unit 36, and one application category designation unit 38 correspond to one meaning recognition rule 30. In the following example, it is assumed that the rule name is designated as “user support analysis rule”.

キーワード定義部３４は、１個以上の特定キーワード４０と１個以上の概念キーワード４２から構成される。特定キーワード４０は、１個の特定キーワード名称４０ａ、１個以上の特定個別キーワード４０ｂ及び０個以上の特定個別除外キーワード４０ｃから構成される。概念キーワード４２は、１個の概念キーワード名称４２ａ、１個以上の概念個別キーワード４２ｂ及び０個以上の概念個別除外キーワード４２ｃから構成される。カテゴリ定義部３６は、１個以上のカテゴリ４４から成る。１個のカテゴリ４４には、１個のカテゴリ名称４４ａと、キーワード定義部で定義済みの１個以上のキーワード（特定キーワードまたは概念キーワード）への参照４４ｂが配置される。適用カテゴリ指定部３８は、カテゴリ定義部で定義済みの１個以上のカテゴリへの参照４６から構成される。 The keyword definition unit 34 includes one or more specific keywords 40 and one or more concept keywords 42. The specific keyword 40 includes one specific keyword name 40a, one or more specific individual keywords 40b, and zero or more specific individual exclusion keywords 40c. The concept keyword 42 includes one concept keyword name 42a, one or more concept individual keywords 42b, and zero or more concept individual exclusion keywords 42c. The category definition unit 36 includes one or more categories 44. In one category 44, one category name 44a and a reference 44b to one or more keywords (specific keywords or concept keywords) defined in the keyword definition section are arranged. The application category designating unit 38 includes references 46 to one or more categories already defined by the category defining unit.

表１は、意味認識ルール３０の定義例である。判りやすくするため、基本的にXML形式で示している。なお、閉じタグについては、煩雑を避けるため省略している部分もある。表１では、まず、ルール名称３２を「ユーザサポート分析用ルール」と指定している。次に、キーワード定義部であるが、先頭のキーワードとして、<キーワード k_id=“k-001” type=“概念” name=“〜ない”>が定義されている。k_idとは、個別のキーワードを一意に特定するための識別子で、ここでは「k-001」が指定されている。また、typeとは、キーワードの種類で、ここでは「概念」が指定されていて、このキーワードが概念キーワードであることを表している。更にnameとは、キーワード名称で、ここでは「〜ない」が指定されている。この「〜ない」という概念キーワードには、概念個別キーワードと概念個別除外キーワードの定義が続く。その１番目には、概念個別キーワードを表す、<個別 term=“〜ない”/>が定義されている。termとは、概念個別キーワードの定義内容で、ここでは「〜ない」が指定されている。２番目には、概念個別キーワードを表す、<個別 term=“〜ません”/>が定義されている。３番目には、概念個別除外キーワードを表す、<個別除外 term=“問題ない”/>が定義されている。termとは、概念個別除外キーワードの定義内容で、ここでは「問題ない」が指定されている。このようにして、概念キーワード「〜ない」には、２個の概念個別キーワードと３個の概念個別除外キーワードが定義されている。 Table 1 is a definition example of the semantic recognition rule 30. To make it easier to understand, it is basically shown in XML format. Note that some of the closing tags are omitted to avoid complications. In Table 1, first, the rule name 32 is designated as “user support analysis rule”. Next, in the keyword definition part, <keyword k_id = “k-001” type = “concept” name = “not” ”is defined as the first keyword. k_id is an identifier for uniquely identifying an individual keyword, and “k-001” is designated here. “Type” is a type of keyword, and “concept” is designated here, and this keyword is a concept keyword. Furthermore, “name” is a keyword name, and “to not” is designated here. The concept keyword of “not” is followed by definitions of concept individual keywords and concept individual exclusion keywords. First, <individual term = "no" "/> representing a concept individual keyword is defined. The term is the definition content of the concept individual keyword, and “not” is designated here. Secondly, <individual term = “not” ”/> representing a concept individual keyword is defined. Third, <individual exclusion term = “no problem” /> representing a concept individual exclusion keyword is defined. term is the definition content of the concept individual exclusion keyword, and “no problem” is specified here. In this way, two concept individual keywords and three concept individual exclusion keywords are defined in the concept keyword “not”.

同様に、表２に示すように、２番目のキーワードとして、<キーワード k_id=“k-002” type=“概念” name=“〜ず”>から、７番目のキーワードとして、<キーワード
k_id=“k-007” type=“概念” name=“〜か”>が定義されている。 Similarly, as shown in Table 2, as the second keyword, from <keyword k_id = “k-002” type = “concept” name = “to” ”, as the seventh keyword, <keyword
k_id = “k-007” type = “concept” name = “to”> is defined.

更に、表３に示すように、８番目のキーワードとして、<キーワード k_id=“k-008” type=“特定” name=“問題”>が定義されている。キーワードの種類（type）を「特定」としているが、これはこのキーワードが特定キーワードであることを表している。この「問題」という特定キーワードには、特定個別キーワード１個の定義が続き、特定個別除外キーワードの定義は行われていない。特定個別キーワードは、<個別 term=“問題”/>のように定義されている。termとは、特定個別キーワードの定義内容で、ここでは「問題」が指定されている。更に、９番目のキーワードとして、<キーワード k_id=“k-009” type=“概念” name=“〜？”>から、１６番目のキーワードとして、<キーワード
k_id=“k-016” type=“特定” name=“フィルムスキャン”>が定義されている。 Further, as shown in Table 3, <keyword k_id = “k-008” type = “specific” name = “problem”> is defined as the eighth keyword. The keyword type is “specific”, which indicates that this keyword is a specific keyword. The specific keyword “problem” is followed by the definition of one specific individual keyword, and no specific individual excluded keyword is defined. The specific individual keyword is defined as <individual term = "problem"/>. term is the definition of a specific individual keyword, and “problem” is specified here. Furthermore, from the 9th keyword, <keyword k_id = “k-009” type = “concept” name = “to?”>, The 16th keyword, <keyword
k_id = “k-016” type = “specific” name = “film scan”> is defined.

更に、表４に示すように、１７番目のキーワードとして、<キーワード k_id=“k-017” type=“特定” name=“PRT-100”>から、２３番目のキーワードとして、<キーワード
k_id=“k-023” type=“特定” name=“SCN-300”>が定義されている。

Furthermore, as shown in Table 4, the 17th keyword is <keyword k_id = “k-017” type = “specific” name = “PRT-100”>, and the 23rd keyword is <keyword
k_id = “k-023” type = “specific” name = “SCN-300”> is defined.

更に、表５に示すように、２４番目のキーワードとして、<キーワード k_id=“k-024” type=“概念” name=“予期しない〜”>から、２５番目のキーワードとして、<キーワード
k_id=“k-025” type=“特定” name=“異常終了”>が定義されている。

このようにして、この例では、キーワード定義部に２５個のキーワードが定義されており、このうち、９個が概念キーワードで、１６個が特定キーワードである。 Furthermore, as shown in Table 5, as the 24th keyword, from <keyword k_id = “k-024” type = “concept” name = “unexpected ~”>, as the 25th keyword, <keyword
k_id = “k-025” type = “specific” name = “abnormal termination”> is defined.

In this way, in this example, 25 keywords are defined in the keyword definition section, of which 9 are conceptual keywords and 16 are specific keywords.

次に、カテゴリ定義部であるが、表６に示すように、先頭のカテゴリとして、<カテゴリ c_id=“c-001” name=“苦情”>が定義されている。c_idとは、個別のカテゴリを一意に特定するための識別子で、ここでは「c-001」が指定されている。また、nameとは、カテゴリ名称で、ここでは「苦情」が指定されている。この「苦情」というカテゴリには、前記キーワード定義部で定義したキーワードを参照するための定義が続く。先頭には、<キーワードへの参照 k_id=“k-001”/>が定義されている。ここで、k_idとは、それぞれの定義済みキーワードを一意に参照するための識別子で、「k-001」が指定されている。この「k-001」は、前記キーワード定義部で定義した個々のキーワードの識別子「k-001」と対応する。以下、「苦情」というカテゴリが参照するキーワードのk_idを、「k-001」、「k-002」、「k-003」、「k-005」、「k-008」、「k-024」、「k-025」の７個定義している。これら７個のキーワードは、意味認識ルール「ユーザサポート分析用ルール」では「苦情」を意味するキーワードであると定義されたことになる。つまり、テキストデータを「苦情」に分類させる理由となるのが、これら７個のキーワードである。あるテキストデータに、これら７個のキーワードのいずれかが含まれていたら、そのテキストデータは当該意味認識ルールにより、「苦情」カテゴリに分類される。 Next, in the category definition section, as shown in Table 6, <category c_id = “c-001” name = “complaint”> is defined as the first category. c_id is an identifier for uniquely identifying each category, and “c-001” is designated here. The name is a category name, and “complaint” is designated here. This “complaint” category is followed by a definition for referring to the keyword defined in the keyword definition section. At the top, <keyword reference k_id = “k-001” /> is defined. Here, k_id is an identifier for uniquely referring to each defined keyword, and “k-001” is designated. This “k-001” corresponds to the identifier “k-001” of each keyword defined in the keyword definition section. Hereinafter, k_id of the keyword referred to by the category “complaint” is “k-001”, “k-002”, “k-003”, “k-005”, “k-008”, “k-024” , “K-025” is defined. These seven keywords are defined as keywords that mean “complaint” in the meaning recognition rule “rule for user support analysis”. In other words, it is these seven keywords that cause text data to be classified as “complaints”. If any one of these seven keywords is included in certain text data, the text data is classified into the “complaint” category according to the semantic recognition rule.

同様に、表７に示すように、２番目のカテゴリとして、<カテゴリ c_id=“c-002” name=“質問”>が、３番目のカテゴリとして、<カテゴリ
c_id=“c-003” name=“要望”>が、４番目のカテゴリとして、<カテゴリ c_id=“c-004” name=“プリンタ”>が、５番目のカテゴリとして、<カテゴリ
c_id=“c-005” name=“スキャナ”>が、６番目のカテゴリとして、<カテゴリ c_id=“c-006” name=“新製品”>がそれぞれ定義され、これらのカテゴリには、前記キーワード定義部で定義したキーワードを参照するための定義が続いている。 Similarly, as shown in Table 7, as the second category, <category c_id = “c-002” name = “question”> is the third category, and <category
c_id = “c-003” name = “request”> is the fourth category, <category c_id = “c-004” name = “printer”> is the fifth category, <category
c_id = “c-005” name = “scanner”> is defined as <category c_id = “c-006” name = “new product”> as the sixth category. The definition for referring to the keyword defined in the definition section continues.

このようにして、この例では、カテゴリ定義部で、６個のカテゴリを定義しており、それらのカテゴリで参照するキーワードは、それぞれ、７個、２個、２個、８個、６個、２個、の合計２７個である。 In this way, in this example, six categories are defined in the category definition section, and the keywords referred to in these categories are 7, 2, 2, 8, 6, There are two, 27 in total.

前記定義例では、キーワード定義部で定義したキーワードが２５個であるのに対して、カテゴリ定義部のカテゴリで参照するキーワードは合計２７個ある。これは、例えば、「k-020」というキーワードは「プリンタ」と「新製品」という２つのカテゴリで参照が定義されるなど、１つのキーワードが複数のカテゴリから参照されているためである。 In the above definition example, there are 25 keywords defined in the keyword definition section, whereas there are a total of 27 keywords referred to in the category of the category definition section. This is because one keyword is referred to from a plurality of categories, for example, the keyword “k-020” is defined in two categories “printer” and “new product”.

次に、適用カテゴリ指定部では、前記カテゴリ定義部で定義したカテゴリを参照するための定義が続く。つまり、表８に示すように、先頭には<カテゴリへの参照 c_id=“c-001”/>が定義されている。ここで、c_idとは、それぞれの定義済みカテゴリを一意に参照するための識別子で、「c-001」が指定されている。この「c-001」は、前記カテゴリ定義部で定義した個別のカテゴリの識別子「c-001」と対応する。以下、適用カテゴリ指定部が参照するカテゴリのc_idとして、「c-002」、「c-003」、「c-004」、「c-006」の合計５個を定義している。これら５個のカテゴリは、意味認識ルール「ユーザサポート分析用ルール」において、意味認識処理に使用するカテゴリとして定義されたこととなる。一方、カテゴリ定義部で定義されているが、適用カテゴリ指定部で参照するための定義がされていないカテゴリ、すなわちc_idが「c-005」の「スキャナ」については、意味認識処理に使用されないこととなる。 Next, in the application category designation section, the definition for referring to the category defined in the category definition section continues. That is, as shown in Table 8, <reference to category c_id = “c-001” /> is defined at the top. Here, c_id is an identifier for uniquely referring to each defined category, and “c-001” is designated. This “c-001” corresponds to the identifier “c-001” of the individual category defined in the category definition section. In the following, a total of five “c-002”, “c-003”, “c-004”, and “c-006” are defined as c_id of the category referred to by the application category specifying unit. These five categories are defined as categories used in the meaning recognition process in the meaning recognition rule “rule for user support analysis”. On the other hand, categories that are defined in the category definition section but are not defined for reference in the applicable category specification section, that is, “scanners” with c_id “c-005” must not be used for semantic recognition processing. It becomes.

この例では示していないが、キーワードやカテゴリを複数の意味認識ルールから参照可能な状態で記憶装置に保存する仕組みとすれば、適用カテゴリ指定機能によって、一度定義したカテゴリを複数の意味認識ルールで自在に構成できるようになるので、ユーザにとって一層便利になる。
次に、意味認識ルールの構造及び定義例で示した概念キーワードと特定キーワードについての詳細について説明する。 Although not shown in this example, if the system stores keywords and categories in a storage device in a state where they can be referenced from multiple semantic recognition rules, a category once defined can be defined by multiple semantic recognition rules using the applicable category specification function. Since it can be configured freely, it is more convenient for the user.
Next, details of the concept keyword and the specific keyword shown in the meaning recognition rule structure and definition example will be described.

キーワードに関して、表記の揺れを含む多様な同義語に対応するための仕組みについて説明する。例えば、一般的に「十分」と「充分」、「作る」と「作成する」、「コンピュータ」と「コンピューター」は、それぞれ同義である。このような例は、普遍的な辞書を用いても対応できる。しかし、例えば、「エラー」、「強制終了」、「フリーズ」、「アベンド」、「アボート」、「中止」及び「中断」という一連のキーワードを同義語として、「異常終了」というキーワード名称で一括して検索できるようにしたいという場合がある。このように、特定の分野や業務で独自に複数のキーワードを同義として扱いたいニーズが存在する。この例では、前記例のように、独自に同義語を指定できるようにするため、個別キーワードという仕組みを意味認識ルールに取り入れている。キーワードが１個以上の個別キーワードを持つことにより、それらの個別キーワードは当該キーワードの下で同義であることを表すようにする。 Regarding keywords, we will explain a mechanism for dealing with various synonyms including notation fluctuations. For example, “sufficient” and “sufficient”, “create” and “create”, “computer” and “computer” are generally synonymous. Such an example can be dealt with by using a universal dictionary. However, for example, a series of keywords “error”, “forced termination”, “freeze”, “abend”, “abort”, “cancel”, and “interrupt” are synonymous, and the keyword name “abnormal termination” is used. Sometimes you want to be able to search. As described above, there is a need to uniquely treat a plurality of keywords as synonyms in a specific field or business. In this example, as in the above example, in order to be able to specify synonyms independently, a mechanism called individual keywords is incorporated into the semantic recognition rule. When a keyword has one or more individual keywords, it represents that these individual keywords are synonymous under the keyword.

すなわち、図４で示すように、１個の特定キーワード４０は、１個以上の特定個別キーワード４０ｂを持ち、１個の概念キーワード４２は、１個以上の概念個別キーワード４２ｂを持つ。具体的には、表２に示す４番目のキーワード「〜て下さい」には、３個の概念個別キーワードが指定されている。同様に、表５に示す２５番目のキーワード「異常終了」には、８個の特定個別キーワードが指定されている。ここでは、キーワード「〜て下さい」の同義語として、「〜て下さい」、「〜てくれますか」及び「〜てもらえますか」を定義している。同様にキーワード「異常終了」の同義語として、「異常終了」、「エラー」、「強制終了」、「フリーズ」、「アベンド」、「アボート」、「中止」及び「中断」を定義している。 That is, as shown in FIG. 4, one specific keyword 40 has one or more specific individual keywords 40b, and one concept keyword 42 has one or more concept individual keywords 42b. Specifically, three concept individual keywords are designated for the fourth keyword “Please” shown in Table 2. Similarly, eight specific individual keywords are designated for the 25th keyword “abnormal end” shown in Table 5. Here, as a synonym of the keyword “please please”, “please please”, “can you give me” and “can you give me” are defined. Similarly, "abnormal termination", "error", "forced termination", "freeze", "abend", "abort", "cancel" and "suspend" are defined as synonyms for the keyword "abnormal termination". .

前述のようなキーワードでテキストデータとのマッチングを行った場合、定義したキーワードとは一致するが、意味が違うので、抽出の対象から除外したいというニーズが考えられる。例えば、ユーザサポート分析のための意味認識ルールにおいて、表１に示す１番目のキーワード「〜ない」には、「問題ない」、「悪くない」及び「間違いない」という３個の個別除外キーワードが指定されている。否定的な表現である「〜ない」は、「苦情」を意味するキーワードとして多くの場合妥当であるが、例外もある。すなわち、前記の「問題ない」、「悪くない」及び「間違いない」といった表現は、「苦情」として意味付けするには不適切である。 When matching with text data using the keyword as described above, it matches the defined keyword but has a different meaning, so there may be a need to exclude it from the extraction target. For example, in the meaning recognition rule for user support analysis, the first keyword “not” shown in Table 1 includes three individual excluded keywords “no problem”, “not bad”, and “no mistake”. It is specified. The negative expression “not” is valid as a keyword meaning “complaint” in many cases, but there are exceptions. That is, the expressions “no problem”, “not bad”, and “no doubt” are inappropriate for meaning “complaint”.

この例では、こうした状況に対応できるように、前記のように個別除外キーワードという仕組みを意味認識ルールに取り入れている。テキストデータに含まれる文字列が、あるキーワードの個別キーワードに一致するが、同キーワードの個別除外キーワードにも一致する場合、当該文字列は抽出の対象から外すこととする。 In this example, in order to cope with such a situation, as described above, the mechanism of individual exclusion keywords is incorporated in the semantic recognition rule. When a character string included in text data matches an individual keyword of a certain keyword, but also matches an individual excluded keyword of the same keyword, the character string is excluded from extraction targets.

以下に、個別キーワードの実現方法について説明する。概念個別キーワードは、テキストデータに含まれる文字列と抽象的なマッチングを行うために定義される。例えば、表１の１番目のキーワード「〜ない」の場合、抽象的に表現される部分「〜」を含むので、単純に文字列のみを比較することはできない。 Hereinafter, a method for realizing individual keywords will be described. The concept individual keyword is defined for abstract matching with a character string included in text data. For example, in the case of the first keyword “not” in Table 1, since the abstractly expressed portion “to” is included, it is not possible to simply compare only character strings.

このことを踏まえ、この例では、テキストデータを日本語形態素に解析して、「基本名」、「品詞」及び「活用」という３つの属性に分けて、概念個別キーワード及び以降に示す特定個別キーワードを実現している。例えば、「読めない」という表現の形態素を解析すると、表９に示すように、「読め」の部分については、基本名は「読む」、品詞は「動詞」、活用は「未然形」である。また、表１０に示すように、「ない」の部分については、基本名は「ない」、品詞は「助動詞」、活用は「基本形」である。これらの形態素属性の内容が全て特定されている形態素を具象形態素と呼ぶ。形態素属性の内容の一部または全部が特定されていない形態素を抽象形態素と呼ぶ。

日本語における連続した複数の形態素の並びがあったとして、この複数の形態素の１つ以上が抽象形態素の場合、この連続した形態素の並びを概念個別キーワードとする。一方、１個以上の連続する形態素が全て具象形態素の場合、この連続する形態素の並びを特定個別キーワードとする。また、概念個別除外キーワードと特定個別除外キーワードについても同様である。 Based on this, in this example, the text data is analyzed into Japanese morphemes and divided into three attributes, “basic name”, “part of speech”, and “utilization”, and the concept individual keyword and the specific individual keyword shown below. Is realized. For example, when the morpheme of the expression “unreadable” is analyzed, as shown in Table 9, for the “read” part, the basic name is “read”, the part of speech is “verb”, and the utilization is “unformed” . Further, as shown in Table 10, for the part of “None”, the basic name is “None”, the part of speech is “auxiliary verb”, and the utilization is “basic form”. A morpheme whose contents of these morpheme attributes are all specified is called a concrete morpheme. A morpheme in which part or all of the content of the morpheme attribute is not specified is called an abstract morpheme.

If there is a sequence of a plurality of continuous morphemes in Japanese and one or more of the plurality of morphemes is an abstract morpheme, this sequence of morphemes is used as a concept individual keyword. On the other hand, when one or more continuous morphemes are all concrete morphemes, this sequence of continuous morphemes is set as a specific individual keyword. The same applies to the concept individual excluded keyword and the specific individual excluded keyword.

前記の個別キーワード「〜ない」に関して、例えば、「読めない」という表現があった場合、「読め」と「ない」という２つの形態素の並びとなり、表９と表１０で示した通りである。これらの形態素の属性を一部抽象化したものを、表１１と表１２に示す。表１１では基本名に対応する属性が、表１２では活用に対応する属性が、それぞれ「―」となっており、特定されず抽象化されていることを表している。表１１は、表９の「読め」という未然形の動詞である形態素において、基本名の属性を抽象化したものであり、動詞の未然形であれば基本名の内容を問わず、例えば、「動かず」、「進まない」なども該当する。よって、表１１と表１２の形態素を連続させて、基本名が抽象化された形態素を「〜」という特殊文字で表現すれば、「〜ない」という概念個別キーワードが実現できる。

また、表１１の形態素属性を全て特定されないようにすれば、表１３に示すようになる。表１３には、全ての形態素が該当する。例えば、表１３と表１２の形態素の連続による「〜ない」という概念個別キーワードを設定すれば、前記動詞の「読めない」以外にも、格助詞の「応答がない」、形容詞の「正しくない」、名詞の「問題なし」なども当該概念個別キーワードに該当する。

このようにして、日本語形態素属性を使って概念個別キーワードを実現することができるが、形態素属性の内容の指定方法については、特別の方法で実現する必要はないため、ここではその説明を省略する。 For the individual keyword “not”, for example, when there is an expression “unreadable”, two morphemes “read” and “not” are arranged, as shown in Table 9 and Table 10. Tables 11 and 12 show some abstractions of the attributes of these morphemes. The attribute corresponding to the basic name in Table 11 and the attribute corresponding to utilization in Table 12 are “-”, indicating that they are not specified and are abstracted. Table 11 is an abstraction of the attribute of the basic name in the morpheme, which is the verbal verb “Read” in Table 9, and the verbal form is the same regardless of the content of the basic name. “ Does not move ”, “ does not move ”, and so on. Therefore, if the morphemes of Table 11 and Table 12 are continued and the morpheme whose base name is abstracted is expressed by the special character “to”, the concept individual keyword “to not” can be realized.

Further, if all the morpheme attributes in Table 11 are not specified, the result is as shown in Table 13. Table 13 corresponds to all morphemes. For example, by setting the concept individual keyword "no ~" by the morpheme of a series of Table 13 and Table 12, in addition to the "unreadable" of the verb, "there is no response" of the case particle, "not correct adjectives ", No problem ", etc., also fall under the concept individual keyword.

In this way, it is possible to realize concept individual keywords using Japanese morpheme attributes, but the method for specifying the contents of morpheme attributes does not need to be implemented in a special way, so the explanation is omitted here. To do.

次に、図２に示す意味認識ルール定義ツール２０を使用して、ユーザが実際にどのように意味認識ルールを定義するのかを図５乃至図８を参照して説明する。ユーザが意味認識ルール定義ツール２０の使用を操作ボタンによるテキストデータ分析システムに要求すると、該システムは、ルール名称指定機能２０ｄを呼び出して、意味認識ルール名称指定ダイアログを表示装置１４に表示する。図５は、その表示例である。ここでは、ユーザは、「ユーザサポート分析用ルール」という意味認識ルール名称を、入力装置１６を使用して指定している。そして、「次へ」のボタンを選択すると、指定された前記ルール名称を記憶装置１２に保存して、次の処理に進む。この時点で、図４に示す、意味認識ルール３０におけるルール名称３２が完成する。 Next, how the user actually defines the semantic recognition rule using the semantic recognition rule definition tool 20 shown in FIG. 2 will be described with reference to FIGS. When the user requests the text data analysis system using the operation buttons to use the meaning recognition rule definition tool 20, the system calls the rule name designation function 20d and displays the meaning recognition rule name designation dialog on the display device 14. FIG. 5 is a display example thereof. Here, the user designates the meaning recognition rule name “rule for user support analysis” using the input device 16. When the “Next” button is selected, the specified rule name is stored in the storage device 12 and the process proceeds to the next process. At this point, the rule name 32 in the meaning recognition rule 30 shown in FIG. 4 is completed.

次に、テキストデータ分析システムは、キーワード定義機能２０ａを呼び出して、キーワード定義ダイアログを表示装置１４に表示する。図６は、その表示例である。このキーワード定義機能２０ａでは、ユーザは入力装置１６を使用して、特定キーワードまたは概念キーワードを指定して、「追加」のボタンを選択すると、意味認識ルールにそれぞれのキーワードが定義される。ここでは、ユーザは「特定」（●）を選択して、「プリンタ」というキーワードを指定しようとしている。キーワードの同義語、概念個別除外キーワードまたは特定個別除外キーワードを指定するには、「詳細」ボタンを選択する。選択後、キーワード詳細指定ダイアログを表示装置１４に表示して指定できるようにするが、特別の方法ではないので、ここではその説明を省略する。そして、「次へ」のボタンを選択すると、キーワード定義機能２０ａで指定された前記キーワードを記憶装置１２に保存して、次の処理に進む。この時点で、図４に示す、意味認識ルール３０におけるキーワード定義部３４が完成する。 Next, the text data analysis system calls the keyword definition function 20a and displays a keyword definition dialog on the display device 14. FIG. 6 shows an example of the display. In the keyword definition function 20a, when the user designates a specific keyword or conceptual keyword using the input device 16 and selects an “add” button, each keyword is defined in the semantic recognition rule. Here, the user selects “specific” (●) and designates the keyword “printer”. In order to specify a synonym of a keyword, a concept individual excluded keyword, or a specific individual excluded keyword, the “details” button is selected. After the selection, a keyword detail designation dialog is displayed on the display device 14 so that it can be designated. However, since it is not a special method, its description is omitted here. When the “next” button is selected, the keyword specified by the keyword definition function 20a is stored in the storage device 12, and the process proceeds to the next process. At this point, the keyword definition unit 34 in the meaning recognition rule 30 shown in FIG. 4 is completed.

キーワード定義が完了すると、操作ボタンによるテキストデータ分析システムは、カテゴリ定義機能２０ｂを呼び出して、カテゴリ定義ダイアログを表示装置１４に表示する。図７は、その表示例である。ここでは、ユーザは「要望」という名称のカテゴリを入力装置１６から指定している。また、これまでにキーワード定義機能２０ａで定義されたキーワードの一覧が「関連付けるキーワードを指定」という見出しを付けて表示される。その一覧から任意のキーワードを１つ以上選択することにより、それらのキーワードを前記指定した名称のカテゴリと関連付ける。この例では、「〜て下さい」と「〜てほしい」というキーワードを「要望」という名称のカテゴリと関連付けるために選択している。「追加」ボタンを選択すると、前記指定した名称のカテゴリと、そのカテゴリと関連付けられたキーワード（図４の意味認識ルールの構造で示した「定義済みキーワードへの参照」）が、意味認識ルールに定義される。「次へ」のボタンを選択すると、前記定義情報を記憶装置１２に保存して、次の処理に進む。この時点で、図４に示す、意味認識ルール３０におけるカテゴリ定義部３６が完成する。 When the keyword definition is completed, the text data analysis system using the operation buttons calls the category definition function 20b and displays the category definition dialog on the display device 14. FIG. 7 is a display example thereof. Here, the user designates the category of “request” from the input device 16. In addition, a list of keywords defined by the keyword definition function 20a so far is displayed with the heading “designate keywords to be associated”. By selecting one or more arbitrary keywords from the list, the keywords are associated with the category of the designated name. In this example, the keywords “to please” and “want to want” are selected to be associated with the category named “request”. When the “add” button is selected, the category of the designated name and the keyword associated with the category (“reference to the defined keyword” shown in the structure of the semantic recognition rule in FIG. 4) are added to the semantic recognition rule. Defined. When the “Next” button is selected, the definition information is saved in the storage device 12 and the process proceeds to the next process. At this point, the category definition unit 36 in the semantic recognition rule 30 shown in FIG. 4 is completed.

カテゴリ定義が完了すると、記憶装置１２中に意味認識ルールが完成する。なお、適用カテゴリ指定機能は、単に定義済みカテゴリへの参照を選択するだけなので、ここではその説明を省略している。操作ボタンによるテキストデータ分析システムは、完成した意味認識ルールを意味認識ルールファイルに保存するため、ファイル保存ダイアログを表示装置１４に表示する。図８は、その表示例である。ここでは、ユーザは、「file:/rules/user_support_analysis」というファイルを指定している。そして、「完了」ボタンを選択すると、操作ボタンによるテキストデータ分析システムは、意味認識ルールの各機能で定義した内容を、前記指定した意味認識ルールファイルに保存し、意味認識ルール定義処理を終了する。 When the category definition is completed, the meaning recognition rule is completed in the storage device 12. Note that the application category designation function simply selects a reference to a predefined category, and the description thereof is omitted here. The text data analysis system using operation buttons displays a file save dialog on the display device 14 in order to save the completed meaning recognition rule in the meaning recognition rule file. FIG. 8 shows an example of the display. Here, the user designates a file “file: / rules / user_support_analysis”. When the “Done” button is selected, the text data analysis system using the operation button stores the contents defined by each function of the semantic recognition rule in the designated semantic recognition rule file, and ends the semantic recognition rule definition process. .

ユーザが実際にどのように意味認識ツールを使用するかを以下に説明する。ユーザが、図２に示す意味認識ツール２２の使用を操作ボタンによるテキストデータ分析システムに要求すると、該システムは、分析対象テキストデータ指定機能２２ａと意味認識ルール指定機能２２ｂを呼び出して、意味認識ツールダイアログを表示装置１４に表示する。図９は、その表示例である。 The following describes how the user actually uses the semantic recognition tool. When the user requests the text data analysis system using the operation buttons to use the meaning recognition tool 22 shown in FIG. 2, the system calls the analysis target text data specifying function 22a and the meaning recognition rule specifying function 22b, and the meaning recognition tool. A dialog is displayed on the display device 14. FIG. 9 is a display example.

ここでは、ユーザは、図３で示した分析対象テキストデータが保存されている場所を「file:/text/user_support/」と指定している。また、図８で示した意味認識ルールファイルを「file:/rules/user_support_analysis」と指定している。以上により、file:/text/user_support/に保存されているテキストデータを、file:/rules/user_support_analysisに保存されている意味認識ルールに従って意味認識処理を行うための指定が完了する。 Here, the user designates “file: / text / user_support /” as the location where the text data to be analyzed shown in FIG. 3 is stored. Further, the meaning recognition rule file shown in FIG. 8 is designated as “file: / rules / user_support_analysis”. Thus, the specification for performing the semantic recognition process on the text data stored in file: / text / user_support / according to the semantic recognition rules stored in file: / rules / user_support_analysis is completed.

そして、図９に示す「意味認識開始」ボタンを選択すると、操作ボタンによるテキストデータ分析システムは、ボタン化ソースデータ作成機能２２ｃを呼び出して、意味認識処理を開始する。その結果を基に、図１０に示すボタン化ソースデータ５０を作成して、記憶装置１２に保存する。記憶装置１２では、ボタン化ソースデータファイルという内部ファイルに保存する。保存が完了すると、意味認識処理を終了する。 When the “semantic recognition start” button shown in FIG. 9 is selected, the text data analysis system using the operation buttons calls the buttoned source data creation function 22c to start the semantic recognition process. Based on the result, buttoned source data 50 shown in FIG. 10 is created and stored in the storage device 12. In the storage device 12, it is stored in an internal file called a buttoned source data file. When the saving is completed, the meaning recognition process is terminated.

図１０で示すように、ボタン化ソースデータ５０は、意味認識ルールファイル名５２、分析対象テキストデータ保存場所５４及び意味認識結果テーブル５６から構成される。意味認識結果テーブル５６は、カテゴリ５８ａ、キーワード５８ｂ及びテキストデータファイル名５８ｃの最低３個のフィールドを持つ結果レコード５８の集合によって構成される。尚、カテゴリとキーワードは、それぞれの識別子であるc_id及びk_idで表している。また、テキストデータファイル名５８ｃは、図３の凡例：テキスト単位で示したファイル名のことである。図１０中の「１」は１個存在し、「**」は０個以上存在することを示す。 As shown in FIG. 10, the buttoned source data 50 includes a meaning recognition rule file name 52, an analysis target text data storage location 54, and a meaning recognition result table 56. The meaning recognition result table 56 is constituted by a set of result records 58 having at least three fields of a category 58a, a keyword 58b, and a text data file name 58c. The category and the keyword are represented by c_id and k_id, which are identifiers. The text data file name 58c is the file name shown in the legend: text unit of FIG. “1” in FIG. 10 is one, and “**” is zero or more.

以下は、意味認識ツールにおいて、図９の指定によって、ボタン化ソースデータ５０を作成する例を説明する。
表１４は、図３に示す分析対象テキストデータの“テキストＡ”から“テキストＭ”までの全てのテキストデータの意味を認識して、その結果を基にボタン化ソースデータを作成した例である。

In the following, an example of creating the button source data 50 by the designation in FIG. 9 in the semantic recognition tool will be described.
Table 14 is an example in which the meaning of all text data from “text A” to “text M” in the text data to be analyzed shown in FIG. 3 is recognized, and buttoned source data is created based on the result. .

ここでは、“テキストＡ”に着目して、テキストデータの内容を、表１〜表８で示す意味認識ルールの定義例を使用して分析する例を示す。“テキストＡ”のテキストデータの内容は、「プリントキング100のプリンタドライバを更新したら、印刷できなくなってしまいました。対処方法を教えて下さい。」である。この中から、前記意味認識ルールの定義例に定義されているキーワードを抽出すると、「〜ない」、「〜てしまう」、「〜て下さい」、「プリンタ」、「印刷」及び「PRT-100」の６個になる。ここで、抽出されるキーワードのうち、「印刷」と「PRT-100」について、“テキストＡ”の内容に「プリントキング100」という文字列が含まれているが、表３の意味認識ルールの定義例において、キーワード「印刷」の個別除外キーワードである「プリントキング」が定義されていることにより、同キーワードの個別キーワードとして「プリント」が定義されているが、前記文字列は「印刷」として抽出されないこととなる。また、表４の意味認識ルールの定義例において、キーワード「PRT-100」の個別キーワードである「プリントキング100」が定義されていることにより、前記文字列は「PRT-100」として抽出されることとなる。 Here, an example in which the content of text data is analyzed using the definition examples of the semantic recognition rules shown in Tables 1 to 8 by paying attention to “Text A”. The content of the text data of “Text A” is “I cannot print after updating the printer driver of Print King 100. Please tell me how to deal with it”. When keywords defined in the definition example of the semantic recognition rule are extracted from the list, “~ No”, “~ Take”, “~ Please”, “Printer”, “Print” and “PRT-100” It will be six. Here, among the extracted keywords, for “print” and “PRT-100”, the content of “text A” includes the character string “printing 100”. In the definition example, “printing”, which is an individual exclusion keyword for the keyword “printing”, is defined, so that “printing” is defined as an individual keyword for the keyword, but the character string is “printing”. It will not be extracted. In addition, in the definition example of the meaning recognition rule in Table 4, the character string is extracted as “PRT-100” by defining “Print King 100” which is an individual keyword of the keyword “PRT-100”. It will be.

まず、「〜ない」については、前記定義例のキーワード定義部でk_id=“k-001” type=“概念”と定義されており、カテゴリ定義部でキーワードへの参照がk_id=“k-001”であるカテゴリはc_id=“c-001” カテゴリ名称=“苦情”と定義されている。このことから、「〜ない」を含む“テキストＡ”は、“苦情”というカテゴリに属し、その識別子は“c-001”、関連するキーワードの識別子は“k-001”ということが分かる。この結果を基に、図１０に示すボタン化ソースデータ５０の意味認識結果テーブル５６の結果レコード５８を作成すると、表１４の<結果テーブル>の先頭の<結果レコード>に示すように、カテゴリ=“c-001”、キーワード=“k-001”、doc=“テキストＡ”となる。ここでdocとは、図３の凡例：テキスト単位のファイル名を指す。 First, for “˜not”, k_id = “k-001” type = “concept” is defined in the keyword definition part of the above definition example, and the reference to the keyword in the category definition part is k_id = “k-001”. The category “” is defined as c_id = “c-001” category name = “complaint”. From this, it is understood that “text A” including “not” belongs to the category “complaint”, the identifier is “c-001”, and the identifier of the related keyword is “k-001”. When the result record 58 of the meaning recognition result table 56 of the button source data 50 shown in FIG. 10 is created based on this result, as shown in <Result record> at the top of <Result table> in Table 14, category = “C-001”, keyword = “k-001”, doc = “text A”. Here, doc refers to the legend of FIG. 3: file name in text units.

次に、「〜てしまう」については、キーワード定義部でk_id=“k-003” type=“概念”と定義されており、カテゴリ定義部でキーワードへの参照がk_id=“k-003”であるカテゴリはc_id=“c-001” カテゴリ名称=“苦情”と定義されている。このことから、「〜てしまう」を含む“テキストＡ”は、「〜ない」と同様、“苦情”というカテゴリに属し、その識別子は“c-001”、関連するキーワードの識別子は“k-003”ということが分かる。この結果を基に、図１０に示すボタン化ソースデータ５０の意味認識結果テーブル５６の結果レコード５８を作成すると、表１４の<結果テーブル>の上から６番目の<結果レコード>に示すように、カテゴリ=“c-001”、キーワード=“k-003”、doc=“テキストＡ”となる。 Next, for “to end”, k_id = “k-003” type = “concept” is defined in the keyword definition part, and the reference to the keyword is k_id = “k-003” in the category definition part. A certain category is defined as c_id = “c-001” category name = “complaint”. From this, “Text A” including “to end” belongs to the category of “complaint” like “to not”, the identifier is “c-001”, and the identifier of the related keyword is “k-”. 003 ”. When the result record 58 of the semantic recognition result table 56 of the button source data 50 shown in FIG. 10 is created based on this result, as shown in the sixth <result record> from the top of the <result table> in Table 14. , Category = “c-001”, keyword = “k-003”, doc = “text A”.

以下、「〜て下さい」以降のキーワードについても、同様の方法でボタン化ソースデータを作成していく。このようにして“テキストＡ”だけでなく、“テキストＢ”以降のテキスト単位についても、同様の方法でテキストデータの意味を認識する。
このように、図９で指定した対象テキストデータと意味認識ルールを使用して、対象テキストデータに含まれる全テキスト単位のテキストデータの内容の意味を認識して、図１０に示すボタン化ソースデータ５０の構造に従ってボタン化ソースデータファイルという内部ファイルに出力して記憶装置１２に保存する。 In the following, button source data is created in the same way for the keywords after “to please”. In this way, the meaning of the text data is recognized not only for “text A” but also for text units after “text B” by the same method.
In this way, the target text data and semantic recognition rules specified in FIG. 9 are used to recognize the meaning of the contents of the text data in all text units included in the target text data, and the buttoned source data shown in FIG. According to the structure of 50, it is output to an internal file called a buttoned source data file and stored in the storage device 12.

ボタン化ソースデータがあれば、意味ボタンによる操作環境を作成できるようになる。ユーザが操作ボタンによるテキストデータ分析システムに、図２に示す意味ボタン自動生成ツール２４の使用を要求すると、該システムは、意味ボタン自動生成機能を呼び出し、記憶装置１２からボタン化ソースデータを読み出して、意味ボタンを自動的に生成して表示装置１４に表示する。 With buttoned source data, you can create an operating environment with semantic buttons. When the user requests the text data analysis system using operation buttons to use the semantic button automatic generation tool 24 shown in FIG. 2, the system calls the semantic button automatic generation function and reads the button source data from the storage device 12. , Meaning buttons are automatically generated and displayed on the display device 14.

つまり、意味ボタン自動生成ツール２４のボタンクラス生成機能２４ａは、ボタン化ソースデータからカテゴリを抽出し、これを基にボタンクラスを生成する。ボタンクラスとは、意味ボタンを構成する要素の１つであって、カテゴリ毎に生成する。同一カテゴリに属する個別ボタンを１つのボタンクラスに集合して表示する。個別ボタンとは、意味ボタンを構成する要素の１つであって、キーワード毎に生成する。カテゴリに対応するボタンクラス毎に集合させて表示する。個別ボタンを選択すると、当該個別ボタンに対応するキーワードを含むテキストデータに含まれる他のキーワードに着目し、該キーワードの属するカテゴリに対応するボタンクラスの個別ボタンを、該キーワードに対応するものに絞って再表示する。 That is, the button class generation function 24a of the semantic button automatic generation tool 24 extracts a category from the button source data and generates a button class based on the extracted category. A button class is one of the elements constituting a semantic button and is generated for each category. Individual buttons belonging to the same category are collected and displayed in one button class. An individual button is one of the elements constituting a semantic button and is generated for each keyword. Collect and display for each button class corresponding to the category. When an individual button is selected, attention is paid to other keywords included in the text data including the keyword corresponding to the individual button, and the individual buttons of the button class corresponding to the category to which the keyword belongs are limited to those corresponding to the keyword. To redisplay.

前述の表１４に示すボタン化ソースデータを使用した場合には、“c-001”、“c-002”、“c-003”、“c-004”という識別子を持つカテゴリをボタンクラスに対応させ、前記意味認識ルールの定義例のカテゴリ定義によって、それぞれのカテゴリ名称である「苦情」、「質問」、「要望」及び「プリンタ」をボタンクラス名称に対応させてボタンクラスを生成する。また、テキストデータの内容に関係なく、ボタン化ソースデータの結果テーブル（レコード）のdoc（テキスト単位のファイル名）は、常に参照用ボタンクラスの対象になる。意味ボタンにおけるボタンクラスには、分析用と参照用の２種類がある。前者は、重要な意味を持つキーワードに対応する個別ボタンを選択してテキストデータを分析するために使用する。後者は、ユニークなデータ値に対応する個別ボタンを選択して関連するデータを検索するために使用する。 When buttoned source data shown in Table 14 above is used, categories with identifiers "c-001", "c-002", "c-003", and "c-004" correspond to button classes Then, according to the category definition in the definition example of the meaning recognition rule, a button class is generated by associating each of the category names “complaint”, “question”, “request”, and “printer” with the button class name. Regardless of the contents of the text data, the doc (file name in text unit) of the result table (record) of the buttoned source data is always the target of the reference button class. There are two types of button classes for semantic buttons: analysis and reference. The former is used to analyze text data by selecting individual buttons corresponding to keywords having important meanings. The latter is used to select the individual button corresponding to the unique data value and retrieve relevant data.

用途によっては、ボタン化ソースデータの結果テーブル（レコード）に「作成者」や「作成日時」など、分析対象テキストデータ以外の参照用フィールドがあれば、それらもカテゴリとして扱い、参照用ボタンクラスとすることができる。 Depending on the usage, if there are fields for reference other than the text data to be analyzed, such as “author” and “creation date / time” in the result table (record) of the buttoned source data, they are also treated as categories, and the button class for reference can do.

また、前記ボタン化ソースデータに適用した意味認識ルールの定義例では、「新製品」という名称のカテゴリ（c_id=“c-006”）が定義され、かつ適用カテゴリとして指定されているが、意味認識の結果、このカテゴリに該当するテキストデータは存在しなかった。この場合、ボタン化ソースデータの結果テーブルに“c-006”のカテゴリを持つ結果レコードは出力されないので、このカテゴリに対応するボタンクラスも当然生成されない。 In addition, in the definition example of the semantic recognition rule applied to the buttoned source data, a category with the name “new product” (c_id = “c-006”) is defined and specified as an applied category. As a result of recognition, there was no text data corresponding to this category. In this case, since the result record having the category “c-006” is not output in the button source data result table, the button class corresponding to this category is naturally not generated.

意味ボタン自動生成ツール２４の個別ボタン生成機能２４ｂは、ボタン化ソースデータからキーワードを抽出し、これを基に個別ボタンを生成する。個別ボタンはボタンクラスに従属するが、分析用ボタンクラスか参照用ボタンクラスのどちらに属するかにより個別ボタンの生成方法が異なる。 The individual button generation function 24b of the semantic button automatic generation tool 24 extracts keywords from the button source data and generates individual buttons based on the keywords. The individual buttons depend on the button class, but the method of generating the individual buttons differs depending on whether the button belongs to the analysis button class or the reference button class.

分析用ボタンクラスに属する個別ボタンを生成する方法を説明する。表１４に示す結果テーブルに存在する１つのカテゴリに着目し、当該カテゴリの識別子と一致する結果レコードのみに絞る。絞られた結果レコードから、ユニークなキーワードの識別子を取り出し、当該カテゴリに対応するボタンクラスに属する個別ボタンを生成する。前述の表１４に示すボタン化ソースデータにおいて、カテゴリの識別子“c-001”（カテゴリ名称「苦情」）に着目すると、表１５に示す１１個の結果レコードに絞ることができる。

A method for generating individual buttons belonging to the button class for analysis will be described. Focusing on one category existing in the result table shown in Table 14, only the result records that match the identifier of the category are narrowed down. A unique keyword identifier is extracted from the narrowed result record, and an individual button belonging to the button class corresponding to the category is generated. Focusing on the category identifier “c-001” (category name “complaint”) in the buttoned source data shown in Table 14 above, it is possible to narrow down to 11 result records shown in Table 15.

これらの結果レコードに存在するユニークなキーワードの識別子は、“k-001”、“k-002”、“k-003”、“k-005”、“k-025”であり、それぞれ「〜ない」、「〜ず」、「〜てしまう」、「〜おかしい」、「異常終了」というキーワードに対応する。これら５個のキーワードを基に、カテゴリの識別子“c-001”に対応する（前記分析用）ボタンクラス「苦情」に属する個別ボタンを生成する。 The identifiers for the unique keywords present in these result records are “k-001”, “k-002”, “k-003”, “k-005”, “k-025”, respectively ”,“ ˜zu ”,“ to end ”,“ ˜funny ”, and“ abnormal termination ”. Based on these five keywords, an individual button belonging to the button class “complaint” (for analysis) corresponding to the category identifier “c-001” is generated.

参照用ボタンクラス及び個別ボタンを生成する方法を以下に示す。この方法は、特許文献１に記載の発明によるものである。本発明による操作ボタンと特許文献１に記載の発明による操作ボタンの生成方法を組み合わせることにより、更に有用な操作ボタンを実現することができる。 The method for generating the reference button class and the individual buttons is shown below. This method is based on the invention described in Patent Document 1. By combining the operation button according to the present invention and the operation button generation method according to the invention described in Patent Document 1, a more useful operation button can be realized.

表１４に示す結果テーブル内に存在する参照用フィールドを基にボタンクラスを生成するために、全ての結果レコードの当該フィールド値を抽出する。そのうちユニークな値を基に、当該フィールドに対応する参照用ボタンクラスに属する個別ボタンを生成する。前述のボタン化ソースデータの作成例において、参照用フィールドのdoc（テキスト単位のファイル名）に着目し、全ての結果レコードから当該フィールドのユニークな値として、“テキストＡ”、“テキストＢ”、“テキストＣ”、“テキストＤ”、“テキストＥ”、“テキストＦ”、“テキストＧ”、“テキストＨ”、“テキストＩ”、“テキストＪ”、“テキストＫ”、“テキストＬ”及び“テキストＭ”を抽出し、これらを基に参照用ボタンクラス（「対象」と名付ける）の個別ボタンを生成する。 In order to generate a button class based on the reference field existing in the result table shown in Table 14, the field values of all result records are extracted. Based on the unique value, an individual button belonging to the reference button class corresponding to the field is generated. In the example of creating the button source data described above, paying attention to the doc (file name in text unit) of the reference field, “Text A”, “Text B”, “Text C”, “Text D”, “Text E”, “Text F”, “Text G”, “Text H”, “Text I”, “Text J”, “Text K”, “Text L” and “Text M” is extracted, and an individual button of a reference button class (named “target”) is generated based on these.

意味認識結果テーブルにおけるカテゴリのフィールドは、前述したように分析用ボタンクラス生成のために使用されるが、参照用フィールドとして扱うことによって参照用ボタンクラスを生成するために使用することも可能である。この例では、参照用フィールドとして、カテゴリのフィールドに着目し、全ての結果レコードから当該フィールドのユニークな値として、“c-001”、“c-002”、“c-003”、“c-004”を抽出し、前記意味認識ルールの定義例におけるカテゴリ定義部からそれぞれに対応するカテゴリ名称を取り出して、これらを基に参照用ボタンクラス（「主要分類」と名付ける）の個別ボタンを生成する。前記フィールドのユニークな値は、１個以上の意味認識結果レコードと対応するので、当該ボタンクラスに属する個別ボタン名称は、前記カテゴリ名称の後に“グループ”を付加して決定する。以上により、当該ボタンクラスに属する個別ボタンは、“苦情グループ”、“質問グループ”、“要望グループ”、“プリンタグループ”となる。 The category field in the semantic recognition result table is used for generating the analysis button class as described above, but it can also be used to generate the reference button class by treating it as a reference field. . In this example, focus on the category field as a reference field, and from all the result records, the unique value of the field is “c-001”, “c-002”, “c-003”, “c- 004 "is extracted, the corresponding category names are extracted from the category definition section in the meaning recognition rule definition example, and based on these, individual buttons of the reference button class (named" main classification ") are generated. . Since the unique value of the field corresponds to one or more meaning recognition result records, the individual button name belonging to the button class is determined by adding “group” after the category name. As described above, the individual buttons belonging to the button class are “complaint group”, “question group”, “request group”, and “printer group”.

意味ボタン自動生成ツールによってボタン化ソースデータから意味ボタンが生成されると、操作ボタンによるテキストデータ分析システムは、意味ボタンによる分析ツールダイアログを表示装置１４に表示する。図１１は、その表示例である。 When the semantic button is generated from the buttoned source data by the semantic button automatic generation tool, the text data analysis system by the operation button displays the analysis tool dialog by the semantic button on the display device 14. FIG. 11 is a display example.

図２に示すように、意味ボタンによる分析ツール２６は、意味ボタン並列関連表示機能２６ａとテキストデータ検索及び内容表示機能２６ｂから構成される。図１１のダイアログは、意味ボタン並列関連表示機能２６ａによって表示したものである。図１１の「主要分類」、「苦情」、「質問」、「要望」、「プリンタ」及び「対象」は、ボタンクラスを並列に表示したものである。これらの内、「苦情」から「プリンタ」までのボタンクラスは、前記結果レコードのカテゴリに対応する分析用ボタンクラスであり、「主要分類」は、カテゴリに対応する参照用ボタンクラスで、「対象」は、doc（テキスト単位のファイル名）に対応する参照用ボタンクラスである。「主要分類」というボタンクラスの下に表示されている、「苦情グループ」、「質問グループ」、「要望グループ」及び「プリンタグループ」は、「主要分類」に属する個別ボタンである。また、同様に「苦情」から「対象」までの各ボタンクラスの下に表示されている個々のボタンは、それぞれのボタンクラスに属する個別ボタンである。各ボタンクラスに属する個別ボタンは、他のボタンクラスに属する個別ボタンと、分析対象テキストデータを通して相互に関連付けられており、前記テキストデータを絞り込むために選択する。 As shown in FIG. 2, the semantic button analysis tool 26 includes a semantic button parallel relation display function 26a and a text data search and content display function 26b. The dialog shown in FIG. 11 is displayed by the meaning button parallel relation display function 26a. “Main classification”, “complaint”, “question”, “request”, “printer”, and “target” in FIG. 11 are button classes displayed in parallel. Among these, the button class from “complaint” to “printer” is an analysis button class corresponding to the category of the result record, “main classification” is a reference button class corresponding to the category, and “target” "Is a button class for reference corresponding to doc (file name in text unit). The “complaint group”, “question group”, “request group”, and “printer group” displayed under the button class “main classification” are individual buttons belonging to the “main classification”. Similarly, individual buttons displayed under each button class from “complaint” to “target” are individual buttons belonging to each button class. Individual buttons belonging to each button class are associated with individual buttons belonging to other button classes through analysis target text data, and are selected to narrow down the text data.

個別ボタンの選択によってボタンクラス「対象」の個別ボタンが１つに絞り込まれた場合、すなわち個別ボタンの選択によってテキスト単位のファイル名が１つに絞り込まれた場合、テキストデータ検索及び内容表示機能２６ｂによって、当該ファイルを読み出してその内容が、図１１に示すダイアログの下方に位置する「選択中のテキスト内容」に表示される。図１１は、どの個別ボタンも選択されていない状態であり、「対象」の個別ボタンが１つに絞り込まれていないため、テキスト内容は表示されていない。 When the individual buttons of the button class “target” are narrowed down to one by the selection of the individual buttons, that is, when the file name of the text unit is narrowed down to one by the selection of the individual buttons, the text data search and content display function 26b. Thus, the file is read out and the contents thereof are displayed in the “text content being selected” positioned below the dialog shown in FIG. FIG. 11 shows a state in which no individual button is selected, and since the “target” individual button is not narrowed down to one, the text content is not displayed.

図１１に示す状態で、ボタンクラス「対象」に属する個別ボタン「テキストＡ」を選択すると、意味ボタン並列関連表示機能２６ａによって、意味ボタン全体の表示が図１２のように更新される（選択した“テキストＡ”を太枠で示す）。具体的には、“テキストＡ”に関連する個別ボタン、すなわち、当該テキストに含まれるキーワードに対応するボタンクラスの個別ボタンのみが再表示される。 When the individual button “text A” belonging to the button class “target” is selected in the state shown in FIG. 11, the display of the entire semantic button is updated as shown in FIG. “Text A” is shown in a bold frame). Specifically, only the individual buttons related to “text A”, that is, the individual buttons of the button class corresponding to the keyword included in the text are redisplayed.

図１２に示すように、“テキストＡ”は、「苦情」、「要望」及び「プリンタ」の３つのカテゴリに分類されている。カテゴリの「苦情」に属する「〜ない」、「〜てしまう」というキーワードが抽出されて、これらに対応する個別ボタンが表示されている。同様に、カテゴリの「要望」に属する「〜て下さい」というキーワードが抽出されて、これに対応する個別ボタンが表示されている。更に、カテゴリの「プリンタ」に属する「プリンタ」、「印刷」及び「PRT-100」というキーワードが抽出されて、これらに対応する個別ボタンが表示されている。「主要分類」から「質問グループ」という個別ボタンが非表示となり、また同時に、ボタンクラス「質問」に属する個別ボタンが全て非表示となっていることから、“テキストＡ”には、カテゴリの「質問」に属するキーワードは含まれていないことが分かる。 As shown in FIG. 12, “text A” is classified into three categories, “complaint”, “request”, and “printer”. The keywords “to not” and “to end” belonging to the category “complaint” are extracted, and individual buttons corresponding to these are displayed. Similarly, a keyword “to please” belonging to “request” of the category is extracted, and an individual button corresponding to this is displayed. Further, keywords “printer”, “print”, and “PRT-100” belonging to the category “printer” are extracted, and individual buttons corresponding to these are displayed. Since the individual buttons “question group” from “main classification” are hidden, and at the same time, all the individual buttons belonging to the button class “question” are hidden, the “text A” includes “ It can be seen that the keyword belonging to “question” is not included.

また、テキスト単位が“テキストＡ”という１つのファイル名に絞り込まれたので、テキストデータ検索及び内容表示機能２６ｂによって、“テキストＡ”の内容が表示されている。図１２の表示内容のうち、下線太字部分は当該抽出されたキーワードと対応する文字列である。 Since the text unit is narrowed down to one file name “text A”, the contents of “text A” are displayed by the text data search and content display function 26b. In the display contents of FIG. 12, the underlined bold part is a character string corresponding to the extracted keyword.

図１１に示す状態で、ボタンクラス「苦情」の個別ボタン「〜ない」を選択すると、意味ボタン並列関連表示機能２６ａによって、意味ボタン全体の表示が図１３のように更新される。具体的には、「〜ない」に関連する個別ボタン、即ち、「〜ない」というキーワードを含む“テキストＡ”、“テキストＪ”に含まれる他のキーワードに対応するボタンクラスの個別ボタンのみが再表示される。 In the state shown in FIG. 11, when the individual button “not” of the button class “complaint” is selected, the display of the whole semantic button is updated as shown in FIG. 13 by the semantic button parallel relation display function 26a. Specifically, only individual buttons related to “not”, that is, only individual buttons of button classes corresponding to other keywords included in “text A” and “text J” including the keyword “not”. It will be displayed again.

図１３でカテゴリの「苦情」に属する「〜ない」というキーワードを含むのは、テキストデータを参照する「対象」では“テキストＡ”と“テキストＪ”であることが分かる。また、これらのテキストには、カテゴリの「要望」に属する「〜て下さい」と、カテゴリの「プリンタ」に属する「プリンタ」、「印刷」、「PRT-100」というキーワードが含まれており、カテゴリの「質問」に属するキーワードは含まれていないことが分かる。 In FIG. 13, it is understood that “text” and “text J” are included in the “object” referring to the text data that includes the keyword “not” belonging to the category “complaint”. In addition, these texts include the keywords “to please” belonging to the category “request” and the keywords “printer”, “print” and “PRT-100” belonging to the category “printer”. It can be seen that the keywords belonging to the category “Question” are not included.

図１３に示す状態で、ボタンクラス「苦情」の個別ボタン「異常終了」を選択すると、意味ボタン並列関連表示機能２６ａによって、意味ボタン全体の表示が図１４のように更新される。具体的には、前記「〜ない」または「異常終了」というキーワードを含む“テキストＡ”、“テキストＪ”、“テキストＬ”及び“テキストＭ”に含まれる他のキーワードに対応するボタンクラスの個別ボタンのみが再表示される。 When the individual button “abnormal end” of the button class “complaint” is selected in the state shown in FIG. 13, the display of the whole semantic button is updated as shown in FIG. 14 by the semantic button parallel relation display function 26a. Specifically, button classes corresponding to other keywords included in the “text A”, “text J”, “text L”, and “text M” including the keywords “to not” or “abnormal end”. Only individual buttons are redisplayed.

図１４でカテゴリの「苦情」に属する「〜ない」または「異常終了」というキーワードを含むのは、テキストデータを参照する「対象」では“テキストＡ”、“テキストＪ”、“テキストＬ”及び“テキストＭ”であることが分かる。また、これらのテキストには、カテゴリの「要望」に属する「〜て下さい」、「〜てほしい」と、カテゴリの「プリンタ」に属する「プリンタ」、「印刷」及び「PRT-100」というキーワードが含まれており、カテゴリの「質問」に属するキーワードは含まれていないことが分かる。 In FIG. 14, the keywords “to not” or “abnormal termination” belonging to the category “complaint” include “text A”, “text J”, “text L” and “object” referring to text data. It turns out that it is "text M". In addition, these texts include the keywords “~ Please” and “~ I want” belonging to the category “Request”, and “Printer”, “Print” and “PRT-100” belonging to the category “Printer”. It is understood that the keyword belonging to the “question” of the category is not included.

更に、図１１に示す状態で、ボタンクラス「主要分類」の個別ボタン「質問グループ」を選択すると、意味ボタン並列関連表示機能２６ａによって、意味ボタン全体の表示が図１５のように更新される。具体的には、カテゴリの「質問」に属する全てのキーワード、すなわち「〜か」または「〜?」というキーワードを含む“テキストＢ”、“テキストＣ”、“テキストＧ”及び“テキストＨ”に含まれる他のキーワードに対応するボタンクラスの個別ボタンのみが再表示される。 Furthermore, when the individual button “question group” of the button class “main classification” is selected in the state shown in FIG. 11, the display of the whole semantic button is updated as shown in FIG. 15 by the semantic button parallel relation display function 26a. Specifically, all of the keywords belonging to the category “question”, that is, “text B”, “text C”, “text G”, and “text H” including the keywords “˜ka” or “˜?” Only the individual buttons of the button class corresponding to the other keywords included are redisplayed.

図１５で「質問グループ」に該当するのは（すなわち、キーワード「〜か」と「〜?」のいずれかを含むのは）、テキストデータを参照する「対象」では“テキストＢ”、“テキストＣ”、“テキストＧ”及び“テキストＨ”であり、これらのテキストには、カテゴリの「プリンタ」に属する「プリンタ」、「インク」及び「PRT-200」というキーワードの内、少なくともそのいずれかが含まれていることが分かる。また、カテゴリの「苦情」及び「要望」に属するキーワードは含まれていないことがわかる。 In FIG. 15, “question group” corresponds to “a question group” (that is, includes any one of the keywords “˜ka” and “˜?”). C ”,“ Text G ”, and“ Text H ”. These texts include at least one of the keywords“ printer ”,“ ink ”, and“ PRT-200 ”that belong to the category“ printer ”. It can be seen that is included. It can also be seen that the keywords belonging to the categories “complaint” and “request” are not included.

更に、図１５に示す状態で、ボタンクラス「プリンタ」の個別ボタン「インク」を選択すると、意味ボタン並列関連表示機能２６ａによって、意味ボタン全体の表示が図１６のように更新される。具体的には、前記「質問グループ」に該当し、かつ「インク」というキーワードを含む“テキストＣ”、“テキストＨ”の個別ボタンのみが再表示される。 Further, when the individual button “ink” of the button class “printer” is selected in the state shown in FIG. 15, the display of the whole semantic button is updated as shown in FIG. 16 by the semantic button parallel relation display function 26a. Specifically, only the individual buttons “text C” and “text H” corresponding to the “question group” and including the keyword “ink” are displayed again.

図１６で前記「質問グループ」に該当し、かつ、カテゴリの「プリンタ」に属する「インク」というキーワードを含むのは、テキストデータを参照する「対象」では“テキストＣ”と“テキストＨ”であり、これらのテキストには、カテゴリの「苦情」と「要望」に属するキーワードは含まれていないことが分かる。
尚、以上で述べた意味ボタンによる分析ツールの例では、異なるボタンクラス間で個別ボタンを複数選択すると、選択による絞り込みの条件は論理積（ＡＮＤ）で結合され、同一ボタンクラスの中で個別ボタンを複数選択すると、選択による絞り込みの条件は論理和（ＯＲ）で結合されている。 In FIG. 16, the keyword “ink” that corresponds to the “question group” and belongs to the category “printer” includes “text C” and “text H” in “object” that refers to text data. Yes, it can be seen that these texts do not include keywords belonging to the categories “complaint” and “request”.
In the example of the analysis tool using the semantic buttons described above, when multiple individual buttons are selected between different button classes, the narrowing-down conditions by selection are combined by logical product (AND), and individual buttons within the same button class. When a plurality of are selected, the narrowing-down conditions by selection are combined by a logical sum (OR).

複数の個別ボタン選択時の、絞り込み条件の論理演算は前記以外にも、異なるボタンクラス間で論理和（ＯＲ）かつ同一ボタンクラス中で論理積（ＡＮＤ）、両方とも論理積（ＡＮＤ）、両方とも論理和（ＯＲ）、という組み合わせも考えられるが、もちろん、これらを全て実現することも可能である。 In addition to the above, the logical operation of the filtering condition when selecting multiple buttons is logical OR (OR) between different button classes, logical product (AND) in the same button class, both logical product (AND), both A combination of logical OR (OR) is also conceivable, but of course, all of these can be realized.

以上のように、この例によれば、テキストデータの持つ多様な意味に基づいて操作ボタン（意味ボタン）を自動生成し、コンピュータ経験が少なく専門知識を持たないエンドユーザでも、このボタンを選択するだけで、大量のテキストデータを、任意のカテゴリとキーワードに対応する多様な視点から、多次元で動的に階層を絞りながら検索したり、また複数のボタンクラスや個別ボタンを並列に再表示させて他のカテゴリのキーワードとの相互関連を見ながら検索したりすることが簡単にできる。 As described above, according to this example, an operation button (meaning button) is automatically generated based on various meanings of text data, and this button is selected even by an end user who has little computer experience and does not have expertise. Just search a large amount of text data from various viewpoints corresponding to an arbitrary category and keyword while dynamically narrowing down the hierarchy, and redisplay multiple button classes and individual buttons in parallel. You can easily search while looking at the correlation with other categories of keywords.

本発明を実施するためのコンピュータシステムの概要を示す図である。It is a figure which shows the outline | summary of the computer system for implementing this invention. 操作ボタンによるテキストデータ分析システムのソフトウェア構成を示す図である。It is a figure which shows the software structure of the text data analysis system by an operation button. 分析対象として使用するテキストデータの例を示す図である。It is a figure which shows the example of the text data used as an analysis object. 意味認識ルールの構造を示す図である。It is a figure which shows the structure of a meaning recognition rule. 意味認識ルール名称指定ダイアログの例を示す図である。It is a figure which shows the example of a meaning recognition rule name designation | designated dialog. キーワード定義ダイアログの例を示す図である。It is a figure which shows the example of a keyword definition dialog. カテゴリ定義ダイアログの例を示す図である。It is a figure which shows the example of a category definition dialog. ファイル保存ダイアログの例を示す図である。It is a figure which shows the example of a file save dialog. 意味認識ダイアログの例を示す図である。It is a figure which shows the example of a meaning recognition dialog. ボタン化ソースデータの構造を示す図である。It is a figure which shows the structure of buttoned source data. 意味ボタンによる分析ツールダイアログを示す図である。It is a figure which shows the analysis tool dialog by a meaning button. 図１１に示す状態で、ボタンクラス「対象」の個別ボタン「テキストＡ」を選択して再表示させた時の意味ボタンによる分析ツールダイアログを示す図である。It is a figure which shows the analysis tool dialog by a semantic button when the individual button "text A" of button class "target" is selected and redisplayed in the state shown in FIG. 図１１に示す状態で、ボタンクラス「苦情」の個別ボタン「〜ない」を選択して再表示させた時の意味ボタンによる分析ツールダイアログを示す図である。It is a figure which shows the analysis tool dialog by a semantic button when the individual button "not" of button class "complaint" is selected and redisplayed in the state shown in FIG. 図１３に示す状態で、ボタンクラス「苦情」の個別ボタン「異常終了」を選択して再表示させた時の意味ボタンによる分析ツールダイアログを示す図である。It is a figure which shows the analysis tool dialog by a semantic button when the individual button "abnormal end" of button class "complaint" is selected and redisplayed in the state shown in FIG. 図１１に示す状態で、ボタンクラス「主要分類」の個別ボタン「質問グループ」を選択して再表示させた時の意味ボタンによる分析ツールダイアログを示す図である。It is a figure which shows the analysis tool dialog by a semantic button when the individual button "question group" of button class "main classification" is selected and redisplayed in the state shown in FIG. 図１５に示す状態で、ボタンクラス「プリンタ」の個別ボタン「インク」を選択して再表示させた時の意味ボタンによる分析ツールダイアログを示す図である。FIG. 16 is a diagram showing an analysis tool dialog with semantic buttons when the individual button “ink” of the button class “printer” is selected and displayed again in the state shown in FIG. 15.

Explanation of symbols

１０中央処理装置
１２記憶装置
１４表示装置
１６入力装置
２０意味認識ルール定義ツール
２０ａキーワード定義機能
２０ｂカテゴリ定義機能
２０ｃ適用カテゴリ指定機能
２０ｄルール名称指定機能
２２意味認識ツール
２２ａ分析対象テキストデータ指定機能
２２ｂ意味認識ルール指定機能
２２ｃボタン化ソースデータ作成機能
２４意味ボタン自動生成ツール
２４ａボタンクラス生成機能
２４ｂ個別ボタン生成機能
２６意味ボタンによる分析ツール
２６ａ意味ボタン並列関連表示機能
２６ｂテキストデータ検索及び内容表示機能
３０意味認識ルール
３２ルール名称
３４キーワード定義部
３６カテゴリ定義部
３８適用カテゴリ指定部
４０特定キーワード
４０ａ特定キーワード名称
４０ｂ特定個別キーワード
４０ｃ特定個別除外キーワード
４２概念キーワード
４２ａ概念キーワード名称
４２ｂ概念個別キーワード
４２ｃ概念個別除外キーワード
４４カテゴリ
４４ａカテゴリ名称
４４ｂ定義済みキーワードへの参照
４６定義済みカテゴリへの参照
５０ボタン化ソースデータ
５２意味認識ルールファイル名
５４分析対象テキストデータ保存場所
５６意味認識結果テーブル
５８結果レコード
５８ａカテゴリ
５８ｂキーワード
５８ｃテキストデータファイル名
DESCRIPTION OF SYMBOLS 10 Central processing unit 12 Storage device 14 Display device 16 Input device 20 Meaning recognition rule definition tool 20a Keyword definition function 20b Category definition function 20c Applicable category specification function 20d Rule name specification function 22 Meaning recognition tool 22a Analysis target text data specification function 22b Meaning Recognition rule specification function 22c Buttoned source data creation function 24 Meaning button automatic generation tool 24a Button class generation function 24b Individual button generation function 26 Meaning button analysis tool 26a Meaning button parallel relation display function 26b Text data search and content display function 30 Meaning Recognition rule 32 Rule name 34 Keyword definition unit 36 Category definition unit 38 Application category specification unit 40 Specific keyword 40a Specific keyword name 40b Specific individual keyword 40c Specific individual exclusion Keyword 42 Concept keyword 42a Concept keyword name 42b Concept individual keyword 42c Concept individual exclusion keyword 44 Category 44a Category name 44b Reference to a defined keyword 46 Reference to a defined category 50 Buttoned source data 52 Semantic recognition rule file name 54 Analysis target Text data storage location 56 Semantic recognition result table 58 Result record 58a Category 58b Keyword 58c Text data file name

Claims

Computer processing of text data generated by processing of a computer programmed with a computer processing operation button for retrieving arbitrary text data from a plurality of text data stored in a storage device with a file name including a character string Operation button generation method,
Defining a category and a keyword that is an expression element for matching with a character string in the text data,
Search for text data in which the keyword is included in a character string in the text data,
The category, the keywords, and the text data or et al., Which it has been retrieved that contain the keyword, category field, have a three field of keyword field and text data file name field, in each field, the Generate buttonized source data with a semantic recognition result table that is a set of result records that correspond one-to-one with the category, the keyword, and the file name of the text data searched to contain the keyword. And
Analysis button CLASS in correspondence with the field value of the category field of the button of the source data, individual buttons respectively generating belonging to the button class for the analysis in correspondence to the field values of the keywords field,
The browse button CLASS from the text data file name field of the button of the source data, the text data file name field of the field value is associated with an individual button belonging to the button class for the reference product, respectively,
An operation button generation method for computer processing of text data, wherein the button class and the individual buttons are displayed on a display device.

The analysis button class and the reference button class are displayed in parallel on the display device together with the individual buttons, and when any individual button belonging to one analysis button class is selected, the selected individual button corresponds to the selected individual button The result record including the keyword corresponding to the field value of the keyword field and at least one of the other keywords included in the text data including the keyword is extracted from the semantic recognition result table, and the extracted result record 2. The operation button generating method for computer processing of text data according to claim 1, wherein the individual buttons belonging to the analysis button class and the reference button class are generated and redisplayed.

Meaning the result record including all the keywords respectively corresponding to the field values of the keyword field corresponding to the selected individual button by selecting any individual button of any button class for analysis in any order 3. The method according to claim 1 , further comprising: extracting from the recognition result table, and generating and redisplaying the individual buttons belonging to the analysis button class and the reference button class based on the extracted result record. An operation button generation method for computer processing of the described text data.

Define the keyword,
By defining separately a reference to the predefined keywords for each of the categories, to define the category while associating the categories and keywords each other,
Extracting a keyword that matches the keyword from the text data and associating it with a category in which a reference to the keyword is defined;
4. The operation button generation method for computer processing of text data according to claim 1, wherein the meaning recognition result table is generated based on the associated result.

The keywords are categorized and defined as a specific keyword composed of a specific character string and a concept keyword composed of a character string including an abstracted portion. In the specific keyword, the specific character string is defined as the concept. The keyword is used to match a character string excluding the abstracted portion from a character string including the abstracted portion with a character string in the text data. 5. An operation button generation method for computer processing of text data according to any one of 4 above.