JP6526470B2

JP6526470B2 - Pre-construction method of vocabulary semantic patterns for text analysis and response system

Info

Publication number: JP6526470B2
Application number: JP2015086484A
Authority: JP
Inventors: ジョンフンジャン; ジュンホゴ
Original assignee: 株式会社ワイズナット
Priority date: 2015-02-23
Filing date: 2015-04-21
Publication date: 2019-06-05
Anticipated expiration: 2035-04-21
Also published as: KR101589621B1; JP2016157407A

Description

本発明は自然言語テキストの意味を分析しそれに応答するためのシステムの基盤になるＬＳＰ（ＬｅｘｉｃｏＳｅｍａｎｔｉｃＰａｔｔｅｒｎ：語彙意味パターン）知識を構築する方法に関するものであり、特に音声認識システムのためのＬＳＰ知識の構築方法に関するものである。 The present invention relates to a method of constructing LSP (Lexico Semantic Pattern) knowledge that forms the basis of a system for analyzing and responding to the meaning of natural language texts, and in particular to LSP knowledge for speech recognition systems. It relates to the construction method of

機械によって人の音声を認識し反応する技術は実生活の多様な分野に応用されている。代表的にアップルのシリ（Ｓｉｒｉ）（登録商標）とグーグルナウ（Ｎｏｗ）のように、機械（スマートフォン）が人の音声を認識して応答するか多様な制御命令を実行するシステムが知られている。このようなシステムはテキストマイニング技術に基づいてユーザの入力文章を分析して意味を把握し、その意図に合わせて応答を生成、出力する。これは単にシリとナウだけでなく、ロボットシステムやキーワード抽出、文章要約などのような自然言語処理システムのように人工知能システムの多様なシステム適用されて使用されている。 Technologies that recognize and respond to human speech by machines are applied to various fields of real life. A system is known in which a machine (smartphone) recognizes and responds to human voice or executes various control commands, such as Apple's Siri (registered trademark) and Google Now (Now). There is. Such a system analyzes the input sentence of the user based on text mining technology to grasp the meaning, and generates and outputs a response according to the intention. This is used not only in Siri and Now, but also in various systems of artificial intelligence systems such as robot systems, natural language processing systems such as keyword extraction, text summarization etc.

質疑応答システムがユーザ入力テキストを分析するためには、形態素及び構文分析などのように自然言語分析過程を経る。このような研究は以前から行われており、パターン基盤分析と統計基盤分析に分けられて発展してきた。そのうちパターン基盤分析は様々な文章で繰り返し出現する単語や或いは形態素、構文をＬＳＰ（語彙意味パターン）形態のパターンにし、該当ＬＳＰに意味を与えることで文章を分析する。ＬＳＰ技術を利用した韓国語構文を認識するための語彙意味パターン再構成方法に関しては特許文献１があり、この特許は本発明の発明者が完成したものである。 In order to analyze the user input text, the question and answer system goes through a natural language analysis process such as morpheme and syntactic analysis. Such research has been conducted for a long time, and has been developed in pattern-based analysis and statistical-based analysis. Among them, pattern-based analysis analyzes sentences by converting words, morphemes and syntax that appear repeatedly in various sentences into patterns of LSP (lexical semantic pattern) forms and giving meaning to the corresponding LSP. Patent Document 1 is a lexical-semantic pattern reconstruction method for recognizing Korean syntax using LSP technology, and this patent has been completed by the inventor of the present invention.

パターン基盤の分析方法としてＬＳＰは語彙、形態素、品詞などの情報と構文構造を表現する文法規則であって、自然言語処理方法でよく知られている。ＬＳＰ技術は構文分析が容易に行われない自然言語に対して１次元的な構文分析ができるように助ける。ところで、このようなパターン基盤の分析方法はＬＳＰという知識をシステム管理者が、入力される質疑（質問）に合わせて事前に定義しておくべきである。これは単にＬＳＰ知識構築だけの問題だけではない。入力される質疑に対するテキスト分析過程が終わると該当情報を利用して応答生成過程を経る。よって、質疑に合う応答が効率的に予め構築されていなければ、応答情報の提供に失敗するか間違った応答が出力される恐れがある。 As a pattern-based analysis method, LSP is a grammatical rule that expresses information such as vocabulary, morpheme, part of speech and syntactic structure, and is well known as a natural language processing method. LSP technology helps to enable one-dimensional syntax analysis for natural languages where syntax analysis is not easily performed. In such a pattern-based analysis method, the system administrator should define the knowledge of LSP in advance in accordance with the input question (question). This is not merely a problem of LSP knowledge construction. When the text analysis process for the input query is finished, the corresponding information is used to go through a response generation process. Therefore, if responses to a question and answers are not constructed in advance efficiently, there is a risk that the provision of response information may fail or an incorrect response may be output.

要するに、テキスト分析及び応答システムを利用して良質のサービスを提供するためには質疑を分析するための基本知識であるＬＳＰと質疑に適合する応答データを事前によく構築しておくべきである。よって、本発明の発明家はＬＳＰ知識の構築をいかに効率的に行うのかを長く研究した末、本発明を完成するに至った。 In short, in order to provide a high quality service using a text analysis and response system, LSP which is basic knowledge for analyzing questions and answers and response data conforming to the questions should be constructed in advance. Therefore, the inventor of the present invention completed the present invention after long researches on how to efficiently construct the construction of LSP knowledge.

大韓民国特許第１４０９２９８号Republic of Korea Patent No. 1409298

本発明の目的は、ユーザの質疑に効率的に応答するために多段階のＬＳＰ知識構築方法を提供することである。それによって効果的な質疑応答システムサービスを提供する環境を構築しようとする。一方、本発明の明示されていない他の目的は下記詳細な説明及びその効果から容易に推論し得る範囲内で追加的に考慮されるはずである。 An object of the present invention is to provide a multi-step LSP knowledge construction method to efficiently respond to user queries. It tries to build an environment that provides an effective question and answer system service. Meanwhile, other objects of the present invention, which are not specified, should be considered additionally within the scope of the following detailed description and the effects thereof.

このような課題を解決するために、本発明はテキスト分析及び応答システムのための語彙意味パターンの事前構築方法として、
質疑応答システムの管理者端末により実行されるテキスト分析及び応答システムのための語彙意味パターンの事前構築方法であって、
（ａ）質疑者端末の入力文章にマッチングされる語彙意味パターンが属する集合であるコンセプトを予め定義するステップと、
（ｂ）語彙意味パターンの対象となる文章であるサンプルデータを収集して前記コンセプトに合わせて分類するステップと、
（ｃ）前記コンセプトの意味を構成する基本単位である意味素性（Ｓｅｍａｎｔｉｃｆｅａｔｕｒｅ）を定義して意味素性辞書を構築し、同じ意味を有する単語として一つ以上のエントリを、各意味素性に属する一つの集合になるように構造化するステップと、
（ｄ）前記サンプルデータを認識するための前記語彙意味パターンを前記意味素性を使って所定の文法表現を組み合わせて構築するステップと、
（ｅ）質疑者端末の入力文章に応答する応答データを前記コンセプト別に予め構築するステップと、を含むことを特徴とする。 In order to solve such problems, the present invention is a method of pre-building lexical semantic patterns for text analysis and response systems,
A method of pre-building lexical semantic patterns for a text analysis and response system performed by a manager terminal of a question and answer system, comprising:
(A) defining in advance a concept which is a set to which a vocabulary semantic pattern to be matched with an input sentence of the questioner terminal belongs;
(B) collecting sample data which is a target of a lexical meaning pattern and classifying it according to the concept;
(C) the building concept a basic unit is semantic feature semantic feature dictionary defines the (Semantic Description feature) that constitutes the means of one or more entries as a word having the same meaning, one belonging to each semantic feature Structuring into a set of
(D) constructing the lexical semantic patterns for recognizing the sample data by combining predetermined grammatical expressions using the semantic features;
(E) pre-constructing response data responsive to the input sentence of the questioner's terminal for each of the concepts.

また、本発明の好ましいある実施例によるテキスト分析及び応答システムのための語彙意味パターンの事前構築方法において、前記ステップ（ａ）のコンセプトは階層構造を有することが好ましい。 Also, in the method of pre-building lexical semantic patterns for a text analysis and response system according to a preferred embodiment of the present invention, the concept of the step (a) preferably has a hierarchical structure.

また、本発明の好ましいある実施例によるテキスト分析及び応答システムのための語彙意味パターンの事前構築方法において、前記ステップ（ｂ）でサンプルデータに対応するコンセプトが定義されていない場合、コンセプトを追加するか修正するステップを更に含む。 Also, in the method of pre-building lexical semantic patterns for a text analysis and response system according to a preferred embodiment of the present invention, a concept is added if a concept corresponding to sample data is not defined in the step (b). Or further including the step of correcting.

また、本発明の好ましいある実施例によるテキスト分析及び応答システムのための語彙意味パターンの事前構築方法において、前記ステップ（ｄ）の語彙意味パターンは前記サンプルデータを認識するための文法表現と前記ステップ（ｃ）の意味素性を使用する。 Also, in the method of pre-building lexical semantic patterns for a text analysis and response system according to a preferred embodiment of the present invention, the lexical semantic patterns of the step (d) are grammatical expressions for recognizing the sample data and the steps Use the semantic features of (c).

また、本発明の好ましいある実施例によるテキスト分析及び応答システムのための語彙意味パターンの事前構築方法において、前記ステップ（ｅ）の前記応答データは前記質疑者端末の入力文章に応じて変わるべき部分を変数に指定することが好ましい。 Also, in the method of pre-building lexical semantic patterns for a text analysis and response system according to a preferred embodiment of the present invention, the response data of the step (e) should be changed according to the input sentence of the questioner terminal Is preferably designated as a variable.

このような本発明によると、質疑応答システムのＬＳＰ知識を効果的に構築することことができる長所がある。また、知識の管理とメンテナンスを効率的に行うことができることはもちろんである。一方、ここで明示的に言及されていない効果であっても、本発明の技術的特徴によって期待される下記明細書に記載された効果及びその暫定的な効果は本発明の明細書に記載されたものと同じく取り扱われることを付言する。 According to the present invention, there is an advantage that LSP knowledge of the question and answer system can be constructed effectively. Of course, knowledge management and maintenance can be performed efficiently. On the other hand, even if effects are not explicitly mentioned here, the effects described in the following specification expected by the technical features of the present invention and the provisional effects thereof are described in the specification of the present invention. It is added that it is treated the same as yours.

ＬＳＰ（語彙意味パターン）基盤の質疑応答システムを使用するあるシナリオにおけるシステムの構成例を示すブロック図である。It is a block diagram showing an example of composition of a system in a certain scenario using a question and answer system based on LSP (lexical semantic pattern). 好ましい実施例による本発明の全体プロセスを概略的に示すフローチャートである。Figure 2 is a flow chart schematically illustrating the overall process of the invention according to a preferred embodiment. 本発明によってキャプションを構築するに当たって、管理者端末の画面構成例を示す図である。It is a figure which shows the example of a screen structure of an administrator terminal in constructing | assembling a caption by this invention. 本発明の方法によって意味素性を定義した意味素性辞典テーブル２００の一例を示す図である。It is a figure which shows an example of the semantic feature dictionary table 200 which defined the semantic feature by the method of this invention. 図４の５００番の意味素性「ｍｅｅｔｉｎｇ」に対するエントリテーブル２０１の構成例を示す図である。It is a figure which shows the structural example of the entry table 201 with respect to the 500th semantic feature "meeting" of FIG. 本発明の方法によって生成されたＬＳＰ構築テーブル３００の構成例を示す図である。ちなみに、添付した図面は本発明の技術思想に関する理解を助けるために参照として例示されたものであることを明らかにし、それによって本発明の権利範囲が制限されることはない。It is a figure which shows the structural example of the LSP construction | assembly table 300 produced | generated by the method of this invention. Incidentally, it is clarified that the attached drawings are illustrated as a reference to assist the understanding of the technical idea of the present invention, and the scope of the present invention is not limited thereby.

以下、添付した図面を参照して本発明を実施するための具体的な内容を説明する。そして、本発明を説明するに当たって、関連する公知機能についてこの分野の技術者に自明な事項であって本発明の要旨を不明確にする恐れがあると判断される場合にはその詳細な説明を省略する。 Hereinafter, specific contents for carrying out the present invention will be described with reference to the attached drawings. Then, in the description of the present invention, when it is determined that the related known functions are obvious to those skilled in the art and there is a fear that the gist of the present invention may be unclear. I omit it.

図１は、ＬＳＰ（語彙意味パターン）基盤の質疑応答システムの使用用例のうちあるシナリオを示す。特に図示した例は主に質疑者の入力文章が音声で形成された場合であるが、質疑者の入力文章が音声ではなくテキストで入力された場合でも本発明のＬＳＰ知識構築方法を使用することができる。 FIG. 1 illustrates a scenario of usage scenarios of a question-and-answer system based on LSP (lexical semantic pattern). Particularly, the illustrated example is mainly the case where the input sentence of the questioner is formed by speech, but the LSP knowledge construction method of the present invention is used even when the input sentence of the questioner is input not by speech but by text. Can.

質疑者がユーザディバイス１０にアクセスして入力文章を入力すると、ユーザディバイス１０に内蔵されている音声認識器１１を経てテキストに変換される。ユーザディバイス１０の質疑応答システム１２は予め構築されているＬＳＰ知識情報を利用して入力文章を分析し、質疑に合う応答データを応答出力器１３を介して質疑者に提示する。 When the questioner accesses the user device 10 and inputs an input sentence, it is converted into text through the speech recognizer 11 built in the user device 10. The question-and-answer system 12 of the user device 10 analyzes the input sentence using the pre-established LSP knowledge information, and presents the question-and-answer person with the response data matching the question via the response output unit 13.

質疑応答システム１２の駆動に必要なデータとプログラムコードは貯蔵装置に構築される。本発明の好ましいある実施例では図１のようにユーザディバイス１０のメモリ上に構築される。また、本発明の更に好ましい他の実施例では前記質疑応答システム１２が構築されている貯蔵装置はユーザディバイス１０の外部に位置するが、この場合ユーザディバイス１０はネットワーク通信を介して外部貯蔵装置に常住する質疑応答システム１２を利用して質疑者に応答データを出力する。 Data and program code necessary for driving the question and answer system 12 are built in the storage device. In a preferred embodiment of the present invention, it is built on the memory of the user device 10 as shown in FIG. Also, in another preferred embodiment of the present invention, the storage device in which the question and answer system 12 is built is located outside the user device 10, in which case the user device 10 is connected to an external storage device via network communication. The response data is output to the questioner using the question and answer system 12 which is resident permanently.

本発明の方法は前記質疑応答システム１２を構成するＬＳＰ知識を事前に段階的に構築する方法に関し、これは管理者端末２０によって実施される。以下で説明するＬＳＰ知識は、質疑者の入力文章を分析し応答データを抽出するために使用するデータベースとそれらのデータベースに貯蔵されているデータを使用するためのソフトウェアモジュールを含む。一方、本発明によって事前に構築されたＬＳＰ知識を実際に使用することに当たって、図１又は図１の多様な変形例におけるハードウェア及びソフトウェアの機能と作用関係は公知技術であるか或いは公知技術の変容や様々な改善が含まれる。 The method of the present invention relates to a method of gradually establishing LSP knowledge constituting the question and answer system 12 in advance, which is implemented by the manager terminal 20. The LSP knowledge described below includes databases used to analyze the questioner's input sentences and to extract response data and software modules for using the data stored in those databases. On the other hand, when actually using LSP knowledge constructed in advance according to the present invention, the functions and operational relationships of hardware and software in various modifications of FIG. 1 or FIG. It includes transformation and various improvements.

図２は、本発明の好ましい一実施例によるＬＳＰ知識構築方法の全体プロセスを例示している。これはテキスト分析及び応答システムのためのＬＳＰの辞典構築方法のプロセスでもある。これらの各ステップはハードウェア／ソフトウェアモジュールが構築されているコンピューティングシステムである管理者端末によって行われる。 FIG. 2 illustrates the overall process of the LSP knowledge construction method according to a preferred embodiment of the present invention. This is also the process of LSP dictionary construction method for text analysis and response system. Each of these steps is performed by an administrator terminal which is a computing system in which hardware / software modules are built.

まず、コンセプト（Ｃｏｎｃｅｐｔｓ）を構築するＳ１０。コンセプトは語彙意味パターンが属する集合の役割をし、入力される文章に対してどの応答を取るのか決める単位となる。即ち、ユーザに出力される応答文章は質疑者端末の入力文章にマッチングされる任意のＬＳＰが属したコンセプトに登録された応答文章である。 First, construct a concept (S10). A concept acts as a set to which lexical semantic patterns belong, and is a unit that determines which response to take for an input sentence. That is, the response sentence output to the user is a response sentence registered in a concept to which an arbitrary LSP matched with the input sentence of the questioner terminal belongs.

また、好ましくはＳ１０で定義されて構築されるコンセプトは階層構造を有する。図３はコンセプト生成画面１００を示し、複数のコンセプトが階層構造を形成していることを示す。例えば、ｃｏｍｍｏｎコンセプトを定義しながらその下位範疇としてｐｅｒｉｏｄ，ｔｉｍｅ，ｐｌａｃｅ，ａｔｔｅｎｄｅｅ，ｆｉｌｔｅｒｉｎｇコンセプト定義し、ｆｉｌｔｅｒｉｎｇコンセプトには更にｐｅｒｉｏｄ，ｔｉｍｅ，ｐｌａｃｅ，ａｔｔｅｎｄｅｅを定義して登録する。このようにコンセプトを構築するに当たって、大きい範疇の意味表現から細部意味まで階層を成しながら文章の意味を分類して定義する。各コンセプトはＬＳＰを有するか或いは有しなくてもよい。しかし、本発明において、以下で生成されるＬＳＰは必ずコンセプトに属する。 Also, preferably, the concept defined and constructed in S10 has a hierarchical structure. FIG. 3 shows the concept generation screen 100 and shows that a plurality of concepts form a hierarchical structure. For example, while defining the common concept, period, time, place, attendee and filtering concepts are defined as subcategories thereof, and period, time, place and attendee are further defined and registered in the filtering concept. In this way, in constructing a concept, the meaning of sentences is classified and defined while forming a hierarchy from a large category meaning expression to a detailed meaning. Each concept may or may not have an LSP. However, in the present invention, the LSP generated below necessarily belongs to the concept.

このように多数のＬＳＰがコンセプトに属するように事前に構造化することで、ＬＳＰはコンセプトの集合になる。よって、類似した内容のテキストを分析し得るＬＳＰを一つのコンセプトに束ねることでより効果的に管理することができる。 By pre-structuring so that many LSPs belong to a concept in this way, LSPs become a set of concepts. Therefore, it is possible to manage more effectively by bundling LSPs capable of analyzing text of similar content into one concept.

各コンセプトに属するＬＳＰを構築するためには、対象となるテキストであるサンプルデータの確保が必要である。サンプルデータを収集して前記コンセプトに合わせて分類するＳ２０。サンプルデータを多く収集するほどより精巧なコンセプトとＬＳＰの構築が可能になる。これは質疑応答システムの性能に直接的な影響を及ぼす。収集したサンプルデータは構築したコンセプトに合わせてそれぞれ分類するが、もし収集したサンプルデータのうち特定のコンセプトに分類することが難しいデータである場合、即ち、収集したサンプルデータに対応するコンセプトがない場合にはコンセプトを追加するか修正する。 In order to construct an LSP belonging to each concept, it is necessary to secure sample data which is a target text. Collect sample data and classify it according to the concept S20. The more sample data is collected, the more sophisticated the concept and LSP can be constructed. This has a direct impact on the performance of the question and answer system. The collected sample data is classified according to the constructed concept, but if it is difficult to classify the collected sample data into a specific concept, ie, there is no concept corresponding to the collected sample data Add or modify concepts in

説明の便宜上、以下の文章のようなサンプルデータを例示する：
（Ａ）「チーム会食によいランチを食べる食堂を教えて」
（Ｂ）「課題ワークショックをするつもりだけど、どこがいいかな」
（Ｃ）「課題会議のスケジュールを立てなきゃ…」 For illustrative purposes, sample data such as the following sentence is illustrated:
(A) "Teach me a cafeteria that has good lunches for team meetings"
(B) "I'm going to do a task work shock, but what is the best?"
(C) "I have to make a schedule for the task meeting ..."

質疑応答システムが効果的に実行されるために単語は異なるが同じ意味を有する語彙を構造化する必要がある。そのため、前記コンセプトの意味を構成する基本単位である意味素性を定義して意味素性辞典を構築するＳ３０。 In order for the question and answer system to be effectively implemented, it is necessary to structure vocabulary having different but the same meaning. Therefore, a semantic feature dictionary, which is a basic unit constituting the meaning of the concept, is defined to construct a semantic feature dictionary S30.

意味素質はＬＳＰを構成する基本単位のうち一つであり、意味素性辞典は同じ意味を有する一つ以上のエントリを一つの集合に束ねたものである。 A semantic feature is one of the basic units constituting an LSP, and a semantic feature dictionary is a collection of one or more entries having the same meaning into one set.

前記サンプルデータの文章に関して説明すると、文章（Ａ）の場合には「要請」、「飲食店」、「目的」のような意味素性で構成されている。それぞれの意味素性は、例えば「要請（教えて）」、「飲食店（食堂）」、「目的（会食、ランチ）」などのエントリを含む。文章（Ｂ）の場合、「ミーティング」、「目的」、「ｗｈｅｒｅ」のような意味素性で、文章（Ｃ）は「目的」、「望み」のような意味素性で構成されている。前記文章を包括するコンセプトは「ｒｅｓｅｒｖａｔｉｏｎ」である。結局、いくつかのサンプル文章からこのコンセプトは「要請」、「飲食店」、「ミーティング」、「目的」、「ｗｈｅｒｅ」、「望み」のような意味素性で構成される。 In the case of the sentence (A), the sentences of the sample data are composed of semantic features such as "request", "restaurant", and "purpose". Each semantic feature includes, for example, entries such as "request (teaching)", "restaurant (dining)", and "purpose (meal, lunch)". In the case of sentence (B), semantic features such as "meeting", "purpose" and "where", and sentence (C) is made up of semantic features such as "purpose" and "desire". The concept encompassing the above sentence is "reservation". After all, from some sample sentences, this concept consists of semantic features such as "request", "restaurant", "meeting", "purpose", "where" and "hope".

図３において、「ｒｅｓｅｒｖａｔｉｏｎ」の下のｍｅａｌコンセプトは「飲食店の予約」というコンセプトであり、このコンセプトは「飲食店」、「要請」、「望み」、「ｗｈｅｒｅ」などの意味素性で構成され、「ｒｅｓｅｒｖａｔｉｏｎ」の下のｍｅｅｔｉｎｇコンセプトは「飲食店」の意味素性の代わりに「ミーティング」の意味素性が追加に構成されてもよい。 In Fig. 3, the meal concept under "reservation" is the concept of "reservation of restaurant", and this concept consists of semantic features such as "restaurant", "request", "hope" and "where" The meeting concept under “reservation” may additionally include the “meeting” semantic feature instead of the “restaurant” semantic feature.

図４を利用して更に説明する。図４は、意味素性を定義した意味素性辞典テーブル２００の一例を示す。この意味素性辞典テーブル２００のうち５００番の意味素性２０１である「ｍｅｅｔｉｎｇ」を例に挙げて説明する。 This will be further described using FIG. FIG. 4 shows an example of a semantic feature dictionary table 200 in which the semantic features are defined. The "meeting" which is the 500th semantic feature 201 in the semantic feature dictionary table 200 will be described as an example.

「会議」、「課題＋会議」、「集会」、「ミーティング」、「課題＋ミーティング」、「討議」、「論議」は同じ意味を有する。よって、これらの単語を「ｍｅｅｔｉｎｇ＿ｎ」という意味素性２０１のエントリに束ねることができ、図５のエントリテーブル２１０のように一つのエントリに分類し、このエントリがｍｅｅｔｉｎｇ＿ｎという意味素性２０１の下位分類の集合になるように構造化する。 "Conference", "Issue + Meeting", "Meeting", "Meeting", "Issue + Meeting", "Discussion" and "Discussion" have the same meaning. Therefore, these words can be bundled into an entry of semantic feature 201 "meeting_n" and classified into one entry as in entry table 210 of FIG. 5, and this entry is a set of subclasses of semantic feature 201 meeting_n. Structured to be

このような意味素性は辞典のような役割をし、このように定義された意味素性に同じ意味を有する語彙エントリを追加するため、意味素性はエントリの集合になる。ドメインに内にキーワードで形成された意味素性と叙述表現の意味素性が含まれる。 Such semantic features have a dictionary-like role, and the semantic features become a set of entries, since lexical entries having the same meaning are added to the semantic features thus defined. The domain includes the semantic features of keywords and the semantic features of narrative expressions in the domain.

語彙意味パターンでは記号「＠」を使用して意味素性を「＠ｍｅｅｔｉｎｇ＿ｎ」で表現する。意味素性辞典の構築が終わると、それを活用して先に収集し分類したサンプルデータに対する語彙意味パターン（ＬＳＰ）を構築するＳ４０。 In the lexical semantic pattern, the symbol "@" is used to express the semantic feature "@meeting_n". When construction of the semantic feature dictionary is completed, a vocabulary semantic pattern (LSP) for sample data collected and classified previously using the dictionary is constructed S40.

ＬＳＰを構築する際には意味素性だけでなく多様な文法表現に基づく節、形態素、音節、辞典、変数などの表現と多様な演算子を使用することができる。上述したように、本発明においてＬＳＰは任意のコンセプトに属するようにする。 In constructing LSPs, expressions such as clauses, morphemes, syllables, dictionaries, variables and various operators based on various grammatical expressions as well as semantic features can be used. As described above, in the present invention, LSP is made to belong to any concept.

前記Ｓ３０を先に実行しておくことで、一つの代表文型を表現するＬＳＰはそのＬＳＰを構成する意味素性エントリの組み合わせだけ文章を認識することができる。 By executing S30 first, an LSP representing one representative sentence pattern can recognize sentences only in combination of semantic feature entries constituting the LSP.

図６は、本発明のＬＳＰ構築テーブル３００の一例を示す。このＬＳＰ構築テーブル３００は前記Ｓ２０のサンプルデータの例文（Ａ），（Ｂ），（Ｃ）に関する代表文型のＬＳＰの一部である。ＬＳＰの基本構成は語彙、品詞、形態素を含み、図６で使用された記号（演算子と品詞）については下記表１がその意味を説明する。 FIG. 6 shows an example of the LSP construction table 300 of the present invention. The LSP construction table 300 is a part of representative sentence LSPs related to the example sentences (A), (B) and (C) of the sample data of S20. The basic structure of LSP includes vocabulary, part of speech, and morpheme, and the meanings of the symbols (operators and part of speech) used in FIG.

質疑者の入力文章は質疑応答システムによって分析され、分析結果マッチングされるＬＳＰ構文が検索されると、質疑応答システムはそれに対応する応答データを質疑者に出力する。そのために応答データを予め構築するＳ５０。 The input sentence of the questioner is analyzed by the question and answer system, and when the analysis result matching LSP syntax is retrieved, the question and answer system outputs the corresponding response data to the questioner. For that purpose, the response data are constructed in advance S50.

好ましくは、各コンセプト別に予め応答文章を構築しておく。質疑者の入力文章がＬＳＰにマッチングされているということはその文章が当たるコンセプトが特定されるという意味である。なぜならば、Ｓ２０で事前実行によってＳ４０で構築されたＬＳＰ構文が各コンセプト別に分類されるためである。よって、前記Ｓ５０で予め構築される応答データもコンセプト別に分類して登録することが好ましい。この際、入力文章に応じて変わるべき部分は変数に指定しておくことが好ましい。前記Ｓ２０で提示したサンプルデータの例文に対応する応答データは以下のように予め登録される。
（ａ）「近くの＠ｄｉｓｔａｎｃｅ内に＠ｒｅｓｔａｕｒａｎｔがありますね」
（ｂ）「ワークショップ場所として＠ｄｉｓｔａｎｃｅほどよいところはないですね」
（ｃ）「＠ｐｒｏｊｅｃｔのスケジュールは＠ｗｈｅｎになっております」 Preferably, response sentences are constructed in advance for each concept. The fact that the input sentence of the questioner is matched to the LSP means that the concept that the sentence corresponds to is specified. This is because the LSP syntax constructed in S40 by pre-execution in S20 is classified according to each concept. Therefore, it is preferable that the response data constructed in advance in S50 be classified and registered for each concept. Under the present circumstances, it is preferable to designate the part which should change according to an input sentence as a variable. The response data corresponding to the example sentence of the sample data presented in S20 is registered in advance as follows.
(A) "There is @restaurant in nearby @distance"
(B) "There is no better place than @distance for a workshop location."
(C) "The schedule of @project is @when"

これまで説明した各ブロックは特定の論理的機能を実行するための一つ以上の実行可能なインストラクションを含むモジュール、セグメント又はコードの一部を示す。また、いくつかの代替実行例ではブロックで言及した機能が順番を逸脱して発生する可能性もあることを注目すべきである。例えば、連なって図示されている２つのブロックは実は実質的に同時に行われてもよく、又はそのブロックが時々当たる機能に応じて逆順に行われてもよい。例えば、前記Ｓ２０のサンプルデータの収集は前記Ｓ１０を実行する前に行われてもよく、またＳ３０の後で行われてもよい。また、前記Ｓ２０はＳ４０の前に来ることが自然であるが、サンプルデータを必要に応じて追加することも考えられる。 Each block described so far represents a module, segment or portion of code that includes one or more executable instructions for performing a particular logical function. Also, it should be noted that in some alternative implementations, the functions mentioned in the block may occur out of order. For example, two blocks shown in succession may in fact be performed substantially simultaneously, or may be performed in the reverse order depending on the function they sometimes hit. For example, the collection of sample data in S20 may be performed before performing S10, or may be performed after S30. Also, although it is natural that S20 comes before S40, it is also conceivable to add sample data as needed.

ちなみに、本発明の好ましい様々な実施例によるテキスト分析及び応答システムのための語彙意味パターンの事前構築方法は、多様なコンピュータ手段を介して行われるプログラム命令形態で具現されてコンピュータで判読可能な媒体に記録される。前記コンピュータで判読可能な媒体はプログラム命令、データファイル、データ構造などを単独に又は組み合わせて含む。前記媒体に記録されるプログラム命令は本発明のために特別に設計され構成されたものであるか、コンピュータソフトウェアの当業者に公知されて使用可能なものであってもよい。コンピュータで判読可能な記録媒体の例としては、ハードディスク、フロッピディスク及び磁気テープのような磁気媒体、ＣＤ−ＲＯＭ，ＤＶＤのような光記録媒体、フロプティカルディスク（ｆｌｏｐｔｉｃａｌｄｉｓｋ）のような磁気−光媒体及びＲＯＭ、ＲＡＭ、フラッシュメモリなどのようなプログラム命令を貯蔵し実行するように特別に構成されたハードウェア装置が含まれる。プログラム命令の例としては、コンパイラによって作られるような機械語コードだけでなくインタプリタなどを使用してコンピュータによって実行される高級言語コードを含む。ハードウェア装置は本発明の動作を行うために一つ以上のソフトウェアモジュールとして作動するように構成されてもよく、その逆も同じである。 Incidentally, the pre-construction method of lexical semantic patterns for a text analysis and response system according to various preferred embodiments of the present invention is a computer readable medium embodied in program instructions executed through various computer means. Is recorded in The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the computer software art. Examples of computer readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magnetic disks such as floppy disks. Included are optical media and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as produced by a compiler, as well as high-level language code executed by a computer using an interpreter or the like. A hardware device may be configured to act as one or more software modules to perform the operations of the present invention, and vice versa.

本発明の保護範囲が前記で明示的に説明した実施例の記載と表現に制限されることはない。また、本発明の属する技術分野で自明な変更や置換によって本発明の保護範囲が制限されることもないことを再度付言する。 The scope of protection of the present invention is not limited to the description and expression of the embodiments explicitly described above. In addition, it is added again that the scope of protection of the present invention is not limited by obvious changes and substitutions in the technical field to which the present invention belongs.

Claims

A method of pre-building lexical semantic patterns for a text analysis and response system performed by a manager terminal of a question and answer system, comprising:
(A) defining in advance a concept which is a set to which a vocabulary semantic pattern to be matched with an input sentence of the questioner terminal belongs;
(B) collecting sample data which is a target of a lexical meaning pattern and classifying it according to the concept;
(C) the building concept a basic unit is semantic feature semantic feature dictionary defines the (Semantic Description feature) that constitutes the means of one or more entries as a word having the same meaning, one belonging to each semantic feature Structuring into a set of
(D) constructing the lexical semantic patterns for recognizing the sample data by combining predetermined grammatical expressions using the semantic features;
(E) pre-building response data corresponding to the input sentence of the questioner terminal according to the concept, and pre-constructing a lexical semantic pattern for a text analysis and response system.

The method according to claim 1, wherein the concept of step (a) has a hierarchical structure.

The pre-construction of lexical semantic patterns for the text analysis and response system according to claim 1, further comprising the step of adding or modifying the concept in the step (b) if the concept corresponding to the sample data is not defined. Method.

The method according to claim 1, wherein the response data in the step (e) designates a part to be changed according to the input sentence of the questioner terminal as a variable.