JP2870259B2

JP2870259B2 - Japanese sentence analysis method

Info

Publication number: JP2870259B2
Application number: JP3274165A
Authority: JP
Inventors: 伸一土井
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1991-10-22
Filing date: 1991-10-22
Publication date: 1999-03-17
Anticipated expiration: 2014-03-17
Also published as: JPH05113994A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、機械翻訳システム・日
本語文章推敲支援システム・日本語によるマンマシンイ
ンタフェースなど、入力された日本語の文を解析して、
中間言語などの意味表現を出力する日本語文解析方式に
関する。BACKGROUND OF THE INVENTION The present invention analyzes input Japanese sentences such as a machine translation system, a Japanese text editing support system, and a man-machine interface in Japanese.
The present invention relates to a Japanese sentence analysis method for outputting a semantic expression such as an intermediate language.

【０００２】[0002]

【従来の技術】機械翻訳システムを始めとする現在の日
本語処理システムは、単文についてはかなりの程度まで
解析が可能だが、複文・重文については、「要素数が多
く、係り受けの可能性のある組合せの数が膨大になって
しまう」「省略や照応表現、共有等が多出する」などの
単文にはない特徴のため、十分な解析ができない。2. Description of the Related Art Current Japanese processing systems such as a machine translation system can analyze a single sentence to a considerable extent. Due to features that are not found in a single sentence, such as "the number of certain combinations becomes enormous", "frequent omissions, anatomical expressions, sharing, etc.", sufficient analysis cannot be performed.

【０００３】これらを完全に解析するためには、文脈情
報や常識などの知識が必要である。しかしながら人間は
多くの未知語を含んだ文に関しても構造を推定できるこ
とからもわかるように、複文・重文の構造解析を行うた
めの手がかりは何らかの形で表層文中にも存在している
と考えられる。そこで、表層に出現している語彙の中
で、文内文脈構造を規定する機能をもつものに注目する
ことで、構文解析・意味解析に先立って複文・重文の構
造推定を行う方法として、語彙文脈文法（亀井真一郎・
村木一至：“ＬｅｘｉｃａｌＤｉｓｃｏｕｒｓｅＧ
ｒａｍｍａｒの提案”，電子情報通信学会言語理解とコ
ミュニケーション研究会，４５６−７，ｐｐ．１−５
（１９８６））及びこれを利用した文脈推定方式（特願
昭６１−７５８２０）・文脈解析方式（特願昭６１−２
１６５１０）が提案されている。語彙文脈文法に基づい
た文脈解析方式の一実施例を図８に示す。In order to completely analyze these, knowledge such as context information and common sense is required. However, as can be seen from the fact that a human can also estimate the structure of a sentence containing many unknown words, it is considered that clues for performing structural analysis of a compound sentence or a compound sentence exist in the surface sentence in some form. Therefore, by focusing on the vocabulary appearing on the surface that has the function of defining the in-sentence context structure, the vocabulary is used as a method for estimating the structure of compound sentences and multiple sentences prior to parsing and semantic analysis. Contextual grammar (Shinichiro Kamei,
Kazushi Muraki: “Lexical Discourse G
Proposal of Rammar ", IEICE Technical Committee on Language Understanding and Communication, 456-7, pp. 1-5
(1986)) and a context estimating method using the same (Japanese Patent Application No. 61-75820) and a context analyzing method (Japanese Patent Application No. 61-2)
16510) has been proposed. FIG. 8 shows an embodiment of the context analysis system based on the vocabulary context grammar.

【０００４】この文脈解析方式の特徴は、文脈解析手段
として、文内文脈構造を規定する語彙に関する情報をあ
らかじめ収集しておきそれら表現間の関係を形式化して
保持する文脈形式保持手段と、入力文を形態素解析した
結果得られる辞書内容列を前記文脈形式保持手段に保持
された文脈情報に照合することで構文解析・意味解析に
先立って入力文中の文脈構造の候補を検出する文脈構造
照合手段と、前記文脈構造照合手段によって得られた入
力文中の複数の文脈構造候補から最も確からしい文脈構
造候補を推定する文脈構造推定手段を備えていることで
ある。特に、複文・重文中の要素単文間の関係を示す接
続助詞などの機能語に注目し、その機能語に関する「文
中の切れ目としての強さ」等の情報を文脈形式保持手段
に保持しておき、文構造を推定する。図９に、接続助詞
などの用言間の接続形を、各接続形が付随する節におい
て節の内部にどのような要素を含み得るかを基準として
レベル分けした情報をまとめた表を示す。[0004] This context analysis method is characterized in that, as the context analysis means, information on vocabulary defining the in-sentence context structure is collected in advance, and the context format holding means for formalizing and holding the relationship between the expressions, Context structure matching means for detecting a candidate of a context structure in an input sentence prior to syntax analysis and semantic analysis by matching a dictionary content string obtained as a result of morphological analysis of a sentence with context information held in the context format holding means And a context structure estimating means for estimating the most probable context structure candidate from a plurality of context structure candidates in the input sentence obtained by the context structure matching means. In particular, focus on functional words such as connecting particles that show the relationship between elementary sentences in compound sentences and compound sentences, and hold information about the functional words such as "strength as a break in sentences" in the context format holding means. , Estimate the sentence structure. FIG. 9 is a table summarizing information obtained by dividing connection forms between words such as connection particles based on what elements can be included in a clause in a clause accompanying each connection form.

【０００５】[0005]

【発明が解決しようとする課題】以上説明した従来技術
によれば、接続助詞などの機能語を情報源として、構文
解析・意味解析に先立って複文・重文中の要素単文間の
係り受け関係を推定することにより、構文解析・意味解
析の質と効率を向上させることができる。According to the prior art described above, the dependency relationship between elementary sentences in a compound sentence or a compound sentence is analyzed prior to syntactic analysis and semantic analysis using functional words such as connecting particles as an information source. Estimation can improve the quality and efficiency of parsing and semantic analysis.

【０００６】しかしながら複文・重文中の要素単文間の
関係はまた、接続助詞などの他に「複数の単文間に共通
の要素は一ヶ所だけに出現し他の部分では省略され
る」、すなわち共有という形でも提示される。その中で
特に重要なのが、係助詞「は」示される主題及び格助詞
「が」で示される主格の共有である。この従来技術にお
いては、これらの主題・主格の共有について推定する機
構は与えられていない。[0006] However, the relationship between elementary single sentences in compound sentences and compound sentences also shows that "elements common to a plurality of single sentences appear only in one place and are omitted in other parts" in addition to connecting particles, that is, shared. It is also presented in the form. Of particular importance among them is the sharing of the subject indicated by the particle "ha" and the nominative case indicated by the case particle "ga". In this prior art, no mechanism is provided for estimating the sharing of these subjects and nominatives.

【０００７】また複文・重文における主題・主格の共有
の条件は従来様々な形で記述されているが、いずれも構
文解析・意味解析あるいは談話解析の問題としてとらえ
ており、構文解析・意味解析に先立って形態素解析が終
了した段階で主題・主格の共有を推定する技術は今まで
存在しなかった。[0007] The conditions for sharing the subject and the nominative in compound sentences and compound sentences are conventionally described in various forms, but all of them are regarded as problems of parsing, semantic analysis or discourse analysis. There has been no technique for estimating the sharing of the subject / nominative when the morphological analysis is completed.

【０００８】本発明の目的は、複数の用言間での主題・
主格の共有を規定する語彙に関する情報をあらかじめ収
集しておくことにより、日本語の複文・重文における主
題・主格の共有認定を正確に行うとともに、構文解析・
意味解析に先立って主題・主格の共有を推定することで
構文解析・意味解析の曖昧性を減じて日本語文の解析の
精度及び速度を向上させることができる日本語文解析方
式を提供することにある。[0008] The object of the present invention is to provide a subject
By collecting information on vocabulary that prescribes sharing of the nominative in advance, it is possible to accurately identify the subject and nominative in Japanese compound sentences and multiple sentences,
An object of the present invention is to provide a Japanese sentence analysis method that can improve the accuracy and speed of Japanese sentence analysis by reducing the ambiguity of syntactic analysis and semantic analysis by estimating the sharing of the subject and the nominative prior to the semantic analysis. .

【０００９】[0009]

【課題を解決するための手段】本発明の日本語文解析方
式は、日本語文の入力を読み込む入力読み込み手段と、
入力文に対する辞書引き機能と辞書引き後の辞書情報を
用いて入力文を解析する形態素解析手段、構文解析手
段、意味解析手段を備えた日本語文解析方式において、
日本語における、係助詞「は」で示される主題、もしく
は格助詞「が」で示される主格の複数の用言間での共有
を規定する語彙に関する情報をあらかじめ収集して保持
している主題・主格共有語彙情報保持手段と、前記主題
・主格共有語彙情報保持手段に保持された語彙情報を用
いて、主題・主格の共有が存在するか否かを推定する主
題・主格共有推定手段を備え、主題・主格共有推定手段
における推定に基づき日本語の複文・重文における主題
・主格の共有認定を正確に行うことと、構文解析・意味
解析に先立って主題・主格の共有の存在を推定すること
により構文解析・意味解析の曖昧性を減じて日本語文の
解析の精度及び速度を向上させることを特徴とする。According to the present invention, there is provided a Japanese sentence analyzing method comprising: input reading means for reading an input of a Japanese sentence;
In a Japanese sentence analysis system comprising a morphological analysis means, a syntax analysis means, and a semantic analysis means for analyzing an input sentence using a dictionary lookup function for the input sentence and dictionary information after the dictionary lookup,
In Japanese, the subject indicated by the particle "ha" or the subject which collects and retains information about the vocabulary that prescribes the sharing of multiple nouns of the nominative case indicated by the case particle "ga" in advance. Nominal nominative vocabulary information holding means, and a subject / nominative share estimating means for estimating whether or not there is a subject / nominative sharing using the vocabulary information held in the subject / nominative shared vocabulary information holding means, Accurate recognition of the sharing of the subject and nominative in Japanese compound sentences and multiple sentences based on the estimation by the subject and nominative sharing estimating means, and by presuming the existence of the sharing of the subject and nominative prior to parsing and semantic analysis. It is characterized by improving the accuracy and speed of parsing Japanese sentences by reducing the ambiguity of syntactic analysis and semantic analysis.

【００１０】[0010]

【作用】本発明の日本語文解析方式は、日本語文の入力
が入力読み込み手段によって読み込まれると、形態素解
析手段が、入力文に対する辞書引き機能と辞書引き後の
辞書情報を用いて入力文を解析し、形態素解析結果を出
力する。According to the Japanese sentence analysis method of the present invention, when an input of a Japanese sentence is read by an input reading means, the morphological analysis means analyzes the input sentence using a dictionary lookup function for the input sentence and dictionary information after the dictionary lookup. And outputs the morphological analysis result.

【００１１】主題・主格共有語彙情報保持手段には、日
本語における、係助詞「は」で示される主題もしくは格
助詞「が」で示される主格の複数の用言間での共有を規
定する語彙に関する情報があらかじめ収集されて保持さ
れている。The subject / nominative shared vocabulary information holding means includes a vocabulary defining sharing between a plurality of words of the subject indicated by the particle "ha" or the nominative indicated by the case particle "ga" in Japanese. Information is collected and stored in advance.

【００１２】主題・主格共有推定手段は、形態素解析手
段の出力である形態素解析結果を入力とし、同形態素解
析結果中に主題・主格共有語彙情報保持手段に保持され
た語彙が存在するかどうかを検索し、その語彙に関する
情報を用いて主題・主格の共有が存在するか否かを推定
し、推定結果を形態素解析結果に付与して出力する。The subject / nominative shared estimating means receives as input the morphological analysis result output from the morphological analyzing means, and determines whether the vocabulary held in the subject / nominative shared vocabulary information holding means exists in the morphological analysis result. A search is performed to estimate whether or not the subject / nominative sharing exists using information on the vocabulary, and the estimation result is added to the morphological analysis result and output.

【００１３】構文解析手段・意味解析手段は、主題・主
格共有推定手段の出力を入力とし、形態素解析結果及び
推定された主題・主格の共有情報を利用して構文解析・
意味解析を行い、解析結果を出力する。The syntactic analysis means / semantic analysis means receives the output of the subject / nominative share estimating means as input, and uses the morphological analysis result and the estimated subject / nominative shared information to perform parsing / semantic analysis.
Performs semantic analysis and outputs the analysis result.

【００１４】[0014]

【実施例】次に図１から図７を参照して本発明の実施例
について説明する。Next, an embodiment of the present invention will be described with reference to FIGS.

【００１５】図１は、請求項１に記載した本発明の一実
施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of the present invention described in claim 1.

【００１６】日本語文の入力が入力読み込み手段１によ
って読み込まれると、形態素解析手段３は、辞書２を参
照しながら入力文に対する辞書引き機能と辞書引き後の
辞書情報を用いて入力文を解析し、形態素解析結果を出
力する。When the input of the Japanese sentence is read by the input reading means 1, the morphological analysis means 3 analyzes the input sentence using the dictionary lookup function for the input sentence and the dictionary information after the dictionary lookup while referring to the dictionary 2. And output the morphological analysis result.

【００１７】主題・主格共有語彙情報保持手段７には、
日本語における、係助詞「は」で示される主題もしくは
格助詞「が」で示される主格の複数の用言間での共有を
規定する語彙に関する情報があらかじめ収集されて保持
されている。The subject / nominal shared vocabulary information holding means 7 includes:
In Japanese, information on vocabulary that prescribes sharing between a plurality of verbs of the subject indicated by the particle "ha" or the nominative case indicated by the "ga" is stored in advance.

【００１８】主題・主格共有推定手段８は、形態素解析
手段３の出力である形態素解析結果を入力とし、同形態
素解析結果中に主題・主格共有語彙情報保持手段７に保
持された語彙が存在するかどうかを検索し、その語彙に
関する情報を用いて主題・主格の共有が存在するか否か
を推定し、推定結果を形態素解析結果に付与して出力す
る。The subject / nominative shared estimating means 8 receives the morphological analysis result output from the morphological analyzing means 3 as input, and the vocabulary held in the subject / nominative shared vocabulary information holding means 7 is present in the morphological analysis result. Whether or not the subject / nominative sharing is present is estimated using the information on the vocabulary, and the estimation result is added to the morphological analysis result and output.

【００１９】構文解析手段４は、主題・主格共有推定手
段８の出力を入力とし、形態素解析結果及び推定された
主題・主格の共有情報を利用して構文解析を行い、構文
解析結果を出力する。The parsing means 4 receives the output of the subject / nominative sharing estimating means 8 as input, performs a parsing using the morphological analysis result and the estimated subject / nominative shared information, and outputs the parsing result. .

【００２０】意味解析手段５は、構文解析手段４の出力
を入力とし、形態素解析結果・構文解析結果及び推定さ
れた主題・主格の共有情報を利用して意味解析を行い、
中間言語や意味表現などの解析結果６を出力する。The semantic analysis means 5 receives the output of the syntactic analysis means 4 as input and performs semantic analysis using the morphological analysis result / syntax analysis result and the shared information of the estimated subject / nominative.
An analysis result 6 such as an intermediate language or a semantic expression is output.

【００２１】次に図２は、請求項２に記載した本発明の
一実施例を示すブロック図である。FIG. 2 is a block diagram showing an embodiment of the present invention described in claim 2.

【００２２】ここでの文脈形式保持手段１０・文脈構造
照合手段１１・文脈構造推定手段１２からなる文脈解析
手段９は、図８に一実施例を示した従来技術の文脈解析
方式（特願昭６１−２１６５１０）に記載されている文
脈形式保持手段１０・文脈構造照合手段１１・文脈構造
推定手段１２からなる文脈解析手段９と同様のものであ
る。The context analyzing means 9 comprising the context format holding means 10, the context structure collating means 11 and the context structure estimating means 12 is a prior art context analysis system (FIG. 61-216510), which is the same as the context analysis means 9 including the context format holding means 10, the context structure matching means 11, and the context structure estimation means 12.

【００２３】文脈形式保持手段１０には、文内文脈構造
を規定する語彙に関する情報をあらかじめ収集して記述
しておく。例えば用言間の接続形（連用中止形、接続助
詞、接続助詞的名詞等）について、各接続形が付随する
節において、節の内部にどのような要素を含み得るかも
しくは相対的な節間の包含関係によって、各接続形をレ
ベル分けした情報である。各接続形が付随する節におい
て節の内部にどのような要素を含み得るかを基準として
レベル分けした例を図９に示す。このレベル分けの情報
を用いることにより、文脈構造推定手段１２において構
文解析・意味解析に先立って文内文脈構造すなわち用言
間の係り受け構造を推定することができる。The context format holding means 10 collects and describes in advance information on vocabulary defining the in-sentence context structure. For example, in terms of the conjunctive forms between verbs (continuous discontinuous forms, connective particles, connective nouns, etc.), in the clauses with each connective form, what elements can be included inside the clauses or the relative internodes Is information obtained by classifying each connection type into levels according to the inclusion relation. FIG. 9 shows an example in which levels are classified based on what elements can be included in a node in a node accompanying each connection type. By using the information on the level division, the context structure estimating means 12 can estimate the in-sentence context structure, that is, the dependency structure between words, prior to syntax analysis and semantic analysis.

【００２４】本発明例では、主題・主格共有語彙情報保
持手段７に保持された語彙情報に加えて、接続形のレベ
ル分けの情報など文脈形式保持手段１０に記載された語
彙に関する情報をもとに文脈構造推定手段１２により推
定された文脈構造を用いて、主題・主格共有推定手段８
において主題・主格の共有の存在を推定する。この主題
・主格共有推定手段８において用いられる文脈構造と主
題・主格の共有との関連に関する情報は、あらかじめ主
題・主格共有語彙情報保持手段７に保持しておく。例と
して、用言間の接続形と主文・従属節間の主題・主格の
共有条件との関連を示す情報の一部を図３に示す。In the embodiment of the present invention, in addition to the vocabulary information held in the subject / nominative shared vocabulary information holding means 7, information on the vocabulary described in the context format holding means 10, such as information on connection-type level division, is used. Using the context structure estimated by the context structure estimating means 12, the subject and nominative sharing estimating means 8
Presumes the existence of shared subject and nominative in. Information on the relationship between the context structure and the sharing of the subject / nominative used in the subject / nominative sharing estimating means 8 is stored in the subject / nominative shared vocabulary information holding means 7 in advance. As an example, FIG. 3 shows a part of the information indicating the relation between the connection form between verbs and the sharing condition of the subject and the nominative between the main sentence and the subordinate clause.

【００２５】以下具体的な文の解析例を用いて、本実施
例の動作を説明する。The operation of this embodiment will be described below using a specific example of sentence analysis.

【００２６】１）彼は話しながら帰った。1) He returned while talking.

【００２７】２）彼が話しながら帰った。2) He returned while talking.

【００２８】１）または２）の文が入力読み込み手段１
によって読み込まれると、形態素解析手段３により形態
素解析が行われる。接続助詞の“ながら”には並行動作
と逆接の意味があり、動作性述語に接続する場合は並行
動作、状態性述語に接続する場合は逆接を表すのだが、
ここでは“話す”は動作性述語なので、この形態素解析
の段階でここでの“ながら”は並行動作を表すと決定さ
れる。続いて文脈解析手段９により文脈解析が行われ
る。まず文脈形式保持手段１０に保持された図９に例を
示す情報により、文脈構造照合手段１１において並行動
作を表す接続助詞“ながら”のレベルが５と決定され、
この情報が文脈構造推定手段１２で“ながら”の形態素
解析結果に付与され文脈解析手段９の出力となる。次に
主題・主格共有語彙情報保持手段７に保持された図３に
例を示す語彙情報により、このレベルの節は独自のハは
もちろん独自のガを持つこともできない、すなわち主文
と従属節の主格は必ず共有されることがわかる。この情
報が主題・主格共有推定手段８で文脈解析結果に付与さ
れ出力される。この情報を用いることで、構文解析手段
４・意味解析手段５において・「彼は」または「彼が」は「話しながら」にも「帰っ
た」にも係る・「話しながら」の主格も「帰った」の主格も「彼」で
あることを容易に認定することができる。またこの推定によ
り、構文解析・意味解析の曖昧性をあらかじめ減じて解
析の精度及び速度を向上させることができる。The sentence of 1) or 2) is input reading means 1
Is read, the morphological analysis unit 3 performs morphological analysis. The conjunctive particle "whisper" has the meaning of parallel operation and reverse connection, and when connected to an operation predicate, it indicates parallel operation, and when connected to a state predicate, it indicates the reverse connection.
Here, "speak" is a behavioral predicate, so that "while" here is determined to represent a parallel action at this morphological analysis stage. Subsequently, the context analysis unit 9 performs a context analysis. First, based on the information shown in FIG. 9 held in the context format holding means 10, the context structure collating means 11 determines the level of the connecting particle “while” representing the parallel action as 5,
This information is added to the morphological analysis result of “while” by the context structure estimation means 12 and becomes the output of the context analysis means 9. Next, according to the vocabulary information shown in FIG. 3 held in the subject / nominative shared vocabulary information holding means 7, the clause at this level cannot have its own ga as well as its own ga, ie, the main sentence and the subordinate It turns out that nominatives are always shared. This information is added to the result of the context analysis by the subject / nominative share estimating means 8 and output. By using this information, in the syntactic analysis means 4 and the semantic analysis means 5, "he" or "he" is related to "talking" or "returned". It is easy to recognize that the nominative of "returned" is also "his". Further, by this estimation, the ambiguity of the syntax analysis and the semantic analysis can be reduced in advance, and the accuracy and speed of the analysis can be improved.

【００２９】３）彼は酔っていると言った。3) He said he was drunk.

【００３０】３）の文が入力読み込み手段１によって読
み込まれると、形態素解析手段３により形態素解析が行
われ、結果が文脈解析手段９に渡される。まず文脈形式
保持手段１０に保持された図９に例を示す情報により、
文脈構造照合手段１１において引用を表す“と”のレベ
ルが０と決定され、この情報が文脈構造推定手段１２で
“と”の形態素解析結果に付与され文脈解析手段９の出
力となる。次に主題・主格共有語彙情報保持手段７に保
持された図３に例を示す語彙情報により、このレベルの
節は主格が主文の主題・主格とは独立であることが分か
る。この情報が主題・主格共有推定手段８で文脈解析結
果に付与され出力される。この情報を用いることで、構
文解析手段４・意味解析手段５において、・「彼は」が「酔っている」に係る場合と「言った」に
係る場合がある。When the sentence 3) is read by the input reading means 1, morphological analysis is performed by the morphological analyzing means 3, and the result is passed to the context analyzing means 9. First, based on the information shown in FIG. 9 held in the context format holding means 10,
The context structure matching unit 11 determines that the level of “to” representing the citation is 0, and this information is added to the result of the morphological analysis of “and” by the context structure estimation unit 12 and becomes the output of the context analysis unit 9. Next, from the vocabulary information shown in FIG. 3 held in the subject / nominative shared vocabulary information holding means 7, it can be seen that the nominative case of this level is independent of the nominative case of the main sentence. This information is added to the result of the context analysis by the subject / nominative share estimating means 8 and output. By using this information, in the syntactic analysis means 4 and the semantic analysis means 5, there are cases where "he" is related to "drunk" and cases where "he is" is said.

【００３１】・「彼は」が「言った」に係る場合には
「誰が酔っているか」は「彼」と全く独立である。When "he" is related to "said", "who is drunk" is completely independent of "he".

【００３２】・すなわち３）の文には３通りの解釈があ
り、他の情報がない限り決定できない。ことが認定できる。実際、４），６），８）の文脈を設
定すると各々の３）に相当する部分は５），７），９）
の意味になる。That is, the sentence 3) has three interpretations and cannot be determined without other information. It can be certified. In fact, when the contexts of 4), 6) and 8) are set, the parts corresponding to 3) are 5), 7) and 9)
It means.

【００３３】４）彼はどうしたのかと私が彼女に尋ねた
ところ、彼は酔っていると言った。4) When I asked her what happened, he said he was drunk.

【００３４】５）「彼は酔っている」と（彼女は）言っ
た。5) (She) said, "He is drunk."

【００３５】６）どうしたのかと私が彼に尋ねたとこ
ろ、彼は酔っていると言った。6) When I asked him what happened, he said he was drunk.

【００３６】７）彼は「（自分は）酔っている」と言っ
た。7) He said, "I am drunk."

【００３７】８）彼女はどうしたのかと私が彼に尋ねた
ところ、彼は酔っていると言った。8) When I asked him what happened, she said he was drunk.

【００３８】９）彼は「（彼女は）酔っている」と言っ
た。9) He said, "(She is) drunk."

【００３９】次に図４は、請求項３に記載した本発明の
一実施例を示すブロック図である。FIG. 4 is a block diagram showing an embodiment of the present invention described in claim 3.

【００４０】ここで主格人称制限語彙情報保持手段１３
には、図５に“痛い”という感情形容詞を例にとって示
した、感情述語など主格に人称制限がある語彙に関する
情報が保持されている。まずこの情報について説明す
る。Here, the nominative personal restricted vocabulary information holding means 13
Holds information about a vocabulary having a nominative personal limitation, such as an emotion predicate, as shown in FIG. 5 by taking the emotional adjective "pain" as an example. First, this information will be described.

【００４１】文は必ず話者の何らかの心理的視点に従っ
て記述されている。感情述語は、授受表現等とともにこ
の視点と深く関わる表現の一つであり、“憎む・恐れる
・おびえる”等の感情動詞と“暑い・欲しい・寂しい”
等の感情形容詞がある（益岡隆志・田窪行則：“基礎日
本語文法”，くろしお出版（１９８９））。特に感情形
容詞については「主語に視点がおかれることを必要とす
る」（村木正武：“視点と意味構造”，講座現代の言
語１“日本語の基本構造”，三省堂（１９８３））、す
なわち主格に対する人称制限が存在することが知られて
おり、例えば「疑問の“か”は２人称にしか使えない」
等の原則が提示されている（柳父章：“比較日本語
論”，日本翻訳家養成センター（１９７９））。これを
元に“痛い”を例にとり感情形容詞の主格の人称制限を
まとめたのが図５である。ここで○を記した表現は用い
ることができるが、×の表現は用いることができない。
また※１は「彼は（私にとって）頭が痛い」等の比喩と
しての意味では用いられる。※２では「彼は痛がってい
る」という用法は可能である。Sentences are always described according to some psychological viewpoint of the speaker. The emotion predicate is one of the expressions that are deeply related to this viewpoint together with the exchange expression, etc., and the emotion verbs such as “hate / fear / fright” and “hot / wanted / lonely”
(Takashi Masuoka, Yukinori Takubo: "Basic Japanese Grammar", Kuroshio Publishing (1989)). In particular, emotional adjectives "need a viewpoint to be placed on the subject" (Muratake Masatake: "Viewpoint and Semantic Structure", Lecture Modern Language 1 "Basic Structure of Japanese", Sanseido (1983)), It is known that there are personal restrictions on, for example, "The question"? "Can only be used in the second person."
(Yoshifumi Akira: “Comparative Japanese Studies”, Japan Translator Training Center (1979)). Based on this, FIG. 5 summarizes the restriction on the personality of the nominative of the emotional adjective, taking "pain" as an example. Here, the expression with a circle can be used, but the expression with a cross cannot be used.
* 1 is also used as a metaphor such as "He has a headache (for me)." * 2 It is possible to use "he is in pain".

【００４２】次に主題・主格共有語彙情報保持手段７に
は、この感情述語が引用節中に出現した場合の主題・主
格の共有に関する情報が保持されている。上記で請求項
２の実施例において３）の文を例にとり引用節の主格は
主文の主題・主格とは独立であることを示した。日本語
で引用の“と”をとる動詞には“言う”類（語る、話
す、答える等）と“思う”類（考える、感じる、疑う
等）がある（寺村秀夫：“日本語の文法（下）”，国立
国語研究所（１９８１））が、日本語の特徴として
「“言ウ”主体が発話したままが、“・・・ト”に現れ
ている」（寺村秀夫：“日本語のシンタクスと意味I
I”，くろしお出版（１９８４））ことが挙げられる。
すなわち「」（かぎかっこ）が存在しなくても直接話法
的で、英語等において出現する主文と引用節間のテンス
の一致といった現象は存在しない。従って引用節中の感
情述語に対しても図５に挙げた主格の人称制限がそのま
ま適用できる。この感情述語の制限を利用することで、
本来は独立なはずの主文と引用節の間の主題の共有を主
題・主格共有推定手段８において推定できる。ここでの
推定法を、具体例を用いて説明する。Next, the subject / nominative shared vocabulary information holding means 7 holds information on the sharing of the subject / nominative case when this emotion predicate appears in the quoted section. In the above, in the embodiment of the second aspect, taking the sentence of 3) as an example, it has been shown that the nominative case of the quoted phrase is independent of the subject and nominative case of the main sentence. Verbs that take the "to" in Japanese include "say" (speak, speak, answer, etc.) and "think" (think, feel, doubt, etc.) (Hideo Teramura: "Japanese grammar ( (Below) ”, National Institute for Japanese Language (1981)), as a feature of Japanese,“ It appears in “... to” while the main word “U” is uttered ”(Hideo Teramura:“ Japanese Syntax and meaning I
I ", Kuroshio Publishing (1984)).
That is, even if "" (brackets) does not exist, it is directly spoken, and there is no phenomenon such as consistency of the tense between the main sentence and the quoted section appearing in English or the like. Therefore, the nominative personal restriction shown in FIG. 5 can be applied to the emotion predicate in the quote section as it is. By using this emotion predicate restriction,
The subject sharing between the main text and the quotation which should be independent can be estimated by the subject / nominative sharing estimating means 8. The estimation method here will be described using a specific example.

【００４３】１０）彼は寒いと言った。10) He said it was cold.

【００４４】この１０）を３）の文と同様に解析する
と、５），７），９）に対応する以下の３種の解釈が得
られる。When this 10) is analyzed in the same manner as the sentence 3), the following three interpretations corresponding to 5), 7) and 9) are obtained.

【００４５】１１）「彼は寒い。」と（誰かが）言っ
た。11) "He is cold."

【００４６】１２）彼は「（私は）寒い。」と言った。12) He said, "I am cold."

【００４７】１３）彼は「（誰かが）寒い。」と言っ
た。ところが図５を見ると「“寒い”の主格は１人称に
限られる」ので、直接発話内容が非文になってしまう１
１）と１３）の解釈はありえない、すなわち正しい解釈
は１２）である。これは１４）の場合も同様である。13) He said, "(Someone) is cold." However, when looking at FIG. 5, the subject of "cold" is limited to the first person, so that the utterance directly becomes non-sentence.
The interpretation of 1) and 13) is impossible, ie the correct interpretation is 12). This is the same in the case of 14).

【００４８】１４）彼が寒いと言った。従って「引用節
中に主格が１人称に限られる述語が出現する場合には、
主文と引用節で必ず主格が共有される」ことが分かる。
この類の情報を主題・主格共有語彙情報保持手段７に保
持しておき、主題・主格共有推定手段８において主文と
引用節の間の主題の共有を推定する。14) He said it was cold. Therefore, "If a predicate whose nominative is limited to the first person appears in a quotation clause,
Nominatives are always shared between the main sentence and the quotation. "
This kind of information is stored in the subject / nominative shared vocabulary information holding means 7, and the subject and nominative shared estimating means 8 estimates the sharing of the subject between the main sentence and the quoted passage.

【００４９】以下上記の例を用いて、本実施例の動作を
具体的に説明する。Hereinafter, the operation of this embodiment will be described in detail with reference to the above example.

【００５０】１０）の文が入力読み込み手段１によって
読み込まれると、形態素解析手段３により形態素解析が
行われる。次に主題・主格共有推定手段８において主題
・主格の共有推定が行われる。まず主格人称制限語彙情
報保持手段１３に保持されている図５に示すような“寒
い”の主格制限に関する情報により、「“寒い”の主格
は１人称に限られる」という情報が得られる。次に主題
・主格共有語彙情報保持手段７に保持されているこの感
情述語が引用節中に出現した場合の主題・主格の共有に
関する情報により、ここでは主文と引用節間で主格が共
有されることが推定される。この推定結果が形態素解析
結果に付与されて出力され、構文解析手段４・意味解析
手段５で、１１），１２），１３）の中で正しい１２）
の解釈だけを得ることができる。またあらかじめ１
１），１３）の解釈の可能性を排除しておくことによ
り、解析の精度及び速度を向上することができる。When the sentence (10) is read by the input reading means 1, morphological analysis is performed by the morphological analyzing means 3. Next, the subject / nominative share estimating means 8 performs the subject / nominative share estimation. First, the information relating to the "cold" nominative limitation as shown in FIG. 5 held in the nominative personal limitation vocabulary information holding means 13 provides information that "the nominative of" cold "is limited to the first person". Next, the nominative case is shared between the main sentence and the quotation section according to the information on the sharing of the subject and nominative case when this emotion predicate held in the subject / nominative shared vocabulary information holding means 7 appears in the quotation section. It is estimated that This estimation result is added to the morphological analysis result and output, and the syntactic analysis means 4 and the semantic analysis means 5 correct among 11), 12) and 13) 12)
Only the interpretation of In addition, 1
By eliminating the possibility of the interpretation of 1) and 13), the accuracy and speed of analysis can be improved.

【００５１】次に図６は、請求項４に記載した本発明の
一実施例を示すブロック図である。FIG. 6 is a block diagram showing an embodiment of the present invention described in claim 4.

【００５２】ここで補文主格制限語彙情報保持手段１４
には、図７に例を示した話者の強い意志表明を表す語や
話者が他人の行為を要求する語など補文の主格をコント
ロールする語彙に関する情報が保持されている。この情
報により、主題・主格共有推定手段８において主題・主
格の共有の存在を推定することができる。Here, the supplementary nominative restricted vocabulary information holding means 14
Holds information relating to vocabulary for controlling the nominative of the complement, such as a word expressing a strong will of the speaker and a word requesting the act of another person as shown in FIG. Based on this information, the subject / nominative share estimating means 8 can estimate the presence of the subject / nominative sharing.

【００５３】以下具体的な文の解析例を用いて、本実施
例の動作を説明する。The operation of this embodiment will be described below using a specific example of sentence analysis.

【００５４】１５）患者達は国に補償を求めることを決
めた。15) The patients have decided to seek compensation from the State.

【００５５】１６）患者達は国に補償することを求め
た。16) Patients called on the State to compensate.

【００５６】１５）の文が入力読み込み手段１によって
読み込まれると、形態素解析手段３により形態素解析が
行われる。補文主格制限語彙情報保持手段１４に保持さ
れた図７に例を示す語彙情報により「“〜することを決
める”の場合は補文と主文で主格が一致する」ことが分
かり、この情報が主題・主格共有推定手段９で形態素解
析結果に付与され出力される。この情報を用いること
で、構文解析手段４・意味解析手段５において、・「求
める」の主格も「決めた」の主格も「患者達」であるこ
とが認定できる。When the sentence (15) is read by the input reading means 1, morphological analysis is performed by the morphological analyzing means 3. The vocabulary information shown in FIG. 7 held in the supplementary nominative restricted vocabulary information holding means 14 indicates that "in the case of" decide to do ", the nominative matches in the complement and the main sentence." The subject / nominative share estimating means 9 adds and outputs the result to the morphological analysis. By using this information, the syntactic analysis means 4 and the semantic analysis means 5 can determine that the nominative of "seek" and the nominative of "determined" are also "patients".

【００５７】一方１６）では、補文主格制限語彙情報保
持手段１４に保持された図７に例を示す語彙情報により
「“〜に〜することを求める”の場合は、補文の主格は
主文のニ格として表現される」ことが分かる。この情報
を用いることで、構文解析手段４・意味解析手段５にお
いて、・「求めた」の主格は「患者達」で「補償する」
の主格は「国」であることが認定できる。On the other hand, in the case 16), if the vocabulary information shown in FIG. It is expressed as two cases. " By using this information, in the syntactic analysis means 4 and the semantic analysis means 5, the subject of “determined” is “compensate” with “patients”.
Can be recognized as "country".

【００５８】なお、請求項２に記載した文脈形式保持手
段１０・文脈構造照合手段１１・文脈構造推定手段１２
からなる文脈解析手段９、請求項３に記載した主格人称
制限語彙情報保持手段１３、請求項４に記載した補文主
格制限語彙情報保持手段１４はすべてお互いに独立であ
り、自由に組み合わせて用いることができる。また組み
合わせて用いることにより、主題・主格の共有認定をよ
り正確に行うことができ、また、日本語文の解析の精度
及び速度をさらに向上させることができる。The context format holding means 10, the context structure matching means 11, and the context structure estimating means 12 according to the second aspect.
The context analysis means 9 comprising: the nominative personal limited vocabulary information holding means 13 described in claim 3; and the supplementary nominative personality limited vocabulary information holding means 14 described in claim 4 are all independent of each other and can be used in any combination. be able to. Also, by using them in combination, it is possible to more accurately perform the subject / nominative sharing recognition, and it is possible to further improve the accuracy and speed of analysis of Japanese sentences.

【００５９】さらに、ここでは実施例としていずれも主
題・主格の共有推定を形態素解析の直後に行う例を示し
たが、これは形態素解析を行った後であれば、任意の段
階で行うことができる。Further, here, as an embodiment, an example in which the subject / nominative shared estimation is performed immediately after the morphological analysis has been described, but this can be performed at an arbitrary stage after performing the morphological analysis. it can.

【００６０】[0060]

【発明の効果】以上説明した通り、本発明によれば、日
本語の複文・重文を解析する際に、主題・主格の共有認
定をより正確に行うことができる。また、構文解析・意
味解析に先立って主題・主格の共有を推定することによ
り構文解析・意味解析の曖昧性を減じて、日本語文の解
析の精度及び速度を向上させることができる。As described above, according to the present invention, the subject / nominative sharing recognition can be performed more accurately when analyzing Japanese multiple / multiple sentences. Further, by presuming the sharing of the subject and the nominative prior to the parsing / semantic analysis, the ambiguity of the parsing / semantic analysis can be reduced, and the accuracy and speed of parsing the Japanese sentence can be improved.

[Brief description of the drawings]

【図１】請求項１に記載した本発明の一実施例を示すブ
ロック図である。FIG. 1 is a block diagram showing one embodiment of the present invention described in claim 1;

【図２】請求項２に記載した本発明の一実施例を示すブ
ロック図である。FIG. 2 is a block diagram showing one embodiment of the present invention described in claim 2;

【図３】請求項２に記載した本発明が主題・主格共有語
彙情報保持手段８において保持している、用言間の接続
形と主文・従属節間の主題・主格の共有条件との関連を
示す情報の一部を、表の形で示したものである。FIG. 3 shows the relation between the connection form between words and the sharing condition of the subject and the nominative between the main sentence and the subordinate clause, which is stored in the subject and nominative shared vocabulary information holding means 8 of the present invention described in claim 2; Is shown in the form of a table.

【図４】請求項３に記載した本発明の一実施例を示すブ
ロック図である。FIG. 4 is a block diagram showing one embodiment of the present invention described in claim 3;

【図５】請求項３に記載した本発明が主題・主格共有語
彙情報保持手段８において保持している、感情述語など
主格に人称制限がある語彙に関する情報を、“痛い”と
いう感情形容詞を例に取って示した表である。FIG. 5 is an example of an emotional adjective “painful” for information on a vocabulary having a nominative personality such as an emotion predicate, which is held in the subject / nominative vocabulary information holding means 8 according to the present invention. This is the table shown in FIG.

【図６】請求項４に記載した本発明の一実施例を示すブ
ロック図である。FIG. 6 is a block diagram showing an embodiment of the present invention described in claim 4.

【図７】請求項４に記載した本発明が主題・主格共有語
彙情報保持手段８において保持している、話者の強い意
志表明を表す語や話者が他人の行為を要求する語など補
文の主格をコントロールする語彙に関する情報を、表の
形で示したものである。FIG. 7 is a block diagram showing an embodiment of the present invention, wherein the subject / nominative shared vocabulary information holding unit 8 holds supplementary words such as words expressing strong intention of the speaker and words requiring the speaker to act on another person. Information about the vocabulary that controls the nominative status of a sentence is shown in the form of a table.

【図８】従来技術である語彙文脈文法に基づいた文脈解
析方式（特願昭６１−２１６５１０）の一実施例を示す
ブロック図である。FIG. 8 is a block diagram showing one embodiment of a conventional context analysis method (Japanese Patent Application No. 61-216510) based on a vocabulary context grammar.

【図９】文脈形式保持手段１１に保持されている、各接
続形が付随する節において節の内部にどのような要素を
含み得るかを基準として用言間の接続形をレベル分けし
た情報をまとめた表である。FIG. 9 is a diagram showing information, which is stored in the context format holding unit 11 and is obtained by classifying connection forms between declinable words based on what elements can be included in a clause in a clause accompanying each connection form. This is a summarized table.

[Explanation of symbols]

１入力読み込み手段２辞書３形態素解析手段４構文解析手段５意味解析手段６解析結果７主題・主格共有語彙情報保持手段８主題・主格共有推定手段９文脈解析手段１０文脈形式保持手段１１文脈構造照合手段１２文脈構造推定手段１３主格人称制限語彙情報保持手段１４補文主格制限語彙情報保持手段 DESCRIPTION OF SYMBOLS 1 Input reading means 2 Dictionary 3 Morphological analysis means 4 Parsing means 5 Semantic analysis means 6 Analysis result 7 Subject / Nominative shared vocabulary information holding means 8 Subject / Nominative shared estimating means 9 Context analysis means 10 Context format holding means 11 Context structure collation Means 12 Context structure estimating means 13 Nominative personal limited vocabulary information holding means 14 Complementary nominative personalized vocabulary information holding means

Claims

(57) [Claims]

1. An input reading means for reading an input of a Japanese sentence, a morphological analysis means for analyzing an input sentence using a dictionary lookup function for the input sentence and dictionary information after the dictionary lookup, a syntax analysis means, a semantic analysis means, and a memory. Japanese processing with
In the Japanese sentence analysis method used in the system, information on the vocabulary that prescribes the sharing of the subject indicated by the particle "ha" or the nominative case indicated by the case particle "ga" between a plurality of declinable words in Japanese. Collect and said
Estimating whether there is a subject / nominative sharing using the subject / nominative shared vocabulary information holding means held in the memory and the vocabulary information held in the subject / nominative shared vocabulary information holding means. Equipped with subject / nominative sharing estimation means, parsing
Estimate the existence of shared subject and nominative prior to semantic analysis
Japanese sentence analysis method characterized by that .

2. An input reading means for reading an input of a Japanese sentence, a morphological analysis means for analyzing an input sentence using a dictionary lookup function for the input sentence and dictionary information after the dictionary lookup, a syntax analysis means, a semantic analysis means, and a memory. Japanese processing with
In the Japanese sentence analysis system used in the system, information on vocabulary defining the in-sentence context structure is collected in advance, and the relationship between the expressions is formalized to form the aforementioned sentence.
The context format holding means held in the memory and the dictionary content sequence obtained as a result of morphological analysis of the input sentence are compared with the context information held in the context format holding means, thereby enabling syntax analysis and
Context structure matching means for detecting a context structure candidate in the input sentence prior to semantic analysis, and a context structure for estimating the most probable context structure candidate from a plurality of context structure candidates in the input sentence obtained by the context structure matching means Estimating means and information on vocabulary that prescribes sharing between a plurality of words of the subject indicated by the particle "ha" or the nominative case indicated by the case particle "ga" in Japanese, and collecting the memo in advance
The subject / nominative shared vocabulary information holding means held in the library, the vocabulary information held in the subject / nominative shared vocabulary information holding means, and the context structure estimated by the context structure estimating means. - with the subject-nominative shared estimating means for estimating whether share exists nominative, parsing Yi
Prior to taste analysis, the existence of shared
Japanese sentence analysis method characterized by having:

3. An input reading means for reading an input of a Japanese sentence, a morphological analysis means for analyzing an input sentence using a dictionary lookup function for the input sentence and dictionary information after the dictionary lookup, a syntax analysis means, a semantic analysis means, and a memory. Japanese processing with
In the Japanese sentence analysis method used in the system, information on the vocabulary that prescribes the sharing of the subject indicated by the particle "ha" or the nominative case indicated by the case particle "ga" between a plurality of declinable words in Japanese. Collect and said
And subject-nominative shared vocabulary information holding unit operable to hold in memory, and Primary person restriction vocabulary information holding unit that holds information about the vocabulary is nominative the person limitations such as emotional predicates in the memory, the nominative When a vocabulary with personal restrictions appears in the quotation section, using the vocabulary information held by the subject / nominative vocabulary information holding means and the vocabulary information held by the nominative personal restriction vocabulary information holding means, Equipped with a subject / nominative share estimating means for estimating whether or not there is subject / nominative sharing .
Japanese sentence analysis method characterized by estimating existence .

4. An input reading means for reading an input of a Japanese sentence, a morphological analysis means for analyzing an input sentence using a dictionary lookup function for the input sentence and dictionary information after the dictionary lookup, a syntax analysis means, a semantic analysis means, and a memory. Japanese processing with
In the Japanese sentence analysis method used in the system, information on the vocabulary that prescribes the sharing of the subject indicated by the particle "ha" or the nominative case indicated by the case particle "ga" between a plurality of declinable words in Japanese. Collect and said
Means for holding vocabulary information sharing the subject and nominative in memory, and information about vocabulary that controls the nominative noun in the complement, such as words that express strong intentions of the speaker or words that require the speaker to do something else In the memory, the vocabulary information held in the subject / nominative shared vocabulary information holding means, and the vocabulary information held in the complementary nominative limited vocabulary information holding means. using, with the subject-nominative shared estimating means for estimating whether the sharing of the subject-nominative present subject matter, mainly prior to the syntax analysis and semantic analysis
A Japanese sentence analysis method characterized by estimating the existence of case sharing .